--- Log opened Wed Jul 16 00:00:21 2014 | ||
dalias | stekern, malloc-brk-fail is known to fail for static | 02:31 |
---|---|---|
stekern | dalias: oh, actlually, it's only the static case that fails | 02:38 |
stekern | what's the reason for that? | 02:38 |
dalias | it's a limitation in __simple_malloc that gets used because there's no realloc or free | 02:57 |
stekern | ah | 03:02 |
stekern | did you see the note about microblaze bits/stat.h btw? | 03:03 |
dalias | when we made malloc work without brk, i forgot to do __simple_malloc too. but it's unlikely that static programs would have a brk issue anyway | 03:03 |
dalias | i might fix it later anyway tho | 03:03 |
dalias | yes i saw that | 03:03 |
dalias | but i think microblaze is working | 03:03 |
dalias | are you sure it uses asm-generic? | 03:03 |
dalias | i'll check it... | 03:04 |
stekern | if this doesn't mean it does, then I'm confused: http://lxr.free-electrons.com/source/arch/microblaze/include/uapi/asm/stat.h | 03:05 |
dalias | hm | 03:06 |
dalias | the test for microblaze passes... | 03:22 |
dalias | ahh i suspect the test does not catch the breakage | 03:23 |
stekern | which test? it was sem_open that broke for me | 03:24 |
dalias | src/functional/stat.c | 03:24 |
stekern | all offsets except st_ino are correct | 03:24 |
dalias | hmm it looks like sem_open probably succeeds or fails at random depending on junk with the wrong st_ino offset | 03:26 |
dalias | i can't get it to fail on microblaze with qemu user here | 03:26 |
stekern | interesting, I guess I was lucky that it failed for me then ;) | 03:30 |
dalias | :) | 03:31 |
stekern | what host are you on? | 03:31 |
dalias | ? | 03:35 |
stekern | nah, that can't be it... I was thinking the stat conversion in qemu-user would make the data always correct | 03:35 |
stekern | but I looked up how it's done, and it clears the target stat struct first | 03:36 |
stekern | ...so there shouldn't even be junk there... | 03:36 |
stekern | ah, actually... I was looking at the wrong place, microblaze qemu-user copies st_ino to both offsets... | 03:42 |
dalias | oh? haha | 03:43 |
dalias | why? | 03:43 |
stekern | http://git.qemu.org/?p=qemu.git;a=blob;f=linux-user/syscall.c;h=a50229d0d72fc68966515fcf2bc308b833a3c032;hb=HEAD#l4949 | 03:46 |
stekern | http://git.qemu.org/?p=qemu.git;a=blob;f=linux-user/syscall_defs.h;h=c9e6323905486452f518102bf40ba73143c9d601;hb=HEAD#l1469 | 03:46 |
stekern | no idea | 03:47 |
dalias | ..... | 03:48 |
dalias | qemu has it wrong | 03:48 |
dalias | they truncate the earlier one to 32 bits :/ | 03:48 |
dalias | despite it apparently being the correct one | 03:48 |
dalias | this looks really bad | 03:50 |
stekern | yeah, wonder where that has even came from? | 03:51 |
dalias | no idea | 03:55 |
dalias | emailing several lists about it | 03:57 |
dalias | shall i cc you? | 03:57 |
stekern | out of curiosity, I looked in the kernels git log, and the kernel stat64 has never looked like that | 03:58 |
stekern | sure, it'd be interesting to hear if there is any explanation to it | 03:58 |
dalias | what address? | 03:58 |
stekern | stefan.kristiansson@saunalahti.fi | 03:59 |
dalias | i think we need to add an arch-specific __stat_fixup function... | 04:00 |
dalias | if nothing else, mips seems to still be broken | 04:00 |
dalias | mips idiotically has 32-bit dev_t still | 04:00 |
dalias | and there's padding for plenty more | 04:00 |
dalias | but the padding is situated such that when we define it as 64-bit in userspace, the lo/hi halves are backwards on big endian | 04:01 |
dalias | maybe there'll be a way to work around this nasty microblaze qemu/kernel mismatch too with such a function | 04:02 |
dalias | tho i doubt it | 04:02 |
dalias | stekern, i think the broken files on our side came from arm | 04:08 |
dalias | which is where many/most ports were initially forked from | 04:08 |
dalias | (for arm, this stat.h is right) | 04:09 |
stekern | ah, right. the initial commit of stat.h for microblaze is identical to the arm one | 04:11 |
stekern | the math errors are because I don't have a fenv implementation | 04:18 |
stekern | microblaze qemu-user was actually correct prior to this 'fix'... http://git.qemu.org/?p=qemu.git;a=commitdiff;h=a523eb06ec3fb2f4f4f4d362bb23704811d11379 | 05:01 |
maxpaln | Life is so much easier with console output :-) | 09:10 |
maxpaln | is there a safe way to write to the console during the boto sequence - printk() seems to cause exceptions depending on where it sits. | 09:10 |
stekern | maxpaln: printk *should* be safe in most places | 10:14 |
chan1 | hello, someone please help! | 12:24 |
chan1 | I was following http://openrisc.net/toolchain-build.html. | 12:24 |
chan1 | built the toolchain easy way, | 12:24 |
chan1 | then, am I supposed to go directly to building Busybox? (skipping install linux headers, stage 2 gcc, and uClibc) | 12:24 |
chan1 | That's what I did and I have an error building busybox. | 12:24 |
stekern | chan1: http://juliusbaxter.net/openrisc-irc/%23openrisc.2014-07-07.log.html#t13:22 | 12:27 |
maxpaln | stekern: Taken to using pr_info() - it seems safer, although I have to be honest, I am using it blindly - it was the way I output to the UART when debugging the Ethernet PHY drivers several months ago. | 12:27 |
stekern | maxpaln: pr_info() is just a wrapper to printk | 12:28 |
stekern | http://lxr.free-electrons.com/source/include/linux/printk.h#L226 | 12:28 |
maxpaln | Oh, odd - it seems to cause fewer exceptions! Oh well.... | 12:28 |
maxpaln | out of interest, I get periodic I-TLB Miss exceptions - I am not paying particular attention to these at the moment as they get handled safely. But are they indicative of deeper issues? | 12:29 |
stekern | maxpaln: I would guess it's just old mr Heisenbug that is visiting you ;) | 12:29 |
maxpaln | :-) | 12:29 |
stekern | TLB-miss exceptions are perfectly normal, and completely expected | 12:30 |
maxpaln | excellent - finally a break! :-) | 12:30 |
stekern | the TLB is just a pagetable cache, and the TLB-miss happens when the cache doesn't contain the pagetable entry for the requested address | 12:30 |
chan1 | stekern:oh thank you :-) | 12:35 |
stekern | you're welcome | 12:37 |
maxpaln | stekern: that's what I figured - but nice to have an assumption confirmed every now and again :-) | 12:45 |
chan1 | stekern : sorry but the link for the precompiled binary 1.0rc1 for CentOS-5.5 x86_64 doesn't work. I'll try installing from source. (someone please check why the link is dead..) | 12:55 |
chan1 | I mean in this page http://opencores.org/or1k/OpenRISC_GNU_tool_chain | 12:55 |
stekern | chan1: I was pointing you to: http://opencores.org/or1k/OpenRISC_GNU_tool_chain#Linux_.28uClibc.29_toolchain_.28or1k-linux-uclibc.29 | 13:16 |
stekern | not the old precompiled toolchain | 13:16 |
maxpaln | I am getting a little stuck - Linux is pausing during boot after the 'Mount-cache hash table entries: 1024' message | 13:26 |
maxpaln | tracing through I can see it gets stuck during the initialisation of something (not sure what the actual construct is) - the function call stack looks like this: | 13:28 |
maxpaln | start_kernel->rest_init->kernel_init->kernel_init_freeable->do_pre_smp_initcalls->do_one_initcall | 13:31 |
maxpaln | It enters this function twice - the first time to execute spawn_ksoftirqd() from address c01c17e0 - this one completes fine | 13:32 |
maxpaln | the second time it executes init_workqueues() from c01c1f38 | 13:33 |
maxpaln | tracing a combination of debug printk's and watching the instructions in HW I have traced the code as far as init_worker_pool() - unfortunately adding printk's into this function seems to hang the Linux boot earlier so I am back to tracing through HW | 13:34 |
maxpaln | Does anyone have a birds eye view on this - what is the kernel actually doing at this stage? | 13:35 |
maxpaln | It would be useful to have a broader appreciation as it might point me at the root cause a little quicker than my current strategy | 13:35 |
stekern | maxpaln, presumably there is still some hw bug that causes this, right? | 13:49 |
stekern | and you have random crashes when add debug printks | 13:50 |
stekern | it's not fun, but the way I'd move forward in such cases is take a build where it crashes non-subtle and start inspect that from the hardware side | 13:51 |
maxpaln | ah, so you think the crash from printk is causing this - interesting | 13:52 |
maxpaln | or at least - the boot hang with printk is indicative of a HW bug | 13:53 |
maxpaln | interesting, hadn't thought of that | 13:53 |
maxpaln | I agree the problem is likely to be HW | 13:53 |
maxpaln | but it could also be that I haven't correctly configured the Linux kernel per the HW | 13:53 |
maxpaln | I am pretty happy the memory controller is doing the right thing now | 13:54 |
maxpaln | and the basic ORPSOC is the same as the one I have previously had working on the ECP3 silicon (predecessor to the current silicon I am using) | 13:54 |
maxpaln | but there have been numerous minor changes | 13:55 |
maxpaln | I can print to the UART during boot - I am using that a lot at the moment | 13:55 |
maxpaln | I am getting hangs when placing the printk's at certain points - | 13:56 |
maxpaln | Inspecting on HW side is straight forward - I am tracing through the instructions in HW, comparing against the disassembled Kernel and using printks as a guide. Its working so far | 13:56 |
stekern | what are the certain points? | 14:02 |
stekern | and, does the same kernel work in or1ksim? | 14:03 |
maxpaln | yep, it simulates in or1ksim | 14:20 |
maxpaln | I haven't really been paying attention to the points at which the printk causes the boot to hang | 14:20 |
maxpaln | I hadn't really made the connection when it last happened | 14:20 |
maxpaln | I just assumed there was a genuine reason why printk wouldn't work during some functions | 14:21 |
maxpaln | I will try and add one to init_workqueues() - I have traced the code as far as here, I think this caused the problem last time I tried it. | 14:22 |
maxpaln | hmmm, that seems to work | 14:30 |
maxpaln | i'll return to using the printk's and take note of the location that causes problems when it arises again | 14:31 |
maxpaln | ok, I need to pop out for an hour or so - but I have tracked the system hang to init_workqueues - the following call never gets returned: | 14:38 |
maxpaln | system_unbound_wq = alloc_workqueue("events_unbound", WQ_UNBOUND, | 14:38 |
maxpaln | WQ_UNBOUND_MAX_ACTIVE); | 14:38 |
maxpaln | I am not sure I understand what this code is doing - will look a little later when I get back. | 14:38 |
rah | http://hardware.slashdot.org/story/14/07/16/1218238/sricambridge-opens-cheri-secure-processor-design | 14:42 |
olofk | rah: I think some of the guys from that project were at last year's orconf | 15:54 |
rah | is that good because these people with money are paying attention to openrisc, or bad because they went on and developed their own core anyway? :-) | 16:07 |
stekern | dalias: from my point of view the or1k port is pretty much in shape to be merged, how do you want me to move forward with it? Should I post a patch to the musl ml? | 17:44 |
stekern | I've already squashed the commits together and added a quite descriptive commit message on them already: https://github.com/skristiansson/musl-or1k/commit/a937ef3a8e4dac07fbd4e7e7777aaa0552780dc0 | 17:45 |
dalias | i'll take a look | 17:49 |
dalias | you can go ahead and post to the mailing list tho if you like | 17:49 |
stekern | great | 17:49 |
stekern | sure | 17:49 |
blueCmd_ | olofk: stekern: http://bluecmd.github.io/ | 19:44 |
blueCmd_ | I'm playing with jekyll and github pages. the way it works is that you have a git repository with static files that is then used to generate a static webpage that github (or another provider) use | 19:45 |
blueCmd_ | pros: it's git, so we can accept pull requests and stuff like that. cons: harder to edit | 19:45 |
blueCmd_ | pro: it's not opencores.net and it's not a wiki | 19:51 |
blueCmd_ | hm, who has all this money? http://opencores.org/donation | 19:55 |
stekern | http://opencores.org/donation,faq | 19:56 |
stekern | "then the money will used to upgrade the server hardware and Opencores" | 19:56 |
stekern | haven't you noticed the vast improvements? | 19:57 |
blueCmd_ | stekern: ah. I guess the downtime is due to all the upgrades they are making | 19:57 |
stekern | must be | 19:57 |
blueCmd_ | stekern: do you know if someone owns OpenCores as a trademark? | 20:00 |
blueCmd_ | it says on the website that it's registered trademark, but I don't believe that | 20:01 |
stekern | I have no idea | 20:06 |
dalias | stekern, does or1k have fpu and fenv that will eventually be supported? or is it all soft-float? | 21:26 |
stekern | dalias: it does, but all implementation doesn't have support for it | 21:29 |
dalias | i see | 21:29 |
dalias | so are there separate hard and soft float abis? | 21:29 |
dalias | or is the calling convention the same either way? | 21:29 |
stekern | yes, same calling convention | 21:30 |
dalias | nice | 21:30 |
stekern | there are no seperate fpu regs and so on | 21:30 |
dalias | very very nice | 21:30 |
dalias | so we don't need two separate abi variant subarchs for that | 21:30 |
stekern | right | 21:30 |
dalias | btw what about endianness? just one or both? | 21:31 |
stekern | well, in theory, the architecture is bi-endian. but in practice, there are no little-endian implementations | 21:32 |
dalias | i see | 21:33 |
stekern | and while there are *some* little endian support in the toolchains, it's far from complete | 21:33 |
dalias | so for now we can just treat it as fixed big-engian | 21:33 |
dalias | if little is needed later it can be added as the non-default subarch | 21:33 |
stekern | yup | 21:33 |
dalias | so it looks like there's no need for any subarchs right now | 21:33 |
dalias | that makes adding it nice and clean :) | 21:33 |
dalias | how was the jmp_buf size issue handled? | 21:41 |
dalias | stekern, also, in __unmapself... | 21:42 |
dalias | you don't load any args for the syscalls. i'm guessing the args are already in the right registers for munmap | 21:42 |
dalias | but for SYS_exit, the arg should be 0 | 21:43 |
dalias | actually maybe it doesn't matter | 21:43 |
dalias | i don't think this code is reachable if the exiting thread is the last thread | 21:43 |
dalias | and the exit code to SYS_exit is only seen for the last exiting thread | 21:43 |
stekern | dalias: hmm, yes. looking at the other archs, it is a bit of a mix whether 0 is loaded as the arg | 21:52 |
stekern | regarding jmp_buf size, blueCmd_ and I discussed that and decided to change glibc to reflect what musl does | 21:59 |
blueCmd_ | (y) | 22:00 |
stekern | the glibc port is still in a experimental state | 22:00 |
blueCmd_ | I was very hard to convince | 22:00 |
blueCmd_ | at this point I think musl might be more stable | 22:00 |
dalias | :) | 22:07 |
dalias | stekern, why are the SYS_ macros not in the same order as the __NR_ ones? | 22:30 |
blueCmd_ | http://bluecmd.github.io/or1k.html - that turned out quite nice for a simple ODT -> HTML | 23:30 |
--- Log closed Thu Jul 17 00:00:23 2014 |
Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!