--- Log opened Thu Jul 10 00:00:12 2014 | ||
stekern | dalias: I wish I knew that ;) but this is what I'm observing: after the call to pthread_create(), I can see the parent arriving here: https://github.com/skristiansson/musl-or1k/blob/master/src/thread/or1k/clone.s#L19 | 02:18 |
---|---|---|
stekern | but, then it seems the child never get scheduled or something, since that never arrives to that same place | 02:19 |
dalias | stekern, how do you observe this? | 03:24 |
dalias | this isn't the first thread test, is it? | 03:24 |
dalias | so clone has already worked at least once... | 03:25 |
stekern | dalias: we have a feature in or1ksim that can dump the entire instruction trace, I turned that on at entry on main() in pthread_cancel.c and then I grep for the address of that instruction in clone.s | 03:42 |
stekern | and yes, the first test worked | 03:42 |
stekern | and I also tested commenting the first test out, just to sanity check that the second test wouldn't work when run as the first test | 03:43 |
stekern | it doesn't | 03:46 |
dalias | the syscall failing to return makes no sense... | 03:48 |
stekern | I agree | 03:48 |
stekern | it's of course entirely possible that this is caused by some latent kernel bug | 03:49 |
stekern | since obviously the circumstances has to be right for it to happen (since the first clone works) | 03:49 |
dalias | well if you can dump the entire instruction trace... | 03:50 |
dalias | what happens to the new thread in kernelspace? | 03:50 |
dalias | does it ever get scheduled? | 03:50 |
stekern | yes, the answer is probably in that trace, it's just a matter of digging through it ;) | 03:51 |
dalias | yes | 03:51 |
dalias | digging thru traces is no fun :/ | 03:52 |
stekern | I don't know if it ever get scheduled, that's what I had planned on trying to find out next | 03:52 |
stekern | but, just looking at what the circumstances might be, the difference between the first and the second test is that the first wait for the child thread to start before calling pthread_cancel | 03:55 |
stekern | I've managed to compile strace against musl now too, that'll help debugging slightly | 04:02 |
stekern | this patch helped a lot in getting it to build: http://git.opensde.net/opensde/package-nopast/tree/base/musl/pkg/strace/strace-musl.patch | 04:04 |
stekern | I had done half of what's in it before I found it though :/ | 04:05 |
dalias | :-p | 04:25 |
dalias | i had forgotten opensde but yeah there are several places you should check for patches for stuff like that before spending time redoing it | 04:28 |
dalias | sabotage, alpine, ... | 04:29 |
stekern | ok, will do that in the future ;) | 04:37 |
dalias | going to sleep now. catch you later. let me know if you make any sense of this :-p | 05:10 |
stekern | dalias: sleep well, I'll let you know when I find something of interest | 05:16 |
stekern | olofk_: how do you convert that? | 06:22 |
stekern | I have opened it in quartus | 06:22 |
stekern | besides, I can't find anything but high level information in that? | 06:25 |
stekern | or is that just some interconnect thing? | 06:25 |
maxpaln | FYI, I have traced this to a bug in the memory controller - either my logic, the DDR3 IP or the memory itself. At the point of the fault being created, I see a write to memory of 0x0 then a read from the same address of 0x8701FFEC. | 14:54 |
maxpaln | thanks for your help in getting me to this point | 14:55 |
-!- guilherme is now known as Guest71251 | 15:18 | |
stekern | maxpaln: great that your headache payed off in the end! | 15:35 |
maxpaln | yeah, although now that I've found the bug fixing it might be tricky. If I'm lucky it will be someone else's :-) | 16:33 |
stekern | dalias: making baby step progress on the bug, if nothing else I'm starting to get fairly familiar with parts of the code in src/thread. | 20:34 |
stekern | the bug itself is the definition of a heisenbug, it disappears, or partly changes nature if I add debug printfs | 20:34 |
stekern | anyway, I have a direct question you might be able to answer | 20:35 |
stekern | this call to __timedwait() will cause a FUTEX_WAIT with the timeout argument as 0 (wait indefinetely) | 20:37 |
stekern | where do I find the FUTEX_WAKE for that? | 20:38 |
stekern | 'this call' = http://git.musl-libc.org/cgit/musl/tree/src/thread/pthread_join.c#n11 | 20:38 |
dalias | src/internal/futex.h i think | 21:02 |
dalias | btw is it possible that some of the args to clone are still being passed wrong? | 21:02 |
dalias | if the child_tid_ptr arg is wrong, that would of course prevent pthread_join from working | 21:05 |
stekern | anything is of course possible, but I've stared at those enough to believe that they should be correct now | 21:05 |
dalias | that's what makes the kernel futex_wake the tid address atomically with respect to the thread exiting | 21:06 |
stekern | yes | 21:08 |
stekern | this is what a strace'd clone syscall looks like: clone(child_stack=0x300e9d18, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|0x400000, parent_tidptr=0x300e9d3c, tls=0x300e9de4, child_tidptr=0x300e9d3c) = 52 | 21:11 |
dalias | yeah that looks right | 21:13 |
stekern | ah, here's where the FUTEX_WAKE comes from: http://lxr.free-electrons.com/source/kernel/fork.c#L794 | 21:31 |
dalias | yes | 21:41 |
dalias | sorry i misread your question :( | 21:41 |
dalias | <stekern> where do I find the FUTEX_WAKE for that? | 21:41 |
dalias | i misread that as FUTEX_WAIT and assumed you were just asking where the constant was defined | 21:41 |
dalias | i could have told you it was in kernel/fork.c right away :) | 21:41 |
stekern | heh, I'm not completely incompetent with grep ;) | 21:42 |
blueCmd_ | stekern: jeremybennett_: https://github.com/bluecmd/or1ksim/commit/8ccb1f1677402e9103322ecba60d0370cea2bded.patch for loopback support for or1ksim | 22:15 |
--- Log closed Thu Jul 10 22:32:56 2014 | ||
--- Log opened Thu Jul 10 22:33:16 2014 | ||
-!- Irssi: #openrisc: Total of 25 nicks [0 ops, 0 halfops, 0 voices, 25 normal] | 22:33 | |
-!- Irssi: Join to #openrisc was synced in 23 secs | 22:33 | |
--- Log closed Fri Jul 11 00:00:14 2014 |
Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!