--- Log opened Fri Jul 11 00:00:14 2014 | ||
-!- Netsplit *.net <-> *.split quits: jonmasters, rah, heroux, Amadiro, ysionneau, xlro, fotis2, zama, arokux, olofk_, (+21 more, use /NETSPLIT to show all of them) | 01:27 | |
-!- Netsplit over, joins: LoneTech, jeremy_bennett, rah, rokka, xlro, trevorman, olofk_, simoncook, FreezingCold, stekern (+21 more) | 02:08 | |
olofk_ | maxpaln: Hi! Finally caught up with you :) | 13:51 |
---|---|---|
maxpaln | Hi | 14:16 |
maxpaln | olofk_: How can I help? | 14:17 |
olofk_ | maxpaln: I wanted to invite you to the yearly OpenRISC conference, but I realized that I don't have any contact information | 14:20 |
maxpaln | aha - you can send it via email to matt.holdsworth@latticesemi.com | 14:20 |
maxpaln | is email good enough? | 14:20 |
olofk_ | Perfect. Thank you | 14:21 |
maxpaln | great - when is it? | 14:21 |
maxpaln | or perhaps I should wait for the invite [the suspense is too much :-)] | 14:21 |
olofk_ | HAha | 14:21 |
olofk_ | October 11-12 in Munich | 14:22 |
maxpaln | Great - I look forward to it. I will almost certainly be there. | 14:22 |
olofk_ | Excellent! Happy to hear | 14:22 |
maxpaln | Bad news BTW - it is my bug :-( I was hoping to pass this onto someone else!!!! | 14:24 |
olofk_ | Ahh.. the most common kind of bug unfortunately :( | 14:26 |
maxpaln | yeah, its almost as if logic simulation isn't as thorough as running in HW! :-) Oh, well - this is definitely a job for next week. Have a good weekend all! | 14:27 |
olofk_ | You too! | 14:28 |
stekern | dalias: I've found out what caused the pthread_cancel failure. | 17:00 |
stekern | actually, there was two things | 17:00 |
stekern | the first is a gcc bug causing this code: http://git.musl-libc.org/cgit/musl/tree/src/stdio/__lockfile.c#n9 | 17:00 |
stekern | to be turned into this: http://pastie.org/9378579#29-36 | 17:00 |
stekern | notice that it just loops depending on r14 | 17:01 |
stekern | I'll need to dig into why gcc does that, instead of this ugly local work-around I have: http://pastie.org/9378635 | 17:03 |
stekern | the second problem has to do with how our linux port handles th sig return syscall, it's checking for pending signals, so it would just loop around in the cancel_handler signal handler (since the cancel signal always was pending) | 17:04 |
-!- Netsplit *.net <-> *.split quits: _franck_, ams, heroux, ssvb | 17:41 | |
-!- heroux_ is now known as heroux | 17:41 | |
dalias | stekern, it needs to check against the correct signal mask | 18:03 |
dalias | musl masks the signal in the signal handler via modifying the ucontext_t | 18:04 |
dalias | so maybe your definition of the ucontext_t is wrong | 18:04 |
dalias | or maybe something is ignoring this signal mask | 18:04 |
dalias | as for the __lockfile loop, my guess is that you have a wrong or missing constraint in the asm | 18:05 |
dalias | in atomic.h | 18:05 |
dalias | but i don't see any obvious mistakes | 18:06 |
dalias | oh... | 18:09 |
dalias | yes it's buggy | 18:10 |
dalias | you can't use "r"(p) | 18:10 |
dalias | you have to use "m"(*p) | 18:10 |
dalias | and it should be output+input i think | 18:11 |
dalias | alternatively volatile asm can be used; i think that's also safe | 18:11 |
dalias | but accessing "m"(*p) is more semantically correct | 18:11 |
stekern | oh, I remembered that I would have had a volatile there, but I don't | 18:20 |
stekern | dalias: I'm not sure I follow what you mean, doesn't the signal only get masked once you've hit a cancelation point? | 18:26 |
dalias | the signal handler re-raises its own signal, but also sets the bit in the saved signal mask in the ucontext_t | 18:27 |
dalias | this is necessary for proper behavior with nested signal handlers | 18:27 |
dalias | e.g. if the main flow of execution is at a cancellation point and interrupted by a signal handler, then the cancellation signal interrupts the signal handler | 18:27 |
dalias | cancellation cannot be acted on immediately (unless the signal handler is also at a cancellation point) | 18:28 |
dalias | but it needs to be acted on when the signal handler returns to the main flow of execution | 18:28 |
ysionneau | dalias: how can you say that p is input and output? | 18:29 |
ysionneau | you put it twice? | 18:29 |
dalias | there's a form for input+output, i think it's a + sign or something | 18:30 |
dalias | i forget the constraint syntax but it's inthe gcc manual | 18:30 |
dalias | however as long as it's volatile having it as both input and output is non-essential | 18:30 |
dalias | output-only seems to work on other archs | 18:31 |
dalias | but input+output would be preferable semantically | 18:31 |
ysionneau | in gcc manual I see an example with a variable being output and input | 18:31 |
ysionneau | it seems they just put it twice | 18:31 |
ysionneau | the "m" constraint is for memory addresses, right? not registers | 18:33 |
ysionneau | I think the l.swa and l.lwa use register and not addresses so I don't understand why you say he should use "m"(*p) ? | 18:34 |
dalias | they take memory addresses | 18:37 |
dalias | of an object in memory that they act upon | 18:37 |
dalias | i'm not sure what the rules are for "m" expressions on or1k | 18:37 |
ysionneau | so this is supposed to be semantic and not syntaxic ? | 18:38 |
ysionneau | I mean, syntaxically, it's not a memory address but a register that the opcode is operating on | 18:38 |
dalias | if "m" allows some advanced addressing expressions that aren't valid for l.swa and l.lwa, there should be a separate arch-specific "m" variant that guarantees the address will come in a single register | 18:38 |
ysionneau | but indeed semantically it's operating on a memory address | 18:38 |
dalias | if all memory address expressions are single registers, "m"(*p) and "r"(p) should be equivalent except that the former tells gcc that an object at the address is being accessed, rather than the value of the pointer being used purely as a value | 18:39 |
dalias | this affects the types of transformations (moving and duplicating or deduplicating the asm) are valid | 18:40 |
dalias | erm that sentence was ungrammatical, but hopefully it made sense | 18:41 |
ysionneau | which are valid <= ? | 18:45 |
ysionneau | I never used the "m" constraint so far, won't the "*p" be replaced by the value at address pointed to by "p" ? | 18:49 |
ysionneau | I mean, what's the purpose of the '*' here? | 18:51 |
dalias | no | 19:13 |
dalias | "r"(*p) would load the value at address p into a register and pass it to the asm | 19:13 |
dalias | "m"(*p) results in the asm argument being an address expression that refers to the object *p | 19:14 |
dalias | for x86 this can include advanced address expressions like 16(%eax,%ecx,4) | 19:14 |
dalias | i'm not sure what it can include for or1k | 19:14 |
dalias | basically the %n corresponding to "m" operands should expand to a string that's valid for the source/dest for a load/store instruction | 19:16 |
dalias | (or, on archs like x86 where other instructions can access memory directly, those instructions too) | 19:16 |
ysionneau | dalias: ok thanks for the explanation :) | 19:23 |
ysionneau | dalias: so if I understand correctly, if you have "int a;" you can just put "m"(a) and you don't need any &a at all, right? | 19:24 |
dalias | right | 19:24 |
ysionneau | whereas with "r" you would need "r"(&a) | 19:24 |
dalias | if you wanted the address, right | 19:24 |
ysionneau | ok, thanks :) | 19:24 |
dalias | anyway "r"(&a) would likely be broken | 19:24 |
ysionneau | yep as I understand | 19:24 |
dalias | since the compile might move loads/stores to the object across the asm | 19:24 |
dalias | making the asm volatile and adding a "memory" clobber _probably_ avoids this | 19:25 |
ysionneau | hummm there is still something I don't get | 19:25 |
dalias | but imo using "m" is a nicer way of representing the fact that the asm accesses the object | 19:25 |
ysionneau | since "r"(&a) puts the address of a into a register | 19:25 |
ysionneau | and since this address anyway cannot change (only the value can change) | 19:26 |
ysionneau | I don't get how it can be broken | 19:26 |
dalias | example | 19:26 |
dalias | int a, b; | 19:26 |
dalias | a = 42; | 19:26 |
stekern | dalias: fwiw, the powerpc port has probably the same problem. (I probably used that as a template) | 19:27 |
dalias | __asm__ ("load %0, %1" : "=r"(b) : "r"(&a)); | 19:27 |
dalias | printf("%d\n", b); | 19:27 |
dalias | stekern, ok thanks i'll check it out | 19:27 |
dalias | ysionneau, in that example, the compiler has no reason it can't move the store a=42; across the asm | 19:28 |
dalias | in fact it can possibly even eliminate it entirely | 19:28 |
ysionneau | ok, but if you put volatile | 19:28 |
ysionneau | it's forbidden, right? | 19:28 |
dalias | i think so, but i don't really understand the semantics of volatile asm | 19:28 |
ysionneau | to me volatile means there can be no optimization tresspassing the barrier of the volatile block | 19:28 |
ysionneau | ah maybe I'm wrong here | 19:30 |
ysionneau | https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile | 19:30 |
ysionneau | dalias: so I'm safe using "r"(&a) only if I manipulate it as an address and never dereferencing it | 19:30 |
ysionneau | for instance I can add an offset to it, and store it to another pointer etc | 19:31 |
dalias | ysionneau, yeah, i think so. this is very confusing imo, and i'm not sure why glibc did it that way... | 19:31 |
ysionneau | ok, here I stop bothering you with that ;) | 19:32 |
ysionneau | thanks again | 19:32 |
dalias | imo having a "memory" clobber and access to addresses should make gcc treat the asm as potentially accessing the memory... | 19:32 |
stekern | and just for the record, both volatile and "m"(*p) works | 19:32 |
dalias | but it's not defined that way, or else this is just a big gcc bug that's been around for a long time on many archs... :/ | 19:32 |
dalias | stekern, is it possible to use "+m" or whatever the notation for "input+output" is ? | 19:33 |
stekern | it's not complaining about it at least | 19:34 |
dalias | :) | 19:34 |
dalias | we should definitely check out the ppc case. it's likely broken too on newer gcc at least | 19:35 |
stekern | but the m without the + was already enough to make the bug in __lockfile go away, so I can't say if it has any extra effect ;) | 19:35 |
dalias | since it's volatile i think the + is mostly extraneous | 19:36 |
dalias | but i'd rather be explicit | 19:36 |
stekern | you mean since p is volatile? yeah, maybe | 19:36 |
dalias | *nod* | 19:40 |
dalias | one reason i'd rather not rely on that though is that i'm not clear on the meaning of volatile-qualified pointers when the underlying object is not volatile | 19:41 |
dalias | e.g. if i do | 19:41 |
dalias | int x; volatile int *p = &x; ... use *p ... | 19:41 |
dalias | does the compiler have to make accesses to *p as volatile? or can it use the fact that it knows that the pointed-to object is non-volatile to optimize? | 19:42 |
dalias | imo it's a matter of whether the qualification of the lvalue through which the object is accessed is what matters.... | 19:42 |
dalias | ...or whether the qualification of p as pointer-to-volatile just means that it's allowed to point to either volatile or non-volatile objects | 19:43 |
dalias | the latter is how const-qualified pointers work, which is what makes me suspect the same may be true for volatile | 19:43 |
dalias | but i've never found an authoritative answer one way or the other on the matter | 19:44 |
stekern | yeah, I'm not sure neither | 19:54 |
stekern | not that my answer would have been anywhere near to autoritative neither ;) | 19:57 |
dalias | :) | 20:01 |
dalias | so did all the failures go away now? :) | 20:02 |
stekern | I need to read what you said in the backlog and see if I have misunderstood something about the cancellation signal, but I patched the kernel to not just check for pending signals in the sig_return handler and pthread_cancel passes with that | 20:04 |
stekern | so I'm at least on the right track there. | 20:04 |
stekern | when I'm done with that, there are some IPC failures as well (and I haven't ran all tests yet, so there might be more to play with after that) | 20:05 |
dalias | stekern, did you check if ucontext_t is defined right? | 20:09 |
dalias | i don't think your kernel patch is right, but it's possible there's a kernel bug still | 20:09 |
dalias | can you point me to the relevant kernel code? | 20:10 |
stekern | yeah, I know my kernel patch isn't right, I just did that to check my theory | 20:11 |
dalias | *nod* | 20:12 |
dalias | signal.c has if (__copy_from_user(&set, &frame->uc.uc_sigmask, sizeof(set))) | 20:13 |
dalias | which looks right | 20:13 |
dalias | so i wonder if musl's idea of the address is wrong | 20:13 |
dalias | ok it's possibly a kernel bug or possibly just a clash in interpretation of types | 20:14 |
stekern | let's investigate that | 20:14 |
dalias | mcontext_t (sigcontext) has an oldmask member | 20:14 |
dalias | and there's also the sigset_t in the containing ucontext_t object | 20:15 |
dalias | maybe oldmask is something else other than a signal mask? | 20:15 |
dalias | it's only 32 bits so it can't be a signal mask | 20:15 |
dalias | oldmask seems entirely unused | 20:19 |
dalias | so i doubt that's the issue | 20:19 |
dalias | what you might do is set a weird signal mask (e.g. put some ascii text in it ;-) | 20:20 |
dalias | and then check the uc_sigmask from a signal handler to see if the mask matches what you expect | 20:20 |
dalias | all i can figure is that maybe some type is defined wrong such that the uc_sigmask offset in ucontext_t is wrong in musl | 20:20 |
stekern | yes, and I understand how cancel_handler works now (at last too) ;) | 20:23 |
stekern | I'll take a deeper look at what you suggested | 20:24 |
stekern | btw, another question our CANCEL_REG_IP will point at the instruction *after* the l.sys instruction, won't that break this? https://github.com/skristiansson/musl-or1k/blob/master/src/thread/cancel_impl.c#L49 | 20:25 |
stekern | just adding a l.nop after the l.sys would fix that if that's the case I guess | 20:26 |
dalias | *sigh* i found the problem | 20:28 |
dalias | the uapi headers are wrong and don't match the actual api the kernel uses... | 20:28 |
dalias | at least that seems to be the case... just a sec | 20:28 |
dalias | maybe not | 20:29 |
stekern | https://github.com/skristiansson/musl-or1k/blob/master/src/thread/or1k/syscall_cp.s#L16 <- the l.sys I'm speaking about | 20:32 |
dalias | but the real pt_regs and the struct ptrace.h exposes to userspace don't match | 20:33 |
dalias | hm, what's your concern? | 20:33 |
stekern | that the syscall will not be treated as a cancellation point, since the pc that the cancel_handler sees will be outside cp_begin/cp_end | 20:34 |
stekern | (this is completely unrelated to the sigmask discussion, just to be clear ;)) | 20:36 |
dalias | under what conditions? | 20:37 |
dalias | if the syscall has completed (or will return with EINTR), the signal handler _should_ see a pc outside the range | 20:37 |
dalias | it should only see a pc in the range if the syscall is going to resume after the signal handler returns | 20:38 |
dalias | (or if the syscall wasn't even started yet -- there's a tiny window of possibility for that too) | 20:38 |
stekern | yes, but if I understand things correctly, it will not work in the "syscall is going to resume after the signal handler returns" case | 20:39 |
dalias | why not? | 20:39 |
dalias | in that case pc should point to the syscall instruction | 20:40 |
stekern | since the context store upon syscall entry will store the l.sys+4 address | 20:40 |
blueCmd_ | why hasn't someone created an AXI4-Lite <-> Wishbone module? | 20:40 |
dalias | that should be decremented if the syscall is to be restarted | 20:40 |
stekern | aaaaah! of course | 20:41 |
dalias | how else would the kernel store the knowledge that the syscall needs to be restarted? | 20:41 |
dalias | :) | 20:41 |
dalias | hmm in entry.S, the kernel loads the stack pointer as the arg to _sys_rt_sigreturn | 20:42 |
dalias | which is treated as a pointer to pt_regs | 20:42 |
stekern | mmm | 20:46 |
dalias | but as far as i can tell that's wrong... | 20:47 |
dalias | sp seems to point to an rt_sigframe, not pt_regs | 20:48 |
dalias | but maybe i'm missing something... | 20:49 |
stekern | yes, that looks odd... | 20:52 |
stekern | or does it? r1 that is passed to _sys_rt_sigreturn points to pt_regs | 20:56 |
stekern | then the pre-context-switch sp (that is pointing to a rt_sigframe) is loaded from that | 20:57 |
dalias | how does r1 come to point to pt_regs when the signal handler returns and the restorer thunk makes the rt_sigreturn syscall? | 21:06 |
dalias | oh maybe this is a new pt_regs from the syscall entry point | 21:07 |
stekern | yes | 21:07 |
dalias | i see | 21:07 |
dalias | so this pt_regs struct is in kernelspace, saved by the syscall entry point | 21:07 |
stekern | right | 21:08 |
dalias | and the regs->sp is what the stack pointer in userspace pointed to at the time of the syscall | 21:08 |
dalias | which is the rt_sigframe | 21:08 |
dalias | so i don't see what's wrong. | 21:08 |
dalias | the uc_sigmask is read from userspace and loaded | 21:09 |
stekern | I'll amuse myself with passing some fun values in uc_sigmask (as you suggested earlier) and see if they come through right | 21:10 |
dalias | ok | 21:10 |
dalias | my best guess is that there's some stupid mismatch so musl is setting the wrong bit | 21:11 |
olofk_ | blueCmd_: Yeah, I've been thinking about writing a bridge for both axi4 and axi4lite, but haven't gotten around to it | 21:24 |
olofk_ | axi4lite is probably way easier | 21:25 |
olofk_ | Full axi4 would probably require Wishbone b4 to be somewhat efficient | 21:25 |
-!- olofk_ is now known as olofk | 21:25 | |
blueCmd_ | olofk: cool, in that case I think I will start working on it | 21:46 |
blueCmd_ | I have multiple cores in my design that has AXI4-Lite ports, so it makes sense | 21:47 |
olofk | Yeah, the axi4 family is getting quite popular | 21:50 |
olofk | It's main drawback is the license | 21:50 |
blueCmd_ | I haven't signed anything so.. :P | 21:50 |
olofk | And the insane amount of signals for full ai4 | 21:50 |
blueCmd_ | surprisingly hard to find even simple axi4 cores to use as a test against | 21:52 |
blueCmd_ | found http://opencores.org/project,axi_slave but that uses the dead RobustVerilog | 21:52 |
stekern | this uc_sigmask isn't coming through as it should at all | 22:34 |
dalias | looks like something is broken in the kernel then :-p | 22:36 |
dalias | or in musl's structs for or1k mcontext, etc | 22:36 |
dalias | ...... | 22:36 |
dalias | look in bits/signal.h | 22:37 |
dalias | the definition of mcontext_t has inconsistent size depending on #if defined(_GNU_SOURCE) || defined(_BSD_SOURCE) | 22:37 |
dalias | that's certainly a bug | 22:37 |
dalias | it should be 35, not 41 | 22:37 |
dalias | and likewise for gregset_t i think | 22:37 |
dalias | (34 rather than 40) | 22:37 |
dalias | stekern, you still there? | 22:46 |
dalias | i think i found your bug | 22:46 |
dalias | i'm guessing you were counting bytes rather than longs, or something | 22:46 |
stekern | yes, I see | 22:48 |
stekern | counting wrong, that's what I was doing in any case ;) | 22:49 |
dalias | my guess is you counted bytes for the extra 2 registers, then counted longs for the oldmask :-p | 22:50 |
stekern | yeah, I must have counted pc+sr as bytes... | 22:50 |
stekern | yup, this looks much better | 22:53 |
dalias | :) | 22:54 |
stekern | what a silly mistake... that's so typically me... on the bright side, I wouldn't have learned as much as I did if I wouldn't have made that silly mistake | 22:55 |
blueCmd_ | stekern: feel free to backport everything to glibc as well ............... | 22:55 |
dalias | i don't think this issue affects glibc. it's just in stekern's bits/signal.h for musl | 22:56 |
stekern | blueCmd_: backport my silly mistakes? nah, that doesn't sound like a good idea =P | 22:56 |
blueCmd_ | I mean, I'm super excited about doing that - but I wouldn't feel nice robbing you of the oppertunity | 22:56 |
dalias | otoh backporting musl's cancellation to glibc would be very nice, since glibc's is unusably racy | 22:56 |
blueCmd_ | stekern: well, skip the mistakes and just port the fixes :P | 22:56 |
dalias | (see http://ewontfix.com/16/) | 22:57 |
blueCmd_ | dalias: the amount of work you guys are putting in musl puts glibc to shame | 22:57 |
blueCmd_ | I hacked it together so I could run "normal" apps, that's about it | 22:57 |
stekern | dalias: yes, I really liked the way you've implemented the cancellation points in musl | 22:58 |
dalias | stekern, it took 2 or 3 tries to come up with this; i scrapped the first few implementations of cancallation and replaced them completely | 22:58 |
blueCmd_ | dalias: are there any plans / interest from the Debian community to adopt musl as it's primary libc? | 22:58 |
dalias | i doubt it | 22:59 |
dalias | might happen in the _really_ long term :) | 22:59 |
blueCmd_ | what about gentoo? | 22:59 |
dalias | but debian is a HUGE distro and pretty conservative in what they do | 22:59 |
blueCmd_ | is it runnable with gentoo? | 22:59 |
blueCmd_ | I guess it is | 22:59 |
dalias | there's a gentoo stage-whatever (i forget how their 'stage' numbering works) with musl as the system libc | 22:59 |
dalias | not sure how mature it is tho | 23:00 |
blueCmd_ | cool. olofk will have to start porting it then :) | 23:00 |
dalias | and alpine linux 3.x is using musl as the system libc | 23:00 |
blueCmd_ | dalias: well, we wouldn't want anything unstable | 23:00 |
dalias | i've got alpine on my laptop now; it's pretty nice | 23:00 |
blueCmd_ | Debian for or1k crashes almost < 10 times per hour, so we're pretty proud over that | 23:00 |
blueCmd_ | ;) | 23:01 |
dalias | ;-) | 23:01 |
blueCmd_ | It's kind of stable if you don't do anything | 23:01 |
dalias | alpine's package selection is somewhat spartan, but it's fast, light, security-oriented, and has a responsive developer community | 23:01 |
blueCmd_ | unless you run stekern's SMP | 23:01 |
blueCmd_ | dalias: how big is the system footprint? | 23:03 |
stekern | dalias: btw, due to another silly mistake (I think I hadn't save the file in between compiles), I have to retract the comment about "+m" working. it didn't like that at all | 23:05 |
dalias | bluecmd_, my /usr is ~650 megs with xfce-based X setup. /lib is another 300 megs, mostly firmware and kernel modules | 23:05 |
dalias | stekern, oh? | 23:05 |
stekern | just "m" works though | 23:05 |
dalias | weird | 23:06 |
dalias | did you put the +m in output or input? | 23:06 |
dalias | it would need to be in the output operands section, not the input one | 23:06 |
dalias | despite it declaring an operand that's both input and output | 23:06 |
stekern | ah, let's try again then | 23:06 |
dalias | :) | 23:06 |
dalias | bluecmd_, so you can have a pretty good useful gui system in ~1gb | 23:08 |
blueCmd_ | dalias: indeed | 23:08 |
dalias | for servers, firewalls, etc. of course you can get it much much smaller | 23:08 |
stekern | that it did like better | 23:09 |
dalias | one of the things i'm most impressed with is how fast firefox starts | 23:09 |
dalias | i really want to try to understand it | 23:09 |
dalias | since in theory glibc's dynamic linker has all sorts of trickery to improve firefox load time | 23:09 |
dalias | and musl's is completely naive | 23:09 |
dalias | so i'm guessing either something fancy they do actually hurts performance a lot, or that the influence of dynamic linking in load time is over-hyped | 23:11 |
stekern | blueCmd_: bah, my SMP WIP is stable, I had it running top for two days! | 23:17 |
stekern | hmm, pthread_robust is timing out. | 23:24 |
dalias | :( | 23:29 |
stekern | well, some fun left for tomorrow then, way past bedtime here now =P | 23:35 |
stekern | dalias: thanks again for all the help tracking down my bugs | 23:35 |
dalias | np | 23:47 |
--- Log closed Sat Jul 12 00:00:15 2014 |
Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!