IRC logs for #openrisc Friday, 2014-07-11

--- Log opened Fri Jul 11 00:00:14 2014
-!- Netsplit *.net <-> *.split quits: jonmasters, rah, heroux, Amadiro, ysionneau, xlro, fotis2, zama, arokux, olofk_, (+21 more, use /NETSPLIT to show all of them)01:27
-!- Netsplit over, joins: LoneTech, jeremy_bennett, rah, rokka, xlro, trevorman, olofk_, simoncook, FreezingCold, stekern (+21 more)02:08
olofk_maxpaln: Hi! Finally caught up with you :)13:51
maxpalnolofk_: How can I help?14:17
olofk_maxpaln: I wanted to invite you to the yearly OpenRISC conference, but I realized that I don't have any contact information14:20
maxpalnaha - you can send it via email to matt.holdsworth@latticesemi.com14:20
maxpalnis email good enough?14:20
olofk_Perfect. Thank you14:21
maxpalngreat - when is it?14:21
maxpalnor perhaps I should wait for the invite [the suspense is too much :-)]14:21
olofk_October 11-12 in Munich14:22
maxpalnGreat - I look forward to it. I will almost certainly be there.14:22
olofk_Excellent! Happy to hear14:22
maxpalnBad news BTW - it is my bug :-( I was hoping to pass this onto someone else!!!!14:24
olofk_Ahh.. the most common kind of bug unfortunately :(14:26
maxpalnyeah, its almost as if logic simulation isn't as thorough as running in HW! :-) Oh, well - this is definitely a job for next week. Have a good weekend all!14:27
olofk_You too!14:28
stekerndalias: I've found out what caused the pthread_cancel failure.17:00
stekernactually, there was two things17:00
stekernthe first is a gcc bug causing this code:
stekernto be turned into this:
stekernnotice that it just loops depending on r1417:01
stekernI'll need to dig into why gcc does that, instead of this ugly local work-around I have:
stekernthe second problem has to do with how our linux port handles th sig return syscall, it's checking for pending signals, so it would just loop around in the cancel_handler signal handler (since the cancel signal always was pending)17:04
-!- Netsplit *.net <-> *.split quits: _franck_, ams, heroux, ssvb17:41
-!- heroux_ is now known as heroux17:41
daliasstekern, it needs to check against the correct signal mask18:03
daliasmusl masks the signal in the signal handler via modifying the ucontext_t18:04
daliasso maybe your definition of the ucontext_t is wrong18:04
daliasor maybe something is ignoring this signal mask18:04
daliasas for the __lockfile loop, my guess is that you have a wrong or missing constraint in the asm18:05
daliasin atomic.h18:05
daliasbut i don't see any obvious mistakes18:06
daliasyes it's buggy18:10
daliasyou can't use "r"(p)18:10
daliasyou have to use "m"(*p)18:10
daliasand it should be output+input i think18:11
daliasalternatively volatile asm can be used; i think that's also safe18:11
daliasbut accessing "m"(*p) is more semantically correct18:11
stekernoh, I remembered that I would have had a volatile there, but I don't18:20
stekerndalias: I'm not sure I follow what you mean, doesn't the signal only get masked once you've hit a cancelation point?18:26
daliasthe signal handler re-raises its own signal, but also sets the bit in the saved signal mask in the ucontext_t18:27
daliasthis is necessary for proper behavior with nested signal handlers18:27
daliase.g. if the main flow of execution is at a cancellation point and interrupted by a signal handler, then the cancellation signal interrupts the signal handler18:27
daliascancellation cannot be acted on immediately (unless the signal handler is also at a cancellation point)18:28
daliasbut it needs to be acted on when the signal handler returns to the main flow of execution18:28
ysionneaudalias: how can you say that p is input and output?18:29
ysionneauyou put it twice?18:29
daliasthere's a form for input+output, i think it's a + sign or something18:30
daliasi forget the constraint syntax but it's inthe gcc manual18:30
daliashowever as long as it's volatile having it as both input and output is non-essential18:30
daliasoutput-only seems to work on other archs18:31
daliasbut input+output would be preferable semantically18:31
ysionneauin gcc manual I see an example with a variable being output and input18:31
ysionneauit seems they just put it twice18:31
ysionneauthe "m" constraint is for memory addresses, right? not registers18:33
ysionneauI think the l.swa and l.lwa use register and not addresses so I don't understand why you say he should use "m"(*p) ?18:34
daliasthey take memory addresses18:37
daliasof an object in memory that they act upon18:37
daliasi'm not sure what the rules are for "m" expressions on or1k18:37
ysionneauso this is supposed to be semantic and not syntaxic ?18:38
ysionneauI mean, syntaxically, it's not a memory address but a register that the opcode is operating on18:38
daliasif "m" allows some advanced addressing expressions that aren't valid for l.swa and l.lwa, there should be a separate arch-specific "m" variant that guarantees the address will come in a single register18:38
ysionneaubut indeed semantically it's operating on a memory address18:38
daliasif all memory address expressions are single registers, "m"(*p) and "r"(p) should be equivalent except that the former tells gcc that an object at the address is being accessed, rather than the value of the pointer being used purely as a value18:39
daliasthis affects the types of transformations (moving and duplicating or deduplicating the asm) are valid18:40
daliaserm that sentence was ungrammatical, but hopefully it made sense18:41
ysionneauwhich are valid <= ?18:45
ysionneauI never used the "m" constraint so far, won't the "*p" be replaced by the value at address pointed to by "p" ?18:49
ysionneauI mean, what's the purpose of the '*' here?18:51
dalias"r"(*p) would load the value at address p into a register and pass it to the asm19:13
dalias"m"(*p) results in the asm argument being an address expression that refers to the object *p19:14
daliasfor x86 this can include advanced address expressions like 16(%eax,%ecx,4)19:14
daliasi'm not sure what it can include for or1k19:14
daliasbasically the %n corresponding to "m" operands should expand to a string that's valid for the source/dest for a load/store instruction19:16
dalias(or, on archs like x86 where other instructions can access memory directly, those instructions too)19:16
ysionneaudalias: ok thanks for the explanation :)19:23
ysionneaudalias: so if I understand correctly, if you have "int a;" you can just put "m"(a) and you don't need any &a at all, right?19:24
ysionneauwhereas with "r" you would need "r"(&a)19:24
daliasif you wanted the address, right19:24
ysionneauok, thanks :)19:24
daliasanyway "r"(&a) would likely be broken19:24
ysionneauyep as I understand19:24
daliassince the compile might move loads/stores to the object across the asm19:24
daliasmaking the asm volatile and adding a "memory" clobber _probably_ avoids this19:25
ysionneauhummm there is still something I don't get19:25
daliasbut imo using "m" is a nicer way of representing the fact that the asm accesses the object19:25
ysionneausince "r"(&a) puts the address of a into a register19:25
ysionneauand since this address anyway cannot change (only the value can change)19:26
ysionneauI don't get how it can be broken19:26
daliasint a, b;19:26
daliasa = 42;19:26
stekerndalias: fwiw, the powerpc port has probably the same problem. (I probably used that as a template)19:27
dalias__asm__ ("load %0, %1" : "=r"(b) : "r"(&a));19:27
daliasprintf("%d\n", b);19:27
daliasstekern, ok thanks i'll check it out19:27
daliasysionneau, in that example, the compiler has no reason it can't move the store a=42; across the asm19:28
daliasin fact it can possibly even eliminate it entirely19:28
ysionneauok, but if you put volatile19:28
ysionneauit's forbidden, right?19:28
daliasi think so, but i don't really understand the semantics of volatile asm19:28
ysionneauto me volatile means there can be no optimization tresspassing the barrier of the volatile block19:28
ysionneauah maybe I'm wrong here19:30
ysionneaudalias: so I'm safe using "r"(&a) only if I manipulate it as an address and never dereferencing it19:30
ysionneaufor instance I can add an offset to it, and store it to another pointer etc19:31
daliasysionneau, yeah, i think so. this is very confusing imo, and i'm not sure why glibc did it that way...19:31
ysionneauok, here I stop bothering you with that ;)19:32
ysionneauthanks again19:32
daliasimo having a "memory" clobber and access to addresses should make gcc treat the asm as potentially accessing the memory...19:32
stekernand just for the record, both volatile and "m"(*p) works19:32
daliasbut it's not defined that way, or else this is just a big gcc bug that's been around for a long time on many archs... :/19:32
daliasstekern, is it possible to use "+m" or whatever the notation for "input+output" is ?19:33
stekernit's not complaining about it at least19:34
daliaswe should definitely check out the ppc case. it's likely broken too on newer gcc at least19:35
stekernbut the m without the + was already enough to make the bug in __lockfile go away, so I can't say if it has any extra effect ;)19:35
daliassince it's volatile i think the + is mostly extraneous19:36
daliasbut i'd rather be explicit19:36
stekernyou mean since p is volatile? yeah, maybe19:36
daliasone reason i'd rather not rely on that though is that i'm not clear on the meaning of volatile-qualified pointers when the underlying object is not volatile19:41
daliase.g. if i do19:41
daliasint x; volatile int *p = &x; ... use *p ...19:41
daliasdoes the compiler have to make accesses to *p as volatile? or can it use the fact that it knows that the pointed-to object is non-volatile to optimize?19:42
daliasimo it's a matter of whether the qualification of the lvalue through which the object is accessed is what matters....19:42
dalias...or whether the qualification of p as pointer-to-volatile just means that it's allowed to point to either volatile or non-volatile objects19:43
daliasthe latter is how const-qualified pointers work, which is what makes me suspect the same may be true for volatile19:43
daliasbut i've never found an authoritative answer one way or the other on the matter19:44
stekernyeah, I'm not sure neither19:54
stekernnot that my answer would have been anywhere near to autoritative neither ;)19:57
daliasso did all the failures go away now? :)20:02
stekernI need to read what you said in the backlog and see if I have misunderstood something about the cancellation signal, but I patched the kernel to not just check for pending signals in the sig_return handler and pthread_cancel passes with that20:04
stekernso I'm at least on the right track there.20:04
stekernwhen I'm done with that, there are some IPC failures as well (and I haven't ran all tests yet, so there might be more to play with after that)20:05
daliasstekern, did you check if ucontext_t is defined right?20:09
daliasi don't think your kernel patch is right, but it's possible there's a kernel bug still20:09
daliascan you point me to the relevant kernel code?20:10
stekernyeah, I know my kernel patch isn't right, I just did that to check my theory20:11
daliassignal.c has         if (__copy_from_user(&set, &frame->uc.uc_sigmask, sizeof(set)))20:13
daliaswhich looks right20:13
daliasso i wonder if musl's idea of the address is wrong20:13
daliasok it's possibly a kernel bug or possibly just a clash in interpretation of types20:14
stekernlet's investigate that20:14
daliasmcontext_t (sigcontext) has an oldmask member20:14
daliasand there's also the sigset_t in the containing ucontext_t object20:15
daliasmaybe oldmask is something else other than a signal mask?20:15
daliasit's only 32 bits so it can't be a signal mask20:15
daliasoldmask seems entirely unused20:19
daliasso i doubt that's the issue20:19
daliaswhat you might do is set a weird signal mask (e.g. put some ascii text in it ;-)20:20
daliasand then check the uc_sigmask from a signal handler to see if the mask matches what you expect20:20
daliasall i can figure is that maybe some type is defined wrong such that the uc_sigmask offset in ucontext_t is wrong in musl20:20
stekernyes, and I understand how cancel_handler works now (at last too) ;)20:23
stekernI'll take a deeper look at what you suggested20:24
stekernbtw, another question our CANCEL_REG_IP will point at the instruction *after* the l.sys instruction, won't that break this?
stekernjust adding a l.nop after the l.sys would fix that if that's the case I guess20:26
dalias*sigh* i found the problem20:28
daliasthe uapi headers are wrong and don't match the actual api the kernel uses...20:28
daliasat least that seems to be the case... just a sec20:28
daliasmaybe not20:29
stekern <- the l.sys I'm speaking about20:32
daliasbut the real pt_regs and the struct ptrace.h exposes to userspace don't match20:33
daliashm, what's your concern?20:33
stekernthat the syscall will not be treated as a cancellation point, since the pc that the cancel_handler sees will be outside cp_begin/cp_end20:34
stekern(this is completely unrelated to the sigmask discussion, just to be clear ;))20:36
daliasunder what conditions?20:37
daliasif the syscall has completed (or will return with EINTR), the signal handler _should_ see a pc outside the range20:37
daliasit should only see a pc in the range if the syscall is going to resume after the signal handler returns20:38
dalias(or if the syscall wasn't even started yet -- there's a tiny window of possibility for that too)20:38
stekernyes, but if I understand things correctly, it will not work in the "syscall is going to resume after the signal handler returns" case20:39
daliaswhy not?20:39
daliasin that case pc should point to the syscall instruction20:40
stekernsince the context store upon syscall entry will store the l.sys+4 address20:40
blueCmd_why hasn't someone created an AXI4-Lite <-> Wishbone module?20:40
daliasthat should be decremented if the syscall is to be restarted20:40
stekernaaaaah! of course20:41
daliashow else would the kernel store the knowledge that the syscall needs to be restarted?20:41
daliashmm in entry.S, the kernel loads the stack pointer as the arg to _sys_rt_sigreturn20:42
daliaswhich is treated as a pointer to pt_regs20:42
daliasbut as far as i can tell that's wrong...20:47
daliassp seems to point to an rt_sigframe, not pt_regs20:48
daliasbut maybe i'm missing something...20:49
stekernyes, that looks odd...20:52
stekernor does it? r1 that is passed to _sys_rt_sigreturn points to pt_regs20:56
stekernthen the pre-context-switch sp (that is pointing to a rt_sigframe) is loaded from that20:57
daliashow does r1 come to point to pt_regs when the signal handler returns and the restorer thunk makes the rt_sigreturn syscall?21:06
daliasoh maybe this is a new pt_regs from the syscall entry point21:07
daliasi see21:07
daliasso this pt_regs struct is in kernelspace, saved by the syscall entry point21:07
daliasand the regs->sp is what the stack pointer in userspace pointed to at the time of the syscall21:08
daliaswhich is the rt_sigframe21:08
daliasso i don't see what's wrong.21:08
daliasthe uc_sigmask is read from userspace and loaded21:09
stekernI'll amuse myself with passing some fun values in uc_sigmask (as you suggested earlier) and see if they come through right21:10
daliasmy best guess is that there's some stupid mismatch so musl is setting the wrong bit21:11
olofk_blueCmd_: Yeah, I've been thinking about writing a bridge for both axi4 and axi4lite, but haven't gotten around to it21:24
olofk_axi4lite is probably way easier21:25
olofk_Full axi4 would probably require Wishbone b4 to be somewhat efficient21:25
-!- olofk_ is now known as olofk21:25
blueCmd_olofk: cool, in that case I think I will start working on it21:46
blueCmd_I have multiple cores in my design that has AXI4-Lite ports, so it makes sense21:47
olofkYeah, the axi4 family is getting quite popular21:50
olofkIt's main drawback is the license21:50
blueCmd_I haven't signed anything so.. :P21:50
olofkAnd the insane amount of signals for full ai421:50
blueCmd_surprisingly hard to find even simple axi4 cores to use as a test against21:52
blueCmd_found,axi_slave but that uses the dead RobustVerilog21:52
stekernthis uc_sigmask isn't coming through as it should at all22:34
daliaslooks like something is broken in the kernel then :-p22:36
daliasor in musl's structs for or1k mcontext, etc22:36
daliaslook in bits/signal.h22:37
daliasthe definition of mcontext_t has inconsistent size depending on #if defined(_GNU_SOURCE) || defined(_BSD_SOURCE)22:37
daliasthat's certainly a bug22:37
daliasit should be 35, not 4122:37
daliasand likewise for gregset_t i think22:37
dalias(34 rather than 40)22:37
daliasstekern, you still there?22:46
daliasi think i found your bug22:46
daliasi'm guessing you were counting bytes rather than longs, or something22:46
stekernyes, I see22:48
stekerncounting wrong, that's what I was doing in any case ;)22:49
daliasmy guess is you counted bytes for the extra 2 registers, then counted longs for the oldmask :-p22:50
stekernyeah, I must have counted pc+sr as bytes...22:50
stekernyup, this looks much better22:53
stekernwhat a silly mistake... that's so typically me... on the bright side, I wouldn't have learned as much as I did if I wouldn't have made that silly mistake22:55
blueCmd_stekern: feel free to backport everything to glibc  as well ...............22:55
daliasi don't think this issue affects glibc. it's just in stekern's bits/signal.h for musl22:56
stekernblueCmd_: backport my silly mistakes? nah, that doesn't sound like a good idea =P22:56
blueCmd_I mean, I'm super excited about doing that - but I wouldn't feel nice robbing you of the oppertunity22:56
daliasotoh backporting musl's cancellation to glibc would be very nice, since glibc's is unusably racy22:56
blueCmd_stekern: well, skip the mistakes and just port the fixes :P22:56
blueCmd_dalias: the amount of work you guys are putting in musl puts glibc to shame22:57
blueCmd_I hacked it together so I could run "normal" apps, that's about it22:57
stekerndalias: yes, I really liked the way you've implemented the cancellation points in musl22:58
daliasstekern, it took 2 or 3 tries to come up with this; i scrapped the first few implementations of cancallation and replaced them completely22:58
blueCmd_dalias: are there any plans / interest from the Debian community to adopt musl as it's primary libc?22:58
daliasi doubt it22:59
daliasmight happen in the _really_ long term :)22:59
blueCmd_what about gentoo?22:59
daliasbut debian is a HUGE distro and pretty conservative in what they do22:59
blueCmd_is it runnable with gentoo?22:59
blueCmd_I guess it is22:59
daliasthere's a gentoo stage-whatever (i forget how their 'stage' numbering works) with musl as the system libc22:59
daliasnot sure how mature it is tho23:00
blueCmd_cool. olofk will have to start porting it then :)23:00
daliasand alpine linux 3.x is using musl as the system libc23:00
blueCmd_dalias: well, we wouldn't want anything unstable23:00
daliasi've got alpine on my laptop now; it's pretty nice23:00
blueCmd_Debian for or1k crashes almost < 10 times per hour, so we're pretty proud over that23:00
blueCmd_It's kind of stable if you don't do anything23:01
daliasalpine's package selection is somewhat spartan, but it's fast, light, security-oriented, and has a responsive developer community23:01
blueCmd_unless you run stekern's SMP23:01
blueCmd_dalias: how big is the system footprint?23:03
stekerndalias: btw, due to another silly mistake (I think I hadn't save the file in between compiles), I have to retract the comment about "+m" working. it didn't like that at all23:05
daliasbluecmd_, my /usr is ~650 megs with xfce-based X setup. /lib is another 300 megs, mostly firmware and kernel modules23:05
daliasstekern, oh?23:05
stekernjust "m" works though23:05
daliasdid you put the +m in output or input?23:06
daliasit would need to be in the output operands section, not the input one23:06
daliasdespite it declaring an operand that's both input and output23:06
stekernah, let's try again then23:06
daliasbluecmd_, so you can have a pretty good useful gui system in ~1gb23:08
blueCmd_dalias: indeed23:08
daliasfor servers, firewalls, etc. of course you can get it much much smaller23:08
stekernthat it did like better23:09
daliasone of the things i'm most impressed with is how fast firefox starts23:09
daliasi really want to try to understand it23:09
daliassince in theory glibc's dynamic linker has all sorts of trickery to improve firefox load time23:09
daliasand musl's is completely naive23:09
daliasso i'm guessing either something fancy they do actually hurts performance a lot, or that the influence of dynamic linking in load time is over-hyped23:11
stekernblueCmd_: bah, my SMP WIP is stable, I had it running top for two days!23:17
stekernhmm, pthread_robust is timing out.23:24
stekernwell, some fun left for tomorrow then, way past bedtime here now =P23:35
stekerndalias: thanks again for all the help tracking down my bugs23:35
--- Log closed Sat Jul 12 00:00:15 2014

Generated by 2.15.2 by Marius Gedminas - find it at!