IRC logs for #openrisc Sunday, 2014-02-16

--- Log opened Sun Feb 16 00:00:41 2014
stekernpgavin: interesting thought, I never considered an option of invalidating the branch prediction tables01:27
stekernmor1kx actually have a branch predictor too, but it's just a simple static one right now, I've been planning to add a dynamic one to it though01:28
stekernysionneau: to be honest, the hw page tree walker didn't improve performance by huge amounts01:46
stekernone reason is that the tlb handler and the pagetables will likely be in cache and currently the hw pagewalker doesn't go through the cache to walk the tables01:47
stekernspeaking of which, I took a look at the WIP lm32 netbsd port, I didn't quite get the tlb miss handler implementation01:51
pgavinstekern: right now my BTB has logic to iterate through each entry and invalidate it at reset time05:19
pgavinbut that's not quite consistent with other aspects of the ISA05:19
stekernis it common for architectures to expose the branch predictor like that?05:26
stekernone thing that is a disadvantage with it is that somewhat define how the implementation should look like05:27
stekernexposing my ignorance here - why is it better to explicitly invalidate them than to have "random" predictions?05:30
stekernif it's just a matter of avoiding 'x'es in simulations, than I'd brush it off as a simulation issue and I think it should be handled by simulation specific logic05:32
pgavinstekern: it's not really any better05:50
pgavinbut my pipeline depends on the BTB never returning an incorrect target05:50
pgavinso the valid bits in the btb need to  be flushed05:51
pgavinbut the if the BPB initially has random values it's not really a big deal05:51
pgavinalso, on a context switch the BTB will need to be flushed05:53
pgavinbecause it uses virtual addresses05:53
pgavinI suppose the btb flush could be tied to one of the l.*sync instructions somehow05:54
stekernpgavin: hmm, ok, I see09:33
stekernwhy would virtual addresses matter though? it's just pc relative addresses09:36
stekernah, or you mean the branch source09:36
ysionneau02:51 < stekern> speaking of which, I took a look at the WIP lm32 netbsd port, I didn't quite get the tlb miss handler implementation < it was very wrong until yesterday or 2 days ago09:37
ysionneauI commited quite a few commits those last days09:37
stekernysionneau: I looked at it after your commits yesterday09:37
ysionneauI had a hard time to get it "working"09:37
stekernwhy do you need to save that much context though?09:37
ysionneauok, I agree it might not be obvious and might not be coded the best way09:37
stekernI haven't looked to deeply I admit ;)09:38
ysionneaustekern: I could save less, but since I am calling _do_real_tlb_miss_handling which is in C09:38
ysionneauthen I prefered saving everything, to avoid any future issue09:38
stekernyeah, that was part of my question, what's the difference between fake and real?09:38
ysionneaubut indeed _do_real_tlb_miss_handling is just using sp r1 r2 r3 and not much more09:38
ysionneaufake is used during very early boot09:39
stekernok, so that's boot tlb miss handlers09:39
ysionneaufake is replaced by real at the end of pmap_bootstrap() (in lm32/lm32/lm32_pmap.c)09:39
ysionneaufake only does PA = va - base_virt + base_phys09:39
stekernbut what is the C function doing?09:39
ysionneauC is doing the page table lookup09:39
ysionneauI could do it in assembly09:40
ysionneaubut then I'm pretty sure anyway that I will need someday to call do_fault() or some internal netbsd mechanism09:40
ysionneaufor instance when a user space process want to execute a non executable page09:40
stekernyeah, that was my third question, at what point is the pagefault handler called09:41
ysionneauI think this is handled in machine independant C code09:41
ysionneaufor now I don't call it, which is "bad"09:41
ysionneaubut as far as the boot is going right now I don't need it09:41
stekerncould be, in Linux the pagefault handler is a bit "boiler platey"09:41
stekernok, fair enough, that straighten out my question marks - you have a C function as the tlb miss handler for easier debugging during development09:42
pgavinstekern: the BTB is used to avoid adding the branch PC to the immediate09:48
pgavinit's written to after the target is calculated09:48
pgavinit made the most sense to use virtual addresses for both09:48
pgavinwriting the physical address of the target to the BTB would require translating the target address before the write09:49
pgavinplus there would be other difficulties09:49
stekernok, I understand. And I agree, if you're going to save something, it should be virtual addresses09:51
ysionneaustekern: I could certainly improve performance by only saving 2 or 3 registers and then check if I need to call a C function, then save the remaining regs only if I need to09:55
ysionneaubut for now: I need to get the kernel running more than getting it run fast :)09:55
stekernysionneau: sure, I'm not condemning that, I was just confused about the heavy-weightness of it, and thought the C-function did something more involved than just walking the pagetables13:22
stekernpgavin: so if I understand your implementation right - you need to know if the value in the BTB is valid early, and when you have calculated the target, you can't start backing out of a wrong value in it, right?13:24
stekernare you only using the BTB for conditional jumps, or are you buffering other branches as well?13:26
stekernI guess you could snoop the immucr for writes to invalidate the BTB, to get back to your original problem13:28
ysionneauok I understand :)13:29
-!- Netsplit *.net <-> *.split quits: jonmasters17:08
pgavinstekern: the only reason the pipeline requires the BTB always produce the correct target is because I don't want to compare the BTB target with calculated target19:08
pgavinit's really just to avoid extra logic19:08
pgavinthe BTB buffers conditional and unconditional direct jumps/branches19:09
pgavinI only write a target to the BTB it if the branch was taken but predicted not taken19:10
pgavin(I think)19:10
pgavinso another thing is that I want to support L1 caches with associativity > 219:17
pgavinand the LRU state table needs to be flushed19:17
pgavinwith associativity == 2 the LRU state is only 1 bit, but higher associativities have a special encoding that needs to be initialized at reset19:18
pgavinI suppose the LRU table entries could be flushed when the CBIRs are written19:23
pgavinbut I think there would be more serious problems if those entries aren't flushed and an entry is accessed than there would be for e.g. the mor1kx19:24
pgavinactually now that I've thought about it a bit i think the LRU logic would work correctly even if all bits were intially random20:35
--- Log closed Mon Feb 17 00:00:43 2014

Generated by 2.15.2 by Marius Gedminas - find it at!