IRC logs for #openrisc Sunday, 2014-02-16

--- Log opened Sun Feb 16 00:00:41 2014
stekern	pgavin: interesting thought, I never considered an option of invalidating the branch prediction tables	01:27
stekern	mor1kx actually have a branch predictor too, but it's just a simple static one right now, I've been planning to add a dynamic one to it though	01:28
stekern	ysionneau: to be honest, the hw page tree walker didn't improve performance by huge amounts	01:46
stekern	one reason is that the tlb handler and the pagetables will likely be in cache and currently the hw pagewalker doesn't go through the cache to walk the tables	01:47
stekern	speaking of which, I took a look at the WIP lm32 netbsd port, I didn't quite get the tlb miss handler implementation	01:51
pgavin	stekern: right now my BTB has logic to iterate through each entry and invalidate it at reset time	05:19
pgavin	but that's not quite consistent with other aspects of the ISA	05:19
stekern	is it common for architectures to expose the branch predictor like that?	05:26
stekern	one thing that is a disadvantage with it is that somewhat define how the implementation should look like	05:27
stekern	+you	05:28
stekern	exposing my ignorance here - why is it better to explicitly invalidate them than to have "random" predictions?	05:30
stekern	if it's just a matter of avoiding 'x'es in simulations, than I'd brush it off as a simulation issue and I think it should be handled by simulation specific logic	05:32
pgavin	stekern: it's not really any better	05:50
pgavin	but my pipeline depends on the BTB never returning an incorrect target	05:50
pgavin	so the valid bits in the btb need to be flushed	05:51
pgavin	but the if the BPB initially has random values it's not really a big deal	05:51
pgavin	also, on a context switch the BTB will need to be flushed	05:53
pgavin	because it uses virtual addresses	05:53
pgavin	I suppose the btb flush could be tied to one of the l.*sync instructions somehow	05:54
stekern	pgavin: hmm, ok, I see	09:33
stekern	why would virtual addresses matter though? it's just pc relative addresses	09:36
stekern	ah, or you mean the branch source	09:36
ysionneau	02:51 < stekern> speaking of which, I took a look at the WIP lm32 netbsd port, I didn't quite get the tlb miss handler implementation < it was very wrong until yesterday or 2 days ago	09:37
ysionneau	I commited quite a few commits those last days	09:37
stekern	ysionneau: I looked at it after your commits yesterday	09:37
ysionneau	I had a hard time to get it "working"	09:37
stekern	why do you need to save that much context though?	09:37
ysionneau	ok, I agree it might not be obvious and might not be coded the best way	09:37
stekern	I haven't looked to deeply I admit ;)	09:38
ysionneau	stekern: I could save less, but since I am calling _do_real_tlb_miss_handling which is in C	09:38
ysionneau	then I prefered saving everything, to avoid any future issue	09:38
stekern	yeah, that was part of my question, what's the difference between fake and real?	09:38
ysionneau	but indeed _do_real_tlb_miss_handling is just using sp r1 r2 r3 and not much more	09:38
ysionneau	fake is used during very early boot	09:39
stekern	ok, so that's boot tlb miss handlers	09:39
ysionneau	fake is replaced by real at the end of pmap_bootstrap() (in lm32/lm32/lm32_pmap.c)	09:39
ysionneau	fake only does PA = va - base_virt + base_phys	09:39
stekern	but what is the C function doing?	09:39
ysionneau	C is doing the page table lookup	09:39
ysionneau	I could do it in assembly	09:40
ysionneau	but then I'm pretty sure anyway that I will need someday to call do_fault() or some internal netbsd mechanism	09:40
ysionneau	for instance when a user space process want to execute a non executable page	09:40
stekern	yeah, that was my third question, at what point is the pagefault handler called	09:41
ysionneau	I think this is handled in machine independant C code	09:41
ysionneau	for now I don't call it, which is "bad"	09:41
ysionneau	but as far as the boot is going right now I don't need it	09:41
stekern	could be, in Linux the pagefault handler is a bit "boiler platey"	09:41
stekern	ok, fair enough, that straighten out my question marks - you have a C function as the tlb miss handler for easier debugging during development	09:42
pgavin	stekern: the BTB is used to avoid adding the branch PC to the immediate	09:48
pgavin	it's written to after the target is calculated	09:48
pgavin	it made the most sense to use virtual addresses for both	09:48
pgavin	writing the physical address of the target to the BTB would require translating the target address before the write	09:49
pgavin	plus there would be other difficulties	09:49
stekern	ok, I understand. And I agree, if you're going to save something, it should be virtual addresses	09:51
ysionneau	stekern: I could certainly improve performance by only saving 2 or 3 registers and then check if I need to call a C function, then save the remaining regs only if I need to	09:55
ysionneau	but for now: I need to get the kernel running more than getting it run fast :)	09:55
stekern	ysionneau: sure, I'm not condemning that, I was just confused about the heavy-weightness of it, and thought the C-function did something more involved than just walking the pagetables	13:22
stekern	pgavin: so if I understand your implementation right - you need to know if the value in the BTB is valid early, and when you have calculated the target, you can't start backing out of a wrong value in it, right?	13:24
stekern	are you only using the BTB for conditional jumps, or are you buffering other branches as well?	13:26
stekern	I guess you could snoop the immucr for writes to invalidate the BTB, to get back to your original problem	13:28
ysionneau	ok I understand :)	13:29
-!- Netsplit .net <-> .split quits: jonmasters		17:08
pgavin	stekern: the only reason the pipeline requires the BTB always produce the correct target is because I don't want to compare the BTB target with calculated target	19:08
pgavin	it's really just to avoid extra logic	19:08
pgavin	the BTB buffers conditional and unconditional direct jumps/branches	19:09
pgavin	I only write a target to the BTB it if the branch was taken but predicted not taken	19:10
pgavin	(I think)	19:10
pgavin	so another thing is that I want to support L1 caches with associativity > 2	19:17
pgavin	and the LRU state table needs to be flushed	19:17
pgavin	with associativity == 2 the LRU state is only 1 bit, but higher associativities have a special encoding that needs to be initialized at reset	19:18
pgavin	I suppose the LRU table entries could be flushed when the CBIRs are written	19:23
pgavin	but I think there would be more serious problems if those entries aren't flushed and an entry is accessed than there would be for e.g. the mor1kx	19:24
pgavin	actually now that I've thought about it a bit i think the LRU logic would work correctly even if all bits were intially random	20:35
--- Log closed Mon Feb 17 00:00:43 2014

Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!