--- Log opened Wed Apr 03 00:00:58 2013 | ||
stekern | juliusb: good, I feared I was missing something fundamental | 04:01 |
---|---|---|
stekern | I almost got register reading in decode stage working now (i.e. rf addr from fetch and result ready in decode) | 04:02 |
stekern | but I just realised that I actually will only need to read the B port there | 04:04 |
stekern | register bypassing is becoming a real mess though... | 07:31 |
@juliusb | stekern: ah that's a nice optimisation you can achieve (only having to read port B) | 12:32 |
@juliusb | to be honest, I'm wondering about maybe doing a register file in flops | 12:33 |
@juliusb | it's common in all of the CPUs I've seen. *ahem* | 12:33 |
@juliusb | it makes things easier | 12:33 |
@juliusb | but it is 1k flops | 12:33 |
@juliusb | hmm..... | 12:33 |
@juliusb | but yeah, in ASIC it seems like what is used | 12:33 |
@juliusb | and can probably make life easier | 12:33 |
@juliusb | and... for cappuccino, it's already aimed to be big anyway | 12:33 |
@juliusb | it doesn't matter so much if yuo have 1k more flops | 12:34 |
stekern | mmm | 12:38 |
stekern | an alternative I have considered is to add 2 read ports to each rf ram | 12:38 |
stekern | in (most) FPGAs, you get those at no extra cost | 12:38 |
stekern | because the biggest headache I have at the moment is that I want to read the RF both from decode and execute | 12:40 |
stekern | well, actually, it'd just be add 1 read port to RF B | 12:41 |
stekern | but I'm doing the thing with both ports first, because I have a feeling that bugs will fall out faster and easier that way | 12:41 |
@juliusb | cool | 12:42 |
stekern | but even with flops, you don't get away from register forwarding issues anyway | 12:42 |
stekern | I think I'll stall on this kind of situations anyways: l.ori r3, r0, 0x100; l.jr r3 | 12:47 |
stekern | I could of course try to connect the ALU output straight to the branch target, but I've got a feeling that might be slow | 12:48 |
stekern | and it will not work on this: l.muli r3, r3, 2; l.jr r3 | 12:51 |
stekern | without waiting for the muli to be valid | 12:52 |
stekern | so easiest is probably to just push out a bubble on all cases where the instruction in decode is going to write to the read register B in the fetch instruction | 12:54 |
stekern | all the tests pass at least now | 13:08 |
@juliusb | interesting case where you'd invoke a multiply instead of a l.lsl :P but yeah, I see your point | 14:03 |
stekern | heh, yeah, the example was a bit convoluted, but the point was that you'd have to have stall logic in there anyways | 15:49 |
@juliusb | yep | 15:50 |
mor1kx | [mor1kx] skristiansson pushed 1 new commit to master: https://github.com/openrisc/mor1kx/commit/f8a7fdc1a8d156af0923ffec687a4421d9e962d8 | 16:23 |
mor1kx | mor1kx/master f8a7fdc Stefan Kristiansson: dcache: declare write_pending before it is used | 16:23 |
stekern | juliusb: have you tried running mor1kx on ml501? | 16:24 |
stekern | ok, linux boots to with some minor adjustments | 19:01 |
stekern | time to start moving around branches | 19:01 |
stekern | *too | 19:01 |
stekern | fun corner cases fall out when things speed up... | 21:52 |
stekern | rfe->to jump instructions especially | 21:52 |
stekern | the edge of the branch signal get lost | 21:56 |
stekern | I think it's time to sleep on that problem... | 21:57 |
--- Log closed Thu Apr 04 00:00:00 2013 |
Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!