--- Log opened Thu Sep 01 00:00:51 2016 | ||
-!- Netsplit *.net <-> *.split quits: fotis2, simoncook | 04:58 | |
-!- Netsplit over, joins: simoncook, fotis2 | 05:03 | |
olofk | Has anyone seen this? http://opencores.org/forum,OpenRISC,0,5746,0#1472416478 | 09:50 |
---|---|---|
olofk | It's about mor1kx crashing on a simple program with floating point | 09:51 |
wallento | crashing=floating point exception, right? | 09:56 |
wallento | seems someone should have a look at the vcd | 09:57 |
olofk | I'm trying to reproduce it | 10:07 |
olofk | But I don't get any fp instructions even if I compile with -mhard-float | 10:07 |
olofk | And I guess that should -fsingle-precision-constant, not -lsingle-precision-constant | 10:08 |
olofk | Ah.. sorry. I did get fp instructions | 10:13 |
olofk | wallento: I can reproduce it now too | 10:24 |
olofk | We could just add code to gcc that detects if someone wants to add 3.2 to a number and convert that to +1.0 + 2.2 :) | 10:25 |
wallento | or make an errata: "never add 3.2 to 3.2" ;) | 10:25 |
olofk | Where do I even begin to look for this exception signal? | 10:27 |
wallento | in the control | 10:28 |
wallento | there will be a floating exception signal, but this is what we know already | 10:28 |
wallento | I think we have to go into the FPU for that | 10:28 |
wallento | what can be FPU exceptions? | 10:28 |
wallento | can you share the vcd? | 10:29 |
wallento | at mega or so ;) | 10:29 |
olofk | https://www.dropbox.com/s/ggwbmebgi4zu0dk/mega_illegal_warez.vcd?dl=0 | 10:35 |
wallento | okay, gtg | 11:14 |
wallento | hi ontoshko, its your request on the forum, right? | 11:14 |
wallento | olofk: I can see that there is the fpcsr that differs between the first two and the third op | 11:21 |
wallento | its 100 instead of 000 | 11:21 |
wallento | so without even knowing how this part works, I would trace the origin of this | 11:21 |
wallento | :) | 11:21 |
ontoshko | yes | 11:26 |
ontoshko | wallento, yes | 11:26 |
wallento | great, olofk was able to reproduce the error and I had a look into the VCD | 11:26 |
wallento | I have to leave soon, but will have a more detailed look tomorrow. never touched this part, but I hope to learn something new ;) | 11:27 |
ontoshko | wallento, I have sample where fpcsr=0x80(ZF) leads to fall | 11:37 |
SMDhome1 | olofk: hi, could you remind me how to turn on tracing for fusesoc? | 14:39 |
SMDhome1 | huh, I see it, thanks | 14:39 |
olofk | wallento: Yeah. I saw that too. | 15:42 |
olofk | Looking at the spec, bit 0 means that the FP exception is enabled | 15:43 |
olofk | bit 8 means that the result was inexact. Don't think this should be a reason to cause an exception though | 15:43 |
olofk | So the culprit is the fpu_allf signal. God knows what that one is good for though | 15:44 |
olofk | ok, cpu_allf is a slice of the incoming fpcsr signal | 15:48 |
olofk | So the issue seems to be that the exception is caused by the inexact flag being set | 15:48 |
olofk | I would like to know if the inexact flag is supposed to cause an exception. Looks so in the spec, but then I guess we need to either mask it in sw or take care of it in the exception handler, because most fp operations will be inexact | 15:51 |
olofk | Let's wait for bandvig to return | 15:51 |
olofk | aha. Looks like bandvig made an optional extension to mask these flags, which is disabled by default | 15:54 |
olofk | That's a more sane option, but it would require an update of the spec | 15:54 |
olofk | Yes! Setting iverilog_options = -DSIM -DOR1K_FPCSR_MASK_FLAGS | 15:57 |
olofk | in mor1kx-generic.core seems to work | 15:57 |
olofk | So once again, the spec works against us :/ | 16:00 |
olofk | Wonder how or1200 does it | 16:00 |
olofk | And because the mask stuff is not in the spec, the only thing we can do is take care of it in the exception handler if we want to be compatible with the spec | 16:01 |
SMDhome1 | olofk: do you know if anyone has tried to implement orvdx64 extension? | 16:02 |
olofk | SMDhome1: Pretty sure the answer is no | 16:04 |
SMDhome1 | olofk: Ok, I'll keep that in mind, thanks | 16:04 |
SMDhome1 | olofk: and what's about vectors support in toolchain? | 16:05 |
kc5tja | .......................................................... | 16:34 |
kc5tja | Gaahhh...stupid cat. :( | 16:34 |
ZipCPU | Looks like I need to start over on that DDR3 memory controller a ... what is it, fifth time? | 16:56 |
ZipCPU | I got to studying what the MIG was doing, and discovered that my chip can't handle the speed I wanted the data lines running at. | 16:57 |
kc5tja | :-( | 16:57 |
ZipCPU | So ... I'm now dropping the data rate down to 640 Mb/s. | 16:57 |
ZipCPU | The system clock speed is now going to drop from 200 MHz down to 80MHz. | 16:57 |
ZipCPU | Perhaps I can still interface it with the rest of the system running at 160MHz ... but 200MHz would require a clock crossing FIFO, and I don't want to incur that penalty. | 16:58 |
kc5tja | Is there no way to start slow and ramp up the clock speed upon successful testing? | 16:58 |
ZipCPU | You mean ... to see if you can overclock the chip and whether or not it would work if you did so? | 16:59 |
kc5tja | You could use it for that, I guess. I was thinking that you could start at something like test at 10MHz, then at 20MHz, then at 40MHz, then 80MHz, then 160MHz, etc. | 17:00 |
kc5tja | The idea being if it works at 80MHz and not 160MHz, it might still work at 120MHz. | 17:01 |
kc5tja | Basically use a binary search for the highest frequency you can drive it and have it still work. | 17:01 |
ZipCPU | Yeah, ... the problem is I'm stuck between two tight constraints. The DDR3 memory controller will not work with a clock slower than 3.3ns. | 17:01 |
kc5tja | You should be able to hone in the top clock in 8 or fewer testing cycles. | 17:01 |
ZipCPU | At least, according to spec. | 17:01 |
ZipCPU | According to the Arty's spec, speed grade -1L, the Arty cannot handle a DDR3 clock faster than 3ns. | 17:02 |
ZipCPU | That leaves me stuck in some rather tight windows. | 17:02 |
ZipCPU | Either that or I need to cross clock domains. | 17:02 |
ZipCPU | See: https://forums.xilinx.com/t5/Memory-Interfaces/Slow-DDR3-SDRAM-on-an-Arty/m-p/719652 | 17:05 |
kc5tja | So, 3.3ns translates to 303MHz. Not seeing how you can avoid multiple clock domains. | 17:06 |
ZipCPU | There's a sweet spot near 80MHz that I might work at, if I work the controller at a 4:1 rate (4 commands per system clock) | 17:08 |
ZipCPU | Then each memory command is at a clock of 320MHz, or 3.125ns. | 17:08 |
ZipCPU | While I'd love to do this using the controller at the 2:1 rate, so I could get in a 160MHz system clock, that breaks the Artix-7 spec again. | 17:08 |
ZipCPU | I will say one thing nice about Xilinx and their memory interface generator: they give you all the code. This can be very useful if you want to understand what they did, so you can ... repeat key features from it. | 17:52 |
mor1kx | [mor1kx] skristiansson closed pull request #39: Add ORFPX32 commands at monitor trace (master...master) https://github.com/openrisc/mor1kx/pull/39 | 22:54 |
--- Log closed Fri Sep 02 00:00:53 2016 |
Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!