--- Log opened Tue Nov 25 00:00:43 2014 | ||
olofk | Limb: Remind me, what is the issue you are seeing? | 08:08 |
---|---|---|
olofk | ysionneau: Stil ready for DFI? :) | 08:32 |
_franck__ | http://fr.slideshare.net/Yi-Chiao/porting-free-rtos-on-openrisc | 10:12 |
_franck__ | it looks good. Too bad there is no source code. | 10:13 |
olofk | _franck_: Nice find! | 10:17 |
olofk | I'll drop him a mail | 10:18 |
_franck__ | I've tried but haven't got any answer. But I'm not as famous as you ! You could have more luck | 10:19 |
olofk | :) | 10:19 |
olofk | You mean that I'm famous for hunting down and killing people who don't answer my e-mails? :) | 10:20 |
_franck__ | yeah that's what I meant :) | 10:21 |
_franck__ | ll | 10:22 |
jeremybennett_ | Is there a problem with OpenRISC SVN at the moment? FuseSoC is struggling here. | 10:25 |
ysionneau | olofk: yes! | 11:02 |
olofk | jeremybennett_: There's often problem with OpenCores SVN nowadays | 11:19 |
olofk | ysionneau: I was wondering if there is really any benefit in doing a two phase DDR implementation. I switched over to a single phase, and got it mostly working now. Any thoughts | 11:20 |
olofk | ? | 11:20 |
ysionneau | how I understand it, it's easier to have <wb_width> bits moved in/out sdram in one system cycle | 11:24 |
olofk | How wide is your RAM? | 11:25 |
ysionneau | so if your sdram dq_width is 16 and your wb_width is 32 then 1 phase is OK | 11:25 |
ysionneau | but if your dq_width is 8 then you might wanna have 2 phases | 11:25 |
olofk | My dq width is 16 and wb width = 32, but every write consumes 4x16 words, so I need to scale it up to 64 bits to utilize the full bandwidth | 11:26 |
ysionneau | or you can use a wishbone downconverter ( https://github.com/m-labs/migen/blob/master/migen/bus/wishbone.py#L115 ) | 11:26 |
olofk | For writing I would need an upsizer | 11:27 |
ysionneau | 12:25 < olofk> How wide is your RAM? < depends on the system, for now my only DDR system's dq width is 32 (2 x16 chips) | 11:27 |
ysionneau | but to test my controller I cannot have more than 32 bits right now so I reduced the number of used dq pins to x8 | 11:27 |
olofk | So for every burst you need to supply 128 bits of data, right? If you are using BL4 | 11:28 |
olofk | ah ok | 11:28 |
ysionneau | the thing is, in our systems we set BL so that we get wb_size in 1 system clock | 11:28 |
ysionneau | so here I set BL4 in my x8 system | 11:29 |
ysionneau | but all of this is totally implementation dependant :/ | 11:29 |
olofk | But that would only work for x8 and x4 DQ widths, if you have a 32 bit wb | 11:29 |
ysionneau | it only works for dqx8 here, or dqx4 if I use a DownConverter which translates the 1 32bit access in 2 16 bits accesses | 11:31 |
ysionneau | I mean, I think the more phases the better, you get more bandwidth, but I also think it's getting harder to integrate | 11:33 |
ysionneau | it tends to be easier with dq_width x nphases <= wb_width | 11:33 |
olofk | Yes, I'm starting to see the benefits of using several phases if you can use the native data width all the way to the phy | 11:34 |
ysionneau | yes, then you set BL = 2*nphases for DDR | 11:34 |
ysionneau | or 1*nphases for SDR | 11:34 |
olofk | But the phy got complicated enough so I'm settling for a single phase with data_width = 4*dq_width | 11:34 |
olofk | with BL=4 | 11:35 |
olofk | for a 16-bit RAM | 11:35 |
ysionneau | so you have 2 system cycles to retrieve the data right? | 11:35 |
olofk | Lots of restrictions here :) | 11:35 |
ysionneau | here basically your system clock == your ddr clock ? | 11:35 |
olofk | Yes | 11:36 |
ysionneau | ok | 11:36 |
ysionneau | yes I think it's the easier way to do | 11:36 |
ysionneau | 2 phases is a little bit more complex but not that much and it gives more bandwidth | 11:36 |
olofk | But I will probably need to do this a better way to increase the bandwith | 11:36 |
olofk | Problem is then that I can only send 32*sys_clk bps anyway | 11:37 |
olofk | Well, that's not entirely true if I have multiple masters | 11:38 |
olofk | Anyway. Hoping to be done with my tech-independent DFI Phy soon. Would be interesting to see how hard it would be to integrate that in misoc | 11:41 |
ysionneau | you've done it in Migen? | 11:45 |
olofk | No. Pure verilog, but I guess it could be wrapped in a module | 11:49 |
olofk | Or if someone wants to rewrite it, it's not very many lines of code | 11:50 |
Limb | olofk: the cellular ram interface works with anything smaller then size=4096 in wb_intercon.conf but anything larger causes CRC errors in openOCD and doesn't work | 22:02 |
olofk | Limb: Do you have the source available somewhere? | 22:04 |
Limb | https://github.com/Limb/orpsoc-cores/blob/nexys4/systems/nexys4/rtl/verilog/cellram_ctrl.v | 22:07 |
Limb | That's unmodified from juliusb original code for the nexys 3 | 22:08 |
olofk | Limb: Do you know if there are any simulation models available for the cell ram? | 22:10 |
Limb | olofk: not to my knowledge | 22:10 |
Limb | I increased the read and write cycles and got it to work at 8192, but it wasn't writing correct values | 22:12 |
olofk | That smells like timing issues. Can you send me your .twr file that's generated by ISE? | 22:13 |
Limb | I will do so in about an hour | 22:14 |
olofk | I don't know a thing about Cellular RAMs, but is the clock really supposed to be 0? | 22:17 |
Limb | olofk: it has a standard SRAM interface, and I don't know much about them either. I have the data sheet on my desk at home. Are you referencing a specific line of code? | 22:19 |
olofk | https://github.com/Limb/orpsoc-cores/blob/nexys4/systems/nexys4/rtl/verilog/cellram_ctrl.v#L127 | 22:21 |
olofk | I found a simulation model by the way. I can try to set up a testcase and see what happens | 22:21 |
Limb | Would you mind linking it olofk? | 22:23 |
olofk | The model is available here http://www.micron.com/parts/psram/cellularram/mt45w8mw16bgx-701-it?pc={BD8A72EA-2DC2-4B88-846E-7B59997A2D97} | 22:23 |
olofk | I like Micron. They have great models | 22:23 |
olofk | Not like stupid Winbond assholes that only supplies encrypted models. Grrr!!! | 22:24 |
Limb | olofk: during asynch mode clock is static low | 22:26 |
olofk | aha | 22:27 |
olofk | Have you tried lowering the clock frequency by the way? | 22:27 |
Limb | https://github.com/juliusbaxter/mor1kx-dev-env/blob/master/boards/xilinx/nexys3/rtl/verilog/orpsoc_top/orpsoc_top.v#L1130 | 22:29 |
Limb | I'm not sure of these lines would affect it, I don't have them in my code | 22:30 |
Limb | And I haven't tried lowering the clock, I figured lowering the nexys4s 100 MHz to 50 MHz would keep Iit in line with the nexys3 | 22:31 |
olofk | It makes more sense to let wb_intercon handle that arbiter stuff | 22:31 |
Limb | Is that part of the arbiter? I couldn't tell if it was or doing some translation between the WB interface and cell ram | 22:33 |
olofk | I think that's just 2-port wishbone arbiter | 22:34 |
olofk | But with priority on the data bus. Shouldn't matter in this case though | 22:34 |
olofk | Limb: Going to bed now, but I set up a quick test bench. Not sure it does what it's supposed to do, but I'm getting a lot of warning about timing violations here | 23:03 |
Limb | olofk: http://pastebin.com/vZVxVuig thats my twr file | 23:17 |
--- Log closed Wed Nov 26 00:00:44 2014 |
Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!