--- Log opened Sun Sep 11 00:00:06 2016 | ||
shorne | ZipCPU|Laptop: 25 clock to return data? thats horrible any insight of where the latency i.e. what is between the requestor and the memory (wb-bus, sdram_controller, sdram) can you see the timine all the way to sdram? | 03:50 |
---|---|---|
ZipCPU|Laptop | shorne: I cannot see the latency all the way to the memory right now, but ... | 07:07 |
ZipCPU|Laptop | the MIG leaves its source code lying around. It may be possible to examine and draw some conclusions from. | 07:07 |
ZipCPU|Laptop | I, personally, wouldn't be surprised if the difference had anything to do with how difficult the AXI | 07:08 |
ZipCPU|Laptop | bus is to interact with. | 07:08 |
ZipCPU|Laptop | Another measure of interest: My entire design takes about 14.5k LUTs. Of these, 5.6k or about 40% | 07:10 |
ZipCPU|Laptop | OF MY ENTIRE DESIGN is spent by the MIG. | 07:10 |
ZipCPU|Laptop | On some designs I've worked on--this would cost more logic than the entire chip has. | 07:14 |
shorne | ZipCPU|Laptop: why not use wb_sdram_ctrl? or another memory interface? | 08:12 |
shorne | easy of use? | 08:13 |
ZipCPU | shorne: Good questions. | 08:27 |
ZipCPU | I got most of the way through building a wishbone compliant alternative. | 08:28 |
ZipCPU | I got all the logic working 2-3 times, and after every time discovered something I'd messed up. | 08:28 |
ZipCPU | When I went to start over the last time, I got to thinking: If I build this, it's only going to work for a couple of devices, but if I build a (better) wishbone to axi translator, that would work on every device. | 08:29 |
ZipCPU | While I intend to return to the wbddr3 design, I noticed some things about Xilinx --- they're entire approach to DDR3 interaction is via undocumented IO features: PHASOR's, phasor controllers, IOPLLs, and more. | 08:30 |
ZipCPU | Yet ... while undocumented, their MIG does offer one example design one might follow. | 08:30 |
ZipCPU | Going that approach will likely take me a lot of time. The improved/pipelined wishbone to axi converter allows me to get my design moving before I've invested that kind of time. | 08:31 |
ZipCPU | Still ... don't be surprised if I do invest that kind of time. | 08:32 |
ZipCPU | This actually provides me with a *new* reason why you want a resource efficient CPU: because the Xilinx Memory Interface Generator is going to hog *ALL* your remaining logic!!! | 08:42 |
-!- heroux_ is now known as heroux | 12:55 | |
olofk | ZipCPU: Yes. MIG has been one of the most common target for unspeakable words during my career | 16:05 |
olofk | Regarding the latency, I know that they tell you to use their native (UI) interface instead of AXI4 if you need full speed. But I can't possibly see how their AXI layer could add 25-9 extra cycles | 16:06 |
Laksen | If it's using wide ID's and normal memory ordering it would need a heap of bookkeeping logic | 16:51 |
olofk | That could definitely be a reason | 16:54 |
Laksen | Decoding wrap operations, burst length chopping, resizing, ensuring enough space in all outstanding transaction ID RX fifo's, ensuring enough data is received in all available ID slots | 16:54 |
Laksen | AXI is beautiful in some ways, but very complicated in certain other ways :| | 16:56 |
olofk | Yeah. I love especially the handshaking. Been using AXI4Stream quite a lot | 16:57 |
olofk | AXI4 is more or less five streams :) | 16:57 |
Laksen | AXI4 streaming is great | 16:59 |
Laksen | But it just seems like there are some loose ends in the AXI4 full spec | 16:59 |
Laksen | Or not loose as such, just weird. Like if there's a decoding error early in a burst you have to just empty up to 256 words on the data pipe and finally then report an error | 17:01 |
olofk | Error handling is hard | 17:04 |
Laksen | Very true | 17:06 |
--- Log closed Mon Sep 12 00:00:08 2016 |
Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!