--- Log opened Wed Jun 29 00:00:15 2016 | ||
stekern | olof: yeah, perhaps | 03:22 |
---|---|---|
olof | I've been meaning to make a memory-to-memory DMA from my wb_streamer components | 06:25 |
olof | I just realized that by connecting them the other way around, I can make a FIFO that stores data in a memory with a wishbone port | 06:26 |
ZipCPU|Laptop | olof: Sounds like fun. Did you ever get a chance to look at the DMA I built a while back? It's most definitely a memory to memory DMA, although unlike what I think you are suggesting, both ends are on the same WB bus. | 06:53 |
olof | ZipCPU|Laptop: No, I didn't look at that one. Is it part of ZipCPU or do you have it in a separate repo? | 08:20 |
ZipCPU | It's in the ZipCPU repo on OpenCores, in trunk/rtl/peripherals/wbdmac.v. Don't let that dissuade you, other things depend upon it rather than it depending upon other things. | 08:22 |
ZipCPU | The other big dependence it has on the ZipCPU repo is that the documentation for how to use it is found within the ZipCPU Spec. | 08:22 |
olof | I have no problem with dependencies. That was one of the reasons why I started FuseSoC after all :) | 08:27 |
ZipCPU | olof: One thing kind of special about it--it uses an internal buffer, and works in stages. The first stage copies memory into its internal buffer, the second stage copies it out. | 08:35 |
ZipCPU | The size of the internal buffer is fixed, although you can choose how much to transfer in any given "burst". | 08:35 |
ZipCPU | Interrupts can be used to trigger each "burst" of transfers. | 08:36 |
olof | Yeah, I guess you need an internal buffer to get some effiency | 08:39 |
ZipCPU | Effeciency: all transfers use WB pipeline mode, so aside from setting up and tearing down the bus, transfers are accomplished in two clocks each--one for reading, one for writing. | 08:41 |
ZipCPU | Yes, this is using the B4 version of the WB spec, not the B3 version. I never heard back from you on the legalities of B4 after I asked on OpenCores ... | 08:42 |
olof | ZipCPU: Sorry about that. I try to do too many things at the same time. Forgot about that | 08:43 |
olof | I would like to resolve this at orconf actually. Richard Herveille, who is listed as the keeper of the standard will be there | 08:43 |
ZipCPU | Not bad--I look forward to meeting everyone there. I head out this afternoon to get my passport ... | 08:44 |
ZipCPU | Still need to figure out hotel reservations ... but, like I said, I look forward to meeting everyone there. | 08:44 |
olof | The problem is that ORSoC, who owns OpenCores, put out a new version of the standard. When they did that, they ripped out the text that said that the specification is free, and replaced it with text that says that you're not allowed to copy parts of the spec and other things | 08:44 |
SMDwrk | ZipCPU you can check airbnb for a place to stay | 08:45 |
ZipCPU | SMDwrk: Nice ... that's the response I was fishing for ;) | 08:45 |
olof | So, to sort things out, I would like to have a new version of the spec, based on b3, probably with the changes from b4, but keeping it as an open standard | 08:46 |
ZipCPU | SMDwrk: Wait, though ... Airbnb is a site not a place ... I was looking for an ideal place where I wouldn't necessarily need to order a cab to commute. | 08:46 |
olof | And let the b4 spec die | 08:46 |
ZipCPU | olof: Really? Let the B4 spec die?? | 08:46 |
olof | ZipCPU: Yes. As long as we have an alternative spec that is free to use | 08:47 |
ZipCPU | Ok, I suppose, but can I at least keep the single clock transfers? I *love* that pipeline mode. | 08:47 |
SMDwrk | ZipCPU but you can choose a place to stay there | 08:47 |
olof | Yes, that's what I'm saying. We keep the additions, but we make a new spec based on the free b3 spec | 08:47 |
olof | We can't use b4 for anything, since ORSoC claims that they own it | 08:48 |
ZipCPU | Other things for updating: STB should imply CYC, so that if (STB) would be identical to if ((STB)&&(CYC)) ... | 08:48 |
olof | I don't agree with that | 08:49 |
ZipCPU | No? | 08:49 |
olof | It makes it hard to do bursts through an arbiter | 08:49 |
olof | Unless you also add an id tag | 08:49 |
olof | cyc means that you own the bus | 08:49 |
olof | stb means that the data in this cycle is valid | 08:49 |
ZipCPU | Oh, I think I get the problem: I'm not suggesting that you get rid of CYC. Yes, it is needed by the bus arbiter to know when to open and close the bus. | 08:50 |
ZipCPU | I'm just suggesting that the arbiter, interconnect, and masters make certain that STB is always low unless CYC is also high. | 08:50 |
ZipCPU | It's just a performance issue in the slave--I can get my slaves to run faster if I don't need to also check CYC at the same time. | 08:50 |
olof | That's what I usually do :) | 08:51 |
olof | Only act on cyc | 08:51 |
olof | or wait | 08:51 |
olof | what would the difference be? | 08:51 |
ZipCPU | Ahm ... no. A slave should act on STB. An arbiter should act on CYC. | 08:51 |
ZipCPU | The difference is in performance. If STB implies CYC as well, so that STB will never be high when CYC is low, then I can use less logic in the slave. | 08:52 |
ZipCPU | As an example, in a memory you might write: if (STB)&&(WE) mem <= DATA; | 08:52 |
olof | one gate :) | 08:52 |
ZipCPU | This is simpler than, if ((CYC)&&(STB)&&(WE)) mem <= DATA; | 08:53 |
ZipCPU | Or, also in the same memory, ACK <= STB rather than ACK <= (CYC)&&(STB); | 08:53 |
ZipCPU | While you might argue the performance difference isn't that much in this memory example, checking for both CYC and STB can become prohibitive in more complex designs. | 08:54 |
olof | Well, you would still need to to the and-ing in the arbiter instead. Any smart synthesis tool would end up with the same logic | 08:54 |
ZipCPU | Let's see ... in my arbiter, I have a line "assign o_stb = (r_a_owner) ? i_a_stb : i_b_stb;" ... no reference to CYC there, except elsewhere where I determine whether A is the owner or not. | 08:55 |
olof | I just duplicate my stb to all slaves | 08:56 |
olof | No need for a mux, since I control the gating with cyc instead | 08:56 |
olof | I think we will end up with basically the same code whichever way we do it | 08:57 |
ZipCPU | REALLY??? I've been doing the opposite: duplicating CYC to all of the slaves, and gating the STB with whether or not the slave device is being addressed. | 08:57 |
olof | I'm not surprised actually. We have had a lot of discussions trying to interpret the wishbone spec in this channel over the years :) | 08:58 |
ZipCPU | <GRIN> | 08:58 |
olof | Let's just say that there is more than one way to read it | 08:58 |
olof | Honestly, I think the bus is severely underspecified and not all corner cases are thought of | 08:58 |
ZipCPU | Perhaps I should share one? | 08:58 |
ZipCPU | What happens when you address a device, issue a bus command for the device, and then address a second one with a different delay requirement? | 08:59 |
olof | For example, you're not allowed to change sel during a burst. I turned out that this made it impossible to write a efficient bus resizer | 08:59 |
ZipCPU | The "ACKs" might come back at different delays, and land on top of each other. | 08:59 |
olof | This is where cyc should help you. You shouldn't drop cyc until the ack has returned, right? | 09:00 |
ZipCPU | Correct. | 09:00 |
olof | I won! | 09:00 |
ZipCPU | <GRIN--AGAIN> | 09:00 |
ZipCPU | But ... if you have multiple requests flying through the fabric, one right after another, pipelined mode, then CYC stays high until all are complete. | 09:01 |
ZipCPU | If those requests cross devices, you might end up with multiple ACKs coming back on the same clock. | 09:01 |
olof | I have to be honest and say that I've stayed away from pipelined mode. It's a necessary addition, since wishbone is total crap as soon as you need to do a CDC, but I still haven't done designs with it | 09:02 |
ZipCPU | Ok. Fair enough. I've been using it extensively and have fallen in love with it. | 09:03 |
olof | But there is an interesting side note here. Are you allowed to do two bursts without lowering cyc between them? | 09:03 |
olof | IIRC we discovered some nasty case around this area about a year ago | 09:04 |
ZipCPU | Good question. | 09:05 |
ZipCPU | Here's a corrollary: What defines the limits of a burst? What if the master wants to execute a 16 burst request, but does so with various stalls between the bursts? | 09:05 |
ZipCPU | STB might look like: 10010010001001001000011 ... just as a fun example, with CYC valid from the beginning until the last ACK. Is that valid? I think it should be. | 09:06 |
ZipCPU | At the same time, I've restricted the burst modes I've dealt with to those where the address lines are either constant or incrementing. I've also limited the bursts to one device at a time, never crossing slave devices. | 09:09 |
olof | Yes, that's how I would use cyc and stb. Just raise cyc until it's all done, and lower stb in case I don't have a next valid data to send | 09:12 |
ZipCPU | I'm also dealing with some devices with rather long pipelines. For these, I use ((STB_O)&&(~STALL_I)) as the condition to determine whether I should move on to the next request. | 09:14 |
ZipCPU | As I recall, sending multiple requests in flight at once was a problem under B3 ... I'm just not familiar enough with B3 to state what that problem was (or is). | 09:15 |
ZipCPU | Yeah, here it is: block READ as an example, the master cannot start the cycle for the next word of data until ACK is asserted by the slave. | 09:19 |
ZipCPU | The slave cannot assert ACK, however, until it's output DATA is valid. | 09:19 |
ZipCPU | Hence, when I try to get a SDRAM chip up and running, I can run the SDRAM in a pipelined mode to read/write one word every clock cycle. | 09:20 |
ZipCPU | But, if I need to wait from one request to the next, I can't start reading the second word until I've responded to the first read request. | 09:20 |
olof | ZipCPU: I found parts of a wishbone discussion that stekern and I had 1.5 years ago. It's incredibly confusing and I'm not sure we worked everything out | 09:21 |
ZipCPU | Hence the CAS latency would be applied to every word I tried to read, rather than just the first word. | 09:21 |
olof | http://www.juliusbaxter.net/openrisc-irc/%23openrisc.2014-12-02.log.html | 09:21 |
olof | Also the day before | 09:21 |
olof | ZipCPU: Yes, this is one area where wishbone (b3) is horrible | 09:22 |
ZipCPU | I got a point! | 09:22 |
olof | stekern worked around this in our usual SDRAM controller by introducing a small multiport cache in front of the RAM | 09:22 |
olof | So everytime read a word, we ask the SDRAM for a larger chunk | 09:23 |
ZipCPU | So let's remember, then, for our discussion at ORCONF that we want a pipelined mode that can transfer one datum per clock, and that allows multiple transactions to be in flight at any given time. | 09:24 |
olof | As I said before, we want the additions from b4 (I guess), but having it as a free standard | 09:25 |
ZipCPU | So that's the beginning outline of the discussion. I look forward to being a part of it! | 09:25 |
olof | I hope that the CERN guys show up. They were the ones originally requesting this update to the spec | 09:26 |
ZipCPU | stekern: I'm looking for the most recent/up to date OpenRisc architecture spec. Would I find that at https://github.com/openrisc/doc? | 15:41 |
SMDhome | ZipCPU I think so | 16:53 |
ZipCPU | Thanks! | 16:53 |
SMDhome | ZipCPU also check this http://opencores.org/or1k/Architecture_Specification | 16:55 |
ZipCPU | I thought you guys had left opencores ...? That's why I was asking about the most "up to date" spec. | 16:56 |
SMDhome | ZipCPU Olof or wallento could answer more precisely, but 1.1 specification is the latest afaik | 17:01 |
SMDhome | plus opencores is the only place I could find proposals about 1.2 revision | 17:01 |
ZipCPU | Ok ... it took me a while to find it there, but I eventually found an ODT file to look through. | 17:03 |
ZipCPU | The OpenRISC ISA specifies floating point instructions for both single and double precision floating point. Is this an extension to the basic processor, or a requirement of any OpenRISC processor? | 18:38 |
--- Log closed Thu Jun 30 00:00:17 2016 |
Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!