IRC logs for #openrisc Thursday, 2016-12-15

--- Log opened Thu Dec 15 00:00:28 2016
-!- ZipCPU_ is now known as ZipCPU08:27
arand_Does the dcache addressing behaviour differ depending on if it's disabled or enabled?10:03
olofk_shorne: You're in the news today :)
olofk_kc5tja: Yes, there are some plans and discussions for future Wishbone. A packet mode has been suggested as one part of a new spec16:42
olofk_I was actually in a video conference a few days ago with a bunch of people from different particle accelerators around the world, who are using wishbone as well and want to make improvements16:42
olofk_We will try to move things forward with wishbone, but it hasn't risen high enough on the list of priorities yet16:43
ZipCPUolofk_: You mean ... bonus points are available for anyone who can build something faster/better/cheaper ... first?  ;)16:45
olofk_ZipCPU: Especially cheaper. We must bring down the price on wishbone quite a bit :)16:48
ZipCPUMaybe I could get you something that would cost 4x fewer $?16:49
olofk_That would be fantastic! If you do that I'll give you a 70% discount on FuseSoC16:49
ZipCPUI wonder if the UART-WB converter might offer a possible form.  Every transaction takes up to 36 bits on a bus.  The first four bits define what sort of transaction it is, the other 32 are available for data.16:52
ZipCPUOne word can be used to set an address, the next (several) to write to the address, or (if so commanded) to read a number of values.16:53
ZipCPUHence, a read request would require 2-words to be transmitted, one for the address, one for the number of items to read.  (Whether or not to increment the address between reads is captured in there too.)16:54
ZipCPUA write request would require one address word, and then one word per word written to the bus.  The length is determined by the number of words in the message.  One bit of each word tells you whether or not to increment the address along the way.16:54
ZipCPUPerhaps an extra word would need to be added to the beginning of every packet as well, identifying the node and giving an ID for the transaction.16:55
olofk_Doing like that would make serializing more natural, and I guess that's why you chose this format for the bridge16:58
olofk_(out of curiousity. Have you looked at etherbone, which is a standard for transmitting WB over ethernet?)16:58
ZipCPUNo ... I didn't know there was such, as I've been tempted to build my own ethernet over wishbone protocol.16:59
olofk_anyway, back to your idea17:00
ZipCPUOkay ... so the width of any channel would need to be the word size, plus the word size / 8, plus two bits.17:03
ZipCPUThe word size is obvious, one bit for each data bit to be sent.17:03
ZipCPUThe word size / 8 shouldn't be too hard to figure out: One bit for each of the select lines.17:04
ZipCPUThe last two bits are (for each word written) a '1' to specify a write (as opposed to anything else), and a 0/1 to specify whether or not the address should be incremented between writes.17:04
ZipCPUThus, for a 32-bit data channel, you would need 42 bits total.  Realistically, you'd need two more: one for STB, another for STALL.17:05
ZipCPUSTB traveling with the packet, STALL going against the current.17:06
ZipCPUProbably needs more specific thought/documentation, but ... it should be quite doable.17:06
ZipCPUYou could even define words for particular responses, such as passing around an "interrupt", "bus error", "bus reset" has taken place, or a "lock request" and/or "lock clear" signal.17:11
ZipCPUIf you keep the signal lines at 6-bits or less, than a 6-LUT should process them nicely, no?  ;)17:11
olofk_I haven't given that idea much thought, but I did create a bus about two years ago with some other ideas17:15
ZipCPURealistically, the purpose of such a bus might be to handle packet switched requests, which would then be turned into WB actions at their destinations.17:16
olofk_I called it CAMD as a working name. The idea there was to separate Command/Address and Data/Mask into separate buses17:16
olofk_And have a third channel for responses17:16
ZipCPUWouldn't you want an economy of wires?  Since you'd be using a packet approach, wouldn't you wish to start with an address at the beginning of the packet?17:17
ZipCPUThe reverse channel could be identical, save for a bit in the Start of Packet word specifying that it is a return response.17:18
olofk_One argument against serializing (as in your proposal) is that data might well be the width of a cache line or DDR burst (say 256 bits) while you still want a 32-bit address space17:18
olofk_For off-chip serializing is important, and I think we should consider that use case from the beginning, but we also need to optimize for on-chip behaviour17:18
ZipCPUIf the address is smaller than the amount of data, stuff the bits--no problem.  If the data width is smaller than the address width, use multiple clocks to send the address.17:19
olofk_I will do my best to find compelling arguments against your idea, and hopefully I'll come up with none :)17:23
olofk_There are a few of course, but I'm not sure they are that strong17:23
ZipCPUSo, a read request might look something like:
olofk_1. You will lose bandwidth everytime you send a new address instruction (but you can make sure to write large bursts to make that penalty small)17:25
olofk_2. It will add logic to all slaves to decode the combined data/address bus17:25
olofk_(not sure that is a lot however)17:25
olofk_3. It will not be backwards-compatible (but I find it hard to come up with any scheme that is a significant improvement and still retains backwards compatibility)17:26
ZipCPUAs for backwards compatibility, I already have something similar to what would be needed to bridge between this and the current WB (B4, pipelined ;)17:27
ZipCPUThe big problem you will have is the stall line on the reverse, since nothing in the current WB allows responses to stall.17:27
olofk_Oh yes. Any new scheme would definitely need flow control on the return channel17:28
olofk_I would probably still want to handle valid/ready flags outside of the payload17:29
olofk_So that the underlying medium can take care of this17:29
olofk_But now I need to sleep :)17:30
ZipCPUMaybe kc5tja and I will solve it before you wake back up.17:30
kc5tjaolofk_: Yay!17:32
ZipCPUkc5tja: Have you followed the discussion so far?17:33
kc5tjaI just read up.17:36
ZipCPUAny thoughts?  You were looking for a packet based ... WB replacement?17:37
kc5tjaI'd like to see one eventually, but I don't need it now.17:39
kc5tjaBut I think I need to clarify what I mean by packets.17:39
kc5tjaAXI4 transmits entire packets in a single clock (well, req/ack) cycle.17:39
kc5tjaThe availability of five different channels means that a single interconnect can be participating in five different transactions at any given time.17:40
ZipCPUNot really.  They transmit entire packet requests in a single clock.  The packets responses still take about one clock per data word.17:40
kc5tjaSure, if you're streaming.  AXI4 supports 1024- and 2048-bit wide buses, which is an entire cache-line.17:40
ZipCPUDo you envision a single interconnect to handle five different transactions per clock, or TDM those transactions in a more traditional packet structure?17:40
kc5tjaBoth, for different applications.  Multitasking links for more efficient use of on-chip resources, and TDM for off-chip access.17:41
kc5tjaBut, like I said, if you tweak WB to meet the former spec, you'll end up with AXI4 *anyway*.17:42
ZipCPUYeah, I'd like to avoid creating AXI4.17:43
kc5tjaThe thing is, AXI4's spec allows concurrent transactions on multiple channels, so it's actually possible to bridge (non-pipelined) Wishbone to AXI4 pretty easily.  It just involves a ton of wiring.17:52
kc5tjaYou don't even need a state machine as long as word widths are the same.17:52
ZipCPUYeah ... except for AXI4 burst requests ... ;)17:56
ZipCPUI just finished building an AXI4-WB bridge, and those burst requests were what made the whole thing difficult.17:57
kc5tjaHow so?17:58
ZipCPUIn the end, I needed a FIFO with four pointers in it: One to reference the request that was received from AXI.  This included any expanded burst request.  The second pointer was to WB requests that had been issued.  The third to WB ACK's received (and possibly data), and the fourth kept track of responses given.17:59
ZipCPUI'm not looking forward to debugging the timing errors associated with reading and writing from/to those FIFOs.18:00
kc5tjaYou need FIFOs with pipelined WB too.18:00
kc5tjaThe master must store a transaction in a FIFO so that it knows what transaction the next ACK applies to.18:00
kc5tjaAre you targeting AXI3 or AXI4?18:00
ZipCPUI mean, consider this, what if you receive a request for a 256-long burst.  If your FIFO is 16 long, it will be a while before you can accept any more requests, so you'll need to drop the READY lines on the input channels.18:00
ZipCPUHmm ... yeah, I do have other bus FIFO's wandering around, but I have tried to keep them out of the interconnect.18:02
ZipCPUIn general, only the source master is maintaining any FIFO's.18:02
ZipCPUAnd ... not all of my source master's maintain FIFOs.18:03
kc5tjaWhen the Kestrel advances to the point of needing caches, the CPU suddenly finds itself in a different clock domain from the rest of the system.  FIFOs are compulsory for that case, which is why my cached version of the CPU will use B4 not B3.  :)18:03
ZipCPU|LaptopHello gnawzie!22:19
gnawzieWhat's up?22:19
ZipCPU|LaptopIt's late at night here, and I'm going to lose it soon.22:20
ZipCPU|LaptopHow about you?  You must be just getting up for the day?22:20
gnawzieNo it's about 6:20pm now22:20
gnawzieI'm thinking of using openrisc for a drone flight controller22:20
ZipCPU|LaptopAhh ... PST.  EST here.22:20
ZipCPU|LaptopWait ... 6:20pm isn't PST, ... that's not in the US at all is it?22:21
gnawzieits alaska22:22
ZipCPU|LaptopOk, that makes sense.22:24
ZipCPU|LaptopJust saw your question on ##fpga.  "Is OpenRISC any good"?22:24
ZipCPU|LaptopThis is the forum to chat with folks who use it.  Only problem is, most of those folks are on European time.22:24
ZipCPU|LaptopWe'll see them in the morning.22:24
ZipCPU|LaptopWhat are you looking for?22:25
gnawziejust a bit overwhelmed by the complexity22:25
ZipCPU|LaptopYeah ... tell me about it.  I understand completely!22:26
ZipCPU|LaptopWhat level are you looking to understand OpenRISC at?22:26
ZipCPU|LaptopUser/application level?  Bare metal?  Verilog?  Compile/tools level?22:26
ZipCPU|LaptopI ask because the complexity at each level is different.  Some are easier than others.22:27
gnawzieI really don't know the structure of those I have no direction lol22:28
ZipCPU|LaptopWell, okay, let's try this a different way: what do you want to accomplish?22:29
ZipCPU|LaptopIn many ways, OpenRISC is a tool.  Which tool you use depends upon the task at hand.22:30
gnawzieOkay... that makes sense then. It's not a core you can simply instantiate and use22:32
ZipCPU|LaptopWell ... I might disagree with that.  Many people have just instantiated it and used it.22:32
ZipCPU|LaptopPlacing it into your own design will require building a bus with peripherals on it, connecting the core to that bus, and perhaps even a debug core to that bus.22:33
ZipCPU|LaptopThen you would load instructions into the CPU, and off you go!22:33
ZipCPU|LaptopOk, back to that load instructions part ... that will require building the toolchain, compiler, etc., and then using it to build your program.  All quite doable.22:34
ZipCPU|LaptopWhile the core will run Linux, I think you'll find that many of the folks on this forum aren't running Linux on it.22:34
ZipCPU|LaptopThe other thing to know is that many of the folks on this forum have made those tasks REALLY easy.22:39
ZipCPU|LaptopHowever, I need to head to bed for the night.  Feel free to ask further questions, but do please stick around for the answers.22:40
ZipCPU|LaptopYou might not notice them till sometime tomorrow morning.22:41
--- Log closed Fri Dec 16 00:00:29 2016

Generated by 2.15.2 by Marius Gedminas - find it at!