IRC logs for #openrisc Thursday, 2016-12-15

--- Log opened Thu Dec 15 00:00:28 2016
-!- ZipCPU_ is now known as ZipCPU		08:27
arand_	Does the dcache addressing behaviour differ depending on if it's disabled or enabled?	10:03
olofk_	shorne: You're in the news today :) http://phoronix.com/scan.php?page=news_item&px=Linux-4.10-OpenRISC	16:41
olofk_	kc5tja: Yes, there are some plans and discussions for future Wishbone. A packet mode has been suggested as one part of a new spec	16:42
olofk_	I was actually in a video conference a few days ago with a bunch of people from different particle accelerators around the world, who are using wishbone as well and want to make improvements	16:42
olofk_	We will try to move things forward with wishbone, but it hasn't risen high enough on the list of priorities yet	16:43
ZipCPU	olofk_: You mean ... bonus points are available for anyone who can build something faster/better/cheaper ... first? ;)	16:45
olofk_	ZipCPU: Especially cheaper. We must bring down the price on wishbone quite a bit :)	16:48
ZipCPU	Maybe I could get you something that would cost 4x fewer $?	16:49
olofk_	That would be fantastic! If you do that I'll give you a 70% discount on FuseSoC	16:49
ZipCPU	LOL.	16:50
ZipCPU	I wonder if the UART-WB converter might offer a possible form. Every transaction takes up to 36 bits on a bus. The first four bits define what sort of transaction it is, the other 32 are available for data.	16:52
ZipCPU	One word can be used to set an address, the next (several) to write to the address, or (if so commanded) to read a number of values.	16:53
ZipCPU	Hence, a read request would require 2-words to be transmitted, one for the address, one for the number of items to read. (Whether or not to increment the address between reads is captured in there too.)	16:54
ZipCPU	A write request would require one address word, and then one word per word written to the bus. The length is determined by the number of words in the message. One bit of each word tells you whether or not to increment the address along the way.	16:54
ZipCPU	Perhaps an extra word would need to be added to the beginning of every packet as well, identifying the node and giving an ID for the transaction.	16:55
olofk_	Doing like that would make serializing more natural, and I guess that's why you chose this format for the bridge	16:58
olofk_	(out of curiousity. Have you looked at etherbone, which is a standard for transmitting WB over ethernet?)	16:58
ZipCPU	No ... I didn't know there was such, as I've been tempted to build my own ethernet over wishbone protocol.	16:59
olofk_	http://www.ohwr.org/projects/etherbone-core	17:00
olofk_	anyway, back to your idea	17:00
ZipCPU	Okay ... so the width of any channel would need to be the word size, plus the word size / 8, plus two bits.	17:03
ZipCPU	The word size is obvious, one bit for each data bit to be sent.	17:03
ZipCPU	The word size / 8 shouldn't be too hard to figure out: One bit for each of the select lines.	17:04
ZipCPU	The last two bits are (for each word written) a '1' to specify a write (as opposed to anything else), and a 0/1 to specify whether or not the address should be incremented between writes.	17:04
ZipCPU	Thus, for a 32-bit data channel, you would need 42 bits total. Realistically, you'd need two more: one for STB, another for STALL.	17:05
ZipCPU	STB traveling with the packet, STALL going against the current.	17:06
ZipCPU	Probably needs more specific thought/documentation, but ... it should be quite doable.	17:06
ZipCPU	You could even define words for particular responses, such as passing around an "interrupt", "bus error", "bus reset" has taken place, or a "lock request" and/or "lock clear" signal.	17:11
ZipCPU	If you keep the signal lines at 6-bits or less, than a 6-LUT should process them nicely, no? ;)	17:11
olofk_	I haven't given that idea much thought, but I did create a bus about two years ago with some other ideas	17:15
ZipCPU	Realistically, the purpose of such a bus might be to handle packet switched requests, which would then be turned into WB actions at their destinations.	17:16
olofk_	I called it CAMD as a working name. The idea there was to separate Command/Address and Data/Mask into separate buses	17:16
olofk_	And have a third channel for responses	17:16
ZipCPU	Wouldn't you want an economy of wires? Since you'd be using a packet approach, wouldn't you wish to start with an address at the beginning of the packet?	17:17
ZipCPU	The reverse channel could be identical, save for a bit in the Start of Packet word specifying that it is a return response.	17:18
olofk_	One argument against serializing (as in your proposal) is that data might well be the width of a cache line or DDR burst (say 256 bits) while you still want a 32-bit address space	17:18
olofk_	For off-chip serializing is important, and I think we should consider that use case from the beginning, but we also need to optimize for on-chip behaviour	17:18
ZipCPU	If the address is smaller than the amount of data, stuff the bits--no problem. If the data width is smaller than the address width, use multiple clocks to send the address.	17:19
olofk_	I will do my best to find compelling arguments against your idea, and hopefully I'll come up with none :)	17:23
olofk_	There are a few of course, but I'm not sure they are that strong	17:23
ZipCPU	So, a read request might look something like: http://imgur.com/a/Scq30	17:24
olofk_	1. You will lose bandwidth everytime you send a new address instruction (but you can make sure to write large bursts to make that penalty small)	17:25
olofk_	2. It will add logic to all slaves to decode the combined data/address bus	17:25
olofk_	(not sure that is a lot however)	17:25
olofk_	3. It will not be backwards-compatible (but I find it hard to come up with any scheme that is a significant improvement and still retains backwards compatibility)	17:26
ZipCPU	As for backwards compatibility, I already have something similar to what would be needed to bridge between this and the current WB (B4, pipelined ;)	17:27
ZipCPU	The big problem you will have is the stall line on the reverse, since nothing in the current WB allows responses to stall.	17:27
olofk_	Oh yes. Any new scheme would definitely need flow control on the return channel	17:28
olofk_	I would probably still want to handle valid/ready flags outside of the payload	17:29
olofk_	So that the underlying medium can take care of this	17:29
olofk_	But now I need to sleep :)	17:30
ZipCPU	Rgr.	17:30
ZipCPU	Maybe kc5tja and I will solve it before you wake back up.	17:30
kc5tja	olofk_: Yay!	17:32
ZipCPU	kc5tja: Have you followed the discussion so far?	17:33
kc5tja	I just read up.	17:36
ZipCPU	Any thoughts? You were looking for a packet based ... WB replacement?	17:37
kc5tja	I'd like to see one eventually, but I don't need it now.	17:39
kc5tja	But I think I need to clarify what I mean by packets.	17:39
kc5tja	AXI4 transmits entire packets in a single clock (well, req/ack) cycle.	17:39
kc5tja	The availability of five different channels means that a single interconnect can be participating in five different transactions at any given time.	17:40
ZipCPU	Not really. They transmit entire packet requests in a single clock. The packets responses still take about one clock per data word.	17:40
kc5tja	Sure, if you're streaming. AXI4 supports 1024- and 2048-bit wide buses, which is an entire cache-line.	17:40
ZipCPU	Do you envision a single interconnect to handle five different transactions per clock, or TDM those transactions in a more traditional packet structure?	17:40
kc5tja	Both, for different applications. Multitasking links for more efficient use of on-chip resources, and TDM for off-chip access.	17:41
kc5tja	But, like I said, if you tweak WB to meet the former spec, you'll end up with AXI4 anyway.	17:42
ZipCPU	Yeah, I'd like to avoid creating AXI4.	17:43
kc5tja	The thing is, AXI4's spec allows concurrent transactions on multiple channels, so it's actually possible to bridge (non-pipelined) Wishbone to AXI4 pretty easily. It just involves a ton of wiring.	17:52
kc5tja	You don't even need a state machine as long as word widths are the same.	17:52
ZipCPU	Yeah ... except for AXI4 burst requests ... ;)	17:56
ZipCPU	I just finished building an AXI4-WB bridge, and those burst requests were what made the whole thing difficult.	17:57
kc5tja	How so?	17:58
ZipCPU	In the end, I needed a FIFO with four pointers in it: One to reference the request that was received from AXI. This included any expanded burst request. The second pointer was to WB requests that had been issued. The third to WB ACK's received (and possibly data), and the fourth kept track of responses given.	17:59
ZipCPU	I'm not looking forward to debugging the timing errors associated with reading and writing from/to those FIFOs.	18:00
kc5tja	You need FIFOs with pipelined WB too.	18:00
kc5tja	The master must store a transaction in a FIFO so that it knows what transaction the next ACK applies to.	18:00
kc5tja	Are you targeting AXI3 or AXI4?	18:00
ZipCPU	I mean, consider this, what if you receive a request for a 256-long burst. If your FIFO is 16 long, it will be a while before you can accept any more requests, so you'll need to drop the READY lines on the input channels.	18:00
ZipCPU	(AXI4)	18:00
ZipCPU	Hmm ... yeah, I do have other bus FIFO's wandering around, but I have tried to keep them out of the interconnect.	18:02
ZipCPU	In general, only the source master is maintaining any FIFO's.	18:02
ZipCPU	And ... not all of my source master's maintain FIFOs.	18:03
kc5tja	When the Kestrel advances to the point of needing caches, the CPU suddenly finds itself in a different clock domain from the rest of the system. FIFOs are compulsory for that case, which is why my cached version of the CPU will use B4 not B3. :)	18:03
gnawzie	hello	22:19
ZipCPU\|Laptop	Hello gnawzie!	22:19
gnawzie	What's up?	22:19
ZipCPU\|Laptop	It's late at night here, and I'm going to lose it soon.	22:20
gnawzie	lol	22:20
ZipCPU\|Laptop	How about you? You must be just getting up for the day?	22:20
gnawzie	No it's about 6:20pm now	22:20
gnawzie	I'm thinking of using openrisc for a drone flight controller	22:20
ZipCPU\|Laptop	Ahh ... PST. EST here.	22:20
ZipCPU\|Laptop	Wait ... 6:20pm isn't PST, ... that's not in the US at all is it?	22:21
gnawzie	its alaska	22:22
ZipCPU\|Laptop	Ok, that makes sense.	22:24
ZipCPU\|Laptop	Just saw your question on ##fpga. "Is OpenRISC any good"?	22:24
gnawzie	haha	22:24
ZipCPU\|Laptop	This is the forum to chat with folks who use it. Only problem is, most of those folks are on European time.	22:24
ZipCPU\|Laptop	We'll see them in the morning.	22:24
gnawzie	okay	22:24
ZipCPU\|Laptop	What are you looking for?	22:25
gnawzie	just a bit overwhelmed by the complexity	22:25
ZipCPU\|Laptop	Yeah ... tell me about it. I understand completely!	22:26
ZipCPU\|Laptop	What level are you looking to understand OpenRISC at?	22:26
ZipCPU\|Laptop	User/application level? Bare metal? Verilog? Compile/tools level?	22:26
ZipCPU\|Laptop	I ask because the complexity at each level is different. Some are easier than others.	22:27
gnawzie	I really don't know the structure of those I have no direction lol	22:28
ZipCPU\|Laptop	Well, okay, let's try this a different way: what do you want to accomplish?	22:29
ZipCPU\|Laptop	In many ways, OpenRISC is a tool. Which tool you use depends upon the task at hand.	22:30
gnawzie	Okay... that makes sense then. It's not a core you can simply instantiate and use	22:32
ZipCPU\|Laptop	Well ... I might disagree with that. Many people have just instantiated it and used it.	22:32
ZipCPU\|Laptop	Placing it into your own design will require building a bus with peripherals on it, connecting the core to that bus, and perhaps even a debug core to that bus.	22:33
ZipCPU\|Laptop	Then you would load instructions into the CPU, and off you go!	22:33
ZipCPU\|Laptop	Ok, back to that load instructions part ... that will require building the toolchain, compiler, etc., and then using it to build your program. All quite doable.	22:34
ZipCPU\|Laptop	While the core will run Linux, I think you'll find that many of the folks on this forum aren't running Linux on it.	22:34
gnawzie	yeah	22:38
ZipCPU\|Laptop	The other thing to know is that many of the folks on this forum have made those tasks REALLY easy.	22:39
ZipCPU\|Laptop	However, I need to head to bed for the night. Feel free to ask further questions, but do please stick around for the answers.	22:40
ZipCPU\|Laptop	You might not notice them till sometime tomorrow morning.	22:41
--- Log closed Fri Dec 16 00:00:29 2016

Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!