--- Log opened Thu Aug 11 00:00:20 2016 | ||
olofk_ | blueCmd_: Happy birthday :) | 08:05 |
---|---|---|
olofk_ | shorne: Ah ok. That makes sense | 08:06 |
blueCmd_ | olofk_: thanks :D | 09:09 |
mafm | congrats blueCmd_ ! | 09:22 |
mafm | btw, blueCmd_, you appear in the "special thanks" section of my talk in the risc-v conf :P | 09:23 |
-!- Netsplit *.net <-> *.split quits: Xueming, bandvig, _franck__ | 09:26 | |
blueCmd_ | mafm: I did see that! olofk_ tweeted that I was famous - thanks a lot man, made my day :D | 09:36 |
-!- blueCmd_ is now known as blueCmd | 09:37 | |
mafm | blueCmd: well, all I can say is... credit where credit is due :D | 09:41 |
blueCmd | thanks :P | 09:55 |
-!- Netsplit over, joins: bandvig | 10:00 | |
mafm | blueCmd: I expect to join me in 2038 or so, when gcc support for risc-v is upstreamed :) | 10:57 |
blueCmd | haha, will do! | 10:58 |
ZipCPU | olofk: I just got an extended Quad SPI flash controller up and running. You should find the test bench for this core easier to incorporate into FuseSoc--should you so wish. | 18:48 |
ZipCPU | You'll find the (new) code on OpenCores, in the qspiflash project. | 18:48 |
andrzejr | ZipCPU, what HW do you use for testing (board, QSPI slave)? | 18:49 |
ZipCPU | I've been testing this new controller on the Arty platform. | 18:49 |
ZipCPU | The older controller, also in the same directory, has been tested on Digilent's Basys-3 and CMod-S6 platforms. | 18:50 |
ZipCPU | You can actually find the hardware files, both Verilog and C++, that I've been using in the openarty project on OpenCores. | 18:50 |
andrzejr | wow, that's a lot of good stuff | 18:54 |
ZipCPU | Anything you are looking for in there? | 18:54 |
andrzejr | I see you have WB<->QSPI Flash bridge, nice touch. | 18:54 |
ZipCPU | It was actually necessary on the old controller to integrate into the CMod-S6. The WB-QSPI bridge allowed the ZipCPU I was running to run from flash memory just as it would any other memory save that flash was slow. | 18:55 |
andrzejr | In the past I was working on nexys4 board. It uses a QSPI flash as well. | 18:55 |
ZipCPU | I think the older controller should work with the Nexys-4 board. Almost all of those flash chips are nearly the same. | 18:56 |
ZipCPU | The newer controller ... took some more work. It had a different set of registers, a bit of a different interface, and I wanted to run the flash at 100MHz instead of 50MHz. | 18:56 |
andrzejr | Is GPL a deliberate decision? | 18:58 |
ZipCPU | Yes. | 18:58 |
ZipCPU | With an appropriate incentive I could make an individual incentive ... ;) | 18:58 |
ZipCPU | individual exception ... (individual incentive ... gosh--sometimes I don't believe what I type!) | 19:00 |
andrzejr | I'm not against GPL in general, but it just doesn't work for HW. That is, there is no way to legally make an open HW. The main culprit is the definition of "system libraries". | 19:14 |
andrzejr | Did you write your own DDR3 controller? | 19:18 |
kc5tja | ZipCPU: The Quad-SPI controller, does it allow you to expose an SPI flash chip to the host CPU as if it were a parallel flash chip? | 19:37 |
kc5tja | (forgive me if you already mentioned this; scrollback is not easy for me to read at the moment) | 19:37 |
kc5tja | n/m | 19:38 |
kc5tja | I see that you already answered my question. :) | 19:38 |
ZipCPU | kc5tja: Sorry --- stepped away for supper. Yes, you can run programs directly from the Quad SPI flash controller if you wish. Sure, it's slow, but you can do it. | 19:42 |
kc5tja | Nice. | 19:43 |
ZipCPU | andrzejr: The DDR3 controller is (sadly) a work in progress. I <though> I was almost there: I had a simulation built, a simulator running, fully operational test benches, etc. | 19:43 |
ZipCPU | Then I discovered I made a *big* mistake in understanding the spec, so ... now I get to start that over from scratch. | 19:43 |
kc5tja | I'd have to ask for a license exemption though for Kestrel project (MPLv2 instead of GPL). MPLv2 is compatible with GPL, but the reverse relationship doesn't hold. | 19:45 |
andrzejr | How do you deal with SerDeses and phase interpolators? Do you use ones from Xilinx (encrypted)? | 19:45 |
ZipCPU | kc5tja: Let's discuss it offline--you can e-mail me at dgisselq (at) opencores.org. | 19:46 |
kc5tja | ZipCPU: Way too early yet. I'll chat w/ you when the time is right. | 19:46 |
kc5tja | Won't be until next year at least. :( | 19:46 |
ZipCPU | andrzejr: For the DDR3 controller, I am going to need to user xSERDES elements. I thought I was going to be able to get away with xDDR elements only, and was ... corrected along the way. | 19:47 |
ZipCPU | kc5tja: Under the current license, you should be able to at least make your Kestrel computer, even with the Quad-SPI flash controller, and then know if it works. | 19:48 |
ZipCPU | It's when things come to distribution that things get ... interesting. | 19:48 |
kc5tja | Right. | 19:48 |
kc5tja | OK, I'm giving up on implementing a native RISC-V CPU for now. The debugging complexity is just intractible. | 21:59 |
ZipCPU | Really? | 21:59 |
* kc5tja goes back to using a Forth CPU, and writing a RISC-V software emulator instead. | 21:59 | |
kc5tja | It's well and truly beyond my skills. | 21:59 |
ZipCPU | Is this specific to RISC-V? | 21:59 |
kc5tja | I've tried four different times, each attempting a different approach to tackle the problem. | 22:00 |
kc5tja | No, to RISC in general. It's just too complex for me to wrap my head around. | 22:00 |
ZipCPU | How are you trying to handle debugging? | 22:01 |
kc5tja | I want to see a working computer, even if it utterly under-performs, than show up to RISC-V workshop empty-handed a third time. :( | 22:01 |
kc5tja | Unit tests. | 22:01 |
kc5tja | Test benches. | 22:01 |
ZipCPU | But ... you've done CPU's before. How and why are RISC CPU's different? | 22:02 |
kc5tja | I've done MISC cores before. | 22:02 |
kc5tja | Those are 300 to 600 lines or Verilog for the _complete_ core. | 22:02 |
kc5tja | RISC at a minimum takes up several thousand lines of Verilog, no part of which is easily modularizable. | 22:02 |
kc5tja | That's why I'm going to go back to using a Forth core. | 22:02 |
kc5tja | This is something I know I can get working. | 22:03 |
ZipCPU | Hmm ... I like to think of my own core as being a RISC core, and I was able to separate some of the parts into modules--certainly not all. | 22:03 |
ZipCPU | The memory, divide, ALU, instruction decoder, and prefetch all have separate cores from the main core. | 22:03 |
ZipCPU | The main core, though, is still ... pretty ugly. | 22:04 |
ZipCPU | (Ugly = 1.8k lines of code) | 22:04 |
ZipCPU | But ... hold on, let me understand something ... is this because of the large number of instructions the RISC-V CPU supports? | 22:04 |
ZipCPU | Have you seen my CPU ISA complexity comparison? | 22:06 |
ZipCPU | kc5tja: You might find http://opencores.org/websvn,filedetails?repname=zipcpu&path=%2Fzipcpu%2Ftrunk%2Fdoc%2Forconf.pdf of value. | 22:14 |
kc5tja | The # of insns isn't that much; only 56 or so for my CPU design. | 22:14 |
ZipCPU | I put it together to try to capture the ... complexity of the various CPU ISA's out there. | 22:14 |
kc5tja | And from an instruction decode POV, I only have 24ish minterms, so that's not much. | 22:14 |
kc5tja | But, for some reason, I'm running into a situation where I deliberately feed it a malformed instruction, and the "undefined" flag isn't being set. | 22:15 |
kc5tja | I'm like, if I can't even get THAT working, there's no hope. | 22:15 |
ZipCPU | How many bits does it take to determine that an instruction is malformed or undefined? | 22:15 |
kc5tja | Ultimately, all 32, because CSRRx instructions throw and exception if you give an unsupported CSR register. | 22:16 |
kc5tja | But, typically, 9 to 12 bits. | 22:16 |
ZipCPU | Ouch! | 22:16 |
kc5tja | Yeah. | 22:17 |
kc5tja | The instruction encoding is very obviously optimized for ASIC layout, and not at all for FPGAs. | 22:17 |
ZipCPU | This is absolutely fascinating ... it sounds like your experience is starting to confirm one of my "theses" regarding the ZipCPU. | 22:18 |
kc5tja | So, right now, I'm planning on just implementing a wide Forth CPU, using block RAMs as a ROM to hold the RISC-V emulator, so external modules never know the difference. | 22:18 |
ZipCPU | Thesis: modern FPGA-based soft-core CPU's have instruction sets that are *way* too complex. | 22:19 |
kc5tja | Once THAT is working, which I HOPE I can repeat a success on, then I hope to implement RISC-V instruction decode-to-stack-ops approach, like x86 pre-decodes to RISC-ops. | 22:19 |
ZipCPU | You mean, like Just-In-Time hardware compilation? | 22:20 |
kc5tja | That should let me get back a lot of the lost performance. | 22:20 |
kc5tja | Yep | 22:20 |
ZipCPU | And you think that will be ... <gasp> simpler?? | 22:20 |
kc5tja | All RISC-V instructions have a corresponding (and static) equivalent Forth code sequence. | 22:20 |
kc5tja | No. | 22:20 |
kc5tja | It will be more complex overall. | 22:21 |
ZipCPU | Hmm ... quick question: how many lines of code are you working with in your (now) failed attempt at RISC-V? | 22:21 |
kc5tja | But if I can get a *success* on the Forth CPU idea (that will be simpler for sure), the instruction predecoder should not be too much harder. | 22:21 |
kc5tja | Let me check. | 22:21 |
kc5tja | Disclaimer: I have modules written but they're not even wired up yet. | 22:22 |
ZipCPU | (Understood) | 22:22 |
ZipCPU | SLOC (Software Lines of Code) measures are like MIPS (Meaningless Indicators of Processor Performance). Their both easy to measure, both feel good to report, and both are probably just about as meaningless--but I'm asking anyway :) | 22:24 |
kc5tja | 261 for instruction decode, ALU, and GPRs combined, but 140 for instruction decode. | 22:24 |
kc5tja | ALU and GRPs work -- those I've tested separately. | 22:24 |
kc5tja | So the bug is _somewhere_ in instruction decode. | 22:24 |
ZipCPU | How many lines of code determine what constitutes "undefined"? | 22:25 |
kc5tja | Almost all of them when you expand the logic equations. :) | 22:25 |
ZipCPU | Yikes! That bad? | 22:26 |
kc5tja | Let me just point you to the source. | 22:26 |
ZipCPU | Do you have it posted somewhere? | 22:26 |
kc5tja | https://github.com/sam-falvo/polaris/blob/V3/rtl/verilog/polaris.v | 22:29 |
kc5tja | line 126 is where undefined_o is assigned. | 22:30 |
ZipCPU | This's gotta be harder than it looks: do you know what values each of those components has taken on? i.e., which component thinks you have a valid instruction? | 22:32 |
ZipCPU | Ooh ... linse 26-54 ... painful! | 22:32 |
ZipCPU | What are you using for a simulator? Does it give you insight into what is going on within the core? | 22:34 |
kc5tja | Icarus Verilog | 22:36 |
ZipCPU | Hmmm ... am I being spoiled by using Verilator, and getting full insight into all of the wires and registers internal to my core? | 22:39 |
ZipCPU | kc5tja: Gonna need to chat later, my eyes aren't staying open. | 22:40 |
kc5tja | I could use something like GtkWave, but with logic as simple as that (and, while voluminous, it IS simple logic), I shouldn't need to. | 22:41 |
kc5tja | ZipCPU: 'night! | 22:42 |
--- Log closed Fri Aug 12 00:00:21 2016 |
Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!