--- Log opened Mon Oct 20 00:00:45 2014 | ||
olofk | stekern: All wb_ports in wb_sdram_ctrl has their own cache, and they are coherent between all ports, right? | 07:05 |
---|---|---|
stekern | olofk: that's the intent, yes | 07:11 |
stekern | olofk: I played my first couple of hours of broken age yesterday btw ;) | 07:12 |
olofk | stekern: Woohoo!! Was it fun? :) | 07:12 |
stekern | yes, a bit on the easy side perhaps | 07:12 |
stekern | and I can't say I'm superfound of the graphic style | 07:13 |
olofk | Yes. It wasn't that hard, but amusing. I like the fox :) | 07:13 |
stekern | isn't it a wolf? | 07:13 |
stekern | our daughter enjoyed watching while I played though | 07:13 |
olofk | ah ok. Was almost a year since I played it | 07:14 |
olofk | I bought these on humble bundle some weeks ago. Much more enjoyable graphics http://www.wadjeteyegames.com/games/blackwell-legacy/ | 07:14 |
stekern | yeah, that sounds right, think it was about a year ago that I bought it ;) | 07:15 |
olofk | About the sdram, I've been thinking a lot about wide wb buses to increase bandwidth with many masters | 07:15 |
olofk | And that got me wondering if it ever happens that one port caches something that is used by another port | 07:16 |
olofk | I mean, in the common case we only have data and instruction buses to the RAM, and they would never get stuff from the same memory range, right? | 07:17 |
stekern | umm, and then you've got DMA accesses | 07:17 |
olofk | Yes. With DMA I guess that the caches gets more used | 07:18 |
stekern | (blackwell) yeah, that seems to have a retro touch to it ;) gotta buy that one too | 07:20 |
stekern | but TOMI had modern graphics that I liked | 07:22 |
olofk | Ah.. never played that. I still haven't completely accepted COMI though :) | 07:24 |
stekern | COMI was good, EMI was a disaster | 07:25 |
stekern | I've completely missed this too: http://en.wikipedia.org/wiki/Back_to_the_Future:_The_Game | 07:26 |
olofk | Me too | 07:27 |
stekern | it sucks that telltale doesn't do linux versions though | 07:28 |
olofk | ah.. they're the ones who did the Sam'n'Max shorties too | 07:29 |
stekern | yup | 07:29 |
olofk | Regarding the SDRAM again. My grand plan was to split out the wb_port arbiting stuff to a separate system cache component, add proper arbitrary wishbone resize blocks and expose a single DQ_WIDTH*BURST_LENGTH wishbone port from the memory controller | 07:34 |
olofk | Benefits: 1) System cache can be used for all memory controllers. 2) If we add support in mor1kx, we can pull in a full cache line in one transaction | 07:35 |
stekern | 1) sounds good | 07:39 |
stekern | 2) is problematic, since there's no portable way to create block rams with seperate data ports | 07:40 |
stekern | sizes | 07:40 |
olofk | (2) Where is that a problem? In mor1kx or the system cache? | 07:45 |
stekern | in mor1kx | 07:46 |
olofk | ah ok | 07:46 |
olofk | Currently trying to raise Fmax for mor1kx. Found critical path from dltb_match_regs to pc_fetch. Disabled dmmu and got 7MHz lower Fmax. What? | 07:47 |
stekern | that's probably due to the phase of the moon | 07:48 |
olofk | ahh.. or because the generate statement says if (FEATURE_DATACACHE!="NONE") | 07:50 |
olofk | Never really disabled it | 07:50 |
stekern | yes, it's confusing... | 07:50 |
stekern | it's on my todo list to document all the parameters in the top module with the valid values they can obtain | 07:52 |
olofk | In theory it could be handled with a common function that returns 1 for "ENABLED", "TRUE", and 0 for "DISABLED", "NONE" and some more values. Problem there is that Icarus still doesn't support static functions :( | 07:53 |
stekern | I'm also working on putting together 'templates' with parameter setups for different purposes | 07:54 |
stekern | this is a minimal baremetal setup: http://pastie.org/9662421 | 07:54 |
olofk | stekern: I've been considering the same thing. My approach however is to have a config file that spits out a verilog wrapper | 07:55 |
olofk | That could also instantiate the debug interface and hook it up depending on if the debug unit parameter is set | 07:57 |
olofk | And other things | 07:57 |
olofk | Disabling ?MMU gives me a critical path in the arbiter. Feels much better to know it's my fault for it being slow :) | 07:58 |
stekern | ;) | 07:59 |
olofk | stekern: I slightly improved the timing of wb_arbiter | 09:03 |
stekern | by? | 09:10 |
olofk | https://github.com/bmartini/verilog-arbiter/pull/2 | 09:10 |
olofk | Precalculating the index inside of verilog-arbiter | 09:11 |
stekern | does it make more sense to do it inside the arbiter instead of outside it? | 09:13 |
olofk | Hmm.. no | 09:13 |
stekern | I mean, it seems to me that all the signals that would be needed to do it outside of it are already exposed | 09:13 |
olofk | This is wrong | 09:13 |
olofk | crap | 09:13 |
stekern | heh ;) | 09:14 |
olofk | The idea is right, but I did a mistake in the implementation | 09:14 |
olofk | It should be sel <= ff1(token & request); | 09:15 |
olofk | That we can only do inside the module | 09:15 |
stekern | ah, that makes a lot more sense | 09:15 |
olofk | Can I just amend the patch and force push it to update the pull request? | 09:16 |
olofk | Yeah, that worked | 09:18 |
stekern | you do remember that $clog is broken in ISE? | 09:19 |
olofk | Yes, but I ignored it. Maybe it's too early to ignore it | 09:22 |
stekern | does vivado evaluate it correctly? | 09:24 |
olofk | ISE > 13.2 IIRC | 09:42 |
olofk | At least >=14.x | 09:43 |
stekern | ah, ok. I thought it still was broken | 09:49 |
olofk | Sometime in 2047 when I get the versioning support going in FuseSoC I intend to make it possible to require certain versions of the tools as well | 09:50 |
olofk | We have workarounds for at least some versions of Icarus, Verilator and ISE right now | 09:51 |
hansfbaier | What do you guys think about Risc-V means for OpenRisc? It looks very attractive to me. | 09:56 |
hansfbaier | But the toolchain does not build on Ubuntu stable. | 09:56 |
hansfbaier | s/about/about what/ | 09:56 |
hansfbaier | (Read the orconf slides) | 09:56 |
hansfbaier | That Rock processor performance is very impressive | 09:57 |
hansfbaier | (Benchmark against ARM A) | 09:57 |
hansfbaier | A5 IIRC | 09:57 |
stekern | the DMIPS or the area or what performance? | 10:04 |
olofk | hansfbaier: The impression right now is that it looks like a well thought out 64-bit architecture, but it's currently way too big and slow to run on most FPGAs | 10:23 |
olofk | Does anyone have a wb register stage that I can use? | 10:28 |
stekern | what do you need that for? | 10:30 |
olofk | Getting a critical path on the slave data from adv_debug_sys | 10:32 |
olofk | There's something that resembles an endian converter in there that seems to use a lot of logic | 10:33 |
olofk | So I only really need to register the return s2m data path | 10:34 |
hansfbaier | stekern: it seemed pretty impressive in area as well as performance | 10:36 |
hansfbaier | olofk: Wasn't aware of that... Looked pretty good for the ASIC (also speed wise). | 10:38 |
hansfbaier | olofk: How many LUTs would that thing need? | 10:38 |
hansfbaier | olofk: or cells rather | 10:39 |
olofk | hansfbaier: Don't remember. I know that a few others have done FPGA synthesis, and IIRC it was about 10 times bigger than mor1kx | 10:39 |
hansfbaier | olofk: wow, that's quite a bit | 10:39 |
hansfbaier | olofk: Would be nice if OpenRISC had kind of a 'thumb' mode. That seems to be the real killer feature in the embedded world. | 10:40 |
hansfbaier | s/the/one/ | 10:41 |
olofk | Yes. That together with removal of delay slots and a few other things were proposed for or2k, the OpenRISC 1000 successor | 10:42 |
hansfbaier | Ah | 10:42 |
hansfbaier | olofk: Risc-V runs on the Zynq 7020 with 85 kCells | 10:46 |
hansfbaier | But that thing is EXPENSIVE | 10:47 |
hansfbaier | It probably would run on the SocKit too | 10:47 |
olofk | I got a 7020 on my Parallella | 10:47 |
hansfbaier | Yes that should do | 10:48 |
olofk | stekern: Regarding moving the port stuff from wb_sdram_ctrl, I would probably want to move the CDC closer to the pads, and do a lot more in the wb_clk domain. Given that it's usually slower than sdram_clk, do you think we'll lose any performance? | 10:49 |
stekern | hansfbaier: mor1kx does get about the same DMIPS number | 10:54 |
stekern | olofk: ah, wb as in wishbone? | 10:57 |
olofk | Yes | 11:05 |
hansfbaier | stekern: Ah thanks | 11:10 |
hansfbaier | great | 11:10 |
olofk | wallento: I see some action in newlib. Nice! | 13:22 |
wallento | yes, I try to wrap stuff up now | 13:23 |
wallento | and: I did continuous integration, i.e., automatic builds | 13:23 |
wallento | https://lis.ei.tum.de/jenkins/job/build_newlib_toolchain/ | 13:23 |
wallento | available at: http://lis.ei.tum.de/pub-download/openrisc-builds/unstable/ | 13:23 |
wallento | and put some info together (for the landing page then): http://wallento.github.io/or1k-newlib/ | 13:24 |
olofk | Great that you have split out the gdb instructions | 13:25 |
wallento | I suppose thats the only thing left in or1k-src second step, correct? | 13:26 |
olofk | Should be | 13:26 |
wallento | I will also put up some other continuous stuff (uclibc, musl, some hardware stuff?) | 13:27 |
wallento | if you have anything thats a candidate, I can give you proxy access to our jenkins instance (with a few computers as slaves), I just don't want to expose this to the outside world, therefore the public jenkins just mirrors at the moment | 13:28 |
olofk | Regression testing mor1kx would be nice for example. Could be done with verilator against or1k-tests. Unfortunately they don't all pass right now | 13:29 |
wallento | yes, but I can already add it if you want to, are there any instructions? | 13:30 |
olofk | Nothing automatic right now, but it's basically fusesoc sim --build-only --force --sim=verilator mor1kx-generic && for i in $tests; do fusesoc sim --sim=verilator mor1kx-generic --elf-load=$i; done | 13:31 |
wallento | okay, nice, I will give it a try | 13:32 |
olofk | Me tooI just don't want to expose this to the outside | 13:33 |
olofk | world, therefore the public jenkins just mirrors at the moment | 13:33 |
olofk | 13:29 < olofk> Regression testing mor1kx would be nice for example. Could be | 13:33 |
olofk | done with verilator against or1k-tests. Unfortunately they don't | 13:33 |
olofk | Sorry. Baby at the keyboard. Left the computer unattended for ~1 minute | 13:34 |
stekern | she seems pretty competent at copy pasteing | 13:41 |
wallento | would qualify for a PhD already ;) | 13:42 |
stekern | lol | 13:42 |
wallento | what is the last state of spr-defs.h? there was some discussion around this I remember | 13:43 |
wallento | do we have an automatic generated under BSD? | 13:43 |
stekern | i think pgavin did something, but I don't know if he ever published it | 13:45 |
olofk | You can use the shareware version of my C#-implementation of spr-defs | 14:04 |
olofk | But seriously. It's just the spec written down as a .h file. Must be regarded extremely trivial | 14:05 |
olofk | I can see spr-defs.h in or1ksim, or1ktrace, orpmon, orpsocv2, adv_debug_sy, misoc, barebox, newlib and sim | 14:15 |
olofk | Probably a few more places as well | 14:15 |
olofk | Ah ok. It's called spr_defs.h in linux | 14:16 |
wallento | mmh, is any of those BSD? | 14:24 |
wallento | ;) | 14:24 |
wallento | I think its trivial, but it has a GPL header | 14:24 |
wallento | with damjan and jeremy | 14:24 |
olofk | stekern: Am I just making this up, or did you add an option to disable the npc spr to improve timing? | 18:19 |
stekern | olofk: I did some tests with pulling out npc spr when the debug unit is disabled | 18:20 |
stekern | so I might have mentioned it | 18:20 |
olofk | ah ok. We still need it for debugging. Then it's not an option anyway | 18:20 |
stekern | but it was mostly for area, not speed | 18:20 |
stekern | are you seeing some critical paths through it? | 18:21 |
olofk | One of the most critical paths is from spr_sr[5] to pc_fetch[31], so I took a wild guess | 18:21 |
stekern | sr != npc | 18:22 |
olofk | ah right. | 18:23 |
olofk | supervisor register | 18:23 |
stekern | I have never seen that path | 18:24 |
olofk | Have you synthesized during full moon? | 18:24 |
stekern | you can easily register spr_sr[5], but I bet it's more interesting what's in between | 18:25 |
stekern | which is a lot... | 18:25 |
olofk | Not very handy with Quartus sta tool yet. How can I get it to show me the whole path? | 18:25 |
stekern | depends on the target... | 18:25 |
stekern | I actually tend to use the quartus gui when looking at critical paths | 18:26 |
stekern | it'll give you the failing paths and then you can open them in ta | 18:27 |
olofk | Then I need to constrain harder to make it fail :) | 18:27 |
stekern | ah, ok | 18:27 |
stekern | well, where do you see the path? | 18:28 |
olofk | In .sta.rpt | 18:28 |
olofk | Slow 1200mV 85C Model Setup: 'wb_clk' | 18:28 |
stekern | I'm trying to get vice to start | 18:31 |
stekern | ...without huge success | 18:35 |
stekern | ah, now it starts to work... but it's sloooow ;) | 18:57 |
olofk | vice? | 19:35 |
stekern | http://1drv.ms/ZDcTDF | 19:47 |
olofk | haha | 19:47 |
olofk | stekern: What's preferred, B3_READ_BURSTING or B3_REGISTERED_FEEDBACK? | 19:50 |
stekern | this starts too at least: http://1drv.ms/ZDdntv | 19:51 |
stekern | B3_REGISTERED_FEEDBACK | 19:52 |
stekern | B3_READ_BURSTING only makes sense for espresso | 19:52 |
olofk | cool | 19:52 |
olofk | Looks like the wishbone clock just needed to be pushed a little. Constraining it to 50MHz gives me Fmax ~70MHz. Setting it to 75MHz gives me 88MHz | 20:08 |
olofk | Interesting. 100MHz seems to work too | 20:13 |
olofk | (without DMMU, IMMU and store buffer | 20:14 |
olofk | ) | 20:14 |
stekern | you should be able to go to 100 MHz with the store buffer too | 20:19 |
olofk | That's good | 20:20 |
olofk | Running at 100MHz gives me the benefit of avoiding a CDC to the SDRAM | 20:20 |
olofk | ah wait. I still need a phase shifted sdram clock, right? | 20:21 |
olofk | Or why would I need that? | 20:21 |
stekern | not sure | 20:21 |
stekern | I just finished broken age | 20:21 |
olofk | Cool. Did you like it? | 20:22 |
stekern | yeah, when is act 2 coming? | 20:22 |
olofk | I just got an update yesterday about the next episode | 20:22 |
olofk | They are hoping to get it out this year at least. Seemed like they were close to an alpha release | 20:22 |
olofk | Is mor1kx_rf_cappuccino|ctrl_hazard_a to mor1kx_execute_ctrl_cappuccino|ctrl_alu_result_o[12] a sensible critical path btw? | 20:23 |
stekern | yes | 20:23 |
olofk | Good | 20:24 |
olofk | Oh well. Time for some Blackwell legacy and sleep now | 20:24 |
olofk | But first reenable the store buffer | 20:24 |
olofk | 98.94MHz. Close but no cigar | 20:29 |
olofk | ctrl_lsu_adr_o[19] to the icache tag ram | 20:31 |
olofk | One more try with smaller icache. | 20:32 |
stekern | I bet that goes through intercon? | 20:33 |
stekern | blackwell legacy purchased and installed | 20:34 |
olofk | :) | 20:35 |
olofk | Smaller cache did it! Now I can sleep | 20:36 |
--- Log closed Tue Oct 21 00:00:46 2014 |
Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!