--- Log opened Fri Apr 04 00:00:50 2014 | ||
blueCmd | olofk: yes, rebase -i is the way to do that! | 01:32 |
---|---|---|
analognoise | Hi Openrisc afficianados | 02:42 |
stekern | olofk: ($time) yes | 03:54 |
stekern | olofk: sure, show me. or if you fell ready with it, just push it. If I feel that I have something I could add to it, I can post a patch against that. | 03:55 |
stekern | feel | 03:56 |
olofk_ | stekern: Doh... I thought I had pastied it. | 05:40 |
stekern | =) | 05:44 |
stekern | I though you were just a teaser, wanting to build up the tension before you show it ;) | 05:45 |
stekern | +t | 05:45 |
olofk_ | stekern: Isn't there an option to choose rf ram implementation in mor1kx? | 07:39 |
stekern | no, currently there is only ram | 07:41 |
stekern | I've planned to remove the mor1kx_rf_ram and use mor1kx_simple_dpram_sclk instead | 07:43 |
stekern | and then at the same time add options to allow for a reg based rf as well | 07:43 |
stekern | are you running out of blockrams? | 07:44 |
olofk_ | No. Just playing with the critical path for a Virtex-6 | 07:46 |
stekern | ah, right | 07:47 |
olofk_ | But right now I see that the critical path is here | 07:47 |
olofk_ | Source: mor1kx0/mor1kx_cpu/cappuccino.mor1kx_cpu/mor1kx_lsu_cappuccino/mor1kx_store_buffer/read_pointer[1] (FF) Destination: mor1kx0/mor1kx_cpu/cappuccino.mor1kx_cpu/mor1kx_fetch_cappuccino/pc_fetch[31] (FF) | 07:47 |
olofk_ | 16 levels of logic | 07:47 |
olofk_ | Can I get away without the store buffer? | 07:48 |
stekern | ok, but then it should be simple to just add a 'ramstyle' attribute | 07:49 |
olofk_ | Yeah, I did that for the rf ram | 07:49 |
olofk_ | I just thought there was a top-level switch for it as well | 07:49 |
stekern | no, the store buffer isn't optional as it is, and I don't think it goes through the storebuffer, just originates from it, right? | 07:50 |
stekern | what you can do, is to add he 'ramstyle' attribute to that too, and reduce the depth of the fifo | 07:50 |
olofk_ | It uses the full flag | 07:50 |
olofk_ | So ramstyle probably won't help there | 07:51 |
stekern | ah, ok | 07:51 |
stekern | I see | 07:51 |
stekern | but... the statement remains, that path would be there anyway (I think), just that it would originate from something else | 07:52 |
stekern | because, what that does, is that the full flag is connected to the stall signal | 07:53 |
stekern | and without the storebuffer, the stall signal would be asserted by the fact that the lsu is busy | 07:54 |
stekern | and the actual problem is that the stall signal is the critical path | 07:54 |
stekern | at least, that's what I think is happening there | 07:55 |
olofk_ | It uses plenty of logic to generate the full flag, so if I'll see if I can register that | 08:00 |
olofk_ | http://pastie.org/private/uzcxalekpnybsyuiqukq | 08:01 |
stekern | ah.. right.. it originated from the read_pointer, not the full flag | 08:02 |
olofk_ | Shouldn't need all that logic. ~3.5ns (over half of my timing budget) to generate the full glag | 08:03 |
stekern | yes, and it should be possible to register it | 08:03 |
stekern | it's *gag* btw | 08:04 |
olofk_ | :) | 08:04 |
stekern | but, then you'll need to make the actual full flag an "almost full" flag | 08:05 |
olofk_ | IIRC there is some trickery you can do to make the full flag registered | 08:08 |
olofk_ | But I think it costs some extra pointer registers | 08:08 |
stekern | hmm, why do you need to use trickery? | 08:10 |
stekern | isn't it just: always @(posedge clk) full <= we & (one_spot_left) | zero_spot_left; | 08:11 |
olofk_ | You need to detect reads as well, don't you? | 08:12 |
stekern | were the "almost full" would be the one_spot_left | 08:12 |
stekern | ah, well.. I was thinking we can afford the one extra cycle of latency there | 08:14 |
stekern | I think, with a fifo depth of about 16 it "never" get full anyways | 08:15 |
olofk_ | 640 words depth will be enough for everyone | 08:15 |
stekern | haha, yes! | 08:15 |
stekern | that's at least what my emprical testing showed when I was working on the storebuffer | 08:16 |
olofk_ | and yes. What you described was the kind of "trickery" I had in mind. Precalculating an extra pointer | 08:16 |
stekern | ah, ok. Yes | 08:17 |
olofk_ | And I think you're right. An extra cycle latency when we we are almost full, write and read at the same time is probably a good trade off | 08:17 |
olofk_ | Yep. That killed the worst path. Unfortunately I got slightly worse timing at some other place instead :) | 08:30 |
stekern | that's the voodoo and black magic talking | 08:31 |
olofk_ | Yep | 08:32 |
olofk_ | How critical is the picsr signal? | 08:32 |
olofk_ | Is that external interrupts that we can afford to delay a cycle? | 08:32 |
stekern | I had a fun game with quartus at work last week, the project showed no timing errors, but changing the signals captured by signaltap made the project 'make or break' | 08:33 |
olofk_ | Yeah. That's so annoying. | 08:33 |
olofk_ | Especially when the build time is a few hours | 08:34 |
stekern | should be possible to register picsr without problems | 08:35 |
stekern | luckily this was a small project, with ~20 min build time | 08:35 |
stekern | but where do you get a critical path from that? it's already registered | 08:36 |
olofk_ | http://pastie.org/private/ryzhtuxtzif34ov3iswcg | 08:37 |
olofk_ | picsr is only part of the problem, but I figured it might be harder to register it later on | 08:37 |
stekern | ah | 08:38 |
stekern | https://github.com/openrisc/mor1kx/blob/master/rtl/verilog/mor1kx_ctrl_cappuccino.v#L889 | 08:38 |
stekern | that OR can definetely be registered | 08:38 |
olofk_ | ah c00l. I'll try that then | 08:38 |
olofk_ | Can't run any tests, so I'm just happy coding :) | 08:39 |
olofk_ | hmm.. how do you set a register to a static value? | 08:40 |
stekern | ? | 08:41 |
olofk_ | Can you do "assign r = 0" with registers? | 08:42 |
juliusb | I think it must be within an laways statement | 08:43 |
juliusb | s/laways/always/ | 08:43 |
stekern | no, and old icarus can't even handle: always(*) r = 0; | 08:43 |
stekern | so you got to do: always @(posedge clk) r <= 0; | 08:44 |
juliusb | !!?? that's a fundamnetally different thing. That's describing it as a synchronous not a combinatorial assignment | 08:45 |
juliusb | I presume olof wants a combinatorial thing | 08:45 |
juliusb | or are you meaning for synthesis this is the best way to do it? | 08:45 |
stekern | no, he wants a static value | 08:45 |
stekern | and old versions of icarus can't handle the prior statement, so you have to work around it with the synchronous statement if you'd want to be compatible with those | 08:46 |
stekern | if you don't care about that - always @(*) r = 0; is fine | 08:46 |
juliusb | Fair enough. | 08:47 |
juliusb | I guess older versions of icarus don't support the SV always_comb stuff? | 08:47 |
stekern | is there any kind of SV support in icarus? | 08:48 |
stekern | I doubt any synthesis tool would infer a synchronous logic on always @(posedge clk) r <= 0; instead of just tying the signal to '0' too ;) | 08:49 |
stekern | the bug in icarus is related to that there are no changing signals in that statement, so the sensitivity list becomes empty | 08:53 |
stekern | I wonder if: always @(*) if (rst) r = 0; else r = 0; | 08:54 |
stekern | would work as well | 08:55 |
stekern | ... but that's of course not any prettier as a workaround... | 08:57 |
olofk_ | stekern: Is it a known fact that all critical paths end up in pc_fetch? | 10:02 |
olofk_ | The picsr pipelining helped raise fmax from ~128 to ~131MHz btw | 10:02 |
stekern | don't know if it's a known fact, but the mux here: https://github.com/openrisc/mor1kx/blob/master/rtl/verilog/mor1kx_fetch_cappuccino.v#L228 is heavy and addr_valid contains the stall signal (padv_i) | 10:08 |
stekern | so it's not really suprising | 10:08 |
stekern | olofk_: (131) with what parameters is that? | 10:12 |
olofk_ | stekern: All default | 10:14 |
olofk_ | No caches | 10:14 |
stekern | do a test with parameter FEATURE_OVERFLOW= "NONE"; | 10:15 |
olofk_ | Sure | 10:15 |
stekern | if that doesn't change things, try FEATURE_ADDC= "NONE"; | 10:16 |
stekern | iirc one of those is a real bottle neck | 10:16 |
olofk_ | It could be nice to have a few predefined "profiles". Like extremely small, runs linux, high IPC and so on | 10:17 |
stekern | yup, I agree | 10:17 |
olofk_ | I'll bring the numbers when I have them | 10:17 |
olofk_ | 162MHz without changing the parameters you suggested | 10:45 |
olofk_ | I'm building this stand alone in an empty FPGA, so I need to pipeline the paths to and from the pads to get more realistic values | 10:46 |
stekern | ok, so that was with staring at the screen, holding your thumbs and whispering "voodoo and black magic, work in my favour" repeatedly? | 10:46 |
olofk_ | Yes! That and adding an extra pipeline stage between the pad and du_stall_i | 10:47 |
stekern | ah, ok =) | 10:47 |
stekern | the latter probably had less to do with it than the staring and whispering | 10:50 |
olofk_ | Now all my 10 worst paths are from mor1kx_execute_ctrl_cappuccino/ctrl_except_syscall_o or mor1kx_fetch_cappuccino/fetch_exception_taken_o to mor1kx_fetch_cappuccino/pc_fetch | 10:50 |
olofk_ | Do you think OVERFLOW/ADDC will help with those? | 10:51 |
stekern | overflow is connected to the exception | 10:52 |
olofk_ | Ok, at least this is black magic. Setting FEATURE_OVERFLOW = "NONE" lowered fmax to 145MHz | 11:02 |
stekern | heh | 11:04 |
stekern | that sounds like proper black magic | 11:05 |
stekern | maybe you forgot to hold your thumbs this time? | 11:06 |
olofk_ | ah.. yes. I'll rerun and give you more accurate numbers | 11:07 |
stekern | or your stare wasn't fierce enough? | 11:07 |
_franck_web_ | posts in the opencores forum are not lost, they just have a ~2 days latency | 11:31 |
olofk_ | ahh :) | 11:34 |
olofk_ | What the hell... suddenly the RF becomes block ram again | 11:44 |
olofk_ | stekern: Can I expect the caches to becomes critical path if I turn them on? | 11:58 |
stekern | probably | 11:58 |
olofk_ | Damn caches. Can't live with 'em. Can't live without 'em | 11:58 |
kyrre_ | Hi, im trying to compile gdb --with-or1ksim=/opt/or1ksim and im adding the path to lsim by LDFLAGS=-L/opt/or1ksim/lib LIBS='-lsim'.This produces an error with undefined ref to "ceil" so i add -lm to LIBS and since i dont know where -lm flag is set in the order in the makefile i added the "-Wl,--no-as-needed".... stil this doesnt help me :/ | 12:03 |
_franck_web_ | kyrre_: you don't need to set LDFLAGS and LIBS, --with-or1ksim=/opt/or1ksim should be enough (and in your case --enable-sim should be enough as your are using the default location for or1ksim) | 12:11 |
_franck_web_ | https://github.com/openrisc/or1k-src/blob/or1k/gdb/configure.ac#L274 | 12:11 |
kyrre_ | ok, thats what i had earlier, but when i tried to compile with the library from or1ksim into a freertos application it didnt recognize the library format. And on the openrisc forums Jeremy Bennett suggested to reinstall the gdb while telling it the location of or1ksim | 12:13 |
_franck_web_ | what do you want to do exactly ? gdb sim target is mostly used while running the testsuite | 12:14 |
stekern | library from or1ksim into a freertod application? | 12:16 |
stekern | s/freertod/freertos | 12:16 |
kyrre_ | im writing an exception handling scheme for FreeRTOS which will use a hardware module that is being added to the or1200.... so i need to write an interrupt routine that can call the exception function. But i should test this interupt routine on or1ksim first | 12:16 |
kyrre_ | so im trying to get the or1ksim to generate interupts, but i might have misunderstood something | 12:17 |
stekern | ok, that doesn't answer why you would try to link or1ksim into your freertos application | 12:17 |
kyrre_ | so i can use the "generate interupt"-function | 12:18 |
kyrre_ | void or1ksim_interrupt (int i )[‘or1ksim.h’]Generate an interrupt on interrupt line i. from the user manual for or1ksim | 12:18 |
stekern | ok, I see. You're not supposed to use that from your or1k application | 12:20 |
kyrre_ | ok, i took a wild guess since i couldnt find much information on the subject | 12:21 |
stekern | it's completely unrelated to gdb too | 12:21 |
kyrre_ | do you have a link or a hint on how to start with it? | 12:21 |
stekern | so, what you need to do is to write a model of your hardware in or1ksim, and then let that call the or1ksim_interrupt() function | 12:22 |
kyrre_ | hm... | 12:22 |
kyrre_ | how do i approach this? i wasnt all that much wiser after reading(skimming) through the user manual | 12:24 |
stekern | ...and that will then generate an interrupt exception that will trigger your interrupt routine in the freertos application | 12:25 |
kyrre_ | ya, thats exactly what i want to | 12:25 |
stekern | I'd approach it by looking at the or1ksim code, but I'm pragmatic kind of guy... | 12:25 |
stekern | +a | 12:25 |
kyrre_ | ok so i have to acctualy alter som of the or1ksim source code? | 12:26 |
kyrre_ | there is no "include this module" option in or1ksim right? | 12:26 |
olofk_ | Stupid fucking synthesis tool. Don't try to turn my code into fucking block RAM! | 12:26 |
stekern | well, since there is a library, I guess the idea is that you can write an external module that call that or1ksim_interrupt function | 12:27 |
kyrre_ | ye, that was kindof my guess also, but im totaly lost on how to add this to the simulator | 12:28 |
stekern | but that will be a *host* application, not a target application | 12:29 |
kyrre_ | ok | 12:29 |
kyrre_ | haha, im totaly clueless :p and how do i add a host application? | 12:31 |
kyrre_ | on another note, unless i "--disable-or1ksim"" the gdb compilation returns with a linker error complaining about that missing library | 12:32 |
stekern | I think the whole lib thing is just for driving the simulator from another application (like a test-suite) and that's probably not what you want | 12:32 |
kyrre_ | ok | 12:33 |
stekern | I'd just add the model for your hardware in or1ksim/peripheral and look at for example 16450.c and apply the interesting parts for your peripheral from that | 12:33 |
stekern | I don't know why you're messing about with the or1ksim options to gdb? | 12:34 |
stekern | all you want is to run or1ksim with support for your hardware added as a peripheral and have or1ksim run your freertos program | 12:35 |
kyrre_ | me neither | 12:35 |
kyrre_ | ye | 12:35 |
kyrre_ | ill try looking into how to add a second application to or1ksim | 12:35 |
stekern | no, you want to add a peripheral | 12:36 |
stekern | add code to or1ksim, recompile it and install it | 12:36 |
olofk_ | or1ksim is a simulator that runs a program | 12:36 |
kyrre_ | ok | 12:36 |
kyrre_ | just have to get my toolchains back in order first | 12:37 |
olofk_ | aha! Seems like synplify creates different logic based on the clock frequency. Had a misspelled clock name, so it assumed I only wanted 1 MHz | 12:54 |
_franck_web_ | olofk_: I think I messed up thing in simulation with my "remove verilog.py" | 12:56 |
_franck_web_ | stupid me | 12:56 |
olofk_ | _franck_web_: No worries. | 12:57 |
olofk_ | stekern: Minimum period: 5.960ns{1} (Maximum frequency: 167.785MHz) | 12:58 |
stekern | that's a nice number, I wonder how it would compare to a pentium running at that speed | 13:09 |
_franck_web_ | olofk_: is there a better way to fix my error: http://pastie.org/private/3glyfaqyw0l6kei3dgsglw ? | 13:12 |
olofk_ | _franck_web_: There are a few different ways. Not sure which one is best. | 13:13 |
olofk_ | You could make an private class in simulator that only has include_dirs, src_files, tb_src_files etc | 13:14 |
_franck_web_ | I thought about that. It's your call | 13:15 |
olofk_ | I don't have any strong opinions. We should just do something that works, and if it turns out to be a bad solution, we can change it later | 13:16 |
olofk_ | But the internal class can be nice, because it generates a smaller diff | 13:16 |
olofk_ | (I think) | 13:16 |
_franck_web_ | ok I prepare a patch. Yes, patch will be smaller. | 13:16 |
olofk_ | stekern: Minimum period: 4.966ns{1} (Maximum frequency: 201.369MHz) | 13:17 |
olofk_ | Timing errors: 0 Score: 0 (Setup/Max: 0, Hold: 0) | 13:17 |
stekern | what have you changed? | 13:19 |
olofk_ | Just changed the constraints from 6ns to 5ns clock period | 13:19 |
olofk_ | A very interesting thing is that the largest logic depth now is 8, where I had 18 or 19 before | 13:20 |
olofk_ | I could try to go higher, but I want to enable caches and stuff to make it a more realistic implementation | 13:21 |
stekern | do you have anything connected to the i/d busses? | 13:22 |
olofk_ | No. Just a few pipeline stages. | 13:23 |
olofk_ | Are they unregistered? | 13:23 |
stekern | no, they should be pretty well cut-off from the cpu | 13:23 |
olofk_ | Trying to enable Data cache now. What other values than "NONE" are valid? | 13:27 |
stekern | "ENABLED" | 13:29 |
stekern | or anything except "NONE" will be interpreted as that | 13:29 |
stekern | but the common practice have been "ENABLED"/"NONE" for boolean parameters | 13:30 |
olofk_ | I think you should change that to "INDEED"/"HARDLY". Sounds more polite | 13:35 |
olofk_ | What the hell? When I enabled data cache, synplify once again started to implement the RF as Block RAM | 13:48 |
_franck_web_ | our fusesoc testsuite could be as simple as this for now: http://pastie.org/8994161 | 13:50 |
stekern | olofk_: INDEED is already implemented | 13:51 |
stekern | as well as AYE | 13:53 |
olofk_ | _franck_web_: That's a good idea. We just have to make sure we don't break or1200-generic too often :) | 13:53 |
olofk_ | stekern: Is that AYE_CACHE and DEE_CACHE? | 13:54 |
stekern | olofk_: aye | 13:56 |
olofk_ | Enabling caches pulled down the freq a lot. ~120MHz with DATACACHE enabled | 14:07 |
_franck_web_ | the fusesoc testbench has found its first problem: pyhton3.3 doesn't like get_verilator_root() | 14:14 |
_franck_web_ | stekern: verilator doesn not like "unsigned long t" in your verilator_tb_utils | 14:26 |
_franck_web_ | http://pastie.org/private/4j2oaobt1ymhrvcmyqxxba | 14:26 |
_franck_web_ | changing to "double" does the trick | 14:26 |
stekern | I'd say it's better to use the vluint32_t or vluint64_t then | 14:30 |
_franck_web_ | sure | 14:32 |
stekern | are you on a 32-bit machine, or what is different from my setup (where it worked with the unsigned long) | 14:33 |
_franck_web_ | yes, 32bits | 14:33 |
stekern | I actually intended to have unsigned long long there, but I must have forgot one long | 14:34 |
stekern | so, vluint64_t is probably the best choice | 14:35 |
_franck_web_ | olofk: stekern : what you think of a very small dummy system to test our Quartus and ISE build system ? | 14:51 |
_franck_web_ | (that would be just a simple counter) | 14:53 |
_franck_web_ | or may be we could put tests systems in fusesoc itself | 15:35 |
stekern | I've used the unittest unit testing framework at work a bit, I think it might be a good idea to use that somehow | 15:38 |
olofk | _franck__: Yes. A dummy system would be good for the tests. | 15:42 |
stekern | _franck__: where is the difference in 3.3 in the get_verilator_root? | 16:05 |
stekern | I mean, where do the ' come from? | 16:07 |
stekern | oh... byte string and str incompatibilities between python2.7 and python3 is a real pain... | 16:43 |
mor1kx | [mor1kx] skristiansson pushed 1 new commit to master: https://github.com/openrisc/mor1kx/commit/1a99eb1c9a89ae4e9bca2e296b5c4608fc97fd54 | 19:28 |
mor1kx | mor1kx/master 1a99eb1 Stefan Kristiansson: remove cache enable debug prints... | 19:28 |
stekern | _franck__: I didn't quite understand your answer, did you mean "You can patch it" or "I can patch it"? | 20:00 |
_franck__ | I meant (but forgot some words) "stekern, please do it" | 20:01 |
stekern | ah, sure | 20:02 |
stekern | I wouldn't have cared if you just had picked up what I suggested, but since it's my mess, I guess it serves me to clean it up ;) | 20:03 |
_franck__ | don't take it like this at all :) | 20:04 |
_franck__ | stekern: don't you have a depend on elf-loader in your mor1kx-generic.core ? (may have misread) | 20:12 |
stekern | nope | 20:15 |
stekern | since I depend on verilator_tb_utils, elf-loader will be pulled in from that | 20:16 |
_franck__ | ok | 20:17 |
olofk | Most of the pull requests taken care of for fusesoc. Thanks for all the patches | 21:45 |
_franck__ | I hope we didn't break too much things ;) | 21:47 |
_franck__ | I'm a bit strict about style: you should have removed "Warning" from the pr_warn :) | 21:49 |
olofk | Doh. Sorry. I missed that | 21:49 |
_franck__ | (should be more strict about what I commit) | 21:49 |
olofk | I'm a little bit too tired now. Should probably stay away from the keyboard :) | 21:49 |
_franck__ | I'll change this when we find something else here | 21:49 |
olofk | Yeah. That's fine | 21:50 |
_franck__ | we would be more efficient if we were working during the day :( | 21:50 |
olofk | _franck__: That's just a crazy dream :) | 21:59 |
olofk | Please comment if there is something missing in the IseSection in my latest commit | 21:59 |
olofk | Time to sleep now | 22:00 |
olofk | good night | 22:00 |
--- Log closed Sat Apr 05 00:00:51 2014 |
Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!