IRC logs for #openrisc Friday, 2014-04-04

--- Log opened Fri Apr 04 00:00:50 2014
blueCmdolofk: yes, rebase -i is the way to do that!01:32
analognoiseHi Openrisc afficianados02:42
stekernolofk: ($time) yes03:54
stekernolofk: sure, show me. or if you fell ready with it, just push it. If I feel that I have something I could add to it, I can post a patch against that.03:55
olofk_stekern: Doh... I thought I had pastied it.05:40
stekernI though you were just a teaser, wanting to build up the tension before you show it ;)05:45
olofk_stekern: Isn't there an option to choose rf ram implementation in mor1kx?07:39
stekernno, currently there is only ram07:41
stekernI've planned to remove the mor1kx_rf_ram and use mor1kx_simple_dpram_sclk instead07:43
stekernand then at the same time add options to allow for a reg based rf as well07:43
stekernare you running out of blockrams?07:44
olofk_No. Just playing with the critical path for a Virtex-607:46
stekernah, right07:47
olofk_But right now I see that the critical path is here07:47
olofk_  Source:               mor1kx0/mor1kx_cpu/cappuccino.mor1kx_cpu/mor1kx_lsu_cappuccino/mor1kx_store_buffer/read_pointer[1] (FF)   Destination:          mor1kx0/mor1kx_cpu/cappuccino.mor1kx_cpu/mor1kx_fetch_cappuccino/pc_fetch[31] (FF)07:47
olofk_16 levels of logic07:47
olofk_Can I get away without the store buffer?07:48
stekernok, but then it should be simple to just add a 'ramstyle' attribute07:49
olofk_Yeah, I did that for the rf ram07:49
olofk_I just thought there was a top-level switch for it as well07:49
stekernno, the store buffer isn't optional as it is, and I don't think it goes through the storebuffer, just originates from it, right?07:50
stekernwhat you can do, is to add he 'ramstyle' attribute to that too, and reduce the depth of the fifo07:50
olofk_It uses the full flag07:50
olofk_So ramstyle probably won't help there07:51
stekernah, ok07:51
stekernI see07:51
stekernbut... the statement remains, that path would be there anyway (I think), just that it would originate from something else07:52
stekernbecause, what that does, is that the full flag is connected to the stall signal07:53
stekernand without the storebuffer, the stall signal would be asserted by the fact that the lsu is busy07:54
stekernand the actual problem is that the stall signal is the critical path07:54
stekernat least, that's what I think is happening there07:55
olofk_It uses plenty of logic to generate the full flag, so if I'll see if I can register that08:00
stekernah.. right.. it originated from the read_pointer, not the full flag08:02
olofk_Shouldn't need all that logic. ~3.5ns (over half of my timing budget) to generate the full glag08:03
stekernyes, and it should be possible to register it08:03
stekernit's *gag* btw08:04
stekernbut, then you'll need to make the actual full flag an "almost full" flag08:05
olofk_IIRC there is some trickery you can do to make the full flag registered08:08
olofk_But I think it costs some extra pointer registers08:08
stekernhmm, why do you need to use trickery?08:10
stekernisn't it just: always @(posedge clk) full <= we & (one_spot_left) | zero_spot_left;08:11
olofk_You need to detect reads as well, don't you?08:12
stekernwere the "almost full" would be the one_spot_left08:12
stekernah, well.. I was thinking we can afford the one extra cycle of latency there08:14
stekernI think, with a fifo depth of about 16 it "never" get full anyways08:15
olofk_640 words depth will be enough for everyone08:15
stekernhaha, yes!08:15
stekernthat's at least what my emprical testing showed when I was working on the storebuffer08:16
olofk_and yes. What you described was the kind of  "trickery" I had in mind. Precalculating an extra pointer08:16
stekernah, ok. Yes08:17
olofk_And I think you're right. An extra cycle latency when we we are almost full, write and read at the same time is probably a good trade off08:17
olofk_Yep. That killed the worst path. Unfortunately I got slightly worse timing at some other place instead :)08:30
stekernthat's the voodoo and black magic talking08:31
olofk_How critical is the picsr signal?08:32
olofk_Is that external interrupts that we can afford to delay a cycle?08:32
stekernI had a fun game with quartus at work last week, the project showed no timing errors, but changing the signals captured by signaltap made the project 'make or break'08:33
olofk_Yeah. That's so annoying.08:33
olofk_Especially when the build time is a few hours08:34
stekernshould be possible to register picsr without problems08:35
stekernluckily this was a small project, with ~20 min build time08:35
stekernbut where do you get a critical path from that? it's already registered08:36
olofk_picsr is only part of the problem, but I figured it might be harder to register it later on08:37
stekernthat OR can definetely be registered08:38
olofk_ah c00l. I'll try that then08:38
olofk_Can't run any tests, so I'm just happy coding :)08:39
olofk_hmm.. how do you set a register to a static value?08:40
olofk_Can you do "assign r = 0" with registers?08:42
juliusbI think it must be within an laways statement08:43
stekernno, and old icarus can't even handle: always(*) r = 0;08:43
stekernso you got to do: always @(posedge clk) r <= 0;08:44
juliusb!!?? that's a fundamnetally different thing. That's describing it as a synchronous not a combinatorial assignment08:45
juliusbI presume olof wants a combinatorial thing08:45
juliusbor are you meaning for synthesis this is the best way to do it?08:45
stekernno, he wants a static value08:45
stekernand old versions of icarus can't handle the prior statement, so you have to work around it with the synchronous statement if you'd want to be compatible with those08:46
stekernif you don't care about that - always @(*) r = 0; is fine08:46
juliusbFair enough.08:47
juliusbI guess older versions of icarus don't support the SV always_comb stuff?08:47
stekernis there any kind of SV support in icarus?08:48
stekernI doubt any synthesis tool would infer a synchronous logic on always @(posedge clk) r <= 0; instead of just tying the signal to '0' too ;)08:49
stekernthe bug in icarus is related to that there are no changing signals in that statement, so the sensitivity list becomes empty08:53
stekernI wonder if: always @(*) if (rst) r = 0; else r = 0;08:54
stekernwould work as well08:55
stekern... but that's of course not any prettier as a workaround...08:57
olofk_stekern: Is it a known fact that all critical paths end up in pc_fetch?10:02
olofk_The picsr pipelining helped raise fmax from ~128 to ~131MHz btw10:02
stekerndon't know if it's a known fact, but the mux here: is heavy and addr_valid contains the stall signal (padv_i)10:08
stekernso it's not really suprising10:08
stekernolofk_: (131) with what parameters is that?10:12
olofk_stekern: All default10:14
olofk_No caches10:14
stekerndo a test with parameter FEATURE_OVERFLOW= "NONE";10:15
stekernif that doesn't change things, try FEATURE_ADDC= "NONE";10:16
stekerniirc one of those is a real bottle neck10:16
olofk_It could be nice to have a few predefined "profiles". Like extremely small, runs linux, high IPC and so on10:17
stekernyup, I agree10:17
olofk_I'll bring the numbers when I have them10:17
olofk_162MHz without changing the parameters you suggested10:45
olofk_I'm building this stand alone in an empty FPGA, so I need to pipeline the paths to and from the pads to get more realistic values10:46
stekernok, so that was with staring at the screen, holding your thumbs and whispering "voodoo and black magic, work in my favour" repeatedly?10:46
olofk_Yes! That and adding an extra pipeline stage between the pad and du_stall_i10:47
stekernah, ok =)10:47
stekernthe latter probably had less to do with it than the staring and whispering10:50
olofk_Now all my 10 worst paths are from mor1kx_execute_ctrl_cappuccino/ctrl_except_syscall_o or mor1kx_fetch_cappuccino/fetch_exception_taken_o to mor1kx_fetch_cappuccino/pc_fetch10:50
olofk_Do you think OVERFLOW/ADDC will help with those?10:51
stekernoverflow is connected to the exception10:52
olofk_Ok, at least this is black magic. Setting FEATURE_OVERFLOW = "NONE" lowered fmax to 145MHz11:02
stekernthat sounds like proper black magic11:05
stekernmaybe you forgot to hold your thumbs this time?11:06
olofk_ah.. yes. I'll rerun and give you more accurate numbers11:07
stekernor your stare wasn't fierce enough?11:07
_franck_web_posts in the opencores forum are not lost, they just have a ~2 days latency11:31
olofk_ahh :)11:34
olofk_What the hell... suddenly the RF becomes block ram again11:44
olofk_stekern: Can I expect the caches to becomes critical path if I turn them on?11:58
olofk_Damn caches. Can't live with 'em. Can't live without 'em11:58
kyrre_Hi, im trying to compile gdb --with-or1ksim=/opt/or1ksim   and im adding the path to lsim by LDFLAGS=-L/opt/or1ksim/lib  LIBS='-lsim'.This produces an error with undefined ref to "ceil" so i add -lm to LIBS and since i dont know where -lm flag is set in the order in the makefile i added the "-Wl,--no-as-needed".... stil this doesnt help me :/12:03
_franck_web_kyrre_: you don't need to set LDFLAGS and LIBS, --with-or1ksim=/opt/or1ksim should be enough (and in your case --enable-sim should be enough as your are using the default location for or1ksim)12:11
kyrre_ok, thats what i had earlier, but when i tried to compile with the library from or1ksim into a freertos application it didnt recognize the library format. And on the openrisc forums Jeremy Bennett suggested to reinstall the gdb while telling it the location of or1ksim12:13
_franck_web_what do you want to do exactly ? gdb sim target is mostly used while running the testsuite12:14
stekernlibrary from or1ksim into a freertod application?12:16
kyrre_im writing an exception handling scheme for FreeRTOS which will use a hardware module that is being added to the or1200.... so i need to write an interrupt routine that can call the exception function. But i should test this interupt routine on or1ksim first12:16
kyrre_so im trying to get the or1ksim to generate interupts, but i might have misunderstood something12:17
stekernok, that doesn't answer why you would try to link or1ksim into your freertos application12:17
kyrre_so i can use the "generate interupt"-function12:18
kyrre_void or1ksim_interrupt (int i )[‘or1ksim.h’]Generate an interrupt on interrupt line i.     from the user manual for or1ksim12:18
stekernok, I see. You're not supposed to use that from your or1k application12:20
kyrre_ok, i took a wild guess since i couldnt find much information on the subject12:21
stekernit's completely unrelated to gdb too12:21
kyrre_do you have a link or a hint on how to start with it?12:21
stekernso, what you need to do is to write a model of your hardware in or1ksim, and then let that call the or1ksim_interrupt() function12:22
kyrre_how do i approach this? i wasnt all that much wiser after reading(skimming) through the user manual12:24
stekern...and that will then generate an interrupt exception that will trigger your interrupt routine in the freertos application12:25
kyrre_ya, thats exactly what i want to12:25
stekernI'd approach it by looking at the or1ksim code, but I'm pragmatic kind of guy...12:25
kyrre_ok so i have to acctualy alter som of the or1ksim source code?12:26
kyrre_there is no "include this module" option in or1ksim right?12:26
olofk_Stupid fucking synthesis tool. Don't try to turn my code into fucking block RAM!12:26
stekernwell, since there is a library, I guess the idea is that you can write an external module that call that or1ksim_interrupt function12:27
kyrre_ye, that was kindof my guess also, but im totaly lost on how to add this to the simulator12:28
stekernbut that will be a *host* application, not a target application12:29
kyrre_haha, im totaly clueless :p and how do i add a host application?12:31
kyrre_on another note, unless i "--disable-or1ksim"" the gdb compilation returns with a linker error complaining about that missing library12:32
stekernI think the whole lib thing is just for driving the simulator from another application (like a test-suite) and that's probably not what you want12:32
stekernI'd just add the model for your hardware in or1ksim/peripheral and look at for example 16450.c and apply the interesting parts for your peripheral from that12:33
stekernI don't know why you're messing about with the or1ksim options to gdb?12:34
stekernall you want is to run or1ksim with support for your hardware added as a peripheral and have or1ksim run your freertos program12:35
kyrre_me neither12:35
kyrre_ill try looking into how to add a second application to or1ksim12:35
stekernno, you want to add a peripheral12:36
stekernadd code to or1ksim, recompile it and install it12:36
olofk_or1ksim is a simulator that runs a program12:36
kyrre_just have to get my toolchains back in order first12:37
olofk_aha! Seems like synplify creates different logic based on the clock frequency. Had a misspelled clock name, so it assumed I only wanted 1 MHz12:54
_franck_web_olofk_: I think I messed up thing in simulation with my "remove"12:56
_franck_web_stupid me12:56
olofk__franck_web_: No worries.12:57
olofk_stekern:    Minimum period:   5.960ns{1}   (Maximum frequency: 167.785MHz)12:58
stekernthat's a nice number, I wonder how it would compare to a pentium running at that speed13:09
_franck_web_olofk_: is there a better way to fix my error: ?13:12
olofk__franck_web_: There are a few different ways. Not sure which one is best.13:13
olofk_You could make an private class in simulator that only has include_dirs, src_files, tb_src_files etc13:14
_franck_web_I thought about that. It's your call13:15
olofk_I don't have any strong opinions. We should just do something that works, and if it turns out to be a bad solution, we can change it later13:16
olofk_But the internal class can be nice, because it generates a smaller diff13:16
olofk_(I think)13:16
_franck_web_ok I prepare a patch. Yes, patch will be smaller.13:16
olofk_stekern:    Minimum period:   4.966ns{1}   (Maximum frequency: 201.369MHz)13:17
olofk_Timing errors: 0  Score: 0  (Setup/Max: 0, Hold: 0)13:17
stekernwhat have you changed?13:19
olofk_Just changed the constraints from 6ns to 5ns clock period13:19
olofk_A very interesting thing is that the largest logic depth now is 8, where I had 18 or 19 before13:20
olofk_I could try to go higher, but I want to enable caches and stuff to make it a more realistic implementation13:21
stekerndo you have anything connected to the i/d busses?13:22
olofk_No. Just a few pipeline stages.13:23
olofk_Are they unregistered?13:23
stekernno, they should be pretty well cut-off from the cpu13:23
olofk_Trying to enable Data cache now. What other values than "NONE" are valid?13:27
stekernor anything except "NONE" will be interpreted as that13:29
stekernbut the common practice have been "ENABLED"/"NONE" for boolean parameters13:30
olofk_I think you should change that to "INDEED"/"HARDLY". Sounds more polite13:35
olofk_What the hell? When I enabled data cache, synplify once again started to implement the RF as Block RAM13:48
_franck_web_our fusesoc testsuite could be as simple as this for now:
stekernolofk_: INDEED is already implemented13:51
stekernas well as AYE13:53
olofk__franck_web_: That's a good idea. We just have to make sure we don't break or1200-generic too often :)13:53
olofk_stekern: Is that AYE_CACHE and DEE_CACHE?13:54
stekernolofk_: aye13:56
olofk_Enabling caches pulled down the freq a lot. ~120MHz with DATACACHE enabled14:07
_franck_web_the fusesoc testbench has found its first problem: pyhton3.3 doesn't like get_verilator_root()14:14
_franck_web_stekern: verilator doesn not like "unsigned long t" in your verilator_tb_utils14:26
_franck_web_changing to "double" does the trick14:26
stekernI'd say it's better to use the vluint32_t or vluint64_t then14:30
stekernare you on a 32-bit machine, or what is different from my setup (where it worked with the unsigned long)14:33
_franck_web_yes, 32bits14:33
stekernI actually intended to have unsigned long long there, but I must have forgot one long14:34
stekernso, vluint64_t is probably the best choice14:35
_franck_web_olofk: stekern : what you think of a very small dummy system to test our Quartus and ISE build system ?14:51
_franck_web_(that would be just a simple counter)14:53
_franck_web_or may be we could put tests systems in fusesoc itself15:35
stekernI've used the unittest unit testing framework at work a bit, I think it might be a good idea to use that somehow15:38
olofk_franck__: Yes. A dummy system would be good for the tests.15:42
stekern_franck__: where is the difference in 3.3 in the get_verilator_root?16:05
stekernI mean, where do the ' come from?16:07
stekernoh... byte string and str incompatibilities between python2.7 and python3 is a real pain...16:43
mor1kx[mor1kx] skristiansson pushed 1 new commit to master:
mor1kxmor1kx/master 1a99eb1 Stefan Kristiansson: remove cache enable debug prints...19:28
stekern_franck__: I didn't quite understand your answer, did you mean "You can patch it" or "I can patch it"?20:00
_franck__I meant (but forgot some words) "stekern, please do it"20:01
stekernah, sure20:02
stekernI wouldn't have cared if you just had picked up what I suggested, but since it's my mess, I guess it serves me to clean it up ;)20:03
_franck__don't take it like this at all :)20:04
_franck__stekern: don't you have a depend on elf-loader in your mor1kx-generic.core ? (may have misread)20:12
stekernsince I depend on verilator_tb_utils, elf-loader will be pulled in from that20:16
olofkMost of the pull requests taken care of for fusesoc. Thanks for all the patches21:45
_franck__I hope we didn't break too much things ;)21:47
_franck__I'm a bit strict about style: you should have removed "Warning" from the pr_warn :)21:49
olofkDoh. Sorry. I missed that21:49
_franck__(should be more strict about what I commit)21:49
olofkI'm a little bit too tired now. Should probably stay away from the keyboard :)21:49
_franck__I'll change this when we find something else here21:49
olofkYeah. That's fine21:50
_franck__we would be more efficient if we were working during the day :(21:50
olofk_franck__: That's just a crazy dream :)21:59
olofkPlease comment if there is something missing in the IseSection in my latest commit21:59
olofkTime to sleep now22:00
olofkgood night22:00
--- Log closed Sat Apr 05 00:00:51 2014

Generated by 2.15.2 by Marius Gedminas - find it at!