--- Log opened Tue May 24 00:00:20 2016 | ||
SMDwrk | Yep, clearing mem with zeroes helped | 01:34 |
---|---|---|
SMDwrk | Also tried to init ram with 0xabcdabcd. It worked. | 01:44 |
SMDwrk | For now ram contains "xxxxx", and when such value gets fetched it can poison pipeline. What do you think if we initialize ram with random values or 0xFFs? I don't speak about zeroes because it would be too easy. | 01:44 |
olof | SMDwrk: Ah cool. That's good to know. I still would like to know why the software is trying to read uninitialized memory, but having an option to clear the ram is probably good to have too | 02:52 |
SMDwrk | olof I thought gcc should have a key to init vars with zeroes but can't find one | 02:52 |
olof | SMDwrk: I'm experimenting a bit with objcopy. It might be possible to do something like this | 02:58 |
olof | or1k-elf-objcopy -O elf32-or1k --gap-fill 0 dhrystone_10.elf dhrystone_10.elf2 | 02:58 |
olof | That should fill the gaps between sections with a value | 02:59 |
olof | Also might be worth trying other options such as --pad-to | 02:59 |
SMDwrk | Gap-fill didn't work for me( | 03:26 |
olof | SMDwrk: Well, now that we know that it works, I'm fine with an option to clear the mem on init | 03:27 |
olof | However, I think we could put it in the testbench for now, just before the elf-loading | 03:27 |
SMDwrk | olof I'll discuss that problem with my collegues when they come to work | 03:27 |
olof | cool | 03:27 |
SMDwrk | And that's why this problem doesn't show up with verilator | 03:28 |
olof | I would assume so | 03:28 |
olof | Have you been able to pinpoint which read from RAM that causes the error? Would be interesting to know why the sw does this in the first place | 03:29 |
SMDwrk | yep, let me search for the pic | 03:30 |
SMDwrk | http://imgur.com/LOPfHkU here | 03:30 |
SMDwrk | 0x178f8 or something like this. I don't know why we read mem which has not been written yet and wasn't initialized also. | 03:32 |
SMDwrk | but we might get it from insn trace] | 03:32 |
SMDwrk | That 0x4e4700xx we get because we previously written 3 bytes to that cell, but LSByte is still undefined | 03:34 |
olof | Hard to tell if this is a CPU or SW problem. As long as we don't rely on the result of the last byte, it should be ok to have it unitialized | 03:38 |
olof | So either mor1kx looks at the LSByte, even though it shouldn't care about its value | 03:38 |
olof | Or the sw assumes that the last byte is 0 | 03:38 |
SMDwrk | olof yep, but in real hw uninit bit(byte) is either 0 or 1, but not x | 03:38 |
SMDwrk | >Or the sw assumes that the last byte is 0 | 03:39 |
SMDwrk | That could be one problem, yes | 03:39 |
olof | Yeah. True. It's starting to become more of an academic problem :) | 03:39 |
olof | SMDwrk: I'll push an option to clear the RAM for the mor1kx-generic system during simulation | 03:40 |
SMDwrk | There is question that should be answered: why sw reads from mem which has not been written. Either it thinks it's already initialized then we have to find out why it hasn't, or sw reads random mem address and expects to read "trash" from there, but it reads a bunch of "xxxx" and then pipeline is crashed. In the last case we should init mem | 03:41 |
olof | Yep. That's what I think too | 03:42 |
olof | But to get around the problem for now, I'll add a command-line option to clear the mem for that system | 03:42 |
SMDwrk | Thanks! | 03:42 |
SMDwrk | I guess people most of the time use verilator | 03:43 |
olof | Yeah, for pure mor1kx simulations at least. When you're having peripherals and more clocks it's getting trickier to use verilator | 03:44 |
olof | Could also put in an option to set the init value to anything, but we can do that later if needed | 03:50 |
SMDwrk | olof what do you think about adding total cycles passed for the simulation after it ends? | 03:50 |
olof | It's a good idea. I often wonder how many cycles that has been run. I think that could be implemented as a new option in vlog_tb_utils | 03:51 |
SMDwrk | olof, huh, I think I have an idea | 03:51 |
SMDwrk | Maybe it's cache issue: we could read let's say 0x1234 address and it will fill cache with 1238 and so on, thus we see that mem reads | 03:53 |
SMDwrk | but next question why we use one of that results | 03:53 |
olof | SMDwrk: I'm about to test my fix now. What is supposed to happen without the clearing? Will it just hang, or do I get a wrong result? | 03:57 |
SMDwrk | For that dhrystone binary and current pipeline implementation it gets into infinite loop after "int comp should be 17" | 03:58 |
olof | ah cool. | 03:58 |
olof | Looks like my patch works. I'm past the Int_Comp test now | 03:58 |
olof | yep. dhrystone completed | 03:59 |
SMDwrk | olof do you have that .vcd file I posted? | 04:01 |
olof | No, I haven't looked at it | 04:01 |
SMDwrk | Ok, then I'll update issue ticket with things I find interesting to look at | 04:01 |
SMDwrk | My question now is: if we store to cache some non-init values(i.e. mem locations which are never written, now we can do that, it's ok i guess), why then we use that values from cache. And I have some values to search/look for in that .vc | 04:03 |
SMDwrk | d | 04:03 |
olof | SMDwrk: I pushed two patches now | 04:06 |
SMDwrk | Thanks! | 04:06 |
olof | Just add --clear_ram at the end of the argument list | 04:06 |
SMDwrk | olof also I suggest to add tick number to the trace log. It could help to estimate what's going on in the pipeline | 04:18 |
olof | SMDwrk: Hmm.. I thought there already was one, but maybe that was for the old or1200 monitor | 04:20 |
olof | There's a slightly ugly way you could do something similar right now actually | 04:20 |
olof | If you set the heartbeat parameter, e.g. add --heartbeat=100, you will get an indication every 100th time unit | 04:21 |
olof | If you also do --trace_enable and --trace_to_screen, you will dump the trace to the screen, together with the heartbeat | 04:21 |
olof | Not saying it's pretty, but at least it should work with what is there nwo | 04:22 |
olof | now | 04:22 |
SMDwrk | I'd like to relate .vcd and insn trace to get idea what code executes at 996500 time and why | 04:22 |
olof | Yeah. That's nice to have. | 04:23 |
olof | I got some ideas how to do that, but they require a little effort | 04:23 |
SMDwrk | Maybe we could expand that insn trace file with extra info(i.e. stall signals, cache misses and etc) which will help to debug things in future. | 04:24 |
SMDwrk | I think I could do it myself later when I study code and make PCU | 04:25 |
olof | Yes. We could put a lot of nice stuff in there. Feel free to hack it if you want. It's in mor1kx/bench/verilog/mor1kx_monitor.v I think | 04:25 |
SMDwrk | olof buggy code is in strlen. I don;t know how it's implemented in libc, bug I guess there's a while loop untill \0 is met. And for some reason it doesn't stop where it's intended and that's why it reads mem and cache further, get corrupted values and hangs | 04:43 |
olof | SMDwrk: Good find! | 04:44 |
olof | I wonder if they skip writing \0 and just assume that it's zero to begin with | 04:45 |
SMDwrk | It's more like a guess now but looks reasonable for me) Yep, that's the question. Where should I look for it, gcc? | 04:45 |
olof | I think you would find strlen somewhere in newlib | 04:46 |
olof | But I guess gcc would be responsible to \0-terminating the string | 04:47 |
SMDwrk | olof, yep every string goes with \0 and gcc doesn't rely something is already init with zero. So we need to find out why \0 is missing. | 04:54 |
SMDwrk | oh, I think I get it | 04:56 |
SMDwrk | So, the answer is the following. There is \0 in 0x4e4700xx, but that "xx" ruins everything | 04:59 |
SMDwrk | If it's initialized, we can correctly detect \0 and exit from strlen(), while it has "xx" in it pipeline gets mad and can't correctly calculate result. | 05:01 |
olof | ah ok | 05:08 |
olof | That's a mor1kx bug then | 05:08 |
stekern | umm, how does the 'xx' come to ruin things? | 05:10 |
SMDwrk | I think here: assign {adder_carryout, adder_result} = a + b_mux + {{OPTION_OPERAND_WIDTH-1{1'b0}}, carry_in}; if a is 0x112233xx result woul be 0xXXXXXXXX | 05:11 |
stekern | but isn't the memory position read with a byte load? | 05:12 |
olof | stekern: For strlen, I assume it would be much quicker to read words and shift | 05:12 |
olof | or mask | 05:13 |
SMDwrk | I guess it's done with lwz, yep | 05:13 |
stekern | maybe, but then the 'xx' would be masked out | 05:13 |
SMDwrk | Code from strlen: l.lwz r5,0(r4); l.ori r2,r2,0xfeff; l.add r6,r5,r2; | 05:15 |
SMDwrk | Let's assume r5 is loaded with 0x112233xx | 05:15 |
SMDwrk | then add will fail as I described, right? | 05:16 |
stekern | mm, that seems right | 05:17 |
stekern | but where is that string located then? | 05:17 |
stekern | is it the last thing in memory (or in some section) | 05:17 |
stekern | if so, I would say it'd be a bug in the program loader | 05:17 |
olof | stekern: Or end of stack perhaps? | 05:17 |
SMDwrk | Or it could be copied from bss(?, where all static vars are) to the mem and then compared? | 05:18 |
SMDwrk | and while we copy, we update mem on byte basis | 05:19 |
olof | Hmm.. do we track memory writes in the trace log? | 05:19 |
stekern | olof: all stack writes are 4 byte aligned, so the data has to have been read from uninitialized memory in the first place then too (I think at least) | 05:19 |
olof | stekern: True | 05:20 |
olof | I also think that end of the section is a possibility | 05:22 |
olof | but otoh, the elf-loader also only writes full words | 05:22 |
olof | So it should zero-extend | 05:22 |
stekern | hmmm | 05:23 |
stekern | SMDwrk: can you look up where the string is located? | 05:23 |
SMDwrk | I guess it's in .rodata | 05:30 |
SMDwrk | http://pastebin.com/u956y7n1 "blabla SOME blabla" | 05:31 |
SMDwrk | I think we do smth like memcopy(strlen()) or strcpy() and copy given sting by byte and then we finish with 0x4e4700xx in ram/cache(http://imgur.com/LOPfHkU) | 05:37 |
stekern | can't look at pastebin from here... | 05:45 |
stekern | but that could of course be the case, if it's copied byte by byte from one place to another | 05:46 |
stekern | not sure how you would prevent the x'es in any other way than initializing the memory to something first | 05:47 |
SMDwrk | why don't you want to initialize mem? | 05:50 |
stekern | it might take a long time to initialize all of it if you have a large mem | 05:52 |
SMDwrk | You can init it only for iverilog and tell verilator to treat non-init values either as 0, or 1, or even random values each time you build project | 05:53 |
olof | stekern: But we might be able to solve this in mor1kx. | 07:25 |
stekern | how, when the problem is that x'es are read from uninitialized memory? | 07:26 |
olof | stekern: But isn't the problem really that mor1kx looks at the x bits, even though it shouldn't? | 07:38 |
stekern | no, the code is doing an add, it has to use the 'x' bits for that | 07:38 |
stekern | if it would have been a byte load or an AND operation, then I'd claim it's a mor1kx bug | 07:39 |
stekern | we could have a insert-random-bit-on-x module on the bus to avoid having to initialize the memory ;) | 07:41 |
olof | ah ok. Hmm... but if there is something non-zero in that location, we will get the wrong result, right? | 07:43 |
stekern | no, I don't think so | 07:43 |
stekern | if *that* would be the case, then it's a bug in the sw | 07:43 |
olof | But if we do an add, where part of one of the operands is unknown... how would that ever work reliably? | 07:44 |
SMDwrk | we could initialize mem with random values and still get correct result, problem is with 'xx'. So either we init mem with any values, or we insert x-to-smth on the mem bus | 07:45 |
stekern | olof: the code sequence that SMDwrk showed was: l.lwz r5,0(r4); l.ori r2,r2,0xfeff; l.add r6,r5,r2; | 07:47 |
olof | And the data @r4 has xx in LSByte? | 07:49 |
stekern | yes | 07:49 |
olof | Then I still don't understand how we could trust the contents of r6 after the add, if part of one of the operands could be anything | 07:50 |
SMDwrk | I think we can't. | 07:51 |
olof | Then I would say it's a software bug | 07:52 |
SMDwrk | but hw is responsible for that xx being in LSByte, not sw | 07:52 |
stekern | yes, but since they are 'xx' in practice they could be random | 07:53 |
SMDwrk | that's more like particular sim error | 07:54 |
stekern | ? | 07:54 |
SMDwrk | I mean on verilator you won't see that kind of error and also on real hw/fpga. | 07:55 |
stekern | but then lsb != 0 should be fine | 07:56 |
stekern | otherwise you might see the error on hw | 07:56 |
SMDwrk | why? lsb could be anythin except 'x', because in 0x4e4700xx there is already \0 | 07:57 |
SMDwrk | actually we just don't care about last byte | 07:58 |
stekern | exactly, then lsb != 0 should be fine | 07:59 |
olof | It works fine in icarus with RAM initialized to 0xff | 07:59 |
SMDwrk | lsb != x should be fine | 08:00 |
olof | Yeah. I agree. Just trying to figure out where in mor1kx it breaks | 08:00 |
SMDwrk | it breaks on add when we can't predict what add(0x112233xx, 12341234) would be and then we get xxxxxxxx as result and then pipe goes mad | 08:01 |
stekern | no, lsb != 'x' shouldn't be fine when fed to an add operation | 08:06 |
stekern | no, scratch that, I'm not concentrating... | 08:07 |
stekern | I read that as lsb == 'x' ;) | 08:07 |
GeneralStupid | hi, is there a list of papers about openrisc (like from TU munich)? | 10:24 |
SMDwrk | GeneralStupid: do you need arch manual or smth else? | 10:31 |
-!- Netsplit *.net <-> *.split quits: mafm, pecastro | 10:42 | |
GeneralStupid | SMDwrk: i need to find a master thesis and i have three options, doing smthg interesting with openrisc or doing something in medical robotics or doing something boring | 10:42 |
SMDwrk | GeneralStupid: I think you'd better ask wallento or others from universities, and I don't know european requirements for the thesis | 10:44 |
GeneralStupid | SMDwrk its my first master thesis, so i dont know them, too. :) | 10:45 |
SMDwrk | GeneralStupid: I mean in russia your thesis must have some r&d part and some math | 10:47 |
SMDwrk | I'd say this paper one looks like great thesis to me http://www.eecg.utoronto.ca/~moshovos/ACA06/readings/two-level-bpred.pdf | 10:47 |
GeneralStupid | a bit too short as a master thesis :) | 10:48 |
GeneralStupid | i really would like to spend some time for this project, because i have no free time atm... | 10:49 |
SMDwrk | you could use bigger font size ;) | 10:50 |
GeneralStupid | nice idea :D | 10:51 |
SMDwrk | I can teach you a lot how to enlarge your thesis :3 | 10:52 |
SMDwrk | s/teach/tell , I guess | 10:54 |
GeneralStupid | i hope i dont need that -.- but i still have 2 or 3 months to decie what to do | 11:04 |
-!- Netsplit *.net <-> *.split quits: mafm, pecastro | 11:57 | |
-!- Netsplit over, joins: mafm, pecastro | 12:06 | |
shorne | Hi All, wallento, olofk there are still 2 pull requests pending for gdb and newlib. Want me to do anything with them before they are merged? I think once we get through all the test results we can work to push upstream. | 19:06 |
shorne | But no harm merging, so others can get them, unless I need to fix my commit messages or something. | 19:06 |
shorne | Still working through test results. I groups by failure count here: | 19:07 |
shorne | https://gist.github.com/stffrdhrn/f96fbc9c0a94b6299dd50fa1f79640c6 | 19:07 |
--- Log closed Wed May 25 00:00:22 2016 |
Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!