IRC logs for #openrisc Wednesday, 2014-07-30

--- Log opened Wed Jul 30 00:00:42 2014
stekernblueCmd: cool, we should do a port ;)02:49
stekernwhat's another little ball to juggle?02:50
stekernhmm, this is odd... the pointer into the .dynamic section match up when is loaded, but the contents does not03:31
stekernthe content matches something in .debug_abbrev03:32
stekerndalias: yes, this is strange... I haven't got to the bottom of it yet03:55
daliaswhat's the issue?03:56 segfaults when it tries to work out the deps for libgcc_s.so03:58
stekernand it does so because the content at what should be .dynamic is not what it should, so it treats some 'random' data as DT_NEEDED04:01
daliasis the mmap call being made wrong?04:09
daliasmaybe your mmap2 syscall has wacky behavior with respect to the units for the offset04:10
daliasiirc some arch ports were confused about whether it's in units of pages or a fixed 4k unit04:10
daliasimo it should alway04:10
daliasimo it should always be the latter04:10
stekerndalias: hmm, ok... I'll take a look at that, but wouldn't one expect the offsets between the 'random' data (I can find that in the elf) and the actual .dynamic to be something like multiples of 4k then?04:31
stekernhere it's 0x1363404:31
daliasah probably not that then :)04:50
daliasbtw it's best not to have libgcc_s :)04:50
HeshamWhich simulator (other than or1ksim) do OpenRISC folks use for debugging the cores? And can I use it to load complex software like RTEMS and debug both HW and SW via it?04:56
HeshamI want to simulate an entire SoC for example and its software, providing all features I can get from HW design that or1ksim does not support.04:58
stekernHesham: there's qemu too05:08
stekernwhat peripherals do you miss in or1ksim?05:08
HeshamI noticed that QEMU is kinda newly added, does it support the full features of HW SoC, i.e, orpsoc05:09
stekernthen there's of course rtl-simulations05:09
HeshamI miss the memory management architecture, MMU at or1ksim05:09
Heshamit's not fully implemented as of HW, multi-level page tables is an example05:10
HeshamYes, I am thinking of something like icarus and connect it to GDB, and load RTEMS applications there05:10
HeshamWill this get some attention of the community?05:11
stekernit's possible to do that yes05:11
HeshamI mean, will anyone may want to try to run it?05:11
stekernmulti-level page tables, well... I don't know if you should bother with that? it's only mor1kx that support that05:13
HeshamAnd by using Icarus, I guess I can write software as it run on real HW (mor1k and fuseSoC for example) but it will be slow05:13
stekernyes, you can use verilator for a faster cycle-accurate model05:13
HeshamWell, in RTEMS, being used in embedded systems, we do not need large pages like 16 MB, and 8 KB needs two-levels page tables, right?05:14
stekernbut again, what are you missing in or1ksim?05:14
stekernyes, two-level pagetables for 8KB05:14
HeshamReal-time clock, I do not get accurate clock interrupts when configuring the tick timer to tick every 1000ms for example05:15
HeshamWe need 8 KB pages, or 1 MB (ARM targets in RTEMS that implement MMU manager used 1 MB pages), so I need two-level page tables which or1ksim does not offer05:16
stekernHesham: ah, now I understand what you meant. two-level 8KB pagetables are supported by or1ksim, one-level 16MB is not.05:17
stekernI thought when you said 'multi-level' you meant using several different setups (one-level and two-level) at the same time05:19
HeshamWe were considering that mixing too, to provide an API for dynamic memory allocation and setting up page attributes dynamically too.05:20
stekern(tick timer) yeah, to get accurate ticks/clock-cycles you'll need a cycle-accurate simulation05:21
stekernI'd use verilator for that05:21
HeshamFor the 128 TLB entries, we can get at most 1 MB of memory; TLBs entry would be statically reloaded at startup. But as being RTOS, we want to avoid TLB misses as possible as we can, so 1MB will be small05:22
HeshamCan I use verilator with GDB?05:22
stekernor the verilated model ;)05:22
HeshamThis way I can simulate the entire SoC and large software like RTEMS with accurate clocks?05:23
stekernI wouldn't call RTEMS 'large', but yes05:23
stekernyou can boot Linux in a verilated model too, so shouldn't be a problem05:24
stekernbut what's the 'entire SoC' in your case05:25
HeshamI want to share as much code as possible for RTEMS/or1k with different BSPs.05:25
HeshamFor example, mor1k core, UART driver, tick timer, FP unit, and maybe multi-core?05:25
stekernmor1kx doesn't have FPU05:26
Heshamlet's make multi-core the least priority, I notices someone has implemented two-core system05:26
stekernotherwise, yes all that is possible in the verilated model05:26
HeshamIs there a tutorial for setting up this verilated model?05:27
HeshamI came across this one which uses Icarus ->
stekernI don't think there's any step-to-step guides, maybe it's a task for you to do ;)05:28
stekernyou'll need this: and this:
HeshamSure, I intend to write a blog post when make this works as I did with or1ksim/RTEMS05:28
stekernthen you'll need verilator:
HeshamYes that's to build the SoC, what about verilated model and GDB parts?05:29
stekernthen to run a verilated simulation: fusesoc sim --force --sim=verilator mor1kx-generic --elf-load=/path/to/elf05:30
HeshamThis way I can simulator a SoC with its software, and (as I have an Atlys board) I can put the design and RTEMS app on this FPGA board with no big effort, right?05:31
stekernto connect gdb you'll need to use this option: '-j, --jtag-server[=PORT]   Enable openocd JTAG server, opt. specify PORT'05:31
stekernthe mor1kx-generic SoC and the atlys SoC has some differences, but from the peripherals that you listed (uart) they should be identical05:33
HeshamThanks a lot stekern, I will work on providing a new RTEMS BSP (perhaps called verilator).05:33
HeshamAnd tick-timer?05:34
stekernor or1k-generic? ;)05:34
HeshamAlso, PIC05:34
stekernyes, tick-timer and PIC are embedded in the CPU core05:34
stekern(i.e. mor1kx)05:34
HeshamPIC is not fully implemented in or1ksim05:34
stekernyes it is05:35
Heshamor1k-generic provides more features right?05:35
stekernlike what for example?05:35
HeshamI am just inquiring05:35
HeshamI remember I read that mor1k is mini core of or1k05:36
Heshamlike FP for example ?05:36
stekernok, I'm mixing up what you are referring to.05:37
HeshamI thought that or1k-generic embed or1200 core not mor1k05:38
stekernmor1kx doesn't implement all the features of the or1k architecture, no. But there isn't any implementation that does that05:38
stekernor1k-generic was just a suggestion for your BSP05:38
HeshamAh, I misunderstood that, sorry :)05:38
stekern*a name-suggestion05:38
stekernor1200 have some features that are not implemented in mor1kx (like FPU and MAC)05:39
HeshamAnd or1200 it can be built using fuseSoC and run on verilator?05:39
stekernmor1kx have some features that are not implemented in or1200 (like hw pagewalk, multi-way caches, support for more tlb entries than 64 etc)05:40
stekernand it's a lot faster05:40
stekernyes, or1200-generic can be used in the same fashion as mor1kx-generic05:41
HeshamI will use mor1k as it seems to be simple and robust, and I think many people use it more than those who use or120005:42
stekernI support that decision ;)05:42
stekernI actually got a priv-mail from someone that is planning on porting the or1200 fpu to mor1kx05:43
HeshamGreat, we can wait for that while working on providing more drivers and architecture support on RTEMS05:45
HeshamDo you remember about the old RTEMS/or1k port (back to 2005)?05:45
stekernthat was before I entered this project, but I know that there was some old rtems port05:46
HeshamIt was deleted by 2005, I hope that this new port will be good, so any suggestions for making it successful from you (and OpenRISC community) would be appreciated ;)05:48
sb0stekern, btw, what exactly are those SPRs that you think are responsible for the large resource usage in mor1kx vs. lm32?07:16
stekernsb0: reviewing that again, I think that statement is not so much true for the setup you are using07:19
stekerni.e. a lot of them are actually already opted out07:19
stekernI still have a couple of local patches where I tried commenting out the remaining ones (that are mandatory by the spec, but could be opted out and still have a functional core)07:25
stekernbut it didn't make a huge difference07:25
stekernI don't think there a lot of low hanging fruit left, it's just a matter of combing through everything and do micro-optimizations all over07:28
stekern...and sometimes unexpected things fall out, like the PIPELINED mul implementation, I don't quite understand why that made such a difference area wise07:29
sb0how did lm32 (and microblaze and nios) do it?07:37
stekern'do it' == become small?07:40
stekernhow should I know? I didn't sit with the design teams ;) but I bet that they did some optimization iterations on those too07:42
Heshamstekern: I have installed fusesoc, verilator, compiled simple hello.elf with bare-metal or1k-elf-gcc, and when typing the command you provided "fusesoc sim --force --sim=verilator mor1kx-generic --elf-load hello.elf" the simulator starts and go to the part of loading the hello.elf and produce an error of unable to load the elf file07:43
HeshamLoading hello.elf07:44
HeshamCan't open hello.elf07:44
HeshamError loading elf file07:44
HeshamTOP.v.mor1kx0.bus_gen.ibus_bridge: Wishbone bus IF is B3_REGISTERED_FEEDBACK07:44
HeshamTOP.v.mor1kx0.bus_gen.dbus_bridge: Wishbone bus IF is B3_REGISTERED_FEEDBACK07:44
stekernsb0: how large was lm32 now again (in the videomixer soc)?07:44
sb02400 LUTs07:44
sb0mor1kx is, like, 5k07:45
stekernbut that's with the PIPELINED mul iirc07:47
sb0which doesn't meet timing...07:48
stekernyes.. but the difference between the PIPELINED and THREESTAGE isn't 2.5k LUTs07:48
stekernthat's with THREESTAGE07:50
sb0well, using lm32 does remove 1600 LUTs from the soc07:53
sb0and I'm sure of this number07:53
stekerninteresting. so some logic increase somewhere else then07:54
stekerndo you have an mrp around?07:54
stekernI can do a build myself if not though07:55
sb0not on this laptop... you can test on misoc with: ./ -t simple -Ot cpu lm32/or1k build-bitstream07:56
sb0sorry, -Ot cpu_type07:57
sb0that 1600 figure is from last month, it should be a bit smaller with the couple new optimizations you made since then07:58
stekernok, no problem, I'll take a look07:59
sb0thx :)08:00
HeshamI am able to run mor1kx and hello.elf using fusesoc now and I got exit message successfully. However the printf("hello"); does not emit anything to the console, do I miss something?08:12
stekernsb0: there was a simpler explanation, you remembered the lut count for lm32 wrong. it's 148408:14
stekernso, a lot of micro-optimizations left to do ;)08:17
stekernthere's of course subtle differences between the archs all over the place, like that you can't turn off caches at runtime in lm32. That will save you a big mux in the fetch unit08:18
stekernHesham: no, that should work08:20
stekern(fetch and lsu unit)08:21
Heshamfusesoc sim --force --sim=verilator mor1kx-generic does this command fetch the UART core?08:24
HeshamNot sure what's the problem, this is the message I got from the previous command08:25
HeshamProgram header 0: addr 0x00000000, size 0x0001117408:25
HeshamProgram header 1: addr 0x00013174, size 0x00000A8008:25
HeshamTOP.v.mor1kx0.bus_gen.ibus_bridge: Wishbone bus IF is B3_REGISTERED_FEEDBACK08:25
HeshamTOP.v.mor1kx0.bus_gen.dbus_bridge: Wishbone bus IF is B3_REGISTERED_FEEDBACK08:25
HeshamSuccess! Got NOP_EXIT. Exiting (27316)08:25
HeshamSimulation ended at PC = 0000d760 (27317)08:25
Heshamhello.elf simple printf("Hello"); from the main08:26
HeshamDoes the UART address is 0x9000000 as usual?08:28
HeshamThanks stekern for all your help. Everything is working fine now, the problem with some conflicts with my toolchain.09:01
blueCmdstekern: *TODO list explores*09:10
-!- simoncoo1 is now known as simoncook09:11
maxpalnstekern: FYI - the test is now passing :-)09:14
maxpalnA fresh pair of eyes did indeed solve the problem. Although I can't be sure exactly what caused the issue - I cleaned up several things that were wrong. The result is a passing simulation :-)09:15
maxpalnstarting a broader regression now09:20
stekernblueCmd: d?09:22
stekernHesham and maxpaln: great! =P09:22
blueCmdstekern: yes, explodes09:36
blueCmdexplores would be something though09:37
-!- Netsplit *.net <-> *.split quits: aburgess, jonmasters, olofk09:44
-!- Netsplit over, joins: aburgess09:45
stekernhmm... I get fun results with mmap..10:43
stekernnm... I'm holding it wrong11:01
-!- clopez_ is now known as clopez11:05
stekernok, holding it right yields fun result too... and consistent with what happens in musl11:07
stekernok... so this whole thing is most likely related to page size and mmap211:10
stekern-'most likely'11:14
stekernwhat a mess...11:20
stekernI'd consider that a kernel bug11:20
stekernblueCmd: back to the original problem with R_OR1K_INSN_REL_26 in, does the glibc dynlinker resolve those?11:24
stekernblueCmd: follow-up question - is the result of that correct?12:09
blueCmdstekern: for this case or in general?12:18
blueCmdeither way I don't know, but if it isn't correct in general I would think I would see more crashes12:19
stekernboth, but for this case is of course more interesting12:19
stekernwell, from what I understand, R_OR1K_INSN_REL_26 shouldn't be present in dynlibs. and the compiler will not produce code that will result in such relocations, so I don't think it is that common12:20
stekernok, probably something else, hacking around those relocations I get a bus error12:41
stekernso, maybe I'm now at the same point as you ;)12:42
blueCmdstekern: :D13:21
blueCmdnice, Matjaz responded to my GCC email. I think we're only missing Jungsook now14:03
stekernthis bus error seems to come from a failure in my test program...14:18
stekernblueCmd: can you give me the program that fails to call __umodsi3?14:26
blueCmdstekern: a binary?14:46
blueCmdstekern: I'm afraid that will cost you license fees14:50
blueCmdstekern: should use modsi at least14:52
blueCmdI don't have any broken chroots open atm so I couldn't test14:53
blueCmdlet me know if you want something more exact14:53
stekernthat's fine, I might not need it14:53
stekernok, my rudimentary __umodsi3 test works now14:57
stekernblueCmd: did you recompile rdfind after you added the .size statement?14:58
stekernprobably, since you got the __modsi in the got14:59
stekernso, then I don't know why it doesn't work for you15:03
stekernblueCmd: can you test running this:
stekerncompiled with: or1k-linux-gcc -O2 umodsi.c -shared-libgcc -o umodsi15:11
stekernand if that fails, test applying this hack to or1k.S:
stekernit's against openrisc/or1k-gcc, so your changes are there too15:12
blueCmdstekern: no, I didn't recompile15:28
blueCmdI don't want to recompile all packages that linked to libgcc15:28
stekernyou only need to recompile the once that link to it dynamically15:32
stekernbut the file you gave me didn't look broken like mine before I added the .size info15:33
stekernand it complained about missing that info when compiling15:34
blueCmdstekern: apt-cache rdepends libgcc1 | wc -l15:35
blueCmdso forgive me for not wanting to recompile "only" the packages affected :)15:35
stekernyou didn't read what i said15:36
blueCmdstekern: well, I answered the 1st thing. I didn't quite understand the two follow-up sentences15:37
blueCmdand I assumed 's/once/ones'15:37
stekernbut from that I assume that when you say that you haven't recompiled, you mean that you rdfind is compiled with 4.8.x?15:39
stekernmoving back, are you saying that none of those 4466 link againts libgcc statically?15:41
blueCmdstekern: yes15:41
blueCmdstekern: statically would infer a build-dependency, not a binary dependency15:42
blueCmdI'm not sure if this is the whole tree or just the first level dependencies though, let me check that15:42
blueCmdbut I don't know why we are discussing this, surely maintaining the same structure of exported function is trivial?15:43
stekernbut either way, you'd only have to recompile the ones that were built against the libgcc that were missing @function and .size15:44
blueCmd(apparently not, but what I'm saying is that this has to be something small)15:44
blueCmdstekern: I have no such binaries15:44
stekernexactly, so no issue15:44
blueCmdbut things still crash with @function and .size15:44
stekernbut I thought you had rebuilt the apps you noticed failing15:44
stekernyes, I got that... So something is up with that15:45
blueCmdright. no, I'm waay to lazy for that15:45
blueCmdyou should know that15:45
stekernso... I need to build a libgcc with the old C versions, compile my test-prog against that and then try to run that against the asm libgcc15:46
blueCmdyes, that is what I would do as the next step15:46
blueCmdbut going back to my lazyness, that is why I involved you15:47
blueCmdbut it's nice to know that you have the same issue now atleast (or something that looks related)15:47
stekernright now, I have no issues15:50
stekern...well not related to this at least..15:51
blueCmdwhat was that crash you were talking about earlier?15:54
stekernI have 'resolved' all crashes15:59
stekern1) was due to that our sys_mmap2 in Linux is buggy16:00
stekern2) was due to the R_OR1K_INSN_REL_26, musl doesn't resolve them16:02
stekern3) was due to the missing @function and .size16:03
blueCmdI might have been sloppy looking at the final crash. it seems to go past umodsi3 now and go to strcasecmp16:09
blueCmdI'm going to rebuild gcc and glibc and see what happens16:09
blueCmdtakes about a day to build gcc and about 6 hours to build glibc16:10
blueCmdso, I'll be right back!16:10
stekerngood, because you'd anyway be on your own, the test app built against C libgcc worked when dynlinking against asm libgcc16:11
stekernblueCmd: musl takes about 1.5 minute to build, without -j16:14
blueCmdstekern: natively? ;)16:15
stekernsure, on a really fast or1k machine ;)16:15
stekernbut, no, cross-build16:15
blueCmdstekern: yeah, I cannot run with -j and I need to compile it natively for debian16:18
blueCmdrebuilding my whole development setup takes ~10 min from zero to working rootfs otherwise, so that's fine16:18
blueCmdstekern: why would you link to libgcc staticly btw?16:24
poke53281blueCmd: Try to start qemu with the no flags option. I am pretty sure your compile time will be significantly reduced..16:28
blueCmdpoke53281: oh, what flag is that?16:32
stekernblueCmd: because otherwise stekern might break the dynamic one16:34
stekernjoking aside, try compiling my test without the -shared-libgcc flag, and it will statically link libgcc16:35
blueCmdmaybe it is standard to statically link it and that's why the whole system isn't broken and only a handful16:36
blueCmdand libgcc contains some other thing that the libraries need16:36
stekerndalias: any suggestions what to do with a broken sys_mmap2...?17:02
blueCmdstekern: you're not speaking of the PAGE_SIZE = 14 and not = 13 thing are you?17:03
blueCmdor whatever that constant is called17:04
stekernbut no17:04
blueCmdor 13 / 1217:04
stekernour page size is 8KB17:05
* blueCmd 's greatest strength is his attention to details17:05
stekernnot 14 bytes ;)17:05
blueCmdsomething like that17:05
blueCmdit's 2<<1417:05
stekernno, it's 2<<13 == 819217:05
stekernyou tricked me17:06
blueCmdyeah, don't look at the numbers - i'm buzzing around about the fact that IIRC we're not using the same as 'all the other' arches17:07
stekernyeah, but that's not an issue per se17:07
stekernthe issue is that mmap2 should take the offset argument in multiples of 4kb, but our kernel implementation interprets the offset argument as 8kb17:08
stekern*as multiples of 8kb17:09
stekernat least if we're going to follow this:
blueCmdTHERE! that was annoyingly hard to find17:10
stekernyes, I know, both our glibc and uclibc are correcting for the kernel bug17:11
blueCmdright. I always thought there was a grander idea behind that17:11
blueCmdmore than a bug17:11
stekernto comfort ourselfs, arc seems to be even more broken...17:12
stekernarc can be configured to have 4/8/16KB page tables, and mmap2 follows that configure...17:13
blueCmdlet's not do that17:13
blueCmdstekern: do you know how syscalls are compiled for glibc? I find the way they do it kind of hillarious17:36
blueCmda pain to debug if you don't know how it's done, but yeah17:36
daliasstekern, is sys_mmap2 broken?18:15
daliasah i see...18:16
daliasthe kernel _REALLY_ needs comments in the source to warn porters about this issue18:16
daliassys_mmap2 should always be using 4k units18:16
daliasit especially should not use a runtime-variable pagesize18:16
daliasif you want to keep the constant 8k size then we need to make musl aware of it18:17
daliasso far musl does not support any of the wacky archs where mmap2 behaves badly18:17
daliasFYI, microblaze _ALMOST_ had a horrible regression in this area a few months back18:18
stekernyeah, I don't think we'll get much forward trying to force a change in the kernel18:18
daliasbut i caught the message on the glibc list that was about to break it and make an ABI-compat nightmare18:18
daliasand we got it reverted before it got out into the wild :)18:18
stekerndalias: well, welcome to wacky or1k land ;)18:19
stekernor perhaps :/18:19
daliasstekern, so for or1k, mmap2 shift matches pagesize?18:20
daliasand pagesize is runtime-variable (obtained from auxv) right?18:20
daliasnote that PAGE_SIZE, not PAGE_SHIFT, is what we get...18:21
daliasso now we need division in mmap() :-p18:21
stekernyes ;)18:24
stekernbut on a practical level, our pagesize isn't really variable, it's always 8KB18:24
daliasbut from an abi standpoint we need to treat it as such18:26
stekernnot really, it's defined to be 8KB in the ABI18:26
blueCmdit has "always" been 8K18:26
daliaswell why doesn't limits.h define PAGE_SIZE then? :)18:44
daliasright now musl treats it as an arch where the page size is runtime variable18:44
daliasi would be very happy to define it to 8k and treat that as a permanent part of the abi tho, if you are18:44
daliasvariable pagesize is stupid18:44
blueCmdwhat is the constant supposed to be called? PAGE_SIZE?18:45
blueCmdmy x86_64 doesn't have a constant for that18:46
blueCmdit has one in sys/user.h though18:46
daliasbluecmd, that's glibc. they removed PAGE_SIZE on all archs because of their HURD mentality that everything is variable/unlimited-limit18:47
daliasthey also removed PATH_MAX for the same reason18:48
daliason musl, PAGE_SIZE exists for most archs in bits/limits.h18:48
ysionneauwell isn't the page size variable on most modern MMU ?18:50
ysionneauor maybe this feature isn't used on any OS?18:51
daliasi dunno, probably not18:51
daliasif it's variable at runtime on a given piece of hardware, then it doesn't have to be variable in the ABI18:52
daliasyou just pick the best (smallest) available option18:52
daliason the other hand, if the choice of page size is limited by the hardware, and the same ISA has different page sizes on different hardware18:52
daliasthen the ABI for this ISA has to support variable pagesize18:52
daliasthis is the case with mips18:53
stekerndalias: (PAGE_SIZE in limits.h) it very well can be, I just noticed today that we're using the runtime variable18:57
daliasstekern, making it constant generates mildly better code in a few places18:59
stekernyeah, I'm happy to make it constant. the only reason it isn't alreadyt was just an oversight from my side.19:02
daliasstekern, ok. so there are two issues. 1. add SYSCALL_MMAP2_UNIT macro with value 8k for or1k19:24
dalias2. define PAGE_SIZE to 8k19:24
stekerndalias: right19:35
daliasbtw some interesting history, at least as i understand it19:48
daliasi think the push to #undef PAGE_SIZE came around the time people started getting interested in huge pages on x86[_64]19:48
daliasbecause there was a serious misunderstanding of how to use them right at first, and of what PAGE_SIZE means19:49
daliasand some people assumed you would run processes with a page_size of 2MB or 2GB for using huge pages...19:49
dalias(which would of course be idiotic; you'd use all your ram in no time, and swapping becomes totally impractical)19:50
daliasPAGE_SIZE is simply the mmap granularity19:53
poke53281and I wrote a manual a few months ago in the chat.23:12
blueCmdpoke53281: ooh, nice - yes I remember this now when reading the message23:35
--- Log closed Thu Jul 31 00:00:43 2014

Generated by 2.15.2 by Marius Gedminas - find it at!