IRC logs for #openrisc Monday, 2014-05-12

--- Log opened Mon May 12 00:00:44 2014
stekernblueCmd: yeah, I agree, I should probably have split that in two ;)03:46
stekernand not agile?! you should know that I pair programmed it with my daughter!03:47
stekernshe had all kinds of ideas about pinky pies, princesses and unicorns that I had to fight off!03:49
olofkstekern: Why? :(05:40
olofkWe have way too few unicorns in the project05:41
stekernolofk: hmm, maybe you're right, perhaps I was to hasty brushing off her ideas as nonsense..06:09
stekernshe usually come with strong arguments though, the other day she proclaimed "I'm a princess, and I will hit you in the face!"06:10
blueCmdstekern: you're a brave man fighting off a princess that will hit you in the face06:23
blueCmdhttp://s2.hubimg.com/u/6748285_f520.jpg < just add that in the commit message or something06:24
stekernif the generic gpio driver pulled of getting weird ascii art mainline, so can I!06:33
stekernhttp://lxr.free-electrons.com/source/drivers/gpio/gpio-generic.c06:33
olofkhaha. I hadn't seen that :)06:41
stekernhmm, I wonder what is best when implementing an atomic test_and_set_bit07:07
stekernunconditionally setting the bit, or actually test before setting?07:08
stekernthe test is a couple of more instructions, but you can avoid looping on a failing l.swa07:08
stekern...but otoh, with the couple of more instructions, the risk of an l.lwa interruption increases when the test fails.07:10
wallentostekern: I rebased and our thread is lost now. find the update pull request here: https://github.com/openrisc/or1k-src/pull/707:37
stekernwallento: yes, that's a bit annoying that github does that :(07:41
wallentoi also searched around a bit what people suggest for modifications of pull requests07:41
stekernbut as for the pull request itself, I think it looks good now07:42
wallentothat would be an awesome thing, as the basic idea of pull requests is, that thay typically need changes :)07:42
stekernI've created a "newlib" team and added you as a member to it, you should be able to push your changes to or1k-src yourself07:42
wallentoperfect, thank you07:43
stekernthe only way I can come up with to "work-around" the github behaviour is to push the new changes to a different branch, open a new pull-request and then close the old one07:44
wallentoyes, that is also an option, but then people might get even more confused. I think the best is handling this properly on the github site07:44
wallentolike detecting force writes on branches in pull requests07:45
wallentobut I think they may come up with something, seeing what else they got going07:46
stekerntrue07:52
wallentoI worked on the snooping, may commit it the next days07:53
wallentowe changed a lot and especially made the cache handle writes, so I re-worked it for the current mor1kx cache07:53
wallentoproblem is you instantly have to react on a snoop, but I think there are no problems with it so far. But it needs some testing now..07:54
stekernwallento: oh, that sounds cool. I'm playing with the Linux stuff currently07:55
stekernI've got a WIP branch of it here: http://git.openrisc.net/cgit.cgi/stefan/linux/log/?h=smp07:56
stekernI haven't got very far with it yet ;)07:58
wallentoyeah, see my recent mail on the scratch reg issue08:00
stekernah, it hasn't arrived here yet, but will when it does08:00
wallentobest would indeed be something that does not require significnat arch spec or ABI changes08:01
wallentobest would be if it can be checked with UPR08:01
wallentoor so08:01
stekernyes, or just a CONFIG_08:01
stekern(in the kernel case)08:02
stekernchecking uprs isn't really an option, since you'd need free regs to check it runtime ;)08:02
wallentoi mean at boot time then08:02
wallentolike linux multicore will simply not boot (or not boot the other cores)08:02
stekernyes... but what will you do with the information?08:02
stekernah... ok you mean like that08:03
wallentoif there is openrisc silicon out there, this would be a sane check I think08:03
stekernyeah, that could of course work (and is not exclusive to the CONFIG_ option)08:04
wallentojust gave a second thought to the snooping, turned out it is +8 lines only :)08:22
wallentoi will push it as soon as I am sure that really no state transition is missed08:22
LoneTechstekern: thought just a bit more on the PLT. it would seem to me that its function is to keep trampolines so function pointers are still simple pointers whether going to static or dynamic symbols08:36
LoneTechbut the existence of a nameN value implies that we can find sufficient information with one instruction; we can pass the GOT+nameN address instead of reloc_offsetN and save one word per PLT entry08:38
LoneTechthe whole construct is a kludge to work around the linker not being able to lock a code page (in which case the PLT section would not need to look in a separate GOT table)08:39
LoneTechoh my. or32_trampoline_init seems to set up (some sort of) closures by writing code on the stack, per the documentation08:53
LoneTechwhere do I find the actual code generation for the PLT?08:58
stekernLoneTech: do you mean this? https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/elf32-or1k.c;h=956ec387ec14f7361e95812f72e7463e0a21b592;hb=HEAD#l3009:00
LoneTechthat looks right, yes. thanks :)09:00
LoneTechI think I can't cut it down to less than 12 bytes09:05
stekernyes, and a note on that, I was mostly concentrated on "making it work" rather than "making it optmized" when I implemented those09:06
LoneTechunderstandable09:06
stekern=)09:06
stekernfor instance, I guess you could have different sized entries for the pic/non-pic case, but I took the easy path and made them same sized, even if that meant some 'useless' padding in the PIC case09:08
LoneTechI'm not used to seeing ld.so in /usr09:16
stekernno.. I don't think that's even used09:17
blueCmdstekern: cmpxchg works for 1, 2 and 4 now.09:17
stekernso, 3 is missing?09:19
stekernor were there more?09:19
blueCmdwell, do you really want to support 3 byte operations? ;)09:20
blueCmdin filenames it would be -1, -2 and -309:20
stekernhaha, ok09:20
blueCmdand -4 and -5 are for 64 and 128 bit operations09:20
stekernI though you were speaking about the filenames09:21
stekern+t09:21
blueCmdyes, I can be a bit vauge at times09:23
blueCmdit's wonderful that searching for 'openrisc 1000 architecture' gets me to the old site and requires 4 clicks to get to the pdf09:31
blueCmdand the second hit on google is a very old mirror of the document09:31
stekernit's not a remedy for the problems you point out, but I usually go to http://opencores.org/or1k and click on the link there when I want the arch spec09:37
stekernI don't know why we don't have a direct link to the document there though..?09:38
olofkOne down! http://opencores.org/or1k/Architecture_Specification#Atomic_operations09:44
stekernI should perhaps push the updated spec too...09:48
olofkstekern: Will that be rev 1.0-1 ?09:55
olofkIIRC juliusb had some ideas on how to keep arch spec and implementation versions in syn09:55
olofkc10:01
LoneTechit's rather sad how many hardcoded offsets go into the linker code10:01
stekernolofk: 1.1-010:20
stekernI think the last digit can only be used for changes that not change the architecture10:21
stekernlike fixing typos, reorganizing text, clarifications etc10:21
stekern... and the implementations have an arch ver reg, that claims what version of the arch spec it's compatible with10:29
stekern(well, at least the post 1.0 implementations)10:30
olofkah yes. This is definitely a cause for updating the arch rev10:31
stekern*ver10:33
stekernlast digit is the rev ;)10:33
stekernmajor.minor-rev10:33
olofkah fuck it. I'm not going to comment on this topic anymore. It's way over my head :)10:33
blueCmdhm10:35
blueCmdadd_fetch and sub_fetch are not trivial to implement for 1 and 2 bytes10:35
blueCmdi guess it will need to be something like: l.lwa, l.slr, l.and 0xff, l.add, l.and 0xff, l.sll, l.or (combine l.lwa and the result of the computation), l.swa10:42
blueCmdand there needs to be a xor or something to clear the old value before the l.or10:42
stekernhmm... is the opencores web svn slow, or why doesn't the commit I just did show?10:52
stekernolofk: as for the de0_nano sim question, the answer is as simple as - There are no support for sim with de0_nano implemented yet10:54
stekernat least I haven't got around to add the support for it10:55
olofkaha :)10:59
olofkI guess FuseSoC should detect this11:00
olofkand I guess we should add a tb11:00
olofkstekern: Remind me about request #46. It was updated after I commented on adding an if to check for an env var, right?11:02
olofk(starting to understand the irritation over dropped conversations on github)11:03
olofkAH.. Just found something good about git starting ranges after the specified commit11:07
olofkI can do git cherry-pick HEAD..FETCH_HEAD after I have fetched a branch for a pull request11:08
stekernolofk: yes, #46 should be good to go now11:10
olofkI assumed that and pulled it already :)11:11
stekernblueCmd: it's time for your daily SMP commit: http://git.openrisc.net/cgit.cgi/stefan/linux/commit/?h=smp&id=52a3c9208c1ca003999f032f0ec2e0fc0927165511:11
stekernthat link revealed a bit of whitespace damage in that commit I think11:13
blueCmdstekern: lgtm :)11:15
blueCmdI appriciate the daily commits11:15
stekernnext milestone - spinlocks11:16
blueCmdI have no idea how to make this nice..11:17
blueCmdMy problem:11:17
blueCmdI have four insns: op_and_fetch, fetch_and_op, nand_and_fetch, fetch_and_nand11:17
blueCmdop is a iterator for [plus, minus, ior, xor, and] (might have forgotten a couple, not important)11:18
blueCmdthose work nicely for 4 byte operations11:18
blueCmdI can do some optimization for things that are not artithmetic (two masked logical operations will not touch things outside the mask, 3 & 3 = 3, 3 + 3 = 6 however)11:19
blueCmdsince I have 4 insns any way I go will either 1) make it slower for the 4 byte case, or 2) explode the insns into at least 8 insns11:20
blueCmdThis would be much nicer if I could just have a skeleton and use C instead of the weird RTL templating shit11:21
stekernhaha11:24
stekernI didn't get the "I have 4 insns", where's that restriction?11:24
blueCmdit's not a restriction per se, but that's how I implemented it11:31
blueCmdI might be able to consolidate them down to 2. hm11:32
blueCmdop_and_fetch and fetch_and_op are quite similar. I expanded them to 2 different insn since I got some weird errors, but I found a bug in the register contraints now, so it might be worth revisiting11:36
rahis it possible to test in-core power management strategies, like power gating, on an FPGA?11:55
rahI'm guessing not?11:55
blueCmdrah: I think it is11:57
blueCmdat least for stuff like "disabling the clock to part X"11:58
rahah right11:58
rahthat's clock gating11:58
blueCmdright, yeah.11:58
blueCmdpower gating would be harder I guess11:58
rahwaye11:59
raherr.. aye :-)11:59
blueCmdthat said, I doubt the whole fpga is powered all the time, so maybe it has some way of doing 'block level' power gating11:59
rah"Power gating uses low-leakage PMOS transistors as header switches to shut off power supplies to parts of a design in standby or sleep mode. NMOS footer switches can also be used as sleep transistors. ... Typically, high-Vt sleep transistors are used for power gating, in a technique also known as multi-threshold CMOS (MTCMOS). The sleep transistor sizing is an important design parameter."12:01
rahhttps://en.wikipedia.org/wiki/Power_gating12:01
rahwhich all sounds very much like it's specific to the silicon circuit12:01
rahI don't understand how FPGAs work but it sounds like this kind of fine-grained power gating is done at a lower level than the usual IP put into an FPGA12:02
rahout12:03
rahjust out of interest, are there any openrisc implementations that do clock gating?12:03
blueCmdstekern: stop cherrypicking and repeating the correct thing all the time12:04
blueCmdnow I'm down to 2 insn12:04
stekernblueCmd: wut?12:05
stekern;)12:05
blueCmd:D12:05
stekernwhere's the cherries?12:07
blueCmdthe cherrypicking would be 'I have 4 insns'12:10
blueCmd(which, interestingly enough, were the first words I wrote)12:11
* stekern is completely lost now12:11
stekernbare in mind, I don't have the code you are poking around in front of me ;)12:12
blueCmdstekern: I know! just forget about what I said :P12:13
blueCmdyou helped me, that's all you need to know :)12:13
stekerngreat! =)12:13
blueCmdstekern: waste 1 instruction (actual CPU instruction) and make the code easier to follow - or duplicate a bunch of code and save 1 insn12:14
stekernumm, depends how often that insn is used ;)12:15
blueCmdit's in the critical section between lwa and swa12:15
stekernok, maybe it's worthwile a bit dirty code for that12:15
stekern...but, since I asked a similar question this morning, you should know that I don't really know :P12:16
blueCmdhaha12:16
blueCmdmore details:12:16
blueCmda = 1 byte, b = 1 byte. a OP b should be one byte. that is true for all logical but not arithmetical operations12:17
blueCmdfor artihmetical I need to mask the result12:17
blueCmddoing that mask for all insn or just the artihmetical ones is what I'm wondering about.12:17
stekerntough call ;)12:22
blueCmdgonna go with easy code for now12:24
olofkrah: There are a lot of silicon-agnostic power management things that can be done14:11
olofkAnd all the FPGA vendors have some power analyzing tool. Especially Lattice are very focused on low-power FPGA14:11
olofkBut for the OpenRISC and it's IP cores there hasn't really been enough demand for it yet14:12
raholofk: I see14:17
rahthanks14:17
raholofk: can I ask what silicon-agnostic things are there beyond clock gating?14:18
olofkrah: I was mostly thinking about clock gating actually14:20
olofkWell, lowering clock speeds would of course help too14:21
LoneTechthere are other ways to reduce toggling14:22
LoneTechlike unrolling sequences so you can slow the registers14:23
LoneTechgate level design is rather different from fpgas though; you can do things like use transmission gates to not drive unused paths, leaving their input capacitance to hold it. in fpga design you're mostly limited to needing a mux, which sometimes translates to a lut, and it hurts timing, power and area.14:27
olofkLoneTech: True.14:27
olofkBut there are some easy targets, like making sure to only enable block RAM when you actually need them. I suspect a lot of those are always on14:28
LoneTechyes14:28
rahI see14:44
mohessaidhello, I am facing rigt now a problem "segmentation fault" on running asterisk in or1ksim. the probelm is due to system configuration (must have root privileges)?15:25
mohessaidasterisk was compiled with glibc toolchain and linux(asterisk on it) with or1k-elf toolchain. I tried to lanch in startup using inittab and the same thing was happing.15:29
rah"Another notable feature is a rich set of SIMD instructions intended for digital signal processing."16:19
rahhttps://en.wikipedia.org/wiki/OpenRISC16:19
rahOpenRISC has a rich set of SIMD instructions? :-)16:20
stekern"yes", but no implementations of them ;)16:20
rahstekern: are you joking or are there really SIMD instructions in the spec?16:21
* rah has never read it16:21
rah"OpenRISC Vector/DSP eXtension (ORVDX64)"16:24
rahZOMG!16:24
stekernok, micro-hdmi cable *check*, micro-sdcard *check*, let's see if we can blow some life into the parallella17:03
juliusbquick Q, does fusesoc support running modelsim?17:11
stekernquick A, yes! ;)17:15
juliusbcool, a grep for modelsim in fusesoc/ gave me 1 hit17:16
stekernI think it required two lines to be commented out if you want to run it on a 64-bit machine though17:16
juliusbOK, so I'll instlal it with the VM that I'mp reparing for this workshop17:16
juliusbthis bloody simulator takes up 4GB! Insane17:17
stekernwallentos mor1kx-dualcore demo worked with modelsim at least (and with verilator too)17:17
stekernthat's the only fusesoc thing I've tried together with modelsim17:18
olofkI'm running modelsim from time to time in FuseSoC17:27
olofkInteresting with just one hit in the code base. I take that as a good thing and say it's because the amount of modelsim-specific stuff is very little17:29
olofkBut I'm getting 10 hits17:29
stekernolofk: I tried running fusesoc in cygwin today btw17:30
stekern...didn't go to well17:30
olofkGood to know. I have wanted to try that out, but I haven't got access to a windows machine17:31
olofkCan you run cygwin in wine? :)17:31
stekernI'm not sure what was the problem, but the svn checkouts didn't create all the files for some reason17:31
stekerncould be the windows svn client too17:31
olofkThat's weird17:31
stekernI'll take a closer look at some point17:32
olofkomeone else noticed an issue some months ago. Setting up the quartus files for de0_nano worked fine, but the  windows version expected \\ instead of17:39
olofk(I wrote that in #opencores by mistake)17:39
stekernyeah, I didn't get that far17:42
stekernmaybe I'd be better off with the cygwin svn version17:43
stekernI first had some intentions trying to get things to run natively17:43
stekernbut gnuwin32 is just too outdated to be useful anymore...17:44
olofkWhat's gnuwin32?17:44
stekerngnu tools natively compiled for windows17:44
stekernhttp://gnuwin32.sourceforge.net/17:45
olofkah ok17:47
stekernthey're (or were) great, it's a pity no-one is maintaining them anymore17:48
olofkotoh, I wouldn't rule out the shaky openrisc repository either17:52
stekerntrue, it can have been a momentary issue right then when I tried it17:54
rfajardoBlueCmd, compiling Linux with the development toolchain didn’t solve the problem: “Warning: Ethernet packet length 0 too small.”17:54
olofkrfajardo: You're holding it wrong17:56
stekernrfajardo: are you actually trying to use the ethernet with a tun/tap device or does that happen even if you disable ethoc?17:59
rfajardoI didn’t try to disable it17:59
rfajardoIs ethoc the driver, or the configuration for the simulator?18:01
stekerntry disable it in (comment it out) in arch/openrisc/boot/dts/or1ksim.dts18:02
rfajardoI will try that soon, thanks.18:04
-!- rfajardo is now known as rschmidlin18:32
olofk:)18:33
_franck_olofk: are you the one who know how the openrisc ethernet MAC works ?21:31
_franck_can we have a tx bd len not multiple of 4 ? I'm asking this because the wb master hardcode wb_sel = 4'hf21:37
_franck_and if I send this ARP packet: http://pastie.org/9169818 last two bytes are not read from the memory21:38
_franck_must be a software bug. With Linux a get an extra 32 bits read.21:45
_franck_(I'm working on barebox)21:48
--- Log closed Tue May 13 00:00:46 2014

Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!