IRC logs for #openrisc Saturday, 2014-09-27

--- Log opened Sat Sep 27 00:00:12 2014
poke53282stekern: I need 5 minutes of your time.
poke53282Can you review this. Only the most severe errors.01:00
poke53282libffi is nothing more than a hack. :)01:01
poke53282It's Ok if you don't understand anything in this patch ;)01:09
stekernpoke53282: I'll take a look at some time during the day today04:22
stekernor rather, I might take several looks during the day04:36
stekernwhy do you need to save r14-r16 here?04:36
stekernwell, I see that you use them, but why do you need to use r14-r20? Can't you just save r5-r8 directly?04:38
poke53282r14-r20 are saved because they are used to save some parameters during two function calls.04:39
poke53282the stack pointer is not really safe here. So I cannot access this variables later without more logic.04:40
poke53282So, everything, which I need after the function call is saved in registers.04:40
stekernok, fair enough04:41
poke53282the function chain is the following.04:41
poke53282ffi_status ffi_prep_cif_machdep,  ffi_call(C) ->ffi_call_SYSV(Assembler)->ffi_prep_args-> ret to ffi_call_SYSV(Assembler) -> function call -> ret to ffi_call_SYSV(Assembler) and handle the ret value04:43
poke53282I copied a lot from others. So, yes, this is the way I should handle this.04:44
poke53282the closure is a separate part.04:44
poke53282ffi_call -> arrays with args to C function ,      ffi_prep_closure_loc -> C function to array with args  (uses a trampoline)04:45
poke53282and yes, there is at least one function which makes an assumption about gcc. Not sure, how save this is. But other architectures do the same.04:47
poke53282I mean ffi_closure_SYSV  and the assembler code in the beginning and the end.04:50
poke53282and yes, it took me days to understand how ffi works.04:51
poke53282and I am not sure, how wee should call the abi04:57
poke53282there are in principle two options "sysv" and "eabi"04:57
-!- Netsplit *.net <-> *.split quits: enghong, ssvb, sb0, poke53282, FreezingCold, rjo, mboehnert, bentley``17:37
-!- Netsplit *.net <-> *.split quits: jonmasters, rah, Amadiro, xlro, zama, wallento, pecastro, blueCmd, martinboehnert, LoneTech, (+15 more, use /NETSPLIT to show all of them)17:39
-!- Netsplit over, joins: martinboehnert, FreezingCold, blueCmd, mithro, sb0, kiwichris, wallento, pecastro, ssvb, jeremybennett (+23 more)17:46
poke53282any news, stekern? Otherwise I will pull.18:13
poke53282I will make a pull request I mean18:14
stekernpoke53282: I've been away all day, so haven't had a chance to take a look18:29
stekernbut if you feel confident about it, don't let me slow you down18:29
poke53282Ahh well. They have waited one year, they can wait additional few hours.18:34
stekernpoke53282: ok, I'll take a closer look then ;)18:57
stekernI just added support to change the samplerate to my i2s core, prboom with sound works now!18:58
poke53282any programs, which I should compile. timidity maybe?19:03
poke53282Or do you nice programs related to sound?19:03
poke53282mod player, sid player.19:03
poke53282what's the latency?19:06
poke53282what's the number of interrupts per seconds?19:09
stekernperiods of 1024 samples are being used, and the sample freq is 2205019:20
stekernsorry, 1024 bytes I mean19:21
stekernwhich makes 128 samples at 2 channels that are of format S3219:21
poke53282that means around 172 interrupts? That's bad.19:23
stekernbad for who? =)19:23
poke53282for me.19:23
stekernmaybe you can force larger periods by the hw constraints19:23
poke53282and my emulator. 50 interrupts per second is possible right now.19:23
poke53282of course with some tricks 200 interrupts per second is possible. But I don't want to add more logic.19:24
poke53282A latency of 50ms-100ms is fine for me.19:24
stekernI get buffer underruns in mi2 at 22050, maybe it'll work with 1102519:26
stekernI still haven't done a wb interface to the i2s core, so I have to rebuild the FPGA image when I want to change the sample rate =P19:27
poke53282Ahh, that's fine for now.19:28
poke53282buffer underruns? You mean, that the converting step takes too much time?19:30
olofkstekern: I don't like that you're getting buffer underruns. What do you think of the chances of transmitting continous data with a 30.72MHz I/O Clock? :)19:31
stekernpoke53282, olofk: it means that the cpu can't feed the buffers fast enough19:38
stekernpoke53282: ffi_prep_closure_loc uses r13, gcc use r13 for internal temp storage, is that a problem?19:39
poke53282Well, that's the one function which makes some assumptions. r13 is set in the trampoline. Then the header of ffi_prep_closure is executed. And we hope that r13 survives this step.19:40
poke53282the trampoline function is that at tramp[0] = .....19:42
poke53282So if the program executes a function which looks like an ordinary C function like myfunction(arg1, arg2, ...) it executes the trampoline instead, which then executes the ffi_prep_closure_loc which analyzes the arguments and the stack.19:43
poke53282I am not really satisfied with this function. But others do the same crazy stuff.19:44
poke53282There is another option to use the register parameters instead and save the registers on the stack before.19:46
stekernyes, but I'm particularly speaking about r1319:48
poke53282Oh, I can take any temporary register here.19:48
poke5328231?  29?19:49
stekernsince gcc take that into use as a temp reg for "internal" use19:49
poke53282I use also 15 and r1719:49
stekernnot sure if that matters at all, just sharing the knowledge ;)19:49
poke53282and I am not sure if gcc protects this register if I use it like in this function.19:50
poke53282so, the preamble of the C function usually don't use the temporary registers.19:50
poke53282However, I am little bit afraid about the instruction shaking.19:50
stekernbuilding instructions by putting together hex numbers into a buffer makes me think no ;)19:51
poke53282to make them more independent.19:51
stekernanyway, apart from that it looks good as far as I can see19:51
poke53282I like this. It's so easy with OpenRISC to build such functions. And it is readable in my opinion.19:51
poke53282the other option is to write the trampoline in sysv.S and use offsets.19:52
poke53282But thats so terrible with handling the offsets.19:52
poke53282any idea why unwindig does not work.19:54
poke53282I guess we use the eh_frame for this and not the frame pointer?19:54
poke53282I tried it for several hours. No success.19:54
poke53282try{} catch{} does not work.19:54
-!- mithro_ is now known as mithro20:44
poke53282Just a small advertisment on the github page.20:49
olofkpoke53282: Awesome :)21:12
poke53282Done with imagemagick. It's such a nice tool.21:14
stekern@11025 mi2 runs without underruns21:39
poke53282mmh, but 11kHz is not really hifi.21:40
olofkMaybe it's time to start working on wider buses21:40
poke53282you can send two samples at a time with 16 Bit.21:40
olofkOr perhaps not. I just figured that bus congestion might a problem21:40
stekernpoke53282: yes, but I'm not sure if that will help21:41
stekern(that much)21:42
stekernwriting 16-bit values into 32-bit aligned offsets or 16-bit aligned offset will not make a huge difference21:43
poke53282yeah, probably not. Does alsa use floating point for resampling?21:43
stekernolofk: we should do that, but the problem is not that the cpu isn't fast enough to write the values into the buffer, nor is the problem that the wb_streamer is fast enough to pull the data out of the memory21:48
poke53282With your current approach, there is no need for resampling. But maybe also does some mixing and some other conversions like volume.21:48
stekernthe problem is that the data has to be generated before it can be written into the buffer21:48
stekernpoke53282: I don't think that is very resource hungy in such case21:49
stekernplaying a .wav without resampling, takes around 4-5% cpu21:50
poke53282do you have problems with speaker-test?21:50
poke53282and with?21:50
stekernyes, with the sine test21:50
stekern(because the sine wave is generated real time with floating point)21:50
poke53282with resampling I mean21:51
poke53282you can increase all your buffers.21:52
poke53282this will increase the latency but give the application more time.21:52
stekernthat will only help if the application has spare time21:55
poke53282does monkey island and prboom run significantly slower when you enable sound?21:56
stekernprboom just slightly, and that works with 22050, but we only have sound fx enabled there21:57
poke53282I guess mplayer will have the same problems.21:57
stekernmi2 and dott is a lot slower if I run it on single-core21:57
stekernbut audio is threaded, so on multi-core it doesn't make much a difference21:58
poke53282:) Looks like you implemented smp right the correct time.21:58
poke53282... at the right time22:00
stekernprboom has that disabled due to some bug22:02
--- Log closed Sun Sep 28 00:00:13 2014

Generated by 2.15.2 by Marius Gedminas - find it at!