--- Log opened Sat Sep 27 00:00:12 2014 | ||
poke53282 | stekern: I need 5 minutes of your time. https://github.com/s-macke/libffi/commit/0f316ab7c11b8315a838a6ae4645d36ff2c46f4c | 00:59 |
---|---|---|
poke53282 | Can you review this. Only the most severe errors. | 01:00 |
poke53282 | libffi is nothing more than a hack. :) | 01:01 |
poke53282 | It's Ok if you don't understand anything in this patch ;) | 01:09 |
stekern | poke53282: I'll take a look at some time during the day today | 04:22 |
stekern | or rather, I might take several looks during the day | 04:36 |
stekern | https://github.com/s-macke/libffi/commit/0f316ab7c11b8315a838a6ae4645d36ff2c46f4c#diff-5da41f37094d6c7103f9b01811117a6cR43 | 04:36 |
stekern | why do you need to save r14-r16 here? | 04:36 |
poke53282 | Thanks | 04:38 |
stekern | well, I see that you use them, but why do you need to use r14-r20? Can't you just save r5-r8 directly? | 04:38 |
poke53282 | r14-r20 are saved because they are used to save some parameters during two function calls. | 04:39 |
poke53282 | the stack pointer is not really safe here. So I cannot access this variables later without more logic. | 04:40 |
poke53282 | So, everything, which I need after the function call is saved in registers. | 04:40 |
stekern | ok, fair enough | 04:41 |
poke53282 | the function chain is the following. | 04:41 |
poke53282 | ffi_status ffi_prep_cif_machdep, ffi_call(C) ->ffi_call_SYSV(Assembler)->ffi_prep_args-> ret to ffi_call_SYSV(Assembler) -> function call -> ret to ffi_call_SYSV(Assembler) and handle the ret value | 04:43 |
poke53282 | I copied a lot from others. So, yes, this is the way I should handle this. | 04:44 |
poke53282 | the closure is a separate part. | 04:44 |
poke53282 | ffi_call -> arrays with args to C function , ffi_prep_closure_loc -> C function to array with args (uses a trampoline) | 04:45 |
poke53282 | and yes, there is at least one function which makes an assumption about gcc. Not sure, how save this is. But other architectures do the same. | 04:47 |
poke53282 | I mean ffi_closure_SYSV and the assembler code in the beginning and the end. | 04:50 |
poke53282 | and yes, it took me days to understand how ffi works. | 04:51 |
poke53282 | and I am not sure, how wee should call the abi | 04:57 |
poke53282 | there are in principle two options "sysv" and "eabi" | 04:57 |
-!- Netsplit *.net <-> *.split quits: enghong, ssvb, sb0, poke53282, FreezingCold, rjo, mboehnert, bentley`` | 17:37 | |
-!- Netsplit *.net <-> *.split quits: jonmasters, rah, Amadiro, xlro, zama, wallento, pecastro, blueCmd, martinboehnert, LoneTech, (+15 more, use /NETSPLIT to show all of them) | 17:39 | |
-!- Netsplit over, joins: martinboehnert, FreezingCold, blueCmd, mithro, sb0, kiwichris, wallento, pecastro, ssvb, jeremybennett (+23 more) | 17:46 | |
poke53282 | any news, stekern? Otherwise I will pull. | 18:13 |
poke53282 | I will make a pull request I mean | 18:14 |
stekern | poke53282: I've been away all day, so haven't had a chance to take a look | 18:29 |
stekern | but if you feel confident about it, don't let me slow you down | 18:29 |
poke53282 | Ahh well. They have waited one year, they can wait additional few hours. | 18:34 |
stekern | poke53282: ok, I'll take a closer look then ;) | 18:57 |
stekern | I just added support to change the samplerate to my i2s core, prboom with sound works now! | 18:58 |
poke53282 | Perfect. | 19:02 |
poke53282 | any programs, which I should compile. timidity maybe? | 19:03 |
poke53282 | Or do you nice programs related to sound? | 19:03 |
poke53282 | mod player, sid player. | 19:03 |
poke53282 | what's the latency? | 19:06 |
poke53282 | what's the number of interrupts per seconds? | 19:09 |
stekern | periods of 1024 samples are being used, and the sample freq is 22050 | 19:20 |
stekern | sorry, 1024 bytes I mean | 19:21 |
stekern | which makes 128 samples at 2 channels that are of format S32 | 19:21 |
poke53282 | that means around 172 interrupts? That's bad. | 19:23 |
stekern | bad for who? =) | 19:23 |
poke53282 | for me. | 19:23 |
stekern | maybe you can force larger periods by the hw constraints | 19:23 |
poke53282 | and my emulator. 50 interrupts per second is possible right now. | 19:23 |
poke53282 | of course with some tricks 200 interrupts per second is possible. But I don't want to add more logic. | 19:24 |
poke53282 | A latency of 50ms-100ms is fine for me. | 19:24 |
stekern | I get buffer underruns in mi2 at 22050, maybe it'll work with 11025 | 19:26 |
stekern | I still haven't done a wb interface to the i2s core, so I have to rebuild the FPGA image when I want to change the sample rate =P | 19:27 |
poke53282 | Ahh, that's fine for now. | 19:28 |
poke53282 | buffer underruns? You mean, that the converting step takes too much time? | 19:30 |
olofk | stekern: I don't like that you're getting buffer underruns. What do you think of the chances of transmitting continous data with a 30.72MHz I/O Clock? :) | 19:31 |
stekern | poke53282, olofk: it means that the cpu can't feed the buffers fast enough | 19:38 |
stekern | poke53282: ffi_prep_closure_loc uses r13, gcc use r13 for internal temp storage, is that a problem? | 19:39 |
poke53282 | Well, that's the one function which makes some assumptions. r13 is set in the trampoline. Then the header of ffi_prep_closure is executed. And we hope that r13 survives this step. | 19:40 |
poke53282 | the trampoline function is that at tramp[0] = ..... | 19:42 |
poke53282 | So if the program executes a function which looks like an ordinary C function like myfunction(arg1, arg2, ...) it executes the trampoline instead, which then executes the ffi_prep_closure_loc which analyzes the arguments and the stack. | 19:43 |
poke53282 | I am not really satisfied with this function. But others do the same crazy stuff. | 19:44 |
poke53282 | There is another option to use the register parameters instead and save the registers on the stack before. | 19:46 |
stekern | yes, but I'm particularly speaking about r13 | 19:48 |
poke53282 | Oh, I can take any temporary register here. | 19:48 |
poke53282 | 31? 29? | 19:49 |
stekern | since gcc take that into use as a temp reg for "internal" use | 19:49 |
poke53282 | I use also 15 and r17 | 19:49 |
stekern | not sure if that matters at all, just sharing the knowledge ;) | 19:49 |
poke53282 | and I am not sure if gcc protects this register if I use it like in this function. | 19:50 |
poke53282 | so, the preamble of the C function usually don't use the temporary registers. | 19:50 |
poke53282 | However, I am little bit afraid about the instruction shaking. | 19:50 |
stekern | building instructions by putting together hex numbers into a buffer makes me think no ;) | 19:51 |
poke53282 | to make them more independent. | 19:51 |
stekern | anyway, apart from that it looks good as far as I can see | 19:51 |
poke53282 | I like this. It's so easy with OpenRISC to build such functions. And it is readable in my opinion. | 19:51 |
poke53282 | the other option is to write the trampoline in sysv.S and use offsets. | 19:52 |
poke53282 | But thats so terrible with handling the offsets. | 19:52 |
poke53282 | any idea why unwindig does not work. | 19:54 |
poke53282 | I guess we use the eh_frame for this and not the frame pointer? | 19:54 |
poke53282 | I tried it for several hours. No success. | 19:54 |
poke53282 | try{} catch{} does not work. | 19:54 |
-!- mithro_ is now known as mithro | 20:44 | |
poke53282 | http://jor1k.com/jor1k.gif | 20:49 |
poke53282 | Just a small advertisment on the github page. | 20:49 |
olofk | poke53282: Awesome :) | 21:12 |
poke53282 | Done with imagemagick. It's such a nice tool. | 21:14 |
stekern | @11025 mi2 runs without underruns | 21:39 |
olofk | stekern: | 21:40 |
poke53282 | mmh, but 11kHz is not really hifi. | 21:40 |
olofk | Maybe it's time to start working on wider buses | 21:40 |
poke53282 | you can send two samples at a time with 16 Bit. | 21:40 |
olofk | Or perhaps not. I just figured that bus congestion might a problem | 21:40 |
stekern | poke53282: yes, but I'm not sure if that will help | 21:41 |
stekern | (that much) | 21:42 |
stekern | writing 16-bit values into 32-bit aligned offsets or 16-bit aligned offset will not make a huge difference | 21:43 |
poke53282 | yeah, probably not. Does alsa use floating point for resampling? | 21:43 |
stekern | resampling? | 21:44 |
stekern | olofk: we should do that, but the problem is not that the cpu isn't fast enough to write the values into the buffer, nor is the problem that the wb_streamer is fast enough to pull the data out of the memory | 21:48 |
poke53282 | With your current approach, there is no need for resampling. But maybe also does some mixing and some other conversions like volume. | 21:48 |
stekern | the problem is that the data has to be generated before it can be written into the buffer | 21:48 |
stekern | poke53282: I don't think that is very resource hungy in such case | 21:49 |
stekern | playing a .wav without resampling, takes around 4-5% cpu | 21:50 |
poke53282 | do you have problems with speaker-test? | 21:50 |
poke53282 | and with? | 21:50 |
stekern | yes, with the sine test | 21:50 |
stekern | (because the sine wave is generated real time with floating point) | 21:50 |
poke53282 | with resampling I mean | 21:51 |
poke53282 | you can increase all your buffers. | 21:52 |
poke53282 | this will increase the latency but give the application more time. | 21:52 |
stekern | that will only help if the application has spare time | 21:55 |
poke53282 | does monkey island and prboom run significantly slower when you enable sound? | 21:56 |
stekern | prboom just slightly, and that works with 22050, but we only have sound fx enabled there | 21:57 |
poke53282 | I guess mplayer will have the same problems. | 21:57 |
stekern | mi2 and dott is a lot slower if I run it on single-core | 21:57 |
stekern | but audio is threaded, so on multi-core it doesn't make much a difference | 21:58 |
poke53282 | :) Looks like you implemented smp right the correct time. | 21:58 |
poke53282 | ... at the right time | 22:00 |
stekern | prboom has that disabled due to some bug | 22:02 |
--- Log closed Sun Sep 28 00:00:13 2014 |
Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!