IRC logs for #openrisc Friday, 2014-11-14

--- Log opened Fri Nov 14 00:00:27 2014
poke53282The last two hours I tried to figure out the speed benefit of dynamic recompilation in Javascript. In my example it is a factor of five in Firefox, but just a factor of two in Chrome and IE.  Let's see.04:45
stekernpoke53282: my boys are complaining about chrome being slow when they play some games, so they tend to use firefox for that06:58
stekernI've currently strayed into a non-openrisc project... porting scummvm to windows phone 8.106:59
poke53282:) Yes. I think Chrome is only good for non-optimized code. The first few month when I started working on jor1k Chrome was always the fastest. Sind 1.5 years, this is no longer the case.07:17
poke53282ScummVM doesn't exist for Windows Phone yet?07:17
poke53282I thought ScummVM was ported to every device. Like Doom :)07:18
stekernnope, it's not ported. a big factor why, is that there's no SDL 1.2 port for windows phone07:51
stekernthere's a 2.0 one, but that's highly experimental, but since scummvm use 1.2, it doesn't help much07:51
stekernthe port I'm working on doesn't use SDL at all, so it's quite a bit of work07:52
olofkstekern: The ScummVM port for Sailfish is SDL2-based07:56
olofkSailfish runs Wayland07:56
olofkSo that's why they need SDL207:56
olofkSo there must be an SDL2-port of ScummVM available somewhere I guess07:57
olofkIf you want to cheat :)07:57
poke53282Maybe it is easier to port Dosbox08:06
poke53282The lesson I learned today: Optimizing the top 20 hot spots (~200 instructions) in the Linux kernel doesn't change anything. Instead managing the heuristics to find those hot spots increases the loading time by 50% :(08:24
olofkpoke53282: I did an analysis of the most executed functions during linux boot a while ago08:26
poke53282At the moment I make heuristics from one jump instruction (including to the next. The average instructions in a code block are 6 instructions. But probably I have to optimize on the function level in the end.08:28
olofkIIRC memcpy and memset where the biggest contributors. I did some experiments to optimize that, and by stealing the memset code from Microblaze, it dropped down from the top 1008:28
olofkI wonder if I still have the results somewhere08:30
poke53282yes, our memcopy is a bit ... let's say simple.08:31
olofkWe could say that it's optimized for code size :)08:31
olofkCan you use DMA for memcpy easily under Linux btw?08:32
poke53282exactly :)08:32
olofkI mean if you have a DMA component that you can use. We have one in verilog now08:32
poke53282Not sure. Does it make sense. Probably only for  sizes > 1MB.08:32
olofkTrue. I haven't got a clue how large the common memcpy is08:33
poke53282The overhead for < 10kB is probably too much.08:33
olofkAs long as you can let the CPU do other things in the meantime, I think that the overhead would be quite small. Setting up the DMA only require a few instructions, but the code paths might be longer08:34
olofkDoes SDRAM and DDR2 use the same commands? I'm getting a bit interested in reusing the state machine from wb_sdram_ctrl in my DDR2 controller08:36
poke53282Interrupt handling, scheduling. All that is overhead.08:37
poke53282I don't have a clue about those memory chips.08:37
olofkYeah, it's probably more overhead than I first thought, but still interesting to know if the Linux kernel can be set up to use it for larger blocks08:38
poke53282By the way. At the moment my code tries to find hotspots and exchanges those spots by some compiled Javascript lines. For example like this:
poke53282This might include the answer08:40
olofkpoke53282: What was the original instruction sequence that you replaced?08:44
olofkAnd DMA for memcpy seems not worth the trouble, even though I think that our break-even block size would be much smaller than 1MB since we have smaller caches than an x8608:45
poke53282I found already an error :)08:45
poke53282This is the work of around 5 hours. So still a lot to do. And tons of ways to optimize. But maybe we might get Descent working with a good framerate :)08:49
olofkWoohoo!!! Keep working :)08:51
poke53282But I hope, that the browsers will survive such a torture. With the current way I have to recompile 10000-20000 functions just to boot the kernel.08:51
poke53282blocks, not functions.08:52
poke53282Butt I have enough ideas to prevent such cases.08:52
poke53282Let's see08:52
poke53282mv poke /home/bed && sleep 2880008:54
stekernolofk: hmm, can't seem to find that sailfish port09:03
stekernbut it's probably still more viable to do a 'pure' port, since the existing wp SDL port is experimental09:04
stekernand besides that, SDL wouldn't help me with what I'm currently busy with, file system operations09:12
olofkstekern: Here's the scummvm port for Sailfish if you decide to switch over to SDL2
olofk_florent_, ysionneau : I'm reading the DFI spec now, and I get the impression that only the Read and Write Interfaces need two phases, and that I can use a single phase for the Control Interface. Any thoughts?11:12
olofkI'm thinking that signals such as address, ras, cas, we will be the same in both phases anyway11:14
_florent_hi olofk, in fact you have to:11:32
_florent_- choose wrcmdphase and wrphase according to your write latency11:32
_florent_- choose rdcmdphase and rdphase according to your read latency11:32
_florent_- PRECHARGE, ACTIVATE or REFRESH can be done on any phase11:32
_florent_(Hope I'm not mixing things between DDR, DDR2, DDR3...)11:33
olofkah ok.11:40
olofkI see it now. From the spec : "The PHY must be able to accept a command on all phases to be DFI compliant. If the MC is only using certain phases, the11:41
olofkPHY must be appropriately connected to properly interpret the command stream"11:41
-!- rah_ is now known as rah11:48
stekernolofk: so how do I get the sources from that?13:33
stekernnevermind, I found it13:42
stekernok, it's based on some random commit and no commit history...14:06
stekernI think I'll try to compile the SDL 2.0 port and build against that and see what happens14:07
--- Log closed Sat Nov 15 00:00:29 2014

Generated by 2.15.2 by Marius Gedminas - find it at!