IRC logs for #openrisc Saturday, 2015-09-12

--- Log opened Sat Sep 12 00:00:40 2015
wbxstekern: hi09:52
andrzejrI'm having problems booting to code in RAM compiled with gcc+newlib. It works for hand-written assembly code.10:27
andrzejrWhat is the meaning of OFFSET and BOOTROM_ADDR and what are their correct values? ATM, I'm using 0 for both.10:28
andrzejralso, should I upload the .hex file (it contains some data before the actual code) or .vh file (this starts with actual code)10:30
andrzejr.elf has a reset vector at 0x100, in .vh the same code starts at 0x0, in .hex an extra header is added so the code starts at 0x140. Which file should I download to RAM?11:59
olofkandrzejr: Are you talking about the bootloaders in or1k_bootloaders?14:07
andrzejrolofk, I am trying to compile and load "hello.c" (newlib example). I'm writing either .vh or .hex directly to RAM and then jump to 0x00000000.15:16
andrzejrthis works with my own assembly code and doesn't work with newlib. I suspect this may be caused by boot vectors.15:16
olofkandrzejr: Is this in simulation, or hw?16:27
olofkAnd 0x100 is the default OpenRISC reset vector16:28
olofkOnly thing that uses 0x0 are my handwritten ASM bootloaders, because they can't afford to waste 0x100 bytes16:29
olofkWhat are the .hex files you are talking about? Are they generated by or1k-elf-objcopy?16:30
andrzejrolofk, 0x100 matches contents of .elf file but after or1k-elf-objcopy->mkimage->or1k-elf-objcopy I get a .hex file which seems to contain an extra 0x40B byte header17:48
andrzejrI will modify my debug UART code to download .elf file directly (currently it requires a .vh file)17:51
andrzejrunfortunately I don't have a functioning JTAG thanks to always helpful guys at Digilent :-/17:52
andrzejrlooks like converting .elf to .hex directly (single or1k-elf-objcopy run) maintains the 0x100 offset. Now I only need a way of uploading .hex file to the chip.18:13
olofkandrzejr: mkimage inserts a 40 byte header18:39
olofkI created the spi_uimage_loader to parse that header and copy the contents to RAM18:41
olofkI actually started working on a HEX loader a while ago18:41
andrzejrI see, is that related to u-boot?18:41
olofkYes, I decided to use the u-boot format since there are already tools available to create an image18:41
olofkSo I didn't have to invent anything myself18:41
andrzejrI tried hexloader.S yesterday. It didn't work but there were multiple other issues18:41
olofkI don't think I ever finished hexloader18:42
olofkBut I started working on it for the same reason that you are having. No easy way to just upload programs18:42
olofkAt least when you don't have JTAG access18:43
andrzejrso what is "the best" solution for the first stage bootloader? copy u-boot from spi flash to ram and start from there?18:43
olofkYes18:43
andrzejrhave we got any examples of a minimum u-boot setup?18:43
olofkI haven't used u-boot for years unfortunately. I tend to boot linux directly or run bare-metal apps via jtag18:44
olofkBut _franck_ is the expert here, and others are using it as well18:44
andrzejrI assume loading u-boot or linux works the same, does it?18:44
olofkYes18:45
olofkI am having some problems with my SPI loader unfortunately. It works great on the de0 nano, and it works great in simulation on lx9 microboard (spartan6-based)18:47
olofkBut I haven't managed to make it load anything from SPI on the real lx9 board18:48
olofkProblem is once again that I'm completely blind since I can18:48
olofk't connect through JTAG18:48
olofkSo I don't know if my SPI image is bad, if it fails to read the Flash, if it fails to write to memory or if it just fails to start executing18:49
andrzejrI had similar problems - any issues with booting and I'm back to simulations or rs232-syscon18:49
olofkI really can't understand why Digilent insists on keepin their proprietary JTAG interface. Opening that up would probably solve a bunch of problems18:50
andrzejrOn Xilinx I had to output several SCK pulses with CS off. This is because the clock is shared with FPGA config.18:51
olofkaha. Interesting18:51
andrzejrBtw, my half-word wb_resize fixes are faulty. I will push a new version once I confirm it is working. Basically, reverted most of my changes and left only half-word stuff.18:58
olofkcool. I haven't had time to look at them18:58
olofkFor the spi loader, I should probably add some instructions to read out the chip ID and print it to the UART to get a confirmation that it can read the Flash properly18:59
andrzejrIt will never work perfectly because if we wanted to write 8b registers with a 32b instruction this (in the worst case) would have to be split into 4 bus accesses.18:59
olofkYeah. Resizing is awkward on Wishbone19:00
olofkWhen I did the upsizer I had to break the Wishbone spec to make it work19:00
olofkYou're not allowed to change wb_sel during a burst according to the spec (which I think is just stupid)19:01
andrzejrI would really like some basic UART console with read/write bus access and perhaps some boot choices. But as far as I understand that is pretty much what u-boot does.19:02
andrzejrwhy upsizing required changing wb_sel?19:03
andrzejr(ah, linear mode)19:03
olofkActually all burst modes where you don't send upsizing_factor*n words19:06
andrzejrwhat was on the other end of your upsizer? CAMD?19:07
olofkwishbone19:07
olofkCAMD would probably be easier, but I needed wishbone in both ends for that one19:08
olofkIn the end I gave up and solved it another way19:09
andrzejrI see. IMHO the only way to handle it is to split the burst into separate transactions. This is how I handle bursts in WB->UI bridge.19:09
olofkAt least that's much much easier19:12
olofkMy code ended up a complete mess and doesn't really work https://github.com/olofk/wb_intercon/blob/master/rtl/verilog/wb_upsizer.v19:12
olofkNot doing that again :)19:13
andrzejrI initially had more ambitious plans for the WB->UI bridge but after several failed attempts and our discussions about CAMD I have decided to first make a bridge that works and implement a high performance bus later.19:16
andrzejrespecially that when we start from a 32b bus with ACK signals (WB) there is not much we can do to improve performance.19:17
andrzejrYesterday I noticed that the same program in DDR2 RAM runs ~20x slower than in WB_ROM (no caching, no bursts in both cases). This delay is mostly memory latency.19:19
andrzejrwith this much latency the best we can do is to schedule reads in cache controller and process results asynchronously. The bus not enforce any acknowledgements or it will slow down everything else19:22
olofkThe read cache that stekern implemented in wb_sdram_ctrl helps a bit with latency problems for burst reads. I've been meaning to rip that out and put on a CAMD interface19:26
olofkBut if the CPU uses CAMD natively (as you have proposed), then it won't be needed at all I guess19:27
olofkRegarding http://pastebin.com/gG51rp5V , were you thinking that the instruction bus should go to the "async bus", and the data bus directly to wishbone?19:30
andrzejrCache in DRAM controller could be useful for multi-core systems but it would have to be bigger than the L1 cache.19:30
olofkYes, that's true. But this was just supposed to be a small burst cache. A large L2 cache is hard to motivate in an FPGA as it would eat tons of block RAM19:31
andrzejrno, I meant both buses go to WB when caching is disabled (a fallback). Once caches are enabled, cache controllers should use the async bus exclusively.19:32
olofkaha19:32
olofkI know that MicroBlaze uses a similar approach and separates cached and non-cached accesses to different physical buses19:33
andrzejrtrue, but that small cache was to workaround a deficiency of WB.19:33
olofkExactly19:33
olofkI like the idea of cached and non-cached buses. My one problem with that is a project I did a few years ago for a dual-core system where the SW guys wanted to have half the RAM shared between both CPUs. The bus hierarchy for that became somewhat....complicated19:35
olofkoh... or do you mean that every access should go through the async bus once the cache is on?19:38
olofkMicroBlaze lets you specify which memory range that should be cached19:40
andrzejrIn the CPU we could use the async bus only, if that helps. But that would require more changes, not only cache controllers but also instruction and data fetching.19:42
andrzejrSDRAM should IMHO be attached to the async bus only, so any sharing could be implemented at this interface. RAM access from WB would have to go through a bridge (just like they do now: WB->UI)19:43
olofkThat makes sense20:08
daliasstekern, do you (or anyone) know if anyone's working on getting or1k support upstream in gcc?20:31
stekerndalias: blueCmd has been the one that has been the most driving force, and as olofk mentioned the other day, the biggest current obstacle is getting the copyright assignment from one last guy21:05
dalias:/21:07
daliasneed to make some small changes to entry point stuff in musl affecting all archs again21:11
daliasand i was thinking it would be convenient if gcc upstream just worked :)21:12
olofkHas anyone booked hotel for orconf yet?21:25
--- Log closed Sun Sep 13 00:00:41 2015

Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!