--- Log opened Sat Sep 12 00:00:40 2015 | ||
wbx | stekern: hi | 09:52 |
---|---|---|
andrzejr | I'm having problems booting to code in RAM compiled with gcc+newlib. It works for hand-written assembly code. | 10:27 |
andrzejr | What is the meaning of OFFSET and BOOTROM_ADDR and what are their correct values? ATM, I'm using 0 for both. | 10:28 |
andrzejr | also, should I upload the .hex file (it contains some data before the actual code) or .vh file (this starts with actual code) | 10:30 |
andrzejr | .elf has a reset vector at 0x100, in .vh the same code starts at 0x0, in .hex an extra header is added so the code starts at 0x140. Which file should I download to RAM? | 11:59 |
olofk | andrzejr: Are you talking about the bootloaders in or1k_bootloaders? | 14:07 |
andrzejr | olofk, I am trying to compile and load "hello.c" (newlib example). I'm writing either .vh or .hex directly to RAM and then jump to 0x00000000. | 15:16 |
andrzejr | this works with my own assembly code and doesn't work with newlib. I suspect this may be caused by boot vectors. | 15:16 |
olofk | andrzejr: Is this in simulation, or hw? | 16:27 |
olofk | And 0x100 is the default OpenRISC reset vector | 16:28 |
olofk | Only thing that uses 0x0 are my handwritten ASM bootloaders, because they can't afford to waste 0x100 bytes | 16:29 |
olofk | What are the .hex files you are talking about? Are they generated by or1k-elf-objcopy? | 16:30 |
andrzejr | olofk, 0x100 matches contents of .elf file but after or1k-elf-objcopy->mkimage->or1k-elf-objcopy I get a .hex file which seems to contain an extra 0x40B byte header | 17:48 |
andrzejr | I will modify my debug UART code to download .elf file directly (currently it requires a .vh file) | 17:51 |
andrzejr | unfortunately I don't have a functioning JTAG thanks to always helpful guys at Digilent :-/ | 17:52 |
andrzejr | looks like converting .elf to .hex directly (single or1k-elf-objcopy run) maintains the 0x100 offset. Now I only need a way of uploading .hex file to the chip. | 18:13 |
olofk | andrzejr: mkimage inserts a 40 byte header | 18:39 |
olofk | I created the spi_uimage_loader to parse that header and copy the contents to RAM | 18:41 |
olofk | I actually started working on a HEX loader a while ago | 18:41 |
andrzejr | I see, is that related to u-boot? | 18:41 |
olofk | Yes, I decided to use the u-boot format since there are already tools available to create an image | 18:41 |
olofk | So I didn't have to invent anything myself | 18:41 |
andrzejr | I tried hexloader.S yesterday. It didn't work but there were multiple other issues | 18:41 |
olofk | I don't think I ever finished hexloader | 18:42 |
olofk | But I started working on it for the same reason that you are having. No easy way to just upload programs | 18:42 |
olofk | At least when you don't have JTAG access | 18:43 |
andrzejr | so what is "the best" solution for the first stage bootloader? copy u-boot from spi flash to ram and start from there? | 18:43 |
olofk | Yes | 18:43 |
andrzejr | have we got any examples of a minimum u-boot setup? | 18:43 |
olofk | I haven't used u-boot for years unfortunately. I tend to boot linux directly or run bare-metal apps via jtag | 18:44 |
olofk | But _franck_ is the expert here, and others are using it as well | 18:44 |
andrzejr | I assume loading u-boot or linux works the same, does it? | 18:44 |
olofk | Yes | 18:45 |
olofk | I am having some problems with my SPI loader unfortunately. It works great on the de0 nano, and it works great in simulation on lx9 microboard (spartan6-based) | 18:47 |
olofk | But I haven't managed to make it load anything from SPI on the real lx9 board | 18:48 |
olofk | Problem is once again that I'm completely blind since I can | 18:48 |
olofk | 't connect through JTAG | 18:48 |
olofk | So I don't know if my SPI image is bad, if it fails to read the Flash, if it fails to write to memory or if it just fails to start executing | 18:49 |
andrzejr | I had similar problems - any issues with booting and I'm back to simulations or rs232-syscon | 18:49 |
olofk | I really can't understand why Digilent insists on keepin their proprietary JTAG interface. Opening that up would probably solve a bunch of problems | 18:50 |
andrzejr | On Xilinx I had to output several SCK pulses with CS off. This is because the clock is shared with FPGA config. | 18:51 |
olofk | aha. Interesting | 18:51 |
andrzejr | Btw, my half-word wb_resize fixes are faulty. I will push a new version once I confirm it is working. Basically, reverted most of my changes and left only half-word stuff. | 18:58 |
olofk | cool. I haven't had time to look at them | 18:58 |
olofk | For the spi loader, I should probably add some instructions to read out the chip ID and print it to the UART to get a confirmation that it can read the Flash properly | 18:59 |
andrzejr | It will never work perfectly because if we wanted to write 8b registers with a 32b instruction this (in the worst case) would have to be split into 4 bus accesses. | 18:59 |
olofk | Yeah. Resizing is awkward on Wishbone | 19:00 |
olofk | When I did the upsizer I had to break the Wishbone spec to make it work | 19:00 |
olofk | You're not allowed to change wb_sel during a burst according to the spec (which I think is just stupid) | 19:01 |
andrzejr | I would really like some basic UART console with read/write bus access and perhaps some boot choices. But as far as I understand that is pretty much what u-boot does. | 19:02 |
andrzejr | why upsizing required changing wb_sel? | 19:03 |
andrzejr | (ah, linear mode) | 19:03 |
olofk | Actually all burst modes where you don't send upsizing_factor*n words | 19:06 |
andrzejr | what was on the other end of your upsizer? CAMD? | 19:07 |
olofk | wishbone | 19:07 |
olofk | CAMD would probably be easier, but I needed wishbone in both ends for that one | 19:08 |
olofk | In the end I gave up and solved it another way | 19:09 |
andrzejr | I see. IMHO the only way to handle it is to split the burst into separate transactions. This is how I handle bursts in WB->UI bridge. | 19:09 |
olofk | At least that's much much easier | 19:12 |
olofk | My code ended up a complete mess and doesn't really work https://github.com/olofk/wb_intercon/blob/master/rtl/verilog/wb_upsizer.v | 19:12 |
olofk | Not doing that again :) | 19:13 |
andrzejr | I initially had more ambitious plans for the WB->UI bridge but after several failed attempts and our discussions about CAMD I have decided to first make a bridge that works and implement a high performance bus later. | 19:16 |
andrzejr | especially that when we start from a 32b bus with ACK signals (WB) there is not much we can do to improve performance. | 19:17 |
andrzejr | Yesterday I noticed that the same program in DDR2 RAM runs ~20x slower than in WB_ROM (no caching, no bursts in both cases). This delay is mostly memory latency. | 19:19 |
andrzejr | with this much latency the best we can do is to schedule reads in cache controller and process results asynchronously. The bus not enforce any acknowledgements or it will slow down everything else | 19:22 |
olofk | The read cache that stekern implemented in wb_sdram_ctrl helps a bit with latency problems for burst reads. I've been meaning to rip that out and put on a CAMD interface | 19:26 |
olofk | But if the CPU uses CAMD natively (as you have proposed), then it won't be needed at all I guess | 19:27 |
olofk | Regarding http://pastebin.com/gG51rp5V , were you thinking that the instruction bus should go to the "async bus", and the data bus directly to wishbone? | 19:30 |
andrzejr | Cache in DRAM controller could be useful for multi-core systems but it would have to be bigger than the L1 cache. | 19:30 |
olofk | Yes, that's true. But this was just supposed to be a small burst cache. A large L2 cache is hard to motivate in an FPGA as it would eat tons of block RAM | 19:31 |
andrzejr | no, I meant both buses go to WB when caching is disabled (a fallback). Once caches are enabled, cache controllers should use the async bus exclusively. | 19:32 |
olofk | aha | 19:32 |
olofk | I know that MicroBlaze uses a similar approach and separates cached and non-cached accesses to different physical buses | 19:33 |
andrzejr | true, but that small cache was to workaround a deficiency of WB. | 19:33 |
olofk | Exactly | 19:33 |
olofk | I like the idea of cached and non-cached buses. My one problem with that is a project I did a few years ago for a dual-core system where the SW guys wanted to have half the RAM shared between both CPUs. The bus hierarchy for that became somewhat....complicated | 19:35 |
olofk | oh... or do you mean that every access should go through the async bus once the cache is on? | 19:38 |
olofk | MicroBlaze lets you specify which memory range that should be cached | 19:40 |
andrzejr | In the CPU we could use the async bus only, if that helps. But that would require more changes, not only cache controllers but also instruction and data fetching. | 19:42 |
andrzejr | SDRAM should IMHO be attached to the async bus only, so any sharing could be implemented at this interface. RAM access from WB would have to go through a bridge (just like they do now: WB->UI) | 19:43 |
olofk | That makes sense | 20:08 |
dalias | stekern, do you (or anyone) know if anyone's working on getting or1k support upstream in gcc? | 20:31 |
stekern | dalias: blueCmd has been the one that has been the most driving force, and as olofk mentioned the other day, the biggest current obstacle is getting the copyright assignment from one last guy | 21:05 |
dalias | :/ | 21:07 |
dalias | need to make some small changes to entry point stuff in musl affecting all archs again | 21:11 |
dalias | and i was thinking it would be convenient if gcc upstream just worked :) | 21:12 |
olofk | Has anyone booked hotel for orconf yet? | 21:25 |
--- Log closed Sun Sep 13 00:00:41 2015 |
Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!