--- Log opened Sat Feb 07 00:00:30 2015 | ||
GeneralStupid | hi | 10:03 |
---|---|---|
stekern | olofk: to be picky // is c++-style comments | 10:15 |
stekern | but I know we have them in our asm elsewhere | 10:15 |
stekern | you never figured out the module problem? | 10:16 |
GeneralStupid | i have some questions, iam pretty new at hardware design. i want a softcore with custom Co Processor(s) . Is OpenRISC and Coprocessors over wishbone a good idea? | 10:58 |
GeneralStupid | and i dont need an operating system. i want it more to act like a µC | 10:59 |
knz | GeneralStupid: that might work | 11:01 |
knz | depends on how fast you want your coprocessor to be | 11:01 |
GeneralStupid | faster then a C solution :) | 11:19 |
GeneralStupid | knz: what other options do i have? | 11:20 |
GeneralStupid | knz: editing the openrisc is no option for me. | 11:20 |
GeneralStupid | AFAIK, i could "extend" the OpenRISC instruction register ( thats how nios does it http://www.altera.com/devices/processor/nios2/images/nios2-cust-instruc_fig1.gif ) | 11:22 |
GeneralStupid | and i could write a implementation with register reads and interrupts - that could be fast but it's not a very nice design IMHO | 11:23 |
GeneralStupid | knz: but i want to stay at something which is a "default" way in openrisc... And i want to use the standard tools | 11:25 |
olofk | stekern: Yeah, I looked at other openrisc asm code that uses //, but I can change it | 12:48 |
olofk | I did not figure out the module problem however. Tried to put the defines in string.h inside __KERNEL__, but that didn't help | 12:49 |
olofk | GeneralStupid: It depends on the nature of your coprocessor | 12:49 |
olofk | Can you give us some ideas what it's supposed to do | 12:50 |
olofk | But a separate core on the wishbone bus would be the most standard solution. If you are going to move a lot of data you would probably benefit from having DMA | 12:51 |
olofk | I haven't used the custom instructions. It feels a bit messy in my opinion, and it's hard to find use cases where it would be the best solution | 12:52 |
olofk | stekern: I really can't figure out what is wrong with the module problem. Found a similar issue with a m68k kernel and another module. In that case the problem was that they hadn't included string.h, so it relied on the built-in memset I think | 12:54 |
olofk | But that is not the case here | 12:55 |
olofk | I should see if I get the same error with musl | 12:55 |
olofk | Or another module that uses memset | 12:55 |
GeneralStupid | olofk: It should be a small project for my college... If i did something like this i get a DE2-115 (Alterra Cyclone IV) for free. | 13:04 |
GeneralStupid | olofk: for me personally i want a pattern recognizion with a webcam. | 13:04 |
GeneralStupid | olofk: but a simple example with a benchmark at the end would be enough for me. For example laplace pyramids in C and in Hardware - and compare them. | 13:05 |
olofk | Without really knowing the algorithms, I would say that a core with DMA would be your friend here. It sounds like you will take in a large amount of data, do a number of (paralell?) computations and spit out a relatively small amount of data | 13:08 |
GeneralStupid | i think the speedup will be larger with a parallel algorithm :-D | 13:09 |
olofk | Spelling that word is always a guesswork for me :) | 13:09 |
GeneralStupid | it is... But normally everything should be faster with a dedicated piece of hardware (if done right)... The only thing to measure for me is how many times? | 13:11 |
GeneralStupid | At the moment iam implementing a gammatone Filter for my work in VHDL (for simulink ... :-D ) | 13:11 |
olofk | yikes | 13:12 |
GeneralStupid | i have no real results at the moment, but in software i iterate over every band... In hardware i would have 32 parallel bands :) | 13:12 |
olofk | I did my master thesis on executing simulink models inside labview on FPGA. A piece of crap thesis where I spent 90% of the time fighting against stupid proprietary tools | 13:13 |
GeneralStupid | olofk: NICE! | 13:13 |
GeneralStupid | olofk: is your master thesis available? :) | 13:13 |
GeneralStupid | olofk: iam fighting more with the tools, too :) | 13:13 |
olofk | GeneralStupid: I think so, but it's from 2009 so things have probably changed since then, and I didn't put a lot of love into that thesis | 13:14 |
olofk | http://publications.lib.chalmers.se/records/fulltext/124113.pdf | 13:15 |
olofk | Anyway. Back to your problem. I think we can conclude that doing this with a dedicated core using DMA would speed things up. Your question is, how much. Is that right? | 13:15 |
GeneralStupid | olofk: that's what i want to measure. Iam computer scientist not electronic engineer. So i want a compromise between usability and speedup. | 13:16 |
GeneralStupid | olofk: thanks for the thesis. | 13:17 |
olofk | Usability is for cowards ;) | 13:18 |
GeneralStupid | olofk: in my imagination i write an software product in C (because its good readable, and quickly done). Then iam doing profiling and think about what kind of instruction make sense do develop in hardware... | 13:19 |
olofk | ah ok. Makes sense | 13:19 |
olofk | In my experience, you want to do the split between sw and hw at a level where you don't have to move so much data back and forth | 13:20 |
olofk | as a general rule | 13:20 |
olofk | So try to find places where you have a small set of data that you do a lot of calculations on | 13:22 |
GeneralStupid | olofk: ok. that sounds comprehensible | 13:22 |
olofk | Be also aware that floating point in FPGA hardware can be quite expensive, so if you need that, see if you can get away with fixed point instead | 13:23 |
GeneralStupid | i will use fix point | 13:26 |
GeneralStupid | we have a library that makes it a bit easier to work with FixPoint | 13:27 |
olofk | cool | 13:27 |
GeneralStupid | i dont know if its open source. :) | 13:29 |
GeneralStupid | If i reduce the data which needs to be transferred, then the Wishbone bus would be the best decission? Because it would be the most standard way? | 13:30 |
olofk | Yes, if you're going to use OpenRISC | 13:30 |
GeneralStupid | olofk: i looked at a lot of cores... | 13:30 |
GeneralStupid | i want open source, stability and something which is easy to use. | 13:31 |
olofk | Well, OpenRISC it is then :) | 13:31 |
olofk | Or lm32 | 13:31 |
olofk | I saw that you mentioned VHDL. There are some benefits switching to verilog here I would say | 13:33 |
GeneralStupid | olofk: then my other important question. at the moment i just read about OpenRISC and Operating System. I want to use it without Operating System and more lika a µC. | 13:33 |
GeneralStupid | olofk: why exactly? | 13:33 |
olofk | You can do that. We have a toolchain based on the newlib C library for bare-metal | 13:33 |
olofk | If you go all verilog, you can use tools like verilator, which will generate an extremely fast model of you RTL code, which allows you to do a lot of testing very quickly | 13:34 |
GeneralStupid | olofk: i dont like both of them, there are description langauges ... | 13:34 |
olofk | If you are mixing VHDL you will need an expensive simulator capable of doing mixed language | 13:34 |
olofk | Yeah. Sure. sb0 is the one behind migen, which is for doing RTL stuff in Python | 13:35 |
olofk | But migen generates verilog in the end | 13:36 |
GeneralStupid | olofk: i thought about testing the co processor seperated from the openrisc | 13:37 |
olofk | Yes. That makes sense | 13:37 |
GeneralStupid | Ok, that really helps me a lot. | 13:38 |
olofk | But doing together is cool if you want to elaborate with the sw/hw separation | 13:38 |
GeneralStupid | and openrisc is only verilog? | 13:39 |
olofk | yes | 13:39 |
GeneralStupid | (some cores generate their HDL...) | 13:39 |
olofk | Well, there are VHDL implementations of OpenRISC too, but I'm not sure how good they are | 13:39 |
GeneralStupid | I dont exactly know what board i get, it will be a terrasic DE2 oder DE2-115 board. | 13:39 |
GeneralStupid | olofk: no, i want the stable and good "official" openrisc implementation. | 13:40 |
olofk | I think we have OpenRISC-based systems available for those boards | 13:40 |
olofk | Then it's verilog | 13:40 |
GeneralStupid | ok, that would have been the next question. What is to be configured? For example how to address the complete RAM or to use the embedded multipliers ... | 13:42 |
olofk | Not sure I understand the question | 13:44 |
GeneralStupid | olofk: for example, the DE2 has 8 Mb RAM and embedded multiplies (DSP). How can i tell the OpenRISC to use them. So to use the maximum performance my board can offer. | 13:45 |
olofk | The verilog is written to take advantage of resources such as on-chip RAM and multipliers | 13:50 |
olofk | So no configuration needed for that | 13:50 |
olofk | Got to go | 13:50 |
GeneralStupid | nice! | 13:50 |
GeneralStupid | really nice | 13:50 |
GeneralStupid | Ok, lets conclude: I want "bare-metal", that's supported by the OpenRISC Toolchain. Then i want to do some calculations on a CoProcessor, that could be done over the wishbone bus and is supported in OpenRISC (Master, Slave). I want to use the maximum i can get with the DE2 Board - that is automatically supported in OpenRISC | 13:56 |
GeneralStupid | I want open source, it is also. | 13:56 |
GeneralStupid | What's left is... How? Are there some more resources for Documentations? | 13:57 |
-!- Netsplit *.net <-> *.split quits: rhythmx, zama_, ssvb, simoncook | 14:45 | |
-!- Netsplit over, joins: simoncook | 14:45 | |
-!- Netsplit over, joins: rhythmx | 14:46 | |
-!- Netsplit over, joins: ssvb | 14:46 | |
stekern | olofk: ok, I can take a look at 8 | 14:52 |
stekern | bah, slow internet... | 14:53 |
stekern | olofk: ok, I can take a look at the module problem | 14:53 |
stekern | olofk: http://pastie.org/9895402 | 15:05 |
stekern | you need that | 15:05 |
olofk | stekern: Nice! Did you try it? | 17:12 |
olofk | And it would probably have taken me more than 12 minutes to find the solution :) | 17:13 |
GeneralStupid | olofk: how did you like the university you studied? | 17:13 |
olofk | GeneralStupid: Education was good, but my motivation wasn't. I quit three times, but it turned out I wasn't very good at anything else :) | 17:14 |
GeneralStupid | olofk: i need to get 23 to have enough motivation for learning... | 17:15 |
GeneralStupid | needed to get, would be right or? | 17:15 |
GeneralStupid | probably i need one semester more and i would like to go to sweden for that time. | 17:19 |
olofk | GeneralStupid: Where are you from? | 17:19 |
GeneralStupid | olofk: germany, hanover | 17:21 |
olofk | That's not too far away from here | 17:23 |
GeneralStupid | no but thats fine for me. I dont need USA, Austria or New Zealand :) | 17:27 |
olofk | No, Austria is a bit far away I guess | 17:39 |
olofk | stekern: Tested with the defconfig. Works now with export_symbol | 17:39 |
GeneralStupid | olofk: australia... | 19:14 |
--- Log closed Sun Feb 08 00:00:32 2015 |
Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!