World: r3wp
[rebcode] Rebcode discussion
older newer | first last |
Geomol 11-Feb-2008 [2313] | I plan to do more tests in the coming days. Or maybe some of you wanna test too! |
GiuseppeC 11-Feb-2008 [2314] | Script Error: rebcode has no value |
Geomol 11-Feb-2008 [2315x2] | And this is on a 1.2 GHz G4 powerpc on my iBook. |
Guiseppe, make sure you run rebol 1.3.50! | |
GiuseppeC 11-Feb-2008 [2317] | I am running view 2.7.5 |
Oldes 11-Feb-2008 [2318] | You must have version with the old rebcode |
Henrik 11-Feb-2008 [2319] | only 1.3.50 and one more version has rebcode. not later versions |
GiuseppeC 11-Feb-2008 [2320] | Ok, I will try later |
Geomol 11-Feb-2008 [2321x4] | This win version should work: http://www.rebol.net/builds/031/rebview1350031.exe |
OS X version: http://www.rebol.net/builds/024/rebview1350024.tar.gz Linux version: http://www.rebol.net/builds/042/rebview1350042.tar.gz | |
For 6502 asm documentation, I use the "BBC Advanced User Guide" found here: http://www.nvg.ntnu.no/bbc/docs.php3 | |
I'll see, if I can make a good ram-file to test with. And maybe a little wrapper, so it's possible to monitor, what's going on in the emulation. | |
BrianH 11-Feb-2008 [2325] | Oldes, the older rebcode version wasn't slower, it just had less features. We had to change the naming convention of the opcodes to add the features. |
Geomol 11-Feb-2008 [2326] | Just to clear things out regarding performance. This is an emulation of a 1MHz cpu. It requires quite some computing power to emulate another cpu. To give a hint: an instruction line INX, which increment the X register by 1 requires 2 cycles on the 6502. So you can do half a million of those instructions on a 1MHz 6502 each second. In my emulator, that INX instruction become 11 rebcode instructions plus 6 rebcode instructions to control the loop, a total of 17 rebcode instructions. And it takes less than half a second to do 1 million of those, which is like a 4MHz 6502. So with this initial test, I'll say, rebcode is useable. |
btiffin 11-Feb-2008 [2327] | John; still can't get into 6502.org but the site holds quite a bit of source code, from snippets to floating point math by Steve Wozniak. |
Geomol 12-Feb-2008 [2328x6] | A first version of a MOS 6502 workbench tool is ready: do http://www.fys.ku.dk/~niclasen/rebol/language/m6502wb.r It'll load the 6502 assembler and emulator. It's a tool to compile 6502 assembler programs to machinecode and run it with the rebcode emulator. It's possible to see the 6502 registers and flags. Both asm6502.r and em6502.r has been updated. |
You'll need REBOL 1.3.50 to run this!!! | |
Scroll the 65kb ram with arrow-keys, page-up/down, home and end. | |
It works like this: 1) Write some 6502 asm in the text area. Example: lda #&80 2) Press the button "Assemble". Now you can see the opcodes in the ram at address 0000. 3) Press the button "Begin" to run the emulator with the produced machine code and see the results in the registers and flags. | |
Hm, probably a bad idea to use arrows to navigate ram, because it makes them not work in the text area. | |
A performance test program: lda #0 sta &1001 .l1 lda #0 sta &1002 .l2 lda #0 sta &1003 .l3 lda &1003 adc #1 sta &1003 lda &1003 bne l3 lda &1002 adc #1 sta &1002 lda &1002 bne l2 lda &1001 adc #1 sta &1001 lda &1001 bne l1 It takes 40s to run on a BBC emulator emulating a 1MHz 6502. It took around 14s using the rebcode emulator on my 1.2 GHz G4, and it took 9.5s using the rebcode emulator on my 2.4GHz Pentium 4. | |
Geomol 13-Feb-2008 [2334x3] | A similar rebcode performance test program might look like: ram: make binary! 3 insert/dup ram #"^(00)" 3 looptest: rebcode [/local a] [ set a 0 pokez ram 0 a label l1 set a 0 pokez ram 1 a label l2 set a 0 pokez ram 2 a label l3 pickz a ram 2 add a 1 pokez ram 2 a eq a 256 braf l3 pickz a ram 1 add a 1 pokez ram 1 a eq a 256 braf l2 pickz a ram 0 add a 1 pokez ram 0 a eq a 256 braf l1 ] It does 16'777'216 loops and takes less than 3 seconds on my 1.2 GHz G4. |
To sum it up: A 1MHz 6502 takes 40 sec to do 16'777'216 loops of this kind. Emulating the 6502 using rebcode can do the same thing in 14 sec (on a 1.2 GHz G4) and in 9.5 sec (on a 2.4 GHz P4). A pure rebcode program (no emulation) can do the 'same' 16'777'216 loops in around 2.7 sec on a 1.2 GHz G4. So a conclusion might be, that programming in rebcode is like having a 40 / 2.7 = 15 MHz cpu (if run on a 1.2 GHz G4). Is this a correct conclusion? | |
Is it known how many cpu clocks, each rebcode instruction use in average? | |
Henrik 13-Feb-2008 [2337] | sounds pretty slow? |
Geomol 13-Feb-2008 [2338x2] | I'm not sure. |
This is just one single test using only a few of the available instructions. To have a better view, more tests are needed. I made a similar loop in C, compiled it with gcc, and it runs around 6 times faster than the pure rebcode version. Initially I won't call rebcode slow, but not blasting fast either. | |
Pekr 13-Feb-2008 [2340] | and R3 rebcode si going to be even slower .... |
Geomol 13-Feb-2008 [2341] | There's something wrong with my compare with a 1MHz 6502. I counted the number of cycles in the inner loop and found 17 cycles. A 1MHz 6502 can then do 1'000'000 / 17 * 40 = 2'352'941 loops in 40 seconds. But the BeebEm emulator made 16.7 mio. loops in that time. It should have taken 285 sec. So programming in rebcode is more like a 107 MHz cpu in this test. (It's probably not correct to measure it this way.) |
BrianH 13-Feb-2008 [2342] | Rebcode is a higher-level language than 6502 assembler. Perhaps a peephole optimizer can rewrite your generated rebcode into better equivalent rebcode. |
Steeve 13-Feb-2008 [2343] | Geomol, i had a look on your emulator code, i think perfs could be improved if you delay the update of all flags only when they are used. |
Geomol 13-Feb-2008 [2344] | Good idea! Do you have previous experience with emulators like this, because I have none. |
Steeve 13-Feb-2008 [2345x2] | in fact the engine is very similar with the z80 one, i think we could make a meta-emulator using external data-sheets (one for 6502, one for Z80) |
i' made a Z80 emulator using rebcode (not complete), you can see it in galaga.r on rebol.org | |
Geomol 13-Feb-2008 [2347] | Ah, that was you. Someone mentioned that one lately. |
Steeve 13-Feb-2008 [2348x3] | ah BrianH, i remember that you made the same proposal for my z80 emu (peephole optimzation) |
hard to do | |
interesting to do on ROMs (static analysis before to launch the code) but not valuable in RAM because the code can be modified | |
Geomol 13-Feb-2008 [2351] | Steeva, about flags: e.g. the zero flag Z (bit 1 of P). In stead of that I set it each time A, X or Y become zero, I could save any of those (A, X or Y) in a variable, and then test on that var and set the flag correctly, if and when the flag is actual used. Is that what you mean? |
Steeve 13-Feb-2008 [2352x3] | and limited because on 6502 for example, many branchements are calculated (not statics) |
yes Geomol, it's that | |
Flags are calculated on the last accumulator value if i don't do mistakes | |
Geomol 13-Feb-2008 [2355] | ok. One optimization, I consider, is to cross-compile 6502 opcodes to rebcode, instead of emulating the 6502. That won't work with self-modifying code and branches will be a problem. So it's hard, but I think, it might work. |
Steeve 13-Feb-2008 [2356x2] | in theory |
i give you an example with the TAX opcode ; updating flags in real time label TAX seti X A eq X 0 either [or P 2] [and P 253] seti i X and i 128 eq i 128 either [or P 128] [and P 127] bra continue ; delay the calcul of flags label TAX seti X A or maskA (2 + 128) ; remember that we have to recalculate zero and negative flags using A, but don't do it now bra continue | |
Geomol 13-Feb-2008 [2358] | Thanks! |
Steeve 13-Feb-2008 [2359] | you got the idea ? ;-) |
Geomol 13-Feb-2008 [2360] | Yup! :) |
Steeve 13-Feb-2008 [2361x2] | did you think that using PC as an offset (integer) instead of as a serie could be faster ? |
I should do a test before saying that | |
older newer | first last |