6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun May 19, 2024 5:57 am

All times are UTC




Post new topic Reply to topic  [ 28 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Mon Nov 09, 2020 11:45 am 
Offline

Joined: Mon Nov 09, 2020 11:24 am
Posts: 12
Hi, i'm trying to run a 6502 on sega saturn, my first test is on a tetris arcade game.
I've tested some 6502 core but it's always too slow, so i took the fastest i've found and tried to tweak it a bit :

- use computed goto
- removal of wait for irq (or something like that)
- includes lookup tables from K5602
- read/write functions rewrote

There was huge speed increase from 19 to 43 (target is 60 fps) but it's not enough.
"my" 6502 is here :

https://github.com/vbt1/fba_saturn/tree/master/m6502

i'd like some comments about this core (what's good/what's bad)
do you have some advices to improve speed and keep compatibility :
- speed hacks
- common hints
- slow opcode that need asm
- whatever idea

for now i've tried everything i could, i lack of clues to improve this core.
Feel free to answer my post, depending on this C core i'll try to run or not NES emulation on saturn


Top
 Profile  
Reply with quote  
PostPosted: Mon Nov 09, 2020 12:12 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10802
Location: England
Welcome! I don't know anything about the Saturn's SH2 CPU or the quality of the C compiler, so I can't say much about what performance you might hope for.

I'll take a look at the m6502 you're using - that's a new one for me. I'll add it to the reference section too.
(Do you have an upstream link for the source?)

However, I'd suggest you take a look at Ian Piumarta's lib6502, as that has always looked good to me. There's a slightly enhanced fork here:
https://github.com/ZornsLemma/lib6502-sf
and the original is here:
https://www.piumarta.com/software/lib6502/


Top
 Profile  
Reply with quote  
PostPosted: Mon Nov 09, 2020 12:24 pm 
Offline
User avatar

Joined: Mon May 12, 2014 6:18 pm
Posts: 365
This looks really cool!

I see you use static inline in some places. Have you looked at the assembly output to make sure it's actually inlined? Sometimes the compiler ignores it.

Probably not affecting your speed much but why does BRA take a condition code?

If you have the extra space, you could try making a copy of your dispatch loop on line 201 at the end of every op code function. That way you have one jump per instruction instead of two and (hopefully) only get one pipeline stall instead of two.


Top
 Profile  
Reply with quote  
PostPosted: Mon Nov 09, 2020 2:14 pm 
Offline

Joined: Mon Nov 09, 2020 11:24 am
Posts: 12
BigEd wrote:
Welcome! I don't know anything about the Saturn's SH2 CPU or the quality of the C compiler, so I can't say much about what performance you might hope for.

i dream of 20-25% more speed
Quote:
I'll take a look at the m6502 you're using - that's a new one for me. I'll add it to the reference section too.
(Do you have an upstream link for the source?)

only this one on github
Quote:
However, I'd suggest you take a look at Ian Piumarta's lib6502, as that has always looked good to me. There's a slightly enhanced fork here:
https://github.com/ZornsLemma/lib6502-sf
and the original is here:
https://www.piumarta.com/software/lib6502/

Not yet, i'll check it ! thanks !
Druzyek wrote:
This looks really cool!

I see you use static inline in some places. Have you looked at the assembly output to make sure it's actually inlined? Sometimes the compiler ignores it.

i didn't check asm, i can just say the code is bigger and faster, i tried with and without. so i would say it's inlined.
Quote:
Probably not affecting your speed much but why does BRA take a condition code?

it's reused in multiple opcodes (bra,bcc,beq,etc)
Quote:
If you have the extra space, you could try making a copy of your dispatch loop on line 201 at the end of every op code function. That way you have one jump per instruction instead of two and (hopefully) only get one pipeline stall instead of two.

yes i have extra space, it try this !


Top
 Profile  
Reply with quote  
PostPosted: Mon Nov 09, 2020 4:26 pm 
Offline
User avatar

Joined: Mon May 12, 2014 6:18 pm
Posts: 365
vbt wrote:
BigEd wrote:
Welcome! I don't know anything about the Saturn's SH2 CPU or the quality of the C compiler, so I can't say much about what performance you might hope for.

i dream of 20-25% more speed
Probably a silly question, but are your compiler flags set to optimize for speed?

One thing you could consider if you have the ram is translating the machine code from 6502 to SH2 as the program executes. There's no way to reliably convert a whole program at once since you can't tell what's code and what's data, but you could monitor the program counter and convert instructions at addresses that have been executed at least once. Then just watch out for self modifying code which I imagine is less common on the NES.


Top
 Profile  
Reply with quote  
PostPosted: Mon Nov 09, 2020 4:59 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10802
Location: England
This might be of interest
https://github.com/Xenomega/NESgen
Static Recompilation of NES ROMs to C code

and maybe
Static recompilation of 6502 machine code from NES ROMs
viewtopic.php?f=2&t=2541

and this post with links
viewtopic.php?p=43049#p43049

maybe see also
A very fast BBC Micro emulator.
https://github.com/scarybeasts/beebjit


Top
 Profile  
Reply with quote  
PostPosted: Mon Nov 09, 2020 7:39 pm 
Offline

Joined: Mon Nov 09, 2020 11:24 am
Posts: 12
it's optimized as -O2, -O3 gives bad results.
i have 1mb (a bit less) to do static recompilation, if it's enough, i can try


Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 10, 2020 4:23 am 
Offline
User avatar

Joined: Mon May 12, 2014 6:18 pm
Posts: 365
vbt, where did you get the copyright message you have in your files on github?


Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 10, 2020 7:28 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10802
Location: England
(Just to calibrate expectations: the Saturn has a pair of 28MHz SH2 processers, which are 32 bit RISC with 16 registers. The NES runs at 1.79MHz, and an emulator will need to emulate not only the CPU but also the PPU and APU, and any graphics format conversion. I suspect this will be pretty tricky!)


Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 10, 2020 10:40 am 
Offline

Joined: Mon Nov 09, 2020 11:24 am
Posts: 12
Druzyek wrote:
vbt, where did you get the copyright message you have in your files on github?

Most are from the original FBA (final burn alpha), it no more exists and became FBNeo)


Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 10, 2020 10:42 am 
Offline

Joined: Mon Nov 09, 2020 11:24 am
Posts: 12
BigEd wrote:
(Just to calibrate expectations: the Saturn has a pair of 28MHz SH2 processers, which are 32 bit RISC with 16 registers. The NES runs at 1.79MHz, and an emulator will need to emulate not only the CPU but also the PPU and APU, and any graphics format conversion. I suspect this will be pretty tricky!)

For me it's possible, i did already full speed master system emulator, z80 runs well, so why not 6502


Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 10, 2020 11:19 am 
Offline
User avatar

Joined: Tue Mar 02, 2004 8:55 am
Posts: 996
Location: Berkshire, UK
vbt wrote:
BigEd wrote:
(Just to calibrate expectations: the Saturn has a pair of 28MHz SH2 processers, which are 32 bit RISC with 16 registers. The NES runs at 1.79MHz, and an emulator will need to emulate not only the CPU but also the PPU and APU, and any graphics format conversion. I suspect this will be pretty tricky!)

For me it's possible, i did already full speed master system emulator, z80 runs well, so why not 6502

Depends how you compare the performance. A Z80 instruction takes between 4 and 23 clock cycles so executes fewer instructions per second (on average) than a 6502 (2 to 7 cycles/op) but many of those operate on 16-bit values which makes it more efficient than a 6502 at some tasks.

This is why a 2MHz 6502 can out perform a 4Mhz Z80.

_________________
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs


Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 10, 2020 11:25 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10802
Location: England
It will certainly be interesting to see progress! Those 6502 system simulators which aim to be cycle accurate are somewhat more expensive to run, so there's also that question: how accurate do you need to be, or choose to be. For example, running a scanline worth of instructions and then catching up with the I/O is a lot less expensive, I think.


Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 10, 2020 5:46 pm 
Offline
User avatar

Joined: Mon May 12, 2014 6:18 pm
Posts: 365
vbt wrote:
Druzyek wrote:
vbt, where did you get the copyright message you have in your files on github?
Most are from the original FBA (final burn alpha), it no more exists and became FBNeo)
I will steal this for my own projects if you have no objections.

BitWise wrote:
vbt wrote:
BigEd wrote:
(Just to calibrate expectations: the Saturn has a pair of 28MHz SH2 processers, which are 32 bit RISC with 16 registers. The NES runs at 1.79MHz, and an emulator will need to emulate not only the CPU but also the PPU and APU, and any graphics format conversion. I suspect this will be pretty tricky!)
For me it's possible, i did already full speed master system emulator, z80 runs well, so why not 6502
Depends how you compare the performance. A Z80 instruction takes between 4 and 23 clock cycles so executes fewer instructions per second (on average) than a 6502 (2 to 7 cycles/op) but many of those operate on 16-bit values which makes it more efficient than a 6502 at some tasks.

This is why a 2MHz 6502 can out perform a 4Mhz Z80.
Looks like the Sega Master System had a 4MHz Z80, so it's in the same ballpark! I recently looked into the SH4 variant that comes in a calculator I bought a few weeks ago, which is a descendant of the SH2. Single cycle instructions on 32 bit registers should be really fast!


Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 10, 2020 6:21 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10802
Location: England
Edit: I think we need a catch-all topic for license discussions: here's one. That replaces the comment I was about to make.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 28 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: