6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Wed May 08, 2024 10:23 am

All times are UTC




Post new topic Reply to topic  [ 2 posts ] 
Author Message
 Post subject: Speed Freaks
PostPosted: Tue Sep 03, 2002 6:05 pm 
Offline

Joined: Tue Sep 03, 2002 10:17 am
Posts: 2
Location: america
Are you a speed freak? Are you aware of, or have you written some blinding fast and clever routine?
Here's some examples:
Calculate 8 bit Julia iteration (involving complex numbers) in just *14 cycles*. Set by Chris Jam, of C64 "finely-sliced demo PAL" fame. Then he beat his own record by 2 cycles, by rederiving the formula, without cheating (such as removing a 2 cycle CLC).
The 40 cycle 8 bit unsigned multiply which any self respecting hardcoder has reinvented. But have you made the 44 cycle worst case version, as fast as 12 cycles, and an average in the 30's? Hint: use a jmp table...
How about the 190 cycle 16 bit unsigned multiply? Hint: just like the 8 bit, but done in optimal arrangement.
Finally, how about a 105 cycle 16/8 unsigned divide? Divide is proven to be slightly harder than multiply. Can you beat it in full (err...) accuracy?
See the routine below. Of course, using log tables you can make this quite fast, and same with log, exp, sqr, sin and other functions...
8 bit unsigned divide

ldx #8
lda dividend
div cmp divisor
bcs s
sbc divisor
s rol quotient
rol
dex
bne div

timing in cycles:
overhead 5
"0" bit 20
"1" bit 18
range 148 to 164
average 156
unrolled, with dividend in A register, and last iteration optimized: 105
average of 14 cycles per bit.
dividend, quotient, and divisor are 8 bit values stored in zero page.
The remainder is left in A.
note: dividend/divisor must be < 2. So this is not a fully general
routine. To handle the full range, you must do a prefix of the dividend
to reduce it to the right range, then fix the quotient after. This is
true of any shift and subtract divide. True divides are much slower.
There are two faster routines I have written, but this is the fastest
algorithmic version. A faster version uses decision tree optimization and
requires over 256 bytes of code (about 56 cycles). The fastest version requires tables
which I cannot calculate without a computer.
I mean, I cannot even easily specify how to find them, since it is not a
formula.

_________________
c128,2xc64/64cycle/65cycleVIC,2x1541/8k,1571,1581


Top
 Profile  
Reply with quote  
 Post subject: Re: Speed Freaks
PostPosted: Tue Sep 03, 2002 6:51 pm 
Offline
Site Admin
User avatar

Joined: Fri Aug 30, 2002 1:08 am
Posts: 280
Location: Northern California
Hi,

This is exactly the kind of thing that the 6502.org Source Code Repository exists for. I'm very interested in collecting all kinds of generic 6502 source code like this for inclusion into the repository. Whatever you have, please send it to me or post it here.

Regards,
Mike

_________________
- Mike Naberezny (mike@naberezny.com) http://6502.org


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 2 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: