Re: Fast prime sieve implementation
Posted: Fri Oct 13, 2023 7:24 pm
I've done a 6502 implementation of B now, because it's the simplest one to do in assembler: https://github.com/gfoot/primesieve6502 ... mesieve2.s
However, surprisingly it seems to come in slower than my original 30-based one - it takes about twice as long as the optimised version, and I didn't expect that. Perhaps I did something silly in the code here.
Actually it's a good opportunity to try out hoglet's profiler - it showed that about 15% of time is spent in "clear_sieve", 15% in "do_sieve", and about 60% in "init_new_sieve", mostly in init_new_sieve's "elimloop". This is the loop eliminating multiples of low primes from newly-initialised sieves. I suspect it's particularly expensive for the small primes that appear many times in the sieves - 3, 5, 7, 11 - and avoiding this was the idea behind the 1155 trick in the BASIC version. Perhaps that would be more effective here than it was in BASIC.
Here's some of the output from the profiler - 030c is the start of init_new_sieve's elimloop:
I'm continually amazed by how useful hoglet's decoder manages to be!
However, surprisingly it seems to come in slower than my original 30-based one - it takes about twice as long as the optimised version, and I didn't expect that. Perhaps I did something silly in the code here.
Actually it's a good opportunity to try out hoglet's profiler - it showed that about 15% of time is spent in "clear_sieve", 15% in "do_sieve", and about 60% in "init_new_sieve", mostly in init_new_sieve's "elimloop". This is the loop eliminating multiples of low primes from newly-initialised sieves. I suspect it's particularly expensive for the small primes that appear many times in the sieves - 3, 5, 7, 11 - and avoiding this was the idea behind the 1155 trick in the BASIC version. Perhaps that would be more effective here than it was in BASIC.
Here's some of the output from the profiler - 030c is the start of init_new_sieve's elimloop:
Code: Select all
030c : 1613590 cycles ( 3.847899%) 806795 ins (2.00 cpi) ********************
030e : 4033975 cycles ( 9.619747%) 806795 ins (5.00 cpi) **************************************************
0310 : 1613590 cycles ( 3.847899%) 806795 ins (2.00 cpi) ********************
0311 : 2420385 cycles ( 5.771848%) 806795 ins (3.00 cpi) ******************************
0313 : 2420385 cycles ( 5.771848%) 806795 ins (3.00 cpi) ******************************
0315 : 2420385 cycles ( 5.771848%) 806795 ins (3.00 cpi) ******************************
0317 : 2420385 cycles ( 5.771848%) 806795 ins (3.00 cpi) ******************************
0319 : 2420385 cycles ( 5.771848%) 806795 ins (3.00 cpi) ******************************
031b : 2420385 cycles ( 5.771848%) 806795 ins (3.00 cpi) ******************************
031d : 1613590 cycles ( 3.847899%) 806795 ins (2.00 cpi) ********************
031f : 2415542 cycles ( 5.760299%) 806795 ins (2.99 cpi) *****************************