We (André and I) posted about this previously.
In addition to commercial and type-in offerings, three implementations have jumped out in the recent discussion:
- a version for the Atom was ported by Dave to the Beeb's second processor, and then converted to use Tony Finch's List Life algorithm, which had to be ported from C to 6502. This turns out to be nice and fast and offers a large 32k by 32k universe.
- Litwr brought in his Xlife-8 implementations, which use some of the tactics from the Unix program Xlife, again ported from C to 6502. This ends up having a smaller universe, but is the fastest we've seen, especially on small patterns.
- A friend of mine made available his 6502 code from 1988, which uses a different list-based approach, turning out to be faster than Dave's version but with a universe of 1k by 1k. There are ways to improve this, which Dave is presently pursuing. Indeed, Dave has a great diagram of the innards of the present code - see attached.
Here's a screen from Xlife-8:

Here's a screen from Life88:

Here's a video of some patterns running on these programs. Here's Xlife-8 running on a C64.
What we note, then, is
- the fastest evaluation is on a finite grid
- a list structure allows for a much larger universe
- a combination of a list with a bitmap gives the best result
- modern life programs use huge datastructures, huge tables, and 32-bit or 64-bit arithmetic, none of which are much help for a 6502.
- it's interesting to find a good way to partition computation and display, on a two-processor Acorn BBC.
I expect there's a lot more to be said, about algorithms used and ideas for improving them!