It does convert without using decimal mode?...
Yes, that's correct. Decimal mode is not used.
I notice, I think, that Bruce cut off the output just before the point where the algorithm needs an extra wrinkle. Which is only fair - he had a deadline!
Actually, I finished this quite a while ago. With a division routine that uses an 8-bit denominator, you can only get 7 more digits before the denominator (of the first call to DIV) exceeds 8 bits. I wrote a version that buffers the previous digit (with 8 bits, you don't reach the point where you need to count nines), and it's a little longer, but not much. (Counting nines is longer still, but not a lot.) Since there wasn't much difference between the two versions, I chose the shortest one to (hopefully) encourage as many people as possible to try it.
There's a (somewhat mathematical) paper here explaining the algorithm - which also gives us a clue as to how far we can get with 8-bit bytes, and how much further we might get with 16-bit bytes. The mul and div routines would need some 8-counts adjusting to 16, at least.
A 65org16 version (which would have a 32-bit intermediate result -- a 24-bit is sufficient -- and "P" would be an array of 16-bit values rather than 8-bit values) with buffering and counting can generate over 250 times as much output as an 8-bit version (and over 300 times as much as the above). Or, as you suggested, you could just change the LDY #8 to LDY #16 in MUL and DIV and be done with it.
And there was me thinking that it must use some previously unknown secret feature of the 6502 that made it start doing arithmetic in base pi
2.222... = pi in base pi/(pi-2) (=~ 2.752) not base pi.