BigEd wrote:
Thinking about the 10x problem, without actually coding it, I felt that it should be possible to use two 256 byte tables - usually the fastest way! Each byte when multiplied by ten gives you two bytes: the LSB is a result byte and the MSB is to be held over to add into the next byte up.
One of my design goals is that I'd like to be able to have a version of this without too many extra constraints as part of an interpreter that runs in 4K of RAM with no ROM. (The 8080 and 6800 versions would be a direct competitor to Altair (MS) 4K BASIC. By 1977 there were a ton of 6502 computer variants, but were the interpreter a third party product for an Apple I, PET 2001 or Apple II, it would also want to fit into 4 KB RAM, though it might be able to use a few ROM routines.)
That said, though it wouldn't be something I'd be looking at before getting smaller versions working, yes, I should keep tables and the like in mind as a possibility for some systems where I might have extensive ROM space. So thanks for that.
_____________________________________________________
Before replying to your other comments, here's a caveat: I still seem to be having great difficulty wrapping my head around all the corners of multi-precision two's-complement arithmetic (and that's been a source of many of my problems). But I'll present some thoughts based on my experience writing the code linked/attached above. You can see the initial versions of the x10 multiply (
bi_x10u) and digits-only decmal read (
bi_read_udec) in
this commit. (You can see the whole history of my work on this on the
dev/cjs/191126/bi-decimal branch in the repo, though I warn you that much of it is a hot mess and may make your head hurt.)
Quote:
It's possible that multiplying a negative number is a whole different thing....
As it turns out, not at all, as far as I can tell.
I had originally thought that the x10 multiply was unsigned, as you can see from my early code where the routine was called
bi_x10u. I realized days later, near the end of all this coding, that in classic 6502 style it actually turns out to be both signed and unsigned: you're free to look at it either way. The only trick (and it shares this in commnon with the 6502 ADC and SBC instructions) is that if you are treating it as signed and over/underflow the MSB (the only byte with ADC/SBC) by one bit you need to remember that the value is still correct as long as you sign-extend it with an additional byte. (For signed values with ADC/SBC, the overflow flag is used to indicate loss of the sign bit. With those instructions the value is still always correct when sign-extended with another byte because you can never overflow the value itself: the A register holds 8 bits but the input is never more than 7 bits in absolute value when considered signed.*)
I thought I'd explained this in the comments, but it turns out I hadn't, so I've added the following to the header comment for that function:
Code:
; Like the 6502 ADC/SBC instructions, this does not care whether you are
; multiplying a signed or unsigned value; you decide which way you want
; to treat it. This allows you an extra bit of precision if you have
; separately kept track of the sign of the input; you can safely overflow
; the sign bit out of the MSB and sign-extend afterwards. For example,
; input [$FF, $00] results in [$30, $30]. If you were considering your
; input to be unsigned (65280₁₀) it overflowed and the result is
; incorrect. But if you were considering your input to be signed
; (-5320₁₀) only the sign bit "overflowed"; you know the output is
; negative and can simply sign-extend extend to $FF3030 to have the
; correct result.
Quote:
...but in the case of initialising a bignum from a decimal string, I'd be inclined first to build the positive value from the digits and then invert the bignum if it's to be negative. One could, possibly, compress this into a single pass for the negative case...
And that's just how I was first inclined, too. This was exactly the way I'd started my original implementation, thus
bi_read_decdigits being originally named
bi_read_udec and doing only adds. My intent was, exactly as you say, to invert the value during the normalization pass, where I have to copy the read-digits result to remove redundant leading sign bytes anyway. Unfortunately I ran into difficulty with this and so rewrote the digits reader to read positive or negative, making it longer along the way because now I needed separate code for the add loop and subtract loop.
Looking at it now, I think you're right: it should be possible always to do an unsigned digits read and do a straight two's-complement inversion (EOR #$FF and add one) during the normalization pass, followed by a sign extension byte if necessary. But I'm still not quite wrapping my head around the details of that. Also, I'm not sure there's actually any savings in code size since the instruction savings in the digits reader routine might be lost in the additional complexity of the negation in the normalization copy pass. And it might even be slower since there now would be per-byte calculations be done on the normalization that previously had already been done for us by the separate SBC instruction in the read-digits pass.
__________
* Bruce Clark's
"The Overflow (V) Flag Explained" probably explains exactly this, but I have to admit I didn't really understand it from my first reading of the article, or even after
my first go in detail at doing 6502 arithmetic.