Page 1 of 2
short binary to hex char using bcd mode
Posted: Fri Apr 16, 2021 8:03 pm
by fachat
That is probably old news here but I didn't know and found it clever use of BCD mode.
https://twitter.com/adumont/status/1381 ... 02785?s=19
The code is brilliant. For 0-9 the first step gets $90 to $99, which are the uppermost values BCD can represent in a byte.
For 10-15 this gets, through the overflow, then $00 to $05, plus carry.
Adding $40 in the second step results in $30-$39 and $41-$46 (due to the carry).
There is no branch but sed and cld add cycles. Still have to compare with a classic conversion.
But nevertheless clever code.
André
Re: short binary to hex char using bcd mode
Posted: Sat Apr 17, 2021 6:29 am
by BigDumbDinosaur
That algorithm has been around for decades. When I wrote Supermon 816 I considered that method, as well as the following, which I first encountered back in the late 1970s:
Code: Select all
.0000010 cmp #10 (2)
bcc .0000020 (2, 3 or 4*)
;
adc #'f' (2)
;
.0000020 eor #'0' (2)
*6502/65C02 only if branching over a page boundary
The numbers in parentheses are Ø2 cycles. The total for the above would require 8 cycles, worst case. It would run slightly faster on the 65C816 in native mode, 7 cycles, best case.
In contrast, the other algorithm:
Code: Select all
sed (2)
clc (2)
adc #$90 (2 or 3*)
adc #$40 (2 or 3*)
cld (2)
*3 cycles if a 65C02
The above would take a constant 10 cycles on an NMOS part or 65C816, or 12 cycles on a 65C02.
In both cases, it is assumed the nybble to be converted is in .A in the least-significant-nybble position.
EDIT: Fixed my cycle counting. 
Re: short binary to hex char using bcd mode
Posted: Sat Apr 17, 2021 6:34 am
by BillG
That technique is as old as dirt. This was used in at least two places within the DDT debugger of CP/M-80:
Code: Select all
04E9 C6 90 [7] 00350 adi 90h
04EB 27 [4] 00351 daa
04EC CE 40 [7] 00352 aci 40h
04EE 27 [4] 00353 daa
The 6502 version:
Code: Select all
0200 18 [2] 00003 clc
0201 F8 [2] 00004 sed
0202 69 90 [2] 00005 adc #$90
0204 69 40 [2] 00006 adc #$40
0206 D8 [2] 00007 cld
The brute force version:
Code: Select all
0207 18 [2] 00009 clc
0208 69 30 [2] 00010 adc #'0'
020A C9 3A [2] 00011 cmp #'9'+1
020C 90 02 (0210) [2/3] 00012 blo 2f
00013
020E 69 06 [2] 00014 adc #'A'-'0'-10-1
00015
0210 00016 2
The "clever" method is one cycle faster for 0..9 and two bytes smaller.
The fastest way on the 6502 at the expense of more bytes is:
Code: Select all
0210 AA [2] 00018 tax
0211 BD 0214 [4/5] 00019 lda Hexits,X
00020
00021 ;
00022 ;
00023 ;
00024
0214 3031323334353637 00025 Hexits db '0123456789ABCDEF'
021C 3839414243444546
The 8080 and 6800 cannot do that...
Re: short binary to hex char using bcd mode
Posted: Sat Apr 17, 2021 6:37 am
by BillG
That algorithm has been around for decades. When I wrote Supermon 816 I considered that method, as well as the following, which I first encountered back in the late 1970s:
Code: Select all
.0000010 cmp #10 (2)
bcc .0000020 (2, 3 or 4*)
;
adc #'f' (2)
;
.0000020 eor #'0' (2)
*6502/65C02 only if branching over a page boundary
The numbers in parentheses are Ø2 cycles. The total for the above on a 6502/65C02, worst case, would be 10 cycles, or 8 cycles, best case. On the 65C816 in native mode, worst case would be 9 cycles, or 8-cycles, best case.
Ooh, that is good. I have never seen that before...
Re: short binary to hex char using bcd mode
Posted: Sat Apr 17, 2021 7:08 am
by BigDumbDinosaur
The 6502 version:
Code: Select all
0200 18 [2] 00003 clc
0201 F8 [2] 00004 sed
0202 69 90 [2] 00005 adc #$90
0204 69 40 [2] 00006 adc #$40
0206 D8 [2] 00007 cld
Decimal addition on the 65C02 of an immediate mode operand costs 3 cycles, not 2. So there is a performance hit if using the 'C02. In a single conversion, it's no big deal. However, if dumping a page of memory to the screen, there would be 512 conversions for the data, plus 64 conversions for the addresses associated with the data, assuming 16 bytes dumped per line. Total cycle consumption for conversion with the 'C02 would be 6912. In contrast, it would be 5760 with an NMOS 6502 or the 65C816 running in native mode. The numbers, of course, don't include the overhead of JSR and RTS, or the shifting and masking required to extract a nybble from a byte.
That said, the branch-and-add algorithm I illustrated is generally faster on the 65C816, since that MPU doesn't incur a cycle penalty when branching across a page boundary (one of the sneaky benefits of having a 16-bit ALU). Also, there is one less instruction to execute, for an automatic 2 cycle saving.
Re: short binary to hex char using bcd mode
Posted: Sat Apr 17, 2021 7:24 am
by BillG
From
https://twitter.com/LIV2/status/1381885385085755392
who got it from
http://www.6502.org/tutorials/decimal_mode.html#6
Code: Select all
0224 F8 [2] 00027 sed
0225 C9 0A [2] 00028 cmp #10
0227 69 30 [2] 00029 adc #'0'
0229 D8 [2] 00030 cld
Re: short binary to hex char using bcd mode
Posted: Sat Apr 17, 2021 7:50 am
by BigDumbDinosaur
Funny how the stuff posted here finds its way to other places, where it is then occasionally claimed to be original thinking.
Virtually every 6502 algorithm I've seen posted here or elsewhere was known at least 40 years ago.
Code: Select all
0224 F8 [2] 00027 sed
0225 C9 0A [2] 00028 cmp #10
0227 69 30 [2] 00029 adc #'0'
0229 D8 [2] 00030 cld
Again, addition (and subtraction) in decimal mode on a 65C02 costs an extra cycle, ergo the ADC #'0' is a 3-cycle instruction, not 2. So the total cycle consumption of the above with the 'C02 would be 9 cycles. Ironically, it would be faster on the NMOS processor.
Re: short binary to hex char using bcd mode
Posted: Sat Apr 17, 2021 8:14 am
by BigEd
Or we could be happy that people are discovering and sharing knowledge about programming the 6502.
Re: short binary to hex char using bcd mode
Posted: Sat Apr 17, 2021 9:03 am
by GARTHWILSON
Keep 'em coming! There are occasionally still ones that are new to me, after 40 years of this. It reminds me of something a boss I had 30 years ago said, which was that in the early days of programmable calculators when memory was very limited, they'd have contests to see how short of a program people could come up with to do a certain job; and just when they thought they had absolutely the shortest possible program, in came someone with one that was only half as long, and by that time it was hard to tell why it even worked at all.
Re: short binary to hex char using bcd mode
Posted: Sat Apr 17, 2021 6:31 pm
by drogon
I like reading these sorts of threads - mostly because it highlights my old saying of ask 10 programmers, get 11 answers...
It also demonstrates just how varied our world is with different people coming up with different ideas and styles - for different reasons....
My own coding style is often somewhat verbose and easy to read - and that's intentional. Until I'm really pressed for space I tend to not code for space efficiency (or even speed - until I really need it) so my preferred hex print code looks like:
Code: Select all
;********************************************************************************
; oHex4: oHex8:
; Output the A register as 1 or 2 hex digits
; Preserves X, Y, A has last character printed.
;********************************************************************************
_oHex8:
pha ; Temp. save
lsr a ; A := A >> 4
lsr a
lsr a
lsr a
jsr _oHex4 ; Print top 4 bits as hex
pla ; Restore A and fall into ...
_oHex4:
and #$0F ; Isolate bottom 4 bits
phy
tay
lda hexTable,y
ply
jmp osWrch ; and return
hexTable: .byte "0123456789ABCDEF"
It's easy to read and follow, but yes, is a bit longer and has some extra guards in it to preserve Y - which in a more tighter system may be eliminated, but, for me, its easy to read now and re-read years later and easy to understand.
(and interesting choice of me using Y here - vs the use of X above - no difference in this instance, but just an interesting thing I note)
When I (re) started my 6502 love, I write my own 'wozmon' - it was much bigger than the Woz original, even bigger than the 2K one in the Apple II, but back then what were the constraints? RAM and ROM were very expensive, so Woz had to save every byte he could. I don't have that restriction, so is that making me lazy? Who knows.
I'd actually suggest I'm more lazy because I've never bothered with decimal mode on the 6502 - I've never, ever had a use for it in it's intended use-case, but it's very interesting to see little hacks like this.
-Gordon
Re: short binary to hex char using bcd mode
Posted: Sat Apr 17, 2021 8:23 pm
by fachat
Haha, I knew it would be old news here
But interestingly for a while here I still thought it would be the shortest one, but then came the BCD version here with 8 bytes
http://www.6502.org/tutorials/decimal_mode.html#6
I'm interested what the comment on that page relates to regarding "relying on undocumented behaviour"? The decimal mode is well documented, so why should this be undocumented?
André
Re: short binary to hex char using bcd mode
Posted: Sat Apr 17, 2021 8:30 pm
by BigEd
I think decimal mode is documented when the hex digits are in the range 0-9, but when they are A-F, the machine does whatever it does - and that's the sort of thing where different implementations can differ.
Re: short binary to hex char using bcd mode
Posted: Sat Apr 17, 2021 8:43 pm
by fachat
That algorithm has been around for decades. When I wrote Supermon 816 I considered that method, as well as the following, which I first encountered back in the late 1970s:
Code: Select all
.0000010 cmp #10 (2)
bcc .0000020 (2, 3 or 4*)
;
adc #'f' (2)
;
.0000020 eor #'0' (2)
*6502/65C02 only if branching over a page boundary
The numbers in parentheses are Ø2 cycles. The total for the above on a 6502/65C02, worst case, would be 10 cycles, or 8 cycles, best case. On the 65C816 in native mode, worst case would be 9 cycles, or 8-cycles, best case.
Ooh, that is good. I have never seen that before...
Ha, I see BDD has edited out the max 10 cycles in the original post that I tripped over here

Because if branch is taken the ADC does not count, so it's 8 max.
André
Re: short binary to hex char using bcd mode
Posted: Sat Apr 17, 2021 8:58 pm
by fachat
So, let's compare the different versions
1. BCD version
- size: 6
- NMOS: 8 cycles
- CMOS: 9 cycles (ADC in decimal mode is once add. cycle)
Note: behaviour in decimal mode for values 10-15 typically work, but are actually not defined.
2. Branch version
Code: Select all
CMP #$0A
BCC SKIP
ADC #$66
SKIP EOR #$30
- size: 8
- NMOS: 7 cycles for digits 0-9, +1 if page is crossed on branch. 8 cycles for digits A-F
- CMOS: 7 cycles for digits 0-9, +1 if page is crossed on branch. 8 cycles for digits A-F
Note: different versions exist where e.g. the EOR is replaced with ADC (and values adapted) but they have the same properties.
3. Table version
Code: Select all
AND #$0F
TAY
LDA HTAB,y
...
HTAB .BYT "0123456789ABCDEF"
- size: 22 (assuming HTAB is not in zeropage, in which it would be 21, but 16 of them ZP)
- NMOS: 8 cycles (+1 if index crosses page boundary)
- CMOS: 8 cycles (+1 if index crosses page boundary)
Note: additional code and cycles are needed to protect YR (typically easier per byte, not nibble)
Summary
- The BCD version is the shortest one with 6 bytes, with the additional benefit of using constant time (per CPU type) of 8 or 9 cycles, but relying on potentially undocumented behaviour
- The Branch version is the fastest one with 7-8 cycles, with variable execution time, with 8 byte size.
- The table version is the largest, but has constant time of 8 (+1 if page crossing, which can be controlled by placing code) without BCD mode
Re: short binary to hex char using bcd mode
Posted: Sat Apr 17, 2021 9:55 pm
by BillG
3. Table version
Code: Select all
AND #$0F
TAY
LDA HTAB,y
...
HTAB .BYT "0123456789ABCDEF"
- size: 22 (assuming HTAB is not in zeropage, in which it would be 21, but 16 of them ZP)
- NMOS: 8 cycles (+1 if index crosses page boundary)
- CMOS: 8 cycles (+1 if index crosses page boundary)
Note: additional code and cycles are needed to protect YR (typically easier per byte, not nibble)
Don't the other methods also need to do the "AND $0F"?
Also, if we do both nybbles in a byte, the cost of the table is only an additional 8 bytes per instance.