String Comparison for PASCAL style Strings

Programming the 6502 microprocessor and its relatives in assembly and other languages.
Post Reply
Wheels
Posts: 2
Joined: 30 Jul 2005

String Comparison for PASCAL style Strings

Post by Wheels »

Source code is in MERLIN format be should be easy to adapt

Code: Select all

    1 * String Comparison Routine for PASCAL strings
    2 * INPUT *
    3 * Pointer to First String in STRING1, STRING1+1
    4 * Pointer to Second String in STRING2, STRING2+1
    5 * STRING1 and STRING2 must be located on the ZERO PAGE
    6 * TEMP can be anywhere in memory
    7 * OUTPUT *
    8 * If String1 < String2 then carry clear; else carry set
    9 * If String1 = String 2 then Z set; else Z clear
   10 * Destroys A-REG and Y-REG; X-REG unchanged
   11 STRING1  EQU   6
   12 STRING2  EQU   8
   13 TEMP     EQU   9
   14 PSTRCMP  LDY   #0
   15          LDA   (STRING1),Y
   16          CMP   (STRING2),Y
   17          BCC   :1
   18          LDA   (STRING2),Y
   19 :1       STA   TEMP
   20          LDY   #1
   21 :2       LDA   (STRING1),Y
   22          CMP   (STRING2),Y
   23          BNE   :3
   24          INY
   25          CPY   TEMP
   26          BCC   :2
   27          LDY   #0
   28          LDA   (STRING1),Y
   29          CMP   (STRING2),Y
   30 :3       RTS
A slightly shorter and faster 65C02 version of this code can be created by removing lines 14 and 27, and removing the ",Y" from the end of every line EXCEPT 21 and 22.
Wheels
Posts: 2
Joined: 30 Jul 2005

Post by Wheels »

Also, if any one can modify this routine to deal with C style Strings of any length I would greatly appreciate seeing it.
kc5tja
Posts: 1706
Joined: 04 Jan 2003

Post by kc5tja »

Wheels wrote:
Also, if any one can modify this routine to deal with C style Strings of any length I would greatly appreciate seeing it.
Untested... (in ACME syntax)

Code: Select all

T0 = $00
T1 = $02
T2 = $04

strcmp:
   ; T0 = input string 1 (255 chars or less)
   ; T1 = input string 2 (255 chars or less)
   ; 
   ; on exit, flags should be set as expected for CMP.
   ;  C flag clear if string 1 < string 2
   ;  Z flag set if string 1 = string 2
   ;
   ; I treat all registers as effectively destroyed.
   ;  A is trashed outright.
   ;  X is untouched, but should never depend on that anyway.
   ;  Y contains length of shortest string, less the NULL.
   ;    (same semantics as strlen()).
 

   ldy #$00

.loop:
   lda (T1),y
   beq .different   ; Terminating NULLs always break the loop
   lda (T0),y
   beq .different   ; Terminating NULLs always break the loop
   cmp (T1),y
   bne .different   ; Mismatches also break the loop
   iny
   jmp .loop

.different:
   lda (T0),y   ; reset the flags accordingly
   cmp (T1),y
   rts

;
; As a bonus: strlen -- get length of string, excluding NULL byte.
; Since we're relying on strcmp() to get the length, it's not as fast
; as a native implementation of strlen(), and it destroys T1 too.
;

strlen:
   lda T0
   sta T1
   lda T0+1
   sta T1+1
   jmp strcmp
User avatar
dclxvi
Posts: 362
Joined: 11 Mar 2004

Post by dclxvi »

In the length-byte-before-string comparison routine above:

1. With the EQUs above, TEMP overlaps with the high byte of the STRING2 pointer.
2. It doesn't compare the last byte of the string when the length byte is greater than 1.
3. If both length bytes are $00, it can loop endlessly.

A couple of optimizations are also possible: (a) X can be used instead of TEMP, and (b) the LDY #1 can be eliminated. Here is the updated code:

Code: Select all

   LDY #0
   LDA (STRING1),Y
   CMP (STRING2),Y
   BCC :1
:1 LDA (STRING2),Y 
   TAX
   BEQ :3
:2 INY
   LDA (STRING1),Y
   CMP (STRING1),Y
   BNE :3
   DEX
   BNE :2
   LDA (STRING1,X)
   CMP (STRING2,X)
:3 RTS
Here is a slightly optimized version of the routine that compares strings (of any length) terminated by $00. Same inputs and outputs as above.

Code: Select all

   LDY #0
:1 LDA (STRING1),Y
   BEQ :2
   CMP (STRING2),Y
   BNE :3          
   INY             ; note: C=1 since CMP was equal
   BNE :1
   INC STRING1+1
   INC STRING2+1
   BCS :1          ; always branches
:2 CMP (STRING2),Y
:3 RTS
Post Reply