I've done it as a less generalised function, Yuri, based on what I need to achieve rather than an exact copy of strncasecmp:
Code: Select all
is_token:
ldy #0
istk_1:
lda (tok_ptr),y
beq istk_x ; while *tok_ptr != 0
lda (txt_ptr),y
ora #0x20
cmp (tok_ptr),y ; does it match?
bne istk_no ; nope
iny
bra istk_1 ; else back for next
istk_no:
ldy #0
istk_x
rts
is_token has two pointers initialised off-stage: txt_ptr points into a block of text which _may_ be the start of a token, and compares it against a particular zero-terminated string in tok_ptr. If it matches, it returns y with the length of the token text (not the zero) or if not, y = 0.
That's a helper function for match_token:
Code: Select all
match_token:
phx
ldx #LET ; token count
sta txt_ptr
sty txt_ptr+1 ; pointer to text to match
lda #lo tokens
sta tok_ptr
lda #hi tokens
sta tok_ptr+1 ; pointer to start of token table
matk_1:
jsr is_token ; a match yet?
cpy #0
bne matk_yes ; no
matk_2:
iny ; else
lda (tok_ptr),y
bne matk_2 ; scan for rest of token
sec ; <--- not a bug!
tya ; find the next token in list
adc tok_ptr
sta tok_ptr
bcc matk_3
inc tok_ptr+1
matk_3:
inx
cpx #LAST_KW
bcc matk_1 ; back for next token
ldy #0
bra matk_x ; we've tried them all
matk_yes: ; we found a match - y has the token length
txa ; how many tests?
ora #0x80 ; set high bit as token enum
matk_x:
plx
rts ; and back
which checks all the tokens in the token table against the text in txt_ptr, and again, if it finds one, returns the associated token value in A (0x80 upwards) or LAST_KW if not found, again with the length of the token, or zero, in Y. (And I've just noticed that the last ora #0x80 is no longer required since I fixed a bug earlier!)
And _that_ is another helper function for squish_buffer which
- Reads the line number and converts to int16_t
- Moves all the post-digit and post-whitespace text to start at buffer[3]
- Inserts the int16_t line number
- Scans the line, checking every character to see whether it might be the start of a keyword (unless it's in quoted text)
- Replaces every keyword with the token code, and shuffles the code down to fill the space if necessary (memmove)
- and finally goes and has a cup of tea because it's tired and thirsty after all that work!
So the (invalid code, but convenient to show) input line
Code: Select all
1234 for q = 1 to 10; print q; next
looks like this in the buffer (after the line number has been converted and inserted)
Code: Select all
d2 04 00 66 6f 72 20 71 20 3d 20 31 20 74 6f 20 31 30 3b 20 70 72 69 6e 74 20 71 3b 20 6e 65 78 74 0d
and is further converted to
Code: Select all
d2 04 1a 89 20 71 20 92 20 31 20 8a 20 31 30 3b 20 82 20 71 3b 20 8b 0d
| | | | |_ next
| | | |_ print
| | |_ to
| |_ =
|_ for
(In the final product there is only one keyword/token per line (except for 'for/to'))
Neil
