Neolithic Tiny Basic

Yuri · Post by **Yuri** » Sun Nov 24, 2024 11:12 am

Untested strcasecmp()

Assuming pointers to strings are at ZP 0,1 and 2,3 are 7 bit ASCII null terminated strings

Using ZP 4 as temp space

Code: Select all

strcasecmp:
    phy
    ldy #0
cmp_loop:
    lda 2,y
    beq cmp_end ; Null terminator
    and #%01011111 ; Remove case bit
    sta 4
    lda 0,y
    beq cmp_end ; Null terminator
    and #%01011111 ; Remove case bit
    iny
    sec ; Double check this, I tend to get this bit missed up.
    sbc 4
    beq cmp_loop; Characters are the same
cmp_end:
    ply
    rts ; Signed result in A register

EDIT: This isn't the most robust way to do it. Pretty sure there are some edge cases that will break this function.

barnacle · Post by **barnacle** » Sun Nov 24, 2024 12:16 pm

I've done it as a less generalised function, Yuri, based on what I need to achieve rather than an exact copy of strncasecmp:

Code: Select all

is_token:
	ldy #0
istk_1:
	lda (tok_ptr),y
	beq istk_x				; while *tok_ptr != 0
		lda (txt_ptr),y
		ora #0x20
		cmp (tok_ptr),y		; does it match?
		bne istk_no			; nope
			iny
			bra istk_1		; else back for next 
istk_no:
	ldy #0
istk_x
	rts

is_token has two pointers initialised off-stage: txt_ptr points into a block of text which _may_ be the start of a token, and compares it against a particular zero-terminated string in tok_ptr. If it matches, it returns y with the length of the token text (not the zero) or if not, y = 0.

That's a helper function for match_token:

Code: Select all

match_token:
	phx
	ldx #LET				; token count
	sta txt_ptr
	sty txt_ptr+1			; pointer to text to match
	
	lda #lo tokens
	sta tok_ptr
	lda #hi tokens
	sta tok_ptr+1			; pointer to start of token table

matk_1:
	jsr is_token			; a match yet?
	cpy #0
	bne matk_yes			; no
matk_2:
		iny					; else
		lda (tok_ptr),y
		bne matk_2			; scan for rest of token
		sec					; <--- not a bug!
		tya					; find the next token in list
		adc tok_ptr
		sta tok_ptr
		bcc matk_3
			inc tok_ptr+1	
matk_3:
		inx
		cpx #LAST_KW
		bcc matk_1			; back for next token		
	ldy #0
	bra matk_x				; we've tried them all
matk_yes:					; we found a match - y has the token length
	txa						; how many tests?
	ora #0x80				; set high bit as token enum
matk_x:
	plx
	rts						; and back

which checks all the tokens in the token table against the text in txt_ptr, and again, if it finds one, returns the associated token value in A (0x80 upwards) or LAST_KW if not found, again with the length of the token, or zero, in Y. (And I've just noticed that the last ora #0x80 is no longer required since I fixed a bug earlier!)

And _that_ is another helper function for squish_buffer which

Reads the line number and converts to int16_t
Moves all the post-digit and post-whitespace text to start at buffer[3]
Inserts the int16_t line number
Scans the line, checking every character to see whether it might be the start of a keyword (unless it's in quoted text)
Replaces every keyword with the token code, and shuffles the code down to fill the space if necessary (memmove)
and finally goes and has a cup of tea because it's tired and thirsty after all that work!

So the (invalid code, but convenient to show) input line

Code: Select all

1234 for q = 1 to 10; print q; next

looks like this in the buffer (after the line number has been converted and inserted)

Code: Select all

                              
d2 04 00 66 6f 72 20 71 20 3d 20 31 20 74 6f 20 31 30 3b 20 70 72 69 6e 74 20 71 3b 20 6e 65 78 74 0d

and is further converted to

Code: Select all

d2 04 1a 89 20 71 20 92 20 31 20 8a 20 31 30 3b 20 82 20 71 3b 20 8b 0d
         |           |           |                 |              |_ next
         |           |           |                 |_ print
         |           |           |_ to
         |           |_ =
         |_ for

(In the final product there is only one keyword/token per line (except for 'for/to'))

Neil

barnacle · Post by **barnacle** » Sat Nov 30, 2024 7:25 am

Wow, list was a long one - and it doesn't even do ranges yet (though that's easy to add later). I need it for testing that I'm storing lines correctly in memory. First time I have a routine long enough that I need a jmp instead of a bxx for the outer wrapper...

Code: Select all

listt:
	phx
	LYA MEM_START
	stz indent
	stz list_first
	stz list_first+1
	lda #0xff
	sta list_last
	lda #0x7f
	sta list_last+1
	;where = find_line(first_line);	// so get its address
	lda list_first
	ldy list_first+1
	jsr find_line
	sta list_where
	sty list_where+1
	;found = * (uint16_t *) where;	// and read the number from memory
	jsr star_int16
	sta list_found
	sty list_found+1
	;while ((found > 0) && (found <= last_line))
list_1:
		lda list_found
		ora list_found+1
		bne list_1a
		jmp	list_x					; line zero: end of program
list_1a:
			lda list_found
			ldy list_found+1
			phy
			pha						; found on stack
			lda list_last
			ldy list_last+1
			jsr compgt				; carry set if Y:A > TOS
			bcs list_2
				pla
				pla					; clean stack
				jmp list_x			; and break
list_2:		
		;printf ("%5d ", found);
		pla							; get the found line back and clean
		ply							; the stack
		sec
		jsr putn
		;next_line = where + *(uint8_t *)(where + 2);
		ldy #2
		lda (list_where),y
		clc
		adc list_where
		sta list_next
		lda list_where+1		
		adc #0
		sta list_next+1				; where points to start of next line
		;where += 3;					// move to payload
		clc
		lda list_where
		adc #3
		sta list_where
		bcc list_4
			inc list_where+1		; and now to start of text
list_4:
		;ch = * where;
		lda (list_where)
		tax							; first character
		;for (int q = 0; q < indent; q++)
			ldy indent
			beq list_6
list_5:
			;// print the indent, if there is one
			;printf ("  ");
			lda #' '
			jsr putchar
			lda #' '
			jsr putchar
			dey
			bne list_5
list_6:
		;if ((ch == IF) || (ch == FOR) || (ch == DO))
		cpx #IF
		beq list_7
		cpx #FOR
		beq list_7
		cpx #DO
		bne list_8
list_7:		;indent++;
			inc indent
list_8:		
		;if ((ch == ENDIF) || (ch == NEXT) || (ch == WHILE))
		cpx #ENDIF
		beq list_9
		cpx #NEXT
		beq list_9
		cpx #WHILE
		bne list_10
list_9:
			;indent--;
			dec indent
			bpl list_10
				stz indent		; don't go negative!
list_10:
		;while (ch != CR)
		cpx # CR
		beq list_20
			;if (ch < 0x80)				// it's ascii, print it
			txa
			bmi list_11
				;printf ("%c", * where);
				jsr putchar
				bra list_12
list_11:
				;else						// it's a token
				;printf ("%s",tokens[ch - 0x80]);
				jsr print_token_by_val
list_12:
		;where ++;
		inc list_where
		bne list_19
			inc list_where+1
list_19:
		;	ch = * where;
		lda (list_where)
		tax
		jmp list_10
list_20:
		;printf ("%c",CR);
		jsr crlf
		;where = next_line;
		lda list_next
		sta list_where
		ldy list_next+1
		sty list_where+1
		;found = * (uint16_t *) where;
		jsr star_int16
		sta list_found
		sty list_found+1
		jmp list_1
list_x:	
	
	plx
	rts

Of course, that initial bne/jmp could be a beq to the exit of the routine above, but I don't want to do that yet.

And it turns out I do have a line store issue, but that's still unfinished...

Neil

p.s. going travelling for a couple of weeks; I won't have the hardware with me so expect no updates for a while.

barnacle · Post by **barnacle** » Wed Jan 01, 2025 10:48 am

The line store issue is driving me crazy; something obvious that I haven't yet resolved. I had thought the issue might be to do with a 16-bit comparison routine, but I've done a lot of checking and as far as I can see that routine - which sets the carry flag if y:a is greater or equal to a stacked variable - is working just fine.

Hmph.

It's all a bit complex: storeline has to check the existing memory to find a line which is either numbered zero (i.e. end of memory), or which is equal to or greater than the incoming line number.
- if the number found is zero, insert the new line at the end of memory and adjust all the associated pointers
- if the number found is greater than the incoming line number, memmove a space and insert the new line in order
- if the number is the same as the incoming line number, either delete the existing line or replace it (depending on whether there's any text on the incoming line).

I'm implementing it a bit at a time; so far just the first. But weird stuff happens: if I enter a series of two digit line numbers, all is well. If I enter a series of four digit line numbers, all appears to be well. But if I enter a three digit line number, the first one is fine but subsequent ones crash horribly...

Which has led me down all sorts of rabbit holes. For example, a routine 'squishbuffer' compresses and tokenises the initial line buffer in place, in particular, replacing the n-digit ascii line number with a sixteen-bit line number and an eight-bit length byte. So depending on the number of bytes between the start of the line and the first non-blank payload byte, the remainder of the line has to be moved either left, right, or nowhere. An obvious place to screw it up... but again, that all appears to be working.

Meh.

Neil

barrym95838 · Post by **barrym95838** » Wed Jan 01, 2025 6:40 pm

When I was getting ready to share VTL02 I had a last-minute panic attack when I realized that I could append and delete program lines perfectly, but if I tried to edit or insert program lines in between others I would slowly but surely corrupt the program text. It turned out to be an off-by-one in a y-register copy loop that sprinkled an unwanted byte at offset 255, which didn't matter when appending lines, but mattered greatly when editing lines inside a non-trivial program. A few breakpoints inside my line editor caught the issue.

barnacle · Post by **barnacle** » Thu Jan 02, 2025 1:22 pm

I think I found it... the routine which looks for a specific line in the main memory required an unsigned 16-bit comparison, not the signed one I had used.

Always tricky when there is no OS or debugging on the 65c02 SBC, so it's all a question of sprinkling in memory or processor status views in the code, building it, and trying again. And of course the routine which adds new lines requires the routine to find new lines, which itself requires the memory to have some lines in it to be tested... it's all a bit circular, sometimes.

Neil

barnacle · Post by **barnacle** » Wed Jan 08, 2025 10:42 am

I have a signed comparison which returns true on A >= B.
I need A > B.
I _think_ I can do this by swapping A and B and inverting the result (hmm, Boolean logic with variables?)

A > B true if !(B >= A)

Neil

barrym95838 · Post by **barrym95838** » Wed Jan 08, 2025 4:22 pm

barnacle wrote:

A > B true if !(B >= A)

Seems correct, with no obvious corner cases. There's nearly always the -32768 issue, depending on your actual compare implementation.
viewtopic.php?f=3&t=7550&p=106203#p106203

barnacle · Post by **barnacle** » Wed Jan 08, 2025 7:43 pm

Yeah, well, 32768 isn't a real number in signed 2's complement

Nice to have another pair of eyes on it. I didn't want to write a separate routine if I could avoid it.

Thanks,

Neil

drogon · Post by **drogon** » Wed Jan 08, 2025 8:36 pm

barnacle wrote:

Yeah, well, 32768 isn't a real number in signed 2's complement

But -32768 is ...

As another way to do it (mostly to try to save space), I adopted a way in my own TB that I 2borrowed" from another - which is to do an equality test followed by a subtraction and use the results of those to set 'flags' like a virtual status register - and from those 2 flags everything else can be inferred ...

So...

Code: Select all

;********************************************************************************
;* multiCmp:
;*      Routime to do all the tests at once and set bits in
;*      a register.
;*      Return the result in A.
;********************************************************************************

.proc   multiCmp

        ldx     arithPtr                ; Get arith stack pointer

; Now, run a series of compares to set various flags...

        lda     #0
        sta     num

;  Start by subtracting

        sec
        lda     arithStack-4,x
        sbc     arithStack-2,x
        sta     regAL
        lda     arithStack-3,x
        sbc     arithStack-1,x
        sta     regAH

; Test for zero

        ora     regAL
        bne     notZero
        lda     #fEQ
        sta     num
notZero:

; Test for <

        lda     num
        ldy     regAH
        bpl     notLt

; The result was negative, so LT, ...

        ora     #fLT
        sta     num
notLt:
        rts
.endproc

and it's use like e.g.

Code: Select all

;********************************************************************************
;* GTR:
;*      Greater than
;********************************************************************************

.proc   GTR
        jsr     multiCmp
        and     #fLT | fEQ
        beq     true
        bne     false
.endproc


;********************************************************************************
;* GEQ:
;*      Greater than or equal
;********************************************************************************

.proc   GEQ
        jsr     multiCmp
        and     #fLT
        beq     true
        bne     false
.endproc

(similar for the < and <= tests)

I have some conditionally assembled code to explicitly do the = and <> tests which adds a few more bytes but does speed up a most programs I tested it with.

In my BCPL bytecode VM everything is expanded out with macros for speed and the > test looks like:

Code: Select all

;********************************************************************************
; JGR:
;       JGR Ln
;       JGR Ln IF B > A DO PC := Ln
;********************************************************************************

        opId    "JGR  "

.proc   ccJGR
        .a16
        .i16

; Compare B > A ?

        doCmpBA
        bmi     doJmp

        incPcNext

doJmp:  takeJump

.endproc

And the doCmpBA macro is:

Code: Select all

.macro  doCmpBA
        .a16
        .i16
        lda     regA+0  ; regA - regB
        cmp     regB+0
        lda     regA+2
        sbc     regB+2
        bvc     :+      ; N eor V
        eor     #$8000
:
.endmacro

See also: http://www.6502.org/tutorials/compare_beyond.html

-Gordon

barnacle · Post by **barnacle** » Sun Jan 12, 2025 3:03 pm

Still contemplating the best option here: I need signed comparisons (<, <=, =, !=, >=, >) for basic execution, and also unsigned version of some of those for the actual program control... it's a lot of comparisons and a lot of flavours. It's easy in 8-bit but 16-bit complicates things.

Not to mention that sometimes I'm comparing against immediate values, or zero, and sometimes between two variables. And sometimes those variables are only eight bits comparing against sixteen, with a sign extension, or not...

Hmph.

Neil

barrym95838 · Post by **barrym95838** » Mon Jan 13, 2025 8:04 pm

Make line numbers >32767 illegal, and a significant percentage of your problem is solved ...

barnacle · Post by **barnacle** » Mon Jan 13, 2025 8:54 pm

They already are, but all that does is help me for line numbers, not, for example, to calculate whether the space available is greater than the line length... or indeed if a memory address is greater than another.

Things get tricky, and I'm missing something...

2 > 4? 2 - 4 = -2 (FFFE), negative, so false
4 > 2? 4 - 2 = 2 (0002), positive, so true.
So far so good.
But:
2 > 65534? 2 - 65534 = 0004, positive, so true!
65534 > 2? 65534 - 2 = FFFC, negative, so false!

I want to use my existing signed subtraction to save code space, but to handle a 16-bit unsigned subtraction the sign bit ends up at bit 16. So if I extend my subtraction to save the 16th bit somewhere then I can check that and the 15th bit to see signed and unsigned comparison at the same time. I think.

Just to complicate things, the subtraction is done by inverting the subtrahend and adding it to the minuend, which works fine at 16 bits signed but I need to check what happens to the final carry.

Neil

White Flame · Post by **White Flame** » Mon Jan 13, 2025 9:28 pm

You're doing unsigned math, so you want to look at just the carry, not the result sign (which has more to do with overflow). The actual signed results will be more than 16 bits wide, so your results are truncated and look wrong. But carry is actually that 16th bit.

2 > 4? 2 - 4 clears carry, so false
4 > 2? 4 - 2 sets carry, so true
2 > 65534? 2 - 65534 clears carry, so false
65534 > 2? 65534 - 2 sets carry, so true

The carry basically reflects (on subtraction) whether or not it crossed under 0, which is semantically what you want to test.

(corrected, thanks!

)

barrym95838 · Post by **barrym95838** » Mon Jan 13, 2025 10:57 pm

2 - 4 sets borrow, which clears carry on the 65xx, as I am certain you're aware.

Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic

Re: Neolithic Tiny Basic