Page 3 of 3
Re: Assembler with structured conditionals
Posted: Mon Apr 29, 2019 10:00 pm
by JimBoyd
BTW, I'll quickly inject a comment of my own. Ragsdale's assembler (and yours, I think) check to see that conditionals are properly paired, which means (for example) that BEGIN leaves both an address and an error-check code on stack and UNTIL removes these two items. Usually that's fine, but in certain situations the error-check code becomes a nuisance, leading to extra SWAPs ROTs etc.
I'd like to add that the compiler security isn't meant to be restrictive. It is modeled on high level forth control flow structures where the security is there to make sure that
>MARK is resolved by
>RESOLVE and
<MARK is resolved by
<RESOLVE . There is no assembler version of the mark and resolve words, but in the assembler, any word that needs a forward branch or jump resolved can be paired with any word that resolves a forward branch or jump. Any word that leaves an address for a backward branch or jump can be paired with any word that uses that address for a backward branch or jump. For example,
ELSE, ,
ELIF, , and
WHILE, are defined like this:
Code: Select all
: ELSE, ( ADR1 CS -- ADR2 CS )
[COMPILE] AHEAD, 2SWAP
[COMPILE] THEN, ; IMMEDIATE
: ELIF, ( ADDR1 CS1 -- ADDR2 CS2 )
[COMPILE] IF, 2SWAP
[COMPILE] THEN, ; IMMEDIATE
: WHILE, ( A1 C1 -- A2 C2 A1 C1 )
[COMPILE] IF, 2SWAP ; IMMEDIATE
Which means that, oddly, any number of
ELSE, or
ELIF, clauses can be between an
IF, and a
THEN,. But then, no restrictions, just making sure that forward and backward references are properly resolved.
Re: Assembler with structured conditionals
Posted: Tue Apr 30, 2019 9:12 pm
by JimBoyd
That's why I've got CS-DUP CS-DROP CS-SWAP and CS-ROT which also work with high level Forth conditionals. I just treat the address and security number as a single unit.
Yes, you have a solution of sorts but it's not one I like much, and I thought perhaps I could do a little better.
Unfortunately, after refreshing myself on the subject (it's been years since I involved myself with this) I have no dramatic improvement to offer. In my notes I did find the following...
Code: Select all
: IF[WITHIN], ( like IF, but used within an assembler structured conditional )
>R >R [COMPILE] IF,
R> R> ; IMMEDIATE
... but this is less flexible than what you're doing, and certainly no prettier.
Now that I have an Auxiliary stack, there is also
CS>A , control stack to auxiliary, and
A>CS , auxiliary to control stack to transfer control flow data ( address and security code ) between the control flow stack ( data stack ) and the auxiliary stack. They also work with high level code since all the control flow data in Fleet Forth ( address and security code ) are two cells. Even the "sys" used by colon and semicolon ( as well as
CODE and
END-CODE ) is two cells.
Code: Select all
: -- sys M,79 "colon"
sys is balanced with its corresponding ;
; -- C,I,79 "semi-colon"
sys -- (compiling)
Stops compilation of a colon definition, allows the <name>
sys is balanced with its corresponding :
Re: Assembler with structured conditionals
Posted: Wed May 05, 2021 11:52 pm
by JimBoyd
(For ZP, there's LDA_ZP,Y.)
And what op-code do you comma in for that one?

In the past it has slipped my mind that there is no LDA ZP,Y instruction. I wondered why my PICK was two bytes bigger than I thought it would be.
It turns out that the NMOS 6502 has only two instructions which have the ZP ,Y addressing mode: LDX and STX .
This is one more thing I like about my assembler, I don't need to think about which instructions have a zero page addressing mode or if the address is small enough unless I'm counting bytes ( or cycles) to find the better of two or more ways to write a primitive.
For instructions with the ABSOLUTE ABSOLUTE,X or ABSOLUTE,Y addressing modes, if the address will fit in a single byte and if the instruction supports the zero page version then the assembler will use the zero page version.
[Edit: Fixed a mistake]
Re: Assembler with structured conditionals
Posted: Sun Jun 13, 2021 9:46 pm
by JimBoyd
I mentioned here that I was experimenting with removing the trailing commas from my assembler's mnemonics and control flow words.
After almost four months, I can report that I do not miss the old version with the trailing commas.
Re: Assembler with structured conditionals
Posted: Sun May 14, 2023 9:32 pm
by JimBoyd
I've uploaded to the head post the plain text source for the latest version of Fleet Forth's assembler in a zip file.
The control flow data is the same size as the control flow data for Fleet Forth's high level control flow words. The control flow data is just an address followed by a security number. The security code is only there to be certain the branches are properly resolved. This is similar to how the Forth-83 words >MARK , >RESOLVE , <MARK and <RESOLVE work.
For an example, here is the source for Fleet Forth's word BLK> from the system loader. It takes an absolute block number and returns the relative block number (relative to the drive where it is stored) and the drive number.
Code: Select all
CODE BLK> ( BLK#1 -- BLK#2 DR# )
BEGIN
SEC 1 ,X LDA
0 1 DR+ SPLIT NIP # SBC
CS WHILE
1 ,X STA INY
0 RAM 0 1 DR+ / # CPY
0= UNTIL
THEN
TYA
APUSH JMP END-CODE
and the disassembly.
Code: Select all
SEE BLK>
BLK>
14216 SEC
14217 1 ,X LDA
14219 8 # SBC
14221 14230 BCC
14223 1 ,X STA
14225 INY
14226 8 # CPY
14228 14216 ^^ BNE
14230 TYA
14231 3176 JMP APUSH
18
OK
The largest CODE word written for Fleet Forth is the find primitive (FIND) .
Note: CS-DUP is just an immediate version of 2DUP .
Code: Select all
CODE (FIND) ( ADR VOC -- ADR2 F )
DEY N 1- STY
SEC
2 ,X LDA 2 # SBC N 2+ STA
3 ,X LDA 0 # SBC N 3 + STA
0 ,X LDY 1 ,X LDA
INX INX XSAVE STX
BEGIN
N STY N 4 + STY
N 1+ STA N 5 + STA
BEGIN
0 # LDY
N )Y LDA TAX INY
N )Y LDA
0= NOT WHILE
N 1+ STA N STX INY
N )Y LDA
SBIT $1F OR # AND
N 2+ )Y EOR
CS-DUP 0= UNTIL // CONTINUE
BEGIN
INY
N 2+ )Y LDA
N )Y EOR
0= NOT UNTIL
$7F # AND
0= UNTIL
XSAVE LDX SEC
TYA N ADC 0 ,X STA
0 # LDA N 1+ ADC 1 ,X STA
2 # LDY
N )Y LDA IBIT # AND
0= NOT IF
LABEL PUSH.ONE // 1
1 # LDA
LABEL APUSH
0 # LDY
LABEL AYPUSH
DEX DEX
LABEL AYPUT
0 ,X STA 1 ,X STY
NEXT JMP
THEN
LABEL PUSH.TRUE // -1
$FF # LDA TAY
AYPUSH 0= NOT BRAN
THEN
3 # LDY
N 4 + )Y LDA
N 1- AND
0= NOT WHILE
TAX DEY
N 4 + )Y LDA
TAY TXA
0= UNTIL // ALWAYS BRANCH
THEN
XSAVE LDX
LABEL PUSH.FALSE // 0
0 # LDA
APUSH 0= BRAN
END-CODE
// AYPUSH & AYPUT
// .A HOLDS LO BYTE
// .Y HOLDS HI BYTE
Here is the disassembly.
Code: Select all
SEE (FIND)
(FIND)
3086 DEY
3087 132 STY N 1-
3089 SEC
3090 2 ,X LDA
3092 2 # SBC
3094 135 STA N 2+
3096 3 ,X LDA
3098 0 # SBC
3100 136 STA N 3 +
3102 0 ,X LDY
3104 1 ,X LDA
3106 INX
3107 INX
3108 141 STX XSAVE
3110 133 STY N
3112 137 STY N 4 +
3114 134 STA N 1+
3116 138 STA N 5 +
3118 0 # LDY
3120 133 )Y LDA N
3122 TAX
3123 INY
3124 133 )Y LDA N
3126 3192 BEQ
3128 134 STA N 1+
3130 133 STX N
3132 INY
3133 133 )Y LDA N
3135 63 # AND
3137 135 )Y EOR N 2+
3139 3118 ^^ BNE
3141 INY
3142 135 )Y LDA N 2+
3144 133 )Y EOR N
3146 3141 ^^ BEQ
3148 127 # AND
3150 3118 ^^ BNE
3152 141 LDX XSAVE
3154 SEC
3155 TYA
3156 133 ADC N
3158 0 ,X STA
3160 0 # LDA
3162 134 ADC N 1+
3164 1 ,X STA
3166 2 # LDY
3168 133 )Y LDA N
3170 64 # AND
3172 3187 BEQ
3174 1 # LDA
3176 0 # LDY
3178 DEX
3179 DEX
3180 0 ,X STA
3182 1 ,X STY
3184 2160 JMP NEXT
3187 255 # LDA W 1+
3189 TAY
3190 3178 ^^ BNE AYPUSH
3192 3 # LDY
3194 137 )Y LDA N 4 +
3196 132 AND N 1-
3198 3208 BEQ
3200 TAX
3201 DEY
3202 137 )Y LDA N 4 +
3204 TAY
3205 TXA
3206 3110 ^^ BNE
3208 141 LDX XSAVE
3210 0 # LDA
3212 3176 ^^ BEQ APUSH
128
OK
Although everything in Fleet Forth's kernel is built with a metacompiler, the metacompiler's assembler is based on Fleet Forth's assembler.
The assembler's control flow words are not that different from their high level counterparts. Some require a condition code to assemble the correct branch while the high level versions require a flag on the data stack.
Here is a high level example of control flow in Fleet Forth using CS-DUP to manipulate the control flow data.
Code: Select all
: QUIT ( -- )
[COMPILE] [
BEGIN
RP!
['] LIT (IS) WHERE
CR QUERY INTERPRET
STATE @ 0=
CS-DUP UNTIL // CONTINUE
." OK"
AGAIN -;
Re: Assembler with structured conditionals
Posted: Sun May 14, 2023 11:37 pm
by JimBoyd
If I implement the structured conditionals, I might do something like:
Code: Select all
HEX
CODE ACLOSE ( -- ) \ Close all open files.
STX_ZP XSAVE C,
BEGIN,
LDY_ZP 98 C, \ (but give a name, as Jeff suggests)
0<>
WHILE,
LDA_ABS 258 C, \ LAST OPENED
JSR FFC3 , \ CLOSE
REPEAT,
LDX_ZP XSAVE C,
JMP NEXT ,
I've been re-reading this thread and the commas at the end of the control flow words are, in my opinion, annoying.
If I were to implement an assembler for a Forth without vocabularies, I'd try the following:
Code: Select all
: BEGIN
STATE @
IF [COMPILE] BEGIN EXIT THEN
<ASSEMBLER VERSION OF BEGIN> ; IMMEDIATE
The high level control flow words are redefined. BEGIN performs the original high level function of BEGIN If STATE is compiling. It's assembling If STATE is interpreting.
Re: Assembler with structured conditionals
Posted: Sun Apr 28, 2024 9:18 pm
by JimBoyd
I just noticed part of the source code for my Forth assembler failed to take advantage of the pearl of Forth.
There are eight words which set the addressing mode.
Code: Select all
VARIABLE MODE
: .A ( -- ) MODE OFF ;
: X) ( -- ) 1 MODE ! ;
: )Y ( -- ) 2 MODE ! ;
: # ( -- ) 3 MODE ! ;
: ) ( -- ) 7 MODE ! ;
: MEM ( -- ) 8 MODE ! ;
: ,X ( -- ) 9 MODE ! ;
: ,Y ( -- ) #10 MODE ! ;
The version with CREATE DOES> is smaller.
Code: Select all
VARIABLE MODE
: AM ( N -- )
CREATE ( N -- )
C,
DOES> ( -- )
C@ MODE ! ;
0 AM .A 1 AM X) 2 AM )Y
3 AM # 7 AM ) 8 AM MEM
9 AM ,X $A AM ,Y
Ragsdale also overlooked this opportunity to use CREATE DOES> with his assembler.
Code: Select all
VARIABLE MODE 2 MODE !
: .A 0 MODE ! ; : # 1 MODE ! ; : MEM 2 MODE ! ;
: ,X 3 MODE ! ; : ,Y 4 MODE ! ; : X) 5 MODE ! ;
: )Y 6 MODE ! ; : ) F MODE ! ;
Re: Assembler with structured conditionals
Posted: Sun Jun 23, 2024 7:19 pm
by JimBoyd
Some of the instructions have an absolute addressing mode (full 16 bit address) and a zero page addressing mode ( 8 bit address ).
Code: Select all
ZERO PAGE: ABSOLUTE:
0 ,X LDA $101 ,X LDA
$8D LDA $256 ,X LDA
My assembler has two words to define all the opcodes which are not branching instructions. CPU0 defines the instructions which have the implied addressing mode. They have a single opcode and do not have an operand.
CPU1 defines all the other instructions except the branches. The branches are handled by the control flow words and the word BRAN .
TABLE is actually two tables of eleven bytes each. One byte for each addressing mode except implied and relative.
The words defined by CPU1 access the variable MODE to determine which addressing mode is to be used. These are the words which set the addressing modes and the value each one stores in MODE .
Code: Select all
.A 0
X) 1
)Y 2
# 3
) 7
MEM 8
,X 9
,Y 10
Mode 0, operations on the accumulator, do not have an operand to assemble.
Modes 7 through 10 have a two byte operand.
Modes 1 through 6 have a one byte operand. MEM is executed to set the addressing mode to absolute addressing (no indexing) when the assembler is started by CODE , SUBR or ;CODE . The words defined by CPU1 also leave MODE set to absolute addressing. Also notice that the modes defined by ,X and ,Y default to the absolute rather than the zero page versions.
The words defined by CPU1 have a parameter field with a byte and a two byte cell.
Code: Select all
HEX
073E 61 CPU1 ADC
073E 21 CPU1 AND
.
.
.
The byte is what I call a base opcode. The cell is a bitmap. The eleven lowest bits are the bitmap of supported addressing modes for the instruction. The high bit determines which table is used. The value of MODE is also the offset into the tables for a value to be added to the base opcode to get the actual opcode for that instruction and addressing mode.
Code: Select all
CREATE TABLE
$0008 , $0810 , $1404 , $2C14 ,
$1C0C , $0818 , $1000 , $0400 ,
$1414 , $0C2C , $1C1C ,
A word defined by CPU1 calls the word Z (perhaps I should call that one ZP instead)
Code: Select all
: Z ( OA? ADR -- OA? BOP BMAP )
COUNT SWAP @
MODE @ 7 > 0EXIT
2PICK $100 U< 0EXIT
1 MODE @ 2- 2-
LSHIFT OVER AND 0EXIT
-4 MODE +! ;
If MODE is less than eight, Z exits. At this point, if Z has not exited, the mode is one of MEM ,X or ,Y If the address on the stack is greater than $FF, Z exits. If the instruction does not have a zero page version of the specified mode, Z exits. If Z has not exited at this point, four is subtracted from the value in MODE , changing the addressing mode from the absolute version to the zero page version.
This part of the assembler works fine.
I had not considered the the possibility of someone using a value larger than what can fit in one byte with the instructions which only have an addressing mode with a one byte operand. Regardless of the size of the number on the data stack, only the low byte is used with these instructions. This could allow some errors to get through.
It never occurred to me that someone might try something like the following code fragment:
which assembles the same thing as this:
I personally have always used a zero page address with the indexed indirect X) and indirect indexed )Y addressing modes and a byte sized number with the immediate # addressing mode; Nevertheless, ignoring the high byte is a bug. A small correction to the source for CPU1 eliminates it.
The original source for CPU1
Code: Select all
: CPU1 ( BMAP BOP -- )
CREATE C, , DOES>
Z DUP 1 MODE @ MEM DUP>R
LSHIFT AND
[ FORTH ] 0= [ ASSEMBLER ]
ABORT" NON VALID ADDRESSING MODE"
[ FORTH ] 0< [ ASSEMBLER ]
$B AND R@ + TABLE + C@ + C,
R> ?DUP 0EXIT
7 U< IF C, EXIT THEN , ;
And the improvement.
Code: Select all
: CPU1 ( BMAP BOP -- )
CREATE C, , DOES>
Z DUP 1 MODE @ MEM DUP>R
LSHIFT AND
[ FORTH ] 0= [ ASSEMBLER ]
ABORT" NON VALID ADDRESSING MODE"
[ FORTH ] 0< [ ASSEMBLER ]
$B AND R@ + TABLE + C@ + C,
R> ?DUP 0EXIT
7 U<
IF
SPLIT ABORT" OPERAND TOO BIG"
C, EXIT
THEN
, ;
SPLIT splits the value on the data stack into its low byte and high byte, each taking a full sixteen bit cell on the data stack. If a single byte operand is to be assembled and the high byte is anything but zero, the child word of CPU1 aborts with the message 'OPERAND TOO BIG'.
As I said, I never considered someone trying something like $1234 )Y LDA or $5678 # LDA . I think this covers all the errors in my Forth assembler.