Self Modifying Code
Re: Self Modifying Code
MichaelM wrote:
[...]SMC in the same sense as possible in FORTH where the instruction stream can be physically changed?
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html
https://laughtonelectronics.com/Arcana/ ... mmary.html
Re: Self Modifying Code
MichaelM wrote:
I've following this discussion with some interest. I can't say that I have a strong opinion one way or another. Generally, I will fall on the side of avoiding SMC. Many of the reasons given here reinforce the idea that SMC should be avoided.
SMC doesn't really need to be avoided as long as the programmer is careful. in the case of SCAN and SKIP, the scan/skip loop is exclusive to SCAN and SKIP. No other code alters it. SCAN sets it to scan for a given character and SKIP sets it to skip all leading occurrences of a given character. No code modification takes place while the loop is running. It doesn't incur much of a speed penalty ( since the extra code is outside the loop ) and it's also almost two dozen bytes smaller than if SCAN and SKIP each had their own loop, assuming that SKIP jumps into the end of SCAN right at the first DEX,. SMC saves 40 bytes versus the case where SCAN and SKIP are independent of each other.
Re: Self Modifying Code
It is possible to use SMC in the case of SCAN and SKIP without labels, but some might find it a little unsettling.
The stack comments with AH1 AH2 BEG etc. show what is on the control flow stack ( data stack ) after the control flow word executes.
As I said, some people might not be thrilled by the creative ( mis- ) use of the control flow structures.
Code: Select all
// SCAN
HEX
CODE SCAN ( A1 L1 C -- A2 L2 )
0= # LDA,
AHEAD, ( AH1 )
-3 ALLOT // JUST NEED CONTROL FLOW DATA
BAD STA,
3 # LDA, SETUP JSR,
AHEAD, ( AH1 AH2 )
BEGIN, ( AH1 AH2 BEG )
N 4 + INC,
0= IF, N 5 + INC, THEN,
N 2+ LDA,
0= IF, N 3 + DEC, THEN,
N 2+ DEC,
CS-SWAP THEN, // SECOND AHEAD JUMPS TO HERE
N 2+ LDA, N 3 + ORA, ( AH1 BEG )
0= NOT WHILE, ( AH1 IFW BEG )
N 4 + )Y LDA, N EOR, .A ASL,
// THE ADDRESS OF THE FIRST AHEAD ( THE ADDRESS OF STA,) GETS FIXED UP BY THE FOLLOWING LINE
CS-ROT THEN, ( IFW BEG )
0= UNTIL, // STORE DESIRED BRANCH HERE
THEN, ( )
DEX, DEX,
N 4 + LDA, 0 ,X STA,
N 5 + LDA, 1 ,X STA,
N 2+ LDA, PHA,
N 3 + LDA,
PUSH JMP, END-CODE
// SKIP
CODE SKIP ( A1 L1 C -- A2 L2 )
0= NOT # LDA,
' SCAN @ 3 + @ STA,
' SCAN @ 5 + JMP, END-CODE
As I said, some people might not be thrilled by the creative ( mis- ) use of the control flow structures.
Re: Self Modifying Code
Dr Jefyll wrote:
Interesting that you wrote an assembler using Bill Ragsdale's syntax. The code in Bill's version is interesting to say the least! I remember there was one particular word -- UPMODE -- which seemed to demonstrate rather a lot (arguably too much) cleverness!
Code: Select all
HEX
: NOT 20 + ;
Code: Select all
HEX
: NOT 20 XOR ;
Someday I would like to build my own single board computer based on the 65C02 or the 65816. When that happens, I would like to port my Forth to the SBC. I took a good look at Ragsdale's word M/CPU so I could figure out how to add more opcodes to the assembler. I decided it would be easier to write my own version of M/CPU. That way, I would know how to add new opcodes to support other members of the 6500 family.
I agree that UPMODE displayed too much cleverness. My assembler works just fine without it and my index table is ten bytes smaller than his.
Re: Self Modifying Code
MichaelM wrote:
At a lunch time discussion with a colleague, we came to the conclusion that the function code was pre-compiled, i.e. static. The operating data of the function was somehow dynamically constructed and at run-time linked to the relevant instruction stream. Can I assume that your definition of SMC is such that since the "functional" behavior is static, then the resulting dynamically constructed function is not SMC in the same sense as possible in FORTH where the instruction stream can be physically changed?
Simply, if I were to "disassemble" a piece of code at a specific address at the beginning of a program, and, later, disassembled that same address range again later, and they were different, then that's SMC (modulo the whole relocating of code by a loader or linker or anything like that).
JimBoyds example is he has a common piece of code that gets changed based on who invokes the code.
So (and I have not looked at your code in any detail), rather then having something like:
Code: Select all
DOIT:
LDA FLAG
BEQ $1
JSR DOTHIS
JMP $2
$1: JSR DOTHAT
$2: RTS
Code: Select all
_DOIT:
JSR _DOTHIS
RTS
DOTHAT:
LDA #<_DOTHAT
STA _DOIT+1
LDA #>_DOTHAT
STA _DOIT+2
JMP _DOIT
DOTHIS:
LDA #<_DOTHIS
STA _DOIT+1
LDA #>_DOTHIS
STA _DOIT+2
JMP _DOIT
THAT is SMC to me. My earliest attempts at 6502 used SMC for a block move routine, changing the LDA and STA addresses (I wasn't to versant in the indirect and index address modes at the time).
This is quite different from passing pointers to compiled code around to be invoked as is done with dynamic languages (or just function pointers in, say, C).
The two main complaints against SMC is ROMability, and clarity.
The Fig-Forth thing is actually done in zero page, so this aspect alone doesn't affect ROMability (I don't know if the stock FIG is ROMmable as is or not, but if it isn't, its not because of this). FIG actually puts the JMP code in to RAM, lying in wait for the address later, during cold boot.
If Fig didn't use the technique of populating the address of an indirect jump, it could have simply played games with stack and invoked RTS.
Clarity is the biggest thing. Most folks don't do it just because of that, no matter how well documented it is. It's most routinely done in copy protection logic (which, by definition, is designed to hinder clarity).
As a rule, I tend to not like code that doesn't do what the source code says it does. It's not a universal truth, it's just a guideline I tend to favor.
In Java, it's quite possible to see something like:
Code: Select all
public class X {
public void thingI() {
System.out.println("Hi there!");
}
public static void main(String args[]) {
X x = new X();
x.thing();
}
}
Shenanigans are involved, of course, but it's possible.
Debugging code like that is like Bugs Bunny playing "Those Endearing Young Charms" on a "piana".
https://www.youtube.com/watch?v=gUsJXwE73QU
Re: Self Modifying Code
I'm surprised nobody noticed something about SKIP in my example of SMC.
This:
can be streamlined to this:
In my example, SCAN does exactly what the source code says it does. Storing the type of branch in the code simply sets it back to doing what the source code says because SKIP changes it slightly and it's not too difficult to see what SKIP does.
Imagine if you will a hypothetical extension to the 6510 processor ( the one in the C64 ). An extra flag is added, the branch invert flag. When this flag is cleared all branches work normally. When this flag is set all branches operate in the opposite sense. With such a processor, SCAN would clear the branch invert flag. SKIP would set it then jump into SCAN to the instruction after the clear branch invert instruction. This version of SCAN and SKIP would not be SMC. My version of SCAN and SKIP uses SMC to accomplish the same code reuse on the 6510 that would be possible on the hypothetical extended 6510.
I'm not thinking about SMC that twists the code beyond all recognition, rather to allow better code reuse as with SCAN and SKIP or to work around limitations such as what NEXT does to work around the 6502 ( and 6510 ) not having a double indirect jump.
I realize that my version, with SMC, would not work in ROM. Here is a version I might use in a ROM based Forth for the 65C02 or even a cartridge for the C64 to keep the size down. It depends on how badly I would need to save memory ( to fit the system in ROM ).
Notice the extra clock cycles added to the loop.
I thought of another version to keep the size down that doesn't use SMC and shouldn't incur as many extra cycles in the loop as this one does, but the SMC version reads easier. This other version wasn't pretty. Trust me.
[Edit: I forgot to remove the screen number ( and comment line ) in this example the first time. Oops. Yes, my Forth system has source in blocks. ]
This:
Code: Select all
// SKIP
CODE SKIP ( A1 L1 C -- A2 L2 )
0= NOT # LDA, // LOAD OPCODE F0 ( BEQ )
' SCAN @ 3 + @ STA,
' SCAN @ 5 + JMP, END-CODE
Code: Select all
CODE SKIP ( A1 L1 C -- A2 L2 )
0= NOT # LDA, // LOAD OPCODE F0 ( BEQ )
' SCAN @ 2+ JMP, END-CODE
whartung wrote:
As a rule, I tend to not like code that doesn't do what the source code says it does. It's not a universal truth, it's just a guideline I tend to favor.
Imagine if you will a hypothetical extension to the 6510 processor ( the one in the C64 ). An extra flag is added, the branch invert flag. When this flag is cleared all branches work normally. When this flag is set all branches operate in the opposite sense. With such a processor, SCAN would clear the branch invert flag. SKIP would set it then jump into SCAN to the instruction after the clear branch invert instruction. This version of SCAN and SKIP would not be SMC. My version of SCAN and SKIP uses SMC to accomplish the same code reuse on the 6510 that would be possible on the hypothetical extended 6510.
I'm not thinking about SMC that twists the code beyond all recognition, rather to allow better code reuse as with SCAN and SKIP or to work around limitations such as what NEXT does to work around the 6502 ( and 6510 ) not having a double indirect jump.
I realize that my version, with SMC, would not work in ROM. Here is a version I might use in a ROM based Forth for the 65C02 or even a cartridge for the C64 to keep the size down. It depends on how badly I would need to save memory ( to fit the system in ROM ).
Code: Select all
// SCAN
HEX
CODE SCAN ( AD1 N1 C -- AD2 N2 )
DEY,
N 6 + STY, 0 # LDY, // SETUP NEEDS Y TO BE ZERO
3 # LDA, SETUP JSR,
AHEAD,
BEGIN,
N 4 + INC,
0= IF, N 5 + INC, THEN,
N 2+ LDA,
0= IF, N 3 + DEC, THEN,
N 2+ DEC,
CS-SWAP THEN,
N 2+ LDA, N 3 + ORA,
0= NOT WHILE,
N 4 + )Y LDA, N EOR, .A ASL,
0= NOT IF, 0FF # LDA, THEN, // BE SURE RESULT IN ACCUMULATOR IS 0 OR FF
N 6 + EOR, // SCAN INVERTS THE RESULT, SKIP DOES NOT
0= NOT UNTIL, // SO THE TEST IS REVERSED
THEN,
DEX, DEX,
N 4 + LDA, 0 ,X STA,
N 5 + LDA, 1 ,X STA,
N 2+ LDA, PHA,
N 3 + LDA,
PUSH JMP, END-CODE
CODE SKIP ( A1 L1 C -- A2 L2 )
// SKIP HAS NO PARAMETER FIELD. IT'S CFA POINTS ONE BYTE INTO SCAN
// AVOIDING THE DEY, INSTRUCTION
-2 ALLOT
' SCAN @ 1+ , END-CODE
I thought of another version to keep the size down that doesn't use SMC and shouldn't incur as many extra cycles in the loop as this one does, but the SMC version reads easier. This other version wasn't pretty. Trust me.
[Edit: I forgot to remove the screen number ( and comment line ) in this example the first time. Oops. Yes, my Forth system has source in blocks. ]
Re: Self Modifying Code
Dr Jefyll wrote:
I apologize if I've drifted off-topic. This post has more to do with working around Ragsdale's structured conditionals than with self-modifying code per se.
Cheers,
Jim
Re: Self Modifying Code
BTW, there is a discussion of self modifying code in general, rather than Forth specific, in the Programming sub-forum here.
Re: Self Modifying Code
GARTHWILSON wrote:
How 'bout a separate SMC stack? It wouldn't need much space.
There could be words to move single values from the data stack to the aux stack and back.
analogous to >R and R> they could be called >A and A>.
There could also be words to move two cells between the data and aux stacks 2>A and 2A>.
Words to move control flow data from the control flow stack ( which on most implementations is probably just the data stack ) to the aux stack could be CS>A and A>CS.
Re: Self Modifying Code
I don't think SMC is as useful with high level code. The only example I could think of involves a deferred word's 'vector', for lack of a better word, setting the deferred word to another vector.
Case in point: Fleet Forth's (ABORT") , the word compiled by ABORT" , executes the word WHERE . WHERE shows the location of the error ( or tries to ). If loading a block causes an error, WHERE will try to load that block to show the error, causing yet another error ( recursively!)
One solution was to try something like the following:
(WHERE) is set to a no-op by SHOW.WHERE . If an error occurs when SHOW.WHERE is running, WHERE gets executed, but does nothing more than reset (WHERE) back to SHOW.WHERE so it's ready for the next error.
It was actually easier to take care of this with a flag variable like so:
So I'm not sure how useful self modifying high level Forth is. Does anyone have another example of high level Forth self modifying code?
Cheers,
Jim
Case in point: Fleet Forth's (ABORT") , the word compiled by ABORT" , executes the word WHERE . WHERE shows the location of the error ( or tries to ). If loading a block causes an error, WHERE will try to load that block to show the error, causing yet another error ( recursively!)
One solution was to try something like the following:
Code: Select all
DEFER (WHERE)
: SHOW.WHERE
['] NOOP IS (WHERE)
// SHOW THE LOCATION OF THE ERROR
//
//
;
: WHERE
(WHERE)
['] SHOW.WHERE IS (WHERE) ;
It was actually easier to take care of this with a flag variable like so:
Code: Select all
VARIABLE WHERE? TRUE WHERE? !
: WHERE
WHERE? @
IF
WHERE? OFF
// SHOW THE LOCATION OF THE ERROR
//
//
THEN
WHERE? ON ;
Cheers,
Jim
Re: Self Modifying Code
What about manipulating the return stack to control program flow?
I was just reading Dynamically Structured Codes by M. L. Gassanenko. Does this count as self modifying code?
I was just reading Dynamically Structured Codes by M. L. Gassanenko. Does this count as self modifying code?
Re: Self Modifying Code
JimBoyd wrote:
What about manipulating the return stack to control program flow?
As we know, the stuff between BEGIN and AGAIN would ordinarily keep happening forever. But in this case, somewhere between BEGIN and AGAIN a condition is tested, and eventually the mystery word is allowed to execute, causing top-of-R to be dropped. Because it's a colon definition, the mystery word concludes with SEMIS which of course invokes the un-nest sequence. And instead of un-nesting to the mystery word's caller (ie, the word with the BEGIN AGAIN loop), we un-nest to that word's caller.
You might wonder whether the BEGIN AGAIN shouldn't just be replaced with a BEGIN WHILE REPEAT but this particular situation doesn't seem amenable to properly structured conditionals, and the R> DROP is used as a GOTO of sorts! A similar need may arise in other situations; I don't suppose the FIG example is unique.
That concludes the summary. To be explicit I'll need to identify the situation, and that entails explaining some cleverness in a different department. The word containing R> DROP has a name one character long, and that character is an ascii Null -- $00. Since Null is unprintable, the FIG Glossary lists the word as X, and interested parties can find it under that pseudonym.
As part of Forth's startup, QUIT calls INTERPRET and INTERPRET is the word with the BEGIN AGAIN loop I mentioned. The loop uses -FIND to get fragments of input text one by one and either compile or execute them. When no more text is available, a fetch from the buffer yields only a null, which is the end-of-buffer marker. -FIND dutifully does a search attempting to find a word in the dictionary named null, and the search is successful! Unprintable/X/Null is executed, and it's a case of, "Scotty, beam me up!"
-- Jeff
( QUIT is in Screen 54 of the FIG Forth source. INTERPRET is in Screen 52, and X is in Screen 45.)
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html
https://laughtonelectronics.com/Arcana/ ... mmary.html
Re: Self Modifying Code
Fleet Forth does something similar. When WORD parses the text stream , whether a block, the text input buffer, or a string, it places a counted string at here and appends a blank. when the text stream is exhausted, word paces a count of zero and no characters, it still appends a blank. The blank name is found and it is immediate. In Fleet Forth, the blank name has a code field that points to EXIT , but no body, making it an alias for EXIT . Why waste memory on a colon definition when an alias for EXIT will work?
What about the technique used by M. L. Gassanenko? What do you think of it?
What about the technique used by M. L. Gassanenko? What do you think of it?
Re: Self Modifying Code
I don't want to get hung up on assembler control flow workarounds, but since SCAN and SKIP were already mentioned, here is what I think is the best solution since I added an Auxiliary stack to Fleet Forth.
Given the way my INTERPRET is defined ( with EXECUTE called by INTERPRET rather some word called by INTERPRET ) if I squeeze the code enough to fit both >A and A> on the same source screen I wouldn't even need to use the Auxiliary stack. It just wouldn't be portable.
Code: Select all
SCR# 38
// SCAN
HEX
CODE SCAN ( AD1 N1 C -- AD2 N2 )
0= # LDA, // LOAD D0 ( BNE )
HERE 1+ >A
BAD STA,
3 # LDA, SETUP JSR,
AHEAD,
BEGIN,
N 4 + INC,
0= IF, N 5 + INC, THEN,
N 2+ LDA,
0= IF, N 3 + DEC, THEN,
N 2+ DEC,
CS-SWAP THEN,
N 2+ LDA, N 3 + ORA,
SCR# 39
// SCAN SKIP
0= NOT WHILE,
N 4 + )Y LDA, N EOR, .A ASL,
HERE A> !
0= UNTIL,
THEN,
DEX, DEX,
N 4 + LDA, 0 ,X STA,
N 5 + LDA, 1 ,X STA,
N 2+ LDA, PHA,
N 3 + LDA,
PUSH JMP, END-CODE
CODE SKIP ( A1 L1 C -- A2 L2 )
0= NOT # LDA, // LOAD F0 (BEQ)
' SCAN @ 2+ JMP, END-CODE
Re: Self Modifying Code
Dr Jefyll wrote:
JimBoyd wrote:
What about manipulating the return stack to control program flow?
Code: Select all
: ENTER >R ; \ ( tcf-addr -- ) call the threaded code fragment at tcf-addr
: SUCC COMPILE R@ COMPILE ENTER ; IMMEDIATE
: FAIL COMPILE R> COMPILE DROP COMPILE EXIT ; IMMEDIATE
: 1-10 ( --> i --- i --> ) \ generate numbers from 1 to 10
0 BEGIN 1+ DUP 11 <
WHILE SUCC \ call the continuation, of type ( i -- i )
REPEAT
DROP
FAIL ; \ exit the code fragment that contains the continuation
: //2 ( i --> i --- i --> i ) \ filter even numbers
DUP 2 MOD 0=
IF SUCC \ call the continuation, of type ( i -- i );
\ (in the case of //2 we could just exit)
THEN
FAIL ; \ exit the code fragment that contains the continuation
: .even1-10 ( -- ) 1-10 //2 DUP . ;
Code: Select all
.even1-10 2 4 6 8 10 ok