6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Fri Nov 22, 2024 11:57 pm

All times are UTC




Post new topic Reply to topic  [ 29 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Tue Aug 23, 2022 10:39 pm 
Offline

Joined: Wed Jun 29, 2022 2:15 am
Posts: 44
In a video interview, when asked if the 6502 was RISC, Bill Mensch said it was neither RISC nor CISC, but a CPU with addressable registers. He was, of course, talking about the 256 addresses in the zero page.

Meanwhile, over at Apple, Woz and team used most of those memory addresses as global variables, choosing the zero page because that saved some precious bytes of memory from ROM, and because when Woz wrote the first version of the ROM for the Apple board kit (a.k.a. Apple I) he didn't he have enough memory for the 40x24 page of text.

At the same time, over at Atari, the Atari 2600 used half of zero page to address the TIA chip, likely choosing that address space because the game ROMs were only 4K in size. The other half of the zero page was all of RAM, including the stack, and thus again really just global variables.

Are there any 1970s use cases of the 6502 that used zp as registers?

What I'm wondering is how much of the use of that space as globals was due to the syntax of 6502 assembly? LDA $03 doesn't look or feel any different from LDA $303. What if instead the syntax were LDA R3? E.g.
Code:
LDX  #2
LDA $FF82,X
STA R1
INX
LDA $FF82,X
STA R2
LDY $801
LDA (R1),Y
STA R129

I posit that if we change the syntax of accessing zp without changing any of the actual behavior, and then we'd look upon those previous 256 bytes as 256 registers rather than just another 256 bytes of memory.

Agree? Disagree? Other thoughts on the 259 registers of the 6502, only three of which have names?


Top
 Profile  
Reply with quote  
PostPosted: Tue Aug 23, 2022 11:21 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8544
Location: Southern California
It's always an interesting topic. One would have to remember that two-byte "registers" used for indirects take two addresses, and if R3 were one of these, then the next one would have to be R5, not R4.

Woz's SWEET-16 had 16 virtual registers, 32 bytes in total, are located in the zero page of the Apple II's real, physical memory map (at $00–$1F), according to Wikipedia (which I looked it up in since although I've been aware of SWEET-16, I never really got familiar with it).

ZP is the natural place for a data stack too, using X as the stack pointer, and some cells will be contain addresses to access, like LDA (2,X), which is done all the time in Forth.

The only time I've put code in ZP was a short routine when I was using instructions' operands as variables in SMC, to eliminate a level of indirection.

Regardless, I'm sure there are possible useful techniques that have yet to be thought of, to improve the power of the software.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 24, 2022 1:58 am 
Offline

Joined: Wed Jan 08, 2014 3:31 pm
Posts: 578
Given the addressing modes only available in zero page, it shouldn't be treated the same as other RAM. I like splitting it in half and using part as a data stack indexing by the X reg, and half for register like variables.

But it's also possible to locate a data stack in page 2 and use all of page zero as register like variables.


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 24, 2022 2:43 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8544
Location: Southern California
Martin_H wrote:
But it's also possible to locate a data stack in page 2

...as long as that stack doesn't contain addresses you need for indirect access like the (ZP,X) addressing mode since there's no (abs,X) addressing mode.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 24, 2022 7:15 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
It's common enough for some of the zero page locations to be used by different routines for different purposes. When that happens, I'd say those locations look more like registers than like global variables.

Edit: it's also often seen to have a software subsystem like floating point use some area of zero page both as working space and as a way to pass data in and out. Again, somewhat like registers.

I think it might be a stretch to try to make a strict dichotomy of it. Perhaps instead make a taxonomy of the various ways in which zero page is used.


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 24, 2022 10:15 am 
Offline

Joined: Wed Jan 08, 2014 3:31 pm
Posts: 578
GARTHWILSON wrote:
...as long as that stack doesn't contain addresses you need for indirect access like the (ZP,X) addressing mode since there's no (abs,X) addressing mode.

True, and I have used that addressing mode on occasion. But more often than not I need to use (zp),y to access multiple bytes via the pointer, so I pop it to one of my register-like variables.


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 24, 2022 2:50 pm 
Offline
User avatar

Joined: Fri Aug 03, 2018 8:52 am
Posts: 746
Location: Germany
a while ago i did make some macros for a more RISC-like alternative instruction set for the 6502, it's obviously slower than just using the Accumulator, but i found the idea neat (plus an FPGA could have the ZP as an actual register file and implement these instructions in hardware to be much much faster)

example code:
Code:
    LDR R0, #$42        ; Load $42 into ZP Address $00
    LDR R1, #$80        ; Load $80 into ZP Address $01
    CLC
    ADC R2, R1, R0      ; Add the values from ZP Address $00 and $01 together, and store the result in ZP Address $02
    TFR R5, R2          ; "TransFeR" (ie copy) the value from ZP Address $02 into ZP Address $05
    PHR R2              ; Push the value in ZP Address $02 onto the Stack
    STR $1234, R5       ; Store the value from ZP Address $05 into Memory at location $1234


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 24, 2022 9:44 pm 
Offline

Joined: Fri Apr 15, 2022 1:56 pm
Posts: 47
Location: San Antonio, TX, USA
65LUN02 wrote:
What I'm wondering is how much of the use of that space as globals was due to the syntax of 6502 assembly? [...] I posit that if we change the syntax of accessing zp without changing any of the actual behavior, and then we'd look upon those previous 256 bytes as 256 registers rather than just another 256 bytes of memory.
The AVR architecture has 32 8 bit registers which can be used alone or in pairs depending on the operation, so similar in some ways to zero page but named as R0 through R31. Even so, years ago when programming an AVR in assembler, (as a model helicopter flight controller,) I ended up using most of the registers for global variables, to minimize code size and maximize performance. Especially in resource-constrained environments I think programmers will use the resources they have available in the most effective way regardless of the naming.


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 25, 2022 3:47 am 
Offline

Joined: Wed Jun 29, 2022 2:15 am
Posts: 44
Yes, I agree that 256 registers (or 128 2-byte registers) seems excessive for most use cases. I'm thus not advocating to remove the existing LDA $0F syntax, but to add a LDA R15 alternative. R0-R256. Coders dealing themselves with however many bytes each R might hold, just as they deal with $04 being one bytes and ($04),Y being two.

I have this working in my playroom assembler/compiler. It handles not only LDA $03 and LDA R3, but also a half-level higher syntax like A = M@$1234 or M$1234 += X or M$1234 += M$2345.

I'm halfway through implementing those expressions. Given addition can only happen in A, I'm using R0-R3 truly as registers to save and restore A, and I expect I'll be using a few more as I expand the expressions to three arguments. I know Garth and others on this forum like Forth, and would implement all this in a data stack, but I'm using the opportunity to instead see what happens if I treat the CPU as having a ton of registers, ignoring the stack completely.

@Proxy, I like your three register RISC idea. What does the macro for ADC R2, R1, R0 translate to? LDA R1, LDA R0, STA R2? If so, then in this pseudo-RISC chip do you simply never load or store A itself, instead truly using it as the accumulator? If so or if not, that is an interesting idea for my half-step higher level assembler/compiler, simply assuming that A can get clobbered from any line of code and instead relying on R0-R255 (or some reasonable subset) plus X and Y.

For the assembly part of my testbed, that could mean outlawing half of the mnemonics: LDA, STA, ORA, ADC, and replacing them with something like LDR, STR, ORR, ADR that each take two arguments, Rn followed by the argument that would otherwise go with the traditional A register version of the mnemonic. Or mixing it into the mnemonic, e.g. LDR4 $1234 or STR7 ($80),Y.


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 25, 2022 4:39 am 
Offline
User avatar

Joined: Fri Dec 12, 2008 10:40 pm
Posts: 1007
Location: Canada
I have to read this entire thread, but it seems like you want to implement a virtual machine in the new assembler over and above the native 6502 capabilities WRT registers. Am I close?

If you did not use up all of page 0, that would be awesome!

I have some reading to do...

_________________
Bill


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 25, 2022 4:24 pm 
Offline
User avatar

Joined: Fri Aug 03, 2018 8:52 am
Posts: 746
Location: Germany
I think the thread just kinda slipped into that "Virtual Machine" thing.
Though i like the idea.
To refine it a bit more, how about the "registers" are limited to 32 in total (R0-R31). With word sized registers ontop of them, so R0 and R1 form W0, R2 and R3 form W1, etc. which gives you a total of 16x 16-bit registers (W0-W15).
So in addition to the regular 8-bit RISC like instructions there would be a much smaller set of 16-bit instructions intended for the P registers (simple stuff like Load from Memory, Store to Memory, Push to Stack, Pull from Stack).


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 25, 2022 4:46 pm 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1488
Location: Scotland
Proxy wrote:
I think the thread just kinda slipped into that "Virtual Machine" thing.
Though i like the idea.
To refine it a bit more, how about the "registers" are limited to 32 in total (R0-R31). With word sized registers ontop of them, so R0 and R1 form W0, R2 and R3 form W1, etc. which gives you a total of 16x 16-bit registers (W0-W15).


So.... Sweet16?

There are also:

https://github.com/AcheronVM/acheronvm

https://github.com/dschmenk/PLASMA#portable-vm

I'm using the BCPL Cintcode/bytecode VM in my projects - this is a 32-bit VM with 8-bit instructions and uses as a base 3 x 32-bit registers (boring called A, B and C), in zero/direct page, but it needs a few others like a program counter, globals (G) and stack pointer (K) as well as some temporaries for arithmetic.

I'm sure there are many others..

Quote:
So in addition to the regular 8-bit RISC like instructions there would be a much smaller set of 16-bit instructions intended for the P registers (simple stuff like Load from Memory, Store to Memory, Push to Stack, Pull from Stack).


The issue I've had with stuff like sweet-16 is the overheard of switching in and out of it - same for the '816 when it comes to handling byte data in 16 bit mode (and vice versa)

All good fun, though...

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 25, 2022 5:38 pm 
Offline
User avatar

Joined: Fri Aug 03, 2018 8:52 am
Posts: 746
Location: Germany
yea basically Sweet16 but with the 3 Operand instruction format that a lot of RISC CPUs use.
also who said anything about mode switching? my idea was to just have extra pseudo instructions specifically for 16-bit operations, no switching required.

for example:
LDR0 $1234 loads a byte from Address $1234 into R0
LDW0 $1234 loads 2 bytes, one from Address $1234 into R0, and the other from $1235 into R1

but the more i think about it the more i kinda just want to make an expanded version of Sweet16 with this 3 Operand format so that an Accumulator is no longer needed.
i'll have to see if i can make a spec sheet of this idea. if i get something together i'll open a new thread.


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 25, 2022 5:49 pm 
Offline

Joined: Mon Feb 15, 2021 2:11 am
Posts: 100
I was having thoughts along similar lines when I started thinking up possible uses for old chips like the 74S181. First I was thinking use a few to make a simplified 6502-type CPU, then I thought of a simpler project tying a real 6502 to a register file at the bottom of zero page and a '181-based ALU, with a Sweet16-like interface to control it. As I started to realize how much physical space a bunch of '181's, shift registers, and register file chips would actually take, plus A and B data buses, not to mention costs on some of the parts, I started to hesitate. Then I noticed there are Am2901's and imitators and successors still to be found.

One of the things I've noticed is that convenient as it would be to have a set of registers available, with an 8-bit data bus the number of cycles to get the instructions, and send the instructions to the ALU/coprocessor from the 6502 was often excessive for any two or three operand instructions involving many registers. Full RISC-style three-operand instructions, allowing for 16 opcodes and 16 registers, would require two bytes of instruction. A two operand instruction format with 6 bits of opcode and two 5 bit register numbers fit nicely into 16 bits, two. Two operand instructions could be squeezed into 12 bits, which might make them amenable to transfer from 6502 to coprocessor via bus sniffing + illegal opcodes, in 3-4 cycles 6502 cycles. Then there's conditional branching based upon results. Hardware can help, again - an illegal opcode that checks a coprocessor flags register and turns the next 2-3 bytes read into NOP's instead of BR/JMP instructions if conditions not met might help. Reading the same flag register via a VIA testing it and then conditionally branching would work, but would likely be slower. Dr Jefyll's Kim Klone, Don Lancaster's Cheap Video, and some of the Byte and Dr. Dobb's articles from the late 70's and early 80's provide some interesting ideas for exploiting the illegal opcodes. I can provide a few links if anybody cares - I've been compiling them.

A 16-bit VM or pseudo-instructions for a RISC-like syntax, strictly in software, seems like it would slow things down. The speed at which the software runs, not the developer's time, obviously.

Maybe a hybrid of hardware and software? Perhaps with an ability to toggle the RAM and data bus between 8 bit and 16 bit access? Unaligned accesses would still need two reads/writes, though. Perhaps add enough to the coprocessor to at least fetch its own instructions when it is told to run, so it could run some 16-bit subroutines?

Just brainstorming electronically.


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 25, 2022 8:09 pm 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1488
Location: Scotland
Proxy wrote:
yea basically Sweet16 but with the 3 Operand instruction format that a lot of RISC CPUs use.
also who said anything about mode switching? my idea was to just have extra pseudo instructions specifically for 16-bit operations, no switching required.


I was thinking more along the lines of the overhead of JSR SW16 ... RET and comparing it to mode switching on the '816, but if you can do it all in macros then that's eliminated. I think the AcheronVM works all in macros too.

Quote:
for example:
LDR0 $1234 loads a byte from Address $1234 into R0
LDW0 $1234 loads 2 bytes, one from Address $1234 into R0, and the other from $1235 into R1

but the more i think about it the more i kinda just want to make an expanded version of Sweet16 with this 3 Operand format so that an Accumulator is no longer needed.
i'll have to see if i can make a spec sheet of this idea. if i get something together i'll open a new thread.


Sounds interesting - keep us posted!

Cheers,

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 29 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 23 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron