6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Fri Jun 28, 2024 7:06 am

All times are UTC




Post new topic Reply to topic  [ 91 posts ]  Go to page Previous  1, 2, 3, 4, 5 ... 7  Next
Author Message
PostPosted: Wed Mar 29, 2023 4:36 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8234
Location: Midwestern USA
jgharston wrote:
Yes, I'd have (S,n) instead of (X,n) addressing mode. In 40+ years I think I've only ever used (X,n) once...

My virtual QUART driver makes extensive use of (<dp>,X) addressing. In fact, implementing an efficient driver, one that could support 115.2 Kbps on all four channels, coming and going, would have been quite difficult without (<dp>,X).

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Wed Mar 29, 2023 8:49 am 
Offline
User avatar

Joined: Tue Nov 16, 2010 8:00 am
Posts: 2353
Location: Gouda, The Netherlands
gfoot wrote:
Does anybody know how relatively expensive different changes would have been?


Based on my own experience with CPU core design, I would say the most expensive thing would be to make changes to an already chosen path, especially because there was so much manual labor involved in chip design back then. When you first start with the design, you're overwhelmed with choices, and it's very hard to pick an optimal direction. Once the design starts to crystallize, you get a better idea of what you want, but then you don't really want to go back and start over.


Top
 Profile  
Reply with quote  
PostPosted: Wed May 10, 2023 1:47 pm 
Offline

Joined: Sun Mar 19, 2023 2:04 pm
Posts: 137
Location: about an hour outside of Springfield
added a Z register for X, Y, Z operations (yes it can be serialized, we live in a 3d world and I think it would help)
put the Stack, in the IC if possible as a form of L1 Cache RAM (for speed and simplicity)

...general compatibility interoperability with existing standards


Top
 Profile  
Reply with quote  
PostPosted: Wed May 10, 2023 4:20 pm 
Offline
User avatar

Joined: Sat Dec 01, 2018 1:53 pm
Posts: 727
Location: Tokyo, Japan
drogon wrote:
Or that was the thought... Did anyone actually use [decimal mode]?

Yup! Took me hours of debugging before I figured out it was accidentally getting turned on. :-)

gfoot wrote:
I wonder about NMI, now that you mention it...

I love NMI; it lets me get back into the monitor when something's gone broken in my code with interrupts on.

jgharston wrote:
I'd also ensure JSR pushed the proper return address, not address-1.

I've aways suspected the reason for doing that was to save a cycle on JSR, though now that I look more closely at it I can't really see how it would do that. Maybe it avoids having to make an "increment PC or not?" decision when executing RTS? Anybody want to pop it into Visual 6502 and give a cycle by cycle account of JSR and RTS?

As for a more general comment, it's always worth thinking about instruction decoding when considering what features you might want to add. I believe that saving on decoding circuitry was the reason that the 6502 didn't have a BRA instruction: they'd already filled up an entire 3-bit "column" with the conditional branch instructions, and decoding one more branch instruction from a different column would have added considerable expense, compared to just using a single column.

_________________
Curt J. Sampson - github.com/0cjs


Top
 Profile  
Reply with quote  
PostPosted: Wed May 10, 2023 4:28 pm 
Offline
User avatar

Joined: Tue Nov 16, 2010 8:00 am
Posts: 2353
Location: Gouda, The Netherlands
cjs wrote:
I've aways suspected the reason for doing that was to save a cycle on JSR, though now that I look more closely at it I can't really see how it would do that. Maybe it avoids having to make an "increment PC or not?" decision when executing RTS? Anybody want to pop it into Visual 6502 and give a cycle by cycle account of JSR and RTS?


It just naturally works out that way. The CPU first reads the LSB of the destination address, then pushes the PC, then reads the MSB, and then jumps to new location. If you wanted to read LSB, read MSB, push PC, then you'd need an extra 8 bits to store the MSB somewhere.


Top
 Profile  
Reply with quote  
PostPosted: Wed May 10, 2023 5:08 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10836
Location: England
It's very much a teaching moment: JSR as it is saves needing an extra byte of storage. That was the level of price pressure on the design!


Top
 Profile  
Reply with quote  
PostPosted: Wed May 10, 2023 5:29 pm 
Offline
User avatar

Joined: Tue Nov 16, 2010 8:00 am
Posts: 2353
Location: Gouda, The Netherlands
And the LSB of the destination address is temporarily stored in the S register to save another byte.


Top
 Profile  
Reply with quote  
PostPosted: Wed May 10, 2023 5:38 pm 
Offline
User avatar

Joined: Sat Dec 01, 2018 1:53 pm
Posts: 727
Location: Tokyo, Japan
Arlet wrote:
The CPU first reads the LSB of the destination address, then pushes the PC, then reads the MSB, and then jumps to new location. If you wanted to read LSB, read MSB, push PC, then you'd need an extra 8 bits to store the MSB somewhere.

Ah, of course! That's a classic 6502 optimisation; it had just never occurred to me that it would push between reading the LSB and the MSB of the target address.

I love learning this kind of stuff about the 6502 (and about old microcomputers in general). You really come out with a completely different viewpoint about what's sensible and what isn't when you learn what's going on at those lower levels.

_________________
Curt J. Sampson - github.com/0cjs


Top
 Profile  
Reply with quote  
PostPosted: Sat May 27, 2023 4:26 pm 
Offline

Joined: Wed Jun 26, 2013 9:06 pm
Posts: 56
Even though the 6502 was supposed to be the original RISC CPU, later RISC CPUs like ARM went in a completely different direction, with less addressing modes but more registers. I always wondered what would've happened if the makers of the 6502 took the approach of more registers but less addressing modes but still managed to get it within the same price and transistor count.

For example, getting rid of indirect addressing modes, but using 16-bit address registers instead. I do how much that would've change the overall transistor count.


Top
 Profile  
Reply with quote  
PostPosted: Sat May 27, 2023 6:51 pm 
Offline
User avatar

Joined: Sat Dec 01, 2018 1:53 pm
Posts: 727
Location: Tokyo, Japan
Aaendi wrote:
Even though the 6502 was supposed to be the original RISC CPU...

It was not supposed to be, and it was not. Much as I enjoy my little joke about the 6502 being "a RISC CPU with one register," there are several issues with that.

1. "RISC" wasn't even a concept until 1980, or perhaps 1979 or late 1978 in unpublished work. That's years after the 6502.

2. RISC is distinguished from CISC, and arguably it was the VAX-11 (announced in '77, two years after the 6502 introduction) that provided the CISC standard of comparison, with all the extra, specialised instructions it added over the PDP-11.

3. RISC focuses on register-to-register operations, with registers being generic. The 6502 is very much a memory-oriented processor, even more than than 8080. For example, the 8080, while operations tend to focus on the accumulator, is generic in the counterparty register: you can do things like `XRA A` (`XOR A,A` in Z80 mnemonics) which is simply something you can't do in the 6502. This was also quite efficient; `XRA A`/`XOR A,A` is the standard idiom for loading 0 into the accumulator, taking only one one byte and M-cycle (four T-cycles, IIRC, the minimum), whereas the 6502 must do `LDA #0`, taking two bytes and two cycles.

4. It's clear from much of the 6502 instruction coding that reducing the number of instructions was not about being RISC, but simply saving transistors. For example, it would be great to have a `BRA` instruction as the 6800 did (so great that they added it back in the 65C02!), but with a hard limit of 8 relative branch instructions to keep the coding of the particular branch instruction to three bits, there simply wasn't room for it. (Consider which branch instruction you would remove to make room for `BRA`, and why.)

Quote:
For example, getting rid of indirect addressing modes, but using 16-bit address registers instead. I do how much that would've change the overall transistor count.

Well, using 16-bit address registers instead of the zero page as pseudo-registers is still indirect addressing. And my guess is that it would have increased the transistor count by rather a bit, given that they felt it was worthwhile even to make the stack pointer just 8 bits, unlike the 6502's inspiration, the 6800.

And you can't even just change the X register to be 16 bits instead of having indirect ZP indexed addressing; one of the improvements over the 6800 (which had a 16-bit X register) made by the 6502 design team was to essentially have multiple (inefficient, because zero-page) index registers. If you've ever programmed a 6800, you'll well remember how inefficient it was to do things (particularly copies) with only one index register, and the additional pain caused to program design from not being able to push it on to the stack.

The 6502 has a marvelously efficient design. It's not easily appreciated until you first have programmed both it and the 6800, and second have gained some understanding of how the 6502 works internally. It took me many, many years to start to properly appreciate it.

_________________
Curt J. Sampson - github.com/0cjs


Top
 Profile  
Reply with quote  
PostPosted: Sat May 27, 2023 7:30 pm 
Offline

Joined: Tue Nov 10, 2015 5:46 am
Posts: 217
Location: Kent, UK
cjs wrote:
3. RISC focuses on register-to-register operations, with registers being generic. The 6502 is very much a memory-oriented processor, even more than than 8080. For example, the 8080, while operations tend to focus on the accumulator, is generic in the counterparty register: you can do things like `XRA A` (`XOR A,A` in Z80 mnemonics) which is simply something you can't do in the 6502. This was also quite efficient; `XRA A`/`XOR A,A` is the standard idiom for loading 0 into the accumulator, taking only one one byte and M-cycle (four T-cycles, IIRC, the minimum), whereas the 6502 must do `LDA #0`, taking two bytes and two cycles.

Register-to-register operations are the key, here. As the CPU frequency increases, the latency to memories becomes a bottleneck. Even an L2 cache hit in a modern ARM core has a latency you'd like to avoid if you can.

So keeping your working data in registers and doing your ALU and branch decisions based on them keeps the pipelines moving at full speed.

And when you do have to go to memory, burst-reading cache line sized units, and then keeping a cache hierarchy is important, as are techniques such as speculation to keep the memory interface busy and data moving even if the program is getting its current needs met from register-to-register or cache hits. All of this acts to decouple the instruction retirement rate from the memory latency.

The 6502 does its best work when the memory system can keep up, and service a transaction on every cycle, and even then it's only on the bus for half a cycle, so with a 70ns SRAM I think that means the 6502 maxes out at around 7MHz. Let me know if I'm mistaken about that.


Top
 Profile  
Reply with quote  
PostPosted: Sat May 27, 2023 8:15 pm 
Offline

Joined: Sun Mar 19, 2023 2:04 pm
Posts: 137
Location: about an hour outside of Springfield
I am interested in the hub-bub about using zero page as 'registers'.

I again advocate for a 3rd "Z" register that can be even further reduced from Y, as Y is to X, as X is to A.
This would not be a full Z register, just something to help the robotic automation of Detroit auto factories.

Id also try like crazy to get some of the base, low level memory (zero page, stack) onto the die wherever possible.
Id also try to get more I/O 'Chip Select' pins for more advanced Chip to Chip communication.
More Vectors in the upper band maybe.


Top
 Profile  
Reply with quote  
PostPosted: Sat May 27, 2023 9:20 pm 
Offline

Joined: Tue Nov 10, 2015 5:46 am
Posts: 217
Location: Kent, UK
wayfarer wrote:
I am interested in the hub-bub about using zero page as 'registers'.

Architecturally the 6502 uses zero-page as a stand-in for real registers, but of course they're not real registers and they require explicit instructions to bring into and out of the processor.

I say, "architecturally", as it's intrinsic to the design. There are no general purpose address registers in the 6502 (i.e. registers that hold and address for the purpose of performing indirect accesses), like for example the HL, BC, DE, IX, IY regs in Z80, the X, Y regs in 6809,the SI, DI, BP regs in 8086, IX reg in 6800. In the 6502 such indirect accesses are accomplished by pairs of bytes in page zero, primarily via the (ZP),Y address mode, with the Y register providing an offset and an efficient way to advance the pointer using INY vs. having to increment the 16-bit value in memory with an INC/BNE/INC sequence.

It's somewhat cumbersome, but remember at the time (1975) the 6502's minimal design was dramatically cheaper than the other available CPUs, and its aggressive price point arguably unleashed the entire home computer industry. Worth it.


Top
 Profile  
Reply with quote  
PostPosted: Sat May 27, 2023 9:46 pm 
Offline

Joined: Tue Nov 10, 2015 5:46 am
Posts: 217
Location: Kent, UK
My prioritized wish list:
    1. ADD/SUB without carry.
    2. (ZP),X instead of (ZP,X).
    3. Unconditional BRA.

2% of MSBASIC is SEC/CLC. Not a dramatic amount, but personally I find it cumbersome to need one of these before an ADC/SBC.

I've never used (ZP,X), so I'd rather have two independent indirect offsets. The counter to this is that certain types of application may suffer (e.g. Forth).

There's not much to say about BRA (2 bytes) vs. JMP (3 bytes)... one less byte. Unless position independence is important... in which case perhaps a BSR would be more valuable.

Overall none of these are game-changers... which most likely means I haven't written enough 6502 to have better ideas.


Top
 Profile  
Reply with quote  
PostPosted: Sun May 28, 2023 2:22 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8234
Location: Midwestern USA
sark02 wrote:
My prioritized wish list...(ZP),X instead of (ZP,X)...I've never used (ZP,X)...

Guess you haven’t done much device driver programming. :D

Code:
03594  ;   —-—-—-—-—-—-—-—-—-—-—-—
03595  ;   SIO Receiver Processing
03596  ;   —-—-—-—-—-—-—-—-—-—-—-—
03597  ;
03598  00D2CF  85 4A                  sta sioirqst          ;save channel IRQ status word
03599  00D2D1  A0 00                  ldy #0                ;1st channel to process
03600  ;
03601  00D2D3  46 4A         .0000010 lsr sioirqst          ;this channel interrupting?
03602  00D2D5  90 32                  bcc .0000040          ;no, skip it
03603  ;
03604  00D2D7  98                     tya                   ;copy channel index &...
03605  00D2D8  0A                     asl                   ;make channel pointer...
03606  00D2D9  AA                     tax                   ;offset
03607  00D2DA  A9 40                  lda #nxpcresr         ;clear any RxD...
03608  00D2DC  81 18                  sta (siocr,x)    <——— ;overrun (ugly hack)
03609  ;
03610  00D2DE  A1 10         .0000020 lda (siosr,x)    <——— ;get receiver status
03611  00D2E0  89 01                  bit #nxprxdr          ;any data?
03612  00D2E2  F0 11                  beq .0000030          ;no, done with channel
03613  ;
03614  00D2E4  A1 20                  lda (siofif,x)   <——— ;get datum from channel &...
03615  00D2E6  EB                     xba                   ;protect it
03616  00D2E7  B5 30                  lda sioputrx,x        ;get RX queue ‘put’ pointer
03617  00D2E9  1A                     inc                   ;bump it
03618  00D2EA  D5 28                  cmp siogetrx,x        ;any room in queue?
03619  00D2EC  F0 F0                  beq .0000020          ;no, discard datum
03620  ;
03621  00D2EE  EB                     xba                   ;recover datum &...
03622  00D2EF  81 30                  sta (sioputrx,x) <——— ;store in queue
03623  00D2F1  F6 30                  inc sioputrx,x        ;adjust queue ‘put’ pointer &...
03624  00D2F3  80 E9                  bra .0000020          ;go back for more
03625  ;
03626  00D2F5  38            .0000030 sec
03627  00D2F6  B5 30                  lda sioputrx,x        ;compute datums...
03628  00D2F8  F5 28                  sbc siogetrx,x        ;in queue
03629  00D2FA  C9 E7                  cmp #s_siohwm         ;reach queue fill level?
03630  00D2FC  90 0B                  bcc .0000040          ;no, move on to next channel
03631  ;
03632  00D2FE  BD 0B EF               lda siotstab,x        ;yes, stop...
03633  00D301  04 48                  tsb siorxst           ;incoming data flow
03634  00D303  D0 04                  bne .0000040          ;data flow previously stopped
03635  ;
03636  00D305  A9 90                  lda #nxpcrrsd         ;tell channel to...
03637  00D307  81 18                  sta (siocr,x)    <——— ;deassert RTS
03638  ;
03639  00D309  C8            .0000040 iny                   ;next channel
03640  00D30A  CC 0C C0               cpy io_quart+nx_urega ;all channels processed?
03641  00D30D  D0 C4                  bne .0000010          ;no

Note all the highlighted instructions, in which specific registers in the four channels of the two DUARTs in POC V1.3’s hardware are easily selected with a single index value. Writing the above code without benefit of (<dp>,X) addressing would have been possible, of course. However, lacking (<dp>,X) to quickly select each channel’s register set, it would have been necessary to set up the pointers each time the ISR ran in response to a virtual QUART interrupt, adding to both code size and total execution time.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 91 posts ]  Go to page Previous  1, 2, 3, 4, 5 ... 7  Next

All times are UTC


Who is online

Users browsing this forum: AndrewP and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: