65C02 TSB and TRB

Programming the 6502 microprocessor and its relatives in assembly and other languages.
Post Reply
leeeeee
In Memoriam
Posts: 347
Joined: 30 Aug 2002
Location: UK
Contact:

65C02 TSB and TRB

Post by leeeeee »

Greetings all.

All the datasheets I have for the 65C02 instruction set say that TSB and TRB operate as follows ..

TSB does A AND M -> M, sets/clears Zb on the result

TRB does ~A AND M -> M, sets/clears Zb on the result

.. but having investigated this on a real 65C02 core (CCU3000 single chip micro) it seems to do this ..

TSB sets/clears Zb on the result of A AND M then does A OR M -> M

TRB sets/clears Zb on the result of A AND M then does ~A AND M -> M

Is this right? Or is there some different 'third way' that I'm as yet unaware of?

Cheers,

Lee.
User avatar
GARTHWILSON
Forum Moderator
Posts: 8773
Joined: 30 Aug 2002
Location: Southern California
Contact:

Post by GARTHWILSON »

Here it is out of WDC's programming manual which every 6502/65816 hobbyist or professional should have:

For TSB: "Logically OR together the value in the accumulator with the data at the effective address specified by the operand. Store the result at the memory location. ..."

and for TRB: "Logically AND together the _complement_ of the value in the accumulator with the data at the effective address specified by the operand. Store the result at the memory location. ..."

For both, it says Z is set or cleared based on a second, different operation, working just like BIT does, for both TSB and TRB (no difference between the two). This test operation only affects Z, and no results are kept other than in Z.

Garth
User avatar
barrym95838
Posts: 2056
Joined: 30 Jun 2013
Location: Sacramento, CA, USA

Post by barrym95838 »

Hi, all.

I figured that it was more appropriate to ask my questions in this thread rather than subject the other thread to more off-topic pollution.

Sub-topic 1: The Rockwell 'c02 seems to use 32 opcodes to set, reset, and test-and-branch individual zero-page bits. Does anyone here use these? It seems to me that they would be very fast and compact for flags, but nearly useless for I/O, unless port(s) were mapped into zero-page, a la 6510. Does anyone have any commented code snippets to share, showing how these guys work? (links would suffice)

Sub-topic 2: The WDC 'c02 seems to use just four opcodes to test-and-set and test-and-reset up to 8 bits anywhere in RAM. I know that Garth and BDD have used these, but I would be interested in knowing if there is a good reason for them only changing the Z flag. Why not N and V as well, like their close cousin, BIT? I'm guessing that there is a reason, but is it to make the hardware less complicated, or the hypothetical software using it less complicated, or both, or neither? Does anyone have any commented code snippets to share, showing how these guys work? (links would suffice)

Sub-topic 3: The WDC and Rockwell versions don't exist together on any design, as far as I know, due to functional overlap. If a hypothetical processor design implemented the Rockwell-like set with full-RAM addressing, do you guys think that it would be preferable to the WDC-like set?

Thanks,

Mike
User avatar
GARTHWILSON
Forum Moderator
Posts: 8773
Joined: 30 Aug 2002
Location: Southern California
Contact:

Re:

Post by GARTHWILSON »

barrym95838 wrote:
I figured that it was more appropriate to ask my questions in this thread rather than subject the other thread to more off-topic pollution.
Wow, revival of an 11-year-old topic! That's partly why we keep the archives though-- to avoid having to repeat stuff, and to review what was learned in the past.
Quote:
Sub-topic 1: The Rockwell 'c02 seems to use 32 opcodes to set, reset, and test-and-branch individual zero-page bits. Does anyone here use these? It seems to me that they would be very fast and compact for flags, but nearly useless for I/O, unless port(s) were mapped into zero-page, a la 6510. Does anyone have any commented code snippets to share, showing how these guys work? (links would suffice)
There were some 6502-based microcontrollers with I/O in ZP which took advantage of BBS, BBR, SMB, and RMB. I have never had I/O in ZP, so I have had no real use for these. I think it would be more useful to have them for absolute addressing only than to have them for ZP addressing only. If you have the op codes available on the 65m32, I'd say go for it.
Quote:
Sub-topic 2: The WDC 'c02 seems to use just four opcodes to test-and-set and test-and-reset up to 8 bits anywhere in RAM. I know that Garth and BDD have used these, but I would be interested in knowing if there is a good reason for them only changing the Z flag. Why not N and V as well, like their close cousin, BIT? I'm guessing that there is a reason, but is it to make the hardware less complicated, or the hypothetical software using it less complicated, or both, or neither? Does anyone have any commented code snippets to share, showing how these guys work? (links would suffice)
I don't know why it doesn't do N & V as well. As I think about my own uses for TSB & TRB, I don't see much extra need for affecting N & V at the same time, probably because I have never tested bits while setting or clearing bits.
Quote:
Sub-topic 3: The WDC and Rockwell versions don't exist together on any design, as far as I know, due to functional overlap. If a hypothetical processor design implemented the Rockwell-like set with full-RAM addressing, do you guys think that it would be preferable to the WDC-like set?
WDC did add the Rockwell instructions to their '02 over 20 years ago, so they do have both. They do not reside in the same columns in the opcode table. However, if the instructions couldn't coexist, then I'd say the RMB, SMB, BBS, and BBR would be preferable, as long as they're not limited to ZP or DP. I have used TSB & TRB for more than one bit at a time though, which SMB & RMB cannot do. An example is at http://wilsonminesco.com/6502primer/SPI.ASM .
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
User avatar
barrym95838
Posts: 2056
Joined: 30 Jun 2013
Location: Sacramento, CA, USA

Re: 65C02 TSB and TRB

Post by barrym95838 »

Thanks, Garth! I knew that I could count on you. Your link to the TRB & TSB code was perfect, and has caused me to lean toward them over the Rockwell stuff, for a few reasons.

1) The assembly language is more '6502'-like, which is important to me.
2) The ability to set or reset more than one bit at a time seems useful.
3) The op-code footprint is smaller.

Regarding the flag behavior: I see that you are using these instructions as simple outputs in your SPI example, but it's possible that there could be a use for knowing something about the location that you're modifying. I'm thinking of semaphores and the like, but it also occurs to me that some I/O ports have a bi-directional quality that could benefit from this too.

The finer control offered by the Rockwell stuff might be able to implement algorithms for compression or encryption more economically (especially when combined with indexed addressing, which is automatically included in the 65m32), but I don't know enough about these algorithms to decide whether the benefits would outweigh the cost of the additional op-code space.

I certainly don't have room for both, so unless someone offers a compelling argument in favor of the Rockwell stuff (with example code), I'm going to set my sights on TSB/TRB.

Mike
User avatar
GARTHWILSON
Forum Moderator
Posts: 8773
Joined: 30 Aug 2002
Location: Southern California
Contact:

Re: 65C02 TSB and TRB

Post by GARTHWILSON »

barrym95838 wrote:
Thanks, Garth! I knew that I could count on you. Your link to the TRB & TSB code was perfect, and has caused me to lean toward them over the Rockwell stuff, for a few reasons.
Amazing. I know it's a pain to look at someone else's code, so I try to make it as readable as possible, and still wonder if anyone will read it.
Quote:
but it also occurs to me that some I/O ports have a bi-directional quality that could benefit from this too
like the 6522 VIA.
Quote:
The finer control offered by the Rockwell stuff might be able to implement algorithms for compression or encryption more economically (especially when combined with indexed addressing, which is automatically included in the 65m32), but I don't know enough about these algorithms to decide whether the benefits would outweigh the cost of the additional op-code space.
I've never done any real compression, but I'm working on another realtime multitasking project on a PIC (horrors-- why can't they put a 6502 or '816 in it?) and the variable space is tight if I want to stay in RAM bank 0 nearly full time to keep the software simpler and more efficient. I have a lot of flag variables, and I use a whole byte for each. It's possible to put 8 flags in a byte of course, but it gets a little messier.
Quote:
I certainly don't have room for both, so unless someone offers a compelling argument in favor of the Rockwell stuff (with example code), I'm going to set my sights on TSB/TRB.
No room for both? I guess it's TRB & TSB then. I would use use RMB, SMB, BBS, and BBR plenty in I/O if they were made to do abs instead of ZP addressing. That lack of abs addressing there though will probably mean no one here has code already written using them, since putting I/O in ZP takes too much address-decoding logic and takes up too much of ZP.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
White Flame
Posts: 704
Joined: 24 Jul 2012

Re: 65C02 TSB and TRB

Post by White Flame »

barrym95838 wrote:
The finer control offered by the Rockwell stuff might be able to implement algorithms for compression or encryption more economically (especially when combined with indexed addressing, which is automatically included in the 65m32), but I don't know enough about these algorithms to decide whether the benefits would outweigh the cost of the additional op-code space.
I've done my fair share of both, however the compression I've done on the 6502 end of things has been tailored to the chip's capabilities, keeping bit twiddling requirements as cheap as possible.

Encryption algorithms are numeric processes that generally work with integer byte values, and don't involve much of any bit twiddling.

Compression often involves dealing with streams of variable-length numbers encoded in bytes. On the reading end, a zero test is useless for "get the next 5-bit number out of a byte stream", and we have the AND instruction anyway to pull masked values. Plus, decompression doesn't need to modify the values its reading, so the modification part of TxB is superfluous for this use.

Writing to such a stream (assuming the current byte was initialized to zero) could be done with TSB, again as long as the addressing mode is flexible. Compression is slow anyway, writing has to deal with output values crossing byte boundaries regardless, and compression is often handled on a cross-development host instead of on the 6502, so that really doesn't need this particular micro-optimization. The amount of work saved is dwarfed by everything else going on.
User avatar
BigDumbDinosaur
Posts: 9427
Joined: 28 May 2009
Location: Midwestern USA (JB Pritzker’s dystopia)
Contact:

Re: 65C02 TSB and TRB

Post by BigDumbDinosaur »

I've never put RMB, SMB, BBS, and BBR to use, primarily for the same reasons offered by Garth, plus the 65C816 doesn't implement them. I do have a fair number of TRBs and TSBs sprinkled through the POC firmware, and these instructions will get more use as I write code to implement my 816NIX filesystem and develop a lightweight kernel to go with it. TRB and TSB can be useful in manipulating bitmaps (using self-modifying code—neither instruction has an indexed addressing mode), and it so happens that I use several bitmaps in the 816NIX internal structure to allocate and release inodes and data blocks. A two instruction sequence, such as:

Code: Select all

         lda #%00100000
         tsb bmaddr
is somewhat faster and tidier than:

Code: Select all

         lda bmaddr
         and #%00100000
         sta bmaddr
That said, the lack of indexing on the TRB and TSB instructions can complicate things. I could write:

Code: Select all

         lda bmbase,x
         and #%00100000
         sta bmbase,x
and not have to modify the operand of a TSB instruction, which can be messy.

Either way, the accumulator is going to get clobbered. Decisions, decisions...
x86?  We ain't got no x86.  We don't NEED no stinking x86!
User avatar
Rob Finch
Posts: 465
Joined: 29 Dec 2002
Location: Canada
Contact:

Re: 65C02 TSB and TRB

Post by Rob Finch »

Code: Select all

lda bmbase,x
         and #%00100000
         sta bmbase,x
RTF65002 can do that sort of thing with bitmap instructions as:

Code: Select all

lda #5      ; set the 5th bit
bms bmbase,x    ; relative to the bmbase,x
User avatar
barrym95838
Posts: 2056
Joined: 30 Jun 2013
Location: Sacramento, CA, USA

Re: 65C02 TSB and TRB

Post by barrym95838 »

Nice, Rob!

It looks like you did a hybrid of the Rockwell and WDC set, by putting the bit number in a, instead of the AND/OR mask. I was considering something similar, but I haven't made up my mind yet. How much time did you spend considering the relative benefits of your idea vs. the multi-bit WDC and multi-opcode Rockwell plans? I'm almost never sure until I mock up some translations of working code.

I believe that you previously mentioned breaking the 8-bit op-code barrier for the RTF65002, so you probably have a lot more room to expand than I do, unless I follow the advice of a couple of friends and steal a bit or two from my embedded constant. I have some stubborn tendencies, so I don't know if that will wind up being in the cards or not. I need to pull the trigger soon, though!

Mike
User avatar
BigDumbDinosaur
Posts: 9427
Joined: 28 May 2009
Location: Midwestern USA (JB Pritzker’s dystopia)
Contact:

Re: 65C02 TSB and TRB

Post by BigDumbDinosaur »

Rob Finch wrote:

Code: Select all

         lda bmbase,x
         and #%00100000
         sta bmbase,x
RTF65002 can do that sort of thing with bitmap instructions as:

Code: Select all

lda #5      ; set the 5th bit
bms bmbase,x    ; relative to the bmbase,x

That's an analog of the RMB and SMB 65C02 instructions, which don't exist in the 65C816. The significant improvement, I think, is that conceivably the mask in .A could have multiple bits set.

My interests, however, are in using the actual WDC parts, rather than FPGA implementations.

The crux of the problem with TRB and TSB (as well as RMB, SMB and the BBx group) is that none have an indexed addressing mode. These instructions are very useful in manipulating bitwise flags (I use several such flags in the POC's firmware, especially in the TIA-232 drivers) but lacking indexing, use of any of these bit-twiddlers on a bitmap becomes problematic. I envision one solution that would utilize self-modifying code. However, doing so would negate much of the efficiency of the TRB or TSB instruction, as an entire address would have to be set rather than merely adjusting .X or .Y.
x86?  We ain't got no x86.  We don't NEED no stinking x86!
User avatar
barrym95838
Posts: 2056
Joined: 30 Jun 2013
Location: Sacramento, CA, USA

Re: 65C02 TSB and TRB

Post by barrym95838 »

BigDumbDinosaur wrote:
... A two instruction sequence, such as:

Code: Select all

         lda #%00100000
         tsb bmaddr
is somewhat faster and tidier than:

Code: Select all

         lda bmaddr
         and #%00100000
         sta bmaddr
I don't think those sequences are equivalent anyway, BDD. I think that a direct translation of the first snippet would be:

Code: Select all

        lda  #%00100000
        bit  bmaddr
        php
        ora  bmaddr
        sta  bmaddr
        plp
which further reinforces your point ... unless I'm hopelessly confused (it certainly wouldn't be the first time).

Mike B.
User avatar
BigDumbDinosaur
Posts: 9427
Joined: 28 May 2009
Location: Midwestern USA (JB Pritzker’s dystopia)
Contact:

Re: 65C02 TSB and TRB

Post by BigDumbDinosaur »

barrym95838 wrote:
BigDumbDinosaur wrote:
... A two instruction sequence, such as:

Code: Select all

         lda #%00100000
         tsb bmaddr
is somewhat faster and tidier than:

Code: Select all

         lda bmaddr
         and #%00100000
         sta bmaddr
I don't think those sequences are equivalent anyway, BDD. I think that a direct translation of the first snippet would be:

Code: Select all

        lda  #%00100000
        bit  bmaddr
        php
        ora  bmaddr
        sta  bmaddr
        plp
which further reinforces your point ... unless I'm hopelessly confused (it certainly wouldn't be the first time).

Mike B.
Right! Must've been one of those late night posts. :lol: I should have written:

Code: Select all

         lda #%00100000
         tsb bmaddr
is somewhat faster and tidier than:

Code: Select all

         lda bmaddr
         ora #%00100000
         sta bmaddr
...and I wasn't trying to account for the BIT-like effect of TSB.

That the Boolean test occurs before the actual change to the address affected by TRB and TSB can be very useful. Consider, for example, this code fragment from POC V1.1's UART driver:

Code: Select all

         lda tiatstab,x        ;transmitter status bit mask
         trb tiatxst           ;transmitter enabled?
         beq .0000020          ;yes
;
         lda #nxpcrtxe         ;no
         ldy #nx_cr            ;point at command register &...
         stasi .chan           ;enable transmitter
;
.0000020 ...
Here a datum to be transmitted has been written into UART's FIFO and now the code checks to see if the transmitter has been shut down. If it was it is restarted. The TRB TIAXST instruction does double duty by clearing a "transmitter is disabled" flag bit (in direct page) and telling us if the transmitter is already enabled—the flag bit would have been 0 when TRB was executed. Only if the flag bit was 1 would we write the "enable transmitter" mask to the UART's command channel. So although TRB is slightly slower than BIT, the code is more compact and on average slightly faster.

Incidentally, STASI .CHAN is a stack pointer relative indirect indexed macro-instruction, generating the equivalent of STA (.CHAN,S),Y. The driver keeps ephemeral data on the stack, primarily channel indices, since the UART has two channels. That way one piece of code can service either channel. The .CHAN value is a stack pointer offset that is local to the routine in which this code is located.
x86?  We ain't got no x86.  We don't NEED no stinking x86!
Post Reply