6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Thu Nov 21, 2024 9:30 pm

All times are UTC




Post new topic Reply to topic  [ 581 posts ]  Go to page Previous  1 ... 4, 5, 6, 7, 8, 9, 10 ... 39  Next
Author Message
 Post subject:
PostPosted: Sun Jan 17, 2010 10:10 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
It is possible that the extra read access which occurs in the dead cycle is having a side-effect?

From this document which includes information on cycle-by-cycle timings:
Code:
 Zero page indexed addressing
     Write instructions (STA, STX, STY, SAX)

        #   address  R/W description
       --- --------- --- -------------------------------------------
        1     PC      R  fetch opcode, increment PC
        2     PC      R  fetch address, increment PC
        3   address   R  read from address, add index register to it
        4  address+I* W  write to effective address

       Notes: I denotes either index register (X or Y).

              * The high byte of the effective address is always zero,
                i.e. page boundary crossings are not handled.

  Absolute indexed addressing
     Write instructions (STA, STX, STY, SHA, SHX, SHY)

        #   address  R/W description
       --- --------- --- ------------------------------------------
        1     PC      R  fetch opcode, increment PC
        2     PC      R  fetch low byte of address, increment PC
        3     PC      R  fetch high byte of address,
                         add index register to low address byte,
                         increment PC
        4  address+I* R  read from effective address,
                         fix the high byte of effective address
        5  address+I  W  write to effective address

       Notes: I denotes either index register (X or Y).

              * The high byte of the effective address may be invalid
                at this time, i.e. it may be smaller by $100. Because
                the processor cannot undo a write to an invalid
                address, it always reads from the address first.

(While the access in the dead cycle is officially under-documented as 'internal operation' we know that it has to be something, and these descriptions, which are presumably from observation, make sense if you consider the necessary internal operations of the machine.)

(I keep losing that document and I find it quite difficult to find each time I need it...)


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Jan 17, 2010 10:50 am 
Offline
User avatar

Joined: Tue Mar 02, 2004 8:55 am
Posts: 996
Location: Berkshire, UK
Is the order in which the hardware registers are loaded important?

Your simple code sets the registers in the order 4, 7, 2
Code:
          ldx #reg4                   ;register offset
          lda #%01100011              ;parameter
          sta device,x
          ldx #reg7                   ;register offset
          lda #%00001100              ;parameter
          sta device,x
          ldx #reg2                   ;register offset
          lda #%01100011              ;parameter
          sta device,x

But your table versions loops through the table backwards decrementing Y so the order is 2, 7, 4
Code:
regtab   .byte reg4                   ;a device register offset (e.g., $04)
         .byte reg7                   ;ditto
         .byte reg2                   ;ditto

Might be worth reversing the data or changing the code to increment Y instead.

_________________
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Jan 17, 2010 5:35 pm 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
BigEd wrote:
It is possible that the extra read access which occurs in the dead cycle is having a side-effect?


Good question, and good call -- I didn't think to consider this, but I'm sure with time I would have arrived at this conclusion sooner or later.

He is using a 65816, which while it has all the timing characteristics of a 6502, it makes detecting "internal operation" easier thanks to the VPA and VDA signals. If (VPA,VDA) = internal_operation, then you can inhibit the generation of chip select signals.


Top
 Profile  
Reply with quote  
PostPosted: Sun Jan 17, 2010 6:24 pm 
Offline

Joined: Tue Jul 05, 2005 7:08 pm
Posts: 1043
Location: near Heidelberg, Germany
BigDumbDinosaur wrote:
To verify my suspicion, I wrote the following linear alternative to the above loop and burned it into another ROM:

Code:
          ldx #reg4                   ;register offset
          lda #%01100011              ;parameter
          sta device,x
          ldx #reg7                   ;register offset
          lda #%00001100              ;parameter
          sta device,x
          ldx #reg2                   ;register offset
          lda #%01100011              ;parameter
          sta device,x




Did you test this exact code also with replacing the

Code:
          sta device,x

with
Code:
          sta device+reg<n>

i.e. replacing the STA absolute indexed with STA absolute to verify that it's not say in the order you write the bytes to the device? OTOH your linear alternative already writes the values in the opposite order than your looping code (where y is decremented). But it would help determining if the abs,x really is part of the problem.

Besides, your linear alternative writes a different value into reg2 than your looping version. May that also be a problem?

André

(edit: removed extra quote I forgot to remove the first time)


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Jan 17, 2010 6:41 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8504
Location: Midwestern USA
kc5tja wrote:
At first, I thought your code might have had an off by one error. In an attempt to locate it, I annotated your code with proof assertions. However, not taking the effects of the tmstd procedure into account, I've succeeded in proving the freedom of bugs of your loop code.

Is it possible to post the code for tmstd as a next step? Process of elimination boils things down to either hardware bug or a bug with tmstd.


Code:
I define a universe REGS to include the set of all valid DUART register offsets.
A distinguished value $FF signifies a known-invalid, but otherwise intentional,
register offset.

I define a universe PARMS to include the set of all values relevant to DUART
register settings.

The notation |x| indicates the size of x, in bytes.  (c.f. the algebraic notation
for vector magnitude a.k.a. vector length)


inireg   ldy #parmtab-regtab-1
         lda #$ff
                            ; (0 <= Y) && (Y < |regtab|)
inireg01 cmp regtab,y       ; LOOP INVARIANT: (0 <= Y) && (Y < |regtab|) && (A is REGS)
         bne inireg02

         jsr tmstd
                            ; (0 <= Y) && (Y < |regtab|) implies (0 < Y+1) && (Y+1 <= |regtab|)
inireg02 ldx regtab,y       ; (0 < Y+1) && (Y+1 <= |regtab|)
         lda parmtab,y      ; (0 < Y+1) && (Y+1 <= |regtab|) && (X is REGS)
         sta device,x       ; (0 < Y+1) && (Y+1 <= |regtab|) && (A is PARMS) && (X is REGS)
         txa                ; (0 < Y+1) && (Y+1 <= |regtab|) && (X is REGS)
         dey                ; (0 < Y+1) && (Y+1 <= |regtab|) && (A is REGS)
         bpl inireg01       ; (0 <= Y) && (Y < |regtab|) && (A is REGS)
                            ; LOOP INVARIANT HOLDS

         rts                ; 0 > Y

TMSTD (Ten MilliSecond Time Delay) watches a zero page location that is decremented with each watchdog timer interrupt, which is set up to occur at 10 millisecond intervals. I substituted a "do nothing" loop with the Y-register to verify that TMSTD wasn't somehow involved. It wasn't.

Anyhow, my hunch as to what was going on proved to be correct and after a little cut 'n patch session (ain't wirewrap wire and tiny flush-cutters great?), the problem has been solved. I will separately post what I did to fix it, as I think it may be very instructive to others who are working on '816 designs.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Jan 17, 2010 6:44 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8504
Location: Midwestern USA
BigEd wrote:
It is possible that the extra read access which occurs in the dead cycle is having a side-effect?

Not the read access itself. I will shortly be posting what my analysis turned up and how I fixed it.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Jan 17, 2010 6:46 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8504
Location: Midwestern USA
BitWise wrote:
Is the order in which the hardware registers are loaded important?

In some cases, yes. The tables are organized backwards so the registers are loaded in the correct order.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Sun Jan 17, 2010 6:50 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8504
Location: Midwestern USA
fachat wrote:
BigDumbDinosaur wrote:
To verify my suspicion, I wrote the following linear alternative to the above loop and burned it into another ROM:

Code:
          ldx #reg4                   ;register offset
          lda #%01100011              ;parameter
          sta device,x
          ldx #reg7                   ;register offset
          lda #%00001100              ;parameter
          sta device,x
          ldx #reg2                   ;register offset
          lda #%01100011              ;parameter
          sta device,x




Did you test this exact code also with replacing the

Code:
          sta device,x

with
Code:
          sta device+reg<n>

i.e. replacing the STA absolute indexed with STA absolute to verify that it's not say in the order you write the bytes to the device? OTOH your linear alternative already writes the values in the opposite order than your looping code (where y is decremented). But it would help determining if the abs,x really is part of the problem.

The original ROM wrote to the absolute address of the DUART's registers, which worked fine.

Quote:
Besides, your linear alternative writes a different value into reg2 than your looping version. May that also be a problem?

André


The example I gave was strictly for illustration of the method. I wasn't trying to reproduce the entire data table.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Sun Jan 17, 2010 8:36 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8504
Location: Midwestern USA
Earlier I mentioned that I had encountered a weird bug in my POC unit which seemed to be related to the use of STA $ABS,X to write into the DUART registers. Several here had come close to guessing the possible cause:

BigEd wrote:
It is possible that the extra read access which occurs in the dead cycle is having a side-effect?

You're getting extremely warm. :)

kc5tja wrote:
He is using a 65816, which while it has all the timing characteristics of a 6502, it makes detecting "internal operation" easier thanks to the VPA and VDA signals. If (VPA,VDA) = internal_operation, then you can inhibit the generation of chip select signals.

Bingo! :P

Recall I mentioned the watchdog timer (WTD) was working properly using the STA $ABS,X method to load its registers and that the scope confirmed that the WDT was behaving as it should. That started me thinking about the DUART and how it may react to invalid address bus conditions. This prompted me to do some more digging and analysis of the DUART's timing characteristics, which ultimately pointed me to the cause of the problem and the solution.

The DUART is clocked by a 3.6864 MHz signal which, in the POC, is derived from a TTL can oscillator—a crystal could also be used. The DUART clock not only drives the baud rate generator (BRG) and a 32 bit counter/timer (which I am not using), it produces the internal timing signals that regulate the entire device's operation, much as the Ø2 clock regulates the MPU's operation. Considering this, although write operations to the DUART's setup registers are in sync with Ø2, the device's reaction to them is slaved to the 3.6864 MHz clock. This behavior explains why consecutive writes to the same register must, in some cases, be separated by a delay—it is conceivable that if the MPU is fast enough, the second write may occur before the DUART has reacted to the first one, due to the latency of the DUART's clock signal.

Analysis of the DUART's timing diagram caused me to suspect that the problem was occurring late in the third clock cycle of the STA $ABS,X operation, a time when the address bus is being diddled by the MPU and is not yet valid. Latency within the DUART, coupled with a possibly invalid A0-A7 after A8-A15 had been asserted, could cause first one register and then another to be selected, totally confusing the DUART. This would not occur with STA $ABS, as no effective address computation is required for the latter instruction.

Therefore, it seemed prudent to qualify device selection so it could only occur after the fourth cycle had started and at a time when A0-A15 would truly be valid and reflect the final address. My original I/O decoding logic selected a device only according to the address, using an 74AC138 decoder. The revised decoding logic still uses the 'AC138, but qualifies its selection with the MPU's VDA (valid data address) and VPA (valid program address) outputs, preventing selection until after the fourth cycle of the STA $ABS,X instruction has started, at which time a valid address is guaranteed.

The following information is in the '816 data sheet and is what I used to work out the logic:

Code:
VDA   VPA   BUS STATE
 0     0    Internal Operation Address and Data Bus available. The Address Bus may be invalid.
 0     1    Valid program address-may be used for program cache control.
 1     0    Valid data address-may be used for data cache control.
 1     1    Opcode fetch-may be used for program cache control and single step control.

From the perspective of selecting devices for I/O, the third condition is the one to use. Condition one (both VDA and VPA low) is where the trouble was occurring, as that was when the MPU was fiddling with the address bus and confusing the DUART.

A little PCB surgery set up the new logic. First I tested with a ROM having ordinary STA $ABS code to verify that the POC was still upright with a pulse following surgery. Once that had been established, I modified the code to use STA $ABS,X, using identical parameters and in the exact same order:

Code:
1324    ;CONFIGURE DUART
1325    ;
1326    E264  A2 05        cfgacia  ldx #dr_imr
1327    E266  A9 00                 lda #%00000000
1328    E268  9D 00 D0              sta io_acia,x         ;mask all IRQ sources
1329    E26B  A2 02                 ldx #dr_cra
1330    E26D  A9 20                 lda #%00100000
1331    E26F  9D 00 D0              sta io_acia,x         ;reset RxD
1332    E272  20 EE E2              jsr tmstd             ;10 ms time delay
1333    E275  A2 02                 ldx #dr_cra
1334    E277  A9 30                 lda #%00110000
1335    E279  9D 00 D0              sta io_acia,x         ;reset TxD
1336    E27C  20 EE E2              jsr tmstd
1337    E27F  A2 02                 ldx #dr_cra
1338    E281  A9 10                 lda #%00010000
1339    E283  9D 00 D0              sta io_acia,x         ;reset MSR pointer
1340    E286  20 EE E2              jsr tmstd
1341    E289  A2 00                 ldx #dr_msra
1342    E28B  A9 93                 lda #%10010011
1343    E28D  9D 00 D0              sta io_acia,x         ;mode 1...
1344    ;
1345    ;   [7]   1:    enable RTS mode
1346    ;   [6]   0:    IRQ on RxD ready
1347    ;   [5]   0:    character error mode
1348    ;   [4,3] 10:   no parity check
1349    ;   [3]   0:    even parity (not checked)
1350    ;   [2,1] 11:   8 bit data format
1351    ;
1352    E290  A2 00                 ldx #dr_msra
1353    E292  A9 17                 lda #%00010111
1354    E294  9D 00 D0              sta io_acia,x         ;mode 2...
1355    ;
1356    ;   [7,6] 00:   duplex channel mode
1357    ;   [5]   0:    TxD RTS mode off
1358    ;   [4]   1:    TxD CTS mode on
1359    ;   [3-0] 0111: stop bit
1360    ;
1361    E297  A2 01                 ldx #dr_csra
1362    E299  A9 CC                 lda #%11001100
1363    E29B  9D 00 D0              sta io_acia,x         ;clock select...
1364    ;
1365    ;   [7-4] 1100: 38.4 Kb/s TxD
1366    ;   [3-0] 1100: 38.4 Kb/s RxD
1367    ;
1368    E29E  A2 02                 ldx #dr_cra
1369    E2A0  A9 80                 lda #%10000000
1370    E2A2  9D 00 D0              sta io_acia,x         ;assert RTS
1371    E2A5  20 EE E2              jsr tmstd
1372    E2A8  A2 02                 ldx #dr_cra
1373    E2AA  A9 05                 lda #%00000101
1374    E2AC  9D 00 D0              sta io_acia,x         ;command...
1375    ;
1376    ;   [7-4] 0000: no operation
1377    ;   [3,2] 01:   enable transmitter
1378    ;   [1,0] 01:   enable receiver
1379    ;
1380    E2AF  A2 04                 ldx #dr_acr
1381    E2B1  A9 00                 lda #%00000000
1382    E2B3  9D 00 D0              sta io_acia,x         ;aux control...
1383    ;
1384    ;   [7]   0:    select BRG #1 (38.4 Kbps max)
1385    ;   [6-4] 000:  C/T setup (not used)
1386    ;   [3-0] 0000: IP[0-3] IRQs disabled
1387    ;
1388    E2B6  A2 0D                 ldx #dr_opcr
1389    E2B8  A9 00                 lda #%00000000
1390    E2BA  9D 00 D0              sta io_acia,x         ;output port config
1391    E2BD  A2 05                 ldx #dr_imr
1392    E2BF  A9 03                 lda #%00000011
1393    E2C1  9D 00 D0              sta io_acia,x         ;enabled IRQ sources...
1394    ;
1395    ;   [7]   0:    IP0-IP3 state change off
1396    ;   [6]   0:    ch B break change off
1397    ;   [5]   0:    ch B RHR ready/FIFO full off
1398    ;   [4]   0:    ch B THR ready off
1399    ;   [3]   0:    C/T ready off
1400    ;   [2]   0:    ch A break change off
1401    ;   [1]   1:    ch A RHR ready/FIFO full on
1402    ;   [0]   1:    ch A THR ready on
1403    ;
1404    E2C4  60                    rts


The board booted with the above code, displayed the POST screen and cheerfully accepted typed input. This was with the POC running on a 1 MHz Ø2 clock. Next, I changed the oscillator and boosted Ø2 to 8 MHz. Everything worked fine. In fact, I programmed one of the function keys on the Wyse 60 to emit a complete sentence as fast as the terminal could physically send the characters and held the key down to ram data through the POC as fast as possible (about 200 chars per sec). Every line of text was echoed error free.

With the CBAT I/O test confirming that the new logic was working, I replaced the above setup code with the data tables and register loading loop. It all appears to work as it should.

The interesting question is why was only the DUART affected? I concluded it was a combination of it being driven by a clock signal that is asynchronous to Ø2, as well as the device's inherently slow operation when writing to the setup registers. You may well ask why didn't I slow down Ø2 to see if it was a case of a slow device not reacting quickly enough to A0-A15. Well, I went so far as to put a 1 MHz oscillator in the board, which resulted in Ø2 being 500 KHz. The error persisted, although in a different way, which is what ultimately pointed me in the right direction.

I think the conclusion that can be drawn from this little contretemps is that the VDA and VPA outputs should be part of the address decoding logic of any '816 design. Clearly, Bill Mensch anticipated some address bus hinkiness with the '816 when he designed it, and decided to provide outputs that would unambiguously indicate when it was safe to select hardware.

Looks like it's time for me to permanently revise the schematic and the board layout.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Last edited by BigDumbDinosaur on Sun Feb 21, 2010 7:14 am, edited 2 times in total.

Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Jan 17, 2010 8:55 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
Interesting - good to see you had a fix.

It sounds like you wouldn't have the same fix available to you if you were using a 65C02. You might have to slow down the CPU (a lot) to let the DUART distinguish the two consecutive writes.

Or, perhaps, you could take advantage of the first write lacking the carry into the high byte, by basing your accesses at the top of the previous page:
Code:
  STA CFFF,X
instead of
Code:
  STA D000,X
and of course arranging for X to be one greater. Would that work?


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Jan 17, 2010 9:37 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8504
Location: Midwestern USA
BigEd wrote:
Interesting - good to see you had a fix.

It sounds like you wouldn't have the same fix available to you if you were using a 65C02.

It may be the 'C02 doesn't monkey with A0-A15 in the same fashion as the '816 during the second-to-last cycle of the STA $ABS,X instruction. If so, no special hardware logic would be required, as the DUART would see a stable address at the onset of the final cycle. I have no way to determine if this is the case, as the bus behavior of the '816 is the same in both modes (the POC ROM currently runs the MPU in emulation mode).

Quote:
You might have to slow down the CPU (a lot) to let the DUART distinguish the two consecutive writes.

That would mean somehow altering Ø2 on the fly to some submultiple of its normal rate. That's assuming that timing per se is the culprit. My fix on the POC indirectly indicates that timing was not behind the DUART's wacky behavior with STA $ABS,X. It was strictly a case of address bus behavior late in the instruction execution.

Quote:
Or, perhaps, you could take advantage of the first write lacking the carry into the high byte, by basing your accesses at the top of the previous page:
Code:
  STA CFFF,X
instead of
Code:
  STA D000,X
and of course arranging for X to be one greater. Would that work?

Dunno. That would require that register offsets be one-based, which is not an issue in itself, but would add one cycle to every STA $ABS,X access. It might actually make things worse.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Thu Jan 21, 2010 3:08 am 
Offline
User avatar

Joined: Thu Mar 11, 2004 7:42 am
Posts: 362
BigDumbDinosaur wrote:
Quote:
Or, perhaps, you could take advantage of the first write lacking the carry into the high byte, by basing your accesses at the top of the previous page:
Code:
  STA CFFF,X
instead of
Code:
  STA D000,X
and of course arranging for X to be one greater. Would that work?

Dunno. That would require that register offsets be one-based, which is not an issue in itself, but would add one cycle to every STA $ABS,X access. It might actually make things worse.


Unlike non-writing instructions (LDA, AND, CMP, etc.), STA abs,X always takes 5 cycles whether it crosses a page boundary or not.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Thu Jan 21, 2010 6:16 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8504
Location: Midwestern USA
dclxvi wrote:
Unlike non-writing instructions (LDA, AND, CMP, etc.), STA abs,X always takes 5 cycles whether it crosses a page boundary or not.

Really! In which data sheet did you see that? :?: Here's an excerpt from the W65C816S data sheet, a document with which I and several others around here have more than passing familiarity:

Absolute,X 4 (1,5) 4 (1,3,5)

and notes 1,3 and 5:

1. Page boundary, add 1 cycle if page boundary is crossed when forming address.
3. M = 0 or X = 0, 16 bit operation, add 1 cycle, add 1 byte for immediate.
5. Read-Modify-Write, add 2 cycles for M = 1, add 3 cycles for M = 0.


Please refer to table 3-1 on page 25 of the data sheet for more info.

I'm currently running my POC system in 'C02 emulation mode while I debug the hardware. The address of the DUART is $D000. Therefore, since I'm doing an STA $D000,X, only note 1 would apply and the instruction execution time will be four clock cycles, or 500 nanoseconds (Ø2 = 8 MHz). A page boundary crossing with this instruction ain't gonna happen, right?

You can rest assured that I spent plenty of time studying the '816 timing diagram and instruction execution specs as I was designing my POC. I won't go so far as to say I've got it all memorized, but I'm getting close. :)

In any case, the resolution of the problem I was experiencing had to do with address bus behavior right at the beginning of the final cycle of the STA $D000,X instruction. At that point in time, an incomplete address is asserted on A0-A15. The address doesn't become valid until the MPU's VPA signal is low and VDA is high. Whether that final cycle was number four, number five or number 16,777,215 isn't relevant.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Thu Jan 21, 2010 7:27 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
BigDumbDinosaur wrote:
dclxvi wrote:
Unlike non-writing instructions (LDA, AND, CMP, etc.), STA abs,X always takes 5 cycles whether it crosses a page boundary or not.

Really! In which data sheet did you see that? :?: Here's an excerpt from the W65C816S data sheet
...


Hmm, indeed the current datasheet says there's a page-crossing penalty. Other docs say as Bruce does. So, I ran a test on a real 65816, about 4 million repeats on a 2MHz system:
    NOP : 21.87s - known to be 2 cycles
    LDA abs,X, no page crossing: 26.24s - therefore 4 cycles
    LDA abs,X, page crossing: 28.42s - therefore 5 cycles
    STA abs,X, no page crossing: 28.42s - again 5 cycles
    STA abs,X, page crossing: 28.42s - same again
To me this makes sense: in the 4th cycle of a normal load, the CPU performs the dummy load and also discovers there's no carry. So the next cycle can be the next fetch. For a store, it's not safe for the 4th cycle to be a dummy write, so it has to be a read. Therefore the earliest a write can safely be made - whether or not there's a carry - is the 5th cycle.

The datasheet is probably wrong(*1), unless the design was recently updated, in which case my older parts need an older datasheet.

I checked, and an NMOS 6502 performs exactly the same. (The above tests are in emulation mode, I repeated them and got the same cycle counts in native mode.)

Note that the datasheet doesn't seem to have room to describe operations with 2-byte loads or stores - it's over-simplified.(*2)

I'm tempted to say
    if it's not in the datasheet, it isn't specified - but you might need to know, so measure it
    if it is in the datasheet, it might be wrong - so measure it
    if you get a new batch, the answer might have changed - so measure again

but I'm not going to!

Cheers
Ed

Edit:
(*1) Bruce found the section which gives the correct info.
(*2) Ditto - the table BDD consulted is not the detailed one.


Last edited by BigEd on Sat Jan 23, 2010 11:16 pm, edited 1 time in total.

Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Thu Jan 21, 2010 9:50 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8504
Location: Midwestern USA
BigEd wrote:
Hmm, indeed the current datasheet says there's a page-crossing penalty.

Hmm...the 65C02 data sheet also says STA $ABS,X takes four clock cycles if a page boundary isn't crossed. However, the MOS 6510 data sheet says five cycles. Go figure. Looks like my earlier response may be in error. :)

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 581 posts ]  Go to page Previous  1 ... 4, 5, 6, 7, 8, 9, 10 ... 39  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 35 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: