6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Sep 29, 2024 9:29 am

All times are UTC




Post new topic Reply to topic  [ 19 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Thu Nov 01, 2018 11:55 am 
Offline
User avatar

Joined: Tue Mar 02, 2004 8:55 am
Posts: 996
Location: Berkshire, UK
I'm planning to connect an SPI device to my W65C265SXB board. The '265 has several built in ports but no shift buffers so bit bashing is the only way to go. My first thoughts are to go with something like this as the main core of the SPI
Code:
         0000DF25          = SPI_DDR         .equ    $df25           ; Port5 DDR
         0000DF21          = SPI_DR          .equ    $df21           ; Port5
                             
         00000001          = SPI_MISO        .equ    $01
         00000002          = SPI_MOSI        .equ    $02
         00000004          = SPI_SCK         .equ    $04
         00000008          = SPI_SS          .equ    $08
                             
                             
                             SpiSend:
                                             short_ai
                           +                 .longa  off
                           +                 .longi  off
00:F05F  E230              +                 sep     #$30            ; Make all registers 8-bit
00:F061  A208              :                 ldx     #8
                                             repeat
00:F063  0A                :                  asl    a
00:F064  EB                :                  xba
00:F065  A902              :                  lda    #SPI_MOSI       ; Set MOSI
00:F067  1C21DF            :                  trb    SPI_DR
00:F06A  9003              :                  if cs
00:F06C  0C21DF            :                   tsb   SPI_DR
                                              endif
00:F06F  A904              :                  lda    #SPI_SCK        ; Set SCK hi
00:F071  0C21DF            :                  tsb    SPI_DR
00:F074  A901              :                  lda    #SPI_MISO
00:F076  2C21DF            :                  bit    SPI_DR          ; Read MISO     
00:F079  F003              :                  if ne
00:F07B  EB                :                   xba
00:F07C  1A                :                   inc   a
00:F07D  EB                :                   xba
                                              endif
00:F07E  A904              :                  lda    #SPI_SCK        ; Set SCK lo
00:F080  1C21DF            :                  trb    SPI_DR
00:F083  EB                :                  xba
00:F084  CA                :                  dex
00:F085  D0DC              :                 until eq
00:F087  60                :                 rts

This uses B to hold the byte being shifted out/in while A is used to hold the masks for port bits for SCK, MISO and MOSI.

Anyone go any ideas for a faster logarithm? There may be some flexibility on the pin assignments. I need to check of any of port 5 is involved with the banked RAM addressing.

_________________
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs


Top
 Profile  
Reply with quote  
PostPosted: Thu Nov 01, 2018 12:02 pm 
Offline
User avatar

Joined: Tue Mar 02, 2004 8:55 am
Posts: 996
Location: Berkshire, UK
Hmm, on second thoughts maybe C could be used to hold the last bit shifted in
Code:
         0000DF25          = SPI_DDR         .equ    $df25           ; Port5 DDR
         0000DF21          = SPI_DR          .equ    $df21           ; Port5
                             
         00000001          = SPI_MISO        .equ    $01
         00000002          = SPI_MOSI        .equ    $02
         00000004          = SPI_SCK         .equ    $04
         00000008          = SPI_SS          .equ    $08
                             
                             
                             SpiSend:
                                             short_ai
                           +                 .longa  off
                           +                 .longi  off
00:F05F  E230              +                 sep     #$30            ; Make all registers 8-bit
00:F061  A208              :                 ldx     #8
00:F063  18                :                 clc
                                             repeat
00:F064  2A                :                  rol    a
00:F065  EB                :                  xba
00:F066  A902              :                  lda    #SPI_MOSI       ; Set MOSI
00:F068  1C21DF            :                  trb    SPI_DR
00:F06B  9003              :                  if cs
00:F06D  0C21DF            :                   tsb   SPI_DR
                                              endif
00:F070  A904              :                  lda    #SPI_SCK        ; Set SCK hi
00:F072  0C21DF            :                  tsb    SPI_DR
00:F075  A901              :                  lda    #SPI_MISO
00:F077  2C21DF            :                  bit    SPI_DR          ; Read MISO     
00:F07A  18                :                  clc
00:F07B  F001              :                  if ne
00:F07D  38                :                   sec
                                              endif
00:F07E  A904              :                  lda    #SPI_SCK        ; Set SCK lo
00:F080  1C21DF            :                  trb    SPI_DR
00:F083  EB                :                  xba
00:F084  CA                :                  dex
00:F085  D0DD              :                 until eq
00:F087  2A                :                 rol     a
00:F088  60                :                 rts

_________________
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs


Top
 Profile  
Reply with quote  
PostPosted: Thu Nov 01, 2018 2:51 pm 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3367
Location: Ontario, Canada
BitWise wrote:
Anyone go any ideas for a faster logarithm? There may be some flexibility on the pin assignments.

Hi, Andrew,

have you done a cycle count to evaluate throughput? I've only glanced at your code.

Here are some SPI routines that can input at a rate of one bit every 18 or 19 CPU cycles. For SPI output it's 17 or 18 cycles per bit. I suspect that's as fast as it can be done, although I'd be happy to be proven wrong. (If memory consumption is a non-issue then further improvement might be possible.)

Pin assignments do play a role -- Garth and I like to use bit0 as an output to the clock bit. There are some other significant tricks, but I won't explain them again here.

-- Jeff

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
PostPosted: Thu Nov 01, 2018 4:32 pm 
Offline
User avatar

Joined: Tue Mar 02, 2004 8:55 am
Posts: 996
Location: Berkshire, UK
After reading the '265SXB manual some more I think this is the best I'm going to be able to do given the hardware constraints on the board
Code:
                             ;===============================================================================
                             ; Bit Bashed SPI
                             ;-------------------------------------------------------------------------------
                             
                             ; SPI communication is bit bashed using the spare bits in Port4 as shown in the
                             ; following table:
                             ;
                             ; +---+-----------------------------------+-----------------------------------+
                             ; | # | SXB Function                      | SPI Function                      |
                             ; +---+-----------------------------------+-----------------------------------+
                             ; | 0 | /NMI                              |                                   |
                             ; | 1 | /IRQ                              | /INT from CH376 module            |
                             ; +---+-----------------------------------+-----------------------------------+
                             ; | 2 | Spare                             | /SS to CH376 module               |
                             ; +---+-----------------------------------+-----------------------------------+
                             ; | 3 | FA15 (ROM Bank)                   |                                   |
                             ; | 4 | FAMS (ROM Bank)                   |                                   |
                             ; +---+-----------------------------------+-----------------------------------+
                             ; | 5 | Spare                             | SCK                               |
                             ; | 6 | Spare                             | MOSI                              |
                             ; | 7 | Spare                             | MISO                              |
                             ; +---+-----------------------------------+-----------------------------------+
                             
         00000020          = SPI_SCK         .equ    $20
         00000040          = SPI_MOSI        .equ    $40
         00000080          = SPI_MISO        .equ    $80
                             
                             SpiInit:
                                             short_ai
                           +                 .longa  off
                           +                 .longi  off
00:F040  E230              +                 sep     #$30            ; Make all registers 8-bit
00:F042  A920              :                 lda     #SPI_SCK        ; Set SCK as lo output
00:F044  0C24DF            :                 tsb     PDD4
00:F047  1C20DF            :                 trb     PD4
00:F04A  A940              :                 lda     #SPI_MOSI       ; Set MOSI as an output
00:F04C  0C24DF            :                 tsb     PDD4           
00:F04F  A980              :                 lda     #SPI_MISO       ; Set MISO as an input
00:F051  1C24DF            :                 trb     PDD4
00:F054  60                :                 rts
                             
                             SpiSend:
                                             short_ai
                           +                 .longa  off
                           +                 .longi  off
00:F055  E230              +                 sep     #$30            ; Make all registers 8-bit
00:F057  A208              :                 ldx     #8
                                             repeat
00:F059  2A                :                  rol    a               ; Shift out a data bit
00:F05A  EB                :                  xba
00:F05B  A940              :                  lda    #SPI_MOSI       ; Set MOSI
00:F05D  1C20DF            :                  trb    PD4
00:F060  9003              :                  if cs
00:F062  0C20DF            :                   tsb   PD4

Portable 65xx Assembler [18.06]

                                              endif
00:F065  A920              :                  lda    #SPI_SCK        ; Set SCK hi
00:F067  0C20DF            :                  tsb    PD4
00:F06A  18                :                  clc
00:F06B  2C20DF            :                  bit    PD4             ; Read MISO into carry 
00:F06E  1001              :                  if mi
00:F070  38                :                   sec
                                              endif
00:F071  1C20DF            :                  trb    PD4             ; Set SCK lo
00:F074  EB                :                  xba
00:F075  CA                :                  dex
00:F076  D0E1              :                 until eq
00:F078  2A                :                 rol     a               ; Rotate in last data bit
00:F079  60                :                 rts

I've decided to use PORT4 instead of PORT5 so I can use hardware flow control on the UART lines but this means there is less flexibility on the pins, in particular bit 0 is unavailable for the clock so the INC/DEC trick is not possible.

The CH376 module I want to connect via SPI has an interrupt output which I can connect to the /IRQ line on the 265 and it seems to be controllable via the EIER register (bit 7 - although one table in the data sheet says bit 3).

It looks like this means around 54 cycles per bit but my code is currently both transmitting and receiving a data bit on each iteration. If I can use send only and receive only routine like yours I can squeeze quite a lot out.

_________________
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs


Top
 Profile  
Reply with quote  
PostPosted: Thu Nov 01, 2018 5:29 pm 
Offline

Joined: Sat Dec 13, 2003 3:37 pm
Posts: 1004
BitWise wrote:
I've decided to use PORT4 instead of PORT5 so I can use hardware flow control on the UART lines but this means there is less flexibility on the pins, in particular bit 0 is unavailable for the clock so the INC/DEC trick is not possible.

Yea, I was going to suggest P4. As I recall (not having looked at it in some time), P4 was not completely available, so would be well suited to use a couple of the remaining pins for SPI.
Quote:
The CH376 module I want to connect via SPI has an interrupt output which I can connect to the /IRQ line on the 265 and it seems to be controllable via the EIER register (bit 7 - although one table in the data sheet says bit 3).

I am eager to hear about your results with the CH376 module.
Quote:
It looks like this means around 54 cycles per bit but my code is currently both transmitting and receiving a data bit on each iteration. If I can use send only and receive only routine like yours I can squeeze quite a lot out.

Is it required to be full duplex for each bit? or can you "send 8 bits" and then "receive 8 bits" in blocks?


Top
 Profile  
Reply with quote  
PostPosted: Thu Nov 01, 2018 6:26 pm 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3367
Location: Ontario, Canada
BitWise wrote:
my code is currently both transmitting and receiving a data bit on each iteration. If I can use send only and receive only routine like yours I can squeeze quite a lot out.
Understood. For your application, will it be necessary to send and receive simultaneously? (I know some SPI chips, like the 16is750 SPI UART I was playing with, don't require this.)

Quote:
I've decided to use PORT4 instead of PORT5 so I can use hardware flow control on the UART lines but this means there is less flexibility on the pins, in particular bit 0 is unavailable for the clock so the INC/DEC trick is not possible.
Attachment:
'265 port 4.png
'265 port 4.png [ 33.02 KiB | Viewed 2563 times ]
This is Port 4, am I right? It looks as if all 8 bits have dedicated functions -- or can those functions be switched off? (and are you willing to switch them off?) Which Port 4 bits can be made to behave like bits on the port of a 6522 VIA (ie, with both a Data Reg bit and a DDR bit) ?

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
PostPosted: Thu Nov 01, 2018 9:36 pm 
Offline

Joined: Mon Sep 14, 2015 8:50 pm
Posts: 110
Location: Virginia USA
Hi Bitwise,

To speed things up somewhat, perhaps you can save your direct register on the stack and set your direct page to $DF00 while you're doing I/O and then reset your direct register on exit.

Cheers,
Andy


Top
 Profile  
Reply with quote  
PostPosted: Thu Nov 01, 2018 10:18 pm 
Offline
User avatar

Joined: Tue Mar 02, 2004 8:55 am
Posts: 996
Location: Berkshire, UK
Quote:
This is Port 4, am I right? It looks as if all 8 bits have dedicated functions -- or can those functions be switched off? (and are you willing to switch them off?) Which Port 4 bits can be made to behave like bits on the port of a 6522 VIA (ie, with both a Data Reg bit and a DDR bit) ?

The pins other than /NMI and /IRQ related the parallel interface bus which I am not using so I believe I can re-purpose them for SPI.
Quote:
I am eager to hear about your results with the CH376 module.

I was a lot more impressed with the CH376 until I realised it only supported one open file at a time.

It also seems it would be better to connect the CH376's BZ (Busy) output to the /IRQ pin but use it as a digital input. The MISO line can be made to be the /INT signal when /SS is hi.
Quote:
To speed things up somewhat, perhaps you can save your direct register on the stack and set your direct page to $DF00 while you're doing I/O and then reset your direct register on exit.

I have considered that. As I want to have the callers data bank in my 1MB SRAM expansion remapping direct page would make it easier to copy disk blocks to/from the callers buffer (e.g. LDA/STA $0000,Y would access the callers data bank).

_________________
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs


Top
 Profile  
Reply with quote  
PostPosted: Thu Nov 01, 2018 10:26 pm 
Offline

Joined: Sat Dec 13, 2003 3:37 pm
Posts: 1004
BitWise wrote:
I was a lot more impressed with the CH376 until I realised it only supported one open file at a time.

Oh no!

That's really disappointing.

I get it, sorta, not having looked in detail at the command set, if you're just streaming a file, but...yea, honestly that's a real blow.


Top
 Profile  
Reply with quote  
PostPosted: Fri Nov 02, 2018 7:58 am 
Offline

Joined: Mon Sep 14, 2015 8:50 pm
Posts: 110
Location: Virginia USA
Hi Bitwise,

There's also 64 bytes of ram in the $DF00 page ($DF80-$DFBF) that you can use/borrow for direct page addressing. There's 6 bytes $DFBA-$DFBF available according to WDC's 265iromlist.pdf listing

Cheers,
Andy


Top
 Profile  
Reply with quote  
PostPosted: Fri Nov 02, 2018 2:28 pm 
Offline
User avatar

Joined: Tue Mar 02, 2004 8:55 am
Posts: 996
Location: Berkshire, UK
My first cut of the code seems to work and achieves around 100KHz at 3.68MHz. Should be a little bit faster with direct page remapped and zero page instructions instead of absolute.
Attachment:
SXB SPI.png
SXB SPI.png [ 239.38 KiB | Viewed 2496 times ]

The CH376 is active and mounts a USB thumb drive but I'm not accessing it yet. I don't think the /INT signal is being correctly mapped on the MISO line (by the second command 0B/16/10) and I may not bother with it as the BZ signal is easier to use (e.g. if BZ is hi then wait). Mounting my USB stick takes 0.34 sec at end of which BZ goes lo.

_________________
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs


Top
 Profile  
Reply with quote  
PostPosted: Sat Nov 03, 2018 10:31 am 
Offline

Joined: Mon Sep 14, 2015 8:50 pm
Posts: 110
Location: Virginia USA
Hi Bitwise,

I believe you can use pin 40 as an I/O pin by being sure BCR6 and BCR5 are set to zero (NMIB and ABORTB disabled) and then you can use PD4/PDD4 to bit bang pin 40. Unless you are using NMIB...

Cheers,
Andy


Top
 Profile  
Reply with quote  
PostPosted: Sat Nov 03, 2018 2:49 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8406
Location: Midwestern USA
handyandy wrote:
There's also 64 bytes of ram in the $DF00 page ($DF80-$DFBF) that you can use/borrow for direct page addressing. There's 6 bytes $DFBA-$DFBF available according to WDC's 265iromlist.pdf listing

Direct page accesses incur a one clock cycle penalty per instruction if DP doesn't start exactly on a page boundary, effacing the speed advantage of DP addressing modes.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Sat Nov 03, 2018 5:33 pm 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3367
Location: Ontario, Canada
BigDumbDinosaur wrote:
Direct page accesses incur a one clock cycle penalty per instruction if DP doesn't start exactly on a page boundary, effacing the speed advantage of DP addressing modes.
Yup. There's a substantial speedup available, but only if you set DP to align with a page boundary.

Earlier I mentioned bit-bang SPI inputting at a rate of 18 or 19 CPU cycles per bit or outputting at 17 or 18 cycles per bit (the figures are data dependent). This already speedy throughput increases another 12.5% (approx) if you use Direct Page accesses (for 6502, Zero page accesses) -- specifically, each of the cycles-per-bit figures gets reduced by 2.

Even without DP / Z-pg accesses, bit-banged SPI can be surprisingly effective. Talking to an SPI UART, even a 1 MHz 6502 can bit-bang fast enough to achieve 19.2 or even 34.8 kbaud on the asynch connection -- and of course faster CPU's exceed this performance.

One final note: the routines I use are written for 6502 / 'C02, and it seems doubtful that '816-specific code would be any faster. But, by design, my routines don't receive and transmit simultaneously. If there were a requirement for that then the B Accumulator might make an '816 faster than a 6502 / 'C02. Apologies for going slightly OT, talking about bit-banging but not on a W65C265.

-- Jeff

(edits -- last paragraph)

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
PostPosted: Wed Nov 14, 2018 11:33 am 
Offline
User avatar

Joined: Tue Mar 02, 2004 8:55 am
Posts: 996
Location: Berkshire, UK
After much hair pulling I finally got my CH376S to read the MBR block from a USB flash drive this morning.

The datasheet for the CH376S is appalling and there is very little example software for it on the web - especially in SPI mode.

I'm using the SPI code from earlier posts with some additional signals for slave select and a BUSY input. I'm not using the interrupt signal. The BUSY signal is simpler and I wait for it to go low after sending a data byte to the module.

The initialising the CH376S is easy enough.
1. Use CHECK_EXIST to ensure the module is connected.
2. Use SET_USB_MODE with arg $06 to select USB mode.
3. Send DISK_CONNECT
4. Send DISK_MOUNT

To read a sector based on LBA number:
1. Send DISK_READ with the LBA and sector count (1)
2. Use GET_STATUS to check status is USB_INT_DISK_READ
3. Send RD_USB_DATA0 to read the next block size and data (should be $40 followed by 64 bytes)
4. If there is more data left to read then send DISK_RD_GO and go back to 3

The attached image is analyser output for a sector read. You can see the initial LBA sending at the start and the eight 64 byte blocks for the sector.
Attachment:
CH376 SPI.png
CH376 SPI.png [ 205.46 KiB | Viewed 2003 times ]

_________________
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 19 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 11 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: