I'm watching the thread about the 3 chip design and the challenges to load an initial image to the 6502 system using a minimal count of a controller with great interest. So I take the time to elaborate a little more on the boot loader as it is implemented in ROMulus the 2nd and in principal in the 1st as well. Although this method will not fit into 3 chips a minimal version will at least fit into a 4 chip design: CPU, RAM, MCU, GLUE and uses a reasonable count of PINs. Depending on the requirements the GLUE will fit into a SPLD (GAL16V8 or GAL22V10) in my case, as GLUE does much more, the logic is part af a CPLD (ATF1508). I recommend that you have a look at the design file already attached to this thread.
There are the following signals that are controlled by the AVR
!RES this pin is constatly set to 0. Instead of de-asserting !RES the AVR switches between input and output so it is rather a open drain output and requires a pull-up, the advantage is that you can have a switch in parallel
PHI2 this pin uses OC2B of the AVR used here. Depending on the mode the AVR controll program either toggles the state manually using sbi and cbi instructions ore it activates Timer 2 to create a steady PHI2 clock
IML this signals to the CPLD (or whatever glue logic you use) which memory decoding scheme shall be active. When all reads from the 65C816 are directed to the internal byte buffer of the CPLD and all writes go to the memory. Memory is then addressed linearly as one 512kbyte RAM covering the first 8 banks. Note that IML is implemented as a bit in the internal control byte of the CPLD. The control byte can be written using WRC clock
WRK this signal latches the value on PORTA into the internal byte buffer of the CPLD
PORTA this is the data port of the AVR, in this design it connects to the CPLD to write internal registers, one of these is the byte register which can be read by the 65xxx
If you use this method for a minimal system, then you typically will just connect PORTA to the databus and of course you need also to add some instructions to control the direction of the PORTA when the CPU performs a write cycle. Using a PIC controller with a parallel port this could also be handled in the GLUE logic as the PORT of the PIC in this mode can be tri-stated. Or you could add a buffer between the MCU and the data bus. In any case this design in a minimal setup would use 11 pins of the MCU to be connected to the system: RES, PHI2, IML and D0..7. Of course much more pins than in the 3 chip design challenge but very few when compared to solutions that connect all bus signals to the MCU. If the pin count is critical the data bus could also be connected via a shift register, buth this will then make the system at least a 5 chip design. But I think 11 pins is quite good and there are no pull-up/downs on the databus and hence the system is capable of higher clock rates, in my case 11MHz.
At the beginning the AVR initialises the ports and starts in a known state with IML set in the CPLD control register
Code:
;--------------------------------------------------------------------------
;
; PORTA A0..A7
;
; Multiple Purpose Output Port, this port is connected to the CPLD to
; the address lines of the DPRAM and to the upper video RAM address
; latch.
;
out PORTA, ff
out DDRA, ff
;--------------------------------------------------------------------------
;
; SYSCTL: This is a register that holds the current value of the CPLD control register
;
; These flags are used to control the 65C816 system.
;
; IML, VBL, WE, GR, SHFT, CA, OA, TAPE
;
.set TAPE = 0 ; Simlulate TAPE input
.set OA = 1 ; Open Apple Key
.set CA = 2 ; Closed Apple Key
.set SHFT = 3 ; Shift Key
.set GR = 4 ; Graphics
.set WE = 5 ; 0=Write Enable RAM
.set VBL = 6 ; Vertical Blank
.set IML = 7 ; 1=Set Initial Machine Load
;
;
;
ldi temp, (1<<IML) | (1<<WE)
out PORTA, temp
nop
sbi PORTB, WRC
cbi PORTB, WRC ; Latch Control Bits into CPLD
in SYSCTL, PORTA ; keep a copy
Then it will disable the clock and assert reset
Code:
;--------------------------------------------------------------------------
;
; PORTB VPB, RDV, WRK, WRC, CE, RES, SHLD
;
.equ VPB = 0 ; Vector Pull from 65xxx
.equ RDV = 1 ; Read Video Control bits, active low!
.equ WRK = 2 ; Write Byte Buffer/Keyboard Clock normally high
.equ WRC = 3 ; Write Control Bits Latch Enable normally low
.equ INH = 4 ; Inhibit Video Pixel Clock OC0B
.equ CE = 5 ; Video RAM Chip Enable
.equ RES = 6 ; 65xxx reset
.equ SHLD= 7 ; SHLD video shift register connected to OC3B
;
; Reset pin is set low and normally is configured as input. The normal
; pull-up on the reset line will pull the signal high. To activate
; reset we just set the pin as output which will pull the line low.
;
#define RESETPIN DDRB, RES
;
ldi temp, (1<<RDV) | (1<<WRK) | (0<<WRC) | (1<<CE)
out PORTB, temp
ldi temp, (1<<RDV) | (1<<WRK) | (1<<WRC) | (1<<INH) | (1<<CE) | (1<<SHLD)
out DDRB, temp
Now we can initialise the 65xxx system. For this we execute the following code
Code:
imlprepare:
;
; Stop the clock
;
cbi PORTD, PHI2 ; Make sure we and with PHI2=Low
sts TCCR2B, zero ; Stop the timer
sts TCCR2A, zero
sts OCR2A, zero
;
sbi RESETPIN
;
; After Reset is asserted (pulled low) the 65816 requires two
; full PHI2 cycles to reset.
;
sbi PORTD, PHI2 ; Note that consecutive sbi/cbi
cbi PORTD, PHI2 ; create a clock requency of 4 AVR
sbi PORTD, PHI2 ; cycles in our case 5.5MHz.
cbi PORTD, PHI2
sbi PORTD, PHI2
cbi PORTD, PHI2
sbi PORTD, PHI2
;
; De-assert Reset when PHI2 is high so the 65816 will detect
; the rising edge of reset at the next falling edge of PHI2
;
cbi RESETPIN
nop
nop
out PORTA, SYSCTL
sbi PORTA, IML
sbi PORTB, WRC
cbi PORTB, WRC ; Latch Control Bits into CPLD
in SYSCTL, PORTA
cbi PORTD, PHI2 ; this for some reason does not
; detect reset being de-asserted
;
; It seems that the number of cycles after a reset is not well
; and accurately documented in the W65C816 manual from WDC. How-
; ever it is in line with the Synertek manual of the 6502
;
; http://archive.6502.org/datasheets/synertek_hardware_manual.pdf
;
; which states that after RESET is de-asserted (high) the 6502 will
; delay 6 cycles and then fetch the program counter, so we wait
; 6 cycles before emitting the start address of IML.
;
sbi PORTD, PHI2
cbi PORTD, PHI2 ; This detects that reset is de-asserted
sbi PORTD, PHI2
cbi PORTD, PHI2 ; Cylce 1
sbi PORTD, PHI2
cbi PORTD, PHI2 ; Cylce 2
sbi PORTD, PHI2
cbi PORTD, PHI2 ; Cylce 3
sbi PORTD, PHI2
cbi PORTD, PHI2 ; Cylce 4
sbi PORTD, PHI2
cbi PORTD, PHI2 ; Cylce 5
ret
;
Now the 65xxx processor is fully initialsed. The next the CPU expects is the reset vector and instructions. So we provide the necessary data. A small subroutine is used to place a byte into the byte buffer and strobe PHI2
Code:
imlbyte:
out PORTA, temp ; Set value
cbi PORTB, WRK ; Reset WRK
sbi PORTB, WRK ; Load with leading edge
sbi PORTD, PHI2 ; PHI2 = high
nop
nop
nop
cbi PORTD, PHI2 ; PHI2 = low
ret
The following routine will take a ROM image in flash and copies it to the RAM address that late is the ROM. It expects the pointer Z to point the the flash image with the 65C816 ROM image. X is expected to be setup with the starting address of the ROM image as seen by the 65C816. In our case the ROM image must be copied into BANK 0 and BANK 4 depending on the value in X. This of course depends on how the CPLD maps the memory addresses. At the moment this routine just copies a ROM image from the starting address up to $FFFF. Different to ROMulus the 1st I now use the store absolute long instruction so I can write to the complete 24-address range of the 65C816. As this works as well in emulation mode I don't need to switch into native mode here. The loop starts with sending an address. When entered first this is the time when the 65C816 expects the address from the reset vector.
Code:
wloop:
;
ldi temp, 0x00 ; The first time we need to provide
rcall imlbyte
;
ldi temp, 0x03 ; a RESET vector, any address will do so
rcall imlbyte ; we use the same address as in the JMP.
;
ldi temp, LDA_ ; Load <immediate>
rcall imlbyte
;
lpm temp, Z+ ; next byte from ROM image
rcall imlbyte
;
ldi temp, STAL_ ; Store <absolutelong>
; ldi temp, STA_ ; Store <absolute>
rcall imlbyte
;
mov temp, xl ; to the current pointer
rcall imlbyte
;
mov temp, xh ; of the ROM image destination
rcall imlbyte
;
mov temp, zero ; CX ROM is in Bank 0
cpi xh, 0xD0
brlo wbank4
ldi temp, 0x04 ; D0..F8 ROM is in Bank 4
wbank4:
rcall imlbyte
;
ldi temp, 0xff ; the write cycle, the value is not
rcall imlbyte ; relevant
;
ldi temp, JMP_ ; Jump to start of loop
rcall imlbyte ;
;
adiw xh:xl, 1 ; Next ROM destination address
brne wloop ; if not overflow to 0 then continue
Now we have loaded the ROM image to the memory section that later will be mapped as ROM and typically will be write protected. To activate the ROM we first stop the clock, clear IML, assert RESET and the start the clock
Code:
;
; Toggle IML in the control byte of the CPLD
;
imliml:
sbi GPIOR0, LOCKA
out PORTA, SYSCTL
sbi PINA, IML
sbi PORTB, WRC
cbi PORTB, WRC ; Latch Control Bits into CPLD
in SYSCTL, PORTA
cbi GPIOR0, LOCKA
ret
;
; Start the clock, the value in a1l is set to the desired rate
;
imlclock:
lds temp, a1l
tst temp
breq imlstop
ori temp, 0x01 ; 1 = 11MHz, 0x14 is ~1MHz, 0xC8 is ~ 100kHz
sts OCR2A, temp
lsr temp
sts OCR2B, temp
ldi temp, (1<<WGM22) | (0<<CS22) | (0<<CS21) | (1<<CS20)
sts TCCR2B, temp
ldi temp, (0<<COM2A1) | (0<<COM2A0) | (1<<COM2B1) | (1<<COM2B0) | (1<<WGM21) | (1<<WGM20)
sts TCCR2A, temp
ret
The AVR must wait at least 2 clock cycles before de-asserting RESET. As IML has been cleared the memory mapping is switched to the normal SBC layout and the ROM image in RAM mapped to the upper addresses (in our case $C100..FFFF). Once RESET has been de-asserted the CPU will restart using the "ROM" image.