Page 1 of 5
FT816 Core
Posted: Thu Nov 27, 2014 8:37 am
by Rob Finch
Work has been started on a 65c816 compatible core called FT816. I've managed to get a small test program working for it.
Sources for the '816 core are in Github:
http://github.com/robfinch/Cores/tree/m ... T816/trunk
There is a sample LED/Switch test system called FT816Sys.v as a top module.
Next level is an mpu module which has chip select decoding, call FT816mpu.v
Next level down is the cpu itself FT816.v
Code: Select all
cpu W65C816S
.org $E000
start:
clc ; switch to '816 mode
xce
rep #$30 ; set 16 bit regs & mem
ndx 16
mem 16
lda #$0070 ; program chip selects for I/O
sta $F000 ; at $007000
lda #$0071
sta $F002
ldy #$0000
.st0001:
ldx #$0000
.st0002:
inx
bne .st0002
lda $7100
sta $7000
iny
bra .st0001
.org $FFFC
dw $E000
Re: FT816 Core
Posted: Thu Nov 27, 2014 12:54 pm
by MichaelM
Rob:
Thanks for sharing. Had a look at the project. There's certainly a lot of work that's already been done. I'm certainly looking forward to reading about your progress on this project.
Re: FT816 Core
Posted: Thu Nov 27, 2014 11:45 pm
by Rob Finch
There's certainly a lot of work that's already been done.
I was able to make use of the '816 portion of a previous core (RTF65003) in order to speed development. Large parts of the core were 'already done' and working. One of the nice things about working in an HDL rather than with real chips, is it's easier to cut-and-paste. One can leverage the use of existing projects.
The core is fairly large, but it should fit into an slx9 or xc3s500 part (it's about 5000 LUTs). It's also fairly fast. I've got it running @64MHz. It outputs chip selects for slower memory and I/O parts, where the speed of the addressed area is controllable. 1/32, 1/4 clock rate are cs options.
Re: FT816 Core
Posted: Fri Nov 28, 2014 2:47 am
by MichaelM
Rob:
I noticed that your approach made many of the address and ALU calculations in parallel. From that I gathered that you were optimizing for speed rather than area/size. I think that you are succeeding in meeting your apparent goal by getting the part to operate around 64MHz. I think it will make the core quite attractive, especially for the speed hounds among us.
I also saw that you were using a shift register for the clock divider. That's a nice touch. It is a simple and effective approach to get your 1/32 and 1/4 clocks for the addressed devices. It is certainly easier to maintain the speed of the core by using shift registers instead of counters for dividers. The extra logic required for counting and decoding is far less efficient than your shift register approach.
I noticed that you made an update earlier today. It appeared related to the return address differences between BRK traps, interrupts, and subroutine calls. I had thought about making my 65C02 core compatible with the idiosyncratic behavior of the 6502/65C02 for subroutine calls, BRK traps, and interrupts. Instead, I opted to make the return addresses for all three the same so that my RTS and RTI microroutines both increment the return address by 1 before the fetch of the instruction.
Is that what you were attempting to do, and if so, why not make RTS and RTI behave in a common manner with respect to the return address?
Re: FT816 Core
Posted: Fri Nov 28, 2014 6:33 am
by Rob Finch
I noticed that your approach made many of the address and ALU calculations in parallel. From that I gathered that you were optimizing for speed rather than area/size. I think that you are succeeding in meeting your apparent goal by getting the part to operate around 64MHz. I think it will make the core quite attractive, especially for the speed hounds among us.
Yes, that's a speed optimization, and also a coding simplicity optimization (development time). The core could probably be made more compact by making better reuse of some of the components.
Is that what you were attempting to do, and if so, why not make RTS and RTI behave in a common manner with respect to the return address?
I was worried about breaking existing software. I think I've got the behaviour the same as the 6502/65816. There are some pieces of software that pull and increment the return address. For instance to access inline parameters. It cost extra logic to remain 100% software compatible. One thing that's different is the PC increment. I think on the '816 only the low order 16 bits increment. In FT816 all 24 bits increment. This means that software that relies on wrapping around at the end of a bank is broken.
Re: FT816 Core
Posted: Sat Nov 29, 2014 7:32 am
by Rob Finch
Numerous bug fixes have taken place over the past couple of days. A sign the core is still in early development. But it is running code. An attempt to get strings to display onscreen is currently in progress. Clearscreen works but strings are coming out partially garbled.
Triple byte incrementing pointers are on the table tonight. On the 816 when the memory is set to 16 bits, data operations become 16 bit. That includes the memory based operations like INC and ROR. A nice to have feature would be triple-byte increments (24 bit) for zero page pointers. The question is how to implement. My thought is to have a range of zero page reserved that automatically responds to increment and decrement operations with 24 bit operations rather than 16 bit ones. Suppose the range was $20 to $2F. INC $20 would increment across three bytes ($20,$21 and $22), rather than two. It sounds simple to do, but it requires manipulating 24 bit values in the core. It might be nice to have 24 bit shifts / rotates as well.
Re: FT816 Core
Posted: Sat Nov 29, 2014 11:29 pm
by Dr Jefyll
It sounds simple to do, but it requires manipulating 24 bit values in the core.
When you mentioned triple-byte increments, I misunderstood, maybe, and pictured an operation performed byte-serially in three pieces. Doing it that way means if there's no carry you can take an early exit and save cycles (or one, at least). If the triple-byte value lives in 8-bit external RAM, that idea seems like a win. But I think you're anticipating Direct Page will be in on-chip RAM, is that right? So you can grab 24 bits at once? If 16 bits at once is easier, you could still break the 24-bit operation down into two pieces.
But I wonder whether, deep in its heart, FT816 wants to be a native 24-bit machine. (Or 32/24-bit.) I mean with downgraded 8/16-bit modes to match the 65816.
-- Jeff
Re: FT816 Core
Posted: Sun Nov 30, 2014 2:21 am
by Rob Finch
When you mentioned triple-byte increments, I misunderstood, maybe, and pictured an operation performed byte-serially in three pieces. Doing it that way means if there's no carry you can take an early exit and save cycles (or one, at least). If the triple-byte value lives in 8-bit external RAM, that idea seems like a win. But I think you're anticipating Direct Page will be in on-chip RAM, is that right? So you can grab 24 bits at once? If 16 bits at once is easier, you could still break the 24-bit operation down into two pieces.
Yes, the increment is byte serial. I modified the core to skip the store on a RMW instruction if the high-byte didn't change.
I think I'm scrapping the triple-byte increment mode, in favor of a couple of 24 bit counters I/O devices located in zero page. I coded the counters at the MPU level. Leaving the cpu alone. With the CPU clock so fast, 24 bit counters rather 16 would be more useful. It's also possible to trigger a count cycle in software, so the counters could be used as interpretive pointers as well. I have the one counter operating as a down counter to generate periodic interrupts. It takes 19 bits to divide down to 100Hz from the cpu clock.
But I wonder whether, deep in its heart, FT816 wants to be a native 24-bit machine. (Or 32/24-bit.) I mean with downgraded 8/16-bit modes to match the 65816.
I have to resist temptations on that one. The thought of full 24 bit registers did cross my mind.
I've managed to find some free opcode space in the 65816 instruction set - it's the branches. A branch displacement of $FF is a no-no, so the branch opcodes could be reused if the displacement is $FF, to mean something else. I've though of using them as prefix codes and maybe stealing some from Michael's cpu core.
Re: FT816 Core
Posted: Mon Dec 01, 2014 3:25 am
by Rob Finch
The $FF branch displacements have been allocated to long forms for the branches. Code below shows how it's working. Long branching has been put to use in a parser for terminal emulation. Long branching has been made a core parameter should it not be desired. The following code was assembled with the Finitron 65816 assembler.
Code: Select all
5306 00E0E0 DisplayChar:
5307 00E0E0 29 FF 00 AND #$0FF
5308 00E0E3 24 3C BIT EscState
5309 00E0E5 30 FF 87 00 LBMI processEsc
5310 00E0E9 C9 08 00 CMP #BS
5311 00E0EC F0 FF 31 01 LBEQ doBackSpace
5312 00E0F0 C9 91 00 CMP #$91 ; cursor right
5313 00E0F3 F0 FF 7F 01 LBEQ doCursorRight
5314 00E0F7 C9 93 00 CMP #$93 ; cursor left
5315 00E0FA F0 FF 84 01 LBEQ doCursorLeft
5316 00E0FE C9 90 00 CMP #$90 ; cursor up
5317 00E101 F0 FF 86 01 LBEQ doCursorUp
5318 00E105 C9 92 00 CMP #$92 ; cursor down
5319 00E108 F0 FF 88 01 LBEQ doCursorDown
5320 00E10C C9 99 00 CMP #$99 ; delete
5321 00E10F F0 FF 35 01 LBEQ doDelete
5322 00E113 C9 0D 00 CMP #CR
5323 00E116 F0 44 BEQ doCR
5324 00E118 C9 0A 00 CMP #LF
5325 00E11B F0 44 BEQ doLF
5326 00E11D C9 94 00 CMP #$94
5327 00E120 F0 FF 46 01 LBEQ doCursorHome ; cursor home
5328 00E124 C9 1B 00 CMP #ESC
5329 00E127 D0 05 BNE .0003
5330 00E129 64 3C STZ EscState ; put a -1 in the escape state
5331 00E12B C6 3C DEC EscState
5332 00E12D 60 RTS
The FT816 test system runs Supermon816 but there are some display issues still. These are believed to be a software problem with the terminal emulation and not a softcore problem. That is not to say the core doesn't have bugs - the latest fix was to JMP (indirect) - but the core seems more stable the last day or so.
Re: FT816 Core
Posted: Mon Dec 01, 2014 6:16 am
by BigDumbDinosaur
The FT816 test system runs Supermon816 but there are some display issues still. These are believed to be a software problem with the terminal emulation and not a softcore problem. That is not to say the core doesn't have bugs - the latest fix was to JMP (indirect) - but the core seems more stable the last day or so.
The download version of SuperMon 816 has a WYSE 60 driver in it. The mumbo-jumbo that makes it work is well-commented and can be readily changed to support a different terminal type.
Re: FT816 Core
Posted: Mon Dec 01, 2014 7:34 am
by barrym95838
Code: Select all
5306 00E0E0 DisplayChar:
5307 00E0E0 29 FF 00 AND #$0FF
5308 00E0E3 24 3C BIT EscState
5309 00E0E5 30 FF 87 00 LBMI processEsc
5310 00E0E9 C9 08 00 CMP #BS
...
How exactly does this work, Rob?
I don't know how your native mode works, so I have to ask the following (please forgive my ignorance if it's already been explained elsewhere):
1) The
8-bit BIT $3C instruction
ANDs the value in
A with the
8-bit value in
$3C, using it to set or clear
Z. It transfers the contents of bits 6 and 7 in
$3C to
V and
N, respectively, right?
2) The
16-bit BIT $3C instruction would then
AND the value in
A with the
16-bit value in
$3C and $3D, little-endian style, and use it to set or clear
Z. It would then transfer the contents of bits 14 and 15 of the contents of
$3C and $3D to
V and
N, respectively, right?
3) Where are bits 14 and 15 of the
16-bit value "in"
$3C? Are they not in bits 6 and 7 of
$3D? I don't know how you implemented your ZP variable EscState, but is it possible that your terminal emulation bug lies in this vicinity?
Mike
Re: FT816 Core
Posted: Mon Dec 01, 2014 10:00 am
by Rob Finch
The download version of SuperMon 816 has a WYSE 60 driver in it. The mumbo-jumbo that makes it work is well-commented and can be readily changed to support a different terminal type.
I was referring to a probable bug in my own code to emulate a WYSE 60 terminal no Supermon. I had read through the comments. Supermon816 works great. I found one display bug in the text video controller I had set up - readback of character attribute codes wasn't working. Nothing to do with the '816 emulation.
How exactly does this work, Rob?
I don't know how your native mode works, so I have to ask the following (please forgive my ignorance if it's already been explained elsewhere):
1) The 8-bit BIT $3C instruction ANDs the value in A with the 8-bit value in $3C, using it to set or clear Z. It transfers the contents of bits 6 and 7 in $3C to V and N, respectively, right?
2) The 16-bit BIT $3C instruction would then AND the value in A with the 16-bit value in $3C and $3D, little-endian style, and use it to set or clear Z. It would then transfer the contents of bits 14 and 15 of the contents of $3C and $3D to V and N, respectively, right?
3) Where are bits 14 and 15 of the 16-bit value "in" $3C? Are they not in bits 6 and 7 of $3D? I don't know how you implemented your ZP variable EscState, but is it possible that your terminal emulation bug lies in this vicinity?
You are right on with #1 and #2. There are no modes to this core - it's a straight '816 emulation. So it's supposed to work exactly like the '816 would. Accumulator and memory are both set to 16 bits before the subroutine is called. There are two bytes of zero page reserved for the EscState flag. $3C,$3D. The bit test is using the MSB as a flag that an escape sequence is present. The flag is set to -1 later in code if an ESC char is present. The program does get into processing escape sequences, as things like reverse video, and cursor on/off seem to work. It's just that the display goes nuts when I try running Supermon's disassemble function. It does disassemble code, but it's all over the screen rather than being neatly laid out. I could post the code, but it's a little long.
Re: FT816 Core
Posted: Mon Dec 01, 2014 10:09 am
by Rob Finch
Code for the terminal emulation.
Vars:
- VIDBUF is the text video memory, laid out PC style with high byte attribute, and low byte screen char
- VideoPos is an index into video memory.
- NormAttr is the normal display attribute in use
- VIDREGS is text controller video register set. +13,14 an index for the cursor position
Code: Select all
;------------------------------------------------------------------------------
; Display a character on the screen device
;------------------------------------------------------------------------------
;
DisplayChar:
AND #$0FF
BIT EscState
LBMI processEsc
CMP #BS
LBEQ doBackSpace
CMP #$91 ; cursor right
LBEQ doCursorRight
CMP #$93 ; cursor left
LBEQ doCursorLeft
CMP #$90 ; cursor up
LBEQ doCursorUp
CMP #$92 ; cursor down
LBEQ doCursorDown
CMP #$99 ; delete
LBEQ doDelete
CMP #CR
BEQ doCR
CMP #LF
BEQ doLF
CMP #$94
LBEQ doCursorHome ; cursor home
CMP #ESC
BNE .0003
STZ EscState ; put a -1 in the escape state
DEC EscState
RTS
.0003:
ORA NormAttr
PHA
LDA VideoPos
ASL
TAX
PLA
STA VIDBUF,X
LDA CursorX
INA
CMP #$56
BNE .0001
STZ CursorX
LDA CursorY
CMP #$30
BEQ .0002
INA
STA CursorY
BRL SyncVideoPos
.0002:
JSR SyncVideoPos
BRL ScrollUp
.0001:
STA CursorX
BRL SyncVideoPos
doCR:
STZ CursorX
BRL SyncVideoPos
doLF:
LDA CursorY
CMP #30
LBEQ ScrollUp
INA
STA CursorY
BRL SyncVideoPos
processEsc:
LDX EscState
CPX #-1
BNE .0006
CMP #'T' ; clear to EOL
BNE .0003
LDA VideoPos
ASL
TAX
LDY CursorX
.0001:
CPY #55
BEQ .0002
LDA #' '
ORA NormAttr
STA VIDBUF,X
INX
INX
INY
BNE .0001
.0002:
STZ EscActive
RTS
.0003:
CMP #'W'
BNE .0004
STZ EscState
BRL doDelete
.0004:
CMP #'`'
BNE .0005
LDA #-2
STA EscState
RTS
.0005:
CMP #'('
BNE .0008
LDA #-3
STA EscState
RTS
.0008:
STZ EscState
RTS
.0006:
CPX #-2
BNE .0007
STZ EscState
CMP #'1'
LBEQ CursorOn
CMP #'0'
LBEQ CursorOff
RTS
.0007:
CPX #-3
BNE .0009
CMP #ESC
BNE .0008
LDA #-4
STA EscState
RTS
.0009:
CPX #-4
BNE .0010
CMP #'G'
BNE .0008
LDA #-5
STA EscState
RTS
.0010:
CPX #-5
BNE .0008
STZ EscState
CMP #'4'
BNE .0011
LDA NormAttr
; Swap the high nybbles of the attribute
XBA
SEP #$30 ; set 8 bit regs
NDX 8 ; tell the assembler
MEM 8
ROL
ROL
ROL
ROL
REP #$30 ; set 16 bit regs
NDX 16 ; tell the assembler
MEM 16
XBA
AND #$FF00
STA NormAttr
RTS
.0011:
CMP #'0'
BNE .0012
LDA #$BF00 ; Light Grey on Dark Grey
STA NormAttr
RTS
.0012:
LDA #$BF00 ; Light Grey on Dark Grey
STA NormAttr
RTS
doBackSpace:
LDY CursorX
BEQ .0001 ; Can't backspace anymore
LDA VideoPos
ASL
TAX
.0002:
LDA VIDBUF,X
STA VIDBUF-2,X
INX
INX
INY
CPY #56
BNE .0002
.0003:
LDA #' '
ORA NormAttr
STA VIDBUF,X
DEC CursorX
BRL SyncVideoPos
.0001:
RTS
; Deleting a character does not change the video position so there's no need
; to resynchronize it.
doDelete:
LDY CursorX
LDA VideoPos
ASL
TAX
.0002:
CPY #55
BEQ .0001
LDA VIDBUF+2,X
STA VIDBUF,X
INX
INX
INY
BRA .0002
.0001:
LDA #' '
ORA NormAttr
STA VIDBUF,X
RTS
doCursorHome:
LDA CursorX
BEQ doCursor1
STZ CursorX
BRA SyncVideoPos
doCursorRight:
LDA CursorX
CMP #55
BEQ doRTS
INA
doCursor2:
STA CursorX
BRA SyncVideoPos
doCursorLeft:
LDA CursorX
BEQ doRTS
DEA
BRA doCursor2
doCursorUp:
LDA CursorY
BEQ doRTS
DEA
BRA doCursor1
doCursorDown:
LDA CursorY
CMP #30
BEQ doRTS
INA
doCursor1:
STA CursorY
BRA SyncVideoPos
doRTS:
RTS
HomeCursor:
LDA #0
STZ CursorX
STZ CursorY
; Synchronize the absolute video position with the cursor co-ordinates.
;
SyncVideoPos:
LDA CursorY
ASL
TAX
LDA LineTbl,X
CLC
ADC CursorX
STA VideoPos
STA VIDREGS+13 ; Update the position in the text controller
RTS
Re: FT816 Core - SPH01
Posted: Fri Oct 02, 2015 6:55 pm
by Rob Finch
Fixed a subtle bug in the core today. The stack pointer high byte was not being set to 01h when a switch to emulation mode occurs. Instead the core forced stack address calculations in emulation mode to use '01h' as the stack page without modifying SPH. This bug did not appear to break any software that was tested so far. However when switching back to native mode from emulation mode, the SPH register didn't contain an 01h. Instead it would contain the previous value set in native mode. It is now fixed to switch to page 01h.
I have not tested this fix yet. I need to upgrade my toolset to be Window 10 compatible.
Re: FT816 Core / FT832 Core
Posted: Fri Oct 30, 2015 3:51 pm
by Rob Finch
I've copied the FT816 core and am adding 32 bit support to create the FT832 core. It will be backwards compatible. As suggested by others on the board, the FT832 will have all 32 bit registers in native mode. The program bank register is turned into a code segment register, and the data bank register is turned into a data segment register. In order to switch to emulation mode from native mode a long jump JML instruction will be used that specifies both the program counter and code segment. In this case the code segment value will be $FFFFFFFF to switch to 8 bit emulation or $FFFFFFFE to switch to 16 bit emulation. The reason a jump instruction is used because the code segment needs to be zeroed on the switch, and the programs execution address forced to a known value. In native 32 bit mode the eight bit displacement instruction will be shifted left twice before put to use. This will allow more efficient use of zero page (extending it to 1kB). Also in native mode long address mode instructions are 32 bit addresses rather than 24 bit. Otherwise the instruction set remains the same. It is necessary to flip bits in the status register to get byte or 16 bit operations from native mode.