Page 4 of 19
Posted: Tue Nov 22, 2011 6:17 pm
by ElEctric_EyE
...if you want to clock external devices with the same clock as your internal clock, you need to bring that out with a DDR FF, and the rest of the signals with regular FFs.
Yes, I've done what you suggested and it appears to clear the screen MUCH faster than I've seen at similar 20MHz speeds on the Spartan 3. Maybe my timing was off using the Spartan 3?, more experimentation will tell. But at this point I am able to change colors from green to blue (for some reason not red, red is blue), by changing variables in my clear screen routine.
I'll have to do more testing, it is not 100% yet. I am changing some other variables now like the length of bytes it sends to the TFT, so it should clear everything on the screen down to the last controlled block of data, to ensure the correct amount of data is being sent. Still testing...
Also, for some reason, when I try to do a SIM of the project, it gets stuck in a loop and I cannot halt it. I have remembered to comment out the 'ifdef SIM' & 'endif' statements when trying to run a SIM. Still testing...
Posted: Tue Nov 22, 2011 6:24 pm
by ElEctric_EyE
AH! Now I am in business with ISim! I forgot it does not synth the DCM without extra measures...
Also, Red data works. I must've forgot to update the ROM in my excitement.
Focusing on the amount of data being sent right now, seems to be exceeding the boundary of the display for some reason.
Posted: Tue Nov 22, 2011 8:05 pm
by ElEctric_EyE
It is working correctly after all! My timing must have been off in my old designs when I was experimenting with the Spartan 3.
I've modified the 8-bit CLRSCR to be 16-bit now. The speed at which the TFT is clearing @20MHz is like when I had it running @38MHz! The TFT is actually rated @50MHz, but with the 7.5ns prop delay of the Spartan 3 it could only run @38MHz. I'm sure with the Spartan 6 I can run reliably @40MHz.
Just FYI, ISE13.2 Timespec is showing a delay of O2Internal @13.9ns and top speed of 71.9MHz, with a 4Kx16 zero page, 4Kx16 stack, and 16Kx16 ROM. No other cores are present yet.
Also, I tried hooking back up the reset line on the MCP2200 (USB to UART) without the USB cord plugged in providing power, and it does indeed fowl the Reset circuit. I will just tie it high with a pull-up and hope it works correctly when the time comes to test it.
Here's the new 16-bit TFT CLRSCR routine. I am unsure why I have to increase the 'Y' value to 5:
Code: Select all
CLRSCR: PHA
TXA
PHA
TYA
PHA
LDA #$2A ;set x address
STA DCOM
LDA #$00 ;start
STA DDAT
LDA #$00
STA DDAT
LDA #$02 ;end
STA DDAT
LDA #$7F
STA DDAT
LDA #$2B ;set y address
STA DCOM
LDA #$00 ;start
STA DDAT
LDA #$00
STA DDAT
LDA #$01 ;end
STA DDAT
LDA #$DF
STA DDAT
LDA #$2C ;prepare to send display data
STA DCOM
LDY #$05 ;4?x65536=640X480=307,200 pixels
AA: LDX #$0000
AB: LDA SCRCOL1
STA DDAT
LDA SCRCOL2
STA DDAT
LDA SCRCOL3
STA DDAT
DEX
BNE AB
DEY
BNE AA
PLA
TAY
PLA
TAX
PLA
DONE: JMP DONE
My current list of immediate priorities:
1) Try to increase O2 to 40MHz.
2) Add back in Arlet's improvements for his 6502 core, 1 mod at a time and retest the 65Org16.b core.
Posted: Tue Nov 22, 2011 9:31 pm
by ElEctric_EyE
While trying to approach higher speeds for O2, I noticed behavior would change when I put my finger close to the Reset button, which is on the opposite side of the DS1818. This is leading me, once again, to focus on the Reset circuit as correct operation was not observed above 24MHz... I guess I got lucky picking an arbitrary 20MHz starting point.
Right now I have no external pullup on the Reset line. Tomorrow I will experiment and add a 1K external pullup, as suggested on the DS1818 datasheet. Maybe retry connecting the unpowered MCP2200 Reset...
Posted: Wed Nov 23, 2011 4:22 pm
by ElEctric_EyE
After adding a 1K pull-up and reconnecting the MCP2200 Reset, the system seems to run properly at ~28MHz max((100MHz/7)x2), anything faster and the TFT does not init properly...
Next I would like to add in the software that displays characters and add in Arlet's UART core, and the PS2 keyboard core. Then test 2 unidirectional terminals, the local keyboard plugged into the PS2, and the remote terminal from the PC going through the MCP2200 USB to UART utilizing the Br@y Terminal. I was able to get a 12.5MHz to the MCP clk, close enough, although the baud rate may be skewed abit. The Tx & Rx LEDs were lit...
Posted: Wed Nov 23, 2011 6:09 pm
by ElEctric_EyE
This bit of software is what I have made, back in May 2011, for the 6502SoC to display a character with the PLTCHR routine when reading keypress data from the PS2 Core, no interrupts.
Before calling that routine however, I made it so that I had to preset some variables using the ATTBUTE routine. Neat graphics stuff, I really enjoyed writing it...
With 8 more bits available, using the 65Org16 as the processor, I look forward to do even more with this routine. I can definately use different X
and Y sizes, which I had originally sacrificed for other bit settings. I see though, most free bits will be devoted to color.
Attribute Routine (Call 1st)
Code: Select all
ATTBUTE LDA chrattr ;get color from bits 0,1,2,3
AND #$0f
ASL
ASL ;multiple by 4 for easy indexing
TAX
LDA coltable,x
STA pxlcol1
INX
LDA coltable,x
STA pxlcol2
INX
LDA coltable,x
STA pxlcol3
LDA chrattr ;check bits 4,5,6 for size
AND #$70
LSR
LSR
LSR
LSR ;make size 1x through 7x, no size 0!
STA xwidth
STA ywidth
LDA chrattr ;check font bit 7, 1=C64 , 0=3x5
AND #$80
CMP #$80
BEQ n64
LDA #$08 ;C-64 8x8 Character set
STA chrxlen
STA chrylen
LDA #$e0 ;starting @ $e000
STA chrvecH
JMP porc
n64 LDA #$04 ;3x5 character set
STA chrxlen
LDA #$05
STA chrylen
LDA #$e5 ;starting @ $e500
STA chrvecH
porc LDA PE ;test PE for plot or clear
BNE plot2
LDA scrcol1
STA tmpcol1
LDA scrcol2
STA tmpcol2
LDA scrcol3
STA tmpcol3
RTS
plot2 LDA pxlcol1
STA tmpcol1
LDA pxlcol2
STA tmpcol2
LDA pxlcol3
STA tmpcol3
RTS
Plot Character (Call 2nd)
Code: Select all
PLTCHR LDA #$2c ;PLTCHR
STA dcom
PHA ;Plot Character Subroutine variable (1-7) H and V size
TYA ;save all reg's
PHA
TXA
PHA
CACALC LDA chrvecH ;CACALC Character Address Calculate Subroutine
PHA
CLC
LDA chr
LDY #$03
shift3 ASL ;multiply character by 8 to get char pointer LSB
DEY
BNE shift3
STA chrvecL
CLC
LDA chr
LDY #$05
shift4 LSR
DEY
BNE shift4
TAX
BEQ ninc
cph2 INC chrvecH
DEX
BNE cph2
ninc LDY #$00 ;vertical bit count
loop7 LDX ywidth
loop4 TXA
PHA
CLC
LDX #$00 ;horizontal bit count
LDA (chrvecL),y ;character data addr, $E000(c64) or $E500(3x5)
loop5 ASL ;Shift left
BCC nonpxl ;branch on no bit
pxl2 PHA ;save character data
TXA
PHA
LDX xwidth
xwp LDA tmpcol1
STA ddat
LDA tmpcol2
STA ddat
LDA tmpcol3
STA ddat
DEX
BNE xwp
PLA
TAX
PLA
INX
CPX chrxlen
BNE loop5 ;test for horizontal max
JMP next
nonpxl PHA ;save character data
TXA
PHA
LDX xwidth
xwnp LDA scrcol1 ;load RED background color
STA ddat
LDA scrcol2 ;load GREEN background color
STA ddat
LDA scrcol3 ;load BLUE background color
STA ddat
DEX
BNE xwnp
PLA
TAX
PLA ;reload character data to accumulator
INX
CPX chrxlen
BNE loop5
next PLA
TAX
DEX
BNE loop4
INY
CPY chrylen
BNE loop7
PLA ;reset character pointers
STA chrvecH
LDA #$00
STA chrvecL
PLA
TAX
PLA ;reload all reg's
TAY
PLA
RTS
Posted: Sat Nov 26, 2011 8:34 pm
by ElEctric_EyE
I'm in the middle of converting the PLTCHR routine. Immediately I see I can dispose of the byte manipulations for the CACALC routine, where I have to multiply a character inputted by 8 in order to index the pixel data. With the 65Org16, I can multiply up to 8192 by 8 and not have to use indirect indexed mode. I won't be approaching anywhere near 8K here though. If I didn't need the precious block RAM, I would consider using a 16x16 character set. Maybe in time I could store it in FLASH, when I get the SPI core working.
EDIT: Also I see some corrections I need to make on the original 6502 PLTCHR routine.
First mistake at the very beginning, where an LDA/STA was erroneously done to the TFT register before pushing Accumulator, X, & Y values onto the stack.
Second mistake is using a BNE loop to do a x8 multiply in a graphics routine. It needs to be as fast as possible, so instead of the loop, I will do consecutive ASL, ASL, ASL.
Posted: Tue Nov 29, 2011 12:50 am
by ElEctric_EyE
As I wrestle just abit with indexing issues (my brain slowly flexes into software mode...), converting my original code from 8bit data - 16bit address to 16bit data - 32bit addresses, it occurs to me why some people here on this forum are keen on a contiguous 32bit structure , i.e. 32bit data
and address buses. There would be no need for indirect indexed mode, which while an asset for limited 8-bit machines, maybe also an asset for 16-bit machines, but unnecessary for 32-bit machines (assuming 4GB is enough addressable memory?).
I cannot pursue this right now, however I would like to pursue this 65Org32 Core in due course, based on BigEd's original mod of Arlet's NMOS 6502 Core, the mod's for 32bit operation should not be too difficult.
The reset/IRQ/NMI vectors would have to be adjusted for the 32bit structure
optimally. Also, the upper 32bits of the 64bit address lines would have to be truncated, as there is no need for 2^64 memory
yet. Also, eliminating all indirect indexed mode opcodes may actually speed up top speed of a 65Org32 Core...
In a software routine to plot pixels, if a 32-bit index register could be directly transferred to a 32-bit memory location, especially with a max of 4GB worth of pixels, I can see the value in it, most definately. This would cover a large resolution when sending "bytes" to a display of the sort I am working on currently.
Also, I am trying to find a thread where kc5tja had posted his thoughts about extending the 6502 architecture, and very astutely pointed out that for 16bit byte and 32bit bytes the 6502extended CPUs would waste a certain percentage of bits during translation... Still trying to find the link to that thread.
In the mean time, this in another interesting
thread.
Posted: Tue Nov 29, 2011 3:26 am
by GARTHWILSON
It seems like you would still need indirect indexed addressing. I don't think of it being tied to the fact that the data bus and registers are only half the width of the address bus. It is especially helpful for accessing data whose location is not known until the time the program is running. I suspect André Fachat made much use of it in his GeckOS for 6502 with multithreading and dynamic memory management. Hopefully he will see this and speak up. It would be nice to not have the base address limited to a 256-byte page; and having 32-bit everything means all 4 gigawords of address space are in zero page, so to speak.
Posted: Tue Nov 29, 2011 12:34 pm
by ElEctric_EyE
I see there would still need to be an indirect indexed mode (I'm done wrestling with this fact on a certain portion of my code, I realize I need it). But there would be an issue with indirect indexed if you truncate the upper 32 address bits and kept the NMI/RES/IRQ vectors @ $FFFFFFFA-$FFFFFFFF.
Maybe a program that didn't pay attention to this limitation would just wrap around the memory, which would be a tolerable side effect IMO.
EDIT: would not wouldn't
Posted: Tue Nov 29, 2011 4:33 pm
by Arlet
With 32 bit operands, and a 32 bit address space, at least you could get rid of absolute and absolute-indexed modes.
Posted: Tue Nov 29, 2011 6:26 pm
by kc5tja
It seems like you would still need indirect indexed addressing.
Why?
Nearly all contemporary RISC processors make due with exactly two addressing modes: Ra+Rb and Ra+offset, of which Ra+offset is used
far more often. Many RISC processors don't bother with implementing the Ra+Rb mode, such as MIPS.
The Ra+offset mode would translate into direct-page-indexed (e.g., dp.X or dp,Y) in the 65xx architecture. I'm curious to learn where the extra level of indirection would prove especially useful.
Posted: Tue Nov 29, 2011 8:00 pm
by BigEd
I've started a
new thread, as this seems like a good but separable discussion.
(Feel free to re-post in there, especially if my summary isn't adequate)
Cheers
Ed
Posted: Tue Nov 29, 2011 8:06 pm
by ElEctric_EyE
Thanks for doing that Ed! I am about to post some progress on some code I previously posted, which I modified and successfully tested for the 65Org16 today, but also did not want to take away from a useful discussion of the 65Org32. Now we all can continue!
Posted: Tue Nov 29, 2011 9:27 pm
by ElEctric_EyE
Today, I made some progress...
Besides just clearing the TFT screen with the 65Org16 Core successfully @
40MHz(TFT&FPGA)!. I've previously mentioned I was only able to run @24MHz max also noticed 110mV noise on the RESET circuit. I've since added an external 1K pull-up resistor and this NHD640x480TFT is working at a more normal 40MHz max with the Spartan 6.
I've modified the 8bit PLTCHR routine I previously posted above to work with the 65Org16. This validates in my mind that this 16bit mod'd Core even works. I had some small doubts, very small, but all of the errors I've found so far have been my own programming errors. I am still modifying the ATTBUTE routine for some extra features of 16bits. But using the original ATTBUTE with the 16bit PLTCHR routine below, I have successfully used the 16bit CLRSCR and 16bit PLTCHR to plot different sizes of the 2 different character sets.
Sorry about the indentation discrepancy. I am a newbie to PSPAD text editor.
Code: Select all
PLTCHR PHA ;Plot Character Subroutine variable (1-7) H and V size
TYA ;save all reg's
PHA
TXA
PHA
LDA #$2C ;Prepare TFT to Plot
STA DCOM
CACALC SEC
LDA CHR
SBC #$20 ;first 32 ascii characters "undefined"
BCS nnull
LDA #$00 ;make undefined char's a defined zero
nnull ASL A
ASL A
ASL A ;multiply character by 8 to get char pointer
CLC
ADC CHRVECL ;add pointer to base either CA00 (C64) or CD00(3x5)
STA CHRVECL
ninc LDY #$00 ;vertical bit count
loop7 LDX YWIDTH
loop4 TXA
PHA
CLC
LDX #$00 ;horizontal bit count
LDA (CHRVECL),y ;character data addr, $FFFFCA00(c64) or $FFFFCD00(3x5)
ASL A ;
ASL A ;
ASL A ;
ASL A ;
ASL A ;
ASL A ;
ASL A ;
ASL A ;shift out upper 8 bits, don't care for 8-bit byte character font
loop5 ASL A ;Shift left
BCC nonpxl ;branch on 'blank' bit
pxl2 PHA ;save character data
TXA
PHA
LDX XWIDTH
xwp LDA TMPCOL1
STA DDAT ;plot RED pixel TFT data
LDA TMPCOL2
STA DDAT ;plot GREEN pixel TFT data
LDA TMPCOL3
STA DDAT ;plot BLUE pixel TFT data
DEX
BNE xwp
PLA
TAX
PLA
INX
CPX CHRXLEN
BNE loop5 ;test for horizontal max
JMP next
nonpxl PHA ;save character data
TXA
PHA
LDX XWIDTH
xwnp LDA SCRCOL1
STA DDAT ;plot RED "blank" pixel TFT data
LDA SCRCOL2
STA DDAT ;plot GREEN "blank" pixel TFT data
LDA SCRCOL3
STA DDAT ;plot BLUE "blank" pixel TFT data
DEX
BNE xwnp
PLA
TAX
PLA ;reload character data to accumulator
INX
CPX CHRXLEN
BNE loop5
next PLA
TAX
DEX
BNE loop4
INY
CPY CHRYLEN ;test for vertical max
BNE loop7
PLA
TAX
PLA
TAY
PLA ;reload all reg's
RTS