Page 4 of 19

Posted: Tue Nov 22, 2011 6:17 pm
by ElEctric_EyE
Arlet wrote:
...if you want to clock external devices with the same clock as your internal clock, you need to bring that out with a DDR FF, and the rest of the signals with regular FFs.
Yes, I've done what you suggested and it appears to clear the screen MUCH faster than I've seen at similar 20MHz speeds on the Spartan 3. Maybe my timing was off using the Spartan 3?, more experimentation will tell. But at this point I am able to change colors from green to blue (for some reason not red, red is blue), by changing variables in my clear screen routine.

I'll have to do more testing, it is not 100% yet. I am changing some other variables now like the length of bytes it sends to the TFT, so it should clear everything on the screen down to the last controlled block of data, to ensure the correct amount of data is being sent. Still testing...

Also, for some reason, when I try to do a SIM of the project, it gets stuck in a loop and I cannot halt it. I have remembered to comment out the 'ifdef SIM' & 'endif' statements when trying to run a SIM. Still testing...

Posted: Tue Nov 22, 2011 6:24 pm
by ElEctric_EyE
AH! Now I am in business with ISim! I forgot it does not synth the DCM without extra measures...

Also, Red data works. I must've forgot to update the ROM in my excitement.

Focusing on the amount of data being sent right now, seems to be exceeding the boundary of the display for some reason.

Posted: Tue Nov 22, 2011 8:05 pm
by ElEctric_EyE
It is working correctly after all! My timing must have been off in my old designs when I was experimenting with the Spartan 3.

I've modified the 8-bit CLRSCR to be 16-bit now. The speed at which the TFT is clearing @20MHz is like when I had it running @38MHz! The TFT is actually rated @50MHz, but with the 7.5ns prop delay of the Spartan 3 it could only run @38MHz. I'm sure with the Spartan 6 I can run reliably @40MHz.

Just FYI, ISE13.2 Timespec is showing a delay of O2Internal @13.9ns and top speed of 71.9MHz, with a 4Kx16 zero page, 4Kx16 stack, and 16Kx16 ROM. No other cores are present yet.

Also, I tried hooking back up the reset line on the MCP2200 (USB to UART) without the USB cord plugged in providing power, and it does indeed fowl the Reset circuit. I will just tie it high with a pull-up and hope it works correctly when the time comes to test it.

Here's the new 16-bit TFT CLRSCR routine. I am unsure why I have to increase the 'Y' value to 5:

Code: Select all

CLRSCR:	PHA
		TXA
		PHA
		TYA
		PHA
		LDA #$2A	;set x address
		STA DCOM
		LDA #$00	;start
		STA DDAT
		LDA #$00
		STA DDAT
		LDA #$02	;end
		STA DDAT
		LDA #$7F
		STA DDAT
	
		LDA #$2B	;set y address
		STA DCOM
		LDA #$00	;start
		STA DDAT
		LDA #$00
		STA DDAT
		LDA #$01	;end
		STA DDAT
		LDA #$DF
		STA DDAT
	
		LDA #$2C	;prepare to send display data
		STA DCOM
		LDY #$05	;4?x65536=640X480=307,200 pixels
AA:   LDX #$0000
AB:   LDA SCRCOL1
		STA DDAT
		LDA SCRCOL2
		STA DDAT
		LDA SCRCOL3
		STA DDAT
		DEX
		BNE AB
		DEY
		BNE AA
		PLA
		TAY
		PLA
		TAX
		PLA
DONE:		JMP DONE
My current list of immediate priorities:
1) Try to increase O2 to 40MHz.
2) Add back in Arlet's improvements for his 6502 core, 1 mod at a time and retest the 65Org16.b core.

Posted: Tue Nov 22, 2011 9:31 pm
by ElEctric_EyE
While trying to approach higher speeds for O2, I noticed behavior would change when I put my finger close to the Reset button, which is on the opposite side of the DS1818. This is leading me, once again, to focus on the Reset circuit as correct operation was not observed above 24MHz... I guess I got lucky picking an arbitrary 20MHz starting point.

Right now I have no external pullup on the Reset line. Tomorrow I will experiment and add a 1K external pullup, as suggested on the DS1818 datasheet. Maybe retry connecting the unpowered MCP2200 Reset...

Posted: Wed Nov 23, 2011 4:22 pm
by ElEctric_EyE
After adding a 1K pull-up and reconnecting the MCP2200 Reset, the system seems to run properly at ~28MHz max((100MHz/7)x2), anything faster and the TFT does not init properly...

Next I would like to add in the software that displays characters and add in Arlet's UART core, and the PS2 keyboard core. Then test 2 unidirectional terminals, the local keyboard plugged into the PS2, and the remote terminal from the PC going through the MCP2200 USB to UART utilizing the Br@y Terminal. I was able to get a 12.5MHz to the MCP clk, close enough, although the baud rate may be skewed abit. The Tx & Rx LEDs were lit...

Posted: Wed Nov 23, 2011 6:09 pm
by ElEctric_EyE
This bit of software is what I have made, back in May 2011, for the 6502SoC to display a character with the PLTCHR routine when reading keypress data from the PS2 Core, no interrupts.
Before calling that routine however, I made it so that I had to preset some variables using the ATTBUTE routine. Neat graphics stuff, I really enjoyed writing it...

With 8 more bits available, using the 65Org16 as the processor, I look forward to do even more with this routine. I can definately use different X and Y sizes, which I had originally sacrificed for other bit settings. I see though, most free bits will be devoted to color.
Attribute Routine (Call 1st)

Code: Select all

ATTBUTE	LDA chrattr	;get color from bits 0,1,2,3
	AND #$0f	
	ASL
	ASL		;multiple by 4 for easy indexing
	TAX
	LDA coltable,x
	STA pxlcol1
	INX
	LDA coltable,x
	STA pxlcol2
	INX
	LDA coltable,x
	STA pxlcol3
		
	LDA chrattr	;check bits 4,5,6 for size
	AND #$70
	LSR
	LSR
	LSR
	LSR		;make size 1x through 7x, no size 0!	
	STA xwidth
	STA ywidth
	
	LDA chrattr	;check font bit 7, 1=C64 , 0=3x5
	AND #$80
	CMP #$80
	BEQ n64
	LDA #$08	;C-64 8x8 Character set
	STA chrxlen
	STA chrylen
	LDA #$e0	;starting @ $e000
	STA chrvecH
	JMP porc
n64	LDA #$04	;3x5 character set
	STA chrxlen
	LDA #$05
	STA chrylen
	LDA #$e5	;starting @ $e500
	STA chrvecH
	
porc    LDA PE		;test PE for plot or clear
	BNE plot2
	LDA scrcol1
	STA tmpcol1
	LDA scrcol2
	STA tmpcol2
	LDA scrcol3
	STA tmpcol3
	RTS
plot2	LDA pxlcol1
	STA tmpcol1
	LDA pxlcol2
	STA tmpcol2
	LDA pxlcol3
	STA tmpcol3
	RTS
Plot Character (Call 2nd)

Code: Select all

PLTCHR		LDA #$2c	;PLTCHR
		STA dcom
		PHA		;Plot Character Subroutine variable (1-7) H and V size
		TYA		;save all reg's
		PHA
		TXA
		PHA 		
		
CACALC		LDA chrvecH	;CACALC Character Address Calculate Subroutine
		PHA
		CLC		
		LDA chr		
		LDY #$03	
shift3		ASL		;multiply character by 8 to get char pointer LSB
		DEY		
		BNE shift3	
		STA chrvecL	
		CLC		
		LDA chr	
		LDY #$05	
shift4		LSR		
		DEY		
		BNE shift4	
		TAX		
		BEQ ninc	
cph2		INC chrvecH	
		DEX		
		BNE cph2	
ninc		LDY #$00	;vertical bit count
loop7		LDX ywidth	
loop4		TXA
		PHA

		CLC
		LDX #$00	;horizontal bit count
		LDA (chrvecL),y	;character data addr, $E000(c64) or $E500(3x5)
loop5		ASL		;Shift left
		BCC nonpxl	;branch on no bit
pxl2		PHA		;save character data
		TXA
		PHA
		LDX xwidth	
xwp		LDA tmpcol1	
		STA ddat	
		LDA tmpcol2	
		STA ddat	
		LDA tmpcol3	
		STA ddat
		DEX
		BNE xwp
		PLA
		TAX
		PLA
		INX
		CPX chrxlen	
		BNE loop5	;test for horizontal max
		
		JMP next
		
nonpxl		PHA		;save character data
		TXA
		PHA
		LDX xwidth
xwnp		LDA scrcol1	;load RED background color
		STA ddat	
		LDA scrcol2	;load GREEN background color
		STA ddat	
		LDA scrcol3       ;load BLUE background color
		STA ddat
		DEX
		BNE xwnp
		PLA
		TAX
		PLA		;reload character data to accumulator
		INX		
		CPX chrxlen	
		BNE loop5	
		
next		PLA		
		TAX
		DEX		
		BNE loop4	
		INY		
		CPY chrylen	
		BNE loop7	
		
		PLA		;reset character pointers
		STA chrvecH	
		LDA #$00	
		STA chrvecL	
		PLA		
		TAX
		PLA		;reload all reg's
		TAY
		PLA		
		RTS

Posted: Sat Nov 26, 2011 8:34 pm
by ElEctric_EyE
I'm in the middle of converting the PLTCHR routine. Immediately I see I can dispose of the byte manipulations for the CACALC routine, where I have to multiply a character inputted by 8 in order to index the pixel data. With the 65Org16, I can multiply up to 8192 by 8 and not have to use indirect indexed mode. I won't be approaching anywhere near 8K here though. If I didn't need the precious block RAM, I would consider using a 16x16 character set. Maybe in time I could store it in FLASH, when I get the SPI core working.

EDIT: Also I see some corrections I need to make on the original 6502 PLTCHR routine.
First mistake at the very beginning, where an LDA/STA was erroneously done to the TFT register before pushing Accumulator, X, & Y values onto the stack.

Second mistake is using a BNE loop to do a x8 multiply in a graphics routine. It needs to be as fast as possible, so instead of the loop, I will do consecutive ASL, ASL, ASL.

Posted: Tue Nov 29, 2011 12:50 am
by ElEctric_EyE
As I wrestle just abit with indexing issues (my brain slowly flexes into software mode...), converting my original code from 8bit data - 16bit address to 16bit data - 32bit addresses, it occurs to me why some people here on this forum are keen on a contiguous 32bit structure , i.e. 32bit data and address buses. There would be no need for indirect indexed mode, which while an asset for limited 8-bit machines, maybe also an asset for 16-bit machines, but unnecessary for 32-bit machines (assuming 4GB is enough addressable memory?).

I cannot pursue this right now, however I would like to pursue this 65Org32 Core in due course, based on BigEd's original mod of Arlet's NMOS 6502 Core, the mod's for 32bit operation should not be too difficult.

The reset/IRQ/NMI vectors would have to be adjusted for the 32bit structure optimally. Also, the upper 32bits of the 64bit address lines would have to be truncated, as there is no need for 2^64 memory yet. Also, eliminating all indirect indexed mode opcodes may actually speed up top speed of a 65Org32 Core...

In a software routine to plot pixels, if a 32-bit index register could be directly transferred to a 32-bit memory location, especially with a max of 4GB worth of pixels, I can see the value in it, most definately. This would cover a large resolution when sending "bytes" to a display of the sort I am working on currently.

Also, I am trying to find a thread where kc5tja had posted his thoughts about extending the 6502 architecture, and very astutely pointed out that for 16bit byte and 32bit bytes the 6502extended CPUs would waste a certain percentage of bits during translation... Still trying to find the link to that thread.

In the mean time, this in another interesting thread.

Posted: Tue Nov 29, 2011 3:26 am
by GARTHWILSON
It seems like you would still need indirect indexed addressing. I don't think of it being tied to the fact that the data bus and registers are only half the width of the address bus. It is especially helpful for accessing data whose location is not known until the time the program is running. I suspect André Fachat made much use of it in his GeckOS for 6502 with multithreading and dynamic memory management. Hopefully he will see this and speak up. It would be nice to not have the base address limited to a 256-byte page; and having 32-bit everything means all 4 gigawords of address space are in zero page, so to speak.

Posted: Tue Nov 29, 2011 12:34 pm
by ElEctric_EyE
I see there would still need to be an indirect indexed mode (I'm done wrestling with this fact on a certain portion of my code, I realize I need it). But there would be an issue with indirect indexed if you truncate the upper 32 address bits and kept the NMI/RES/IRQ vectors @ $FFFFFFFA-$FFFFFFFF.
Maybe a program that didn't pay attention to this limitation would just wrap around the memory, which would be a tolerable side effect IMO.

EDIT: would not wouldn't

Posted: Tue Nov 29, 2011 4:33 pm
by Arlet
With 32 bit operands, and a 32 bit address space, at least you could get rid of absolute and absolute-indexed modes.

Posted: Tue Nov 29, 2011 6:26 pm
by kc5tja
GARTHWILSON wrote:
It seems like you would still need indirect indexed addressing.
Why?

Nearly all contemporary RISC processors make due with exactly two addressing modes: Ra+Rb and Ra+offset, of which Ra+offset is used far more often. Many RISC processors don't bother with implementing the Ra+Rb mode, such as MIPS.

The Ra+offset mode would translate into direct-page-indexed (e.g., dp.X or dp,Y) in the 65xx architecture. I'm curious to learn where the extra level of indirection would prove especially useful.

Posted: Tue Nov 29, 2011 8:00 pm
by BigEd
I've started a new thread, as this seems like a good but separable discussion.

(Feel free to re-post in there, especially if my summary isn't adequate)

Cheers
Ed

Posted: Tue Nov 29, 2011 8:06 pm
by ElEctric_EyE
Thanks for doing that Ed! I am about to post some progress on some code I previously posted, which I modified and successfully tested for the 65Org16 today, but also did not want to take away from a useful discussion of the 65Org32. Now we all can continue!

Posted: Tue Nov 29, 2011 9:27 pm
by ElEctric_EyE
Today, I made some progress...

Besides just clearing the TFT screen with the 65Org16 Core successfully @40MHz(TFT&FPGA)!. I've previously mentioned I was only able to run @24MHz max also noticed 110mV noise on the RESET circuit. I've since added an external 1K pull-up resistor and this NHD640x480TFT is working at a more normal 40MHz max with the Spartan 6.

I've modified the 8bit PLTCHR routine I previously posted above to work with the 65Org16. This validates in my mind that this 16bit mod'd Core even works. I had some small doubts, very small, but all of the errors I've found so far have been my own programming errors. I am still modifying the ATTBUTE routine for some extra features of 16bits. But using the original ATTBUTE with the 16bit PLTCHR routine below, I have successfully used the 16bit CLRSCR and 16bit PLTCHR to plot different sizes of the 2 different character sets.

Sorry about the indentation discrepancy. I am a newbie to PSPAD text editor.

Code: Select all

PLTCHR		  PHA		         ;Plot Character Subroutine variable (1-7) H and V size
		        TYA		         ;save all reg's
		        PHA
		        TXA
		        PHA 		
            LDA #$2C	     ;Prepare TFT to Plot
		        STA DCOM
		      
CACALC      SEC	
            LDA CHR
            SBC #$20       ;first 32 ascii characters "undefined"
            BCS nnull
            LDA #$00		  ;make undefined char's a defined zero
nnull       ASL A
		        ASL A		
		        ASL A		      ;multiply character by 8 to get char pointer
			      CLC
			      ADC CHRVECL	  ;add pointer to base either CA00 (C64) or CD00(3x5)
			      STA CHRVECL
ninc		    LDY #$00	    ;vertical bit count
loop7		    LDX YWIDTH	
loop4		    TXA
		        PHA

		        CLC
		        LDX #$00	     ;horizontal bit count
		        LDA (CHRVECL),y	;character data addr, $FFFFCA00(c64) or $FFFFCD00(3x5)
			      ASL A		        ;
			      ASL A		        ;
			      ASL A		        ;
			      ASL A		        ;
			      ASL A		        ;
			      ASL A		        ;
			      ASL A		        ;
			      ASL A		        ;shift out upper 8 bits, don't care for 8-bit byte character font
loop5		    ASL A		        ;Shift left
		        BCC nonpxl	   ;branch on 'blank' bit
pxl2		    PHA		         ;save character data
		        TXA
		        PHA
		        LDX XWIDTH	
xwp		      LDA TMPCOL1	
		        STA DDAT	     ;plot RED pixel TFT data
		        LDA TMPCOL2	
		        STA DDAT	     ;plot GREEN pixel TFT data
		        LDA TMPCOL3	
		        STA DDAT        ;plot BLUE pixel TFT data
		        DEX
		        BNE xwp
		        PLA
		        TAX
		        PLA
		        INX
		        CPX CHRXLEN	
		        BNE loop5	      ;test for horizontal max
		        JMP next
		
nonpxl		  PHA		           ;save character data
		        TXA
		        PHA
		        LDX XWIDTH
xwnp		    LDA SCRCOL1	
		        STA DDAT	       ;plot RED "blank" pixel TFT data
		        LDA SCRCOL2	
		        STA DDAT	       ;plot GREEN "blank" pixel TFT data
		        LDA SCRCOL3 
		        STA DDAT         ;plot BLUE "blank" pixel TFT data
		        DEX
		        BNE xwnp
		        PLA
		        TAX
		        PLA		           ;reload character data to accumulator
		        INX		
		        CPX CHRXLEN	
		        BNE loop5	
		
next		    PLA		
		        TAX
		        DEX		
		        BNE loop4	
		        INY		
		        CPY CHRYLEN       ;test for vertical max
		        BNE loop7	
		
		        PLA		
		        TAX
		        PLA		
		        TAY
		        PLA		;reload all reg's
		        RTS