6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Nov 23, 2024 2:51 pm

All times are UTC




Post new topic Reply to topic  [ 43 posts ]  Go to page Previous  1, 2, 3
Author Message
PostPosted: Sun May 12, 2024 6:01 pm 
Offline
User avatar

Joined: Mon Aug 30, 2021 11:52 am
Posts: 287
Location: South Africa
Yuri wrote:
On a lark I connected the RD/WR pins straight to my 139's PHI2 qualified RWb/WRb pins, and it just worked.....
Cool, I'm glad that worked. The problem with datasheets is that they tend to be (understandably) quite conservative. The suggested write hold after WRX rises is given as 15ns and the '816 typically only holds data for about 10ns after PHI2 falls. However I've noticed that often signals only need to be held for a few nanoseconds rather than what the datasheet recommendations. But as always the caveat remains: this is hobby land and going out of spec is probably fine, but maybe stay in spec if you're in industry.

Just for Ardis, if you haven't come across it yet. This is the '139 RWB to /RD /WD circuit Yuri mentioned:
Attachment:
Read Write 139.png
Read Write 139.png [ 25.46 KiB | Viewed 2260 times ]
It converts the 816's RWB signal into a pair that can be used with the Read and Write pins on an SRAM or such. You don't need the 74xx125 line driver if you don't need to respect BE. You also don't need the 74xx02 NOR gate if you're happy not turning off /RD and /WD when there is no valid address. But personally I would keep it.


Top
 Profile  
Reply with quote  
PostPosted: Sun May 12, 2024 8:02 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8509
Location: Midwestern USA
AndrewP wrote:
This is the '139 RWB to /RD /WD circuit Yuri mentioned...

Qualification of RWB with VDA and VPA is unnecessary—doing so merely consumes silicon.  VDA and VPA are best used to qualify chip selects, especially for I/O hardware that may react badly to false address bus states that occur during so-called dead cycles.  My very first POC unit was a victim of address bus hinkiness until I bodge-wired VDA and VPA into I/O decoding.

The generation of /RD and /WD can be accomplished with a single 74xx00 NAND, e.g.:

Attachment:
File comment: Qualified Read/Write w/74AC00
read_write_qualify_alt.gif
read_write_qualify_alt.gif [ 46.98 KiB | Viewed 2254 times ]

The above shows use of a 74AC00, but any of the 74-series CMOS types would be fine, e.g., 74AHC00.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Fri May 17, 2024 1:48 am 
Offline

Joined: Thu Dec 07, 2023 2:30 am
Posts: 19
AndrewP wrote:
Throwing out a couple of thoughts:

As you're using an 8bit data bus I'd imagine the fastest way to communicate with an LCD would be via that 8bit bus. I know I've seen LCD displays that use a parallel interface but it's not something I've ever looked into so they might be quite expensive or just not match the 816's architecture nicely.

I bring up speed because even 320x240x16bpp @ 60fps is lot of data for an '816 to handle unassisted.

Let's say you're running 14MHz at 320x200x8bpp. The '816 - using block move instructions - will just barely be able to draw an entire frame 30 times a second. Crossing to more than 320x200 (say 320x240x8) means that the video memory is going to be split over two banks. And that's adding more complicated calculations to a CPU that's already only barely keeping up.

All is not lost though. The 816's block move instruction takes 7 cycles per 8bit pixel whereas if you use a dedicated hardware blitter you could get that down to 2 (or even 1) cycles per pixel. You can also use a colour palette if you want to keep 16bit or even 24bit colour but also still keep a pixel width of only 8bits.

For a preliminary build I'd suggest 320x200x8 is a really good starting point.

Ardis wrote:
Makes the full 24-bit address bus of the 65C816 available to software developers (currently using the de-multiplex setup suggested in the 65C816 datasheet, but it might not be suitable for higher CPU speeds)
I wouldn't worry about the time it takes to capture the 816's bank address. Plenty of logic families have latches that have sub 10ns propagation times. I use LVC in a 3.3V setup but you could use AC or similar if you're running with a 5V setup. Probably AHC is what you're looking for if HC seems too slow.

Personally I fully intend to take Plasmo's 40MHz crown with an 50MHz '816 with the complete 24bit bus available. But that's just smack talk for now as I keep running into problems :lol:


Regarding the CPU usage for drawing to the LCD, I don't intend for the '816 to draw directly to the LCD. My plan is to have a dedicated graphics chip, acting similarly to the SNES PPU chips (maybe with a bit of SuperFX functionality if budget and battery life allows.) As I don't have the resources to do custom silicon, it will be on an FPGA, but I want to avoid having one so powerful that people repeatedly ask "why don't you just run your entire handheld on the FPGA?" to which I'd have to answer "it defeats the purpose of what I'm trying to do!"

The graphics will be tile based, much like a Game Boy, with the CPU only sending instructions to tell the graphics chip to move tiles around and modify them (load tiles, change palette, etc.) while the bulk of the graphical work is done on that chip. That should free up quite a few CPU cycles for game logic. The graphics chip would be drawing an image assembled in VRAM to the LCD.

Yuri wrote:
AndrewP wrote:
Personally I fully intend to take Plasmo's 40MHz crown with an 50MHz '816 with the complete 24bit bus available. But that's just smack talk for now as I keep running into problems :lol:


*wonders how the poor CPU doesn't just catch fire at that speed*

I'd like to do my own retro gaming system and I'm shooting for about the same resolution (320x240 @ 15bpp), however, my plan is to use an FPGA to develop something along the lines of an SNES PPU.

My sprites and tiles would likely all still use a single byte per pixel. (Being relatively small in size, I foresee a single byte for a whole title/sprite selecting a palette bank)

This of course all works well if you're sending the pixel data down a VGA interface. (I think the OP was suggesting maybe doing that? Not sure how that works in a small LCD form factor.)

Of course the other part of what made the SNES really shine was that it had several different buses and a lot of different DMA operations that would go on during the blanking intervals of the picture.


That is what I'm hoping to do here, though I'm more familiar with the Game Boy architecture than I am the SNES. I should note that I do plan to have it draw pixel by pixel, scanline by scanline, so VBlank interrupts can be done.

I am open to suggestions for other LCDs if the ILI9341 is not suitable for the planned graphical implementation. The only part that is set in stone is a 65c816 in QFP (though I'll be using a DIP one for breadboard prototyping.)


Top
 Profile  
Reply with quote  
PostPosted: Fri May 17, 2024 7:29 am 
Offline
User avatar

Joined: Tue Feb 28, 2023 11:39 pm
Posts: 257
Location: Texas
Ardis wrote:
I am open to suggestions for other LCDs if the ILI9341 is not suitable for the planned graphical implementation. The only part that is set in stone is a 65c816 in QFP (though I'll be using a DIP one for breadboard prototyping.)


Might need to do some research on that. Poking around on Digikey I found this:
https://www.digikey.com/en/products/det ... R/22531874

If I'm reading the specs right (and that's a BIG IF), then I believe it can do ~60/FPS when the dot clock is run at 8Mhz. Looks like it has several modes of operation, but importantly it appears you can send 24-bits/pixel data on a single clock pulse. It also appears to have a mode where you can operate it on an 8-bit bus through the green channel pins; assuming the pins on your FPGA are at a premium.


Top
 Profile  
Reply with quote  
PostPosted: Fri May 24, 2024 8:33 pm 
Offline

Joined: Thu Dec 07, 2023 2:30 am
Posts: 19
Yuri wrote:
Ardis wrote:
I am open to suggestions for other LCDs if the ILI9341 is not suitable for the planned graphical implementation. The only part that is set in stone is a 65c816 in QFP (though I'll be using a DIP one for breadboard prototyping.)


Might need to do some research on that. Poking around on Digikey I found this:
https://www.digikey.com/en/products/det ... R/22531874

If I'm reading the specs right (and that's a BIG IF), then I believe it can do ~60/FPS when the dot clock is run at 8Mhz. Looks like it has several modes of operation, but importantly it appears you can send 24-bits/pixel data on a single clock pulse. It also appears to have a mode where you can operate it on an 8-bit bus through the green channel pins; assuming the pins on your FPGA are at a premium.


I haven't yet settled on a specific FPGA, but I do need to make sure it has enough data and address pins to access VRAM, receive instructions from the CPU and send data to the LCD. I'm hoping to keep the design simple, so I am hoping to give the FPGA direct connections to everything it needs. Though I am curious about the feasibility of having a second 65c816 or a microcontroller with a 16-bit 65c816 as its core being set up as the graphics controller.

I can't find the specific article I found, but there was someone who was aiming for similar graphical performance (320x240@60fps with 16-bit color) using a Lattice iCE40HX series FPGA, so that is currently one of the options I'm considering.


Top
 Profile  
Reply with quote  
PostPosted: Sat May 25, 2024 4:04 am 
Offline
User avatar

Joined: Tue Feb 28, 2023 11:39 pm
Posts: 257
Location: Texas
Ardis wrote:
I haven't yet settled on a specific FPGA, but I do need to make sure it has enough data and address pins to access VRAM, receive instructions from the CPU and send data to the LCD. I'm hoping to keep the design simple, so I am hoping to give the FPGA direct connections to everything it needs. Though I am curious about the feasibility of having a second 65c816 or a microcontroller with a 16-bit 65c816 as its core being set up as the graphics controller.

I can't find the specific article I found, but there was someone who was aiming for similar graphical performance (320x240@60fps with 16-bit color) using a Lattice iCE40HX series FPGA, so that is currently one of the options I'm considering.



I have not settled on one yet for my design either, though right now I'm using a Cyclone 5 (which seems to be the Cadillac of FPGAs)

As for using the 65816 as a graphics processor itself, that can be done. Super Mario RPG did this with the SNES where the cart had a second 65816 used just to do graphics processing. (Though that version of it had some additional niceties, like multiply and divide instructions)


Top
 Profile  
Reply with quote  
PostPosted: Sat May 25, 2024 12:24 pm 
Offline

Joined: Thu Dec 07, 2023 2:30 am
Posts: 19
Yuri wrote:
Ardis wrote:
I haven't yet settled on a specific FPGA, but I do need to make sure it has enough data and address pins to access VRAM, receive instructions from the CPU and send data to the LCD. I'm hoping to keep the design simple, so I am hoping to give the FPGA direct connections to everything it needs. Though I am curious about the feasibility of having a second 65c816 or a microcontroller with a 16-bit 65c816 as its core being set up as the graphics controller.

I can't find the specific article I found, but there was someone who was aiming for similar graphical performance (320x240@60fps with 16-bit color) using a Lattice iCE40HX series FPGA, so that is currently one of the options I'm considering.



I have not settled on one yet for my design either, though right now I'm using a Cyclone 5 (which seems to be the Cadillac of FPGAs)

As for using the 65816 as a graphics processor itself, that can be done. Super Mario RPG did this with the SNES where the cart had a second 65816 used just to do graphics processing. (Though that version of it had some additional niceties, like multiply and divide instructions)

I thought I had read somewhere that the PPU was built from a 65c816 as well, but with extra features added, but can't find the source to confirm it again.

Alternately, just finding a secondary CPU that has multiply and divide and using that to run the graphics could work. Having those might allow me to run some SuperFX-esque graphics, too.

I'm trying to avoid using an FPGA that is powerful enough to be the entire system as I feel like that would defeat the purpose. Any programmable logic chips I use would be used as stand-ins for custom chips that I cannot afford to design and manufacture, but I would want the functions they perform to be in line with what chips of the 16-bit era were capable of.


Top
 Profile  
Reply with quote  
PostPosted: Sat May 25, 2024 6:27 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8545
Location: Southern California
Ardis wrote:
Yuri wrote:
Alternately, just finding a secondary CPU that has multiply and divide and using that to run the graphics could work. Having those might allow me to run some SuperFX-esque graphics, too.

If you can dedicate a couple of megabytes for tables (or less, if you don't need all the tables), you can use my large look-up tables for 16-bit scaled-integer math, and it will be like having a 16-bit fixed-point/scaled-integer math coprocessor.  (Do note that scaled-integer is a wider field than mere fixed-point.)  There's no interpolation involved, because all the answers are there, pre-calculated; so it's fast.  The routine for the '816 to get a sine for example, accurate to all 16 bits, takes 23 cycles (2.3µs @ 10MHz), or 35 cycles (3.5µs @ 10MHz) if you include the JSR & RTS pair, which is extremely fast compared to having to calculate the sine function with a lot of multiplications and divisions.  It's even far faster than using the CORDIC algorithm.  16 bits is plenty accurate for graphics for your display size.  http://wilsonminesco.com/16bitMathTables/ has the introduction article and links to the files holding the actual tables in Intel Hex format, and descriptions of how they were calculated and how to use them, and rational-number approximations, and how Intel Hex files work.  The server logs say this part of my site gets a lot of traffic, although it doesn't generate any discussion.

Functions included are (and I hope your monitor doesn't wrap these lines):
Code:
   file name    table size    comments
   SQUARE.HEX     256KB    partly for multiplication.  32-bit output
   INVERT.HEX     256KB    partly for division, to multiply by the inverse.  32-bit output.
   SIN.HEX        128KB    sines, also for cosines and tangents
   ASIN.HEX       128KB    arcsines, also for arccosines
   ATAN.HEX        64KB    ends at 1st cell of LOG2.HEX (next)
   LOG2.HEX       128KB    also for logarithms in other bases
   ALOG2.HEX      128KB    also for  antilogs  in other bases
   LOG2-A.HEX     128KB    logs of 1 to 1+65535/65536 (ie, 1.9999847), first range for LOG2(X+1) where X starts at 0
   ALOG2-A.HEX    128KB    antilogs of 0 to 65535/65536 (ie, .9999847), the first range for (2^x)-1
   LOG2-B.HEX     128KB    logs of 1 to 1+65535/1,048,576 (ie, 1.06249905), a 16x zoom-in range for LOG2(X+1)
   ALOG2-B.HEX    128KB    antilogs of 0 to 65535/1,048,576 (ie, .06249905), a 16x zoom-in range for (2^x)-1
   SQRT1.HEX       64KB    square roots,  8-bit truncated output
   SQRT2.HEX       64KB    square roots,  8-bit  rounded  output
   SQRT3.HEX      128KB    square roots, 16-bit  rounded  output
   BITREV.HEX     128KB    set of bit-reversing tables, up to 14-bit, particularly useful for FFTs
   BITREV15.HEX   128KB    15-bit bit-reversing table (not included in EPROM)
   MULT.HEX       128KB    multiplication table like you had in 3rd grade, but up to 255x255


_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Sun May 26, 2024 12:03 pm 
Offline

Joined: Sun May 26, 2024 11:59 am
Posts: 1
Ardis wrote:
The reason I'm aiming for 60 frames per second is to have it be on par with other handhelds of the era like the Game Gear and Game Boy which both ran and drew at 60 frames per second. That said, the CPU won't be sending anything directly to the LCD. I intend to offload as much of the graphical processing onto the FPGA as possible (the CPU would just be telling the FPGA to move tiles around and read tiles from the ROM into VRAM for the most part.) I'm looking to make it treat graphical data similarly to how a Game Boy organizes it, though with more RAM allocated to it. I'm trying to avoid leaning too much on the FPGA, using it as a stand-in for a custom graphics chip I don't have the money to have built.download lagu

Can the SPI interface handle 320x240 with 16-bit color at 60 frames per second? If so, I'd rather use that than the 18-bit parallel RGB LCD I'm currently using.

I might use an Arduino LCD shield for testing, actually, because it would be easier to attach to a breadboard than the ribbon cable of the LCD I'm currently looking at. I have one lying around that I picked up years ago but never used. Just need to figure out how it works. I'm trying to avoid relying on things that are too modern like a MicroSD card. I intend to have a flash memory chip to hold the firmware since I have a GQ-4X programmer already.

The first phase was to get something working on a breadboard. Just a simple program that writes "Hello World" to the screen.

There will be voltage regulators changing voltage on things due to no combination of parts I'm looking at settling on a single voltage, so I suspect I'll have voltage lines for 1.8V (or whatever the FPGA needs), 3.3V and 5V going through the board.


Given your goal of achieving 60 FPS to match handhelds like the Game Gear and Game Boy, and your strategy of offloading graphics processing to the FPGA, what specific graphical operations are you planning to handle with the FPGA? Additionally, how do you plan to manage synchronization between the CPU and FPGA for smooth frame updates?


Top
 Profile  
Reply with quote  
PostPosted: Sun Jun 02, 2024 10:21 am 
Offline

Joined: Sun Jul 11, 2021 9:12 am
Posts: 155
Ardis wrote:
I'm trying to avoid using an FPGA that is powerful enough to be the entire system as I feel like that would defeat the purpose. Any programmable logic chips I use would be used as stand-ins for custom chips that I cannot afford to design and manufacture, but I would want the functions they perform to be in line with what chips of the 16-bit era were capable of.


Personally, I wouldn’t get hung up on the thought of “a too powerful FPGA”. As soon as any gatekeeper sees mention of an FPGA, they’ll take aim at you. It’s a given.

I’d be more worried about getting your system up and running and then see what you can do to optimise, etc… You have a massive task ahead of you just to achieve what you want to do, so put your focus towards that.

Just my 2c.


Top
 Profile  
Reply with quote  
PostPosted: Mon Jun 03, 2024 11:58 pm 
Offline

Joined: Thu Dec 07, 2023 2:30 am
Posts: 19
AmbreeDrew wrote:
Given your goal of achieving 60 FPS to match handhelds like the Game Gear and Game Boy, and your strategy of offloading graphics processing to the FPGA, what specific graphical operations are you planning to handle with the FPGA? Additionally, how do you plan to manage synchronization between the CPU and FPGA for smooth frame updates?


The graphics processor would handle tasks like loading graphics from ROM into VRAM, altering tiles (position, palette, etc), moving them and drawing the composite image of the tile/sprite layers from VRAM onto the LCD.

I was considering more advanced features like some kind of SuperFX-esque 3D graphics mode, but I might want to simplify this project instead unless it gets to the point where I can start trying to form a team around it and someone on this hypothetical team feels they are up to the task. Certainly sound far beyond my knowledge right now and getting to that point would take a long time.

J64C wrote:
Ardis wrote:
I'm trying to avoid using an FPGA that is powerful enough to be the entire system as I feel like that would defeat the purpose. Any programmable logic chips I use would be used as stand-ins for custom chips that I cannot afford to design and manufacture, but I would want the functions they perform to be in line with what chips of the 16-bit era were capable of.


Personally, I wouldn’t get hung up on the thought of “a too powerful FPGA”. As soon as any gatekeeper sees mention of an FPGA, they’ll take aim at you. It’s a given.

I’d be more worried about getting your system up and running and then see what you can do to optimise, etc… You have a massive task ahead of you just to achieve what you want to do, so put your focus towards that.

Just my 2c.

Gatekeepers don't bother me. They can scoff at using an FPGA all they want, but I'd sooner use an FPGA than rip video chips out of other hardware that is no longer in production or try to source something that has been discontinued for 20 years.

The FPGA is a last resort if I can't use some other off the shelf solution to accomplish the video processor function with new chips (like a second 65C816 running different firmware.) However, if I do go the FPGA route, I do plan to design these in such a way that ordering an ASIC from this design is theoretically possible (even if prohibitively expensive at the volumes most hobbyists can afford.)


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 30, 2024 11:14 am 
Offline

Joined: Sun Jul 11, 2021 9:12 am
Posts: 155
How did you go? End up making a start on this?


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 30, 2024 8:37 pm 
Offline
User avatar

Joined: Tue Feb 28, 2023 11:39 pm
Posts: 257
Location: Texas
Ardis wrote:
I thought I had read somewhere that the PPU was built from a 65c816 as well, but with extra features added, but can't find the source to confirm it again.


From what I've heard the original NES (and later SNES) PPU were originally based around the design of the TMS9918A (or one of the later revisions there of)

Super Mario RPG added a second 65816 at 4x the clock or something to do graphics processing.

Quote:
Alternately, just finding a secondary CPU that has multiply and divide and using that to run the graphics could work. Having those might allow me to run some SuperFX-esque graphics, too.


The interesting thing about the SNES's PPU is it did some basic matrix math on 2D graphic planes. That's how it pulled off some of the early "3D" fx that it did.

The SuperFX chip expanded on that and did 3D matrix math in parallel. (What would later go on to become the modern day GPU)

Anyhow, getting back to the 65816 in the SNES. The SuperMario RPG version had extra instructions for multiply and divide which helped it along. Otherwise one of the common tricks with SNES programming was to make the PPU do the multiply for your because it had the ability to do that in hardware. But the caveat was that you couldn't be in Mode 7 when you did.



I came up with a hardware multiply that uses shift registers and the like and would complete in about 8 cycles. With FPGAs or even CPLDs these days you probably could just load up a quarter square table and it would be pretty darn quick.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 43 posts ]  Go to page Previous  1, 2, 3

All times are UTC


Who is online

Users browsing this forum: No registered users and 54 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: