6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Fri Nov 22, 2024 3:08 pm

All times are UTC




Post new topic Reply to topic  [ 26 posts ]  Go to page Previous  1, 2
Author Message
PostPosted: Tue Feb 01, 2022 7:19 pm 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
CountChocula wrote:
I have actually thought about this… I was thinking of adding a separate counter for the memory, which I think might also be used to prevent any memory from being used during the blanking interval (as I could just disable the counters when /BLANK is low). This opens up some interesting opportunities, because it would allow 320x200 to fit in a single 64kB chip. Have you tried anything similar? I worry about keeping the two counters properly synchronized.

I did this in one of my earlier circuits, that was a lower-frequency PAL one - so I could output 640x256 with the framebuffer embedded in 32K of general purpose RAM, like the BBC Micro used to do. It also allows for hardware scrolling as you just initialise the counters to some non-zero value.

Chad pointed out though that this doesn't work as well when double-scanning - I'd forgotten about that, oops! 256x240 wouldn't suffer though as the horizontal width is a power of two - you wouldn't need separate visible-pixel counters for that one. It is very appealing.

Quote:
Also—I hear you on the 640x400 woes. My monitor handles it, but it thinks it's a much weirder resolution (like, 720x480, which isn't even a real VGA resolution).

Yep, I think all my monitors said 720x400, kind of a widescreen resolution. It worked OK, but the pixel frequency is wrong, which can lead to some blurring or inconsistent pixel widths when the LCD display quantizes the pixels.

Quote:
I was actually thinking about this last night after I wrote my post… I did verify that the reset bit is being flipped on the last pixel, but it's occurred to me that the sync ROM operates at 1/8th the clock speed, which means that, as far as the monitor is concerned, the count restarts seven pixels too early, so I wonder whether that might be the problem. I will try to reset at the beginning of the lines instead and see if being one pixel too late a similar effect in the other direction.

Yes, that is tricky. Maybe this only works well for me because I am actually using asynchronous counters. Buffering the EPROM outputs through another 74HC574 may also affect this as it adds a layer of delay to the reset signal.

Quote:
I have had a bit of trouble sourcing 74AC163s, btw… very hard to find unless you want to buy 1,200 of them :-)

I'm sure I could find a use for them all. e.g. six counters per sprite, 200 sprites?!

Quote:
Yes, I figure that's something like that… I'm going to hook it up the 'scope and trigger it on the transition of the /OE lines between the counter buffers and the PIC latches to see what happens on the bus… it might be that I'm underestimating the time it takes for the signals to stabilize.

The ICs will specify some constraints, assuming a certain amount of bus capacitance, but especially on a breadboard you're likely to have a lot more capacitance and also inductance if you use long loopy wires.

I'd expect that if you start WE too early you might write to the wrong address first; if you end it too late you'd also write to the wrong address, and may also cause temporary corruption of the visual display ("snow" while writing is occurring, that goes away after the writing finishes). If the duration of WE is too short, it may fail to write at all, or write to the right address but with the wrong value. It's possible that you can tweak the timing and see which of these effects happen. It's tricky though to accurately adjust the timing at this frequency using discrete logic. I recently switched to a simple PLD (ATF16V8) and of course it makes things much easier.

Quote:
I remember watching that video! I think I may have to do the same thing (crazy to think that there isn't any kind of readily-available software that will do these kind of simulations automatically… you'd think that by now this kind of functionality would be fairly routine.)

There probably is good software for this, I just don't know what it is :)


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 01, 2022 7:31 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8505
Location: Midwestern USA
CountChocula wrote:
As for the '163s… as I mentioned, I can't seem to find any AC ones, though I just bought some 'AC161s, so maybe I'll experiment with those. FWIW, the frequency is within spec from what the datasheet says (Nexperia quotes minimum 27MHz at 4.5V, and typical 51MHz at 5V), so hopefully it's not too big a deal.

Have you looked here?

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Last edited by BigDumbDinosaur on Tue Feb 01, 2022 7:50 pm, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 01, 2022 7:43 pm 
Offline
User avatar

Joined: Sun Nov 07, 2021 4:11 pm
Posts: 101
Location: Toronto, Canada
BigDumbDinosaur wrote:
CountChocula wrote:
As for the '163s… as I mentioned, I can't seem to find any AC ones, though I just bought some 'AC161s, so maybe I'll experiment with those. FWIW, the frequency is within spec from what the datasheet says (Nexperia quotes minimum 27MHz at 4.5V, and typical 51MHz at 5V), so hopefully it's not too big a deal.

Have you looked here and here?


Hey BDD! I did, but unfortunately Mouser's shipping costs to Canada are really high—it's $20 unless you spend $100, and I don't need that many '163s right now :-)


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 01, 2022 7:51 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8505
Location: Midwestern USA
CountChocula wrote:
BigDumbDinosaur wrote:
CountChocula wrote:
As for the '163s… as I mentioned, I can't seem to find any AC ones, though I just bought some 'AC161s, so maybe I'll experiment with those. FWIW, the frequency is within spec from what the datasheet says (Nexperia quotes minimum 27MHz at 4.5V, and typical 51MHz at 5V), so hopefully it's not too big a deal.

Have you looked here and here?


Hey BDD! I did, but unfortunately Mouser's shipping costs to Canada are really high—it's $20 unless you spend $100, and I don't need that many '163s right now :-)

Digi-Key lists the part in SOIC. Dunno if you can work with that, but they are readily available.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 01, 2022 7:54 pm 
Offline
User avatar

Joined: Sun Nov 07, 2021 4:11 pm
Posts: 101
Location: Toronto, Canada
gfoot wrote:
Chad pointed out though that this doesn't work as well when double-scanning - I'd forgotten about that, oops! 256x240 wouldn't suffer though as the horizontal width is a power of two - you wouldn't need separate visible-pixel counters for that one. It is very appealing.


Interesting… what I was envisioning was separate pixel counters for horizontal and vertical coordinates—e.g. 2 '163s to count 9 bits' worth of X coordinates, and a '590 to count 8 bits of Y coordinates, with the ROM outputting separate HRESET and VRESET signals. This would not fit in 64kB of RAM, but it would allow for 240px of vertical resolution.

Come to think of it, if I used a 128kB ROM, I might actually manage to make this work without the need for two separate sets of counters. Hmmmm, interesting……

Quote:
Quote:
I have had a bit of trouble sourcing 74AC163s, btw… very hard to find unless you want to buy 1,200 of them :-)

I'm sure I could find a use for them all. e.g. six counters per sprite, 200 sprites?!]/quote]


Sure… but where do we get the power plant required to power the circuit? ;-)

Quote:
Quote:
Yes, I figure that's something like that… I'm going to hook it up the 'scope and trigger it on the transition of the /OE lines between the counter buffers and the PIC latches to see what happens on the bus… it might be that I'm underestimating the time it takes for the signals to stabilize.

I recently switched to a simple PLD (ATF16V8) and of course it makes things much easier.


TBH, I've been resisting using programmable logic… I was experimenting with it when Chad and I first talked about our respective circuits, but I really don't like dealing with WinCUPL!


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 01, 2022 7:57 pm 
Offline
User avatar

Joined: Sun Nov 07, 2021 4:11 pm
Posts: 101
Location: Toronto, Canada
BigDumbDinosaur wrote:
CountChocula wrote:
BigDumbDinosaur wrote:
CountChocula wrote:
As for the '163s… as I mentioned, I can't seem to find any AC ones, though I just bought some 'AC161s, so maybe I'll experiment with those. FWIW, the frequency is within spec from what the datasheet says (Nexperia quotes minimum 27MHz at 4.5V, and typical 51MHz at 5V), so hopefully it's not too big a deal.

Have you looked here and here?


Hey BDD! I did, but unfortunately Mouser's shipping costs to Canada are really high—it's $20 unless you spend $100, and I don't need that many '163s right now :-)

Digi-Key lists the part in SOIC. Dunno if you can work with that, but they are readily available.


I noticed that… I wonder if I could wire up an adapter for the breadboard, like I did with the RAM. Thanks for the pointer!


Top
 Profile  
Reply with quote  
PostPosted: Fri Feb 04, 2022 3:49 am 
Offline
User avatar

Joined: Sun Nov 07, 2021 4:11 pm
Posts: 101
Location: Toronto, Canada
Hey folks—

I've managed to redesign the interface so that it provides 320x240 resolution natively, and fix all the bugs that I had.

In the end, this took a relatively minor number of changes to the circuit, without increasing the chip count. All I did was to use a separate set of counters for the X and Y coordinates, which requires a larger signal EEPROM but about the same amount of RAM as before. I also interfaced the PIC directly with the RAM's data bus, which makes read/write activity possible.

Switching to '161 counters fixed the distortion at the top of the frame (thanks, George, for the suggestion!), and the spurious pixels turned out to be a problem with power, rather than timing. I added a 4.7µF cap near the SRAM chip, and they went away (they still occasionally occur when the PIC programmer is connected to the breadboard, but not when the circuit is freestanding).

Here's the new diagram (I've attached both colour and b/w versions below):

Attachment:
VGA-320x240x8.png
VGA-320x240x8.png [ 1.28 MiB | Viewed 505 times ]


I also had a chance to add a couple more primitives to the graphics library, as well as the ability to draw text using a (pretty terrible-looking) 5x8 bitmap font.

Here's the obligatory demo (which definitely looks better on video:

Attachment:
320x240demo.gif
320x240demo.gif [ 33.68 MiB | Viewed 505 times ]


I think this is good enough for a PCB so that I can start working out an interface with my SBC. There are still some substantial optimizations that I can make to the pixel rendering algorithm; if I run the demo without syncing the PIC with the blanking interval, you can see just how much faster it can run, though at the cost of some artifacts (other than the sync, this is the same exact code as above):

Attachment:
320x240_unhinged.gif
320x240_unhinged.gif [ 10.48 MiB | Viewed 505 times ]


In any case, I've also become convinced that, at some point or other, I will need to add a text-only mode, because the graphics are cool, but this would be unusable with any kind of textual data.

Cheers!


—Marco


Attachments:
VGA-320x240x8-colour.pdf [357.79 KiB]
Downloaded 37 times
VGA-320x240x8-bw.pdf [352.29 KiB]
Downloaded 39 times
Top
 Profile  
Reply with quote  
PostPosted: Fri Feb 04, 2022 8:06 am 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
Very nice. In the setup where you don't wait for the blanking, how do you instead synchronise the pic, and the OE signals of the address latches? The artifacts on the screen shouldn't appear if you time the writes to occur when CX1 is low - this is how my circuits all work, as the 6502's clock is linked to the equivalent of CX1.


Top
 Profile  
Reply with quote  
PostPosted: Fri Feb 04, 2022 2:33 pm 
Offline
User avatar

Joined: Sun Nov 07, 2021 4:11 pm
Posts: 101
Location: Toronto, Canada
gfoot wrote:
Very nice. In the setup where you don't wait for the blanking, how do you instead synchronise the pic, and the OE signals of the address latches? The artifacts on the screen shouldn't appear if you time the writes to occur when CX1 is low - this is how my circuits all work, as the 6502's clock is linked to the equivalent of CX1.


Thanks, George. The PIC currently runs on its internal 64MHz oscillator, and is completely decoupled from the video circuit's clock. Thus, in “speedy” mode, the PIC just takes control of the bus whenever it wants to, which could be (and usually is) smack in the middle of a scanline—hence all the snow. The monitor doesn't lose sync because the /OE line of U10 is tied to the BLANK signal coming from the ROM, and so the monitor doesn't have to put up with any of the PIC's shenanigans during the blanking intervals.

Even if I tied the PIC to CX1, I suspect that it would take a bit of doing to time the writes so that they can only occur on the low side of the video clock… the PIC that I'm using doesn't have an external memory interface, and so driving /WE high and low is a discrete operation performed in software. This means that the strobe cannot happen in the same cycle, and so we'd need some mechanism that resynchronizes it with the video clock cycle—something like “please strobe /WE the next time you can,” so to speak.

This is probably not impossible (in fact, I think this is something you mentioned you were working on at some point), but it's way off the edge of the map when it comes to my present abilities, so I'm not quite sure where to start. Off the top of my head, the way I imagine this could work is I could use two D-latches (say, like in a '74) to put the main 25.175MHz clock in quadrature, at which point we'd have an edge at every 90º of the signal. The PIC could then set up addresses and data on the latches, flag a pending write, and wait for it to take place:

  • At 0º, bring the counters on the bus.
  • At 90º, bring CP on the video signal latch high to capture the current colour byte and presents it to the monitor.
  • At 180º, bring the PIC on the bus.
  • At 270º, if there is a pending write, pulse /WE.

I'm not sure how one would go about isolating the edges from each other—i.e.: how do you prevent something that needs to happen only at 90º from also happening at 270º? Similarly, I wouldn't know how to pulse /WE for less than 1/4 of the clock cycle. I would also need to use much faster RAM, I suspect, though 10ns is widely available and pretty inexpensive once you cross the line where you steadfastly refuse to use SMD parts. Lots of uncertainty here!

An optimization that I think I can implement now is something that I already alluded earlier in the thread: right now, the PIC has to wait for a full scanline before writing a pixel; otherwise, there is no guarantee that there will be enough time left for the write cycle to complete before the next visible field begins. This can be fixed by having the ROM emit an auxiliary signal (called MBLANK in the schematic) that allows for enough time before the end of a blanking interval to account for the critical section of the PIC code that performs the actual write. This would probably speed up things quite considerably, since right now the design is limited to ~31,500 pixels per second. This is something that I haven't implemented yet mostly because it requires pulling the EEPROM out of the circuit to reprogram it, and I'm a little worried about accidentally disconnecting too many cables :-)

Cheers!


Top
 Profile  
Reply with quote  
PostPosted: Fri Feb 04, 2022 6:57 pm 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
CountChocula wrote:
  • At 180º, bring the PIC on the bus.

Rather than putting the PIC on the data bus, I would probably add an extra 8-bit D flipflop to latch the data that the CPU wants to output, and let the CPU write directly to this whenever it wants to. So the CPU interface is just to write an address, then write a data byte, and it can kind of forget about it after that so long as it doesn't write anything else for a while. Essentially it's a 1-deep FIFO. Unless the CPU is actually sharing a clock with the video system (which is actually what I do in practice) I wouldn't trouble the CPU with trying to synchonize with the video clock for anything, just make sure the FIFO can cope.

It's important that the video circuit, that's reading the FIFO, doesn't try to read it while it's being updated. My plan for this was to have a D flipflop that the CPU sets when it writes the data byte (it can be automatic). Then the video circuit can use that to determine whether there is data to be written, and it can reset the flipflop to say that the data was written. Potentially the CPU can then read that state back to check the FIFO is empty, before writing new data.

It seems prudent for the video circuit to also buffer this flipflop, sample its state in advance of a potential write cycle and only execute the write cycle if the flipflop was set in advance. So that adds up to a one cycle delay to the write, but seems like it would make things more stable, and avoid triggering the write behaviour halfway through a cycle without doing the right setup first.

This doesn't solve the timing problem of how you fit all the reads and writes into the video clock cycle, and it can be tough. I do all this at half the speed you are using, which makes it easier, but a lot of the parts I'm using are also slower and my timings are borderline. I would think in times though, rather than angles, and try to fit it to the clock cycle afterwards.

My general approach is to identify which things I need to control directly - such as enabling/disabling various chips writing to the buses, or enabling/disabling the RAM's write enable line. I think of these as synchronous events, which I'm going to trigger based on the clock. Then between these, various things happen asynchronously - such as the address bus settling down after I change what is driving it, the RAM getting used to a new address on the bus and starting to output it, or the latch having a setup time during which its input data should remain steady. I don't need to explicitly say when one of these transitions into the other - it doesn't need to happen on a clock boundary. But I need to add them all up and make sure I don't trigger the latch until enough time has passed for all these asynchronous things to have occurred in sequence.

My spreadsheet is mostly about summarizing the times for these stages from the datasheets, identifying which things I need to directly control, and determining which time periods can't start until after other time periods end. If something is based on a synchronous event then I also need to round it up to the next clock cycle (about 40ns). That's a big deal for me because there are only four clock cycles in total; it's even worse for you because there are only two!

In general though, yes, it's tricky and requires quite a bit of thought to get these writes happening between pixels. But very effective if you can do it!

Quote:
I'm not sure how one would go about isolating the edges from each other—i.e.: how do you prevent something that needs to happen only at 90º from also happening at 270º?

The easiest way is using a PLD, they are very effective for this.

Otherwise you can use flipflops to delay low-frequency signals by one high-frequency period. e.g. if you use a '163 style counter to generate ~12.5MHz, ~6.25MHz, etc from your ~25MHz clock, you can also phase shift the 6.25MHz by 40ns by re-sampling it based on the original 25MHz clock using a D flipflop. And you can shift it by a little over 20ns by re-sampling it based on the inverted 25MHz clock. This is what I used to do before moving to a PLD.

Quote:
Similarly, I wouldn't know how to pulse /WE for less than 1/4 of the clock cycle.

One way I've considered (but not tried) to synthesize more clock edges is to invert the clock signal, as I mentioned above, so you now have twice as many rising edges. i.e. if your crystal clock rises once every 40ns, then if you invert it then you have another signal that also rises once every 40ns, but delayed by a little over 20ns, so overall you have clock edges that you can trigger from roughly every 20ns.

Quote:
An optimization that I think I can implement now is something that I already alluded earlier in the thread: right now, the PIC has to wait for a full scanline before writing a pixel; otherwise, there is no guarantee that there will be enough time left for the write cycle to complete before the next visible field begins. This can be fixed by having the ROM emit an auxiliary signal (called MBLANK in the schematic) that allows for enough time before the end of a blanking interval to account for the critical section of the PIC code that performs the actual write. This would probably speed up things quite considerably, since right now the design is limited to ~31,500 pixels per second. This is something that I haven't implemented yet mostly because it requires pulling the EEPROM out of the circuit to reprogram it, and I'm a little worried about accidentally disconnecting too many cables :-)

Yes, that sounded like a good idea and a good use of extra ROM output lines that otherwise go unused.


Top
 Profile  
Reply with quote  
PostPosted: Sat Feb 05, 2022 1:35 am 
Offline
User avatar

Joined: Sun Nov 07, 2021 4:11 pm
Posts: 101
Location: Toronto, Canada
gfoot wrote:
CountChocula wrote:
  • At 180º, bring the PIC on the bus.

Rather than putting the PIC on the data bus, I would probably add an extra 8-bit D flipflop to latch the data that the CPU wants to output, and let the CPU write directly to this whenever it wants to. So the CPU interface is just to write an address, then write a data byte, and it can kind of forget about it after that so long as it doesn't write anything else for a while. Essentially it's a 1-deep FIFO. Unless the CPU is actually sharing a clock with the video system (which is actually what I do in practice) I wouldn't trouble the CPU with trying to synchonize with the video clock for anything, just make sure the FIFO can cope.


Agreed—that's what I meant. Of course, that makes the video RAM write-only from the PIC's perspective unless you somehow also use a buffer for reading, which I suppose wouldn't be impossible.

Quote:
It's important that the video circuit, that's reading the FIFO, doesn't try to read it while it's being updated. My plan for this was to have a D flipflop that the CPU sets when it writes the data byte (it can be automatic). Then the video circuit can use that to determine whether there is data to be written, and it can reset the flipflop to say that the data was written. Potentially the CPU can then read that state back to check the FIFO is empty, before writing new data.

It seems prudent for the video circuit to also buffer this flipflop, sample its state in advance of a potential write cycle and only execute the write cycle if the flipflop was set in advance. So that adds up to a one cycle delay to the write, but seems like it would make things more stable, and avoid triggering the write behaviour halfway through a cycle without doing the right setup first.


Ah, interesting… I'd have to think through this a bit to picture how it would actually work in practice. If you ever get around to making it work, I'd love to take a peek at the schematic :-)

Quote:
My spreadsheet is mostly about summarizing the times for these stages from the datasheets, identifying which things I need to directly control, and determining which time periods can't start until after other time periods end. If something is based on a synchronous event then I also need to round it up to the next clock cycle (about 40ns). That's a big deal for me because there are only four clock cycles in total; it's even worse for you because there are only two!


Well, the idea I came up with was to start with a 50MHz base clock (I wonder that would be “good enough” when divided by 2 to generate a stable VGA signal), then use combinatory logic to divide it up in 8 10ns “slices” that can be combined into an alternating sequence of arbitrary length… I was playing around earlier today (because, of course, now I can't leave this be), and it's not too hard to come up with a model where you can figure out just about any combination of timings:

Attachment:
Screen Shot 2022-02-04 at 20.26.37.png
Screen Shot 2022-02-04 at 20.26.37.png [ 180.22 KiB | Viewed 434 times ]


Of course, this doesn't keep into consideration propagation times, but if you manage to do it all in a PLD the timing should at least be consistent throughout, as long as you don't have to use more than one macrocell per signal.

Quote:
Otherwise you can use flipflops to delay low-frequency signals by one high-frequency period. e.g. if you use a '163 style counter to generate ~12.5MHz, ~6.25MHz, etc from your ~25MHz clock, you can also phase shift the 6.25MHz by 40ns by re-sampling it based on the original 25MHz clock using a D flipflop. And you can shift it by a little over 20ns by re-sampling it based on the inverted 25MHz clock. This is what I used to do before moving to a PLD.


I think that's more or less what I've come up with above as well, though I suspect your approach is more elegant. The issue is that you have to start with 4x your target clock—or, at least, I can't figure out how to do it with a 2x clock speed, short of introducing artificial delays that aren't tied to the clock, which feels like would be too dependent on the vagaries of the individual components.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 26 posts ]  Go to page Previous  1, 2

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 32 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: