6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Thu Nov 21, 2024 12:56 pm

All times are UTC




Post new topic Reply to topic  [ 10 posts ] 
Author Message
PostPosted: Thu Oct 28, 2021 12:09 pm 
Offline

Joined: Sun Sep 19, 2021 11:21 am
Posts: 38
I thought I'd just share my experiences so far of attempting to wrap my head around the myriad of address decoding options a 6502 home brew designer has available to them. Doing this allows me to give something back to the tyros like me as well as inviting advice from the more experienced members of the forum.

I have read Garth Wilson's resources at least 3 times now. Each time I take more in as I am better prepared to absorb concepts that I was blind to on the previous readings. I note that Garth (as do others) hint that beginners get too hooked-up on wringing every byte out of the 64k on offer and many posts refer to some beginner decoding solutions being too granular in their approach. I've though more on this as to what approach I should take and here I open up my inexperienced thoughts. I suppose it will always be a trade off between what is possible versus what is practical.

Taking the extremes first, conceptually, I expect that the finest granularity would be individual byte mapping of the available 64k address space. In my mind, I think that you could do this with a 64k x 8 bit ROM as the address decoder. The 16 address lines would allow an individual 8 bit number to be placed on the 8 data pins of the ROM - those numbers being any of the 8 following values (based on active low logic) to select a device...
Code:
11111110, 11111101, 11111011, 11110111,  11101111, 11011111, 10111111, 01111111

Could these 8 outputs instead be fed into 2 x 4-16 multiplexers giving access to 32 devices? I suppose so, in theory?

I make no claims to the viability of this as I fully acknowledge my ignorance around the CPU/ROM timings but would suspect it possible at low operating frequencies simply given ROM access typically being around 70ns and a 1 MHz 6502 providing a 1000ns between cycles.

The other extreme (other than an all RAM or all ROM configuration) is the split which would just use A15 on the address line. This has a granularity of 32K. However, it would be slightly boring with no input/output functionality!

In general, my understanding is that granularity is defined by 2^bits where "bits" refers to the number of address lines in the decoding solution. For my design, I am looking at using A15-A10. 6 bits, so a granularity of 1024 bytes (2^6). To me, this means I can divide my memory map up in 1K "chunks". Using WinCupl, I produced the necessary logic for A15-A10 and allows a simpler version of the Daryl Rictor address decoding idea using a GAL as I have a couple of these already.
Code:
RAM = ADDRESS:['h'0000..5FFF];   /* 24K RAM      */
CS1 = ADDRESS:['h'6000..63FF];   /* IO         */
CS2 = ADDRESS:['h'6400..67FF];   /* IO         */
CS3 = ADDRESS:['h'6800..6BFF];   /* IO         */
CS4 = ADDRESS:['h'6C00..6FFF];   /* IO         */
ROM = ADDRESS:['h'8000..FFFF];   /* 32K ROM      */

I also wrote a python program to test/simulate this logic for all the possible combinations of inputs for A15-10. The "interesting" section I reproduce here where one can see the different devices getting "activated". Note 4 blocks where nothing is activated. 4K (at my granularity) not assigned. Trade-off.
Code:
5C00 ->   _RAM = 0   _CS1 = 1   _CS2 = 1   _CS3 = 1   _CS4 = 1   _ROM = 1   
6000 ->   _RAM = 1   _CS1 = 0   _CS2 = 1   _CS3 = 1   _CS4 = 1   _ROM = 1   
6400 ->   _RAM = 1   _CS1 = 1   _CS2 = 0   _CS3 = 1   _CS4 = 1   _ROM = 1   
6800 ->   _RAM = 1   _CS1 = 1   _CS2 = 1   _CS3 = 0   _CS4 = 1   _ROM = 1   
6C00 ->   _RAM = 1   _CS1 = 1   _CS2 = 1   _CS3 = 1   _CS4 = 0   _ROM = 1   
7000 ->   _RAM = 1   _CS1 = 1   _CS2 = 1   _CS3 = 1   _CS4 = 1   _ROM = 1   
7400 ->   _RAM = 1   _CS1 = 1   _CS2 = 1   _CS3 = 1   _CS4 = 1   _ROM = 1   
7800 ->   _RAM = 1   _CS1 = 1   _CS2 = 1   _CS3 = 1   _CS4 = 1   _ROM = 1   
7C00 ->   _RAM = 1   _CS1 = 1   _CS2 = 1   _CS3 = 1   _CS4 = 1   _ROM = 1   
8000 ->   _RAM = 1   _CS1 = 1   _CS2 = 1   _CS3 = 1   _CS4 = 1   _ROM = 0

Based on this logic from WinCupl...
Code:
    "(A13 and A14) or A15",                                           # _RAM
    "A15 or not(A14) or not (A13) or A12 or A11 or A10",              # _CS1
    "A15 or not(A14) or not (A13) or A12 or A11 or not(A10)",         # _CS2
    "A15 or not(A14) or not (A13) or A12 or not(A11) or A10",         # _CS3
    "A15 or not(A14) or not (A13) or A12 or not(A11) or not(A10)",    # _CS4
    "not(A15)"                                                        # _ROM

I think I'll look at how discrete logic could be used here as an interesting exercise as it "only" comprises NOTs, ORs and one AND gate.
I appreciate that, with the same kit, I could already implement Daryl's solution (using A15-A4) but I like working things out for myself and I hope this post could add to the knowledge base for those like me starting from scratch in the world of 6502 systems design.


Top
 Profile  
Reply with quote  
PostPosted: Thu Oct 28, 2021 12:39 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
Thanks for sharing your thoughts and explorations. It's common advice to go ahead and build one of the standard known-working designs, and there is merit in that advice, but it's also good to work things out for yourself.

It might be worth noting that back in the day a ROM could well have been the slowest component in the system, and ROMs of capacity 2k might have been the largest, and so we see that the humble '138 decoder is a much more practical proposition for decoding.

You are quite right that everything is tradeoffs. If you have GALs and ROMs, you might well use them.

Good to hear that you've studied the materials available before venturing forth, too.


Top
 Profile  
Reply with quote  
PostPosted: Thu Oct 28, 2021 1:40 pm 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1488
Location: Scotland
Re-reading your post - it's better to think that rather than 1µS or 1000nS access time in a 1Mhz system, you actually have a little less than half that to activate your chip decode mechanism - in simple terms, 6502 does everything external on one half cycle and everything internal on the other half. (Which is how a lot of early video systems worked - CPU had one half cycle, video had the other half, also why you needed faster RAM for a 6502 than other 8-bit CPUs of the time)

It might also be worthwhile looking at some classic old systems to see how they did it and what their design choices might have been.

E.g. The Apple II in its original form - up to 48KB RAM, 16KB ROM, although a chunk of that ROM space taken out for hardware IO, also completely switchable to RAM - see "Language Card", etc. 4 fixed regions for graphics inside the lower 48KB region.

BBC Micro - even though it was some 3-4 years after the Apple II, only 32KB of RAM, but also 32KB of ROM, again less a small section for IO but a little finer decoded than in the Apple II. One region for graphics from $8000 downwards, size dependant on video mode from 1KB for a 40x25 text only mode to 20KB for a high resolution graphics mode with 2 colours or lower resolution with 8 colours.

And so on.

And think how best to arrange your system's memory usage while working with the 6502 - e.g. a 6502 ideally needs RAM from $0000 through $01FF (Zero page an dstack), and it needs ROM at the very top for the Reset and interrupt vectors, so it sort of makes sense to have ROM at the top, RAM at the bottom and IO somewhere in-between...

Some systems decode the bottom 256 bytes of RAM further and put IO down there - because accessing that region of RAM is faster (Zero page) It incrases decode logic and reduces zero page usage a little, but if your application needs every cycle it can get for IO then this is the way to go.

And look at some historical IO chips - the 6532 has 128 bytes of RAM, as well as some IO ports and one system I made way back mapped the RAM from $0000 through $7FFF with that 128 bytes area mirrored at every 128 byte boundary. This gave me a handful of bytes for the stack and for zero page and data usage all inside the same chip that was doing the IO.

It's also worth while looking at the actual chips you're thinking of interfacing - e.g. the 6522 VIA has 16 registers so needs the lower 4 address lines and 16-bytes worth of IO space, but it also has 2 chip select inputs which allows multiple devices in the same region - and as the 6551 ACIA does the same, then it's an easy win to simplify the peripheral decoding in systems using these 2 chips.

I currently just have one peripheral in my Ruby systems - a 65C22 VIA but to keep my life simple I decoded the whole of $FExx for it, so it appears repeated over that range. I could easily split that region into 2 regions with the inclusion of A7 into the decode logic (a single GAL).

And on the subject of GALs - use them if you're happy with them (is my view). My 65C02 board does all the decoding and RAM write qualifying in a single GAL - the 65C02 has a single R/W output signal, but the RAM needs separate /Read and /Write signals ... (Actually, there are 2 RAM chips in my 6502 system so it further decodes each 32KB RAM region for their respective CS signals).

For an old-school pure TTL type solution, something else to look at might be an 8-bit comparator IC. Using A8-A15 as one set of inputs and a set of DIP switches as the other, then you could have any 256 byte range in the top 32KB as IO with that range then being decoded further with e.g. a single 74x138. You'll still need to generate the separate /Rd and /Wr signals for the RAM, as well as the RAM and ROM /CS signals, and qualify writes to RAM with the clock... (Which is why I did it all in a GAL)

Let us know your thoughts though - always good to read about new ideas and projects.

Cheers,

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Thu Oct 28, 2021 5:56 pm 
Offline

Joined: Sun Sep 19, 2021 11:21 am
Posts: 38
drogon wrote:
...but it also has 2 chip select inputs which allows multiple devices in the same region...

Gordon, thanks for responding. I sense that the above is important but I don't understand it - although I did wonder why there were 2 chip selects on the 65c22 (CS1 & CS2B) and why they were opposite logic. Can someone explain why or point me to a resource that I could read up on, please? I want to understand the "allows multiple devices in the same region" aspect of what you said.


Top
 Profile  
Reply with quote  
PostPosted: Thu Oct 28, 2021 7:29 pm 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1488
Location: Scotland
DRG wrote:
drogon wrote:
...but it also has 2 chip select inputs which allows multiple devices in the same region...

Gordon, thanks for responding. I sense that the above is important but I don't understand it - although I did wonder why there were 2 chip selects on the 65c22 (CS1 & CS2B) and why they were opposite logic. Can someone explain why or point me to a resource that I could read up on, please? I want to understand the "allows multiple devices in the same region" aspect of what you said.


The VIA is "live" when CS1 is high AND CS2B is low. Other combinations effectively disconnect it from the bus.

So in my Ruby system, I hard wire CS1 to +5v and CS2B is taken from the address decoder which takes that signal low for any address in the range $FE00 through $FEFF. (The signal is called /IO in my schematics)

To use 2 VIAs, I can use address line A4, so one VIA has /IO and A4 into CS1 and CS2B, the other VIA has them the other way round. That activates one VIA at $FE00 through $FE0F and the other at $FE10 through $FE1F (and reflections up that range, so the first VIA is active at $FE20 through $FE2F and so on). So 2 devices for one decoded output.

Same for a VIA and ACIA, or other devices that have a similar scheme.

Hope that helps.

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Fri Oct 29, 2021 1:49 am 
Offline
User avatar

Joined: Wed Feb 13, 2013 1:38 pm
Posts: 589
Location: Michigan, USA
May I share one possible (untested) two chip "discrete logic" concept with capabilities similar to your GAL solution? It uses one variety of the 8-bit Comparator ICs Gordon mentioned along with a 74HC139 dual 2-to-4 line decoder IC to provide RAM and ROM selects, four I/O selects spanning a single (jumper configurable) 256 byte I/O page, and PHI2 qualified /RD and /WR signals. The circuit also uses a 64K RAM and 64K of a 128K Flash ROM and provides more flexibility in the way you partition memory. Basically, you install jumpers to select the I/O page location almost anywhere in address space with RAM mapped below and ROM mapped above that I/O page. Example drawings below.

Good luck. Have fun. Mike, K8LH


Attachments:
temp2.png
temp2.png [ 172.92 KiB | Viewed 6293 times ]
Beater 07c.png
Beater 07c.png [ 377.94 KiB | Viewed 9709 times ]


Last edited by Michael on Sun Sep 15, 2024 6:41 pm, edited 2 times in total.
Top
 Profile  
Reply with quote  
PostPosted: Fri Oct 29, 2021 2:20 am 
Offline
User avatar

Joined: Tue Mar 05, 2013 4:31 am
Posts: 1385
As you apparently have some PLDs and a way to progam them, I'd be inclined to go with that. It's certainly flexible and you can easily change your system configuration.

For address decoding, you should probably look at what devices you plan to support initially. As an example:

Some typical 65xx devices: 65C51 ACIA uses 4 address locations, 65C22 VIA uses 16 address locations.

Looking at some other I/O devices, here's a few to consider:

NXP UART/DUART, generally 16 address locations for a SC28L92 (older SCC2691 uses 8)
Maxim Realtime Clock DS1511/1501 has 32 address locations, but not all are used (check the datasheet)
IDE interface has 2- selects and 3 address lines, so there 8 locations on CS0 and only 2 on CS1.

I'm using an ATF22V10, which is configured for:
1- RAM select
1- ROM select
5- I/O selects at 32 bytes wide each
Phase 2 qualified Read and Write signals

Needless to say, you can easily change your entire address map simply by changing the PLD configuration.

Hope this helps!

_________________
Regards, KM
https://github.com/floobydust


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 01, 2024 2:30 pm 
Offline

Joined: Tue Sep 26, 2023 11:09 am
Posts: 109
Michael, thanks for the diagrams. I was exploring the idea of switching from '688 which I read about in one of your other posts, to the '682 (adding the P>Q flag) and then discovered this post. I know you said it's untested but I had a couple of questions.

1. Here you don't explicitly create a /LO flag so presumably the RAM is enabled at the same time as I/O but relies on MRD/MWR being disabled via the IO select in the '139? Is that right?

2. I thought the ROM normally didn't need clock-qualified read but maybe i'm misremembering, i guess it doesn't hurt?

3. In my current setup I use a '139 downstream of a VIA, with a couple of port A bits wired to '139 inputs and used to select one of several devices for I/O on port B (SD, keyboard, LCD). I like the IO strobe idea - can I read more about that somewhere? For example could I run the '139 in parallel with the VIA rather than downstream and save the port A bits? ie. VIA would be enabled by /IO in a full 256 byte page, with A0...A3 mapping 16 different 16byte blocks to it. Depending on which 16 byte block you accessed, the '139 would determine which device is active on VIA's port B during the access. Am I understanding that properly? I'd have to think about whether the instantaneous nature of the strobe is problematic: I guess the device would only be active during each read/write instruction to a relevant IO address, rather than persistently based on port A? The LCD probably doesn't care. Input from the SD and keyboard are run thru '595 shift registers but maybe that's fine if '139 is just disabling SR output until the via is actually trying to read it?


Top
 Profile  
Reply with quote  
PostPosted: Fri Aug 02, 2024 2:23 am 
Offline
User avatar

Joined: Wed Feb 13, 2013 1:38 pm
Posts: 589
Location: Michigan, USA
pdragon wrote:
Michael, thanks for the diagrams. I was exploring the idea of switching from '688 which I read about in one of your other posts, to the '682 (adding the P>Q flag) and then discovered this post. I know you said it's untested but I had a couple of questions.
Quote:
1. Here you don't explicitly create a /LO flag so presumably the RAM is enabled at the same time as I/O but relies on MRD/MWR being disabled via the IO select in the '139? Is that right?
Yes, that's correct...
Quote:
2. I thought the ROM normally didn't need clock-qualified read but maybe i'm misremembering, i guess it doesn't hurt?
ROM read operations don't normally need to be clock qualified but it shouldn't matter in that example.
Quote:
3. In my current setup I use a '139 downstream of a VIA, with a couple of port A bits wired to '139 inputs and used to select one of several devices for I/O on port B (SD, keyboard, LCD). I like the IO strobe idea - can I read more about that somewhere? For example could I run the '139 in parallel with the VIA rather than downstream and save the port A bits? ie. VIA would be enabled by /IO in a full 256 byte page, with A0...A3 mapping 16 different 16byte blocks to it. Depending on which 16 byte block you accessed, the '139 would determine which device is active on VIA's port B during the access. Am I understanding that properly? I'd have to think about whether the instantaneous nature of the strobe is problematic: I guess the device would only be active during each read/write instruction to a relevant IO address, rather than persistently based on port A? The LCD probably doesn't care. Input from the SD and keyboard are run thru '595 shift registers but maybe that's fine if '139 is just disabling SR output until the via is actually trying to read it?
Not sure I understand what you're describing. Sorry.


Top
 Profile  
Reply with quote  
PostPosted: Fri Sep 20, 2024 6:39 pm 
Offline

Joined: Tue Sep 26, 2023 11:09 am
Posts: 109
More complete description of my progress with this design over here: viewtopic.php?f=4&t=8173

TL;DR using the w65c02 phi1 output to qualify /mw was occasionally causing phantom writes because the rising /mw edge lagged phi2 too much: about 20ns with an HC-family '139 demux. It might work using AHC139 but is probably still not theoretically "in spec" (within 10ns) since phi1 already lags phi2 by 7-8ns. Halving an oscillator using a flipflop gives symmetric phi2, /phi2 inputs which work nicely to qualify the /mw - HC139 now works for me in practice and would be solidly "in spec" using AHC139.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 10 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 0 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron