16-bit wide memory?
16-bit wide memory?
Recently I have been working on CPU microcoding for another project and it hit me that something similar could be done with a 6502. If LDA and STA reference static 20-bit addresses, why not store the highest 4 bits in a second EEPROM? A whole section of code could be marked to work referring to a certain 64k page. For functions that need to operate in any page, a 4 bit buffer would still need to be loaded but the 5th bit of the second EEPROM could activate the buffer. The remaining 3 bits could be used to activate peripherals eliminating a lot of decoding logic.
Has anyone done something like this before?
Has anyone done something like this before?
- GARTHWILSON
- Forum Moderator
- Posts: 8773
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: 16-bit wide memory?
It sounds like partly what Jeff did with his KimKlone, but he trapped unused op codes to fool the processor into doing things it wasn't made to do. He did a great job.
http://laughtonelectronics.com/arcana/K ... mmary.html
http://laughtonelectronics.com/arcana/K ... mmary.html
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
Re: 16-bit wide memory?
Thanks, Garth. I hope the KimKlone description is adequate but I have a nagging feeling I should supply more code examples. One more item on my to-do list!
Meanwhile, questions can be directed to the KK topic here.
I like the idea; just let me check that I understand it! The second EEPROM would attach to the address bus, just like any normal program memory, but its 8 data outputs would feed some custom logic of your own, not the CPU data bus. The effect is to supply extra instruction bits, in this case devoted to increasing the address space. Using just four of the extra EEPROM bits you'd get a 20-bit space containing sixteen 64-K banks.
The main challenge I see is determining when (ie; during which bus cycles) to access the expanded space. Taking the example of a LDA abs instruction, the total duration is 4 cycles. During the first 3 cycles, instruction bytes are being fetched (from bank 0, let's say). Then on the 4th cycle we have the data access, and it's then that the alternative bank becomes active, for one cycle only. Your idea is spot-on; just remember the custom logic needs to be smart enough to manage the timing -- ie; to know when the bus is fetching code (from one specified bank) and when it's accessing data (according to the alternative bank specification in the extra instruction bits).
Microcode is the "bells and whistles" way of managing this but in fact a simple shift register or counter will suffice. That's if you're willing to choose just one address mode for your expanded accesses. For example, absolute and z-pg-indirect mode are both good choices. Either of these could supply most or all the functionality you need.
Another challenge is devising a means to write to the EEPROM. But, with all that extra address space available, that's a problem that should be easy to solve -- pretty much a slam-dunk, I would think!
Keep us posted; projects like this can be wonderfully thought-provoking!
cheers,
Jeff
Druzyek wrote:
why not store the highest 4 bits in a second EEPROM
The main challenge I see is determining when (ie; during which bus cycles) to access the expanded space. Taking the example of a LDA abs instruction, the total duration is 4 cycles. During the first 3 cycles, instruction bytes are being fetched (from bank 0, let's say). Then on the 4th cycle we have the data access, and it's then that the alternative bank becomes active, for one cycle only. Your idea is spot-on; just remember the custom logic needs to be smart enough to manage the timing -- ie; to know when the bus is fetching code (from one specified bank) and when it's accessing data (according to the alternative bank specification in the extra instruction bits).
Microcode is the "bells and whistles" way of managing this but in fact a simple shift register or counter will suffice. That's if you're willing to choose just one address mode for your expanded accesses. For example, absolute and z-pg-indirect mode are both good choices. Either of these could supply most or all the functionality you need.
Another challenge is devising a means to write to the EEPROM. But, with all that extra address space available, that's a problem that should be easy to solve -- pretty much a slam-dunk, I would think!
cheers,
Jeff
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html
https://laughtonelectronics.com/Arcana/ ... mmary.html
Re: 16-bit wide memory?
Thanks for the link Garth. I have read through the whole KimKlone site several times because it is so neat and I link to it a lot when people talk about what can be done with a 6502.
I think you understand me. Basically I would have a 512k SRAM. The 6502 would drive A0-15 of the SRAM and A0-A15 of the second EEPROM also. D0-D2 of the EEPROM would drive A16-A18 of the SRAM. In assembler I would write something like "LDA $03ABCD" and some kind of macro would compile LDA $ABCD for the first EEPROM and store $03 at the same address in the second EEPROM. It seems like the 4 unused bits should be enough so I don't need any decoding logic other than a demultiplexer. "STA $40ABCD" would feed address 4 to the demultiplexer and select the 5th peripheral for writing. Basically it seems like an easy way to handle chip selects without using a CPLD.
This is a good point too. I think whatever software I use to split the extra address byte off can figure that out if I know what each instruction does cycle for cycle.
It also seems like this kind of scheme would work for expanding ROM or RAM but not both at once. One or the other would still need an external latch of some kind to hold the third address byte but that could easily be one of the peripherals selectable with the high 4bits.
Alternatively, both bytes of data could be stored on the same chip and latched out on alternating clock cycles (the 6502 being clocked every other cycle) but that would probably be a speed bottleneck.
Quote:
I like the idea; just let me check that I understand it!
Quote:
ie; to know when the bus is fetching code (from one specified bank) and when it's accessing data
It also seems like this kind of scheme would work for expanding ROM or RAM but not both at once. One or the other would still need an external latch of some kind to hold the third address byte but that could easily be one of the peripherals selectable with the high 4bits.
Alternatively, both bytes of data could be stored on the same chip and latched out on alternating clock cycles (the 6502 being clocked every other cycle) but that would probably be a speed bottleneck.
- GARTHWILSON
- Forum Moderator
- Posts: 8773
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: 16-bit wide memory?
Do look into the 65816 though. It handles the larger memory much more gracefully.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
Re: 16-bit wide memory?
I agree with Garth. Your approach is hardly the most direct solution if what you're seeking is an expanded address space. On the other hand, you do stand to learn a lot -- even if your project is never completed. (Many of my own projects fall in that category -- educational though incomplete!)
I don't see how driving A16-A18 of the SRAM directly from the EEPROM can work. You need some sort of latch or other storage, as you imply in the second quote (above). You need the latch because, on the cycle when the SRAM is accessed, the extra EEPROM bits no longer contain the information you want. All you have is the value the cpu outputs on A15-A0, and that won't tell you what you want to know (namely, what the extra EEPROM bits contained a moment before).
Regarding the LDA & STA instructions you'll be using, it is absolutely essential to learn what the cpu does, cycle by cycle -- no way can you gloss over this!
Luckily, there's a thorough and easily-readable description in Appendix A of mcs6500 family hardware manual.
-- Jeff
Druzyek wrote:
D0-D2 of the EEPROM would drive A16-A18 of the SRAM.
Druzyek wrote:
It also seems like this kind of scheme would work for expanding ROM or RAM but not both at once. One or the other would still need an external latch of some kind to hold the third address byte
Regarding the LDA & STA instructions you'll be using, it is absolutely essential to learn what the cpu does, cycle by cycle -- no way can you gloss over this!
-- Jeff
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html
https://laughtonelectronics.com/Arcana/ ... mmary.html
Re: 16-bit wide memory?
I think I see something...
I think the idea here is that the code being run is in one ROM (byte-wide, of course) and there's another shadow ROM next to it. So, for each of the 3 bytes of a LDA or STA with an absolute address, there's another 8 bits of information available. Now, Jeff is right to point out that we have to understand what happens cycle by cycle. There are 3 cycles in which we read from the main ROM, and the second one, and then (soon) there's a cycle (or so) when the 6502 puts out the absolute address and reads or writes the data we're referencing.
We need to store some part of the extra information somehow and bring it into play exactly during the cycle(s) that the absolute address is in use. Jeff's concern is in keeping track of which cycles those are: in all other cycles we need to put zero out on the extra bank-select addresses, in order to keep running the code we want and with the zero page, stack and i/o devices that we have.
Here's the idea: use the 6502's SYNC signal to detect the opcode fetch. That's a single unambiguous cycle in which we can capture the extra 8 bits of address information. (We're assuming all along that the extra address info is static.) But we still have to figure out exactly when to use it...
OK, so we upgrade to a 65C02 or an '816 in emulation mode, and now we have the VPA and VDA outputs. We AND then to detect the instruction fetch, and store the extra info in our latch. Now we can use
(VDA and not VPA)
to detect data accesses - accesses which are not opcode or operands - and augment the address bus with the contents of the latch, instead of with zero.
All code will run from bank zero, and almost all of the auxiliary ROM will be zeros too, but where we want to access other banks we can do it by putting the bank number next to the opcode in question.
Is that something like what you had in mind, Druzyek?
Now we can bring into play the other ideas: supply ourselves with a second latch, which we write to as an I/O device, and use a fifth bit from the auxiliary ROM to choose between the original latch for static bank selection and the new latch for dynamic bank selection. (We're heading towards having a few latches and calling them segment selectors, perhaps...)
(It's still true that the '816 is cleaner, and indeed you can use static 24-bit addressing even in emulation mode, but that doesn't spoil the idea.)
Cheers
Ed
Druzyek wrote:
If LDA and STA reference static 20-bit addresses, why not store the highest 4 bits in a second EEPROM? A whole section of code could be marked to work referring to a certain 64k page.
We need to store some part of the extra information somehow and bring it into play exactly during the cycle(s) that the absolute address is in use. Jeff's concern is in keeping track of which cycles those are: in all other cycles we need to put zero out on the extra bank-select addresses, in order to keep running the code we want and with the zero page, stack and i/o devices that we have.
Here's the idea: use the 6502's SYNC signal to detect the opcode fetch. That's a single unambiguous cycle in which we can capture the extra 8 bits of address information. (We're assuming all along that the extra address info is static.) But we still have to figure out exactly when to use it...
OK, so we upgrade to a 65C02 or an '816 in emulation mode, and now we have the VPA and VDA outputs. We AND then to detect the instruction fetch, and store the extra info in our latch. Now we can use
(VDA and not VPA)
to detect data accesses - accesses which are not opcode or operands - and augment the address bus with the contents of the latch, instead of with zero.
All code will run from bank zero, and almost all of the auxiliary ROM will be zeros too, but where we want to access other banks we can do it by putting the bank number next to the opcode in question.
Is that something like what you had in mind, Druzyek?
Now we can bring into play the other ideas: supply ourselves with a second latch, which we write to as an I/O device, and use a fifth bit from the auxiliary ROM to choose between the original latch for static bank selection and the new latch for dynamic bank selection. (We're heading towards having a few latches and calling them segment selectors, perhaps...)
(It's still true that the '816 is cleaner, and indeed you can use static 24-bit addressing even in emulation mode, but that doesn't spoil the idea.)
Cheers
Ed
Re: 16-bit wide memory?
Excellent, Ed. It's essential that the added circuitry know what's happening on every cycle, and the '816's VDA-VPA pins supply a much better indication than the SYNC pin does (on 6502/65c02). It's not a 100% solution but it's close. VDA:VPA equal 1:0 in the last cycle of LDA and STA, as desired, but they also equal 1:0 at other times such as during stack accesses and during the fetch of zero-page indirect pointers. The circuit could decode the opcode to overcome this remaining ambiguity.
Yes, except you may wish to avoid the word "segment" for fear of any unwelcome association with the concepts embodied in the x86 model!

But I agree. At the minimum we want to address code and data. Do we expand them both? If there's only one latch (for extended address of the data) then we're left with nowhere to store an extended address for code. That means all code must reside in bank zero.
The '816 (used as intended, I mean) solves this by using two registers: the Program Bank Register (PBR) and the Data Bank Register (DBR). The KimKlone arrangement differs somewhat, offering one near code-/data-bank register and a selection of three far Data Bank Registers (which are a lot faster & easier to load than the DBR, and which tend to compensate for the lack of 816-style features such as 3-byte zero-pg pointers).
On a related topic, allow me to draw attention to the 74_670 dual-port 4x4 register-file. This little chip is highly suitable for storing & recalling bank addresses, as well as for other applications. I've started a new topic here.
-- Jeff
BigEd wrote:
[...] choose between the original latch for static bank selection and the new latch for dynamic bank selection. (We're heading towards having a few latches and calling them segment selectors, perhaps...)
But I agree. At the minimum we want to address code and data. Do we expand them both? If there's only one latch (for extended address of the data) then we're left with nowhere to store an extended address for code. That means all code must reside in bank zero.
The '816 (used as intended, I mean) solves this by using two registers: the Program Bank Register (PBR) and the Data Bank Register (DBR). The KimKlone arrangement differs somewhat, offering one near code-/data-bank register and a selection of three far Data Bank Registers (which are a lot faster & easier to load than the DBR, and which tend to compensate for the lack of 816-style features such as 3-byte zero-pg pointers).
On a related topic, allow me to draw attention to the 74_670 dual-port 4x4 register-file. This little chip is highly suitable for storing & recalling bank addresses, as well as for other applications. I've started a new topic here.
-- Jeff
- Attachments
-
- 670 symbol.gif (9.89 KiB) Viewed 7692 times
Last edited by Dr Jefyll on Fri Jan 09, 2015 12:09 am, edited 1 time in total.
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html
https://laughtonelectronics.com/Arcana/ ... mmary.html
Re: 16-bit wide memory?
(Glad I wasn't hallucinating... but I don't think we do need the cycle-counting, at least for direct absolute addressing modes. For any indirect modes, the mechanism won't work for the reasons you give, but we just have to specify bank 0 in that case and at least everything works in an expected way. Absolute indexed gets a free pass.)
Re: 16-bit wide memory?
Quote:
Do look into the 65816 though. It handles the larger memory much more gracefully.
Quote:
I don't see how driving A16-A18 of the SRAM directly from the EEPROM can work.
Quote:
Is that something like what you had in mind, Druzyek?
Quote:
On a related topic, allow me to draw attention to the 74_670 dual-port 4x4 register-file
Recently I got an EEPROM programmer going pretty easily with an MSP430 and in the same week gave a go at WinCUPL for the ATF1508 I bought. WinCUPL keeps crashing and is generally unpleasant to use
Re: 16-bit wide memory?
I think the crucial realisation is that a single line of assembly code can take several cycles just to fetch, and several cycles to act. In the case of JSR, those different things are even interleaved. I recommend, of course, visual6502 for exploring this. For example
http://visual6502.org/JSSim/expert.html ... f&steps=50
(If you want to stick with the NMOS 6502, or you want to handle other addressing modes, then the above comments about needing to count each instruction length apply. Jeff's KimKlone shows that it can be done!)
http://visual6502.org/JSSim/expert.html ... f&steps=50
(If you want to stick with the NMOS 6502, or you want to handle other addressing modes, then the above comments about needing to count each instruction length apply. Jeff's KimKlone shows that it can be done!)
- GARTHWILSON
- Forum Moderator
- Posts: 8773
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: 16-bit wide memory?
Quote:
Again, Garth's suggestion is a good one but for most people I'm betting the 65816 falls through the cracks because it doesn't qualify for the nostalgia category and falls short in the horsepower category.
As for nostalgia, there weren't many home computers that used the '816; but it is a natural upgrade to the '02, so you don't have to start from scratch when you learn it. To say it still has the 6502 flavor is an understatement.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
Re: 16-bit wide memory?
Why does the 65816 fall short speed-wise?
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: 16-bit wide memory?
banedon wrote:
Why does the 65816 fall short speed-wise?
x86? We ain't got no x86. We don't NEED no stinking x86!
- GARTHWILSON
- Forum Moderator
- Posts: 8773
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: 16-bit wide memory?
banedon wrote:
Why does the 65816 fall short speed-wise?
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?