6502.org

Posted: **Sat Nov 23, 2024 4:37 pm**

BigEd wrote:

Why is it advantageous to copy ROM content to RAM and then page out ROM for an all-RAM system?

With the various Acorn systems it means you have an all-RAM memory with simplified address decoding. For example, the 6502 and Z80 copros have 64K of RAM (with the 6502 having an I/O window at FEFx), on RESET any read from anywhere reads from the ROM, after a read from (either I/O or low memory, can't remember), the ROM is disabled. The ROM code starts by copying the ROM to "itself", and then disables the ROM and continues running in RAM. It then means everything is in RAM. Part of the client code depends on this for rapidly changing the NMI vector for data transfer. Similarly, on the Z80 and 6809, the API vectors are the actual address fields of the API JMP entries.

So it means all the address lines simply go to both memory devices, with a flipflop that directs READ to either ROM or RAM (and a FEFx to select I/O which is just a multiple-input NAND)

Posted: **Sat Nov 23, 2024 4:51 pm**

DRG wrote:

what is doing the switching out if they occupy the same space i.e. is what is controlling the chip selects?

Does this help?

Code: Select all

           +-----+
           |     +--------------+
           | R S +---+          |
           +-+-+-+   |          |
             | |     |          |
RESET--------+ |     |          |
               |     |          |          
   +--------+  |  +--+---+    +-+---+    +-----+
   |        |  |  |  CS  |    | CS  |    |     |
   |     A15|--+--|A15   |    |     |    |     |
   |     A14|=====|A14   |====|     |    |     |
   |      ..|     |      |    |..   |====|     |
   |      A0|=====|A0    |====|A0   |====|     |
   |        |     |      |    |     |    |     |
   |      D7|=====|      |====|     |====|     |
   | CPU  ..|     | RAM  |    | ROM |    | I/O |
   |      D0|=====| 64K  |====|     |====|     |
   |        |     |      |    |     |    |     |
   |     R/W|-----|      |----|     |----|     |
   |        |     |      |    |     |    |     |
   |        |     +------+    +-----+    +-----+

(I/O address decoding omitted)

(Could do with a TT font instead of having to use (code) formatting)

Edit: See also https://mdfs.net/Info/Comp/6809/CoPro.txt

Posted: **Sat Nov 23, 2024 6:19 pm**

jgharston wrote:

(Could do with a TT font instead of having to use (code) formatting)

It's a little more work, but you can use special characters to get it smoother:

Code: Select all

           ┌─────┐
           │     ├──────────────┐
           │ R S ├───┐          │
           └─┬─┬─┘   │          │
             │ │     │          │
RESET────────┘ │     │          │
               │     │          │          
   ┌────────┐  │  ┌──┴───┐    ┌─┴───┐    ┌─────┐
   │        │  │  │  CS  │    │ CS  │    │     │
   │     A15├──┴──┤A15   │    │     │    │     │
   │     A14╞═════╡A14   ╞════╡     │    │     │
   │      ..│     │      │    │..   ╞════╡     │
   │      A0╞═════╡A0    ╞════╡A0   ╞════╡     │
   │        │     │      │    │     │    │     │
   │      D7╞═════╡      ╞════╡     ╞════╡     │
   │ CPU  ..│     │ RAM  │    │ ROM │    │ I/O │
   │      D0╞═════╡ 64K  ╞════╡     ╞════╡     │
   │        │     │      │    │     │    │     │
   │     R/W├─────┤      ├────┤     ├────┤     │
   │        │     │      │    │     │    │     │
   │        │     └──────┘    └─────┘    └─────┘

Posted: **Sat Nov 23, 2024 7:34 pm**

DRG wrote:

Also, I am not wedded to I/O being in page FE. But it does beg the question where is best to map I/O (and I’m guessing there is no universal answer but... depends). Hence, I can move it to page 02 if that makes it easier in terms of system configuration.

You’re correct: there is no universal answer. “Tradition” usually has placed I/O in the $Dxxx range, right below the 8K ROM that used to be common in 6502 systems. For example, the Commodore 64 had its I/O hardware there, and ROM at $Exxx-$Fxxx. That was how I built my POC V1.0 and V1.1 units as well. Such an arrangement tends to simplify decoding—fewer gates needed, and also exposes a maximum amount of contiguous RAM.

Starting with POC V1.2, I rearranged things to place I/O at $00Cxxx and ROM from $00Dxxx to $00Fxxx. That change allowed me to increase ROM to 12KB to accommodate a more-extensive firmware without any holes or interruptions in the ROM memory map. As one of the design goals was to have no more than two gate delays between the address bus and any one chip select, I limited decoding granularity by using only A8-A15, which created 256 bytes of address space per I/O device. Contiguous bank $00 RAM was slightly reduced in size as a consequence. That loss was largely effaced in V1.3 with the addition of extended RAM, producing a total of 112KB available for code and data.

Given that you are using a GAL instead of discrete gates for decoding, you have more flexibility, but should avoid the temptation to create an I/O “island” in the middle of RAM or ROM, especially if you plan to give consideration to shadowing your ROM. I’d be inclined to place the I/O at $CC00, and decode down to A7. Using A7-A15, you can break up your I/O block into 128-byte segments, which with six total I/O devices, would consume $CC00-$CFFF. You could then map RAM from $0000-$CBFF, which would be a contiguous 51KB. This would leave you with 12KB of ROM, starting at $D000.

In a 6502-based system, 12KB is a lot of ROM—well-written 6502 machine code tends to be compact. 12KB is how much I have in POC V1.3, and it has the entire firmware, which consists of a detailed POST with memory sizing, four-channel serial I/O, SCSI I/O, timekeeping with 10 millisecond granularity (and a date range from 1752 to 9999), and a BIOS API with 33 user-accessible functions. Along with all that is the entirety of the Supermon 816 machine language monitor, including an S-record loader. There is still some room in the ROM.

Quote:

On the matter of copying ROM to RAM, however, and in borrowing from JFK - the greater my knowledge increases, the more my ignorance unfolds. RAM & ROM would occupy the same memory addresses but can be switched allowing ROM to occupy the space first and then RAM to take this space in the final configuration. This leads me to ask some questions:

what is doing the switching out if they occupy the same space i.e. is what is controlling the chip selects?

If you follow my suggestion to reduce decoding granularity to A7-A15, you will have three now-uncommitted pins in your GAL. One of those pins can be used as an input that you would connect to the data bus (I’d use D7). A second pin can be used to generate a wait-state enable signal to control an external function that halts the MPU for one or two clock cycles (your GAL doesn’t have enough resources to implement wait-stating in itself, so at least an external flip-flop is needed).

The third pin would act as the SPLD equivalent of buried logic by being a node for a latch. That latch would be set or cleared according to the input at the pin connected to the data bus, with a write to a fake I/O address set up in your I/O block. The state of that latch would be included in the ROM address range decoding to determine if an access to that range selects RAM or ROM when read—it would always select RAM when written. Furthermore, if that range always selects RAM, it can be write-protected to prevent accidental firmware corruption after it has been shadowed.

Quote:

the reason for doing this is to dispense with "slow" ROM - so how do you work with ROM initially with a slow clock speed and then increase the clock when it’s RAM only? Is this what wait-states are for and can be programmed into the 22V10C?

Wait-states are usually used to effectively prolong the Ø2 clock’s high phase to give slow peripherals time to respond. Negating the C02’s RDY signal causes the MPU to stop at the next high-to-low clock transition and hold the buses and RWB in the Ø2-high state—driving RDY low stops the MPU’s internal clock in the high phase. There are some technical problems with using RDY, and my preferred method is to instead control the clock generator.

Quote:

in the context for posts in this thread, is there a difference between "shadow" RAM and "banked" RAM?

In the context of a 65C02, banked RAM usually refers to a range of RAM that normally lies outside of the MPU’s 64KB address space, but may be mapped into “visible” address space in relatively-small segments. A portion of the MPU’s address space, usually 4KB or 8KB in size, is allocated to act as a window into the banked RAM. Additional decoding and latching is needed to determine where in the banked RAM the window is “looking.” Banking RAM in this fashion was common in the heyday of eight-bit, 6502-powered home computers but, in my view, is far from ideal and best avoided—the 65C816 handles large quantities of RAM far more efficiently.

“Shadow RAM” is RAM that is at the same address as ROM. If ROM is copied into that RAM and then switched out, the firmware in ROM is now running in RAM and from the MPU’s perspective, nothing will seem to have changed. With the firmware running in the much-faster RAM, wait-stating firmware accesses can be avoided, boosting system performance. Also, as Bill noted, you can tinker with the firmware without having to reprogram your ROM.

Posted: **Sun Nov 24, 2024 4:02 am**

BigDumbDinosaur wrote:

... Starting with POC V1.2, I rearranged things to place I/O at $00Cxxx and ROM from $00Dxxx to $00Fxxx. That change allowed me to increase ROM to 12KB to accommodate a more-extensive firmware without any holes or interruptions in the ROM memory map.

Woz came to that same pragmatic conclusion in early 1977 when he shifted his attention from the Apple 1 to the ][ and ][+. The only fly in the ointment came from hard wiring the two TTL-based hi-res graphics frame buffers at $2xxx/$3xxx and $4xxx/$5xxx, because RAM was very pricey at the time, and he wanted a 16K system to be hi-res capable. BASIC programs with a "large" footprint that also wanted to use the buffers for their intended purpose had to step carefully around them. He could've added another "soft-switch" or two in the $C0xx range to make the buffers more flexible, but he was probably too busy writing high-density firmware and designing clever peripherals.

Posted: **Sun Nov 24, 2024 3:31 pm**

There's some fantastic information in the contributions coming through this thread, so many thanks to everyone. It'll take some synthesising on my part - I think America has an apt expression for this: "it's like trying to drink from a fire hose!"

BDD, your last post is taking some parsing for me. I'm fine with the bit about an I/O "island" in the memory map, granularity and shadow/banked RAM. I do keep having to go over the rest, however, regarding wait states and latches. I note jrharston kindly posted a schematic for the RAM/ROM chip selects and that uses an SR latch where the Set is connected to A15. I need to sit in a dark, quiet room and think about these D7/A15 suggestions and what they achieve.

BigDumbDinosaur wrote:

your GAL doesn’t have enough resources to implement wait-stating in itself, so at least an external flip-flop is needed.

Is that because all the address decoding logic it's doing already or that the ATF22V10C just isn't capable of such a function at all? I do have other ATF22V10Cs available if that assists.

In attempting to follow how this works from reset, my thinking is along these lines...

For the purpose of this, I am making use of a theoretical 65C02 system here that has 64K RAM and 16K ROM (overlapping the same address space with RAM) and I/O below ROM.

Code: Select all

Memory Map...
0000 - FFFF RAM
BF00 - BFFF I/O
C000 - FFFF ROM

At reset, (I'll use a DS1813) the reset pin of the 65C02 is held low for around 150ms when it then goes high.
If this is also connected to an SR latch (as per jrharston's schematic) this high signal activates the reset of the SR latch and Q/QB become low/high.
Q is connected to the CSB of ROM, thereby now directing any memory requests when accessing C000 to FFFF to ROM. This would allow the reset vector at FFFC/FFFD to point to the code in ROM to copy everything over to RAM.

And, that's where I grind to a halt. I don't understand how the RAM is activated so I can copy from ROM to RAM. I think I *may* have made an erroneous assumption that the code being copied from ROM is going into the same place in RAM. Whereas, is it being copied from C000/FFFF to, say, 4000/7FFF where the RAM/ROM don't overlap? I have this suspicion this is where the SR latch attached to A15 comes in.

But then, once the copying from ROM to RAM has completed, how is the ROM "deactivated" for good (or until the next reset)? Also, in this system, are we running the clock at a constant speed (i.e. one good for RAM only so, say, 16 MHz) but utilising wait-states when dealing with ROM?

Dave

Posted: **Sun Nov 24, 2024 4:37 pm**

When doing ROM-to-RAM copy, it is important to keep both RAM and ROM chip select enabled, but only ROM’s output is enabled while RAM output is disabled.

Take advantage of 128K RAM having 2 chip select; one active high, other active low. Tying active high chip select to clock means RAM will only active when 6502 is putting out valid address and data (skip the dummy access issue for now), then you can connect RWB to RAM’s write enable. So now controlling RAM’s output enable decide whether it will drive the bus or just a write-only device. You may be uncomfortable with EVERY write (including IO write) goes into RAM somewhere, but that’s not really a problem other than slightly more power consumption.

Let’s ground ROM’s chip select so it is always active. Yes, it will consume a bit more power, but not significantly for CMOS part. 22V10 only controls the output enable of RAM and ROM. A flipflop in 22V10 is cleared on reset and enable ROM while disable RAM output enables. This allows ROM read while copying into RAM. 22V10 flipflop is set when write to a ‘magic address (somewhere in IO space), now ROM is swapped out and program running in RAM.

Depending on how many addresses are going into 22V10, IO takes a small chunk of memory space. The IO output should feedback to decoding of RAM output enable in 22V10, so RAM and IO are not both enabled.

Sitting in a dark, quiet room and think about things are good for you.

22V10 is a very power device, far more useful than just address decoder. For an example, you can build a 6502 or Z80 or 68008 computer including a serial port with just CPU, RAM, and 22V10.
https://www.retrobrewcomputers.org/doku ... 6580r0home

Bill

Posted: **Sun Nov 24, 2024 5:18 pm**

A microcontroller boot loader with a simple "blind interface" could copy ROM to RAM at start-up then start the 6502 with a 1, 2, 4, or 8 MHz clock.

Example Arduino code excerpts;

Code: Select all

  /******************************************************************************
   *  load 64K RAM from 64K 'A' or 'B' half of ROM at 'power-up' or 'reset' at  *
   *  a nice liesurely 1-MHz rate.                                              *
   *                                                                            */
   void loader()                      // ****************************************
   { addr = 0x0000;                   // start address $0000                    *
     uReset();                        // reset CPU (synchronize micro to cpu)   *
     do                               // copy 64K ROM minus I/O page to RAM     *
     { wrRAM(rdROM());                // ROM -> RAM (0000..BFFF & C100..FFFF)   *
       if(++addr == 0xC000)           // skip over I/O area ($C000..$C0FF)      *
         addr = 0xC100;               //                                        *
     } while(addr);                   // until roll-over to 0 (full 64K range)  *
  /*                                                                            *
   *  6502 "RUN Mode" ~ reset 6502, start clock, enable RAM                     *
   *                                                                            */
     res(0);                          // reset = 0;                             *
     beginClock(8);                   // start 1, 2, 4, or 8 MHz CPU clock      *
     busInp();                        // disconnect uC from data bus (hi-z)     *
     PORTD = 0b11111100;              // d5-d0 pull-ups                         *
     PORTB = 0b00011011;              // d7-d6 pull-ups, -/res/clk/sck/d7/d6    *
     PORTC = 0b00101010;              // -/-/rom/ram/sys/run/led/led (green)    *
     res(1);                          // reset = 1 -> 6502 'run'                *
   }                                  // ****************************************

Code: Select all

   byte rdROM()                       // ****************************************
   { uPush(0x4C);                     //  jmp $1000  reset PC (avoid I/O area)  *
     uPush(lo(0x1000));               //   "                                    *
     uPush(hi(0x1000));               //   "                                    *
     uPush(0xAD);                     //  lda <abs>                             *
     uPush(lo(addr));                 //   "         abs address lo             *
     uPush(hi(addr));                 //   "         abs address hi             *
     return uPull(rom);               //   "         6502 read op'              *
   }                                  // ****************************************
   void wrRAM(byte data)              // ****************************************
   { uPush(0xA9);                     //  lda <imm>                             *
     uPush(data);                     //   "                                    *
     uPush(0x8D);                     //  sta <abs> (uses global 'addr' var)    *
     uPush(lo(addr));                 //   "         abs address lo             *
     uPush(hi(addr));                 //   "         abs address hi             *
     uPull(ram);                      //   "         6502 write op'             *
   }                                  // ****************************************

   void wrRAM(u16 addr,byte data)     // **********{ overload function }*********
   { uPush(0xA9);                     //  lda <imm>                             *
     uPush(data);                     //   "                                    *
     uPush(0x8D);                     //  sta <abs>                             *
     uPush(lo(addr));                 //   "         address lo                 *
     uPush(hi(addr));                 //   "         address hi                 *
     uPull(ram);                      //   "         6502 write op'             *
   }                                  // ****************************************

Example SBC (the Nano is a full-blown ROM Emulator / Programmer in this $10 SBC design);

Posted: **Sun Nov 24, 2024 6:48 pm**

Michael wrote:

A microcontroller boot loader with a simple "blind interface" could copy ROM to RAM at start-up then start the 6502 with a 1, 2, 4, or 8 MHz clock.

That's very interesting, Michael. Do you have anything like a github link to the code? It would help me understand what's going on here. I don't follow how the Nano can do the copying when it is not connected to any address lines!

Dave

Posted: **Sun Nov 24, 2024 8:12 pm**

DRG wrote:

Michael wrote:

A microcontroller boot loader with a simple "blind interface" could copy ROM to RAM at start-up then start the 6502 with a 1, 2, 4, or 8 MHz clock.

That's very interesting, Michael. Do you have anything like a github link to the code? It would help me understand what's going on here. I don't follow how the Nano can do the copying when it is not connected to any address lines!

Dave

Hey, Dave. I don't have a github but I hope to publish info' for a Ben Eater 6502 compatible board with built-in ROM Emulator / Programmer in coming months. I'll try to work on a simple boot loader 'sketch' for you during the Holidays.

The Nano uses the reset and clock lines to sync' to each 6502 clock cycle and presents to the 6502 as a smart ROM or stealth ROM of sorts. The Nano simply places instructions and data onto the bus at the correct intervals (cycles) to reset the 6502 and to execute instructions. The 6502 is actually doin' all the work, setting address lines, writing RAM, etc. The Nano knows when to read or write data on the bus and when to enable/disable RAM and ROM during each clock cycle depending on the instruction sequence. In 6502 "run mode" the Nano is simply providing a system clock and an econo' reset function.

Good luck. Happy Holidays...

Posted: **Mon Nov 25, 2024 1:55 am**

I think you guys are starting to get overly complicated with all this “let’s boot from a µcontroller...” stuff. You need to focus on building your basic design first. Then you can add the ginger bread.

Posted: **Mon Nov 25, 2024 6:36 am**

I'm inclined to agree with BDD here.

There is a problem common to every microprocessor design: getting the code in there in the first place and making sure it stays there when you turn the power off. The traditional way to do this is with non-volatile memory - ROM, PROM, EPROM, EEPROM, FLASH, whatever - but there is always the problem of getting data into that memory.

So your workflow is something along the lines of

write your software
assemble it to a hex file of some flavour
use an eprom programmer (or similar) to program your prom
physically carry the prom from the programmer and put it in your 6502 board
test
repeat as necessary until the test passes

The huge advantage that single board microcontrollers have over microprocessors - up to and including 64-bit monsters - is that they have non-volatile memory on board that can be easily programmed from an laptop and a USB cable. That's basically what you're trying to emulate with the various techniques that have been discussed both in this thread and elsewhere on the forum.

The problem is that when it doesn't work - and first versions of anything non-trivial never work - you don't know whether it was because the software broke, or because your hardware design is faulty. The way to get around that is to use the simplest possible hardware design - I favour my own variant of Grant Searle's design[1] - with nothing fancy beyond the basic I/O.

When you have that working, you will have proved your workflow _and_ your basic design, and then your next stage design can improve on/add to your original as you discover more issues, but at the same time, better techniques.

Don't forget that when you see Ben Eater or Matt Regan or James Sharman show you a design on Youtube, you're seeing a finished version (with perhaps enough mistakes to give it a realistic flavour) and not the approaches that didn't work...

Neil

[1] It's six-chip design using a 6851 for serial IO to an FTDI USB/serial adaptor cable. The bottom half of the memory space is all ram, the top quarter - from 0xc000 - is eeprom, and an excessively large IO space in between the two (of which a whole two bytes are user!)

You may have seen my two threads on developing a Tiny Basic in the programming section. In the first section I developed the language in C on this host laptop to the stage that I have at least the obvious bugs out. Now I'm slowly converting that design into 65c02 code with a very similar program structure, to minimise the probable bugs I will introduce. But even with the simple processor design, my first bit of code was to prove I could get data out of the serial port, then that could get data in. Then a handful of helper routines which won't appear in the final code to provide some debugging help: output the accumulator as a hex pair, show the contents of the processor registers, show a block of memory etc... you see how it builds, a little at a time, with all the 'what did I do wrong _this_ time?' minimised.

And I've programmed this eeprom probably several hundred times so far... hint: to preserve the pins of the eeprom if you don't have a ZIF socket on both the programmer and the system board, put it in a normal dip socket and move it complete with the socket. If you bend a pin on that it's easy and cheap to replace.

Posted: **Mon Nov 25, 2024 3:38 pm**

barnacle wrote:

I'm inclined to agree with BDD here.

Well, that’s a ringing endorsement!

Just kidding!

Quote:

There is a problem common to every microprocessor design: getting the code in there in the first place and making sure it stays there when you turn the power off...

POC V1.0 used the “traditional” UV-erasable EPROM and as I worked on fleshing out the firmware, I acquired more EPROMs to reduce downtime waiting for the EPROM eraser to do its work. As I have a lot of EPROMs, I am still using them, even though flash memory, e.g., 39SF010, is less hassle. One of these days...

Quote:

The huge advantage that single board microcontrollers have over microprocessors...is that they have non-volatile memory on board...That's basically what you're trying to emulate with the various techniques that have been discussed...when it doesn't work...you don't know whether it was because the software broke, or because your hardware design is faulty. The way to get around that is to use the simplest possible hardware design...

That’s exactly my point. The beauty of using an EPROM or a flash device like the aforementioned 39SF010 is its interface is trivial to attach to a 65C02 and get working. As soon as a µcontroller gets into the picture, bus sharing and other potentially-tricky hardware matters that aren’t present with a basic ROM are now part of the design. In a first build, that’s exactly the sort of complexity that should be avoided.

Posted: **Mon Nov 25, 2024 5:16 pm**

You're both aligned with my thinking, to be honest, as this exact observation was running through my head - whilst I am genuinely interested in learning how Michael's system works (as I have often wondered how to do something like this) and plasmo's descriptions/explanations, I am acutely aware of the limits of my (current) abilities. I am guilty of pursuing the notion of a "fast" system (rather than concentrating on sub 8Mhz) and that dragged us off-topic.

Consequently, I can confirm my next build is STILL pretty much what I set out in the first post other than (as a consequence of this thread and all the great input and contributions) it'll be a 48K RAM/16K ROM with the I/O at the top of RAM. I have a AS6C4008 in my box of parts and there are 5 x W24512AK-15 DIP-32 64K X 8 on their way to me so I can address this greater RAM capacity.

CPU: WDC65C02S6TPG-14
RAM: AS6C4008-55PCN / W24512AK-15
ROM: 39SF010A-70
VIA x2: WDC65C22S6TPG-14
Glue Logic: 2 x ATF22V10C-10PU
Comms: Arduino Nano (clone)

What I need to do now (I think) is write the new WinCUPL code for the 22V10C for the new memory map and draw the schematic. Which will be interesting, as I've not done much of this. I'm underway using KiCad and it has seemed quite simple so far, if not laborious. Pins, individual lines or buses - so many ways! Initially, I've gone with drawing individual lines for the buses.

Dave

Posted: **Mon Nov 25, 2024 6:20 pm**

In my humble (ha!) opinion, every connection with its own wire can be a pain, and just labelling pins is definitely a pain.

The bus construct in Kicad doesn't really do anything - since the wires are named at each end anyway, they're automatically connected - but it does simplify tracing the general idea of where a group of signals goes...

e.g.

grant.pdf: (100.15 KiB) Downloaded 41 times

Personally, I like the default Kicad colours but some of our members have difficulty seeing some colours, so I post monochrome as a courtesy.

Neil

6502.org

Let's start at the very beginning...

Re: Let's start at the very beginning...

Re: Let's start at the very beginning...

Re: Let's start at the very beginning...

Re: Let's start at the very beginning...

Re: Let's start at the very beginning...

Re: Let's start at the very beginning...

Re: Let's start at the very beginning...

Re: Let's start at the very beginning...

Re: Let's start at the very beginning...

Re: Let's start at the very beginning...

Re: Let's start at the very beginning...

Re: Let's start at the very beginning...

Re: Let's start at the very beginning...

Re: Let's start at the very beginning...

Re: Let's start at the very beginning...