Interfacing real 65C02 to an FPGA

jfoucher · Post by **jfoucher** » Wed Feb 17, 2021 5:13 pm

Hi all,
I would like to be able to interface a WDC65C02 to an FPGA, specifically write to the FPGA block ram from the 6502.
The FPGA is presented to the 6502 as an IO device at address $FF80 (for example). One chip select line, 3 address lines, the 8 data lines, the PHI2 line and the RWB line are all inputs to the FPGA.

I am trying to make the FPGA save the data received in it's block ram when I am writing to it from the 6502. I have tried at least 5 different ways, and I always get unreliable data. I have looked at many timing diagrams for the 6502 write cycles and it seems that my verilog code conforms to that. The address is latched on a clock positive edge, and the data on the clock negative edge.

The 6502 test code is very simple:

Code: Select all

    ldx #0
loop:
    stx $FF80
    inx
    ldy #$10
    jsr delay
    bra loop

or something to that effect.
I have tried delays between 0 and about 2000 cycles, but I don't think that should affect anything.
One of the possible issues is with the level converters. I have tried both TXS0108 and 74LVC4245 with the same results.

Here is some of the current verilog code:

Code: Select all

    // CLK_100M is a 100Mhz clock generated by the on chip PLL
    always @(posedge CLK_100M) begin
        // This sets the command registerfrom the address lines when the chip is enabled and the 6502 is writing
        // (the enabled and write signals are inverted ENB and RWB)
        if ((PHI2 == 1'b1) && (enabled == 1'b1) && (write == 1'b1)) begin
            command_reg <= REG;
        end
    end
    always @(posedge CLK_100M) begin
        // this saves the data from the 6502 during the whole of PHI2 high and stops when it goes low
        if ((PHI2 == 1'b1) && (enabled == 1'b1) && (write == 1'b1) && (save_data == 1'b0)) begin
            command_data <= DATA;
        end
        else if (PHI2 == 1'b0) begin
            // Here I do something like this :
            bram[command_reg] <= command_data;
        end
    end

I have tried many different ways, including something like this :

Code: Select all

always @(posedge PHI2)
    command_reg <= REG;

always @(negedge PHI2)
    command_data <= DATA;

but always with mostly garbage results.

I have tried with the 6502 running at 12, 10, and 1.8 Mhz (only 3 oscillator values I have).

Does anybody have any idea what I'm doing wrong ? I'm a complete newb with FPGA (I'm using an ice40 by the way) so any pointers will be more than welcome.

Thanks

BigEd · Post by **BigEd** » Wed Feb 17, 2021 6:22 pm

Quick question: how do you know the stores are not working? You presumably have some kind of read-out as well. Can you tell whether it's a failure on writes, or on reads?

hoglet · Post by **hoglet** » Wed Feb 17, 2021 6:29 pm

I have a few suggestions:
- post a minimal - but complete - example of the Verilog that shows the issue, rather than just selected fragments
- give some more information about the hardware, including the FPGA part and/or development board being used, and whether any level shifters are being used.
- post some photos of your hardware setup

jfoucher · Post by **jfoucher** » Wed Feb 17, 2021 6:49 pm

BigEd wrote:

Quick question: how do you know the stores are not working? You presumably have some kind of read-out as well. Can you tell whether it's a failure on writes, or on reads?

I am only doing writes for the time being. I want to get that working first to prove the concept. The end goal is to design a VGA out chip with a parallel interface to the 6502. So the read out is the screen. When I set the block ram to certain values from the verilog, the display works fine, so I know that the output from the FPGA to the screen is ok.

jfoucher · Post by **jfoucher** » Wed Feb 17, 2021 6:55 pm

hoglet wrote:

I have a few suggestions:
- post a minimal - but complete - example of the Verilog that shows the issue, rather than just selected fragments
- give some more information about the hardware, including the FPGA part and/or development board being used, and whether any level shifters are being used.
- post some photos of your hardware setup

Yes will do! I'm a bit short on times these days but I realize this information will be essential in debugging this issue.

FPGA dev board is an Ipduino 3 (ice40up5k based)

Level shifters: I have mostly tested with TXS0108.

Hardware is all on perfboard. Will post pics soon.

Regarding the code, I will roll back to something that was mostly working and post it here.

maded2 · Post by **maded2** » Thu Feb 18, 2021 5:17 am

have you tried always @(posedge PHI2) , as otherwise you will be sampling at 100Mhz on a signal which is clocking at alot less freq. Which mean you will get many triggers.

BTW, I am also trying to interface my m68k build with a FPGA and using the 0108 level shifter and having lots of issue with the level shifter not triggering (maybe due to the signal not strong enough to drive the 0108).

John West · Post by **John West** » Thu Feb 18, 2021 9:50 am

Life is much easier if you only use one edge of one clock, so you're doing the right thing with your always @(posedge CLK_100M). Feed the same clock into every FF, and use their clock enable inputs to decide which cycles to act on. Mixing clocks can be done, but it adds a lot of pain that probably isn't necessary.

You do still have a mixed clock design though. Since you're sampling at 100MHz, you're adding 10ns to the worst-case hold time on address and data. And the 65C02 datasheet informs me that the minimum hold time is 10ns. I'd add sampledAddress, sampledData, and so on, latching them on the appropriate edge of PHI2.

Also sample PHI2 at 100MHz, and use the sampled version internally. You don't want some parts of your design thinking it's 1 while other parts, further from the pin, still think it's 0. If you want things to happen on an edge of PHI2, keep the previous value and use PHI2 = 0 and PHI2 != prevPHI2 as your condition.

So something like this (I've never used Verilog; this is almost guaranteed to be wrong):

Code: Select all

always @(negedge PHI2) begin
	sampledAddress <= externalAddress;
	sampledData <= externalData;
	sampledRWB <= externalRWB;
	sampledEnableB <= externalEnableB;
end

always @(posedge CLK_100M) begin
	prevPHI2 <= delayedPHI2;
	delayedPHI2 <= sampledPHI2;
	sampledPHI2 <= externalPHI2;

	if (delayedPHI2 == 0) && (delayedPHI2 != prevPHI2) && (sampledEnableB == 0) && (sampledRWB == 0) begin
		bram[sampledAddress] <= sampledData;
	end
end

The delay on PHI2 is to ensure that you aren't trying to read the other signals too soon after they're sampled. Since the two clocks are unrelated, the outputs of their FFs can change at any time during the 100MHz cycle. Wait a whole cycle before looking at them, and they're much more likely to be stable. The FPGA documentation might have something to say about metastability somewhere, and it would be worth reading that.

dmsc · Post by **dmsc** » Thu Feb 18, 2021 1:32 pm

Hi!

jfoucher wrote:

Hi all,

Code: Select all

    // CLK_100M is a 100Mhz clock generated by the on chip PLL
    always @(posedge CLK_100M) begin
        // This sets the command registerfrom the address lines when the chip is enabled and the 6502 is writing
        // (the enabled and write signals are inverted ENB and RWB)
        if ((PHI2 == 1'b1) && (enabled == 1'b1) && (write == 1'b1)) begin
            command_reg <= REG;
        end
    end
    always @(posedge CLK_100M) begin
        // this saves the data from the 6502 during the whole of PHI2 high and stops when it goes low
        if ((PHI2 == 1'b1) && (enabled == 1'b1) && (write == 1'b1) && (save_data == 1'b0)) begin
            command_data <= DATA;
        end
        else if (PHI2 == 1'b0) begin
            // Here I do something like this :
            bram[command_reg] <= command_data;
        end
    end

Do you have a synchronizer on all the CPU signals? As you are transitioning clock domains, you will get metastable states if you don't.

In theory, you only need a syncronizer on PHI2, as all the other signals will latch on PHI2sync, but you need to be careful with the bus timings.

Then, the output of the syncronizer should be the transitions detected on PHI2, not the actual PHI2 value. This will make the rest of the code easier, you just do your logic under "if( PHI2syncUP )" or "if (PHI2syncDOWN)" depending on your needs.

Have Fun!

BigEd · Post by **BigEd** » Thu Feb 18, 2021 2:33 pm

There's a way to avoid the hazard of asynchronous clocks internal and external to the FPGA, which might be applicable in this case. That is, use a single high speed clock in the FPGA and derive a low speed clock for the CPU which you drive out from the FPGA. It may be that the FPGA can take the original crystal's clock as input and multiply it up with a PLL.

With the FPGA in charge and everything synchronous to the high speed clock, you can sample signals - at the appropriate time - with confidence.

jfoucher · Post by **jfoucher** » Thu Feb 18, 2021 5:31 pm

BigEd wrote:

There's a way to avoid the hazard of asynchronous clocks internal and external to the FPGA, which might be applicable in this case. That is, use a single high speed clock in the FPGA and derive a low speed clock for the CPU which you drive out from the FPGA. It may be that the FPGA can take the original crystal's clock as input and multiply it up with a PLL.

With the FPGA in charge and everything synchronous to the high speed clock, you can sample signals - at the appropriate time - with confidence.

Ok, that sounds that like a workable idea. Since i'd rather have the computer not be relying on the FPGA, I am going to look in the direction of having the FPGA takes it's clock input from the cpu phi2 clock.

dmsc wrote:

Do you have a synchronizer on all the CPU signals? As you are transitioning clock domains, you will get metastable states if you don't.

I am not sure what a synchroniser is, would you mind explaining ?

John West wrote:

Wait a whole cycle before looking at them, and they're much more likely to be stable

Ok so I sample the address, data and RW on PHI2 negative edge. Then I wait one cycle until I use them. I think my code already does this (or did at some point) but I'll have to check.

As requested in a previous post, I have attached some pictures of my hardware setup.
The FPGA is at the bottom left, VGA output at the bottom and juste above we have the serial chip and 2 level shifters(from left to right)

I have also posted the current non-working code here: https://gist.github.com/jfoucher/7915b8 ... c67ccfdd85

I think there might also be an issue with reading and writing to the block ram at the same time (the ice40 bram is not dual port AFAIK) To mitigate this I have tried using a FIFO as well as double buffering, none of which I could get to work reliably.

BigEd · Post by **BigEd** » Thu Feb 18, 2021 5:35 pm

Taking the cpu clock into the FPGA and using the PLL to multiply up... that might help, in that it makes the two clocks synchronous, but you no longer know the phase relationship, so if the FPGA samples something as it is changing, you still have a potential problem. (I think!)

dmsc · Post by **dmsc** » Fri Feb 19, 2021 12:26 pm

Hi!

jfoucher wrote:

dmsc wrote:

Do you have a synchronizer on all the CPU signals? As you are transitioning clock domains, you will get metastable states if you don't.

I am not sure what a synchroniser is, would you mind explaining ?

It is a fundamental part of digital circuit design when you have two clock domains interacting. This is fairly common - any signal coming to the FPGA from an outside circuit with it's own clock will require proper synchronization - even a simple UART needs this.

This is needed because FPGA logic needs a stable signal on the inputs before a clock pulse, if the signal changes just at the clock, the output of the flip-flop will be on an undefined state - a metastable state.

See: https://www.intel.com/content/dam/www/p ... bility.pdf

The most common synchronizer is a two flip-flop cascade, implemented in your FPGA logic, that should be enough for your design. You should read about metastability and clock synchronization before attempting any FPGA design that interacts with any signal outside the FPGA. Here are more example of different synchronizer circuits: https://www.edn.com/synchronizer-techni ... ocs-fpgas/

Have Fun!

unclouded · Post by **unclouded** » Mon Feb 22, 2021 7:20 am

I interfaced an iCE40 to a WDC65C02 here:

https://github.com/neilstockbridge/6502 ... /master/R2

It looks like I clocked the FPGA at 12 MHz and then used a (run-time scalable) divider to produce PHI2 for the 6502. I think I used 3.3 V for the 6502 to obviate the level shifters.

The clock scaler is `ClockScaler` in `top.v`:

https://github.com/neilstockbridge/6502 ... top.v#L209

There are a few peripherals such as a primitive UART and an SPI Master (a cut-down 65SPI), reset control for power-up and (byte granular) address decoding here:

https://github.com/neilstockbridge/6502 ... oder.v#L46

jfoucher · Post by **jfoucher** » Mon Mar 01, 2021 7:54 pm

I managed to make it work by bringing the CPU clock (12.5MHz) into the FPGA and drive the PLL from that. This seems to solve most issues when the CPU is writing to the FPGA. However I have not yet been able to read data from the FPGA... But at least that's half of it done.

unclouded, thanks for the code, I will take a look at it, but my situation is different in that I prefer the CPU clock to be provided outside of the FPGA so that the computer can still run without it.

kakemoms · Post by **kakemoms** » Mon Apr 12, 2021 1:06 pm

Interesting topic! I haven't been so active here lately.. mostly because I have a zillion other projects and stuff going on.

I did exactly this thing some years ago. I connected a MachXO3 fpga to a NMOS 6502 (in a Vic-20) and used the internal memory of the XO3 as expansion memory for the 6502. It worked fairly well, and I made alot of software and added a 6502MMU core to use non-defined opcodes (on the NMOS side) to access up to 1MiB of RAM. That was SRAM that was read through the XO3 (but I also used the RAM internal to it).

With respect to syncing the 1MHz (more or less) bus of the NMOS to the XO3, I used a 100MHz clock in the XO3 which was driven by an external 16.384MHz crystal. The internal clocks of the FPGA is silicon based and not very accurate.

The most demanding thing with such a setup is that the NMOS clock transition is not very sharp. It goes from "0" to "1" (and back) with alot of slew. The fast logic of modern fpga's (or alike) gets some confusion in the transition since it can flip back and forth. Thus you may end up with alot of "0"'s and "1"'s on any transition. Most of this can be filitered out, but not all of it. So your design need to be quite redundant to account for that.

With an internal 100MHz clock you can count from 0 to 99 within a 1MHz clock cycle. The clock of old NMOS circuits is not very accurate, so after a certain number of counts, you need to wait for an edge on the actual 1MHz clock. This edge can vary to some degree, so you need to be certain that you read (or write) at the correct time to get the NMOS6502 to read or write values.

I also used a 74245 buffer of some sort since my fpga runs at 3.3V and the 6502 at around 4V-5V. You may not need this, but it is probably a good idea to drive enough current into the long PCB lines of the old NMOS board. Also put on some capacitors to help the buffer to drive them.

Anyway, good luck!

Interfacing real 65C02 to an FPGA

Interfacing real 65C02 to an FPGA

Re: Interfacing real 65C02 to an FPGA

Re: Interfacing real 65C02 to an FPGA

Re: Interfacing real 65C02 to an FPGA

Re: Interfacing real 65C02 to an FPGA

Re: Interfacing real 65C02 to an FPGA

Re: Interfacing real 65C02 to an FPGA

Re: Interfacing real 65C02 to an FPGA

Re: Interfacing real 65C02 to an FPGA

Re: Interfacing real 65C02 to an FPGA

Re: Interfacing real 65C02 to an FPGA

Re: Interfacing real 65C02 to an FPGA

Re: Interfacing real 65C02 to an FPGA

Re: Interfacing real 65C02 to an FPGA

Re: Interfacing real 65C02 to an FPGA