Page 9 of 19
Posted: Thu Mar 08, 2012 4:55 pm
by Arlet
I just hooked up the board to 3.3V, and attached the JTAG cable. Impact recognizes both devices, so that's good.
I also managed to use one of the DCM blocks to generate a 12MHz USB clock, so I could run the MCP2200 device. I don't have a UART block in the FPGA yet, but I made a simple RX-TX loopback device, and saw the echo from my keystrokes in the terminal program.
BTW, what speed grade is the FPGA ?
Posted: Thu Mar 08, 2012 6:49 pm
by ElEctric_EyE
Sweet! So you got Microchip's program to set the MCP2200 for the bitrate?
It's a -2 speed grade...
Posted: Thu Mar 08, 2012 7:00 pm
by Arlet
Sweet! So you got Microchip's program to set the MCP2200 for the bitrate?
No, I'm using a Linux PC (my main desktop machine) running Ubuntu. It comes with a built-in driver for the MCP2200, and it will let me open the port like any other serial port, including setting baud rates. So, I don't need the Microchip program. I'm not sure if there's a way to set the LED properties, but I don't really care about that right now.
Next step is to include the CPU, and the UART interface. I'm just going to use the standard 8-bit 6502 core for now. and focus on the peripherals.
Posted: Fri Mar 09, 2012 1:14 am
by ElEctric_EyE
Nice...
BTW, the SDRAM is the highest speed grade from Micron. A 167MHz part #MT48LC16M16A2P-6A:D
So far in this project this part has not been tested.
Posted: Sat Mar 10, 2012 2:00 pm
by Arlet
I do. This is just the part defining pin locations. Bottom half is SDRAM:
Code: Select all
//NET "DQ0" LOC = P119;
//NET "DQ1" LOC = P118;
//NET "DQ2" LOC = P117;
//NET "DQ3" LOC = P116;
//NET "DQ4" LOC = P115;
//NET "DQ5" LOC = P114;
//NET "DQ6" LOC = P112;
//NET "DQ7" LOC = P111;
These signals are in the wrong order.
Posted: Sat Mar 10, 2012 2:21 pm
by ElEctric_EyE
You are right. I had forgot to change that from V1.0. Checking... There are other changes I made to the SDRAM routing and forgot to update.
Code: Select all
//NET "DQ0" LOC = P111;
//NET "DQ1" LOC = P112;
//NET "DQ2" LOC = P114;
//NET "DQ3" LOC = P115;
//NET "DQ4" LOC = P116;
//NET "DQ5" LOC = P117;
//NET "DQ6" LOC = P118;
//NET "DQ7" LOC = P119;
Posted: Sat Mar 10, 2012 2:37 pm
by Arlet
I made a simple interface to access the SDRAM from the 6502 core. For now, it only supports single byte random reads. Because of the way SDRAM works, it takes 6 extra cycles to read a byte (row activation, CAS delay, and pad/trace delay). During this time, I use the RDY signal to halt the core. This means that the SDRAM isn't particularly fast. However, the fact that it's running at 100 MHz means that it still only 70 ns for a memory access, which makes it competitive with a 14MHz 6502 running from SRAM.
It's possible to reduce the access time by a few cycles (in most cases), by not precharging the row after the read. This would avoid the row activation on the next read if it happens to be in the same row.
Ideally, the SDRAM needs to be accompanied by a block RAM cache, but that's a project for later.
Because I don't have the write logic implemented yet, I only read whatever random data happens to be in the SDRAM. But, using the scope, I can see that it's the correct data. It's also stable when I touch the DQ signals with my finger.
Next step is to add write logic, and run some read/write tests.
Posted: Sat Mar 10, 2012 2:43 pm
by BigEd
Great to hear about this!
Posted: Sat Mar 10, 2012 3:15 pm
by Arlet
I added some write logic, and the first results look promising. I wrote a simple loop to write 00..FF to an SDRAM location, and read it back, and dump the result to the UART. It shows all the bytes as you would expect them.
However, I spotted a nasty performance issue related to the dummy reads we were talking about in the other thread. For instance, when you do a STA (zp), Y, the core goes through states INDY0, INDY1, INDY2, and INDY3. In the INDY2 state, the preliminary address is already put on the AB-bus, while the core adds a possible carry to the MSB. In the INDY3 state, the final address is presented on the bus, and the WE is asserted. Because the 6502 core doesn't have an Output Enable signal, the memory interface assumes that the core wants to read from the preliminary address in INDY2, and stalls the core for 6 cycles while it fetches the data, which is then discarded.
Everything still works, but it's a waste of cycles. Perhaps adding an OE signal to the core would be a good investment, although it's not always easy to see what reads are dummy reads. For instance, when doing LDA (zp), Y, the read during INDY2 is only discarded when there's a page boundary crossing.
Posted: Sat Mar 10, 2012 4:14 pm
by Arlet
Here's my project so far:
SDRAM demo.
I've added all my verilog files. The file 'main.v' is the top level design. It uses DCM_SP to make the SDRAM clock, which is slightly delayed with respect to the normal clock to meet the SDRAM setup times.
The 'sdram.v' file is a slightly modified version of my existing SDRAM controller. It uses separate read and write channels, which made more sense for my old project, but it still works here. Ideally, the SDRAM should get a R/W bus for the CPU, and an additional read-only bus for video data. For now, it works, and it was the least amount of work.
In 'sdramif.v' you'll find a simple interface to hook up the existing SDRAM controller with the 6502 memory bus. SDRAM is mapped between $2000 and $AFFF. I have block RAM between $0-$1FFF, and some peripherals at $B000. Since the SDRAM block is the only one to use RDY, I just made a straight connection.
The file 'mt48lc16m16a2.v' contains a Micron model for the SDRAM, which can be used for simulation.
Posted: Sat Mar 10, 2012 8:45 pm
by Arlet
By the way, on my board there was a short between via and trace. Find TP4 on the board, and follow straight down to get to the via. It was touching the SDA trace.
Some more progress... after a bit of struggle, I managed to access the CS4954 through the I2C interface. At first, it didn't respond, and the datasheet wasn't very helpful. After some puzzling, I realized it probably needed a clock input to drive the I2C state machine.
When I applied the 27MHz video clock, the device suddenly responded to its I2C address (hex 00). Tomorrow, I'll try to program some registers, and see if I can get some video output.
By the way, I didn't bother with a I2C peripheral in the FPGA. I just defined the SCL/SDA signals as open-drain GPIO pins, controlled by software. This is easy to implement and good enough for the CS4954, which only needs to be configured once.
If you want to interface the touch screen IC, a hardware I2C state machine would probably be easier. However, instead of using a general purpose I2C peripheral, it would be nicer to have a dedicated peripheral that communicate with the TSC2003 without software assistance. Something that can poll the touch screen coordinates at full speed, and leave the coordinates in a register for the CPU to read out. To access the CS4594, the I2C controller could be turned off, and the SCL/SDA lines placed under software control.
Posted: Sat Mar 10, 2012 10:40 pm
by ElEctric_EyE
By the way, on my board there was a short between via and trace. Find TP4 on the board, and follow straight down to get to the via. It was touching the SDA trace.
I see that on the layout, the SDA trace in not perfectly parallel and dips down real close to the via. The program should have given me a warning. I've gotten them before when vias were too close to traces. I'll add that to my list of changes for V1.2...
Nice progress you're making! Very quick!
I've been at work busy all day just now catching up.
Posted: Sun Mar 11, 2012 1:36 am
by ElEctric_EyE
...However, the fact that it's running at 100 MHz means that it still only 70 ns for a memory access, which makes it competitive with a 14MHz 6502 running from SRAM...
Now this is pretty impressive if you're using this SDRAM @100MHz and shifting pixels out at that rate for the video? even though the RAM is accessed by the CPU at much lower rates?
...Everything still works, but it's a waste of cycles. Perhaps adding an OE signal to the core would be a good investment, although it's not always easy to see what reads are dummy reads. For instance, when doing LDA (zp), Y, the read during INDY2 is only discarded when there's a page boundary crossing.
Surely the OE signal for the DO[15:0] (databus out) can be implemented for a certain pattern of opcodes with no problem.
Posted: Sun Mar 11, 2012 7:37 am
by Arlet
Now this is pretty impressive if you're using this SDRAM @100MHz and shifting pixels out at that rate for the video? even though the RAM is accessed by the CPU at much lower rates?
Yes, that's no problem. Video can access the SDRAM in longer bursts, which will be much more efficient. Of course, if the CPU wants to access the SDRAM in the middle of such a burst, it will experience a few extra wait states.
And if you want lots of sprites in SDRAM, it can get really busy on the bus, retrieving all the data. Of course, it's up to the programmer to decide how to use the system.
If you just want a simple text screen, with no graphics, you could even use two block RAMs (one for the screen contents, the other for the character ROM). Then you can get something like this:
Image no longer available: http://ladybug.xs4all.nl/arlet/fpga/text1.jpg
Posted: Mon Mar 12, 2012 8:12 am
by Arlet
Not much luck with the CS4954 producing a good video signal. I've initialised a few register to produce colorbars (there is a built-in color bar test mode), and I generate the hsync/vsync pulses as well as the 27 MHz clock.
I see bars on the TV, but they're not very stable, and they're black and white. I am not sure what the problem is.