6502.org
http://forum.6502.org/

Video/sprites for the 65org16 Dev. board
http://forum.6502.org/viewtopic.php?f=10&t=2107
Page 2 of 4

Author:  ElEctric_EyE [ Mon Apr 09, 2012 12:47 am ]
Post subject: 

Arlet wrote:
...The relative positioning would be a new feature, and what it does is make the X/Y coordinate work as an offset with respect to the previous sprite.

So, you could make a sprite list where the first sprite (in the top left corner) has an absolute position. All the other sprites would be programmed (only one time) with a relative offset to this first one. If the CPU then moves the first sprite, all the others would automatically follow...

This is the result of the SDRAM pipeline?

Author:  Arlet [ Mon Apr 09, 2012 6:31 am ]
Post subject: 

ElEctric_EyE wrote:
This is the result of the SDRAM pipeline?


No, right now I have a list of sprites that look like this:
Code:
X=0, Y=00, Sprite=0
X=0, Y=16, Sprite=0
X=0, Y=32, Sprite=0
...
X=0, Y=208,Sprite=2,
X=0, Y=224,Sprite=2

That's a list of 14 sprites in the Mario game, forming the first column of tiles. Sprite=0 means a 16x16 image of sky, and Sprite=2 is a 16x16 image of the rocks. This is then followed by 16 more columns, at X=16, X=32, and so on.

Now, if I want to scroll the entire play field 1 pixel to the left, the CPU has to go through that list, and subtract #1 from all the X coordinates. Like this:
Code:
X=-1, Y=00, Sprite=0
X=-1, Y=16, Sprite=0
X=-1, Y=32, Sprite=0
...
X=-1, Y=208,Sprite=2,
X=-1, Y=224,Sprite=2

With my new feature, the first column would look like this:
Code:
X=00, Y=00, Sprite=0
X+=0, Y+=16, Sprite=0
X+=0, Y+=16, Sprite=0
...
X+=0, Y+=16,Sprite=2,
X+=0, Y+=16,Sprite=2

The "Y+=16" notation means that the sprite is positioned 16 pixels lower than the sprite before that. This option would be encoded using an extra bit in the sprite descriptor. Now, if you want to move all the sprites one pixel to the left, the CPU only has to change the first one:
Code:
X=-1, Y=00, Sprite=0
X+=0, Y+=16, Sprite=0
X+=0, Y+=16, Sprite=0
...
X+=0, Y+=16,Sprite=2,
X+=0, Y+=16,Sprite=2

The other 13 sprites in this list would automatically be shifted as well, because they are positioned at -1 + 0 = -1.

Author:  Arlet [ Tue Apr 10, 2012 5:41 am ]
Post subject: 

I've been trying to modify the video rendering pipeline. Instead of a shared memory between the cs4954 module and the sprite renderer, I've moved to a local memory in the sprite renderer, and an async FIFO between the two blocks.

This means there's no more line-by-line lockstep operation between generating the pixel values, and displaying them. The renderer starts generating a new screen when it gets a signal, and then generates the whole screen as fast as possible. The data is then send through a FIFO. When the FIFO gets full, the renderer pauses.

This design makes it easy send the pixel data to SDRAM instead of the screen, or even to my UART port for a low-speed (but 100% digital) video capture. The new local memory inside the sprite module uses both ports in parallel, so it can do single cycle read/modify/writes on the data. This will be useful for alpha blending.

There's still a bug in the code. Sometimes there's some jitter in the screen, which I suspect is a problem somewhere at the corner cases of the FIFO empty/full handling. Since it appears to be dependent on exact timing between CS4954-generated hsync/vsync and the FPGA, it is hard to reproduce in sims.

Author:  Arlet [ Wed Apr 11, 2012 9:28 am ]
Post subject:  Re:

Arlet wrote:
There's still a bug in the code. Sometimes there's some jitter in the screen, which I suspect is a problem somewhere at the corner cases of the FIFO empty/full handling. Since it appears to be dependent on exact timing between CS4954-generated hsync/vsync and the FPGA, it is hard to reproduce in sims.
Ah, found it. The 'fifo_full' signal was declared as an output instead of an input, so it wasn't detecting at all that the FIFO was full. The weird part is that ISE didn't give me an error message for connecting two outputs together, probably because the signal was only assigned to in one of the modules. The simulator apparently just turned the output into an input, so it worked correctly.

Author:  Arlet [ Wed Apr 11, 2012 8:21 pm ]
Post subject:  Re: Video/sprites for the 65org16 Dev. board

I tried the YUV844 format, but I didn't really like how it looked for subtle color changes, such as skin tones. I decided to add a RGB555 -> YUV conversion, and see if RGB555 looked better. I think it does. Here's the result of a test image:
Image
I think it's pretty good, considering it's only 256x256 pixels, and the image wasn't dithered. With dithering, and at double the resolution, it should look even better.

Here's the code for the RGB->YUV conversion. It's actually quite straightforward, and ISE does a good job of optimizing out all the multiplications.
Code:
always @(posedge clk54) begin
        Yr <= 13'd66  * r;
        Yg <= 13'd129 * g;
        Yb <= 13'd25  * b;

        Ur <= 13'd38  * r;
        Ug <= 13'd74  * g;
        Ub <= 13'd112 * b;

        Vr <= 13'd112 * r;
        Vg <= 13'd94  * g;
        Vb <= 13'd18  * b;
end

always @(posedge clk54) begin
        Ys <=  Yr + Yg + Yb + 16;
        Us <=  Ub - Ug - Ur + 16;
        Vs <=  Vr - Vg - Vb + 16;
end

always @* begin
        y <= Ys[12:5] + 16;
        u <= Us[12:5] + 128;
        v <= Vs[12:5] + 128;
end

This is also running with the new rendering code.

Author:  Arlet [ Sat Apr 14, 2012 3:12 pm ]
Post subject:  Re: Video/sprites for the 65org16 Dev. board

After fixing a few more bugs in the new sprite renderer (this is probably the most complex piece of Verilog I've written so far), I added a transparency feature, and made a simple test with 10 sprites, 64x64 pixels each.

See it here on YouTube

Author:  ElEctric_EyE [ Sun Apr 15, 2012 12:08 am ]
Post subject:  Re: Video/sprites for the 65org16 Dev. board

That is awesome! I have a couple questions:
1) Is there a delay on the fastest of those green orbs? I'm curious what max speed is.

2) What speeds are your CPU and the SDRAM clock right now?

How cool would this be when all these sprites could 'page flip' their images, so instead of an orb, maybe a rotating 3D vector cube changing colors?!

Author:  Arlet [ Sun Apr 15, 2012 5:59 am ]
Post subject:  Re: Video/sprites for the 65org16 Dev. board

There's no "delay" on any of the sprites, except the inherent delay in the frame rate. PAL has 50 fields per second. For every field, you can change the position of all of the sprites by writing new X/Y coordinates. The slowest orbs have the coordinates changed by 1 pixel, and the fastest go by 5 pixels. You can make them even faster by skipping more pixels, but the more pixels you skip the less natural the motion will seem. NTSC has 60 fields per second, so they'll go slightly faster. Because of the fixed frame rate, you'll always be limited in the choice of smooth speeds.

Everything is still clocked at 100 MHz.

By the way, the CPU interfaces with the video module by accessing a bunch of registers. For now, I have the following interfaces. There's a separate interface to control the CS4954:
Code:
$B030 = Left border width
$B032 = Top border width
$B034 = Status/Control (VSYNC bit etc)

The Status/Control byte is used to synchronize the animation to the frame rate. There is a status bit that indicates the current field is finished, so the CPU can modify the sprite tables. There's also a control bit that the CPU can set to indicate it is ready with the changes, and the new field can be rendered.

To control the sprites, there are a number of tables, each 512 bytes long, with 1 byte per sprite. The first set of 4 tables determine the position/image of the sprite:
Code:
$C000-$C1FF = X coordinate (LSB)
$C200-$C3FF = Y coordinate
$C400-$C5FF = Image number
$C600-$C7FF = Extra/Reserved bits (currently has bit 9 of X coordinate, and enable bit)

In my demo, image number 0 is the green ball, so if you do the following:
Code:
LDA #10
STA $C000 ; X = 10
STA $C200 ; Y = 10
LDA #0
STA $C400 ; image = 0
LDA #2         
STA $C600 ; extra = 2 (sprite enable)

Then you'll get a green ball at coordinates (10,10). To move the ball, just write the new coordinates to the $C000/$C200 registers. To change the appearance of the ball, just change the image number. For instance, you can make a "boing ball" demo, by preparing a set of 8 images at different angles of rotation, and then cycle the image register through 0..7 to make the ball appear to rotate.

To describe the images there are some further tables:
Code:
$C800 - $C9FF = Image width
$CA00 - $CBFF = Image height
$CC00 - $CDFF = X offset in bitmap
$CE00 - $CFFF = Y offset in bitmap
$D000 - $D7FF = Bitmap origin in SDRAM (3 bytes address, plus 8 reserved bits)

Author:  Arlet [ Sun Apr 15, 2012 9:36 pm ]
Post subject:  Re: Video/sprites for the 65org16 Dev. board

Here's an example of how you can use a small bitmap to define a larger sprite. The background is a 256x256 sprite, but only has a 64x64 bitmap definition. As a result, the bitmap is tiled to fill the entire 256x256 space. By using the X/Y bitmap offsets, it is possible to scroll the bitmap inside the sprite.

http://www.youtube.com/watch?v=VwXqZmVg-lU

Author:  ElEctric_EyE [ Mon Apr 16, 2012 12:42 am ]
Post subject:  Re: Video/sprites for the 65org16 Dev. board

Ok, so your CPU is running at 100MHz as is your SDRAM video controller. I assume the CPU is your original 8 bit version?

I would like to experiment with your SDRAM code.
I hope I'm not being too presumptuous here!... On my system, I would like to use the current version of 65016.b core, although it's only spec'd at ~90MHz.

Author:  Arlet [ Mon Apr 16, 2012 5:32 am ]
Post subject:  Re: Video/sprites for the 65org16 Dev. board

ElEctric_EyE wrote:
Ok, so your CPU is running at 100MHz as is your SDRAM video controller. I assume the CPU is your original 8 bit version?

Yes, everything is running at 100 MHz. I've never tried running things at different clock frequencies, and honestly I have no idea how to do it, except in the "easy" cases where the clock domains can be separated by a dual port RAM. The core is a slightly modified version of the 8 bit original. I have removed a few cycles at the cost of an extra adder in the address calculator. Also, I moved the stack page into page zero. I didn't need so much zero page/stack, so now I have extra room for code.
Quote:
I would like to experiment with your SDRAM code

Attached. Note that the sdramif.v module is made for an 8 bit CPU. For the 65org16 core you'll need to modify a few things.
Code:
        sdram_wr_data <= { DOH, DO };
        sdram_addr <= { ABH, AB[7:0] };

Should be changed into:
Code:
        sdram_wr_data <= DO;
        sdram_addr <= AB[23:0];

And some of the signals need to be changed from 8 to 16 bit.
You can also remove this code here:
Code:
always @(posedge clk)
    if( ctrl & WE )
        case( AB[1:0] )
            0: ABH[ 7:0] <= DO;
            1: ABH[15:8] <= DO;
            2: DOH       <= DO;
        endcase


By the way, I'm not too happy with the design of the sdramif module. It's a bit of a quick hack. It seems to work okay, but I would like to replace it with a cleaner design at some point.

Attachments:
File comment: Verilog sources for SDRAM controller
65org16dev.zip [13.92 KiB]
Downloaded 104 times

Author:  ElEctric_EyE [ Mon Apr 16, 2012 11:45 am ]
Post subject:  Re: Video/sprites for the 65org16 Dev. board

I'll have to use an older 100MHz version of .b core. Thanks so much for your contributions, it is much appreciated! I have some little troubleshooting left on my board, then I can dive in. I've put much time recently into understanding and expanding your original 6502 core, but I think I need a rest from that now.

Author:  Arlet [ Mon Apr 16, 2012 12:04 pm ]
Post subject:  Re: Video/sprites for the 65org16 Dev. board

ElEctric_EyE wrote:
I'll have to use an older 100MHz version of .b core.

Not necessarily. The code I have attached will work fine at 90 MHz or even lower. The only issues I can think of are: the generation of the 24MHz and 54 MHz clocks for USB and video, which are both derived from the 100 MHz oscillator, and also the SDRAM refresh time.

The SDRAM refresh can be modified in the sdram.v file. Replace the number 768 with a smaller number to get more frequent refresh cycles.
Code:
        refresh_count <= refresh_count - 16'd768;


ETA: with 768 you get a nominal refresh cycle every 768 / 100 = 7.68 usecs. The SDRAM requires 8192 refresh cycles every 64 ms. I have a mechanism in the SDRAM controller where it can delay up to 32 refresh cycles when it's busy. So, worst case I get 8192 refresh cycles in (8192+32)*7.68 = 63.2 ms.

Author:  ElEctric_EyE [ Mon Apr 16, 2012 9:00 pm ]
Post subject:  Re: Video/sprites for the 65org16 Dev. board

Arlet wrote:
... Also, I moved the stack page into page zero. I didn't need so much zero page/stack, so now I have extra room for code.

This is true of the 65Org16.x too!
This is definately a wise decision IMO. I will modify my testbech and cpu and see if there is any speed increase!

Author:  ElEctric_EyE [ Thu Jun 07, 2012 7:40 pm ]
Post subject:  Re: Video/sprites for the 65org16 Dev. board

Just to finish up my last comment, I didn't notice any speed increase when I put the stack and zero page in the same 64K block with the stack on top. 1Kx16 each...

For bitmap generation only, no sprites, using the SDRAM, can I use 4 modules as connected here?

Sorry, this is a poor quality picture. I'll make a larger one tonight.

EDIT: Added a larger pic just now, will take some time to update.

Page 2 of 4 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
http://www.phpbb.com/