Page 13 of 19
Posted: Sat Mar 24, 2012 6:07 am
by Arlet
If the 4M devices are really $140, are you sure you want to allow room for a whole board ? And do you feel comfortable running 100MHz data over these long buses ? If the various sizes are pin compatible, it seems that 1 socket should be plenty, and then you can decide the size based on budget.
Also, I was wondering... we have these boards now with big 16MB SDRAM, but no Flash memory to load it. Are you planning to put some Flash on another board ? For a next revision, I'd think it would be good to replace the Xilinx configuration PROM chip with a big serial Flash that can store both FPGA configuration bits, as well as user data.
The FPGA we use has 32 BRAMs, that's quite a bit. If you use 8 for local memory, and 8 for a cache, there's still 16 left for special purposes. With a 16kB cache in front of it, the SDRAM should be fast enough for nearly all applications, especially if you can dedicate some critical code to the local memory. For applications where speed is still an issue, it's always possible to reprogram the FPGA.
You also have to take into account that even fast external RAM is going to be slower than BRAM.
Posted: Sat Mar 24, 2012 11:44 am
by ElEctric_EyE
I thought I put the serial
FLASH on your board! It's on the opposite side of the SDRAM. J2 goes to /CE. A jumper short on the right will enable it. If I forgot to put it on there, here's the part #'s for Digikey:
Code: Select all
1 $1.24 8Mb SPI Flash SST25VF080B-80-4I
10 $0.20 .050" jumper S9014E-03-ND
10 $0.24 .050" jumper short S9345-ND
And the constraints:
Code: Select all
NET "SCLK" LOC = P21; //P_GCLK
NET "MOSI" LOC = P16; //P_GCLK
NET "MISO" LOC = P23; //P_GCLK
Posted: Sat Mar 24, 2012 12:00 pm
by Arlet
I thought I put the serial
FLASH on your board! It's on the opposite side of the SDRAM. J2 goes to /CE. A jumper short on the right will enable it. If I forgot to put it on there, here's the part #'s for Digikey:
Aha, no, that space is unpopulated on my board. I'll order the Flash, and try it out. Thanks!
Oops, sorry, I was looking at the wrong corner of the board. The Flash chip is in fact mounted. By the way, when I was looking to order the part, I noticed there's also a 32Mbit version in the same package.
Posted: Sat Mar 24, 2012 7:35 pm
by Arlet
Some more progress. The font is suffering somewhat for not being anti-aliased. It's not too bad on the picture, but up close to the screen some letters are a bit rough around the edges.
Anti-aliasing is a possibility, but would require more memory for the font. Right now, the bitmaps for all the letters fit in a single block RAM.
The rendering engine uses 4 memories:
- one to hold the text string
- one to describe the properties of each letter (width, height, x/y offset, and advance)
- one that holds a table of start pointers to each of the bitmaps.
- one to hold all the bitmaps.
To speed up the rendering, all these memories are accessed in parallel, so it would be hard to put all of that in external memory. Since the font bitmaps require the most memory, it would be an option to only put those in external memory, and keep the other ones as block RAMs. All the block RAMs are still available to the CPU, so the parts that aren't used can be used as normal memory.
Image no longer available: http://ladybug.xs4all.nl/arlet/fpga/proportional.png
Posted: Sat Mar 24, 2012 9:16 pm
by BigEd
Posted: Sat Mar 24, 2012 9:44 pm
by Arlet
I've seen the hqx results before, and they look impressive, but I'm not sure if it can be applied here. In my example, the bitmaps for the font are not scaled.
Also, for other applications, it's unlikely you'd to scale up graphics for output on a TV, which already has a low resolution. It looks like it would be better suited for going the other way, displaying original low-resolution TV output on a high resolution monitor.
Posted: Sat Mar 24, 2012 9:52 pm
by BigEd
I was thinking you'd apply the 2x algorithm in order to over sample and then anti alias, without needing a higher resolution font. If that makes sense.
Posted: Sat Mar 24, 2012 10:11 pm
by Arlet
I see... it could be useful for scaling in X direction too.
Because I'm not using interlaced output, the vertical resolution is quite low (240 for PAL). On the other hand, the horizontal resolution is fairly high (720). As a result, the pixels are about half as wide as they are high. For the demonstration, I used a 36 pixel font, but reduced the height by 50%. The alternative would be to use a 18 pixel font, but stretch the width by a factor of 2 with a good upscaler algorithm. It would have to be a modified version of the hqx algorithm, though.
Writing a hqx version in verilog does seem like a fairly big project, though

Posted: Sat Mar 24, 2012 11:58 pm
by ElEctric_EyE
... By the way, when I was looking to order the part, I noticed there's also a 32Mbit version in the same package.
Must be from another company and probably not as fast as 80MHz. What part # were you looking at?
Posted: Sun Mar 25, 2012 6:14 am
by Arlet
... By the way, when I was looking to order the part, I noticed there's also a 32Mbit version in the same package.
Must be from another company and probably not as fast as 80MHz. What part # were you looking at?
Here's the one.
SST25VF032B-80-4I-S2AF I found it on the Farnell site.
Aha, I see the difference. The 32Mb version is only available in the 200 mil SOIC package, while you have used the 150 mil narrow SOIC package.
Posted: Sun Mar 25, 2012 6:29 am
by BigEd
I see... it could be useful for scaling in X direction too.
Writing a hqx version in verilog does seem like a fairly big project, though :)
Hmm ... Seems like Jonathon W. Donaldson did it last year and posted source (edit... No, no source, my mistake, sorry)
http://nesdev.parodius.com/bbs/viewtopi ... 2752#82752
https://rm-rfroot.net/nes_fpga/
But the discussion does include a big simplification of hq2x due to symmetry.
(not for a 2d variant of course)
Cheers Ed
Posted: Sun Mar 25, 2012 6:48 am
by Arlet
I was going to ask about the source just when I noticed you had edited your post.
I think I'll pass for right now. I would like to start using external memory for video, and if I can get that to work, there won't be a need for upscaling. In any case, the upscaling algorithm is going to require some block RAMs itself. If you just have one or two fonts, it's probably just as effective to use more RAM for the font, and keep them in higher resolution.
The NES-FPGA project looks quite impressive...
Posted: Sun Mar 25, 2012 8:14 am
by Arlet
A cool trick you can do with the rendering engine. By doing a carriage return without a line feed, you can go back to the beginning of the same line, and write stuff on top of the old text.
Notice the difference between the 2nd and 3rd line. In the 2nd line, the text was printed first, and the underscores written over it. In the 3rd line, I did it the other way around, so the text sits on top of the underscores.
Image no longer available: http://ladybug.xs4all.nl/arlet/fpga/underlined.jpg
Posted: Sun Mar 25, 2012 8:33 am
by BigEd
I understand why you'd defer the hqx idea, but why would you expect to need more BRAMs? I haven't looked very carefully, but doesn't it just inspect a few near neighbours? No need for a line buffer, if you can do enough lookups in real time.
Posted: Sun Mar 25, 2012 8:44 am
by Arlet
Well, you'd need neighbors from the lines above and below. That means you'd have to keep 3 scan lines in memory. Also, I believe the algorithm requires some interpolation tables. I don't know how big they are, but at the least they'd require another RAM.
Ah, I see what you mean. If you're just using it for fonts, you can look up the neighbor pixels in the font bitmaps.