enso started me down this path, so I thought I'd post the results of my investigations.
First, it is definitely possible to use Xilinx's Data2MEM tool to change the contents of one or more block RAMs.
A key to using Data2MEM to perform the gymnastics necessary is to define the memory space, and to correctly associate the netlist names of the block RAMs to their locations in the routed design.
This
post (the one at the end of the chain from Walter Dvorak) put me on my way. The example in the Data2MEM User's Guide was good, but the issue was finding the locations of the Block RAMs that would not impact performance too much. The syntax of the Data2MEM BMM (Block RAM Memory Map) file is pretty straightforward and consistent with other Xilinx tools. The post at the link gave a significant clue on how to determine the locations post-PAR of the block RAMs which will contain the program image. Walter Dvorak indicates that if the BMM file provided to the tools does not lock (
LOC =) or place (
PLACE =) the memory blocks, but simply defines the address range, the partitioning (bus layout), and the type), then BitGen will generate a new BMM with the
_bd suffix appended to the filename you gave. (
The file will be written by BitGen in the same directory that contains your BMM file.)
The following is my BMM file (for my M16C5x PIC16C5x-compatible soft-core microcomputer project), which is part of the project like the UCF (User Constraints File):
Code:
////////////////////////////////////////////////////////////////////////////////
//
// Block RAM Memory Map file: M16C5x Block RAM (4096 x 12) Program Memory
//
////////////////////////////////////////////////////////////////////////////////
ADDRESS_SPACE PROM RAMB16 [0x00000000:0x000017FF]
BUS_BLOCK
Mram_PROM1 [ 3:0] LOC = X0Y3;
Mram_PROM2 [ 7:4] LOC = X0Y4;
Mram_PROM3 [11:8] LOC = X0Y5;
END_BUS_BLOCK;
END_ADDRESS_SPACE;
The key words
ADDRESS_SPACE and
END_ADDRESS_SPACE; define a range of addresses that will be loaded with user provided data if the "address" of the input data fits within the range. Following
ADDRESS_SPACE, a name for the address range is required. It is simply a name, or tag, that has no relation to the netlist. In this case, I named the address range
PROM because that's the name of the the ROM/RAM variable in my source file. Following the name is a memory type. This is the type of memory that is used. In other words, its the name given by Xilinx to the various block RAMs they have in their products. Spartan II, and Virtex family parts implement 4kb Block RAMs, which Xilinx refers to as
RAMB4. The Spartan 3/3E/3A/3AN and Spartan 6 implement 18kb block RAMs. Thus, a
RAMB18 would indicate to Data2MEM that you will be using the parity and the data lines, and
RAMB16 indicates that the parity lines will not be used. To create a 4096 x 12 PROM for my PIC16C5x-compatible processor core, I need three block RAMs each organized as 4096 x 4. So I specified the memory type of the
PROM address space as
RAMB16 instead of
RAMB18. (
NOTE: in my designs I infer ROMs/RAMs. I use the recommended coding styles found in the "Light Bulb" tool if ISE.)
Following the memory type declaration you have to define the address range (in
bytes) represented by your block RAM memory. In my case, the three block RAMs actually provide 6kB of storage. I initially made an error in specifying my memory array as having an address range [0x0000_0000:0x0000_0FFF], or 4096 12-bit words. When Translate read my BMM file, it reported that it expected an address range of 0x1800 (6kB) instead of the 0x1000 (4kW) I had specified. Thus, I had to set the address range of the
PROM address space to [0x0000_0000:0x0000_17FF].
Inside the
ADDRESS_SPACE block, you need to specify the bit/byte lanes and the distribution/arrangement of the block RAMs. The example in Appendix A of the Data2MEM User's Guide shows a 2k x 64 arrangement. In that example, they are targeting RAMB4 block RAMs which support a 512 x 8 organization. With the 8 byte lanes required, 8 RAMB4 block RAMs are specified in four
BUS_BLOCKs. Each
BUS_BLOCK represents 512 x 64, for a full range of 4kB per
BUS_BLOCK, or 16kB for the entire address space which consists of 32 (
512 x 8) block RAMs.
In the case of my little core, the arrangement I want is 4096 x 12. Therefore, I assigned three 4-bit bus lanes. To do this I need to know the correct netlist name of each of the block RAMs automatically inferred by the synthesizer. My solution to this problem was two fold. First, I expect the Xilinx synthesizer to synthesize a multi-bit component and assemble them into aggregate/composite components LSB/lsb first. Therefore, I expected that the synthesizer would assign the LSN, i.e. bits 3:0, to the first, or lowest numbered block RAM.
Second, rather than using the floorplanner to manually assign the location of the three block RAMs, I simply ran the synthesizer and PAR tools (without the BMM included/added to the project) and then looked in the routed FPGA (with the FPGA Editor) for the three block RAMs that I know would be used for implementing my program memory array. I then took the assigned names and pasted them into the
BUS_BLOCK section of my BMM file, assigned the three 4-bit bit lanes, and reran the tools with the updated BMM added to the project to generate the
my_BMM_bd.bmm file that
BitGen outputs when my BMM is provided without LOC constraints for the block RAMs.
In addition to using Project - New Source, Project - Add Source, or Project - Add Copy of Source to add the BMM file to the project, a modification is required to the process parameters of the netlist translator, Translate. So after adding the BMM file to the project, Translate's "
Other NgdBuild Command Line Options" must be set to access the BMM file. In ISE, open the process properties pop-window and select the Translate Properties option. At the bottom the Translate properties window/pane is the "
Other Ngdbuild Command Line Options" property. By default it is blank. Set the property to something like "
-bm myBMM<.bmm>". (
The file extension is assumed to be bmm, so does not have to be included in the filename unless it's a BMM file which uses another file extension.)
Generate the Mapped, Placed, and Routed design, and then run
BitGen to generate the configuration bitstream file. If the BMM file provided does not include LOC or PLACE constraints,
BitGen will find and output the locations of the placed block RAMs in the
my_BMM_bd.bmm file. I copied those placements into my BMM file and converted them to
LOC = instead of
PLACE = constraint that
BitGen output.
There is likely a simpler way to get the netlist names for the block RAMs, but the above process was successfully applied. The Advanced HDL Analysis section of the Synthesis report also provides the names of the synthesized block RAMs. What it does not do is provide the name of the various block RAMs in a synthesized component consisting of multiple Block RAMs. But armed with that information, you can find (using FPGA Editor) the netlist names of the various block RAMs in a multi-
RAMBx component.
With the BMM file in its final form, the next step is to generate a data file that Data2MEM can use to modify the FPGA configuration image. Data2MEM accepts two file types for the data file: (1) ELF - Execute and Link Format; and (2) MEM. In the case of the MicroBlaze soft-core processor, the Xilinx tools emit ELF files. In my case, neither the Kingswood assembler that I use for my M65C02 soft-core, nor the MPLAB assembler that I use for my M16C5x soft-core emit ELF files. Instead, both emit various (EPROM programming) output file formats. I use the binary output format with the Kingswood A65 assembler, and Intel Hex output format with the MPLAB PIC assembler.
Thus, to patch the bitstream file with Data2MEM I needed to create a data file compatible with the MEM file format. The MEM file format is very simple. It essentially is a file containing ASCII Hex addresses and data. Addresses are preceded by an
@ symbol, and are separated from the data by a space or a newline. Multiple data elements may be placed on a single line if they are separated by spaces. Their addresses are assumed to be sequential until another
@address is provided.
Both the
SMRTool microprogram support tool and the
Bin2Txt utility that I have
released directly generates MEM-like files that I simply read with the Verilog
$readmemh() system function to initialize all memories in any of my designs. These files, generally known as memory initialization files, have an even simpler format than MEM files. They are simply either ASCII Hex (0..9, A..F) or ASCII Binary (1, 0) files. Thus, I decided to use the opcode output file option from MPLAB instead of the Intel Hex file to implement the required MEM file. I figured that the order of the ASCII hex nibbles (characters) would be the same as that of the Verilog memory initialization file. That assumption was completely wrong.
In fact, the order of the nibbles in each data word expected by Data2MEM is completely reversed. The following is the initial few lines from the MEM file I used:
Code:
@0000
FFC
500
600
E0C
A20
700
80C
F20
FE2
80A
And the following code fragment is that of the memory initialization file that Verilog's
$readmemh() system function loads to initialize the same block RAMs during synthesis:
Code:
CFF
005
006
C0E
02A
007
C08
02F
2EF
A08
The two notable differences are the
@0000 at the start of the MEM file, and the
reversal of all the nibbles.
Thankfully, Ultra Edit has a column mode, and with a few simple key strokes its easy to take the first (leftmost) column (normally interpreted as the most significant nibble by everyone but Data2MEM) and swap it with the last (rightmost) column. With this contortion made, I changed the baud rate constant and used Data2MEM to patch the bitstream by running Data2Mem from the command line:
<path>data2mem -bm my.bmm -bd code.mem -bt my<.bit> -o b new<.bit>Downloading the modified bit stream into the FPGA resulted in execution of the test program as expected with the changed baud rate in effect.
Rather than running Data2MEM from a command line, it is possible to set an option in
BitGen, and get
BitGen to automatically run Data2Mem prior to the generation of its output bitstream file. Like setting the option for Translate to use a BMM file, you can set an option in
BitGen to update block RAMs using an ELF/MEM file. (
Note: most of the discussions/examples in the Data2MEM User's Guide are given as if an ELF file will always be used. I ignored that and simply specified the data files with the .mem file extension. As the name of the utility implies, MEM files were likely the first file type supported by the tool.) To have
BitGen patch one or more block RAMs in the bitstream file with an update, open the process options window for
BitGen ("Generate Programming File"), and set the "Other Bitgen Command Line Options" (in the General Options category) to something like:
-bd myMEM.mem.
Apparently
BitGen uses the BMM file specified in Translate to automatically call Data2Mem behind the scenes. Using Data2Mem in either manner, the patching of a design's block RAMs prior to downloading is a great time saver. Furthermore, it precludes re-synthesis and PAR of the design. Thus, if the changes are all restricted to the contents of block RAMs, the potential failure of PAR to meet timing can be avoided on large designs that require several MPP/Smart Explorer passes to close timing.
(
Note: with an ELF file, it may be possible to do the operation in real-time as the bitstream file is being loaded. But it appears that there is a caveat to this option: you have to be connected to an FPGA with a hard-core that shows up as part of the JTAG chain. Since that's not the case here, I have not pursued this option any further.)
Summary:
(1) Determine the netlist names of the block RAMs to be patched.
(2) Create a BMM file and attach it to the project file list. (Loc/Patch constraints optional at this time)
(3) Set Translate's "Other Ngdbuild Command Line Options" to refer to the BMM file: -bm myBMM<.bmm>
(4) Run Synthesis, MAP and PAR, and BitGen. Examine BitGen's myBMM_db.bmm file, and constrain the block RAMs define in myBMM.bmm to match.
(5) Create MEM file to patch into bitstream file.
(6) Set BitGen's "Other Bitgen Command Line Options" to enable real-time patching: -bd myMEM.mem
(7) Re-run MAP, PAR, and BitGen, and download patched bitstream file to target. (At this point it's possible to create a PROM file for the FPGA.)
I believe the preceding discussions and the summary process checklist above are accurate. Hope this is useful to those of us using embedded soft-cores in FPGAs. I expect that I will save considerable amount of time in working with my M65C02/M16C5x development board as a result of this discovery.