brain:
I would also recommend the Verilog 2001 argument list syntax:
Code:
module fulladder #(
parameter WIDTH = 8
)(
input [WIDTH - 1:0] a,
input [WIDTH - 1:0] b,
input c_in,
output reg [WIDTH - 1:0] result,
output reg c_out
);
always @(*)
{c_out, result} <= a + b + c_in;
endmodule
It provides a bit more self-contained definition of the module's inputs and outputs, and also allows parameterization of the module argument list.
I know that you recognize a CPLD is a fundamentally different device than an FPGA. It is specifically designed to support wide logic functions. Its input array, the AND array, typically handles the TRUE and COMPLEMENT of at least 36 logic variables. This is a much larger set of combinatorial variables than the four to six that a modern FPGA handles. The CPLD's input array is followed by a summing array, the OR array. This array is where the various product terms (pterms) are combined. The width of the ORs in this array is typically not equal to the number of pterms each CPLD macro cell can generate, i.e. 54 pterms in some architectures. Thus, when the number of pterms needed to defiine a Sum-Of-Products (SOP) combinatorial equation exceeds that fundamental limit, the OR gates must be combined in sum manner. Because the pterms are so wide, (think high capacitance and therefore large delay,) there is a performance penalty to be paid if the output of an OR is fed back into the AND array in order to reach another OR input to combine x pterms with y pterms. Different CPLD architectures solve this issue in various ways, but the issue still remains that CPLDs generally do not have the functionality of PLAs (Programmable Logic Arrays) where the width of the OR gates equals the number of AND terms. This feature of PLAs allows them to implement any SOP equation, but it comes at a significant penalty in overall speed; nothing good is ever free.
For your project, the fundamental architecture of the CPLD is part of your problem. Most FPGAs include specialized circuits in each logic cell which are intended specifically to implement fast adders using specially designed ripple carry logic. I synthesized your module for an FPGA and it yielded 8 LUTs as the logic required to implement an 8-bit full adder. I also synthesized it for the case WIDTH=13, and the synthesizer indicated that 13 LUTs where required. Both of these results are what I would intuitively expect.
No CPLD architecture with which I am familiar provides the built-in adder support features found in most FPGA logic cells. Thus, the synthesizer is left with the task of generating the full adder and inter-bit carry logic directly in the AND-OR arrays. The general limitations I discussed above regarding the architecture will then be evident. As you increase the width, and width no load for the 8-bit component, it will combine the two full adders into a single 13-bit full adder. The combinatorial SOP equations for such a component will use a large number of pterms.
I set your module up in ISE 10.1i SP3 and targeting an XC9572-7PC84 device. With a width of 8, the number of macrocells used was 15/72 and the number of pterms used was 145/360. Increasing the width to 13, yielded 28/72 macrocells and 231/360 pterms. The number of macrocells required to fit your function is reasonable. There appears to be some optimization taking place in the 8-bit case, and some additional growth in the feedback requirements in the 13-bit case.
I think it is instructive to review the generated equations to see if things appear to be as would be expected if you'd generated the equations yourself. (Sometimes you learn something, and sometimes you catch specification errors.) For the 8-bit case, the tool reported the following equations for result[0] and result[7]:
Code:
assign result[0] = !(b[0] XOR (c_in && a[0]) || (!c_in && !a[0]));
The equation for bit 0 above is what I would expect for a full adder. Note the use of the XOR operator. This is an optimization possible because the CPLD macrocell includes an XOR gate just for this type of condition. If I remember the architecture correctly, that XOR gate only has two inputs, so as the number of bits grow, some of the XOR operations required will have to be synthesized with more pterms and OR terms. The equation for bit 7 below is downright scary, and I can imagine its complexity will only increase as the full adder is extended to 13 bits. (I also looked at the equation for the carry out, and my recommendation to you will be to drop the carry out of the last stage of any cascaded modules. I might even go so far as to recommend that you not cascade and 8-bit and a 5-bit module, and just go for a single 13-bit full adder.)
Code:
assign result[7] = Madd_AUX_1_addsub0000_Mxor_Result[7]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<7>__xor0000_D
XOR (c_in && b[0] && b[1] && a[2] && Madd_AUX_1_addsub0000_Mxor_Result[3]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<3>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[4]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<4>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[5]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<5>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[6]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<6>__xor0000_D)
|| (c_in && b[0] && a[1] && a[2] && Madd_AUX_1_addsub0000_Mxor_Result[3]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<3>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[4]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<4>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[5]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<5>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[6]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<6>__xor0000_D)
|| (b[0] && b[1] && a[0] && a[2] && Madd_AUX_1_addsub0000_Mxor_Result[3]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<3>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[4]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<4>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[5]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<5>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[6]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<6>__xor0000_D)
|| (b[0] && a[0] && a[1] && a[2] && Madd_AUX_1_addsub0000_Mxor_Result[3]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<3>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[4]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<4>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[5]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<5>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[6]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<6>__xor0000_D)
|| (c_in && b[0] && b[1] && b[2] && Madd_AUX_1_addsub0000_Mxor_Result[3]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<3>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[4]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<4>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[5]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<5>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[6]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<6>__xor0000_D)
|| (c_in && b[0] && b[2] && a[1] && Madd_AUX_1_addsub0000_Mxor_Result[3]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<3>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[4]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<4>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[5]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<5>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[6]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<6>__xor0000_D)
|| (c_in && b[1] && a[0] && a[2] && Madd_AUX_1_addsub0000_Mxor_Result[3]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<3>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[4]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<4>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[5]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<5>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[6]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<6>__xor0000_D)
|| (b[0] && b[1] && b[2] && a[0] && Madd_AUX_1_addsub0000_Mxor_Result[3]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<3>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[4]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<4>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[5]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<5>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[6]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<6>__xor0000_D)
|| (b[0] && b[2] && a[0] && a[1] && Madd_AUX_1_addsub0000_Mxor_Result[3]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<3>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[4]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<4>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[5]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<5>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[6]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<6>__xor0000_D)
|| (a[6] && !Madd_AUX_1_addsub0000_Mxor_Result[6]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<6>__xor0000_D)
|| (a[5] && !Madd_AUX_1_addsub0000_Mxor_Result[5]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<5>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[6]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<6>__xor0000_D)
|| (a[4] && !Madd_AUX_1_addsub0000_Mxor_Result[4]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<4>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[5]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<5>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[6]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<6>__xor0000_D)
|| (b[1] && b[2] && a[1] && Madd_AUX_1_addsub0000_Mxor_Result[3]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<3>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[4]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<4>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[5]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<5>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[6]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<6>__xor0000_D)
|| (b[1] && a[1] && a[2] && Madd_AUX_1_addsub0000_Mxor_Result[3]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<3>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[4]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<4>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[5]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<5>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[6]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<6>__xor0000_D)
|| (c_in && b[1] && b[2] && a[0] && Madd_AUX_1_addsub0000_Mxor_Result[3]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<3>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[4]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<4>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[5]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<5>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[6]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<6>__xor0000_D)
|| (c_in && b[2] && a[0] && a[1] && Madd_AUX_1_addsub0000_Mxor_Result[3]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<3>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[4]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<4>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[5]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<5>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[6]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<6>__xor0000_D)
|| (c_in && a[0] && a[1] && a[2] && Madd_AUX_1_addsub0000_Mxor_Result[3]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<3>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[4]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<4>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[5]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<5>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[6]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<6>__xor0000_D)
|| (a[3] && !Madd_AUX_1_addsub0000_Mxor_Result[3]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<3>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[4]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<4>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[5]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<5>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[6]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<6>__xor0000_D)
|| (b[2] && a[2] && Madd_AUX_1_addsub0000_Mxor_Result[3]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<3>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[4]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<4>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[5]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<5>__xor0000_D && Madd_AUX_1_addsub0000_Mxor_Result[6]__xor0000/Madd_AUX_1_addsub0000_Mxor_Result<6>__xor0000_D);