UPDATE : A Better Synthesizable Matrix Multiplier is available here.
Here is the Verilog code for a simple matrix multiplier. The input matrices are of fixed size 2 by 2 and so the output matrix is also fixed at 2 by 2. I have kept the size of each matrix element as 8 bits.
Verilog doesn't allow you to have multi dimensional arrays as inputs or output ports. So I have converted the three dimensional input and output ports to one dimensional array. Inside the module I have created 3D temporary variables which are initialized to the inputs at the beginning of the always statement.
The matrix multiplier is also synthesisable. When synthesised for Virtex 4 fpga, using Xilinx XST, a maximum combinational path delay of 9 ns was obtained.
Matrix multiplier:
//Module for calculating Res = A*B
//Where A,B and C are 2 by 2 matrices.
module Mat_mult(A,B,Res);
//input and output ports.
//Where A,B and C are 2 by 2 matrices.
module Mat_mult(A,B,Res);
//input and output ports.
//The size 32 bits which is 2*2=4 elements,each of which is 8 bits wide.
input [31:0] A;
input [31:0] B;
output [31:0] Res;
//internal variables
reg [31:0] Res;
reg [7:0] A1 [0:1][0:1];
reg [7:0] B1 [0:1][0:1];
reg [7:0] Res1 [0:1][0:1];
integer i,j,k;
always@ (A or B)
begin
//Initialize the matrices-convert 1 D to 3D arrays
{A1[0][0],A1[0][1],A1[1][0],A1[1][1]} = A;
{B1[0][0],B1[0][1],B1[1][0],B1[1][1]} = B;
i = 0;
j = 0;
k = 0;
{Res1[0][0],Res1[0][1],Res1[1][0],Res1[1][1]} = 32'd0; //initialize to zeros.
//Matrix multiplication
for(i=0;i < 2;i=i+1)
for(j=0;j < 2;j=j+1)
for(k=0;k < 2;k=k+1)
Res1[i][j] = Res1[i][j] + (A1[i][k] * B1[k][j]);
//final output assignment - 3D array to 1D array conversion.
Res = {Res1[0][0],Res1[0][1],Res1[1][0],Res1[1][1]};
end
endmodule
input [31:0] A;
input [31:0] B;
output [31:0] Res;
//internal variables
reg [31:0] Res;
reg [7:0] A1 [0:1][0:1];
reg [7:0] B1 [0:1][0:1];
reg [7:0] Res1 [0:1][0:1];
integer i,j,k;
always@ (A or B)
begin
//Initialize the matrices-convert 1 D to 3D arrays
{A1[0][0],A1[0][1],A1[1][0],A1[1][1]} = A;
{B1[0][0],B1[0][1],B1[1][0],B1[1][1]} = B;
i = 0;
j = 0;
k = 0;
{Res1[0][0],Res1[0][1],Res1[1][0],Res1[1][1]} = 32'd0; //initialize to zeros.
//Matrix multiplication
for(i=0;i < 2;i=i+1)
for(j=0;j < 2;j=j+1)
for(k=0;k < 2;k=k+1)
Res1[i][j] = Res1[i][j] + (A1[i][k] * B1[k][j]);
//final output assignment - 3D array to 1D array conversion.
Res = {Res1[0][0],Res1[0][1],Res1[1][0],Res1[1][1]};
end
endmodule
Testbench Code:
module tb;
// Inputs
reg [31:0] A;
reg [31:0] B;
// Outputs
wire [31:0] Res;
// Instantiate the Unit Under Test (UUT)
Mat_mult uut (
.A(A),
.B(B),
.Res(Res)
);
initial begin
// Apply Inputs
A = 0; B = 0; #100;
A = {8'd1,8'd2,8'd3,8'd4};
B = {8'd5,8'd6,8'd7,8'd8};
end
endmodule
// Inputs
reg [31:0] A;
reg [31:0] B;
// Outputs
wire [31:0] Res;
// Instantiate the Unit Under Test (UUT)
Mat_mult uut (
.A(A),
.B(B),
.Res(Res)
);
initial begin
// Apply Inputs
A = 0; B = 0; #100;
A = {8'd1,8'd2,8'd3,8'd4};
B = {8'd5,8'd6,8'd7,8'd8};
end
endmodule
Simulation waveform:
The codes were simulated using Xilinx ISE 13.1. The following waveform verifies that the design is working correctly.