Pages

Saturday, November 4, 2017

Count the number of 1's in a Binary number - Verilog Implementation with Testbench

Suppose you have a binary number, how do you count the number of one's in it? There are more than one way to do it. We will see two ways to do it and compare their performances.

Let's take a 16 bit binary number. The output of our design will have a width of 5 bits, to include the maximum value of output, which is 16("10000").

For example,

Input = "1010_0010_1011_0010" =>  Output = "00111" ( 7 in decimal)
Input = "1110_1010_1001_1010" =>  Output = "01001" ( 9 in decimal)
Input = "0011_0110_1000_1011" =>  Output = "01000" ( 8 in decimal)

Design 1 - A simple Verilog code can be written to achieve this:

module num_ones_for(
    input [15:0] A,
    output reg [4:0] ones
    );

integer i;

always@(A)
begin
    ones = 0;  //initialize count variable.
    for(i=0;i<16;i=i+1)   //check for all the bits.
        if(A[i] == 1'b1)    //check if the bit is '1'
            ones = ones + 1;    //if its one, increment the count.
end

endmodule

The code isn't that big and the logic is easy to understand. But it's less efficient and uses much more resources than the customized design I will present next.

Another way to achieve our purpose, would be to add all the bits in our input. Think of it as a sequence of 16 one bit-adders. The zeros in the input vector will not change the sum, and effectively we get the sum as the number of ones in the vector.

***Design 2 -See the Verilog code below to get what I meant:

module num_ones_for(
    input [15:0] A,
    output reg [4:0] ones
    );

integer i;

always@(A)
begin
    ones = 0;  //initialize count variable.
    for(i=0;i<16;i=i+1)   //for all the bits.
        ones = ones + A[i]; //Add the bit to the count.
end

endmodule


Testbench(same for both the designs):

module tb;

    // Inputs
    reg [15:0] A;

    // Outputs
    wire [4:0] ones;

    // Instantiate the Unit Under Test (UUT)
    num_ones_for uut (
        .A(A), 
        .ones(ones)
    );

    initial begin
        A = 16'hFFFF;   #100;
        A = 16'hF56F;   #100;
        A = 16'h3FFF;   #100;
        A = 16'h0001;   #100;
        A = 16'hF10F;   #100;
        A = 16'h7822;   #100;
        A = 16'h7ABC;   #100;   
    end
      
endmodule

Well, design 2 looks much more simpler than design 1 and the synthesis results showed that its much more faster and uses less number of LUT's. See the table below to see how the performance have improved.

Comparison:-

Name of Design
Combination delay
Number of Slice LUT’s used
Design 1
5.632 ns
47
Design 2
2.597 ns
20
As you can see Design 2 is much better than Design 1.

12 comments:

  1. Awesome. The way u have compaed also.

    ReplyDelete
  2. Explaination was really good.
    Can you please tell me why is it necessary to count the number of 1s in any binary code and what are it's applications?

    ReplyDelete
  3. For 2nd one where is logic for counting 1
    U r just traversing through A[]

    ReplyDelete
    Replies
    1. This below line of code is basically incrementing the count when there is a bit set in A

      ones = ones + A[i]; //Add the bit to the count.

      Delete
  4. ones is a register or net type?

    ReplyDelete
  5. And also how to write code for zeroes counting?

    ReplyDelete
    Replies
    1. you can do this and write count_zeros = 16 - count_ones; (for 16 bit input). you can also try count_zeros = count_zeros + ~A[i];

      Delete
  6. Synthesizable RTL code should not depend on the initial value of registers, as hardware synthesis tools may not preserve the initial value during optimization and implementation hence you cannot make ones = 0;

    ReplyDelete
    Replies
    1. Here the initialization is done within an always statement. So whenever a new input is received at signal A, the signal "ones" will be initialized.

      Delete
    2. The tool can't infer the Hardware when driving reg to 0 (gnd) or 1 (vdd), (ex: for if-else statement, the tool infers a MUX) it is not a good practice. I agree that the second method posted here is relatively fast, but its not a good practice as far as I know. Let me know if I'm wrong, cheers

      Delete
    3. Lets look at two codes:

      module test_initial(
      input [1:0] A,
      output reg [1:0] ones
      );

      integer i;

      always@(A)
      begin
      ones = 0; //initialize count variable.
      ones = ones + A[0]; //Add the bit to the count.
      ones = ones + A[1]; //Add the bit to the count.
      end

      endmodule

      and the 2nd one with a slight modification. we will initialize "ones" to 1.

      always@(A)
      begin
      ones = 1; //initialize count variable.
      ones = ones + A[0]; //Add the bit to the count.
      ones = ones + A[1]; //Add the bit to the count.
      end

      Are you saying, the synthesized hardware will be the same in both cases?

      Delete
    4. What I'm saying is reg ones =0/1 can't be synthesized into hardware. We don't have gnd/vdd pins in the synthesized hardware.

      Delete