Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In VHDL ..... how to count leading zeros of vector?

Tags:

vhdl

fpga

I'm working in a VHDL project and I'm facing a problem to calculate the length of vector. I know there is length attribute of a vector but this not the length I'm looking for. For example, I have std_logic_vector

    E : std_logic_vector(7 downto 0);  

then

    E <= "00011010";

so, len = E'length = 8 but I'm not looking for this. I want to calculate len after discarding the left most zeros , so len = 5;

I know that I can use for loop by checking "0"s bits from left to right and stop if "1" bit occur. But that's not efficient, because I have 1024 or more of bits and that will slow my circuit. So is there is any method or algorithm to calculate the length in efficient way? Such as using combinational gates of log(n) level of gates, ( where n = number of bits ).

like image 841
Moha Blugrana Avatar asked May 13 '13 12:05

Moha Blugrana


2 Answers

What you do with your "bit counting" very similar to the logarithm (base 2).

This is commonly used in VHDL to figure out how many bits are required to represent a signal. For example if you want to store up to N elements in RAM, the number of bits required for addressing that RAM is ceil(log2(N)). For this I use:

function log2ceil(m:natural) return natural is
begin -- note: for log(0) we return 0
    for n in 0 to integer'high loop
        if 2**n >= m then
            return n;
        end if;
    end loop;
end function log2ceil;

Usually, you want to do this at synthesis time with constants, and speed is no concern. But you can also generate FPGA logic, if that's really what you want.

As others have mentioned, a "for" loop in VHDL is just used to generate a lookup table, which may be slow due to long signal paths, but still only takes a single clock. What can happen is that your maximum clock frequency goes down. Usually this is only a problem if you operate on vectors larger than 64bit (you mentioned 1024 bits) and clocks faster than 100MHz. Maybe the synthesizer already told you that this is your problem, otherwise I suggest you try first.

Then you have to split up the operation over multiple clocks, and store some intermediate result into a FF. (I would upfront forget about trying to outsmart the synthesizer by rearranging your code. A lookup-table is a table. Why should it matter how you generate the values in this table? But make sure you tell the synthesizer about "don't care" values if you have them.)

If speed is your concern, use the first clock to check all 16bit blocks in parallel (independent of each other), and then use a second clock cycle to combine the results of all 16bit blocks into a single result. If the amount of FPGA logic is your concern, implement a state machine that checks a single 16bit block at every clock cycle.

But be careful that you don't re-invent the CPU while doing that.

like image 74
11 revs Avatar answered Oct 05 '22 23:10

11 revs


The problem with using a loop is that when you synthesize you might get a very long chain of logic.

Another way to look at your problem is to find the index of the most significant set bit. To do this you can use a priority encoder. The nice thing about this is you can make a large priority encoder by using smaller priority encoders in a tree structure, so the delay is O(log N) instead of O(N).

Here is a 4 bit priority encoder: http://en.wikibooks.org/wiki/VHDL_for_FPGA_Design/Priority_Encoder You can make a 16 bit priority encoder using 5 of these blocks, then a 256 bit encoder from five 16 bit encoders, etc.

But since you have so many bits it is going to be fairly huge.

like image 42
Bull Avatar answered Oct 05 '22 23:10

Bull