GitHub - josyb/recurse: Recursion in MyHDL is easy!

#Recursion in MyHDL#

Today, June 15^th 2015, I was going through a heap of papers as we recently moved offices from Leuven (home of Stella Artois) to Hoegaarden ( home of the famous 'witbier') as I stumbled on a printout of this Design Advantage from Comit Systems Inc. titled: Recursive VHDL structures in FPGA synthesis by Rangarajan Sundaravaradan.
So I indulged myself in my favourite pastime : Applying recursive programming in hardware.
Recursion is easy: I learned the rules from a book that I, some 30 years ago, recovered from the garbage bin at a previous employer. The book is LISP, A Gentle Introduction to Symbolic Computation by David S. Touretzky (ISBN 0-06-046657-X). It has Chapter 8 devoted to Recursion, pages 191 to 213. On page 200 the dragon (every decent book on programming has one) enumerates the three rules of recursion:

Find out how to take just one step
Break each journey down into one step plus a smaller journey
Know when to stop

Let's call this Rule Set A

I rephrased those rules, except the last one:

Find out how to divide the task into two smaller tasks
Keep on dividing those tasks into smaller tasks
But know when to stop

Let's call this Rule Set B

As I said above, recursion is easy; writing a recursive VHDL function to e.g., count the bits in a std_logic_vector is a relatively easy task especially if you stick to Rule Set A.

  function count_bits(s : std_logic_vector) return natural is
  begin
  	if (s'length = 1) then
  		return to_integer(unsigned(s));
  	else
  		return to_integer(s(s'low)) + count_bits(s(s'high downto s'low + 1));
  	end if;
  end function;

But the result doesn't differ much from the standard For-Loop approach

	function count_bits(s : std_logic_vector) return natural is
		variable sum : integer ;
   begin
   	sum := 0;
   	for i in 0 to(s'high - s'low) loop
   		sum := sum + to_integer(s(s'low+i));
   	end loop;
   	return sum ;		
   end function;

When synthesised both will end up in a staircase: one might be going up, the other might be going down, but either eventually delivers the sum.

For Loop

Recursive Function

Now using Rule Set B we get something totally different

	function count_bits(s : std_logic_vector) return natural is
   begin
   	if (s'length = 1) then
   		return to_integer(unsigned(s));
   	else
   		return 	  count_bits(s(s'high downto s'low + s'length / 2)) 
   				+ count_bits(s(s'low + s'length / 2 - 1 downto s'low));
   	end if;
   end function;

We get a bottoms-up binary tree! You can notice one important advantage of the last approach: we have a lower number of levels of logic; in this case 3 instead of 7. When the clockfrequency in the FPGA gets high (and when doesn't it?) the last approach will be a lot faster. Imagine you are not adding up the one bits in a vectors but an array of std_logic_vectors.
Now exactly such a task like adding up the 9 elements in a 3 by 3 convolution, say to do Gauss filtering, calls for a pipelined version. Recursive functions are inherently non-pipeline-able as you can not infer registers inside a VHDL function (to my knowledge). And that is why, at the time, I printed that document, and kept it: it talks about using recursive components. I refer to the pdf but it essentially comes down to instantiate the entity itself in its own architecture. I tried a copy-paste of the code inside the document but that only succeeded partially with quite some editing to do. But as I have switched to MyHDL, several months back, I decided to do this in MyHDL instead of doing it in VHDL and consequently abandoned the editing.

Here is the MyHDL code:

import myhdl

import hdlutils

def sumbits(Clk, Reset, D, Q):
    ''' a recursive pipelined implementation'''
    LWIDTH_D = len(D)
	if LWIDTH_D > 2:
        # recurse by splitting things up
        LWIDTH_L = LWIDTH_D - LWIDTH_D / 2
        dupper, dlower = [myhdl.Signal(myhdl.intbv(0)[LWIDTH_L:]) for _ in range(2)]
        lql, lqu = [myhdl.Signal(myhdl.intbv(0)[hdlutils.widthr(LWIDTH_L):]) for _ in range(2)]
        supper = sumbits(Clk, Reset, dupper, lqu)
        slower = sumbits(Clk, Reset, dlower, lql)

		@myhdl.always_comb
       def split():
           ''' this will expand on the left in case the input data-size is uneven '''
           dupper.next = D[: LWIDTH_L]
           dlower.next = D[LWIDTH_L:]

       @myhdl.always_seq(Clk.posedge, Reset)
       def rtlr():
           ''' the result is the sum of the previous branches '''
           Q.next = lqu + lql

       return supper, slower, split, rtlr

    else:
	   # know when to stop
       @myhdl.always_seq(Clk.posedge, Reset)
       def rtl2():
           ''' the result is the sum of the two (terminal) leaves '''
           Q.next = D[1] + D[0]
       return rtl2

It looks more complicated than the simple VHDL functions we defined earlier, but it matches up to the construct in the paper.

The VHDL (or Verilog, you choose) conversion results in this schematic:

Let's inspect the source:

import myhdl

import hdlutils

Obviously we import the MyHDL library. Notice that we refrained from using the wild from myhdl import *
Hdlutils is my library with utility functions. Unfortunately there is no official library, so everybody brings his own. Here we only see the usage of widthr() but in the code I call on some simulation helper functions too. widthr( number ) returns the number bits to represent the given number( e.g. 7 requires 3 bits and 8 needs 4).

def sumbits( Clk, Reset, D, Q):
    ''' a recursive pipelined implementation'''
    WIDTH_D = len(D)

The first thing we do on entrance is find out how many elements we have in our input data. The strategy in recursion (using Rule Set B) is to cut the task up into smaller tasks and thus the size of the input data evolves with each recursion.

   if WIDTH_D > 2 :
        # recurse by splitting things up
        LWIDTH_L = LWIDTH_D - LWIDTH_D / 2
        dupper, dlower = [myhdl.Signal(myhdl.intbv(0)[LWIDTH_L:]) for _ in range(2)]
        lql, lqu = [myhdl.Signal(myhdl.intbv(0)[hdlutils.widthr(LWIDTH_L):]) for _ in range(2)]
        supper = sumbits(Clk, Reset, dupper, lqu)
        slower = sumbits(Clk, Reset, dlower, lql)

		@myhdl.always_comb
       def split():
           ''' this will expand on the left in case the input data-size is uneven '''
           dupper.next = D[: LWIDTH_L]
           dlower.next = D[LWIDTH_L:]

       @myhdl.always_seq(Clk.posedge, Reset)
       def rtlr():
           ''' the result is the sum of the previous branches '''
           Q.next = lqu + lql

       return supper, slower, split, rtlr

This is Rule 1. and 2. of our Rule Set B.

We simply split each input into two sets. This is achieved by

     dupper, dlower = [myhdl.Signal(myhdl.intbv(0)[LWIDTH_L:]) for _ in range(2)]

 	@myhdl.always_comb
    def split():
        ''' this will expand on the left in case the input data-size is uneven '''
        dupper.next = D[: LWIDTH_L]
        dlower.next = D[LWIDTH_L:]

If the actual input set has an uneven number of element we end up with an even and an uneven set. We could pass these on to the next recursion but this will create an imbalance in the number of registers in the path and thus deliver incorrect results. So we expand the uneven set with one extra element on the msb side. In the example we are treating here it is a single bit and happens automatically in MyHDL when assigning an intbv to a larger one. Now if you are adding up intbv's you have to pad with a zero value intbv (which is a bit more work, and which I will show in some not too distant future). 2. We recurse for each of the two sets

     lql, lqu = [myhdl.Signal(myhdl.intbv(0)[hdlutils.widthr(LWIDTH_L):]) for _ in range(2)]
     supper = sumbits(Clk, Reset, dupper, lqu)
     slower = sumbits(Clk, Reset, dlower, lql)

    @myhdl.always_seq(Clk.posedge, Reset)
    def rtlr():
        ''' the result is the sum of the previous branches '''
        Q.next = lqu + lql

    return supper, slower, split, rtlr

We recurse for each newly split-off set and finally add the two returned results in an @always_seq process that will register the result.

And finally Rule 3.

    else:
	   # know when to stop
       @myhdl.always_seq(Clk.posedge, Reset)
       def rtl2():
           ''' the result is the sum of the two (terminal) leaves '''
           Q.next = D[1] + D[0]
       return rtl2

When the input data only contains two elements we return the registered sum of them.

That's all.

Note: I learned today that Christopher Felton has another fine example using recursion in MyHDL. Check it out too! It is a fully worked-out module so you have to dig into def fft_core(): to find the little gem.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md
countbits-for-loop.png		countbits-for-loop.png
countbits-recursive-A.png		countbits-recursive-A.png
countbits-recursive-B-cropped.png		countbits-recursive-B-cropped.png
countbits-recursive-B.png		countbits-recursive-B.png
sumbits-recursive-B-pipelined-no-enable-cropped.png		sumbits-recursive-B-pipelined-no-enable-cropped.png
sumbits.py		sumbits.py
sumbits.qpf		sumbits.qpf
sumbits.qsf		sumbits.qsf
sumbits.qws		sumbits.qws
sumbits.v		sumbits.v
sumbits.vhd		sumbits.vhd
tb_sumbits.v		tb_sumbits.v
tb_sumbits.vcd		tb_sumbits.vcd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

josyb/recurse

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages