Working with Behavioral Compiler: Some Helpful Tips
By David Black, High-Level Design Wizard, Qualis Design Corporation

I've worked with many of our clients on Behavioral Compiler designs, and have accrued a few helpful tips for users of Synopsys BC. Many engineers have found BC to be extremely effective in reducing design time (5x) when used appropriately. Use these tips to help save you time and stress.

Tip Index

  1. BC TIP: Initialize variables close to their usage site
  2. BC TIP: Use bc_view
  3. BC TIP: Misuse of Verilog 'disable' can make life tough
  4. BC TIP: Make I/O stand out using smart naming conventions
  5. BC TIP: Add a debug output register
  6. BC TIP: Verilog tasks not useful with BC... yet
  7. BC TIP: Appropriate designs for BC
  8. BC TIP: Simplifing constraining fast handshakes
  9. BC TIP: Checklist of common BC traps
  10. BC TIP: Back-to-back loops need a clock
  11. SYNOPSYS TIP: Synopsys manpages at the UNIX prompt
  1. BC TIP: Initialize variables close to their usage site

  2. Behavioral Compiler needs to have variables initialized properly to provide good results. If you fail to initialize the entire variable in one chunk, BC doesn't recognize the initialization as valid. Failure to initialize results in a warning messages, may create simulation mismatches, and may create unnecessary registers.

        Warning: Variable 'MAIN/ERROR_CONDITON/x_loop_connect' is not initialized (HLS-155)
    The following is an example that might puzzle designers expecting more than the tool currently delivers:
        reg [31:0] x;
        ...
        while (COND) begin :WHILE
          if (COND) begin
            x[31:8] = a;
            end //if
          else begin
            x[31:8] = b;
            end //else
          x[7:0] = 0;
          data_out <= x;
          @(posedge clock);
          end //while
    Simply adding 'x = 0;' directly after the 'while' and before the 'if' saves a register and a warning message. BC doesn't recognize that x was fully initialized. Note that if BC sees a loop, it will put a register unless it can determine the old value is of no use. Initialization solves this. Alternately, you can assign a register to itself in all branches of a conditional (pain).

    I recommend eliminating all BC warning messages. Only occasionally is it acceptable to leave the warnings.
    (return to the tip index)

  3. BC TIP: Use bc_view

  4. Behavioral Compiler supports a graphical analysis tool called bc_view. It's part of the BC package, but is not enabled by default. To turn on bc_view, be sure to put the following in your scheduling scripts:

       bc_enable_analysis_info = true       /* must occur before analyze & elaborate */
    It makes the database slightly larger, but makes debug a lot easier. Bc_view also makes scheduling reports much more readable. Anyone who has used BC for a real-world design knows the schedule report can be quite unmanageable. With the latest BC release there are features to help isolate sections of the report table, and the cross highlighting between source code, scheduling table and state graph are getting much better. There is a manual with the on-line documents on how to use it.

    Two minor related tips:

    1. Make sure your DISPLAY environmental variable is correctly set. You can use the dc_shell command set_unix_variable to accomplish this during a session if need be.
    2. bc_view seems to work best if the display host is of the same architecture as the executing host. This is likely only a problem for folks using load balancer or other remote queuing tools.
    (return to the tip index)
  5. BC TIP: Misuse of Verilog 'disable' can make life tough

  6. Behavioral Compiler supports the use of Verilog's 'disable' to emulate VHDL's 'exit' and 'next'. This works if you follow the examples closely; however, overuse can lead to problems. There are five common mistakes.

    MISTAKE 1. Disabling the wrong block

        begin :LOOP forever begin :BODY
           @(posedge clock);
           if (COND1) disable BODY; // equivalent to VHDL 'NEXT'
           if (COND2) disable LOOP; // equivalent to VHDL 'EXIT'
        end end
    All too often, engineers leave out the enclosing begin-end pair. It seems more natural; however, Verilog rules dicate that the inner begin block is not connected with the forever.

    MISTAKE 2. Asserting outputs and disabling without an intervening clock

       request_out <= 1;
       @(posedge clock);
       begin :LOOP forever begin
         if (acknowledge_in == 1'b1) begin
           request_out <= 0;
           disable LOOP;
         end
         @(posedge clock);
       end end
    Verilog-XL simulation will reveal that request_out never gets set to zero. This behavior differs from VCS which will provide the expected zero. The reason for Verilog-XL is that all events scheduled within a block that gets disabled are removed from the event queue. In this case, 0 was scheduled to be placed on request_out, but the disable cancelled it. Verilog semantics consider this behavior as unspecified and hence both simulators are within legal bounds. Insertion of @(posedge clock) before the disable will fix this problem both from a simulation and synthesis point of view.

    MISTAKE 3. Attempting to disable more than one level of hierarchy

    Verilog allows disabling any block from a simulation point of view. Unfortunately, BC does not support exiting more than a single level of loop hierarchy. This should be addressed in a future BC version (time unspecified).

    MISTAKE 4. Disabling a labeled block not associated with a loop

    Verilog semantics allow for disabling many things. Intuitively, disabling a block is nice as an error escape mechanism. In code:

       begin :CODE_SEQUENCE
         ...
         if (ERROR_CONDITION) begin
           error_flag = 1;
           disable CODE_SEQUENCE;
           end
         ...
         if (ERROR_CONDITION) begin
           error_flag = 1;
           disable CODE_SEQUENCE;
           end
         ...
         if (ERROR_CONDITION) begin
           error_flag = 1;
           disable CODE_SEQUENCE;
           end
         ...
         if (ERROR_CONDITION) begin
           error_flag = 1;
           disable CODE_SEQUENCE;
           end
         ...
       end //CODE_SEQUENCE
    Unfortunately, Synopsys does not support this at the present time. For some designs this appears to work occasionally (can you say "feature" with a sly grin).

    A work-around involves setting a bogus variable to true and using a loop that ends with a test that unconditionally exits. Unfortunately, there is a drawback as discussed in Mistake #5 below.

    MISTAKE 5. Too many disables leads to long schedule times

    Synopsys has a complexity problem if the number of states and transitions gets too large. Because BC looks for the best places to place operations, when the number of states and transitions gets large, the search space can get large exponentially. This leads to slow scheduling by the tool. This is related to Synopsys' recommendation that the number of operations be kept under 150 (artificial number) viewed from another angle. If there are a large number of operations OR transitions, then there are a large number of combinations to consider. The number of considerations directly impacts the tools performance.
    (return to the tip index)

  7. BC TIP: Make I/O stand out using smart naming conventions

  8. Behavioral Compiler places specific constraints on all I/O. All I/O are referenced using VHDL terminology 'signals'; whereas, internal temporaries and registers are referred to as variables. Because all I/O is scheduled, it is important to quickly see all I/O in your source code. Outputs are handled by using non-blocking assignments which helps. Inputs are handled merely by their appearance. Any net that crosses the process (always block) boundary is designated a 'signal'.

    I recommend BC designers use a combination of a naming convention using suffixes, and explicitly place inputs into temporary variables. The naming convention is merely to append either and _O (oh) for output signals and an _I for input signals.

    Thus:

        data_O <= value;
        @(posedge clock);
        ack = ack_I;
        while (ack == 0) begin
          ack = ack_I;
          @(posedge clock);
          end
    The convention makes it easier for yourself and others to see the I/O and be alert to BC restrictions. (return to the tip index)
  9. BC TIP: Add a debug output register

  10. Behavioral Compiler code is sometimes difficult to debug if problems are found at the gate level. This happens many times for teams using emulation as part of their methodology. Part of the reason for this is due to the level of abstraction. BC designs an efficient FSM that can be hard to follow. I recomend adding a debug_State output port (registered by default) and assigning it unique values throughout the code. Ideally, this register should be conditionally compiled in (use of m4 preprocessor is recommended). This allows for the register to be used for emulation or early prototypes, and removed on subsequent "clean" compilations.

        // Verilog snippets...
        module Whopper (... , debug_State_o );
        ...
        output [7:0] debug_State_o; reg [7:0] debug_State; // COMMENT OUT FOR
    FINAL GATES
        ...
        debug_State_o <= 0; // COMMENT OUT FOR FINAL GATES
        @(posedge clock);
        ...
        if (CONDITION) begin
          ...
          debug_State_o <= 1; // COMMENT OUT FOR FINAL GATES
          @(posedge clock);
          end
        else begin
          ...
          debug_State_o <= 2; // COMMENT OUT FOR FINAL GATES
          @(posedge clock);
          end
        ...
    (return to the tip index)
  11. BC TIP: Verilog tasks not useful with BC... yet

  12. The latest release of Behavioral Compiler announced some really cool features. Not the least of these was the promise of using tasks as a level of hierarchy. What the announcement failed to point out was the uselessness of the current implementation. Yes, you can use tasks; however, you may not use signals (I/O) within the tasks. That is unfortunate because a natural application of tasks would be to create I/O handlers (e.g Read_PCI, Write_PCI, Send_Packet, etc.).

    The reason tasks don't handle I/O is two fold. First, Verilog specifies that task inputs are read statically at time of invocation and outputs are computed statically when the task finishes (returns). So if you write:

        task Read;
        input  [15:0] addr;
        output  [7:0] data;
        output [15:0] addr_port_o;
        input   [7:0] data_port_i;
        ...
        endtask
        ...
        Read(fifo,result,real_addr_o,real_data_i);
    The signal real_addr_o will be written at the END of the task, and the input data will be read at the BEGINNING of the task invocation! A natural response would be to try directly accessing the I/O ports; however, BC will provide an error message stating they don't support side effects. The good news is that Synopsys is working to remedy this in a future version of BC (time unspecified as usual). (return to the tip index)
  13. BC TIP: Appropriate designs for BC

  14. This topic is a difficult one to approach, but there are some key ideas.

    (return to the tip index)
  15. BC TIP: Simplifing constraining fast handshakes

  16. What follows is a technique to simplify application of cycle constraints on fast handshake loops. This is necessary as part of defensive coding to guard against possible future changes that could break proper scheduling of fast handshake loops.

    STEP 1: In the source code, label all fast handshake loops with a unique and consistent convention. For example, FASTLOOP_. In several existing designs, this has already been done for other reasons.

        EXAMPLE VERILOG SOURCE CODE:
    
        read_req_o <= 1'b1;
        @(posedge clock);
        begin: FASTLOOP_READ_BUS forever begin
            data = data_i;
            if (bus_grant_i == 1'b1) begin
                read_req_o <= 1'b0;
                @(posedge clock);
                disable FASTLOOP_READ_BUS
            end
            @(posedge clock);
        end end
    STEP 2: Add Synopsys dc_shell script find() commands immediately after elaboration that locate all your fast handshake loops. Follow this with set_cycles constraints to keep the loop lengths to 1. The following assumes "FASTLOOP_" was used as part of the labeling scheme described in step 1.
        /* Immediately following elaboration */
        $LOOP_LIST = find("cell","*FASTLOOP_*",-hierarchy)
        foreach ($LOOP,$LOOP_LIST) {
            set_cycles 1 -from_begin $LOOP -to_end $LOOP
        }/*endforeach*/
    This constrains the body of the loops to be one cycle long. Since the initial output occurs on a clock boundary just prior to entering the loop, there can be no more than a single cycle from activiating the output to entering the loop. Similarly, the exit condition will only have a single cycle. If additional code is added that implies more than a single clock cycle, BC will indicate a scheduling failure.

    That's all there is to it. Similar technique may be applied to other situations. The advantage of using naming conventions for this should be obvious: No scripting commands in the HDL source code.
    (return to the tip index)

  17. BC TIP: Checklist of common BC traps

  18. The following is a list of common traps encountered by folks using BC. Hopefully this list may be of use to future designs. Some are technical and others are psychological. These are presented in no particular order. The list does not represent frequency of encounter except to note those more frequently encountered in my experience are marked with an asterisk (*).

        01. Disabling the wrong named block *
        02. Mixing blocked and non-blocking assignments *
        03. Leaving out clocks in one edge of a case/if branch (usually else) *
        04. Attempting to disable more than one level of hierarchy *
        05. Getting lost with indentation of begin/end blocks *
        06. Adding clocks unnecessarily
        07. Casual mixing of RTL/BC code
        08. Omitting simulation of behavior before scheduling
        09. Too many operations as a result of unrolled for loops
        10. Missing DesignWare-Foundation libraries
        11. Overlooking bc_time_design reports of multicycle candidates
        12. Missing clock edges on entering/leaving loops
        13. Failure to adhere to signal naming conventions
        14. Not reviewing clock edge usage
        15. Cognitive dissonance
        16. Wrong version of Synopsys tools
        17. Ignoring warnings/error messages from bc_check and/or schedule
        18. Overlooking clock edges in m4 macros
        19. Over zealous desire to save clock cycles (+)
        20. Over zealous desire to save registers and gates (+)
        21. Not labeling begin/end blocks rigorously
        22. Overlooking clocks because coding style hides them
        23. Mistakes due to lack of sleep from overworking
        24. Not running small experiments on new coding styles
        25. Using styles discouraged or not supported officially
        26. Back-to-back loops add a clock in superstate causing mismatches
    (return to the tip index)
  19. BC TIP: Back-to-back loops need a clock

  20. Sometimes folks like to code back-to-back loops without an intervening clock edge. Unfortunately, BC doesn't support this and in superstate_fixed scheduling mode will add the clock for you. If your code expects this to be a tight interface, the extra clock will cause simulation mismatches.

    Example of problem:

      // Get two bytes from the interface
      request_o <= 1;
      @(posedge clock);
      begin :LOOP1 forever begin
        data1 = data_i;
        if (ack_i == 1) begin
          @(posedge clock);
          disable LOOP1;
          end
        @(posedge clock);
      end //LOOP1
      // BC will require a clock here
      // Superstate will silently insert a clock
      begin :LOOP2 forever begin
        data2 = data_i;
        if (ack_i == 1) begin
          request_o <= 0;
          @(posedge clock);
          disable LOOP2;
          end
        @(posedge clock);
      end //LOOP2
    The solution described by Synopsys is to wrap the two loops into one. For this example, the solution is straightforward. The BC style guide indicates an alternative.

    Example of the solution:

      // Get two bytes from the interface
      count = 2;
      request_o <= 1;
      @(posedge clock);
      begin :LOOP forever begin
        data1 = data_i;
        count = count - 1'b1;
        if (ack_i == 1 && count == 0) begin
          request_o <= 0;
          @(posedge clock);
          disable LOOP;
          end
        @(posedge clock);
      end //LOOP
    (return to the tip index)
  21. SYNOPSYS TIP: Synopsys manpages at the UNIX prompt

  22. Synopsys manpages are available and may be used at the UNIX command line quite easily. The following script will do the trick:

    #!/bin/csh
    #
    # @(#)$Info: dcman script to display Synopsys manpages. $
    #
    # NOTE: This requires that $SYNOPSYS point to the Synopsys
    #       root directory. In dc_shell use 'list synopsys_root'
    #       to determine the correct setting if you don't know
    #       this already.
    #
    setenv MANPATH "$SYNOPSYS/doc/syn/man"
    man $*
    exit 0
    Example of get manpage for set_structure command
      % dcman set_structure
    Example of get manpage for error (LINT-47)
      % dcman LINT/LINT-47
    (return to the tip index)
-------
Brought to you by the Qualis Library
http://www.qualis.com/library/