| TOC  
    To Do (part 1, basic timing): 
     
    To Do (part 2, Performance): 
    To Do (part 3, Pipelining): 
    Check Off  
    Report 
      
      
       | 
     | 
    
     Explore some of the timing issues such as register-to-register delay, clock-to-out,
    etc.  discussed in class.  Also, a first look at the use of pipelining. 
    Background
    In this lab exercise you will make use of all analysis modes available in the Timing
    Analyzer tool.  These different analysis modes are available from the Analysis
    menu once the timing analysis tool is active. The different analysis modes are: 
      - Delay Matrix : this is the mode that you have used in previous exercises. It
        will give you combinational delays from input pins to output pins of the mapped device.
          These delays are known as pin-to-pin delays. Note that if all outputs are
        registered, then the only combinational paths from inputs to outputs is the clock to each
        output.  Sometimes this type of delay is known as clock-to-out delay. 
 
      - Setup Matrix:  this mode reports the setup/hold times for all registered
        inputs (inputs that go the data pin of a latch or flip flop). 
 
      - Registered Performance:  this examines all of the register to register
        delay paths, finds the longest path, and uses the inverse of this value to report a
        maximum clock speed for the design.  Individual delay paths can be listed from within
        this mode. 
 
     
    
    Create a schematic from which you can measure the following: 
      - Minimum register to register delay.   This is produced from two  DFFs in
        which the output of the first DFF goes through some minimum set of combinational logic
        which then goes to the input of the 2nd DFF. The minimum set of logic is one level of
        gating, such as represented by one LUT (a four input boolean function).   You
        will probably need 2 or 3 DFFs feeding the logic gate, with the gate output going to a
        DFF.
 
      - Minimum clock-to-out delay. You should be able to measure this from the
        schematic used above.
 
      - Minium setup time.  You should be able to measure this from the schematic
        used above.
 
      - Minimum pin-to-pin delay for combinational logic.  You can add a simple
        2-input gate (any function) to the above schematic to get this measurement. The inputs and
        outputs are unregistered.
 
     
    Map this schematic to the Flex 10K family, EPF10K20RC240-4 (for
the Professional edition of the software, you will need to click on
the 'show all speed grades' button on the device menu in order to get
the "-4" speed grade). Use the FAST synthesis
    option. 
    
    Take the saturating adder that you created in Lab #2, and used in Lab #4.  Place
    DFFs on the inputs and DFFs on the outputs to register all inputs and outputs.  You
    might find the parameterized macro LPM_DFF or the VHDL file (dff8.vhd)
    useful for adding these flip-flops.  If you use the LPM_DFF, you will need to turn
    off some of the optional input pins. The only input pins you need are DATA, ACLR, and
    CLOCK. 
     Determine the maximum clock frequency for this design using the same device, synthesis
    mode used in part #1. 
    
    
      - Use the LPM_MULT parameterized module, and configure it to do an 8 bit x 8 bit multiply
        with a 8 bit product.   Place DFFs on the inputs, DFFs on the outputs, and
        determine the maximum clock frequency using the same device, synthesis mode used in part
        #1.
 
      - Use the previous schematic, and change the LPM_MULT parameters such that it is now
        pipelined with a latency of '1'.  This means that you will need to mark the 'clock'
          input as 'used', and set the value of LPM_PIPELINE to 1.  You will need to
        connect the clock input of the multiplier to the clock input.  Determine the maximum
        clock frequency using the same device, synthesis mode used in part #1.  Repeat this
        procedure for LPM_PIPELINE =2, and LPM_PIPELINE = 4,  and determine maximum clock
        frequencies for each.
 
      - For the schematic with LPM_PIPELINE=4, create a waveform file that exercises the
        multiplier and verify that this multiplier is operating correctly with a latency of 4.
           Your waveform file MUST feed at least 50 vectors through the multiplier.
          DEMONSTRATE this simulation to the TA. 
 
     
    
    You must DEMONSTRATE your simulation of your multiplier schematic with LPM_PIPELINE =4
    design to the TA.   
    
    You must hand in plots of all schematics, and report all timing  measurements.
      Hand in a partial plot of the multiplier simulation and show the latency between a
    set of input values and the correct output value.  EXPLAIN how pipelining can be used
    to increase maximum clock speed at the cost of latency. 
       |