



| Technology,   |          | path<br>delay<br>(ps) | diff (ps) | inv FO4<br>delay<br>(ps) | diff in<br>FO4<br>delays |  |
|---------------|----------|-----------------------|-----------|--------------------------|--------------------------|--|
| 0.35          | absolute | 624                   |           |                          |                          |  |
| 2/1           | +/1.0 %  | 690                   | +66       | 97                       | 0.68                     |  |
| 0.35          | absolute | 635                   |           |                          |                          |  |
| 1.4/1         | +/1.0 %  | 701                   | +66       | 100                      | 0.66                     |  |
| 0.18<br>2/1   | absolute | 300                   |           |                          |                          |  |
|               | +/- 1.0% | 326                   | +27       | 38                       | 0.71                     |  |
| 0.18<br>1.4/1 | absolute | 300                   |           |                          |                          |  |
|               | +/- 1.0% | 327                   | +27       | 35                       | 0.77                     |  |



· If we want the best performance, then tweak for the best possible solution if this is the critical path and nothing less than the fastest is acceptable.

If we express the delay difference in FO4 delays, then greater than 1 . FO4 is unacceptable

- levels of logic are precious

•

solution?

- much work expended to reduce # of FO4 delays between registers, each FO4 delay is a logic level
- giving up one entire logic level worth of delay because of poor timing optimization is not acceptable
- · Less than 1 FO4 delay and greater than 0.5 FO4 delay is a grey area
- Within 0.5 FO4 delay of best known solution is good.

BR 6/00

4

#### Methodology: Leffort

- Leffort optimization only requires N (number of stages), logical effort values, and load values
  - perl script used to compute values Load values expressed in transistor widths
- for 2/1 inverter, Load = 3; for 2/2 nand2, Load = 4 \_ - for 1.4/1 inverter, load = 2.4, for 1.4/2 nand2, Load = 3.4
- 'g' value for nand2 measured as discussed in notes
- Number of stages = 5
- Two iterative methods used
- #1: Ignore branching effort for first iteration (this under-sizes gates initially, and assumes smallest possible branch effort) #2: Assume gate sizes initially = 1X, compute branching effort based on this (this over-sizes gates initially, and assumes largest possible branch effort)
- Both approaches gave same results.
- · After gates were sized, rounded to nearest 0.5X size.

BR 6/00

| Technology,<br>method<br>leffort (g nand2) |                |     | Gat | delay<br>(ps) | diff in<br>FO4s |     |      |
|--------------------------------------------|----------------|-----|-----|---------------|-----------------|-----|------|
|                                            |                | s0  | sl  | s2            | s3              |     |      |
| 0.35                                       | tilos (abs)    | 3.0 | 4.0 | 6.0           | 4.0             | 624 |      |
| 2/1                                        | leffort (1.27) | 1.5 | 1.5 | 3.5           | 5.0             | 699 | 0.7  |
| 0.35<br>1.4/1                              | tilos          | 3.5 | 4.0 | 6.0           | 5.0             | 635 |      |
|                                            | leffort (1.3)  | 1.5 | 1.5 | 3.5           | 5.0             | 724 | 0.8  |
| 0.18                                       | tilos          | 2.0 | 3.5 | 6.0           | 5.5             | 300 |      |
| 2/1                                        | leffort (1.24) | 1.5 | 1.5 | 3.5           | 5.0             | 328 | 0.74 |
| 0.18<br>1.4/1                              | tilos          | 3.0 | 4.0 | 6.0           | 5.0             | 300 |      |
|                                            | leffort (1.3)  | 1.5 | 1.5 | 3.5           | 5.0             | 328 | 0.8  |

| Technology,<br>method<br>leffort (g nand2) |                |     | Gat | delay<br>(ps) | diff in<br>FO4s |     |     |
|--------------------------------------------|----------------|-----|-----|---------------|-----------------|-----|-----|
|                                            |                | s0  | sl  | s2            | s3              |     |     |
| 0.35<br>2/1                                | tilos          | 2.0 | 2.0 | 3.0           | 2.5             | 690 |     |
|                                            | leffort (1.27) | 1.5 | 1.5 | 3.5           | 5.0             | 699 | 0.0 |
| 0.35<br>1.4/1                              | tilos          | 2.5 | 2.0 | 3.0           | 2.5             | 701 |     |
|                                            | leffort (1.3)  | 1.5 | 1.5 | 3.5           | 5.0             | 724 | 0.2 |
| 0.18                                       | tilos          | 2.0 | 2.0 | 3.0           | 2.5             | 326 |     |
| 2/1                                        | leffort (1.24) | 1.5 | 1.5 | 3.5           | 5.0             | 328 | ~   |
| 0.18<br>1.4/1                              | tilos          | 2.0 | 2.0 | 3.0           | 2.5             | 327 |     |
|                                            | leffort (1.3)  | 1.5 | 1.5 | 3.5           | 5.0             | 328 | ~   |

# Comments Tilos required about 30 iterations for absolute solution, about 15 iterations for +/1% path difference solution, significant simulation time +/- 1% solution did not use any gate sizes > 3.5 -- this is not good, why have the larger gates? · Leffort required approximately 5 iterations, negligible computation time Tilos results were consistently better than Leffort (approx 10%) at absolute solution, about the same for +/1% path difference solution - Tilos always sized stage S3 (inverter) larger than S4 - Leffort always sized stage S3 (inverter) smaller than S4 BR 6/00

#### Improving Leffort?

- The main problem with Leffort seemed be that it missed sizing S3 larger than S4
  - Either under-estimating load seen by S3 or over-estimating drive capability of S3
- Should not be under-estimating load since load calculation is consistent between stages
- Must be over-estimating drive capability of S3 Why?
- Leffort is a linear model applied to a non-linear process
- Electrical effort (Cout/Cin) is supposed to be proportional to the delay associated with an external load this is a linear factor
- However, diminishing returns on delay improvement as you increase a gate size
- This should be reflected in the Leffort model in someway perhaps a non-linear scaling factor for electrical effort based on load size

BR 6/00



#### A Plea --- Learn Perl or Something Similar!!

- · I was able to do all four cases and investigate other issues (such as absolute vs +/- 1% in Tilos) because I used Perl.
- Occasionally, an Engineer will have to write a program! - Not all problems can be solved by spreadsheets
- What types of problems might require programming?
  - Generate complex data streams as input to another problem
  - Parse/collect information out of large data files
  - Write a program that runs other programs in a regression test
  - Convert data files from one format to another
- Many programs are throw away code use once to complete a task, then forget about it.
- · Usually an Engineer is under time pressure
  - Need to become very familiar with a 'favorite' programming language, and use it enough to become time efficient.

BR 6/00

11

## How did I use Perl?

Had a file called *run params.sp* that defined the gate sizes for each stage, the model file ('tscm 0 35.model'), and the inverter ratio. This file was included by my main spice file (tilos.sp).

The Perl script had internal variables for gate sizes - the basic iteration loop in the script was:

- 1. Generate the file "run\_params.sp".
- 2. Run hspice
- 3. Read the delay from the hspice output file.
- 4. Calculate new gate sizes based on delay, goto '1' and continue until tilos algorithm complete.

Also wrote Perl script for Leffort as well - calculated sizes based on leffort model, then ran Hspice at end to get delay with Leffort sizes. BR 6/00 12

#### Some Assumptions

- · Programming is not the main task in your engineering workday
- · Most programs you write are small, throw away programs - less than 100 lines
  - Use a program one or twice, then forget about it
- Computer environment is either Unix-based or Windows based
- Many work environments use both Windows and Unix - Windows for productivity tools (Spreadsheets, Word processing, Powerpoint)
- UNIX workstations/servers for compute intensive jobs

BR 6/00

#### Desirable Features of a Programming Language for 'throw-away' code

- · Powerful get a lot done with a little code
- · Flexible be able to do many different types of tasks - GUIs, string processing, program control, number crunching, etc.
- Well documented
- · Large user base so external libraries, examples readily available

BR 6/00

- Portable be able to run on different systems - carry your favorite code with you when you change jobs
- · Readily available ("free" is the best!)

#### Compiled Programming Languages

- · C, C++, Fortran are traditional compiled languages
- Pros
  - High performance code
  - Portable between systems
  - Free C, C++, Fortran compilers from the Free Software Foundation
- Cons
  - Usually have to write a lot of code to get even simple tasks done
  - Non-standard extension libraries which means you have to move your favorite library from system to system
  - Must compile source code first on target system before execution.
  - GUI interfaces are Operating System dependent
  - Support for 'scripting' (i.e, controlled execution of other programs) is minimal and Operating System dependent
- Best for large, complex tasks -- but may not be best choice for simple tasks (i.e, < 100 lines of code)

BR 6/00

# UNIX Shell Scripting Languages

- · All UNIX shells (csh, bash, ksh, etc) support a scripting language
  - Fairly primitive features, no powerful operators or data handling features

BR 6/00

16

14

## Visual Basic

- · Visual Basic is the most common scripting/tool extension language on Windows platforms
- Pros
  - Integrated with Windows productivity tools (spreadsheets, etc) can be used to extend their base capabilities
  - Very nice GUI building capabilities
  - Complex data types, powerful library functions
  - Has scripting capabilities
  - Decent performance (about 10x less than compiled C/C++)
- Cons
  - only works under Windows OS
  - development environment costs \$\$\$
- Might be the best choice for throw-away code if you never touch a Unix system

BR 6/00

## Java

- · Portable object-oriented programming language
- Pros
  - Powerful data structures, functions
  - Portable between Unix/Windows
  - Free development environment
  - Powerful GUI building
  - Useful for Web page enhancement
  - Decent performance (about 10x less than compiled C/C++)
- Cons
  - Limited scripting, string processing
  - Object-oriented programming model is possible overkill for simple throw-away programs

# Pros - Builtin to shell, always available for use - Ideal for scripting duties (control of other programs) Cons - Very slow - at least 100x slower than compiled code - No GUI capabilities - Only useful for Unix applications

15

17

13



19

- Many public libraries available for tasks such as HTML processing and data base access
- Extremely large user base the scripting language of choice under Unix
- Cons

Pros

- No GUI building capabilities
- Code may be unreadable after you write it!
- In my opinion, best choice for throw away code development under Unix environment, and perhaps a combined Windows/Unix environment.

BR 6/00





| nand_2                                    | inverter_1      | TPHL delay (c-10 s) | Accept (best size) /<br>reject |                                           |
|-------------------------------------------|-----------------|---------------------|--------------------------------|-------------------------------------------|
| Ix                                        | 4x              | 2.07                | Start                          | 1                                         |
| 18                                        | 4.5x            | 2.97                | Reject                         | 1                                         |
| 1.5%                                      | 4x              | 2.87                | Accept                         | Next stan notice                          |
|                                           |                 | 1.1 m               | - 100 LV20                     | Next step, notice                         |
| 1.5x                                      | 4.5x            | 2.86                | Reject                         | that inverter 3                           |
| 28                                        | 4x              | 2.84                | Accept                         |                                           |
| 2x                                        | 4.5x            | 2.82                | Accept                         | <ul> <li>started where it left</li> </ul> |
| 2.5x                                      | 4x              | 2.83                | Reject                         | off in previous step                      |
| 24                                        | 55              | 2.82                | Reject                         |                                           |
| 24<br>2.5x                                | 4.5x            | 2.81                | Accept                         | 1                                         |
|                                           |                 |                     |                                | -                                         |
| nand_1                                    | nand_2          | TPHL delay (e-10 s) | Accept (best size) /<br>teject |                                           |
| 1x                                        | 3x              | 2.78                | Start                          | Final sizing                              |
| 1x                                        | 3.5x            | 2.81                | Reject                         |                                           |
| 1.5x                                      | 31              | 2.67                | Accept                         | -                                         |
| 1.5x                                      | 3.5%            | 2.68                | Reject                         | -                                         |
| 2x                                        | 3x              | 2.59                | Accept                         | ]                                         |
| nand_1<br>nand_2<br>inverter_3<br>nand_3- | -4x<br>-5.5x Fi | nal Gate sizes      |                                |                                           |
|                                           |                 |                     |                                |                                           |