Microelectronic Technologies and Applications
Unit 1: CMOS Digital Logic
This chapter reviews aspects of ASICs and the process involved in designing
and fabricating them using the common CMOS process. This leads to the design
and implementation of common logic cells found in most digital designs including
combinational, sequential, and I/O cells.
Unit Contents
1.1 Introduction
An Application Specific Integrated Circuit (ASIC) is one type of silicon
integrated circuit (IC) or chip. Other types are memories (DRAM and SRAM),
microprocessors and standard devices. In essence ASICs are devices made for
a specific application such as a mobile phone.
Commercial ICs are enclosed in a package, which can either be made from ceramic
or plastic. The silicon chip itself is typically 1cm square and contain several
million transistors. A 2-input NAND gate requires 4 transistors in CMOS so
often the number of gate equivalents is quoted (typically a million gates
in 0.13um technology).
For more information on the development of ICs read the first section of
chapter 1 of the text-book. (Application-specific Integrated Circuits
by Michael John Smith, ISBN: 0201500221, Publisher: Addison Wesley web: www-ee.eng.hawaii.edu/~msmith/ASICs/HTML/ASICs.htm)
There are many types of ASICs so we will now have a brief look at each.
1.1.1 Types of ASICs
All ASICs are fabricated on silicon wafers (typically 8- 12 inch diameter)
by building up layers with different semiconductor characteristics in order
to produce transistors. The process used to define certain areas is masking
, which is a process akin to very high definition photographic printing. Typically
lines of minimum widths 0.13um can be produced hence the technology is referred
to as 0.13um. The wafers are then cut into individual die and mounted in an
IC package. In total the production of an ASIC takes typically 8-10 weeks.
A full custom chip is all made specifically and uniquely for one application
, is very expensive and only economic at very high volumes (millions/year).
To make ASICs economic at lower volumes the semi-custom concept was introduced
where many applications share the same basic configuration of logic cells
– it is only the final interconnect stage that is different to give
the different chips. There are several types of semi- custom ASICs –
in succeeding sections we will look briefly at each type.
1.1.2 Full custom ASICs
In full custom the designer essentially works at the transistor level designing
sub- systems and systems to meet his specification precisely. The chip may
have up to 100M transistors so very complex designs can be made. He may use
predefined cells if they are available and suitable but, in gerneral, all/
parts of the design have to be designed in this way. Therefore full custom
is slow (even with the use of modern CAD tools) and typical design times for
a large full custom chip can be several hundred man-years or more. It follows
therefore that full custom designs are very expensive and the initial cost
can only be recouped if the chip volumes are very high.
As digital systems are usually not very critical for correct operation full
custom design of logic systems in not normally required these days.
However in analogue and mixed analogue/digital systems where the matching
and electrical characteristics are still very critical full custom is still
quite commonly used.
For more info read section 1.1.1 in the text book.
1.1.3 Standard cell based ASIC (or cell-based
IC CBIC)
This approach uses pre-defined standard logic cells (gates, flip-flops etc)
and larger blocks such as processors, which are pre-defined and characterised.
The designer simply has to build up his design(which may be very large typically
over 1M gates) in this way using powerful CAD tools, which makes the whole
process quicker and cheaper and makes designs implemented in this way economic
at lower volumes (typically 100k units/year) than full custom albeit at a
loss of some flexibility for the designer.
Reference Texts
For more information read section 1.1.2 in the text book
1.1.4 Gate array ASIC (channelless array
or sea of gates array)
In this case the transistors are arranged as a regular array on the ASIC
and the designer has only to design the final interconnect pattern using logic
cells in a cell library and powerful CAD tools. See Figure 1.2. One type –
known as Field Programmable Gate Arrays (FPGAs) - can be programmed by the
user using a fuze type technology as used in PLDs (see next section). For
further information on FPGAs read section 1.1.8 in the textbook.
There are many variants of the basic gate array including a channelled array
(where the rows of cells are separated by a channel essentially to make the
interconnect easier), a structured gate array (which combines some aspects
of CBIC and the gate array).
In essence gate arrays remove more of the design process from the ASIC designer,
making the process simpler and quicker at a loss of further flexibility. Hence
complex designs are possible in gate arrays, which are economic at lower volumes
still (typically 20k units/ year or lower).
For more information on gate arrays read sections 1.13 – 1.1.6 in the
textbook.
1.1.5 Programmable logic devices (PLDs)
PLDs have been around for quite some time. They are based on ROM or PROM
technology, are limited to a small number of gate equivalents, typically 1000,
and are very easy to programme using PLD software such as PALASM or CUPL.
Essentially all PLDs consist of an array of OR gates and an array of AND gates
with some macrocells which are usually latches of flip-flops. Depending on
the type one or other or both arrays are user programmable.
PLDs are very simple to use and reduce time/costs to very low levels making
a PLD solution economic at 100 gates or lower.
For more information please read section 1.1.7 in the textbook.
[back to top]
1.2 ASIC Design Flow
The sequence of steps in designing an ASIC (design flow) is shown in Figure
1.3. The steps will now be listed and a brief description given of each.
- Design entry
- Using either schematic entry or more commonly now using a Hardware Description
Language such as VHDL or Verilog.
- Logic synthesis
- Use an HDL and a logic synthesis tool to produce a netlist, which is
a description of the components and their connections.
- System partitioning
- Divide the system into sensible blocks
- Simulation
- Check the system for functionality
- Floorplanning
- Arrange the blocks on the chip
- Placement
- Decide on locations of cells in a block.
- Routing
- Make the connections between cells and blocks.
- Circuit extraction
- Determine the resistances and capacitances of the interconnect
- Post layout simulation
- Check the functionality again with the Rs and Cs of the interconnect
included.
Reference Texts
For further information please read section 1.2 in the textbook.
[back to top]
1.3 Case Study
Case Study
Please read the case study on the Sun Microsytems SPARCstation in section
1.3 in the textbook.
[back to top]
1.4 ASIC Cost Comparisons
The economics of using ASICs in a product is an important consideration.
Different ASIC solutions have different fixed costs and variable costs. To
make realistic cost comparisons, costs must be up-to-date as they change often.
In the example we use 'typical' costs to illustrate the differences.
- Read sections 1.4, 1.4.1, 1.4.2 in the textbook relating to product costs
for different ASIC solutions.
- Figure 1.11 shows a break-even analysis for different ASIC types.
- Familiarise yourself with the constituent parts of fixed costs (spreadsheet
figure 1.12) - section 1.4.3 and variable costs (spreadsheet figure 1.14)
- section 1.4.4 in the textbook.
- Understand how the costs change with technology advances and product
maturity, see figure 1.15
[back to top]
1.5 Cell Libraries
- Read section 1.5 in the textbook.
- Understand the importance of the cell libraries in ASIC design.
- Be familiar with the different types of cell libraries eg. physical,
behavioural etc. and their uses in the design process.
[back to top]
1.6 Websites of Interest
Internal & External Links
Flextronics
EDN
EDAC
Semiconductor Industries Association
Sematech
Fabless Semiconductor Association
[back to top]
1.7 CMOS Inverter
Reference Texts
See chapter 2 in the textbook
- Based on CMOS transistors.
- A CMOS transistor is essentially a switch with four terminals, gate(G),
source(S), drain(D) and substrate or bulk (B).
- Two types of CMOS transistors - n-channel and p-channel.
- Positive logic assumed - VDD is logic 1 (say +5V) and VSS is logic 0
(say OV).
- n-channel transistor is ON with logic 1 at the gate and OFF with logic
0 at the gate.
- p-channel transistor is ON with logic 0 at the gate and OFF with logic
1 at the gate. - see figure 2.1
- CMOS inverter formed by connecting n-channel and p-channel transistors
in series between VDD and VSS
- Operation is that inverter output at logic 0 for logic 1 input and output
at logic 1 for logic 0 input - see CMOS interactive
exercise.
- In both logic states one transistor OFF and hence no current flow
and very low power dissipation (virtually zero) in both states. Major
advantage of CMOS!
- Other gates eg. NAND/NOR and more complex structures can be easily
designed - see figure 2.2 in the textbook.
- Theory of CMOS transistor operation complex - see sections 2.1, 2.1.1,
2.1.2, 2.1.4 in the textbook.
- Theory gives V-I equations in saturation (VDS>VGS-VtN)
as equation 2.12 and in linear region (VDS < VGS-VtN)
as equation 2.9 for n-channel transistor.
- Equations 2.15 apply for p-channel transistor.
- All V-I equations contain b term (Gain Factor) which equals k (process
transconductance) times width/length of the transistor.
- b Important allowing current in a CMOS transistor to be varied by varying
device geometry (W/L) as well as terminal voltages.
- Theory and practical measurements agree well (see figure 1.4). Short-channel
transistors are normal in most ASIC devices.
- CAD programme SPICE often used to characterise transistors or gates.
Parameters for a generic 0.5µm CMOS process given in Table 2.1 for
example.
- Due to transistor operation logic levels can be either strong or weak.
- n-channel transistor gives a strong '0' logic level but a weak '1' -
see section 2.1.4 and Figure 2.5 in the textbook.
- p-channel transistor gives a strong '1' logic level but a weak '0'.
- Use both transistors together in CMOS gates to give strong '0' and '1'
levels.
[back to top]
1.8 CMOS Process
- See section 2.2 in the textbook
- CMOS is the most common fabrication process for ASICs. It is illustrated
in Figure 2.6 in the text book.
- The various mask/layer names together with MOSIS (US design house) mask
labels are given in Table 2.2
- Using these names Figure 2.7 in the textbook shows the layers required
to achieve a typical standard cell layout given in figure 1.3 (p8), together
with the complete cell layout and the phantom layout often used in ASIC
designs.
- 'Wells' of the opposite type of semiconductor to the substrate are used
in CMOS to allow the fabrication of p-type and n-type transistors on the
same substrate.
- Single, twin and triple-well processes available.
- In general the more wells in a process the more control over transistor
properties.
- In all cases n-wells must be connected to the most positive part of the
circuit (VDD) to ensure that substrate/source drain junctions are not
forward biased and p-wells must be connected to the most negative part of
the circuit (VSS) for the same reason.
- Often substrate connections not shown on circuit schematics but vital
for correct circuit operation.
- CMOS process for circuit depicted in figure 2.7 described in pages 52-55
of the textbook.
- Sheet resistance is inversely proportional to the concentration of a
doped layer. Sheet resistance is a measure of the concentration of a semiconductor
doped layer.
- Sheet resistance measured in ohms/square since layers are very shallow
compared to widths/lengths.
- Typical values between 1.1kW/sq for an n-well to 30 x 10-3W/sq
for metal - see Tables 2.3 and 2.4 in the textbook for example set.
- Contact resistance (CR) - metal/silicon - often significant and process
steps taken to reduce CR and also improve contact reliability (see Tables
2.5 and 2.6 in the textbook).
[back to top]
1.9 CMOS Design Rules
- Design rules are a comprehensive set of rules, derived by a foundry which
states the minimum geometry of layers and their relations to other layers
to ensure correct circuit functionality after fabrication by the foundry.
- Often given in terms of λ (minimum feature size) currently 0.13µm
so that the rules are scaleable as technology develops and feature size
reduces.
- MOSIS is a US-based foundry serving their academic institutions; its
design rules (version 7) are shown in figure 2.11, Table 2.7, 2.8 and 2.9
in the textbook.
- Design rules are only of concern to full-custom ASIC designs as in other
ASICs the design rules are in-built into the layout and checking software
and therefore it is impossible for designers to contravene them.
[back to top]
1.10 Logic Cells
1.10.1 Combinational
Reference Texts
See section 2.4 in the textbook.
- Basic combinational gates (NAND, NOR, etc) can be made in CMOS (figure
2.2 in the textbook) as already discussed.
- More complex combinational cells comprising several gate combinations
such as AND-OR-INVERT(AOI) and OR-AND-INVERT (OAI) are much more efficient
in CMOS and often used in combinational design - see Table 2.10 in the textbook.
- Numbering notation based on number of inputs at first level and second
level often used - see Figure 2.12 in the textbook.
- Design procedure (pushing bubbles!) using networks of transistors (stacks)
used.
- Illustrated for the AOI 221in figure 2.13 of the textbook.
- Different hole and election mobilities give rise to different transistor
gain factors bn and bp (equations 2.11 and 2.15 in the textbook).
- Equalise by adjusting (W/L) ratio of n and p type transistors to make
bn and bp equal (same drive strength). (see section 2.4.2 in the textbook)
- Cells in a library available in a range of drive strengths.
- For transistors in series or parallel, design procedures are more complicated
but essentially:
- for transistors in parallel, make all the lengths unity and add the
widths
- for transistors in series, make all the widths unity and add the lengths.
- For example applied to AOI221 in figure 2.13c in the textbook.
- Alternative combinational design approach based on CMOS transmission
gate (TX) exists for simple gates. (section 2.4.3 and figure 2.14
in the textbook).
- More efficient in terms of number of transistors but other considerations
may be important ie. charge sharing which may require extra buffering and
therefore extra transistors.
- Can design of a 2/1 multiplexer using TX gates (figure 2.15 in the textbook)
- Comparison with design using OAI22 cell (figure 2.16 in the textbook)
shows little difference but for longer MUXs differences can become significant.
- Can then design EXC-OR cell from a 2/1 MUX and an OR gate (section 2.4.4
in the textbook).
1.10.2 Sequential
Reference Texts
See section 2.5 in the textbook
- Synchronous design using a single system clock is nearly always used
since it is safer, compatible with CAD tools and usually guarantees that
the ASIC will work as simulated. Asynchronous designs are becoming popular
but are more difficult.
- Sequential cells have a memory or storage feature
- Simple Latches are transparent (ie. changes at inputs appear at the Q
output when the clock is high)
- Flip-flops are more complex (require at least two latches)
- Design a latch from TX gates - operation illustrated in Figure 2.17 in
the textbook.
- Design a flip-flop from two D latches (see fig 2.18 in the textbook)
- Clocked inverter easily designed from an inverter and one TX gate (see
fig 2.19 in the textbook) which can then replace inverters in latches and
FFs
1.10.3 Datapath Logic Cells
Reference Texts
See section 2.6 in the textbook
- Datapath structures exploit the regularity of functions such as adders,
multipliers, subtractors, barrel shifters etc. to give efficient VLSI implementations.
For example, a 1-bit full adder output is usually expressed as:
Sum = A
B
CIN
and Cout = A.B + A.CIN
+ B.CIN
- These can be expressed in terms of the PARITY function (where an output
is 1 if there are an odd number of inputs are 1's) and the MAJORITY function
(where the output is 1 if the majority of the three inputs are 1) as:
Sum = PARITY (A,B,CIN)
and Cout = MAJORITY (A,B,CIN).
- Hence using form efficient single logic cells shown in figure 2.20a in
the textbook.
- Layout in figure 2.20c with data running horizontally and control signals
vertically and array structure shown in figure 2.20d, all in the textbook.
- Extend easily to a 4bit full adder (ADD4) by connecting four full adder
cells as shown in figure 2.20b in the textbook.
- Datapath refers to the layout of buswide logic operating as in the ADD
element described earlier (datapath cell or element).
- Datapath cells are usually stored in a library and are the same size
so more complex datapath cells can be created (expandable).
- Usually orientated so that increasing size in bits grows the datapath
in height; adding different elements to increase the function grows the
datapath in width.
- Datapath implementations are regular and interconnect included in the
cells.
- Disadvantages are control overheads and the requirement to pre-design
the datapath cells themselves.
1.10.4 Datapath Cells
1.10.4.1 Adder
Reference Texts
See section 2.6.1 and 2.6.2 in the textbook
- Symbols given in figure 1.21
- Table 2.11 reviews the common binary arithmetic operations (add, subtract
etc) for the four common binary number representations (unsigned, signed
magnitude, ones' complement, two's complement).
- Often addition is in terms of generate G(i) and propagate P(i) signals
- see section 2.6.2 in the textbook for an explanation and equivalences.
- Form a ripple carry adder (RCA) conventionally (figure 1.22a) or using
the generate/propagate approach (figure 1.22b).
- Other faster adders are based on carry-save (CSA) (figure 2.23 in the
textbook), carry-by pass (CBA), carry-skip and the most well-known the carry-lookahead(CLA).
- Brent-Kung adder reduces the delay and increases the regularity of CLAs
(figure 2.24 in the textbook).
- Fastest adders are based on carry-select leading to the conditional sum
adder (CSA) (see figure 2.25 in the textbook).
- Graphs of normalized delays versus number of bits (figure 2.26a in the
textbook) show ripple-carry to be the slowest, carry- select faster and
carry-save the fastest.
- Graphs of areas versus bits (figure 2.26b in the textbook) show that
ripple-carry and carry-save take up about the same area
whereas carry-select takes up about twice as much area.
1.10.4.2 Multipliers
Reference Texts
See section 2.6.4 in the textbook
- Multiplication is a series of additions and shifts.
- Use a number of adders to form an array multiplier - see figure 2.27
in the textbook for a 6-bit multiplication illustration.
- Performance determined by the number of partial products and the addition
of partial products.
- Use canonical signed - digit vectors (CSDs) to reduce the number of add/subtract
operations and replace some additions by shifts.
- Further improvement by using Booth encoding (partial products reduced
by a factor of 2 improves speed and area utilisation).
- Improve speed still further by using Wallace-trees and Dadda multipliers
(Figures 2.28, 2.29 and 2.30 in the textbook)
- Several considerations apply in the choice of parallel multiplier architecture
eg. overall speed, power dissipation, implementation (cell or full custom),
pipelining etc.
1.10.4.3 Other datapath operations
Reference Texts
See section 2.6.6 in the textbook
- Combinational and sequential datapath cells eg. NAND/NOR and FFS/latches
available and operate identically to standard forms.
- Subtractors and adder/subtractors essentially adders with modified control
lines.
- Barrel shifters rotate or shift input data stream by a specified amount
- used in floating point arithmetic.
- Other floating point arithmetic operators include leading - one detectors,
priority encoders, accumulators and registers of various types.
1.10.5 I/O Cells
Reference Texts
See section 2.7 in the textbook.
- Popular tri-state bi-directional output buffer available (Figure 1.33)
- Many other buffers available and easily designed
- I/O cells have to be designed to withstand static electricity electrostatic
discharge (ESD) brought about by handling. Require input pads to be tied
to structures that clamp the input voltage to below the gate breakdown voltage
(
10v for a 100Å
gate oxide)
1.10.6 Compiled Cells
Reference Texts
See section 2.8 in the textbook.
- Essentially automatic layout tools which generate regular structures
of variable sizes such as RAM, ROM, multipliers.
- Often linked to a model compiler (for behavioural verification) and a
netlist compiler (for layout level verification).
- Complete system produces blocks that are correct by design.
Self Assessment Questions
Please attempt problems 1.2 - 1.6 in section 1.7 of the textbook.
Show solution
Self Assessment Questions
Please attempt questions 2.3, 2.4, 2.5, 2.6, 2.7 (not f), 2.12, 2.14(i),
2.18, 2.19, 2.21, 2.23, 2.24, 2.25, 2.27, 2.28, 2.29, 2.20, 2.31 (first
part) and 2.34.
Show solution
[back to top]
Updated 20.06.06 RA