FPGAs are manufactured by several companies. Each vendor offers different device families having different features in terms of device architecture, device programming technology, internal signal routing, power, logic capacity, voltage, input/output support, and packaging and so on. Additional resources may be available such as ALU's, memory, decoders, fixed multipliers, however, essentially they are all a variation of the general architecture having programmable interconnections, I/O blocks, Logic Blocks, and clock resources.
This unit covers the various programming technologies used with PLDs and examines the general internal structure of FPGA devices.
There are two main categories of FPGA devices programmable and one time programmable (OTP). There are four common technologies used to programming an FPGA device. These are summarised in the table below.
| Programming Technology | Description |
|---|---|
| EPROM | This method is the same as EPROM based memories. It is non-volatile but must be programmed out of circuit. |
| EEPROM | This method is the same as for EEPROM memories. It is non-volatile but can be reconfigured. The device must be out of circuit to be programmed. |
| Anti-fuse | This type of device is programmed by 'growing' fusable links in the interconnect. Configuration is non-volatile but can be programmed in circuit. This device can only be programmed once (OTP). |
| SRAM | This type of device is configured by an external device at power up. This could be non-volatile memory or even a microprocessor. The device can be configured in circuit and the configuration is volatile. That is the configuration must be downloaded each time power is removed from the device. |
EPROM is a technology known as Erasable Programmable Read Only Memory. This technology is based on a standard MOS transistor with an additional floating gate that is isolated between layers of oxide. An EEPROM transistor is shown in Fig 1.

In it's un-programmed state the floating gate is un-charged. To program the EPROM a high voltage is applied between the gate and the drain of around 12V. This charge causes the transistor to be turned hard on. The excited electrons on the floating gate are pushed through and trapped on the other side of the thin oxide layer, giving it a negative charge. This charge remains even when the programming voltage has been removed and will remain charged for around 20 years under normal operating conditions. The stored charge on the floating gate inhibits normal operation of the transistor. This means we can use this characteristic to form a memory cell as shown in Fig 2.
In it's unprogrammed state making a row line active will place all the column line transistors connected to that row in a logic 0 state. In it's programmed state the charge on the floating gate causes the transistor to be disabled making the cells connected to that row appear to be storing a logic 1.

An EPROM cell is erased by discharging the cells floating gate. The energy required to do this is provided by a ultra violet light source. The EPROM device is manufactured with a quartz window on the top to allow the penetration of the uv light. When the device has been programmed it is normal to cover this window with adhesive tape to prevent accidental erasure. To erase the device it is first removed from the circuit, the quartz window is uncovered and the device is placed in an enclosed container with a high intensity UV source.
Electrically Erasable Programmable Read Only Memories (sometimes written as E2PROM) is similar to the EPROM but with the district advantage of the ability to erase the cell electrically. This is accomplished by adding a second transistor to the memory cell as shown in Fig 3. The second transistor is used to erase the cell electrically.

In EEPROM's:
Anti-fuse is a technology were each configurable path has a has an associated programmable link. In it's unprogrammed state the anti-fuse has a very high impedance. The device is supplied in this state whereby the silicon linking the two metal tracks acts as an insulator. Applying pulses of relatively high voltage and current converts the insulating silicon to a conducting polysilicon effectively 'growing' a link between the metal tracks. This is illustrated in Fig 4. Of course once this link has been grown it cannot be removed therefore this technology is known as one-time-programmable or OTP.

SRAM or static RAM is one type of semiconductor memory. The other is dynamic RAM or DRAM. A DRAM memory cell is simply formed from a capacitor / transistor pair. The term dynamic is used because the capacitor looses it's charge and must therefore be periodically refreshed to keep it's data. This refreshing operation is very complex and requires a large amount of additional circuitry. If the memory device is very large then the extra cost for the refresh circuitry can be justified. For these reasons DRAM is not applicable to programmable logic devices. Instead static RAM is a popular medium to configure FPGA's. The static term infers that once the cell has been programmed to a particular state it will remain unchanged until it is either re-programmed or power is removed. An SRAM cell consists of several transistors configured as a latch whose output drives an additional control transistor.
The four major programming technologies have been discussed for PLD's. These technologies and new ones are constantly being developed. Two such technologies are Magnetic RAM and Flash EPROM. EPROM and EEPROM is widely used in programming SPLDs. EEPROM is used for some FPGAs. Anti-fuse and SRAM are widely used in FPGAs. The three FPGA programming technologies are summarised in the table below.
| Feature | EEPROM | Anti-fuse | SRAM |
|---|---|---|---|
| Reprogrammable | Yes | No | Yes |
| Reprogramming Speed | Medium | Fast | |
| Volatile | No | No | Yes |
| External Configuration File | No | No | Yes |
| Prototyping | Good | Bad | Good |
| Instant-On | Yes | Yes | No |
| Security | Good | Good | Poor |
| Configuration Cell | Medium | Small | Large |
| Power Consumption | Medium | Low | Medium |
| Radiation Susceptibility | Low | High | Low |
Examples of SRAM FPGA families include the following:
Examples of antifuse FPGA families include the following:
Examples of flash FPGA families include the following:
Examples of hybrid flash/SRAM FPGA families include the following:
By far the most common programming technology employed on FPGAs at present is Anti-fuse and SRAM.
SRAM based FPGAs have the advantage of being re-programmable. This is very attractive feature during the development and debugging of an application. SRAM based FPGAs can also be reprogrammed while in the system. This allows the device to be upgraded whilst in the field very easily. SRAM FPGAs allow small memories to be constructed although large memories are not at present cost effective. SRAM FPGAs can also be used as a re-configurable computing medium where computers contain FPGAs and algorithms can be compiled and run in the FPGA.
The disadvantage of an SRAM based FPGA is also because it is re-programmable which in some application areas is undesirable. For instance in a military application were the requirement is for the device to be non-volatile and therefore immune to radiation and power glitches. Anti-fuse technology would be appropriate for these type of applications.
Of course the biggest disadvantage of having an SRAM based programmable device is the amount of real estate consumed by the programmable latches and that the configuration data will be lost when power is removed and therefore re-programming is necessary on each power up.
Anti-fuse also have the advantage of lower power consumption over SRAM. Anti-fuse technology FPGAs also offer a security advantage over SRAM devices. Since SRAM devices need to be reprogrammed at power up then these systems will always need to have on board an external memory device. It is therefore relatively easy for someone to copy the design by simply reading the bit stream generated by the external device. It is impossible to read the configuration data from a anti-fuse device.
FPGAs are produced by several manufacturers each offering many different families of devices. Some of these families have been fine tuned for a particular application area in an effort to address the competitive environment that exists between producers. Despite these differences there are also similarities between devices in terms of design architecture, and features. The basic architecture associated with many FPGAs is covered in this section. The three main components of a typical FPGA are programmable interconnect, I/O blocks, and logic blocks. These and the more advanced features are illustrated in Fig 5. The first three following sub-sections cover the basic components of an FPGA device. The following sections cover the more advanced features of devices.

FPGA logic blocks have different architectures between the various manufacturers. They may even have differences from different families from the same manufacturer. It is not surprising therefore that each manufacturer calls it's logic blocks by different names. Some of the names you will see are, logic cell, slice, macrocell, and logic element. In this module we will simply refer to this element as a logic block. A typical logic block contains one or more look up tables, one or flip flops, carry logic, signal routing mux's. A simplified logic block is illustrated in Fig 6. This logic block consists of 4-input look up table (LUT), mux and control logic, and a flip flop.

A look up table (LUT) can implement any Boolean function with N or fewer inputs. Most LUT implementation architectures have four inputs. The LUT is simply a memory element therefore the delay through this element will be constant regardless of the Boolean function implemented. Figure 7 shows the relationship between a Boolean function implemented as logic gates and a LUT.

The output from the look up table can either feed the flip flop or can directly feed out of the logic block. Logic blocks can also be grouped together to form a larger structure.
Two types of logic block were initially offered by the FPGA companies, fine-grained and course-grained. Fine grained structures contain very few logic components and compare to fine-grained ASIC's. Each of these logic blocks was used to implement only a very simple function for example a simple logic gate and a storage element. This type of structure was gradually phased out in preference to the course-grained FPGA. This logic block may contain several 4-input LUTs, several mux's, several flip flops, and some fast carry logic. This move from fine-grained to course-grained structures came about because of the large number of connections required into and out of each block of a fine-grained structure. As the granularity increases then the amount of connections into the blocks decreases compared to the amount of functionality. Some companies are now offering very course-grained architecture devices where a logic block is capable of executing very complex functions for instance an FIR digital filter.
Figure 8 shows a Cyclone II Logic Element. The features of each Logic Element are:
The clock input to the flip flops can only come from the global clock signal. This is a change from earlier architecture which allowed flip flops to be clocked from the output of combinational logic. This allowed asynchronous designs that created lots of problems which will be discussed in Unit 5
A Cyclone II I/O structure is shown in Fig 9. The I/Os support the following features.
Cyclone II device IOEs contain a bidirectional I/O buffer and three registers for complete embedded bidirectional single data rate transfer. The IOE contains one input register, one output register, and one output enable register.
The input register can be used for fast setup times and output registers for fast clock-to-output times. Additionally, the output enable (OE) register can be used for fast clock-to-output enable timing. The Quartus II software automatically duplicates a single OE register that controls multiple output or bidirectional pins. You can use IOEs as input, output, or bidirectional pins.
The specific I/O standard is implemented on the FPGA through the selection in the design tool. This is usually done when assigning pin locations for specific signals with the particular design tool suite.
Signals need to be routed between the logic blocks and the input output blocks. There are typically three levels or components of routing resources. These are local, switch, and long and are illustrated in Fig 10.
The local routing resource connects 'locally' a logic block to it's closest neighbouring logic blocks. These allow more complex functions that will not fit into a single logic block to be mapped to another or several other logic blocks.
The second routing resource is the switch matrix. The switch matrix also connects to the logic blocks (not shown for simplicity) and allow signals to be routed either through 90 or 180 degrees and enable logic blocks to be connected that are relatively far from each other. The disadvantage of this is that each signal that is passed through a switch matrix is subject to some delay. If this occurs many times to a signal then the routing delay could be larger than the actual delay of the logic being implemented.
The third type of routing resource, long, is used to connect critical logic blocks that are physically far from each other without causing too much delay. These would be used for critical path logic. These long lines can also be used as buses.
Only a few routing channels are shown for simplicity. The actual number of routing channels varies between FPGA families and manufacturers.
Logic blocks can also be used for routing purposes. If the design is large and complex there may not be sufficient conventional routing resources. In this case a logic block can be used to simply pass on the signal to another line without any logic change. This increases the routing resources but will introduce additional delays.

Dedicated I/O blocks with special high drive clock buffers are distributed around the FPGA device. These are known as clock drivers. These buffers connect to clock input pads and drive the clock signals onto global clock lines. These clock lines are designed for low skew times and fast propagation times. These global clock lines generally start in the middle of the chip and branch into smaller regions out from the device. FPGA devices ar generally divided into four or more clocking regions. The master global clock can feed all the clocking regions. Regional clocking can also be available that only feeds a particular clocking region. This is illustrated in Fig 11. All clocks are distributed around the chip using steering logic and buffers. Distributing clock signals around the chip in this way ensures that all the flip flops are clocked at almost exactly the same time.

There are two memory resources available within FPGAs, distributed and block memory. Distributed memory makes use of look up tables (LUTs) which are in fact implementations of SRAM memory blocks. Block memory is made up of dedicated SRAM blocks within the FPGA.
Higher specification devices typically have larger dedicated memory blocks. Block RAM stores relatively large amounts of data more efficiently than distributed RAM. Distributed RAM is better used for buffering small amounts of data anywhere along the data path.
The big advantage of distributed RAM is that it's everywhere on the chip which means that the Place and Route toold can put it near the logic that it's driving.
Multipliers can be implemented by connecting many programmable logic blocks together. Although this is perfectly acceptable there are serious speed constraints because of delays in the routing matrix. Since many applications require multipliers and because of the speed issue by using logic blocks many FPGA manufacturers now provide dedicated hardwired multipliers on chip. These multipliers typically accept two 18-bit words as inputs to produce a 36-bit product. The dedicated multipliers are generally located close to the block RAM memory because it is often necessary to store the result of the operation.
As FPGA devices and architectures continue to evolve and applications becoming more and more demanding many new advanced features are now being offered by manufacturers. This is especially the case since many companies now want their product to be first on the market and therefore are making the move from ASIC to FPGA.
Some of the advanced features currently being offered are:
Use the internet to download typical FPGA data sheets from different manufacturers and examine the structures and features for each one.
Identify range of characteristics for currently available FPGA devices in terms of number of pins, number of logic cells, and number of flip flops.
What are the two main types of FPGA.
What type of FPGA would be more suited to rapid system prototyping
Investigate what is meant by 'clock skew'
Updated 25/09/2008 KS
Powered by Google
Site Map