FPGA Class Notes (Student Copy)

The Islamia University of Bahawalpur, Pakistan University College of Engineering & Technology

1 FPGA Based System Design (Telecommuniction Engineering)

FPGA Based System Design

What is FPGA?

FPGA=Field Programmable Gate Array.

Gate Array: an array (matrix) of logic gates that can be arranged to perform any

possible (combinational and/or sequential) logic function.

Programmable: they can be programmed to perform any function.

Field Programmable: they can be programmed “in the field", that is, the device as it

comes out of the production line is not committed to any specific functionality. This

increases enormously the turnaround for projects and the fast prototyping of large ICs

projects.

“Device that can be re-program / re-configure to perform required task”

Field Programmable Gate Array (FPGA) is digital integrated circuit (IC) that contain

configurable (programmable) blocks of logic along with configurable (programmable)

interconnects between these blocks.

Evolution of Programmable Devices

Programmable devices have gone through a long evolution to reach the complexity that they

have today. Programmable logic devices (PLDs) are divided into 3 basic architecture types,

SPLD, CPLD and FPGA.



PROMs

The first of the simplest PLDs were PROMs appeared in 1970.

A ROM is array logic with fixed AND-plane and programmable OR-plane.

Programmable fusible links

PROM can be used to implement any block of combinational logic

Limited Inputs & Outputs.

PAL (Programmable Array Logic)

Introduced in the late 1970s.

Opposite to PROMs.

Programmable AND array and predefined OR array

Faster than PLA because only one of their arrays is programmable.

GAL (Generic Array Logic)

Similar to PAL

Generic array logic (GAL) devices offered sophisticated CMOS electrically erasable

(E2) variations on the PAL concept.

PLA (Programmable Logic Array)

PLA is an array with programmable AND-plane and programmable OR-plane.

PLAs were one-time programmable chips containing AND & OR gates and able to

implement a simple logic function.

A PLA device can be defined by a three parameters:

o number of inputs,

o number of AND gates (terms),

o Number of OR gates (= number of outputs).

CPLD

The next step up in the evolution and complexity of programmable devices/SPLD is the

CPLD, or Complex PLD. A "complex programmable logic device" (CPLD) contains many

SPLD-like (PAL-like) devices interconnected via a programmable switch matrix. The SPLD-

like devices were called logic-blocks, which contain many SPLD-like macrocells. The

general block diagram of a CPLD is shown in figure.



FPGA (Field Programmable Gate Array)

A more advanced programmable logic than the CPLD is the Field Programmable Gate Array

(FPGA). An FPGA is more flexible than CPLD, allows more complex logic

implementations, and can be used for implementation of digital circuits that use equivalent of

several Million logic gates.

An FPGA is like a CPLD except that its logic blocks that are linked by wiring channels are

much smaller than those of a CPLD and there are far more such logic blocks than there are in

a CPLD. FPGA logic blocks consist of smaller logic elements. A logic element has only one

flip-flop that is individually configured and controlled. Logic complexity of a logic element

is only about 10 to 20 equivalent gates. A further enhancement in the structure of FPGAs is

the addition of memory blocks that can be configured as a general purpose RAM.

Figure shows the general structure of an FPGA. As shown in Figure, an FPGA is an array of

many logic blocks that are linked by horizontal and vertical wiring channels. FPGA RAM

blocks can also be used for logic implementation or they can be configured to form memories

of various word sizes and address space. Linking of logic blocks with the I/O cells and with

the memories are done through wiring channels. Within logic blocks, smaller logic elements

are linked by local wires.



Fine- Medium and Coarse-Grained Architectures:

The concept refers to the elementary block that can be programmed. In fine-grained

architectures it is a very simple logic function (e.g. NAND gate, FF). In coarse-grained there

is more logic (e.g. 4-INPUT LUT + MUXes + FFs + fast carry logic).

Fine grained require more connections (they might be ok for very irregular logic as in state

machines. For this reason, most FPGAs are coarse or medium-grained.

Programming an FPGA:

FPGA can be programmed through various technologies like fusible-link, anti-fuse, SRAM-

based, FLASH-based.

Fusible link: The idea is that of building the device with a bunch of fuses and all the fuses

are initially intact. The process of removing fuses (eliminating the connections) is referred to

as programming the device.

Anti-fuse Technologies: Similarly to fuse, but in this case we have missing connections

everywhere, and when we program, some connections are generated. It is more easily

integratable with existing manufacturing processes, so they were preferred to fuse. The

technology uses an amorphous silicon column (non conductive) that can be converted to a

polysilicon via by applying a sufficiently large current through. Disadvantage of anti-fuse is

that programming physically modifies the device forcing it to assume a non-further

modifiable configuration. Devices based on this technology are One Time Programmable

(OTP). This might be a bad choice if you want to reuse the device for prototyping different

designs. It has the advantage that the new connections are not adding much delay (low

resistivity wires). Only few devices nowadays use anti-fuse.



EEPROM/FLASH based Technology: EEPROM is electrically erasable and

programmable. EEPROM is 2.5 times larger than equivalent EPROM cell. Flash reflects the

fast erasure technology compared to EPROM. Using FLASH guarantees re-programmability,

but at the same time non-volatility. Furthermore the manufacturing process is more standard

than anti-fuses.

SRAM-based Technology: Static Random Access Memory is one of the two types of RAM.

Programming is achieved by storing a programming bit in a static RAM cell, that controls a

switch (in the simplest case a MOS transistor). If the bit is 1, the switch is closed, if it is 0 the

switch is open.

The advantage of SRAM-based programming is that the same device can be reconfigured an

indefinite amount of time, as the configuration is stored in a memory cell that can be

rewritten. The main disadvantage is that programming has to occur every time the device is

powered up, thus delaying the startup of the system. It also increases area and delay (a

SRAM+MOS is much larger than a poly via and introduces larger delay). SRAM-based

FPGAs are the dominant FPGAs (Xilinx, Altera).



Comparison of Different Technology Types

Feature SRAM Anti-fuse E2PROM / FLASH

Technology State-of-the-art One or more

generation behind

One or more

generation behind

Reprogrammable Yes

(in system) No

Yes

(in-system or offline)

Reprogramming speed

(including erasing) Fast - 3 x slower than SRAM

Volatile

(must be programmed on

power- up)

Yes No No

Requires external

configuration file Yes No No

Good for prototyping Yes

(very good) No

Yes

(reasonable)

Instant on No Yes Yes

Size of configuration cell Large

(six transistors) Very small

Medium-small

(two transistors)

Power consumption Medium Low Medium

Types of FPGA

FPGAs can be categorized as:

Anti-fuse/fuse-based (e.g. Actel)

SRAM-based (e.g. Xilinx, Altera)

SRAM based FPGAs are most popular due to speed and reconfigurable feature.

World popular Xilinx FPGA is typically an SRAM-based device.

SRAM-based FPGA

Can be programmed many times

Must be programmed at power up, usually by means of external memory.

Can be made secured by encrypting configuration bit stream.

SRAM based FPGAs are most popular due to speed and reconfigurable feature

Anti-fuse based FPGA

Only program off-line

Non-volatile (configuration remains when power off)

No additional memory, reduce cost

Immune to radiant and noise

Very low power consumption

Less area utilizes as SRAM based devices

Extra programming circuitry required

Main disadvantage, OTP. Hence SRAM used mostly



Configuring FPGAs:

Configuring an FPGA consists in “loading" the correct bits in the various programmable parts

of the device, that are: LUTs, inter-block connections, switchboxes, input/output selections,

etc. Irrespective of the type of technology, this means that large quantities of configuration

bits need to be transferred to the device. There are multiple techniques that guarantee such

transfer.

Configuring an Anti-fuse device: Needs a special device programmer to which the device is

attached and is in turn connected to a computer that uploads the configuration file. All the

anti-fuses to be activated are accessed sequentially and “burned". Of course the programming

cannot be done in-system (on the final utilization board) because of the need of the special

programmer, and furthermore no reprogramming is possible. This is the easiest configuration

technique.

Configuring an SRAM based FPGA: Because the device loses its configuration at every

power-off, more complicated strategies are necessary to make sure that programming can be

performed \in-system" every power-on cycle (or possibly even multiple times during a single

session). An adequate representation of the configuration logic is that of a very long shift

register that connects in a chain all the programming bits and can be accessed externally

through a serial input and a serial output. Clearly, such serial programming can take a long

time (e.g. if the bits are 25M and the programming clock is 25MHz, it will take a second

(ages from a digital perspective!). Note that if a LUT is used as a distributed RAM or a shift

register, it can be reset to any preferred initial value. Similarly, the e-RAMs make part of the

programming bits and can be initialized.

In many cases, multiple chains are in fact present that can be selectively programmed. This

allows interesting tricks such as configure only portions of the device. A quick re-

initialization is typically possible for the FFs. This of course does not re-initializes the e-

RAMs or d-RAMs (those original bits are “lost").

Configuration Modes: In general, from the external interface, various modes of

configuration are possible. The desired mode is selected using dedicated configuration pins

(typically 3):

Serial Load with FPGA as Master: The configuration is stored in an external PROM

(now typically a dedicated FLASH device). Such PROM has a single data-out pin that

is hardwired to the programming data input of the FPGA. The FPGA provides a reset

and a clock that indicates to the PROM to start sequentially outputing its content. In

this configuration, if multiple devices need to be programmed, the data output from

the Master FPGA can be connected to the data input of the following, so that a single

PROM will suffice for the whole board. The successive FPGAs in the daisy chain will

have to be configured as Slaves.

Parallel Load with FPGA as Master: To speed-up configuration times, data is

parallel (8-bits or more). In this case the FPGA also provides the address to the



PROM that does not need to be a special device. This technique was therefore very

popular at the beginning of the FPGA times, when dedicated devices were expensive.

Nowadays, there are special devices that do not require the address from the FPGA.

Parallel Load with FPGA as Slave: In this case the control of the transfer is handed

to an external device, for example a microprocessor that generates all necessary

control signals. The advantage is that the microprocessor can choose the specific

configuration file to program, and therefore adapt the FPGA to the system's needs.

Serial Load with FPGA as Slave: The microprocessor (or other device) still reads the

programming bits from the memory, but then serializes them into a 1-bit input. This

also allows a daisy chaining of FPGAs with the first used as master. In the case of the

use of a microprocessor, the advantage is that fewer pins of the FPGA are hardwired

for configuration, so that can be used for general purpose I/O.

JTAG Programming: Most FPGAs are also furnished with a JTAG (Joint Test Action

Group) interface that is traditionally used to facilitate device testing by implementing

boundary scan. In FPGAs, additional JTAG commands allow them to connect the

configuration chain to the JTAG chain, thus in fact allowing programming through

JTAG

Embedded Processor: It is possible to connect the embedded processor JTAG to the

external JTAG chain, so that the configuration could start by configuring the core, and

then leaving it the responsibility to complete the configuration of the device



Structure of FPGA

A Xilinx Logic CELL:

The core building block in a modern FPGA from Xilinx is called Logic Cell (LC). It is the

most elementary programmable logic element in a Xilinx device. It is composed by a

4-input LUT, can also act as 16x1 RAM and 16-Bit SR Latch. Spartan-3 has 4-input

LUT.

Multiplexer

Flip-Flop and

Clock, clock enable, set/reset and two outputs (registered and non-registered).

SLICE:

The next step up the hierarchy is the Slice. A Slice contains two Logic Cells. LUT & MUX

have separate inputs and outputs. Each LC in a slice shares the control inputs (like set/reset,

clock and clock enable).



Configurable Logic Blocks (CLBs):

Multiple slices are grouped in one configurable logic block (CLB). Each CLB can contain

two or four slices. CLB is a single configurable logic block connected to other CLBs using

programmable interconnect.

The logic block hierarchy is as follow:

LC --------> Slice (2 LC) ---------> CLB (4 slices)

The reason for this hierarchy is that interconnects between LCs in the slice are more faster,

then slightly slower interconnects between slices in CLB to connect neighboring slices,

followed by the interconnects between CLBs.

Embedded RAMs:

A lot of applications require the use of memory. FPGA includes a relatively large chunks of

embedded RAM called block RAM. The capacity of the block RAM can be varied from few

hundred thousands bits to several million bits depending on the chip. The block can be used

for a variety of purposes.



Embedded Multipliers, Adders, MACs, etc:

Some functions like multipliers are slow if they are implemented by connecting a large

number of programmable logic blocks together, therefore many FPGA incorporate special

hardwired multiplier blocks.

Some FPGAs provide imbedded adder blocks, and it may includes and imbedded MAC

(Multiply and Accumulate)

General-Purpose I/O:

FPGA Packages can have up to 1000 or more pins, arranged across the base of the package.

General Purpose I/O signals of FPGA are divided into a number of Banks. Each Bank can be

configured individually to support a particular I/O standard.



IP core (intellectual property core)

An IP (intellectual property) core is a block of logic or data that is used in making a field

programmable gate array (FPGA) or application-specific integrated circuit (ASIC) for a

product. As essential elements of design reuse, IP cores are part of the growing electronic

design automation (EDA) industry trend towards repeated use of previously designed

components. Ideally, an IP core should be entirely portable i.e. able to easily be inserted into

any vendor technology or design methodology. Universal Asynchronous Receiver /

Transmitter (UARTs), central processing units (CPUs), Ethernet controllers, and PCI

interfaces are all examples of IP cores.

IP cores fall into one of three categories: hard cores, firm cores, or soft cores.

Hard cores

Built-in resources

Cannot modify design

Speed and size cannot be enhanced

Device dependent

Best suited for plug-and-play applications

Less portable and flexible

Example: Block RAM, Multiplier, Power PC etc.

Soft Core

Fully flexible

Design can be modified

Device Independent

Code can be modified

Example: Divider, UART, Microcontroller, Microblaze etc.

Firm Core

Partially flexible



Design could not be modified

Implementation could be modified

Features of Soft & Hard cores

FPGA Characteristics

In Brief, FPGAs can be specified, and compared using the following:

Number of Logic Cells (number of 4-input LUT’s and associated flip-flop).

Number (and size) of embedded RAM blocks

Number (and size) of embedded Multipliers.

Number (and size) of embedded adders.

Number (and size) of MACs (Multiply and Accumulate).

Availability of Hardware embedded Microprocessor Cores

Number of inputs and Number of pins etc.

FPGA Manufacturers

FPGA devices are produced by a number of semiconductor companies like:

Xilinx

o Spartan Series

o Virtex Series

Actel

Altera

Atmel

Lattice and Quick Logic

FPGA Device Families

Spartan-3 Package

FPGA Class Notes (Student Copy)

Documents

Transcript of FPGA Class Notes (Student Copy)