Embedded FPGA

Embedded Control Using FPGA

Seminar Report

Submitted in partial fulfillment of the requirements

for the degree of

Master of Technology

by

V. Sornam Viswanathan

Roll No: 05323005

under the guidance of

Prof. P.S.V Nataraj

a

Interdisciplinary Programme in Systems and Control

Engineering

Indian Institute of Technology, Bombay

Mumbai

November 2005

Acknowledgements

I wish to thank my guide Prof. P.S.V Nataraj for his valuable guidance.

I also wish to thank Mr. M. Suresh of Systems and Control Engineering for

all his help.

Abstract

FPGA stands for Field Programmable Gate Array. It is an integrated cir-

cuit that can be configured by the user in order to implement digital logic

functions of varying complexities. FPGA can be very effectively used for

control purposes in processes demanding very high loop cycle time. The im-

plementation of a digital controller in a FPGA can be parallel,resulting in

very high speeds of operation. This fact enables FPGAs to score over gen-

eral purpose computing chips like DSP chips,which have a limited number

of Multiplier ACcumulator (MAC) units that can be used for the controller

design. This report looks at the serial and parallel implementation of PID

controller on a FPGA as suggested by [1]. A PID controller was specifically

chosen so that the issues and the trade offs involved in the implementa-

tion of controllers could be understood. A more efficient implementation of

a PID using Distributed Arithmetic (DA) was looked at. This makes use

of the Look Up Table (LUT) rich characteristics of the FPGA. The final

part of the report suggests the recent trends in FPGA based control like

reconfiguration,co-design implementation of controllers. Mention was made

of the implementation of a fuzzy logic based controller for the truck back

upper problem as suggested by [2]. Also the implementation of a Model Pre-

dictive Controller using Co-design techniques as suggested by [3] was looked

into.

Contents

1 Introduction 3

1.1 FPGA ArchitectureAn overview . . . . . . . . . . . . . . . . 3

1.1.1 FPGA Architecture . . . . . . . . . . . . . . . . . . . . 4

1.1.2 Programmable Logic Block . . . . . . . . . . . . . . . . 5

1.1.3 Interconnect Resources . . . . . . . . . . . . . . . . . . 6

1.1.4 Input Output Blocks (IOB) . . . . . . . . . . . . . . . 7

2 FPGA In Control 9

2.1 FPGA In Control-Motivation . . . . . . . . . . . . . . . . . . 9

2.1.1 FPGA vs DSP processors . . . . . . . . . . . . . . . . 10

3 FPGA In Control-Case Studies 14

3.1 FPGA based PID control . . . . . . . . . . . . . . . . . . . . . 14

3.1.1 Parallel Design . . . . . . . . . . . . . . . . . . . . . . 16

3.1.2 Serial Design . . . . . . . . . . . . . . . . . . . . . . . 16

3.2 A More Efficient PID Implementation . . . . . . . . . . . . . . 19

4 Recent Trends In FPGA Based Control 23

4.1 Reconfigurable FPGA In Control Applications . . . . . . . . . 23

4.2 Co-design Implementation of Controllers . . . . . . . . . . . . 26

5 Conclusion 27

1

List of Figures

1.1 FPGA Schematic [4] . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Xilinx FPGA-CLB Schematic [8] . . . . . . . . . . . . . . . . 5

1.3 FPGA Interconnection schematic [6] . . . . . . . . . . . . . . 7

2.1 Implementation of filter on a Conventional DSP [5] . . . . . . 10

2.2 Implementation of filter on a FPGA [5] . . . . . . . . . . . . . 11

2.3 Diagram illustrating loop cycle time [4] . . . . . . . . . . . . . 12

3.1 Parallel Implementation of PID in FPGA [1] . . . . . . . . . . 17

3.2 Serial Implementation of PID in FPGA [1] . . . . . . . . . . . 17

3.3 DA Implementation Schematic [7] . . . . . . . . . . . . . . . . 21

4.1 Compile Time Reconfiguration [2] . . . . . . . . . . . . . . . . 24

4.2 Global Run Time Reconfiguration [2] . . . . . . . . . . . . . . 25

4.3 Local Run Time Reconfiguration [2] . . . . . . . . . . . . . . . 25

2

Chapter 1

Introduction

FPGA based control is a problem, the solution for which involves the synergy

of developments in control as well as FPGA technology. In order effectively

understand the need for FPGA based embedded control,the capabilities of

FPGAs and the reasons for using FPGA based embedded controllers in

the midst of a bewildering array of embedded controllers must be analyzed.

Hence in the first chapter the architecture of FPGA is studied. The second

chapter deals with the motivation for use of FPGA in control.

1.1 FPGA ArchitectureAn overview

FPGA - is an acronym for Field Programmable Gate Array. It belongs to

a class of user programmable digital devices called Programmable Logic De-

vices (PLDs) . A programmable logic device is an integrated circuit that

enables the user to configure it in many ways,enabling the implementation

of various digital logic functions, of varying sizes and complexities. PLDs

can be classified into various categories :

1. Simple programmable logic devices (SPLD)

(a) Programmable logic array (PLA) : A programmable logic array

3

is an integrated circuit that contains two levels of programmable

logic ; an AND plane and an OR plane.

(b) Programmable array logic (PAL): A PAL is an integrated circuit

that contains a fixed OR plane followed by a programmable AND

plane.

2. Complex Programmable Logic Device (CPLD)

3. Field Programmable Gate Array (FPGA)

1.1.1 FPGA Architecture

The typical FPGA consists of the following components:

1. Programmable Logic blocks

2. Interconnection Resources

3. Input output blocks

The general schematic of an FPGA is as shown in the figure :

Figure 1.1: FPGA Schematic [4]

4

1.1.2 Programmable Logic Block

The programmable logic block in a typical FPGA consists of Configurable

Logic Blocks (CLB). The CLB can be realized in many ways; one of them

being the Look Up Table (LUT) based CLB. The LUT is a one bit wide

memory location . The memory address lines are the inputs to the LUT and

the one bit output is the LUT output. Thus the LUT with K-inputs acts

as a 2k by 1 bit memory and the user can directly implement any k input

function by programming the functions truth table into the LUT [8]. The

Figure 1.2: Xilinx FPGA-CLB Schematic [8]

above diagram shows a generalized CLB that can be used for implementing

any logic function of upto nine inputs; two separate four input logic functions

and many other possibilities. The CLB also has a D-Flip Flop that can be

used to implement sequential logic functions. The CLB has also got features

that support the integration of entire systems. It has also got certain spe-

cialized circuitry that enables it perform arithmetic operations like addition,

multiplication etc. in a fast and efficient manner. Users can also configure

the LUT in the CLB as read/write RAM locations. Some FPGA also allow

configuration of their LUTs as Dual port RAMs; with one write and two

5

read inputs. The chips also include very wide AND planes around the pe-

riphery of the CLB to facilitate implementation of wide decoders. Some of

the modern FPGA also include entire micro controllers on the chip; enabling

easier implementation of complicated logic functions on a single chip. This

is especially suited for control applications [8].

1.1.3 Interconnect Resources

The other most important feature that decides the performance of the FPGA

and its suitability for control applications is its interconnect resources. This

is because the interconnection resources allow the implementation of an entire

digital system by providing a means of connecting various individual circuits

(subsystems) that have been implemented on different CLBs in an FPGA.

The interconnect resources in an typical FPGA can be classified as [9] :

1. General Purpose Interconnects : Signal between CLBs and Input Out-

put Blocks (IOBs) can be routed through switch matrices as they travel

along the horizontal and vertical interconnect lines.

2. Direct Interconnects : Adjacent CLBs are interconnected directly.

3. Long Lines : Long lines provide for high fan out,low-skew distribution

of signals that must travel relatively long distances.They span the entire

length or width of the interconnect area. They are typically used for

clock signals.

FPGA interconnects are normally unsegmented; i.e. each wiring segment

spans only one logic block before it terminates in a switch box. A switch

box is a switching matrix that contains programmable interconnections to

all the wiring segments that terminate inside it. By turning on some of the

programmable switches within a switch box, longer paths can be constructed

[6]. Figure 1.3 shows a typical FPGA interconnection scheme.

6

Figure 1.3: FPGA Interconnection schematic [6]

1.1.4 Input Output Blocks (IOB)

The IOB provides the interface between the FPGA and the real world sig-

nals. The IOB consists broadly of I/O pads. The I/O pads connect to one

of the pins on the IC package so that the external signals can be input to

or output from the array of logic cells. It also consists of tristate buffers,

which enable the signals to be input to and output from the logic array. Flip

flops are provided so that the input and the output values can be stored

within the IOB. Each IOB has also got a variety of other features like re

programmability of the input threshold to respond to either TTL or CMOS

logic levels. It also incorporates slew rate control of the output signal and

includes internal pull up resistors to avoid floating inputs [9].

The FPGA can be a fine grained or a coarse grained device. A fine

grained FPGA consists of a large number of small width programmable logic

resources that can be used to implement a variety of functions. A typi-

cal example of such an FPGA would be the Atmel AT40K. A coarse grained

7

FPGA like the Xilinx Virtex series consists of a smaller number of more pow-

erful logic blocks like LUTs and flip flops. Modern FPGAs also come with

features like Low Voltage Differential Signalling (LVDS) and also support

programmability of the input threshold to respond to LVTTL,LVCMOS etc.

They also provide Discretley Controlled Impedance (DCI) features. Most

FPGA also include Peripheral Component Interconnect (PCI) support; by

which they can be interconnected to a general purpose computer or form

a part of a larger development board. FPGAs are also JTAG compliant

i.e. they support the IEEE 1149.1-1990 boundary scan architecture; which

enables test data to be serially loaded into the device and test results to be

serially read out. JTAG can also be used for loading configuration bit streams

into the FPGA. Another important feature that FPGAs possess is that of

In System Programming (ISP) that enables the FPGA to be programmed

while it is a part of the end target system. This eliminates the necessity of

physical removal of the chip from the system and easy programmability.

8

Chapter 2

FPGA In Control

2.1 FPGA In Control-Motivation

The question that we must answer before we proceed is, with a plethora of

embedded devices available for digital control; why must one go in for embed-

ded control using FPGA? This can be answered by looking at the following

advantages that FPGA possess. Most of computations in control involves

the use of 2 operations. The first one being the Multiply operation and the

second one being the accumulate operation. Together these operations are

called Multiply ACcumulate (MAC) operations. The computational over-

head is the maximum when any kind of digital controller is performing these

operations. Hence the sampling rate and hence speed is limited by the rate

at which the device performs these computations. In a general purpose mi-

croprocessor the processors resources are held up while it is busy performing

these MAC operations and the speed or the sampling rate is decided by the

latency of these instructions.

9

2.1.1 FPGA vs DSP processors

In order to ensure fair and square comparison between FPGA and general

purpose processors;let us examine the operation of implementing a digital

filter. It is a well known fact that many of the controllers that are designed

are ultimately implemented as digital filters. Hence in order to illustrate

the power of the FPGA; let us look at the specific implementation of a

256 tap filter on a typical DSP processor and an FPGA. The conventional

DSP processor is a general purpose programming device that typically has

1-4 MAC units along with barrel shifters and other circuits optimized for

efficient computations [5]. The figure gives a very good idea of the factors

involved. The conventional DSP is a serial device. Let us for the time being

Figure 2.1: Implementation of filter on a Conventional DSP [5]

assume that it has got a single MAC unit. A 256 tap filter involves 256 MAC

operations per sample. Hence with a single MAC unit, it takes 256 clock

cycles for the output to be computed in a typical DSP processor.In order to

improve the system throughput ; we have to look at other options like using

10

a high frequency clock generator. This increases the system complexity and

the cost. Moreover the chances for clock skew occurring with high frequency

clocks is also high.

On the other hand, let us look at the same filter implemented on a typ-

ical FPGA. Refer to figure 2.2. This shows the most important feature of a

FPGA-parallelism . The FPGA contains a large number of gates and mil-

lions of transistors. Hence we can implement the filter in a parallel manner as

shown. The implementation consists of 256 registers and 256 multiplier units

along with the addition of the final partial product. Hence what took 256

clock cycles in a DSP can be completed in a single clock cycle in an FPGA.

This results in a tremendous improvement in the latency of each instruction.

Now let us look at some of the other features that FPGA based embedded

Figure 2.2: Implementation of filter on a FPGA [5]

control offers to us. The speed of a control system impacts its performance,

stability, robustness and disturbance rejection characteristics. Faster con-

trol systems are typically more stable, easier to tune, and less susceptible to

changing conditions and disturbances. To provide stable and robust control,

a control system must be able to measure the process variable and set an

actuator output command within a fixed period of time. The computational

performance of the FPGA is so fast that the control loop rate is limited only

11

by the sensors, actuators, and I/O modules. This is a stark contrast to tra-

ditional control systems, where the processing performance was typically the

limiting factor.

One of the most important parameters that is involved in performance

measurement of digital control systems is loop cycle time.Loop cycle time is

the time taken to execute one cycle of the control loop [4]. It is the time

that elapses between sampling the output ; computing the controller out-

put according to the control algorithm and sending the control signal to the

actuator. In the figure, T is the loop cycle time. Because of the inher-

Figure 2.3: Diagram illustrating loop cycle time [4]

ent parallelism present in the FPGA,very low loop cycle times are possible.

Another common measure of control system performance and robustness is

jitter , which is a measure of the variation of the actual loop cycle time

from the desired loop cycle time. In general purpose operating systems such

as Windows, the jitter is unbounded so closed loop control system stability

cannot be guaranteed [4]. Processor-based control systems with real-time

operating systems are commonly able to guarantee control loop jitter of less

than 100 microseconds. In FPGA based systems the control loop does not

need to share hardware resources with other tasks and control loops can be

precisely timed using the FPGA clock. The jitter for FPGA-based control

loops depends on the accuracy of the FPGA clock source. It typically ranges

in the order of picoseconds.

The FPGA can effectively be used as a prototyping device in order to

get the control algorithms fine tuned and running correctly. The wide of

12

design tools available for FPGAs make it very easy in order to build a

prototype of the control algorithm that we wish to implement and understand

and refine the various issues like timing and signal integrity. One can even

design the controller in a control systems design package like MATLABand

use the VHDL or Verilog descriptions of the controller thus generated to

fuse it on to the FPGA prototyping board. The FPGA thus plays a very

important role in prototyping the controller even if the ultimate goal is the

creation of an Application Specific Integrated Circuit (ASIC) controller for

the application at hand. FPGA has another advantage in the fact that the

design cycle time for the controller is less in an FPGA rather than an ASIC.

In some cases it may be economical for the controller to be implemented in

a FPGA rather than an ASIC. The FPGA also consumes lesser power than

the microprocessor based or ASIC based controllers. The FPGA design flow

consists of the steps of creation,simulation,verification,synthesis,placement

and routing of the design. A lot of computer based tools are available for

this purpose, which is yet another argument in FPGAs favour. Hence we

can safely arrive at a justification for the use of FPGA in control applications

13

Chapter 3

FPGA In Control-Case Studies

3.1 FPGA based PID control

The PID control algorithm is one of the most commonly used control algo-

rithms in industry. The controller output is computed in continuous time as

follows:

u(t) = kp

[e(t) +

1

Ti

t0

e(t)dt+ Tdde(t)

dt

](3.1)

Where kp is the proportional gain,Ti is the reset time and Td is the derivative

time. Wei Zhao et.al [1] suggested an implementation of the PID controller on

an FPGA . They implemented both parallel and serial PID designs and per-

formed comparisons between them on the basis of resource utilization,speed

and power consumption. The above equation (3.1) is discretised and the

following equation is obtained:

u(n) = kpe(n) + ki

n1j=0

e(j) + kd(e(n) e(n 1)) (3.2)

Where ki = kpT/Ti is the integral coefficient and kd = kpTd/T is the deriva-

tive coefficient. This form is called the position form of the PID algorithm.

14

An alternative would be to compute u(n) based on past output u(n-1) and

correction term u(n). This approach is often called as the velocity form of

the PID algorithm. The first step in this regard would be to calculate u(n-1)

based on equation (3.2).

u(n 1) = kpe(n 1) + kin1j=0

e(j) + kd(e(n 1) e(n 2)) (3.3)

then calculate correction term as

u(n) = u(n) u(n 1) (3.4)= koe(n 1) + k1e(n 2) + k2e(n 3) (3.5)

where

ko = kp + ki + kd

k1 = kp 2kdk2 = kd

The current control output is calculated as

u(n) = u(n 1) + u(n) (3.6)= u(n 1) + koe(n) + k1e(n 1) + k2e(n 2) (3.7)

In software implementation equation (3.7) avoids accumulation of all past

errors and enables smooth switching from manual to automatic modes of op-

eration. Initially the single channel implementation of equation (3.7) is used.

The above equation is decomposed into its basic operations. Here p and pd

refers to the controlled variable and its desired value(set point) respectively.

po,p1,p2,s1,s2 are temporary variables.

15

e(n) = p+ (pd) (3.8)po = ko e(n) (3.9)p1 = k1 e(n 1) (3.10)p2 = k2 e(n 2) (3.11)s1 = po + p1 (3.12)

s2 = p2 + u(n 1) (3.13)u(n) = s1 + s2 (3.14)

The above equations can be implemented both in parallel and serial designs.

For parallel design, each basic operation has got its own arithmetic unit-

either an adder or a multiplier. In serial design , which is mainly composed

of sequential logic; all operations share only one adder and one multiplier.

3.1.1 Parallel Design

The parallel design implemented by Wei Zhao [1] consisted of 4 adders and 3

multipliers corresponding to the basic operations indicated in equations (3.9)

thro (3.14). The implementation diagram is shown in the figure 3.1. The

other circuitry includes registers for latching initial and intermediate values

of error and controller output signals. The implementation also includes

value limitation logic that keeps the signals generated by the control logic

within limits that the physical device can bear.

3.1.2 Serial Design

In order to minimize the area and the resources consumed for the design;

the serial design consists of only one adder and one multiplier [1]. Fig-

ure 3.2 shows the schematic of the implementation. The other parts in the

implementation include registers,multiplexers and circuits for arithmetic op-

erations. They are commonly refereed to as the datapath circuits. Registers

16

Figure 3.1: Parallel Implementation of PID in FPGA [1]

Figure 3.2: Serial Implementation of PID in FPGA [1]

17

are used to store intermediate results. Because of the fact that the single

adder multiplier unit is used in a time shared manner; there is the necessity

of a control unit; which is a finite state machine that sets the select lines of

the multiplexers; thereby changing the input to the circuits. Wei Zhao [1]

conducted a wide variety of function and performance tests on these imple-

mentations. The results of those tests that have relevance to our problem

are presented.

1. Resource Utilization : it was found that the serial implementation

consumed far less resources on the FPGA than the parallel implemen-

tation. Even though the serial implementation includes a control unit;

it was found to consume far lesser number of CLBs to implement.

2. Speed : Wei Zhao et. al performed detailed analysis of the speed of the

various implementations using the Xilinx timing analyser and found

that in each design there were two timing concerns. The first one was

the control clock frequency. This controlled the timing cycles of the

PID algorithm. The next is the sampling frequency. This corresponds

to the rate at which the control algorithm generates control signals;

this is dependent on whether the implementation is a serial one or

a parallel one . For the parallel implementation which is essentially

a combinational logic implementation;the sampling frequency and the

control clock frequency are the same. This is a result of the inherently

parallel nature of such an implementation. On the other hand ; the

serial algorithm requires four clock cycles to compute all the four basic

operations specified in equations (3.9) thro (3.14). Hence the sampling

frequency for the serial implementation would be 1/4 of the control

clock frequency.

3. Power Dissipation : The power dissipation increased as the sampling

frequency was increased. At reasonable sampling frequencies; there was

no difference between the parallel and serial designs; eventhough the

parallel design was expected to be more power efficient because of much

18

lower sampling frequency.

3.2 A More Efficient PID Implementation

In the previous section we had looked at an implementation of a PID con-

troller based on multipliers and adders. But when we are implementing PID

controllers in LUT rich FPGAs; any design that does not make use of the

memory rich characteristics of the FPGA is not an optimal implementa-

tion. It should however, be mentioned that this type of PID implementation

is more efficient only in those kinds of FPGA that are rich in LUTs; be-

ing inherently architecture specific. Y.F.Chang [7] suggested an improved

implementation of a PID Controller based on Distributed Arithmetic (DA)

concepts. DA techniques are very efficient LUT design techniques; they ex-

ploit the abundance of LUTs present in the FPGA. The continuous PID

equation (3.1) is modified as follows in order to avoid problems of spikes in

the output because of the derivative term. These spikes occur when the user

tries to change the set point abruptly. If the derivative term acts on the set

point, then a sudden change in the set point would result in spikes in the

output.

U(s) = K

[bUc(s) Y (s) + 1

sTi(Uc(s) Y (s)) sTd

1 + sTdN

Y (s)

](3.15)

In equation (3.15) it is advantageous to allow only a fraction of the command

signal act on the proportional part. Here ki is the integral gain,kd is the

derivative gain, K is the proportional gain. Uc is the set point and Y is the

process value. U is the controller output. This equation was proposed by

[10]. Discretising equation (3.15) by using the forward differences for the

derivative term and backward differences for the integral term one has:

u(kT ) P (kT ) + I(kT ) +D(kT ) (3.16)

19

Where k denotes k-th sampling instant and

P (kT ) = K(bu(kT ) y(kT )) (3.17)I(kT ) = I((k 1)T ) + kT

Tiu((k 1)T ) y((k 1)T ) (3.18)

D(kT ) =Td

Td +NT(D((k 1)T ) KTdN

Td +NT(y(kT ) y((k 1)T ) (3.19)

Where y(kT ) is the output at the current instant. y((k 1)T ) is the outputat the previous instant. uc is the desired output of the system.I((k 1)Tis the value of the integral term at the previous instant. D((k 1)T ) is thevalue of the derivative at the previous instant. K,b,Ti,Td,N are controller

parameters. T is the sampling time. The direct implementation of the above

equation requires 5 multipliers , 5 adder subtractors and 4 delay elements.

The multiplier based design is not efficient for FPGA implementation because

of the fact that the FPGA has got limited number of CLBs for implementing

the above logic circuits. A better implementation would be the DA Based

Implementation . Consider equations (3.17) to (3.19). Assuming that

u(kT ),u((k 1)T ),y(kT ),y((k 1)T ) are m bit numbers and [j] representsthe jth bit of these numbers; we obtain the following equations :

P (kT ) =m1j=0

(kb u(kT )[j] k y(kT )[j]) 2j (3.20)

I(kT ) =m1j=0

(I((k 1)T )[j] + kTTi

(u((k 1)T )[j]

y(((k 1)T )[j]) 2j(3.21)

D(kT ) =m1j=0

(Td

Td +NTD((k 1)T )[j]

kTdNTd +NT

((y(kT )[j] y((k 1)T )[j])) 2j(3.22)

the results of (kbu(kT )[j]ky(kT )[j]),(I((k1)T )[j]+ kTTi (u((k1)T )[j]

20

y(((k1)T )[j]),( TdTd+NT

D((k1)T )[j] kTdNTd+NT

((y(kT )[j]y((k1)T )[j], areprecomputed and stored in various look up tables. Using the three LUTs

and corresponding shift add accumulators; the P(kT),D(kT),I(kT) terms can

be computed in m clock cycles. The main advantage of this method is the

fact that it utilizes the LUT rich feature of the FPGA for computing the

control effort.

Figure 3.3 shows the DA implementation schematic for this particular

Figure 3.3: DA Implementation Schematic [7]

implementation. It consists of four delay blocks ,3 LUTs , 3 accumula-

tors, 2 adders. Delay blocks 1 and 2 are used to obtain U((k 1)T ) andy(k 1)T respectively; whereas delay blocks 3 and 4 are used to com-pute D(k-1)T and I(k-1)T. Three LUTs and ACCs are used to provide

the terms P(kT),I(kT),D(kT) respectively. The ACC consists of an accu-

mulator and an adder subtractor pair. Finally two adders produce the sum

of P(kT),I(kT),D(kT). The throughput of this implementation is m+1 clock

cycles; i.e. m clock cycles to compute U and one more clock cycle to update

21

I((k-1)T) and D((k-1)T) terms. Thus we find that the DA based implemen-

tation consumes far less number of logic resources than the parallel multiplier

based design. Hence the design using DA would require 14 clock cycles to

implement in comparison to the design based on multipliers that would take

just a single clock cycle. Since power saving is dependent upon the clock

frequency; the reduction in power consumption and the reduction in clock

frequency would be advantageous in those applications which can tolerate

the increased loop cycle time, resulting form the predominantly serial imple-

mentation of the DA based controller.

22

Chapter 4

Recent Trends In FPGA Based

Control

4.1 Reconfigurable FPGA In Control Appli-

cations

In the recent years,the specifications for control systems has grown to in-

clude a certain degree of intelligence. They vary from specifications requir-

ing certain amount of fault tolerance to operating under varying operating

conditions [11]. These systems must also be capable of intelligent sensor

selection,remote monitoring and operation and must be capable of imple-

menting sophisticated control algorithms that require adaptation. Hence in

order to meet these specifications; one has to look at a new approach in terms

of either hardware software co-design or reconfigurable hardware like FPGA

that allow such a type of hardware/software co-design to take place. By sys-

tematically partitioning the system; functionality requiring large amounts of

reconfiguration can be given such kind of resources on an FPGA; thereby

ensuring that the above mentioned objectives are met. This is especially

useful in certain kinds of fault tolerant systems. Suppose the system detects

the occurrence of a fault; then a new configuration can be loaded (either par-

23

tially or fully) so that the fault is taken care of (either remedied or bypassed)

and the control system performance is not affected. Reconfiguration can be

broadly classified into two:

Compile Time Reconfiguration

This is the normal kind of oine reconfiguration that most FPGA sup-

port. Suppose the architecture of the controller /system requires a change;

the FPGA is taken oine and the new netlist is fused on the target device.

In fact , in most kinds of SRAM based FPGA; this is a routine feature. The

SRAM based FPGA is a volatile device. Its configuration settings have to be

loaded at power up from a non volatile device like a CPLD or a EEPROM.

Hence this is in itself a sort of compile time reconfiguration. The concept of

compile time reconfiguration can be better expressed by means of a diagram

as follows [2]. Refer to figure 4.1.

Figure 4.1: Compile Time Reconfiguration [2]

Run Time Reconfiguration

This is a reconfiguration technique in which the algorithm to be imple-

mented is split into many time independent partitions and the FPGA con-

figuration is changed on the fly. Run time reconfiguration is once again clas-

sified into global and local reconfiguration [2]. Global reconfiguration refers

to the process in which the configuration of the whole FPGA is changed.

Local reconfiguration is the process in which the configuration of the FPGA

is changed only in partial parts; whereas the configuration of the remaining

parts remains the same. This feature is allowed only by certain FPGAs. As

shown in the figure; the FPGA as a whole goes through a series of configura-

24

Figure 4.2: Global Run Time Reconfiguration [2]

tion changes; from A to B and finally to C. Hence the name global run time

reconfiguration. Refer to figure 4.3. Here the first configuration and second

Figure 4.3: Local Run Time Reconfiguration [2]

configuration share a common partition A; whereas the first partition B is

reconfigured into C,D in the second configuration. Here the configuration of

the FPGA is changed only locally;the configurations retain at least one com-

mon part throughout and hence the name-local reconfiguration. This kind

of reconfiguration scheme is used in order to effectively change the control

algorithm and also to incorporate a certain degree of fault tolerance into the

control system. Suppose,there is a sensor failure in a particular sensor when

configuration A,B corresponding to figure 4.3 is being run.Now partition B

can be rewired into C and D such that the failed sensor is effectively wired

out of the circuit and the control system still continues to function prop-

erly. Adaptation mechanisms can also be effectively incorporated into the

scheme.An implementation of a fuzzy logic controller on a recon-

25

figurable FPGA system for the truck back upper application using

the above mentioned reconfiguration techniques can be found in

[2].

4.2 Co-design Implementation of Controllers

The next important and evolving aspect of FPGA based control is the concept

of hardware software co-design using FPGA. An application of this partic-

ular approach in the area of Model Predictive Control (MPC) is illustrated

in [3]. Hardware software co-design is a new paradigm in which a micropro-

cessor/microcontroller is embedded in an FPGA. Control algorithms that

require a large number of computationally involved operations like matrix

manipulations cannot be effectively implemented on a single microprocessor

based set up, as the microprocessor gets bogged down while performing these

operations. It is in this regard that the parallel architecture of the FPGA

can be exploited to develop a matrix coprocessor for performing these com-

putations; while the general purpose microprocessor that was embedded in

the chip can be used to perform other operations. This is more efficient and

still retains its system on chip nature because of the fact that the processor

and the FPGA come together,bundled on single chip. The Model predictive

controller suggested by [3] gives precisely such an implementation. It involves

co-design with a microcontroller and an FPGA. The micro controller bears

the load of performing higher level operations of the algorithm, while the

matrix processor the computationally intensive ones. The matrix processor

does not have to deal with fetching instructions and data , as that aspect is

taken care of by the micrcontroller. Currently the hardware implementation

consists two different boards, one that accommodates the micro controller

and one the FPGA , that communicate via a 20 bit bus (a 16-bit databus

and 4 control signals). But since both of them are described using HDL, they

can later be targeted on a single FPGA in order to arrive at a System On

Programmable Chip (SOPC) Design.

26

Chapter 5

Conclusion

An introduction to the architecture of a FPGA was presented in the first

chapter along with a mention of the two main kinds of commercially available

FPGA: Coarse and fine grained FPGA. The features of the latest and modern

FPGA like Low Voltage Differential Signalling (LVDS) and JTAG compliance

was also looked at. The next chapter dealt with the need for embedded

control using FPGA by comparing system design for control purposes using

FPGA along with system design using DSP processors. It also focused on the

main advantages of using FPGA like reconfigurability and parallelism along

with reduced loop cycle time and reduced jitter. The usefulness of FPGA as

a prototyping device was also illustrated. The role played by FPGA based

system design towards the ultimate goal of ASIC based implementation of the

controller was also dealt with. The advantages of the FPGA based design

such as reduced design cycle time,less power consumption etc. was also

mentioned.

The third chapter looked at the implementation of a PID controller as

suggested by Wei Zhao et.al [1]; first in a normal way and next by using

Distributed Arithmetic concepts as suggested by Y.C.Chang et.al [7]. This

was done in order to obtain a feel for the issues and trade offs that one

faces while designing embedded controllers using FPGA. The PID controller

was specifically chosen as the control algorithm is relatively simple and it

27

gives an idea as to the issues that must be taken care of while trying system

on chip implementations. The next chapter looked at the recent trends in

FPGA based control; namely reconfigurable FPGA. Specifically Global and

local reconfiguration were compared. Compile time and run time reconfigu-

ration strategies were contrasted. These offer wide scope for implementing

fault tolerant and adaptive control strategies on a SOC environment. In the

next section the concept of co-design was looked at with reference to imple-

mentation of a SOPC MPC controller, as suggested by [3]. This method,in

concept, gives a way of implementing a Model Predictive Controller on a

FPGA target.

28

Bibliography

[1] Wei Zhao, Byung Hwa kim, Amy C. Larson and Richard M. Voyles.

FPGA implementation of closed loop control system for small scale

robot. In Proceedings.,12th International conference on advanced

robotics-ICAR 05, pages 7077, 2005.

[2] Daijin Kim. An implementation of fuzzy logic controller on the recon-

figurable fpga system. IEEE Transactions On Industrial Electronics,Vol

47,No.3, 2000.

[3] Panagiotis Vouzis, Leonidas G. Bleris, Mayuresh V. Kothare and Mark

Arnold. Towards a co-design implementation of a system for model

predictive control. In Proceedings., Annual Meeting,American Institute

of Chemical Engineers,Cincinnati Convention Center, Cincinnati, OH,

November 2005.

[4] National Instruments: http://www.ni.com. FPGA based control: Mil-

lions of transistors at your command, 2004.

[5] Xilinx. Comparing and Contrasting FPGA and Microprocessor System

Design and Development, july 2004.

[6] Wikipedia: http://www.wikipedia.org. Field programmable gate array,

2005.

29

[7] Y.F.Chang, M.Moallem and W.Wang. Efficient implementation of PID

control algorithm using FPGA technology. In Proceedings.,43 IEEE

Conference On Decision and Control, 2004.

[8] Stephen Brown. FPGA and CPLD architectures: A Tutorial. IEEE

Design And Test Of Computers, 1996.

[9] Charles H Roth Jr. Digital System Design Using VHDL. Brooks/Cole,

1998.

[10] K. J. Astrom and B. Wittenmark. Computer Controlled Systems. Theory

and Practice. Prentice-Hall, Inc., Upper Saddle River, NJ, 3rd edition,

1997.

[11] R.Muthuraman, A.Fajebe and S.Commuri. Intelligence in embedded

control-a case study. Region 5 conference:Annual Technical and leader-

ship workshop, pages 125130, April 2004.

30

Embedded FPGA

Documents

Transcript of Embedded FPGA