Embedded FPGA

download Embedded FPGA

of 33

description

embedded

Transcript of Embedded FPGA

  • Embedded Control Using FPGA

    Seminar Report

    Submitted in partial fulfillment of the requirements

    for the degree of

    Master of Technology

    by

    V. Sornam Viswanathan

    Roll No: 05323005

    under the guidance of

    Prof. P.S.V Nataraj

    a

    Interdisciplinary Programme in Systems and Control

    Engineering

    Indian Institute of Technology, Bombay

    Mumbai

    November 2005

  • Acknowledgements

    I wish to thank my guide Prof. P.S.V Nataraj for his valuable guidance.

    I also wish to thank Mr. M. Suresh of Systems and Control Engineering for

    all his help.

  • Abstract

    FPGA stands for Field Programmable Gate Array. It is an integrated cir-

    cuit that can be configured by the user in order to implement digital logic

    functions of varying complexities. FPGA can be very effectively used for

    control purposes in processes demanding very high loop cycle time. The im-

    plementation of a digital controller in a FPGA can be parallel,resulting in

    very high speeds of operation. This fact enables FPGAs to score over gen-

    eral purpose computing chips like DSP chips,which have a limited number

    of Multiplier ACcumulator (MAC) units that can be used for the controller

    design. This report looks at the serial and parallel implementation of PID

    controller on a FPGA as suggested by [1]. A PID controller was specifically

    chosen so that the issues and the trade offs involved in the implementa-

    tion of controllers could be understood. A more efficient implementation of

    a PID using Distributed Arithmetic (DA) was looked at. This makes use

    of the Look Up Table (LUT) rich characteristics of the FPGA. The final

    part of the report suggests the recent trends in FPGA based control like

    reconfiguration,co-design implementation of controllers. Mention was made

    of the implementation of a fuzzy logic based controller for the truck back

    upper problem as suggested by [2]. Also the implementation of a Model Pre-

    dictive Controller using Co-design techniques as suggested by [3] was looked

    into.

  • Contents

    1 Introduction 3

    1.1 FPGA ArchitectureAn overview . . . . . . . . . . . . . . . . 3

    1.1.1 FPGA Architecture . . . . . . . . . . . . . . . . . . . . 4

    1.1.2 Programmable Logic Block . . . . . . . . . . . . . . . . 5

    1.1.3 Interconnect Resources . . . . . . . . . . . . . . . . . . 6

    1.1.4 Input Output Blocks (IOB) . . . . . . . . . . . . . . . 7

    2 FPGA In Control 9

    2.1 FPGA In Control-Motivation . . . . . . . . . . . . . . . . . . 9

    2.1.1 FPGA vs DSP processors . . . . . . . . . . . . . . . . 10

    3 FPGA In Control-Case Studies 14

    3.1 FPGA based PID control . . . . . . . . . . . . . . . . . . . . . 14

    3.1.1 Parallel Design . . . . . . . . . . . . . . . . . . . . . . 16

    3.1.2 Serial Design . . . . . . . . . . . . . . . . . . . . . . . 16

    3.2 A More Efficient PID Implementation . . . . . . . . . . . . . . 19

    4 Recent Trends In FPGA Based Control 23

    4.1 Reconfigurable FPGA In Control Applications . . . . . . . . . 23

    4.2 Co-design Implementation of Controllers . . . . . . . . . . . . 26

    5 Conclusion 27

    1

  • List of Figures

    1.1 FPGA Schematic [4] . . . . . . . . . . . . . . . . . . . . . . . 4

    1.2 Xilinx FPGA-CLB Schematic [8] . . . . . . . . . . . . . . . . 5

    1.3 FPGA Interconnection schematic [6] . . . . . . . . . . . . . . 7

    2.1 Implementation of filter on a Conventional DSP [5] . . . . . . 10

    2.2 Implementation of filter on a FPGA [5] . . . . . . . . . . . . . 11

    2.3 Diagram illustrating loop cycle time [4] . . . . . . . . . . . . . 12

    3.1 Parallel Implementation of PID in FPGA [1] . . . . . . . . . . 17

    3.2 Serial Implementation of PID in FPGA [1] . . . . . . . . . . . 17

    3.3 DA Implementation Schematic [7] . . . . . . . . . . . . . . . . 21

    4.1 Compile Time Reconfiguration [2] . . . . . . . . . . . . . . . . 24

    4.2 Global Run Time Reconfiguration [2] . . . . . . . . . . . . . . 25

    4.3 Local Run Time Reconfiguration [2] . . . . . . . . . . . . . . . 25

    2

  • Chapter 1

    Introduction

    FPGA based control is a problem, the solution for which involves the synergy

    of developments in control as well as FPGA technology. In order effectively

    understand the need for FPGA based embedded control,the capabilities of

    FPGAs and the reasons for using FPGA based embedded controllers in

    the midst of a bewildering array of embedded controllers must be analyzed.

    Hence in the first chapter the architecture of FPGA is studied. The second

    chapter deals with the motivation for use of FPGA in control.

    1.1 FPGA ArchitectureAn overview

    FPGA - is an acronym for Field Programmable Gate Array. It belongs to

    a class of user programmable digital devices called Programmable Logic De-

    vices (PLDs) . A programmable logic device is an integrated circuit that

    enables the user to configure it in many ways,enabling the implementation

    of various digital logic functions, of varying sizes and complexities. PLDs

    can be classified into various categories :

    1. Simple programmable logic devices (SPLD)

    (a) Programmable logic array (PLA) : A programmable logic array

    3

  • is an integrated circuit that contains two levels of programmable

    logic ; an AND plane and an OR plane.

    (b) Programmable array logic (PAL): A PAL is an integrated circuit

    that contains a fixed OR plane followed by a programmable AND

    plane.

    2. Complex Programmable Logic Device (CPLD)

    3. Field Programmable Gate Array (FPGA)

    1.1.1 FPGA Architecture

    The typical FPGA consists of the following components:

    1. Programmable Logic blocks

    2. Interconnection Resources

    3. Input output blocks

    The general schematic of an FPGA is as shown in the figure :

    Figure 1.1: FPGA Schematic [4]

    4

  • 1.1.2 Programmable Logic Block

    The programmable logic block in a typical FPGA consists of Configurable

    Logic Blocks (CLB). The CLB can be realized in many ways; one of them

    being the Look Up Table (LUT) based CLB. The LUT is a one bit wide

    memory location . The memory address lines are the inputs to the LUT and

    the one bit output is the LUT output. Thus the LUT with K-inputs acts

    as a 2k by 1 bit memory and the user can directly implement any k input

    function by programming the functions truth table into the LUT [8]. The

    Figure 1.2: Xilinx FPGA-CLB Schematic [8]

    above diagram shows a generalized CLB that can be used for implementing

    any logic function of upto nine inputs; two separate four input logic functions

    and many other possibilities. The CLB also has a D-Flip Flop that can be

    used to implement sequential logic functions. The CLB has also got features

    that support the integration of entire systems. It has also got certain spe-

    cialized circuitry that enables it perform arithmetic operations like addition,

    multiplication etc. in a fast and efficient manner. Users can also configure

    the LUT in the CLB as read/write RAM locations. Some FPGA also allow

    configuration of their LUTs as Dual port RAMs; with one write and two

    5

  • read inputs. The chips also include very wide AND planes around the pe-

    riphery of the CLB to facilitate implementation of wide decoders. Some of

    the modern FPGA also include entire micro controllers on the chip; enabling

    easier implementation of complicated logic functions on a single chip. This

    is especially suited for control applications [8].

    1.1.3 Interconnect Resources

    The other most important feature that decides the performance of the FPGA

    and its suitability for control applications is its interconnect resources. This

    is because the interconnection resources allow the implementation of an entire

    digital system by providing a means of connecting various individual circuits

    (subsystems) that have been implemented on different CLBs in an FPGA.

    The interconnect resources in an typical FPGA can be classified as [9] :

    1. General Purpose Interconnects : Signal between CLBs and Input Out-

    put Blocks (IOBs) can be routed through switch matrices as they travel

    along the horizontal and vertical interconnect lines.

    2. Direct Interconnects : Adjacent CLBs are interconnected directly.

    3. Long Lines : Long lines provide for high fan out,low-skew distribution

    of signals that must travel relatively long distances.They span the entire

    length or width of the interconnect area. They are typically used for

    clock signals.

    FPGA interconnects are normally unsegmented; i.e. each wiring segment

    spans only one logic block before it terminates in a switch box. A switch

    box is a switching matrix that contains programmable interconnections to

    all the wiring segments that terminate inside it. By turning on some of the

    programmable switches within a switch box, longer paths can be constructed

    [6]. Figure 1.3 shows a typical FPGA interconnection scheme.

    6

  • Figure 1.3: FPGA Interconnection schematic [6]

    1.1.4 Input Output Blocks (IOB)

    The IOB provides the interface between the FPGA and the real world sig-

    nals. The IOB consists broadly of I/O pads. The I/O pads connect to one

    of the pins on the IC package so that the external signals can be input to

    or output from the array of logic cells. It also consists of tristate buffers,

    which enable the signals to be input to and output from the logic array. Flip

    flops are provided so that the input and the output values can be stored

    within the IOB. Each IOB has also got a variety of other features like re

    programmability of the input threshold to respond to either TTL or CMOS

    logic levels. It also incorporates slew rate control of the output signal and

    includes internal pull up resistors to avoid floating inputs [9].

    The FPGA can be a fine grained or a coarse grained device. A fine

    grained FPGA consists of a large number of small width programmable logic

    resources that can be used to implement a variety of functions. A typi-

    cal example of such an FPGA would be the Atmel AT40K. A coarse grained

    7

  • FPGA like the Xilinx Virtex series consists of a smaller number of more pow-

    erful logic blocks like LUTs and flip flops. Modern FPGAs also come with

    features like Low Voltage Differential Signalling (LVDS) and also support

    programmability of the input threshold to respond to LVTTL,LVCMOS etc.

    They also provide Discretley Controlled Impedance (DCI) features. Most

    FPGA also include Peripheral Component Interconnect (PCI) support; by

    which they can be interconnected to a general purpose computer or form

    a part of a larger development board. FPGAs are also JTAG compliant

    i.e. they support the IEEE 1149.1-1990 boundary scan architecture; which

    enables test data to be serially loaded into the device and test results to be

    serially read out. JTAG can also be used for loading configuration bit streams

    into the FPGA. Another important feature that FPGAs possess is that of

    In System Programming (ISP) that enables the FPGA to be programmed

    while it is a part of the end target system. This eliminates the necessity of

    physical removal of the chip from the system and easy programmability.

    8

  • Chapter 2

    FPGA In Control

    2.1 FPGA In Control-Motivation

    The question that we must answer before we proceed is, with a plethora of

    embedded devices available for digital control; why must one go in for embed-

    ded control using FPGA? This can be answered by looking at the following

    advantages that FPGA possess. Most of computations in control involves

    the use of 2 operations. The first one being the Multiply operation and the

    second one being the accumulate operation. Together these operations are

    called Multiply ACcumulate (MAC) operations. The computational over-

    head is the maximum when any kind of digital controller is performing these

    operations. Hence the sampling rate and hence speed is limited by the rate

    at which the device performs these computations. In a general purpose mi-

    croprocessor the processors resources are held up while it is busy performing

    these MAC operations and the speed or the sampling rate is decided by the

    latency of these instructions.

    9

  • 2.1.1 FPGA vs DSP processors

    In order to ensure fair and square comparison between FPGA and general

    purpose processors;let us examine the operation of implementing a digital

    filter. It is a well known fact that many of the controllers that are designed

    are ultimately implemented as digital filters. Hence in order to illustrate

    the power of the FPGA; let us look at the specific implementation of a

    256 tap filter on a typical DSP processor and an FPGA. The conventional

    DSP processor is a general purpose programming device that typically has

    1-4 MAC units along with barrel shifters and other circuits optimized for

    efficient computations [5]. The figure gives a very good idea of the factors

    involved. The conventional DSP is a serial device. Let us for the time being

    Figure 2.1: Implementation of filter on a Conventional DSP [5]

    assume that it has got a single MAC unit. A 256 tap filter involves 256 MAC

    operations per sample. Hence with a single MAC unit, it takes 256 clock

    cycles for the output to be computed in a typical DSP processor.In order to

    improve the system throughput ; we have to look at other options like using

    10

  • a high frequency clock generator. This increases the system complexity and

    the cost. Moreover the chances for clock skew occurring with high frequency

    clocks is also high.

    On the other hand, let us look at the same filter implemented on a typ-

    ical FPGA. Refer to figure 2.2. This shows the most important feature of a

    FPGA-parallelism . The FPGA contains a large number of gates and mil-

    lions of transistors. Hence we can implement the filter in a parallel manner as

    shown. The implementation consists of 256 registers and 256 multiplier units

    along with the addition of the final partial product. Hence what took 256

    clock cycles in a DSP can be completed in a single clock cycle in an FPGA.

    This results in a tremendous improvement in the latency of each instruction.

    Now let us look at some of the other features that FPGA based embedded

    Figure 2.2: Implementation of filter on a FPGA [5]

    control offers to us. The speed of a control system impacts its performance,

    stability, robustness and disturbance rejection characteristics. Faster con-

    trol systems are typically more stable, easier to tune, and less susceptible to

    changing conditions and disturbances. To provide stable and robust control,

    a control system must be able to measure the process variable and set an

    actuator output command within a fixed period of time. The computational

    performance of the FPGA is so fast that the control loop rate is limited only

    11

  • by the sensors, actuators, and I/O modules. This is a stark contrast to tra-

    ditional control systems, where the processing performance was typically the

    limiting factor.

    One of the most important parameters that is involved in performance

    measurement of digital control systems is loop cycle time.Loop cycle time is

    the time taken to execute one cycle of the control loop [4]. It is the time

    that elapses between sampling the output ; computing the controller out-

    put according to the control algorithm and sending the control signal to the

    actuator. In the figure, T is the loop cycle time. Because of the inher-

    Figure 2.3: Diagram illustrating loop cycle time [4]

    ent parallelism present in the FPGA,very low loop cycle times are possible.

    Another common measure of control system performance and robustness is

    jitter , which is a measure of the variation of the actual loop cycle time

    from the desired loop cycle time. In general purpose operating systems such

    as Windows, the jitter is unbounded so closed loop control system stability

    cannot be guaranteed [4]. Processor-based control systems with real-time

    operating systems are commonly able to guarantee control loop jitter of less

    than 100 microseconds. In FPGA based systems the control loop does not

    need to share hardware resources with other tasks and control loops can be

    precisely timed using the FPGA clock. The jitter for FPGA-based control

    loops depends on the accuracy of the FPGA clock source. It typically ranges

    in the order of picoseconds.

    The FPGA can effectively be used as a prototyping device in order to

    get the control algorithms fine tuned and running correctly. The wide of

    12

  • design tools available for FPGAs make it very easy in order to build a

    prototype of the control algorithm that we wish to implement and understand

    and refine the various issues like timing and signal integrity. One can even

    design the controller in a control systems design package like MATLABand

    use the VHDL or Verilog descriptions of the controller thus generated to

    fuse it on to the FPGA prototyping board. The FPGA thus plays a very

    important role in prototyping the controller even if the ultimate goal is the

    creation of an Application Specific Integrated Circuit (ASIC) controller for

    the application at hand. FPGA has another advantage in the fact that the

    design cycle time for the controller is less in an FPGA rather than an ASIC.

    In some cases it may be economical for the controller to be implemented in

    a FPGA rather than an ASIC. The FPGA also consumes lesser power than

    the microprocessor based or ASIC based controllers. The FPGA design flow

    consists of the steps of creation,simulation,verification,synthesis,placement

    and routing of the design. A lot of computer based tools are available for

    this purpose, which is yet another argument in FPGAs favour. Hence we

    can safely arrive at a justification for the use of FPGA in control applications

    13

  • Chapter 3

    FPGA In Control-Case Studies

    3.1 FPGA based PID control

    The PID control algorithm is one of the most commonly used control algo-

    rithms in industry. The controller output is computed in continuous time as

    follows:

    u(t) = kp

    [e(t) +

    1

    Ti

    t0

    e(t)dt+ Tdde(t)

    dt

    ](3.1)

    Where kp is the proportional gain,Ti is the reset time and Td is the derivative

    time. Wei Zhao et.al [1] suggested an implementation of the PID controller on

    an FPGA . They implemented both parallel and serial PID designs and per-

    formed comparisons between them on the basis of resource utilization,speed

    and power consumption. The above equation (3.1) is discretised and the

    following equation is obtained:

    u(n) = kpe(n) + ki

    n1j=0

    e(j) + kd(e(n) e(n 1)) (3.2)

    Where ki = kpT/Ti is the integral coefficient and kd = kpTd/T is the deriva-

    tive coefficient. This form is called the position form of the PID algorithm.

    14

  • An alternative would be to compute u(n) based on past output u(n-1) and

    correction term u(n). This approach is often called as the velocity form of

    the PID algorithm. The first step in this regard would be to calculate u(n-1)

    based on equation (3.2).

    u(n 1) = kpe(n 1) + kin1j=0

    e(j) + kd(e(n 1) e(n 2)) (3.3)

    then calculate correction term as

    u(n) = u(n) u(n 1) (3.4)= koe(n 1) + k1e(n 2) + k2e(n 3) (3.5)

    where

    ko = kp + ki + kd

    k1 = kp 2kdk2 = kd

    The current control output is calculated as

    u(n) = u(n 1) + u(n) (3.6)= u(n 1) + koe(n) + k1e(n 1) + k2e(n 2) (3.7)

    In software implementation equation (3.7) avoids accumulation of all past

    errors and enables smooth switching from manual to automatic modes of op-

    eration. Initially the single channel implementation of equation (3.7) is used.

    The above equation is decomposed into its basic operations. Here p and pd

    refers to the controlled variable and its desired value(set point) respectively.

    po,p1,p2,s1,s2 are temporary variables.

    15

  • e(n) = p+ (pd) (3.8)po = ko e(n) (3.9)p1 = k1 e(n 1) (3.10)p2 = k2 e(n 2) (3.11)s1 = po + p1 (3.12)

    s2 = p2 + u(n 1) (3.13)u(n) = s1 + s2 (3.14)

    The above equations can be implemented both in parallel and serial designs.

    For parallel design, each basic operation has got its own arithmetic unit-

    either an adder or a multiplier. In serial design , which is mainly composed

    of sequential logic; all operations share only one adder and one multiplier.

    3.1.1 Parallel Design

    The parallel design implemented by Wei Zhao [1] consisted of 4 adders and 3

    multipliers corresponding to the basic operations indicated in equations (3.9)

    thro (3.14). The implementation diagram is shown in the figure 3.1. The

    other circuitry includes registers for latching initial and intermediate values

    of error and controller output signals. The implementation also includes

    value limitation logic that keeps the signals generated by the control logic

    within limits that the physical device can bear.

    3.1.2 Serial Design

    In order to minimize the area and the resources consumed for the design;

    the serial design consists of only one adder and one multiplier [1]. Fig-

    ure 3.2 shows the schematic of the implementation. The other parts in the

    implementation include registers,multiplexers and circuits for arithmetic op-

    erations. They are commonly refereed to as the datapath circuits. Registers

    16

  • Figure 3.1: Parallel Implementation of PID in FPGA [1]

    Figure 3.2: Serial Implementation of PID in FPGA [1]

    17

  • are used to store intermediate results. Because of the fact that the single

    adder multiplier unit is used in a time shared manner; there is the necessity

    of a control unit; which is a finite state machine that sets the select lines of

    the multiplexers; thereby changing the input to the circuits. Wei Zhao [1]

    conducted a wide variety of function and performance tests on these imple-

    mentations. The results of those tests that have relevance to our problem

    are presented.

    1. Resource Utilization : it was found that the serial implementation

    consumed far less resources on the FPGA than the parallel implemen-

    tation. Even though the serial implementation includes a control unit;

    it was found to consume far lesser number of CLBs to implement.

    2. Speed : Wei Zhao et. al performed detailed analysis of the speed of the

    various implementations using the Xilinx timing analyser and found

    that in each design there were two timing concerns. The first one was

    the control clock frequency. This controlled the timing cycles of the

    PID algorithm. The next is the sampling frequency. This corresponds

    to the rate at which the control algorithm generates control signals;

    this is dependent on whether the implementation is a serial one or

    a parallel one . For the parallel implementation which is essentially

    a combinational logic implementation;the sampling frequency and the

    control clock frequency are the same. This is a result of the inherently

    parallel nature of such an implementation. On the other hand ; the

    serial algorithm requires four clock cycles to compute all the four basic

    operations specified in equations (3.9) thro (3.14). Hence the sampling

    frequency for the serial implementation would be 1/4 of the control

    clock frequency.

    3. Power Dissipation : The power dissipation increased as the sampling

    frequency was increased. At reasonable sampling frequencies; there was

    no difference between the parallel and serial designs; eventhough the

    parallel design was expected to be more power efficient because of much

    18

  • lower sampling frequency.

    3.2 A More Efficient PID Implementation

    In the previous section we had looked at an implementation of a PID con-

    troller based on multipliers and adders. But when we are implementing PID

    controllers in LUT rich FPGAs; any design that does not make use of the

    memory rich characteristics of the FPGA is not an optimal implementa-

    tion. It should however, be mentioned that this type of PID implementation

    is more efficient only in those kinds of FPGA that are rich in LUTs; be-

    ing inherently architecture specific. Y.F.Chang [7] suggested an improved

    implementation of a PID Controller based on Distributed Arithmetic (DA)

    concepts. DA techniques are very efficient LUT design techniques; they ex-

    ploit the abundance of LUTs present in the FPGA. The continuous PID

    equation (3.1) is modified as follows in order to avoid problems of spikes in

    the output because of the derivative term. These spikes occur when the user

    tries to change the set point abruptly. If the derivative term acts on the set

    point, then a sudden change in the set point would result in spikes in the

    output.

    U(s) = K

    [bUc(s) Y (s) + 1

    sTi(Uc(s) Y (s)) sTd

    1 + sTdN

    Y (s)

    ](3.15)

    In equation (3.15) it is advantageous to allow only a fraction of the command

    signal act on the proportional part. Here ki is the integral gain,kd is the

    derivative gain, K is the proportional gain. Uc is the set point and Y is the

    process value. U is the controller output. This equation was proposed by

    [10]. Discretising equation (3.15) by using the forward differences for the

    derivative term and backward differences for the integral term one has:

    u(kT ) P (kT ) + I(kT ) +D(kT ) (3.16)

    19

  • Where k denotes k-th sampling instant and

    P (kT ) = K(bu(kT ) y(kT )) (3.17)I(kT ) = I((k 1)T ) + kT

    Tiu((k 1)T ) y((k 1)T ) (3.18)

    D(kT ) =Td

    Td +NT(D((k 1)T ) KTdN

    Td +NT(y(kT ) y((k 1)T ) (3.19)

    Where y(kT ) is the output at the current instant. y((k 1)T ) is the outputat the previous instant. uc is the desired output of the system.I((k 1)Tis the value of the integral term at the previous instant. D((k 1)T ) is thevalue of the derivative at the previous instant. K,b,Ti,Td,N are controller

    parameters. T is the sampling time. The direct implementation of the above

    equation requires 5 multipliers , 5 adder subtractors and 4 delay elements.

    The multiplier based design is not efficient for FPGA implementation because

    of the fact that the FPGA has got limited number of CLBs for implementing

    the above logic circuits. A better implementation would be the DA Based

    Implementation . Consider equations (3.17) to (3.19). Assuming that

    u(kT ),u((k 1)T ),y(kT ),y((k 1)T ) are m bit numbers and [j] representsthe jth bit of these numbers; we obtain the following equations :

    P (kT ) =m1j=0

    (kb u(kT )[j] k y(kT )[j]) 2j (3.20)

    I(kT ) =m1j=0

    (I((k 1)T )[j] + kTTi

    (u((k 1)T )[j]

    y(((k 1)T )[j]) 2j(3.21)

    D(kT ) =m1j=0

    (Td

    Td +NTD((k 1)T )[j]

    kTdNTd +NT

    ((y(kT )[j] y((k 1)T )[j])) 2j(3.22)

    the results of (kbu(kT )[j]ky(kT )[j]),(I((k1)T )[j]+ kTTi (u((k1)T )[j]

    20

  • y(((k1)T )[j]),( TdTd+NT

    D((k1)T )[j] kTdNTd+NT

    ((y(kT )[j]y((k1)T )[j], areprecomputed and stored in various look up tables. Using the three LUTs

    and corresponding shift add accumulators; the P(kT),D(kT),I(kT) terms can

    be computed in m clock cycles. The main advantage of this method is the

    fact that it utilizes the LUT rich feature of the FPGA for computing the

    control effort.

    Figure 3.3 shows the DA implementation schematic for this particular

    Figure 3.3: DA Implementation Schematic [7]

    implementation. It consists of four delay blocks ,3 LUTs , 3 accumula-

    tors, 2 adders. Delay blocks 1 and 2 are used to obtain U((k 1)T ) andy(k 1)T respectively; whereas delay blocks 3 and 4 are used to com-pute D(k-1)T and I(k-1)T. Three LUTs and ACCs are used to provide

    the terms P(kT),I(kT),D(kT) respectively. The ACC consists of an accu-

    mulator and an adder subtractor pair. Finally two adders produce the sum

    of P(kT),I(kT),D(kT). The throughput of this implementation is m+1 clock

    cycles; i.e. m clock cycles to compute U and one more clock cycle to update

    21

  • I((k-1)T) and D((k-1)T) terms. Thus we find that the DA based implemen-

    tation consumes far less number of logic resources than the parallel multiplier

    based design. Hence the design using DA would require 14 clock cycles to

    implement in comparison to the design based on multipliers that would take

    just a single clock cycle. Since power saving is dependent upon the clock

    frequency; the reduction in power consumption and the reduction in clock

    frequency would be advantageous in those applications which can tolerate

    the increased loop cycle time, resulting form the predominantly serial imple-

    mentation of the DA based controller.

    22

  • Chapter 4

    Recent Trends In FPGA Based

    Control

    4.1 Reconfigurable FPGA In Control Appli-

    cations

    In the recent years,the specifications for control systems has grown to in-

    clude a certain degree of intelligence. They vary from specifications requir-

    ing certain amount of fault tolerance to operating under varying operating

    conditions [11]. These systems must also be capable of intelligent sensor

    selection,remote monitoring and operation and must be capable of imple-

    menting sophisticated control algorithms that require adaptation. Hence in

    order to meet these specifications; one has to look at a new approach in terms

    of either hardware software co-design or reconfigurable hardware like FPGA

    that allow such a type of hardware/software co-design to take place. By sys-

    tematically partitioning the system; functionality requiring large amounts of

    reconfiguration can be given such kind of resources on an FPGA; thereby

    ensuring that the above mentioned objectives are met. This is especially

    useful in certain kinds of fault tolerant systems. Suppose the system detects

    the occurrence of a fault; then a new configuration can be loaded (either par-

    23

  • tially or fully) so that the fault is taken care of (either remedied or bypassed)

    and the control system performance is not affected. Reconfiguration can be

    broadly classified into two:

    Compile Time Reconfiguration

    This is the normal kind of oine reconfiguration that most FPGA sup-

    port. Suppose the architecture of the controller /system requires a change;

    the FPGA is taken oine and the new netlist is fused on the target device.

    In fact , in most kinds of SRAM based FPGA; this is a routine feature. The

    SRAM based FPGA is a volatile device. Its configuration settings have to be

    loaded at power up from a non volatile device like a CPLD or a EEPROM.

    Hence this is in itself a sort of compile time reconfiguration. The concept of

    compile time reconfiguration can be better expressed by means of a diagram

    as follows [2]. Refer to figure 4.1.

    Figure 4.1: Compile Time Reconfiguration [2]

    Run Time Reconfiguration

    This is a reconfiguration technique in which the algorithm to be imple-

    mented is split into many time independent partitions and the FPGA con-

    figuration is changed on the fly. Run time reconfiguration is once again clas-

    sified into global and local reconfiguration [2]. Global reconfiguration refers

    to the process in which the configuration of the whole FPGA is changed.

    Local reconfiguration is the process in which the configuration of the FPGA

    is changed only in partial parts; whereas the configuration of the remaining

    parts remains the same. This feature is allowed only by certain FPGAs. As

    shown in the figure; the FPGA as a whole goes through a series of configura-

    24

  • Figure 4.2: Global Run Time Reconfiguration [2]

    tion changes; from A to B and finally to C. Hence the name global run time

    reconfiguration. Refer to figure 4.3. Here the first configuration and second

    Figure 4.3: Local Run Time Reconfiguration [2]

    configuration share a common partition A; whereas the first partition B is

    reconfigured into C,D in the second configuration. Here the configuration of

    the FPGA is changed only locally;the configurations retain at least one com-

    mon part throughout and hence the name-local reconfiguration. This kind

    of reconfiguration scheme is used in order to effectively change the control

    algorithm and also to incorporate a certain degree of fault tolerance into the

    control system. Suppose,there is a sensor failure in a particular sensor when

    configuration A,B corresponding to figure 4.3 is being run.Now partition B

    can be rewired into C and D such that the failed sensor is effectively wired

    out of the circuit and the control system still continues to function prop-

    erly. Adaptation mechanisms can also be effectively incorporated into the

    scheme.An implementation of a fuzzy logic controller on a recon-

    25

  • figurable FPGA system for the truck back upper application using

    the above mentioned reconfiguration techniques can be found in

    [2].

    4.2 Co-design Implementation of Controllers

    The next important and evolving aspect of FPGA based control is the concept

    of hardware software co-design using FPGA. An application of this partic-

    ular approach in the area of Model Predictive Control (MPC) is illustrated

    in [3]. Hardware software co-design is a new paradigm in which a micropro-

    cessor/microcontroller is embedded in an FPGA. Control algorithms that

    require a large number of computationally involved operations like matrix

    manipulations cannot be effectively implemented on a single microprocessor

    based set up, as the microprocessor gets bogged down while performing these

    operations. It is in this regard that the parallel architecture of the FPGA

    can be exploited to develop a matrix coprocessor for performing these com-

    putations; while the general purpose microprocessor that was embedded in

    the chip can be used to perform other operations. This is more efficient and

    still retains its system on chip nature because of the fact that the processor

    and the FPGA come together,bundled on single chip. The Model predictive

    controller suggested by [3] gives precisely such an implementation. It involves

    co-design with a microcontroller and an FPGA. The micro controller bears

    the load of performing higher level operations of the algorithm, while the

    matrix processor the computationally intensive ones. The matrix processor

    does not have to deal with fetching instructions and data , as that aspect is

    taken care of by the micrcontroller. Currently the hardware implementation

    consists two different boards, one that accommodates the micro controller

    and one the FPGA , that communicate via a 20 bit bus (a 16-bit databus

    and 4 control signals). But since both of them are described using HDL, they

    can later be targeted on a single FPGA in order to arrive at a System On

    Programmable Chip (SOPC) Design.

    26

  • Chapter 5

    Conclusion

    An introduction to the architecture of a FPGA was presented in the first

    chapter along with a mention of the two main kinds of commercially available

    FPGA: Coarse and fine grained FPGA. The features of the latest and modern

    FPGA like Low Voltage Differential Signalling (LVDS) and JTAG compliance

    was also looked at. The next chapter dealt with the need for embedded

    control using FPGA by comparing system design for control purposes using

    FPGA along with system design using DSP processors. It also focused on the

    main advantages of using FPGA like reconfigurability and parallelism along

    with reduced loop cycle time and reduced jitter. The usefulness of FPGA as

    a prototyping device was also illustrated. The role played by FPGA based

    system design towards the ultimate goal of ASIC based implementation of the

    controller was also dealt with. The advantages of the FPGA based design

    such as reduced design cycle time,less power consumption etc. was also

    mentioned.

    The third chapter looked at the implementation of a PID controller as

    suggested by Wei Zhao et.al [1]; first in a normal way and next by using

    Distributed Arithmetic concepts as suggested by Y.C.Chang et.al [7]. This

    was done in order to obtain a feel for the issues and trade offs that one

    faces while designing embedded controllers using FPGA. The PID controller

    was specifically chosen as the control algorithm is relatively simple and it

    27

  • gives an idea as to the issues that must be taken care of while trying system

    on chip implementations. The next chapter looked at the recent trends in

    FPGA based control; namely reconfigurable FPGA. Specifically Global and

    local reconfiguration were compared. Compile time and run time reconfigu-

    ration strategies were contrasted. These offer wide scope for implementing

    fault tolerant and adaptive control strategies on a SOC environment. In the

    next section the concept of co-design was looked at with reference to imple-

    mentation of a SOPC MPC controller, as suggested by [3]. This method,in

    concept, gives a way of implementing a Model Predictive Controller on a

    FPGA target.

    28

  • Bibliography

    [1] Wei Zhao, Byung Hwa kim, Amy C. Larson and Richard M. Voyles.

    FPGA implementation of closed loop control system for small scale

    robot. In Proceedings.,12th International conference on advanced

    robotics-ICAR 05, pages 7077, 2005.

    [2] Daijin Kim. An implementation of fuzzy logic controller on the recon-

    figurable fpga system. IEEE Transactions On Industrial Electronics,Vol

    47,No.3, 2000.

    [3] Panagiotis Vouzis, Leonidas G. Bleris, Mayuresh V. Kothare and Mark

    Arnold. Towards a co-design implementation of a system for model

    predictive control. In Proceedings., Annual Meeting,American Institute

    of Chemical Engineers,Cincinnati Convention Center, Cincinnati, OH,

    November 2005.

    [4] National Instruments: http://www.ni.com. FPGA based control: Mil-

    lions of transistors at your command, 2004.

    [5] Xilinx. Comparing and Contrasting FPGA and Microprocessor System

    Design and Development, july 2004.

    [6] Wikipedia: http://www.wikipedia.org. Field programmable gate array,

    2005.

    29

  • [7] Y.F.Chang, M.Moallem and W.Wang. Efficient implementation of PID

    control algorithm using FPGA technology. In Proceedings.,43 IEEE

    Conference On Decision and Control, 2004.

    [8] Stephen Brown. FPGA and CPLD architectures: A Tutorial. IEEE

    Design And Test Of Computers, 1996.

    [9] Charles H Roth Jr. Digital System Design Using VHDL. Brooks/Cole,

    1998.

    [10] K. J. Astrom and B. Wittenmark. Computer Controlled Systems. Theory

    and Practice. Prentice-Hall, Inc., Upper Saddle River, NJ, 3rd edition,

    1997.

    [11] R.Muthuraman, A.Fajebe and S.Commuri. Intelligence in embedded

    control-a case study. Region 5 conference:Annual Technical and leader-

    ship workshop, pages 125130, April 2004.

    30