Chapter 3 CPUs

105
Chapter 3 CPUs 金金金金金 金金金金金金金金金金 (Slides are taken from the textbook slides)

description

Chapter 3 CPUs. 金仲達教授 清華大學資訊工程學系 (Slides are taken from the textbook slides). Outline. Input and Output Mechanisms Supervisor Mode, Exceptions, Traps Memory Management Performance and Power Consumption. I/O devices. An embedded system usually includes some input/output devices - PowerPoint PPT Presentation

Transcript of Chapter 3 CPUs

Page 1: Chapter 3 CPUs

Chapter 3

CPUs

金仲達教授清華大學資訊工程學系

(Slides are taken from the textbook slides)

Page 2: Chapter 3 CPUs

CPUs-2

Outline Input and Output Mechanisms Supervisor Mode, Exceptions, Traps Memory Management Performance and Power Consumption

Page 3: Chapter 3 CPUs

CPUs-3

I/O devices An embedded system usually includes

some input/output devices A typical I/O interface to CPU:

CPU

statusreg

datareg

mec

hani

sm

Page 4: Chapter 3 CPUs

CPUs-4

Application: 8251 UART Universal asynchronous receiver

transmitter (UART): provides serial communication

8251 functions are usually integrated into standard PC interface chip

Allows many communication parameters Baud (bit) rate: e.g. 9600 chars/sec Number of bits per character Parity/no parity Even/odd parity Length of stop bit (1, 1.5, 2 bits)

Page 5: Chapter 3 CPUs

CPUs-5

8251 CPU interface

CPU 8251

status(8 bit)

data(8 bit)

serialport

xmit/rcv

time

bit 0 bit 1 bit n-1

nochar

start stop...

Page 6: Chapter 3 CPUs

CPUs-6

Programming I/O Two types of instructions can support I/O:

special-purpose I/O instructions memory-mapped load/store instructions

Intel x86 provides in, out instructions Most other CPUs use memory-mapped I/O

MM I/O provide address for I/O registers

Page 7: Chapter 3 CPUs

CPUs-7

ARM memory-mapped I/O Define location for device:DEV1 EQU 0x1000

Read/write code:LDR r1,#DEV1; set up device adrsLDR r0,[r1] ; read DEV1LDR r0,#8 ; set up value to writeSTR r0,[r1] ; write value to device

Page 8: Chapter 3 CPUs

CPUs-8

SHARC memory mapped I/O Device must be in external memory space

(above 0x400000). Use DM to control access:

I0 = 0x400000;M0 = 0;R1 = DM(I0,M0);

Page 9: Chapter 3 CPUs

CPUs-9

Peek and poke Traditional C interfaces:

int peek(char *location) {return *location; }

void poke(char *location, char newval) {(*location) = newval; }

Page 10: Chapter 3 CPUs

CPUs-10

Busy/wait output Simplest way to program device

Use instructions to test when device is ready

#define OUT_CHAR 0x1000 // device data register#define OUT_STATUS 0x1001 // device status register

current_char = mystring;while (*current_char != ‘\0’) {

poke(OUT_CHAR,*current_char);while (peek(OUT_STATUS) != 0); // busy waitingcurrent_char++;

}

Page 11: Chapter 3 CPUs

CPUs-11

Simultaneous busy/wait input and outputwhile (TRUE) {/* read */while (peek(IN_STATUS) == 0);achar = (char)peek(IN_DATA);/* write */poke(OUT_DATA,achar);poke(OUT_STATUS,1);while (peek(OUT_STATUS) != 0);}

Page 12: Chapter 3 CPUs

CPUs-12

Interrupt I/O Busy/wait is very inefficient

CPU can’t do other work while testing device Hard to do simultaneous I/O

Interrupts allow a device to change the flow of control in the CPU Causes subroutine call to handle device

CPU

statusreg

datareg

mec

hani

sm

PC

intr request

intr ack

data/address

IR

Page 13: Chapter 3 CPUs

CPUs-13

Interrupt behavior Based on subroutine call mechanism. Interrupt forces next instruction to be a

subroutine call to a predetermined location. Return address is saved to resume executing

foreground program.

Page 14: Chapter 3 CPUs

CPUs-14

Interrupt physical interface CPU and device are connected by CPU bus CPU and device handshake:

device asserts interrupt request CPU checks the interrupt request line at the

beginning of each instruction cycle CPU asserts interrupt acknowledge when it can

handle the interrupt CPU fetches the next instruction from the

interrupt handler routine

Page 15: Chapter 3 CPUs

CPUs-15

Example: character I/O handlersvoid input_handler() {achar = peek(IN_DATA);gotchar = TRUE;poke(IN_STATUS,0);

}void output_handler() {}

Page 16: Chapter 3 CPUs

CPUs-16

Example: interrupt-driven main programmain() {while (TRUE) {

if (gotchar) {poke(OUT_DATA,achar);poke(OUT_STATUS,1);gotchar = FALSE;}

}}

Page 17: Chapter 3 CPUs

CPUs-17

Example: interrupt I/O with buffers Queue for characters:

head tailhead tail

a

Page 18: Chapter 3 CPUs

CPUs-18

Buffer-based input handlervoid input_handler() {

char achar;if (full_buffer()) error = 1;else { achar = peek(IN_DATA); add_char(achar); }poke(IN_STATUS,0);if (nchars == 1)

{ poke(OUT_DATA,remove_char(); poke(OUT_STATUS,1); }}

Page 19: Chapter 3 CPUs

CPUs-19

Debugging interrupt code What if you forget to change registers?

Foreground program can exhibit mysterious bugs.

Bugs will be hard to repeat---depend on interrupt timing.

Page 20: Chapter 3 CPUs

CPUs-20

Priorities and Vectors Two mechanisms allow us to make

interrupts more specific: Priorities determine what interrupt gets CPU

first Vectors determine what code (handler routine)

is called for each type of interrupt Mechanisms are orthogonal: most CPUs

provide both

Page 21: Chapter 3 CPUs

CPUs-21

Prioritized interrupts

CPU

device 1 device 2 device n

L1 L2 .. Ln

interruptacknowledge

Page 22: Chapter 3 CPUs

CPUs-22

Interrupt prioritization Masking: interrupt with priority lower than cur

rent priority is not recognized until pending interrupt is complete

Non-maskable interrupt (NMI): highest-priority, never masked Often used for power-down

Page 23: Chapter 3 CPUs

CPUs-23

Interrupt Vectors Allow different devices to be handled by

different code Require additional vector line from device

to CPU

handler 0

handler 1

handler 2

handler 3

Interrupt Vector Table

CPU

Device

int req ack

Vector

Page 24: Chapter 3 CPUs

CPUs-24

Generic interrupt mechanism

Assume priority selection is handled before this point.

intr?N

Y

N

ignore

Y

ack

vector?

Y

Y

Ntimeout?

Ybus error

call table[vector]

intr priority > current priority?

continueexecution

Page 25: Chapter 3 CPUs

CPUs-25

Interrupt sequence CPU acknowledges request Device sends vector CPU calls handler Software processes request CPU restores state to foreground program

Page 26: Chapter 3 CPUs

CPUs-26

Sources of interrupt overhead Handler execution time Interrupt mechanism overhead Register save/restore Pipeline-related penalties Cache-related penalties

Page 27: Chapter 3 CPUs

CPUs-27

ARM interrupts ARM7 supports two types of interrupts:

Fast interrupt requests (FIQs) Interrupt requests (IRQs) FIO takes priority over IRQ

Interrupt table starts at location 0

Page 28: Chapter 3 CPUs

CPUs-28

ARM interrupt procedure CPU actions:

Save PC; copy CPSR to SPSR. Force bits in CPSR to record interrupt. Force PC to vector.

Handler responsibilities: Restore proper PC. Restore CPSR from SPSR. Clear interrupt disable flags.

Page 29: Chapter 3 CPUs

CPUs-29

ARM interrupt latency Worst-case latency to respond to interrupt

is 27 cycles: Two cycles to synchronize external request. Up to 20 cycles to complete current instruction. Three cycles for data abort. Two cycles to enter interrupt handling state.

Page 30: Chapter 3 CPUs

CPUs-30

SHARC interrupt structure Interrupts are vectored and prioritized. Priorities are fixed: reset highest, user SW

interrupt 3 lowest. Vectors are also fixed. Vector is offset in

vector table. Table starts at 0x20000 in internal memory, 0x40000 in external memory.v

Page 31: Chapter 3 CPUs

Pads Lab

SHARC interrupt sequenceStart: must be executing or IDLE/IDLE16.1. Output appropriate interrupt vector

address.2. Push PC value onto PC stack.3. Set bit in interrupt latch register.4. Set IMASKP to current nesting state.

Page 32: Chapter 3 CPUs

Pads Lab

SHARC interrupt returnInitiated by RTI instruction.1. Return to address at top of PC stack.2. Pop PC stack.3. Pop status stack if appropriate.4. Clear bits in interrupt latch register and

IMASKP.

Page 33: Chapter 3 CPUs

Pads Lab

SHARC interrupt performance Three stages of response:

1 cycle: synchronization and latching 1 cycle: recognition 2 cycles: branching to vector

Total latency: 3 cycles. Multiprocessor vector interrupts have 6

cycle latency.

Page 34: Chapter 3 CPUs

CPUs-34

Outline Input and Output Mechanisms Supervisor Mode, Exceptions, Traps Memory Management Performance and Power Consumption

Page 35: Chapter 3 CPUs

CPUs-35

Supervisor mode May want to provide protective barriers

between programs. e.g., avoid memory corruption

Need supervisor mode to manage the various programs

SHARC does not have a supervisor mode

Page 36: Chapter 3 CPUs

CPUs-36

ARM supervisor mode Use SWI instruction to enter supervisor

mode, similar to subroutine:SWI CODE_1 Sets PC to 0x08 Argument to SWI is passed to supervisor mode

code to request various services Saves CPSR in SPSR

Page 37: Chapter 3 CPUs

CPUs-37

Exception Exception: internally detected error Exceptions are synchronous with instructions

but unpredictable Build exception mechanism on top of interrup

t mechanism Exceptions are usually prioritized and vectoriz

ed A single instruction may generate more than one ex

ception

Page 38: Chapter 3 CPUs

CPUs-38

Trap Trap (software interrupt): an exception

generated by an instruction Call supervisor mode

ARM uses SWI instruction for traps SHARC offers three levels of software

interrupts. Called by setting bits in IRPTL register

Page 39: Chapter 3 CPUs

CPUs-39

Co-processor Co-processor: added function unit that is

called by instruction e.g. floating-point operations

A co-processor instruction can cause trap and be handled by software (if no such co-processor exists)

ARM allows up to 16 co-processors

Page 40: Chapter 3 CPUs

CPUs-40

Outline Input and Output Mechanisms Supervisor Mode, Exceptions, Traps Memory Management Performance and Power Consumption

Page 41: Chapter 3 CPUs

CPUs-41

Caches and CPUs

CPU

cach

eco

ntro

ller

cache

mainmemory

data

data

address

data

address

Memory access speed is falling further and further behind than CPU

Cache: reduce the speed gap

Page 42: Chapter 3 CPUs

CPUs-42

Cache operation May have caches for:

instructions data data + instructions (unified)

Memory access time is no longer deterministic Cache hit: required location is in cache Cache miss: required location is not in cache Working set: set of locations used by program

in a time interval.

Page 43: Chapter 3 CPUs

CPUs-43

Types of cache misses Compulsory (cold): location has never

been accessed. Capacity: working set is too large Conflict: multiple locations in working set

map to same cache entry.

Page 44: Chapter 3 CPUs

CPUs-44

Memory system performance Cache performance benefits:

Keep frequently-accessed locations in fast cache. Cache retrieves more than one word at a time.

Sequential accesses are faster after first access. h = cache hit rate. tcache = cache access time, tmain = main memory

access time. Average memory access time:

tav = htcache + (1-h)tmain

Page 45: Chapter 3 CPUs

CPUs-45

Multi-level caches

h1 = cache hit rate. h2 = rate for miss on L1, hit on L2. Average memory access time:

tav = h1tL1 + (h2-h1)tL2 + (1- h2-h1)tmain

CPU L1 cache L2 cache

Page 46: Chapter 3 CPUs

CPUs-46

Replacement policies Replacement policy: strategy for choosing

which cache entry to throw out to make room for a new memory location.

Two popular strategies: Random. Least-recently used (LRU).

Page 47: Chapter 3 CPUs

CPUs-47

Cache organizations Fully-associative: any memory location

can be stored anywhere in the cache (almost never implemented)

Direct-mapped: each memory location maps onto exactly one cache entry

N-way set-associative: each memory location can go into one of n sets

Page 48: Chapter 3 CPUs

CPUs-48

Direct-mapped cache

valid

=

tag index offset

hit value

tag data

1 0xabcd byte byte byte ...

byte

cache block

Page 49: Chapter 3 CPUs

CPUs-49

Direct-mapped cache locations Many locations map onto the same cache

block. Conflict misses are easy to generate:

Array a[] uses locations 0, 1, 2, … Array b[] uses locations 1024, 1025, 1026, … Operation a[i] + b[i] generates conflict misses

if locations 0 and 1024 are mapped to the same block in the cache.

Write operations: Write-through: immediately copy write to main

memory Write-back: write to main memory only when

location is removed from cache

Page 50: Chapter 3 CPUs

CPUs-50

Set-associative cache A set of direct-mapped caches:

Set 1 Set 2 Set n...

hit data

Page 51: Chapter 3 CPUs

CPUs-51

Memory Management Units Memory management unit (MMU)

translates addresses MMU are not common in embedded

system as it hardly has a secondary storage

CPU mainmemory

MMU

logicaladdress

physicaladdress

secondary storage

swapping

data

Page 52: Chapter 3 CPUs

CPUs-52

Memory management tasks Allows programs to move in physical

memory during execution. Allows virtual memory:

memory images kept in secondary storage; images returned to main memory on demand

during execution. Page fault: request for location not

resident in memory, which generates an exception

Page 53: Chapter 3 CPUs

CPUs-53

Address translation Requires some sort of register/table to

allow arbitrary mappings of logical to physical addresses

Two basic schemes: segmented paged

Segmentation and paging can be combined (x86)

Page 54: Chapter 3 CPUs

CPUs-54

Segment address translation

segment base address logical address

rangecheck

physical address

+

rangeerror

segment lower boundsegment upper bound

to memory

from CPU

Page 55: Chapter 3 CPUs

CPUs-55

Page address translation

page offset

page offset

page i base

concatenate

page table

logicaddress

physicaladdress

to memory

from CPU

Page 56: Chapter 3 CPUs

CPUs-56

Page table organizations

flat tree

page descriptor

pagedescriptor

Page 57: Chapter 3 CPUs

CPUs-57

Caching address translations Large translation tables require main

memory access. TLB: cache for address translation.

Typically small.

Page 58: Chapter 3 CPUs

CPUs-58

ARM & SHARC memory management Memory region types:

section: 1 Mbyte block large page: 64 kbytes small page: 4 kbytes

An address is marked as section-mapped or page-mapped

Two-level translation scheme SHARC does not have a MMU

Page 59: Chapter 3 CPUs

CPUs-59

ARM address translation

offset1st index 2nd index

physical address

Translation tablebase register

1st level tabledescriptor

2nd level tabledescriptor

concatenate

concatenate

Page 60: Chapter 3 CPUs

CPUs-60

Outline Input and Output Mechanisms Supervisor Mode, Exceptions, Traps Memory Management Performance and Power Consumption

Page 61: Chapter 3 CPUs

CPUs-61

Performance Acceleration There are 3 factors that can substantially

improve system performance: Pipelining Superscalar execution Caching

Need to take advantages of them where possible

But, they also cause problems in analyzing the performance

Page 62: Chapter 3 CPUs

CPUs-62

Pipelining Several instructions are executed simultaneou

sly at different stages of completion Various conditions can cause pipeline stalls th

at reduce utilization: branches memory system delays data hazards, etc.

Both ARM and SHARC have 3-stage pipes: fetch instruction from memory; decode opcode and operands; execute.

Page 63: Chapter 3 CPUs

CPUs-63

ARM pipeline execution

add r0,r1,#5

sub r2,r3,r6

cmp r2,#3

fetch

time

decode

fetch

execute

decode

fetch

execute

decode execute

1 2 3

Page 64: Chapter 3 CPUs

CPUs-64

Pipeline performance Latency: time it takes for an instruction to

get through the pipeline. Throughput: number of instructions

executed per time period. Pipelining increases throughput without

reducing latency. Pipeline stall:

If a step cannot be completed in the same amount of time, pipeline stalls.

Bubbles introduced by stall increase latency, reduce throughput.

Page 65: Chapter 3 CPUs

CPUs-65

fetch decodeex ld r2ldmia r0,{r2,r3}

sub r2,r3,r6

cmp r2,#3

fetch

time

ex ld r3

decode ex sub

fetch decodeex cmp

Data stall Multi-cycle execution and data stall LDMIA: load multiple

Page 66: Chapter 3 CPUs

CPUs-66

Control stalls Branches often introduce stalls (branch

penalty). Stall time may depend on whether branch is

taken. May have to squash instructions that

already started executing. Don’t know what to fetch until condition is

evaluated.

Page 67: Chapter 3 CPUs

CPUs-67

ARM pipelined branch

time

fetch decode ex bnebne foo

sub r2,r3,r6

fetch decode

foo add r0,r1,r2

ex bne

fetch decode ex add

ex bne

Page 68: Chapter 3 CPUs

CPUs-68

Delayed branch To increase pipeline efficiency, delayed

branch mechanism requires n instructions after branch always executed whether branch is executed or not

SHARC supports delayed and non-delayed branches Specified by bit in branch instruction 2 instruction branch delay slot

Page 69: Chapter 3 CPUs

CPUs-69

Example: SHARC code schedulingL1=5;DM(I0,M1)=R1;L8=8;DM(I8,M9)=R2;

CPU cannot use DAG on cycle just after loading DAG’s register, because both need the same internal bus. CPU performs NOP between register assign and DM.

Page 70: Chapter 3 CPUs

CPUs-70

Rescheduled SHARC codeL1=5;L8=8;DM(I0,M1)=R1;DM(I8,M9)=R2;

Avoids two NOP cycles.

Page 71: Chapter 3 CPUs

CPUs-71

Example: ARM execution time Determine execution time of FIR filter:

for (i=0; i<N; i++)f = f + c[i]*x[i];

Only branch in loop test may take more than one cycle. BLT loop takes 1 cycle best case, 3 worst

case.

Page 72: Chapter 3 CPUs

CPUs-72

Superscalar execution Superscalar processor can execute several

instructions per cycle Uses multiple pipelined data paths

Programs execute faster, but it is harder to determine how much faster

Page 73: Chapter 3 CPUs

CPUs-73

Superscalar Processor

Control

Instruction 2

Instruction 1

Instruction Unit

Instruction Unit

Registers

Page 74: Chapter 3 CPUs

CPUs-74

Data dependencies Execution time depends on operands, not just

opcode. Superscalar CPU checks data dependencies dy

namically:

add r2,r0,r1add r3,r2,r5

data dependency r0 r1

r2 r5

r3

Page 75: Chapter 3 CPUs

CPUs-75

Memory system performance Caches introduce indeterminacy in

execution time. Depends on order of execution.

Cache miss penalty: added time due to a cache miss.

Several reasons for a miss: compulsory, conflict, capacity.

Page 76: Chapter 3 CPUs

CPUs-76

CPU power consumption Most modern CPUs are designed with

power consumption in mind to some degree

Power vs. energy: Power is energy consumption per unit time heat depends on power consumption battery life depends on energy consumption

Page 77: Chapter 3 CPUs

CPUs-77

CMOS power consumption Voltage drops: power consumption

proportional to V2

Toggling: more activity means more power Leakage: basic circuit characteristics; can

be eliminated by disconnecting power

Page 78: Chapter 3 CPUs

CPUs-78

CPU power-saving strategies Reduce power supply voltage Run at lower clock frequency Disable function units with control signals

when not in use Disconnect parts from power supply when

not in use to eliminate leakage currents

Page 79: Chapter 3 CPUs

CPUs-79

Power management styles Static power management: does not

depend on CPU activity Example: user-activated power-down mode

Dynamic power management: based on CPU activity Example: disabling off function units

Page 80: Chapter 3 CPUs

CPUs-80

Application: PowerPC 603 Provides doze, nap, sleep modes for static po

wer management Dynamic power management features:

Can shut down unused execution units Cache organized into subarrays to minimize amoun

t of active circuitry

Page 81: Chapter 3 CPUs

CPUs-81

PowerPC 603 activity Percentage of time units are idle for SPEC

integer/floating-point:unit Specint92 Specfp92D cache 29% 28%I cache 29% 17%load/store 35% 17%fixed-point 38% 76%floating-point 99% 30%system register 89% 97%

Idle units are turned off by switching off clocks Pipeline stages can be turned on or off

Page 82: Chapter 3 CPUs

CPUs-82

Power-down costs Going into a power-down mode costs:

time energy

Must determine if going into mode is worthwhile

Can model CPU power states with power state machine

Page 83: Chapter 3 CPUs

CPUs-83

Application: StrongARM SA-1100 Processor takes two supplies:

VDD is main 3.3V supply (on & off) VDDX is 1.5V (always remains on)

Three power modes: Run: normal operation. Idle: stops CPU clock, with logic still powered Sleep: shuts off most of chip activity; 3 steps,

each about 30 s; wakeup takes > 10 ms

Page 84: Chapter 3 CPUs

CPUs-84

SA-1100 power state machine

run

idle sleep

Prun = 400 mW

Pidle = 50 mW Psleep = 0.16 mW

10 s

10 s90 s

160 ms90 s

Page 85: Chapter 3 CPUs

CPUs-85

Outline Input and Output Mechanisms Supervisor Mode, Exceptions, Traps Memory Management Performance and Power Consumption Example Design: Data Compressor

Page 86: Chapter 3 CPUs

CPUs-86

Goals Compress data transmitted over serial

line. Receives byte-size input symbols. Produces output symbols packed into bytes.

Will build software module only here.

Page 87: Chapter 3 CPUs

CPUs-87

Collaboration diagram for compressor

:input :data compressor :output

1..n: inputsymbols

1..m: packedoutputsymbols

Page 88: Chapter 3 CPUs

CPUs-88

Huffman coding Early statistical text compression

algorithm. Select non-uniform size codes.

Use shorter codes for more common symbols. Use longer codes for less common symbols.

To allow decoding, codes must have unique prefixes. No code can be a prefix of a longer valid code.

Page 89: Chapter 3 CPUs

CPUs-89

Huffman examplecharacter Pa .45b .24c .11d .08e .07f .05

P=1

P=.55

P=.31P=.19

P=.12

Page 90: Chapter 3 CPUs

CPUs-90

Example Huffman code Read code from root to leaves:

a 1b 01c 0000d 0001e 0010f 0011

Page 91: Chapter 3 CPUs

CPUs-91

Huffman coder requirements table

name data compression module

purpose code module for Huffmancompression

inputs encoding table, uncodedbyte-size inputs

outputs packed compression outputsymbols

functions Huffman coding

performance fast

manufacturing cost N/A

power N/A

physical size/weight N/A

Page 92: Chapter 3 CPUs

CPUs-92

Building a specification Collaboration diagram shows only steady-

state input/output. A real system must:

Accept an encoding table. Allow a system reset that flushes the

compression buffer.

Page 93: Chapter 3 CPUs

CPUs-93

data-compressor class

data-compressor

buffer: data-buffertable: symbol-tablecurrent-bit: integer

encode(): boolean,data-buffer

flush()new-symbol-table()

Page 94: Chapter 3 CPUs

CPUs-94

Data-compressor behaviors encode: takes one-byte input, generates

packed encoded symbols and a Boolean indicating whether the buffer is full.

new-symbol-table: installs new symbol table in object, throws away old table.

flush: returns current state of buffer, including number of valid bits in buffer.

Page 95: Chapter 3 CPUs

CPUs-95

Auxiliary classes

data-buffer

databuf[databuflen] :character

len : integer

insert()length() : integer

symbol-table

symbols[nsymbols] :data-buffer

len : integer

value() : symbolload()

Page 96: Chapter 3 CPUs

CPUs-96

Auxiliary class roles data-buffer holds both packed and unpacked

symbols. Longest Huffman code for 8-bit inputs is 256 bits.

symbol-table indexes encoded verison of each symbol. load() puts data in a new symbol table.

Page 97: Chapter 3 CPUs

CPUs-97

Class relationships

symbol-table

data-compressor

data-buffer

1

1

1

1

Page 98: Chapter 3 CPUs

CPUs-98

Encode behavior

encode

create new bufferadd to buffers

add to buffer

return true

return false

input symbol

buffer filled?

T

F

Page 99: Chapter 3 CPUs

CPUs-99

Insert behavior

pack intothis buffer

pack bottom bitsinto this buffer,

top bits intooverflow buffer

updatelength

inputsymbol

fills buffer?

T

F

Page 100: Chapter 3 CPUs

CPUs-100

Program design In an object-oriented language, we can

reflect the UML specification in the code more directly.

In a non-object-oriented language, we must either: add code to provide object-oriented features; diverge from the specification structure.

Page 101: Chapter 3 CPUs

CPUs-101

C++ classesClass data_buffer {

char databuf[databuflen];int len;int length_in_chars() { return len/bitsperbyte; }

public:void insert(data_buffer,data_buffer&);int length() { return len; }int length_in_bytes() { return (int)ceil(len/8.0); }int initialize(); ...

Page 102: Chapter 3 CPUs

CPUs-102

C++ classes, cont’d.class data_compressor {

data_buffer buffer;int current_bit;symbol_table table;

public:boolean encode(char,data_buffer&);void new_symbol_table(symbol_table);int flush(data_buffer&);data_compressor();~data_compressor();}

Page 103: Chapter 3 CPUs

CPUs-103

C codestruct data_compressor_struct {

data_buffer buffer;int current_bit;sym_table table;

}typedef struct data_compressor_struct data_compressor,

*data_compressor_ptr;boolean data_compressor_encode(data_compressor_ptr mycmptr

s, char isymbol, data_buffer *fullbuf) ...

Page 104: Chapter 3 CPUs

CPUs-104

Testing Test by encoding, then decoding:

input symbols

symbol table

encoder decoder

compare

result

Page 105: Chapter 3 CPUs

CPUs-105

Code inspection tests Look at the code for potential problems:

Can we run past end of symbol table? What happens when the next symbol does not

fill the buffer? Does fill it? Do very long encoded symbols work properly?

Very short symbols? Does flush() work properly?