VADA Lab.SungKyunKwan Univ. 1 L3: Lower Power Design Overview (2) 성균관대학교 조 준 동...
-
Upload
jerry-sheward -
Category
Documents
-
view
213 -
download
0
Transcript of VADA Lab.SungKyunKwan Univ. 1 L3: Lower Power Design Overview (2) 성균관대학교 조 준 동...
SungKyunKwan Univ.
1VADA Lab.
L3: Lower Power Design Overview (2)
성균관대학교 조 준 동 교수http://vlsicad.skku.ac.kr
SungKyunKwan Univ.
2VADA Lab.
•
Low-Power Design Flow developed at LIS
SungKyunKwan Univ.
3VADA Lab.
Low Power Design Flow IFunction
Partitioning andHW/SW Allocation
SystemLevel
Specification
System-LevelPower Analysis
BehavioralDescription
SoftwareFunctions
ProcessorSelection
Power-drivenBehavioralTransformation
Behavioral-LevelPower Analysis
Power ConsciousBehavioralDescription
Power AnalysisRT-LevelHigh-Level
Synthesis andOptimization
SoftwareOptimization
Software-Level
Power Analysis
To RT-Level Design
SungKyunKwan Univ.
4VADA Lab.
Low Power Design Flow II
RT-levelDescription
RTLmapping
Logic SynthesisandOptimization
Gate-LevelPower Analysis
Gate-level
Description
Power AnalysisSwitch-LevelHigh-Level
Synthesis andOptimization
RTLLibrary
Data-path Controller
Switch-level
Description
Standard cellLibraryProcessor
Control andSteering Logic
Memory
RTLMacrocells
SungKyunKwan Univ.
5VADA Lab.
Execution unit idle time(PowerPC 603)
0 20 40 60 80 100
Load/Store
Fixed Poin t
Floating Point
Special Register
SPECfp92SPECint92
SungKyunKwan Univ.
6VADA Lab.
System Integration
SungKyunKwan Univ.
7VADA Lab.
Power Consumption in Multimedia Systems
• LCD: 54.1%, HDD 16.8%, CPU 10.7%, VGA/VRAM 9.6%, SysLogic 4.5%, DRAM 1.1%, Others: 3.2%
• 5-55 Mode: – Display mode: CPU is in sleep-mode
(55 minutes), LCD (VRAM + LCDC)– CPU mode: Display is idle ( 5 minutes),
Looking up - data retrival• Handwrite recognition - biggest power (me
mory, system bus active)
SungKyunKwan Univ.
8VADA Lab.
Reducing Waste
• Locality of reference
• Demand-driven / Data-driven computation
• Application-specific processing
• Preservation of data correlations
• Distributed processing
SungKyunKwan Univ.
9VADA Lab.
Energy-Efficient Design
1) Reduce the supply voltage
Energy of switching drops quadratically with the supply voltage
This drop is accompanied by reduced circuit speed
2) Minimizing switching capacitance
Exploiting locality of reference with distributed computational structures, minimizing global interactions
Enforcing a demand-driven policy that eliminates switching activities in unused modules
Preserving temporal correlation in data streams by minimizing the degree of hardware sharing
SungKyunKwan Univ.
10VADA Lab.
Switching Activity
SungKyunKwan Univ.
11VADA Lab.
Eliminating Redundant Computations
SungKyunKwan Univ.
12VADA Lab.
Power saving concepts Work with parallel computation and low frequency. Reduce pipe stages to save registers (try to avoid hazards). Disable input toggling when the block is at idle state. Work with minimum gate size to reduce the toggle current. For outputs with large fanout’s speed up the transition to re
duce the short circuit current (invest toggle current in order to save short circuit current) .
SungKyunKwan Univ.
13VADA Lab.
Power Management• DPM
(Dynamic Power Management): stops the clock switching of a specific unit generated by clock generators. The clock regenerators produce two clocks, C1 and C2 . The logic: 0.3%, 10-20% of power savings.
• SPM (Static Power Management): sa
ving of the power dissipation in the steady mode. When the system (or subsystem) remains idle for a significant period time, then the entire chip
(or subsystem) is shut-down.• Identify power hungry modules
and look for opportunities to reduce power
• If f is increased, one has to increase the transistor size or Vdd.
SungKyunKwan Univ.
14VADA Lab.
Power Management([email protected])
• use right supply and right frequency to each part of the system If one has to wait on the occurence of some input, only a small circuit could wait and wake-up the main circuit when the input occurs.
• Another technique is to reduce the basic frequency for tasks that can be executed slowly.
• PowerPC 603 is a 2-issue (2 instructions read at a time) with 5 parallel
• execution units. 4 modes:– Full on mode for full speed– Doze mode in which the execution units are not running– Nap mode which also stops the bus clocking and the Sleep mode which sto
ps the clock generator– Sleep mode which stops the clock generator with or without the PLL (20-100
mW).
• Superpipelined MIPS R4200 : 5-stage pipleline, MIPS R4400: 8 stage, 2 execution units, f/2 in reduce mode.
SungKyunKwan Univ.
15VADA Lab.
TI• Two DSPs: TMS320C541, TMS320C542 reduce power and chip count and syst
em cost for wireless communication applications • C54X DSPs, 2.7V, 5V, Low-Power Enhanced Architecture DSP (LEAD) family: T
hree different power down modes, these devices are well-suited for wireless communications products such as digital cellular phones, personal digital assistants, and wireless modem,low power on voice coding and decoding
• The TMS320LC548 features:– 15-ns (66 MIPS) or 20-ns (50 MIPS) instruction cycle times– 3.0- and 3.3-V operation
• 32K 16-bit words of RAM and 2K 16-bit words of boot ROM on-chip• Integrated Viterbi accelerator that reduces Viterbi butter y update in four instructi
on cycles for GSM channel decoding• Powerful single-cycle instructions (dual operand, parallel instructions, conditional
instructions)• Low-power standby modes
SungKyunKwan Univ.
16VADA Lab.
Low-power embedded system design
• low-power embedded applications: PDAs, mobile phones, etc. power-efficient processor cores(ARM)
• cache/memory organization for low power
• power management on embedded system chips, comparative analysis of power drawn by subsystems (CPU, hard disk, display, and standby) of notebooks
SungKyunKwan Univ.
17VADA Lab.
High level optimization for low power
• use of parallel and/or pipelined structures, • the choice of data representations, • the exploitation of signal correlations, • the synchronization of signals for glitching minimi
zation, and an accurate analysis of the shared resources.
• At the algorithmic-level, applying arithmetic and logic transformations to the block diagram
SungKyunKwan Univ.
18VADA Lab.
VLSI Signal Processing Design Methodology
• pipelining, parallel processing, retiming, folding, unfolding, look-ahead, relaxed look-ahead, and approximate filtering
• bit-serial, bit-parallel and digit-serial architectures, carry save architecture
• redundant and residue systems
• Viterbi decoder, motion compensation, 2D-filtering, and data transmission systems
SungKyunKwan Univ.
19VADA Lab.
Power-hungry Applications
• Signal Compression: HDTV Standard, ADPCM, Vector Quantization, H.263, 2-D motion estimation, MPEG-2 storage management
• Digital Communications: Shaping Filters, Equalizers, Viterbi decoders, Reed-Solomon decoders