Impact of Power-Management Granularity on The Energy-Quality Trade-off for Soft And Hard Real-Time...
-
Upload
noah-allison -
Category
Documents
-
view
219 -
download
0
Transcript of Impact of Power-Management Granularity on The Energy-Quality Trade-off for Soft And Hard Real-Time...
Impact of Power-Management Granularity on The Energy-Quality
Trade-off for Soft And Hard Real-Time ApplicationsInternational Symposium on System-on-Chip, 2008
A. Milutinovic, K. Goossens, and G.J.M. Smit
Advisor: Shiann-Rong KuangSpeaker: Hao-Yi Jheng (鄭浩逸 )
2009.2.26
1
Outline Introduction
Application model Work and slack
Policy Conservativeness and Granularity Experimental Results Conclusions
2
Application model
3
In this paper they evaluate two power-management policies for a number of different granularities on an MPEG4 application, on energy and quality (deadline misses). Granularity (N) : frequency of operating point
changes
Hard real-time applications Don’t allow any frame miss deadline Use conservative power-management
Soft real-time applications Allow a limited number of frame miss deadline Use non-conservative power-management
Work and slack
4
Work : the number of processor cycles Relative deadline :
Relative deadline miss means this frame over deadline
Relative slack (r) :
Absolute deadline :
Absolute deadline miss means that the accumulative execution time frame 0 to i is over the total deadline
Absolute slack(s) :
1/i FRacet T f
i ir T acet
0
i
jjacet iT
0( 1)
i
i jjs i T acet
deadlineT actual execution time /i i iacet w f
Outline
5
Introduction Application model Work and slack
Policy Conservativeness and Granularity Experimental Results Conclusions
Conservative Policy Conservative power-management policy :
Does not introduce any deadline misses compared to operating at .
Non-conservative power-management policy : Some frames maybe miss it’s deadline.
6
maxf
Policy
7
Perfect predictor policy (non-conservative) : Accurately predicts the next N frames workload and
scaled the average frequency for those frame
Proven slack policy (conservative) : Proven slack : the cumulative slack of the frames
before it Assume that the next N frames all require the worst-
case work, but use all the proven slack of previous group to reduce the frequency of the processor
1
*0( ) / ( ) for group
i
N
avg i N jjf w NT i
max 0 1( ) / ( ) for group i j j if NMax w NT s i
Outline
8
Introduction Application model Work and slack
Policy Conservativeness and Granularity Experimental Results Conclusions
Experimental Results (1/5) An MPEG4 decoder running on an ARM946 at
86 MHz 25 frames per second (fps), and a resolution
of 176*144 pixel
9
Experimental Results (2/5) Energy savings w.r.t. operating at are around 30%
for 1-128 frames 2% cost for the power management Above 128 frames the proven-slack policy energy
linearly raise
maxf
10
Experimental Results (3/5)
11
The proven-slack policy cannot always exploit the accumulated slack
Average slack :
Worst-case slack :
1
0/ , for a sequence of S frames
S
iis S
10 , for a sequence of S framesS
i iMax s
Experimental Results (4/5)
12
Perfect predictor policy : 95% quality improvement costs only 3% additional energy Optimum is 13000 mJ
Experimental Results (5/5)
13
Many frames can be processed in the range of 240-250 MHz.
Outline
14
Introduction Application model Work and slack
Policy Conservativeness and Granularity Experimental Results Conclusions
Conclusions
15
1. A long tail in the work distribution results in a steep quality improvement : from almost 0% to almost 100% at an additional energy cost of only 3%.
2. The proven-slack policy offers 100% quality at only 0.3% more energy than the perfect-predictor policy, which is theoretical upper bound and hard to achieve in practice.
3. The energy of the policies increases by only 2% when increasing the granularity to 128 frames.
Conclusions Non-conservation
Conservation Tardiness
(sum of frame delay time / frame number)/deadline
16
2arg
1
( ),
Niact t et
iiact i
fps fpsi
FRV fpsN T
Comparison
17
Progress report
18
Advisor: Shiann-Rong Kuang
Speaker: Hao-Yi Jheng
2009.2.23
Outline Adaptive Inter-compensation
How to choose voltage/frequency level Adaptive Experimental Result
Future Work
19
How to choose voltage/frequency level
20
5.83 3.57 1.16 1.52 1.30 0.08 0.97
Why need inter-compensation
21
Inter-compensation PID
Adaptive inter-compensation If (previous frame predictive cycle number is more
cycles) current frame predictive voltage level decreases one
else current frame predictive voltage doesn’t change
If( ) = 2000
else = 27000
22
ii w-w(t)( ) ( )1( ) ( )
I
Di p
T D
t t TK t t D
I T
1 ii i
( ) ( )IT
t t
Inter-compensation
23
Experimental Result
24
Energy(e+08)
No-inter 2000 27000 adaptive
API_00 2.13389 1.89694 2.10778 1.98991
API_01 1.41421 1.18232 1.25112 1.23007
API_02 2.57939 2.20497 2.34232 2.29719
API_03 1.65572 1.4108 1.49139 1.45527
API_04 2.20379 1.88178 2.06792 1.99084
API_05 1.24353 1.04672 1.16125 1.11097
FRV No-inter 2000 27000 adaptive
API_00 66.2636 32.0008 76.9116 39.8287
API_01 35.9665 8.86423
0.5415340.281196
API_02 24.9081 6.53828 1.00831 1.28403
API_03 41.9968 12.2053 0.341697 1.0757
API_04 18.3523 7.35752 3.91522 1.03591
API_05 25.4673 26.3545 1.5618 3.66423
Future Work We need Hardware GM and RM cycle numbers
to verify the experimental Result
Driver is needed to support the GM and RM dump cycle number for prediction
25