Maeda, Sill Torres: CLEVER
CLEVER: Cross-Layer Error Verification Evaluation and Reporting
Rafael Kioji Vivas Maeda, Frank Sill Torres
Federal University of Minas Gerais (UFMG)
School of Engineering
Belo Horizonte, Brazil
2Maeda, Sill Torres: CLEVER
Focus / Principal idea:
System Health Management approach for
Embedded Systems / SoCs
3Maeda, Sill Torres: CLEVER
1. Motivation
2. Preliminaries
3. CLEVER
4. Verification Environment
5. Conclusion
Outline
4Maeda, Sill Torres: CLEVER
Rising complexity of Embedded Systems / Systems-on-Chip (SoC)
MotivationSystem Complexity
# P
roce
ssin
g E
ngin
es /
SoCSoC Memory Size
2011 2014 2018 2022 2026
SoC Logic Size
1,000
3,000
5,000
7,000
ITRS, 2013
5Maeda, Sill Torres: CLEVER
Due to technology scaling considerable increase of:
– Temporary faults
– Aging and permanent faults
MotivationFaults
Altera, RELIABILITY REPORT 56, 2013
0
20
40
60
80
130 nm 90 nm 65 nm 40 nm 25 nm
Stratix Stratix II Stratix III Stratix IV Stratix V
FIT
(Fai
lure
s in
109
h)
6Maeda, Sill Torres: CLEVER
Technique classification
– Avoidance (e.g.: Triple Modular
Redundancy)
– Detection and Recovery (e.g.: Rollback)
– Prediction (e.g.: PHM, S.M.A.R.T)
Prognostics and Health Management (PHM)
– Runtime monitoring
– Remaining Useful Liftetime (RUL) estimation and extension
PreliminariesReliability
V
7Maeda, Sill Torres: CLEVER
Fai
lure
Rat
e λ
Time in Operation
PreliminariesRemaining Usefile Lifetime (RUL)
λold(t)
tRUL_newtRUL_old
λacc
t
λnew(t)
Failu
re R
ate
λ
www.wikipedia.com
8Maeda, Sill Torres: CLEVER
CLEVER
Prediction of possible system failure important for future SoC Limited effectiveness and efficiency of single layer solutions Straightforward system integration required
Prediction of possible system failure important for future SoC Limited effectiveness and efficiency of single layer solutions Straightforward system integration required
Origination of Approach
Cross-Layer Error Verification Evaluation and Reporting
Cross-Layer Error Verification Evaluation and Reporting
CLEVER
9Maeda, Sill Torres: CLEVER
Sensors
– Sensing Device
– Communication
Processing Unit (PU)
– Data acquisition
– Prediction
– Scheduler
Memories
Sensor Bus
System Bus
CLEVERArchitecture
10Maeda, Sill Torres: CLEVER
CLEVER
Two principal parts
– Sensing device
– Communication
with PU
Sensing on different
level:– Physical / electrical
(Temp., Voltage, …)
– Architectural (NBTI,
detected faults, …)
– System (active time,
load, …)
Architecture - Sensor
11Maeda, Sill Torres: CLEVER
CLEVERArchitecture - Processing Unit
Sensor data
acquisition
Error Prediction
Arbitration
Interface to
Operating
System
(optional)
Memory access
12Maeda, Sill Torres: CLEVER
CLEVERArchitecture – OS Integration (optional)
13Maeda, Sill Torres: CLEVER
CLEVERVerification Flow
SystemC
implementation
Communication
based on TLM
(Transaction
Level Modeling)
Verification based
on Message
Sequence Chart
(MSC)
14Maeda, Sill Torres: CLEVER
CLEVERVerification – TLM2MSC
15Maeda, Sill Torres: CLEVER
Increasing design complexity and fault probability demand solutions
PHM solutions permit prediction of (probable) system failure
CLEVER: Cross-layer approach for Error Detection and Reporting
System Architecture of CLEVER defined
Verification by simulation of feasibility of CLEVER architecture
Next steps:
– Implementation of prediction algorithm
– Test case
Conclusion
16Maeda, Sill Torres: CLEVER
Thank you!ART
OptMAlab / ARTwww.asic-reliability.com
Top Related