Formalizing the Resilience of Open Dynamic Systems
description
Transcript of Formalizing the Resilience of Open Dynamic Systems
Formalizing the Resilience of Open Dynamic Systems
Kazuhiro Minami (ISM), Tenda Okimoto (NII), Tomoya Tanjo (NII), Nicolas Schwind (NII), Hei Chan (NII), Katsumi Inoue (NII), and Hiroshi Maruyama (ISM)
October 26, 2012JAWS 2012
7/31/2012 1Kazuhiro Minami
Many disastrous incidents show that we cannot build systems that fully resist to unexpected events
Lehman financial shock
3.11 nuclear disasters
2003 Northeast blackout
9.11
We should aim to build a resilient system
Taoi-cho, Miyagi Pref.http://www.bousaihaku.com/cgi-bin/hp/index2.cgi?ac1=B742&ac2=&ac3=1574&Page=hpd2_view http://fullload.jp/blog/2011/04/post-265.php
+
7/31/2012 Kazuhiro Minami 3
Resistance Recovery
We formalize Bruneau’s ``Resilience Triangle’’based on Dynamic Constraint Satisfaction
Problems (DCSPs)Se
rvic
e Le
vel Degree of
damage
Time for recovery
Time
100
0
50
Why DCSP?
• Model open systems – Members join or go
away dynamically
• Model changing conditions
Ecological environment X1
Ct Sea level
Land height
f(X1)
DCSP – A time series of CSPs
Variables Domains Constraint
#Variables, domains, and a constraint all change over time!
Configuration and fitness
• Each variable takes a value from domain– I.e.,
• A set of value assignment
is a configuration of the system at time t• A configuration is fit iff
K-Recoverable
• A configuration sequence in dynamic system is k-recoverable if there is no subsequence where all the configurations are unfit
Event 1 Event 2
Unfit Unfitfit fit fit
Example: Resilient Spacecraft RS-1
Components: Value Domain: {Green, Red}Fitness: Every component is Green
Conditions on external Events:1. Each event affects at most k components2. Next event is at least k days apart
Adaptation Strategy:• The engineer fixes one component a day
RS-1 is k-Recoverable
We actually need formal ways to represent accidental failures and adaptation strategies
Transitional Constraint
(TC)
configurationAdaptation
Strategy(AS)
v
Capture laws causality, and non-deterministic events
Represent actions taken by the system itself
Spacecraft Example again
Transitional Constraint
AdaptationStrategy
Componentfailures
Transitional Constraint
AdaptationStrategy
Nothinghappened
We can easily integrate the notion of l-Resistance to get our resilience definition
• Express a constraint Ct as the intersection of multiple Ct
i for i =1 to Mt
• Define the service level as a weighted sum of satisfied constraint Ct
i
• l-Resistance ensures the upper bound of the service degradation
What’s Next?
• Proactive resilience verification algorithm– Find stable solutions by utilizing knowledge of
transitional constraints• Another formalization based on Distributed
Constraint Optimization Problems (DCOPs)– Defining multiple utility functions might be more
practical• Study common resilience strategies:– Diversity, Adaptability, Redundancy and Altruism
Adaptability Example: Ant Colony on the Shore
X1
Ct
X1: Location of the colonyFitness: fit if f(X1)>Ct
Sea level Ct goes up every l daysSea level
Land height
Adaptation Strategy:
f(X1)
if (unfit)Otherwise
This ant colony is 1-resilient if
Diversity Example: Space ColonyColony of n robots Each robot has ten binary
features (e.g., 2-leg/4-leg, flying/non-flying, …)E.g., <0110111011>
C: “fit” configurations
Resource• Resource Reserve R
– Fit robots contribute to build up R – A robot consumes one unit for reconfiguring its one feature
• The colony is resilient if robots can survive a series of changing constraints C1, C2, …, Ct, …
Constraint CA Subset of 2(set of all 1,024
configurations)
A robot is fit if its configuration is in C
Notes on Adaptation Strategies
• Local vs Global– Local: Each robot makes its own decision independently
from others– Global: There is a global coordination. Every robot must
follow the order– Mixed
• Complete vs Incomplete knowledge on C– Complete knowledge: max 10 steps to become fit again– Incomplete knowledge: probabilistic (max 1023 steps if
the landscape is stable)
16
Notes on Constraints
• Topological continuity– If x, y C, there is x∈ 1 (=x), x2, …, xk (=y) s.t. xi C ∈
and the humming_distance(xi, xi+1) = 1
• Semi continuity– There are only a small number of isolated regions
• Small change vs disruptive change– Small: only neighbors are added/deleted– Disruptive: non-small
17
Conclusions
• Formal definition of resilience based on DCSPs– Integrate the notions of Resistance and Recoverability– Represent open systems in a changing environment
• Need to develop additional formalism to define various classes of transitional constraints and adaptation strategies
• Plan to apply our model to systems in different domains
12/10/16 Kazuhiro Minami 18
12/08/20 Hiroshi Maruyama 19
Any Questions?
For more information, please visit our project web site at
systemsresilience.org