Data survivability vs. security in information systems

1

DATA SURVIVABILITY VS. SECURITY IN INFORMATION

SYSTEMS

Gregory Levitin a,b,n, Kjell Hausken c, Heidi A. Taboada d, David W. Coit

Presented by Chia-Ling Lee

2

Agenda

Introduction Data survivability/data security model Two Multiple objective optimization

model Multiple objective optimization Multiple objective genetic algorithm(MOGA)

Example Conclusion

3

Introduction

Defender may suffers theft and/or information destruction. To prevent theft, the defender can separate the

information into multiple blocks and store these blocks on multiple resources.

Separating the information increases its vulnerability as the destruction of any block makes it unusable. To prevent destruction the defender can create

multiple copies of each block and store these blocks on multiple resources.

Creating multiple copies increases the possibility of theft.

4

Introduction

This paper considers the conflicting information survivability and security requirements in systems with data separation based on secret sharing approach.

5

Data survivability/data security model

how a defender separates information into blocks, and makes copies of blocks, to prevent theft and destruction?

The defender separates information into n blocks and stores these blocks on n different resources to protect information from theft.

If the attacker succeeds to destroy any block of information, the defender loses the information integrity and cannot use it. The defender enhances the data survivability by creating mi copies of each block i of the information and stores these blocks on mi different resources.

6


In order to destroy the information, the attacker should destroy all mi copies of any block i. In order to steal the information, the attacker should steal at least one copy of an block.Block 1 Block 2 Block 3

Block 1

Block 2

Block 3

7


Creating more blocks makes information more difficult to steal, but easier to destroy.

Creating more copies of each block makes information more difficult to destroy, but easier to steal.

multiple objective optimization problem constrained optimization problem

8


We assume that the events of data destruction at different resources are independent, which presumes that different resources use different methods of data protection.

9


Formula 1:

Block1 destruction : 0.1*0.1*0.1 (only this situation) V=1-[1-(0.1*0.1*0.1)]*[1-(0.1*0.1*0.1)]*[1-(0.1*0.1*0.1)] When the defender is solely concerned about

information destruction, the defender prefers maximum mi and minimum n (n=1).

If the attacker gets access to one copy, he can see the entire record (theft), but it is difficult to get access to all copies (destroy the record).

10


Formula 2:

When i=1: 1-(1-w11)(1-w12)(1-w13) W=[1-(1-0.1)(1-0.1)(1-0.1)]*[1-(1-0.1)(1-0.1)(1-

0.1)]*[1-(1-0.1)(1-0.1)] when the defender is solely concerned about

information theft, defender prefers minimum mi and maximum n.

If the attacker gets access to one fragment, the record gets destroyed, but it is difficult to get access to all the fragments (steal the record)

11


The defender usually has resource constraints and cannot deploy arbitrarily many blocks and copies.

N=m*n=32 Min V subject

to W<0.5 Answer: (m,n)=(8,4) W=0.11 V=0.89

12


If the defender seeks to address both concerns, it must strike a compromise.

13

Two multiple objective optimization models

The first model is to determine a data/information storage solution to simultaneously minimize the probabilities of information destruction and data theft.

The second model adds an additional objective to minimize cost of information storage.

Since both models have multiple conflicting objective functions, the goal will be to determine a set of non-dominated Pareto solutions.

14

Each resource k, k=1,...,K, has a known cost ck and specified vk and wk.

Therefore for any copy j of block i, there are associated vf(i,j) and wf(i,j).

.


15

Problem 1:

Problem 2:


16

There are two general approaches to determine solutions to multiple objective problems.

The first approach is to combine mathematically all of the objective functions into one composite objective function, often a cost or utility function. When specific, defensible and precise objective

function weights, equivalent cost conversion factors, or utility functions can be determined

The second approach is to determine the Pareto set of solutions.

Multiple objective optimization

17

Advantages of the Pareto set approach:1) the decision-makers has several or more alternatives and

the final selection can incorporate additional information or concerns not included in the original problem formulation.

2) further investigation of the Pareto sets allows for the explicit comparison and consideration of the various trade-offs among objective functions

3) there is often a "knee" solution that balances the relative trade-offs that would not be apparent otherwise

Disadvantages of the Pareto set approach: the decision-maker can get too many solutions to consider

before selecting a preferred solution to implement. Taboada et al. present a methodology to minimize the

number of Pareto-optimal solutions to consider

Multiple objective optimization

18

Multiple objective genetic algorithm(MOGA)

For this study, we determine solutions to the problem using a MOGA called MOEA-DAP (multiple objective evolutionary algorithm-design allocation problem).

MOGAs are ideally suited to multiple objective optimization because they are able to capture multiple near-Pareto-optimal solutions in a single GA run and may exploit similarities of solutions by recombination.

MOGAs are search methods that take their inspiration from natural selection and survival of the fittest in the biological world. MOGAs follow the fundamentals of single-objective GAs.

19

The main difference between single-objective GAs and MOGAs is that, these latter concentrate their efforts on the strategies used for selection and diversification

The two fundamental goals in MOGA design are guiding the search towards the Pareto set and keeping a diverse set of non-dominated solutions.


20

MOEA-DAP differs from other multiple objective evolutionary algorithms in the crossover operation performed and in the fitness assignment.

In the crossover step, several offspring are created through multi-parent recombination. Thus, the mating pool contains a great amount of diversity of solutions.

Two different methods to assign fitness to the solutions are used. The first fitness is intended to maintain population

diversity The second fitness aims to select those individuals

which are more dominating


21

The GA chromosome used for this study includes a string of n substrings each consisting of m elements corresponding to the reference location for each block.

For example, consider the following GA chromosome vector a where aij is the type of resource where copy j of block i is located (aij represents f(i,j))

The chromosome can represent variable number of blocks.


22

Examples

Both examples were formulated and solved as both Problems 1 and 2, with two and three objective functions.

For both examples, there was no defined cost constraint. The examples demonstrate that this approach can be used to find an approximation of the Pareto set of promising solutions.

23

Example (1)

There is a maximum of five blocks. MOEA-DAP was first run for Problem P1 with a

population size of 100 for 10 generations. There were 79 solutions in the obtained

approximation of the Pareto set.

24

Example (1)

Min V

Min W

25

Example (1)

Point 1: (3,5,5,3) (3,5,3,5) (3,5,4,3,1) (1,3,5,3)

Both objective function are deemed to be important.

V=0.00368, W=0.00388Information destruction

Information theft

26

Example (1)

Point 2: (5,5,5,5)

Point 2 favors the minimization of probability of destruction.

(Max mi, min n=1)

V=0.00010, W=0.5904Information destruction

Information theft

27

Example (1)

Point 3: (3,3) (3,3,3) (3,5) (3,5) (3,3,3)

Point3 favors the minimization of the probability of theft.

(max n=5, min mi)

V=0.1894,W=0.0011Information destruction

Information theft

28

Example (1)

Notice that all three of these selected Pareto solution use primarily resources of types 3 and 5. because resource 3 provides the lowest wij and resource 5 (along with 1) provides the lowest vij.

29

Example (1)

The first example was also solved as Problem P2 with three objective functions, including cost.

MOEA-DAP was run for this problem with a population size of 200 for 10 generations

There were 154 solutions in the obtained approximation of the Pareto set.

30

Example (1)

Point 1: (1) (3,3,3) V=0.00368, W=0.388, C=30

Information destruction Information theft

31

Example (2)

There is a maximum of ten blocks. population size of 120 for 5 generations. there were 73 solutions in the obtained

approximation of the Pareto set.

32

Example (2)

The solution uses the maximum number of blocks to provide a low V but relatively few copies.

V=0.0364 W=2.39e^(-5)

33

Example (2)

The second example was then solved as Problem P2 with three objective functions, including the cost.

MOEA-DAP was run with a population size of 200 for 10 generations

There were 166 solutions in the obtained approximation of the Pareto set.

34

Example (2)

This is a solution with very low cost and very low probability of information theft.

V=.4845, W=7.2E-7, C=30

35

Conclusion

We consider a defender which seeks to store information securely. An attacker may steal or destroy the information which we demonstrate are two conflicting concerns.

We show that to prevent information destruction, the defender prefers to maximize the number of parallel copies of each block, regardless how many blocks in series there are.

To prevent information theft, the defender prefers to maximize the number of separated blocks, regardless how many copies in series there are.

36

Conclusion

Two multiple objective optimization models are developed. These minimize the probabilities of information destruction and data theft, and minimize cost.

Using a multiple objective evolutionary algorithm, we determine how to distribute an optimal number of blocks and copies of blocks among the resources.

Further research could be devoted to multi-objective optimization of complex tree-structured information storing topology and to systems with multi-level protections (defense in depth architecture).

37

Thanks for your attention

Data survivability vs. security in information systems

Documents

Transcript of Data survivability vs. security in information systems