Web Intrusion Detection with Bayesian Network by Kanatoko AVTokyo 2013.5 English Slide

Post on 03-Jul-2015

814 views 1 download

Transcript of Web Intrusion Detection with Bayesian Network by Kanatoko AVTokyo 2013.5 English Slide

Copyright (c) Bitforest Co., Ltd.

 

 

Web Intrusion Detection with Bayesian Network

KanatokoChief Tech OfficerBitforest Co.,Ltd.

@kinyukahttp://www.jumperz.net/

http://www.scutum.jp/

02/17/141

Copyright (c) Bitforest Co., Ltd.

 

 

Who am I?

– Kanatoko– Web Application Firewall Developer– My mission: Building accurate WAF

• Reduce false positives/false negatives

02/17/142

Copyright (c) Bitforest Co., Ltd.

 

 

Bayes’ theorem

– Used when we want to calculate P(A|B) when P(B|A) is known

– P(B|A) : the probability of event B given event A– Not so hard to understand

02/17/143

Copyright (c) Bitforest Co., Ltd.

 

 

What is Bayesian Network?

– probabilistic graphical model (a type of statistical model) that represents a set of random variables and their conditional dependencies via a graph(Wikipedia)

02/17/144

AVTokyo

HackerDrunken

Beer in hand

Copyright (c) Bitforest Co., Ltd.

 

 

Famous sprinkler example

02/17/145

•Nodes and Edges represent cause and effect•Probabilities are shown as tables (CPT: conditional probability table)•Observations(=Evidences) are used as Input to nodes•Unobservable nodes are used as Output (= What want to know )•“Glass is wet. What is the probability it rained?”

Copyright (c) Bitforest Co., Ltd.

 

 

Weka

– OSS, Java, Data mining software– GUI/lib/tools– (Sprinkler Demo)

02/17/146

Copyright (c) Bitforest Co., Ltd.

 

 

Web Intrusion Detection with Bayesian Network

02/17/147

•Probability that the HTTP request is an attack: 1%•Probability that the HTTP request is NOT an attack: 99%•Probability that the HTTP request contains ‘alert’ given that the request is an attack: 8%•Probability that the HTTP request contains ‘alert’ given that the request is NOT an attack: 92%•Probability that the HTTP request contains ‘alert’ given that the request is NOT an attack: 0.2%•Probability that the HTTP request NOT contains ‘alert’ given that the request is NOT an attack: 99.8%

What is the probability that the HTTP request is an attack? 1%

What is the probability that the HTTP request is an attackGiven that the HTTP request contains ‘alert’

28.8%

Copyright (c) Bitforest Co., Ltd.

 

 

Spam filter and Naïve Bayes

02/17/148

Copyright (c) Bitforest Co., Ltd.

 

 

Building Accurate Intrusion Detection System / Web Application Firewall

– Signature-based ( Blacklist)• If ‘alert’ then die!• Simple and has some advantages

– Clear– Performance: Stable / Fast enough– Maintainable/Human readable

• Disadvantage: High false positive rate

02/17/149

Copyright (c) Bitforest Co., Ltd.

 

 

Building Accurate Intrusion Detection System / Web Application Firewall(cont)

– Threshold model (vs. simple signature/blacklist model)• Inc/Dec scores on each signature matching• Treated as an attack when total score exceeds the

certain threshold• Low false positives (good)• Hard to change/maintenance(bad)• Example rule 1: score +5 on ‘UNION’• Example rule 2: score +5 on ‘SELECT’• When both ‘UNION’ and ‘SELECT’ found… score +10 ?• Example rule 3: score +20 on ‘UNION and SELECT’• Too complicated

02/17/1410

Copyright (c) Bitforest Co., Ltd.

 

 

Building Accurate Intrusion Detection System / Web Application Firewall(cont)

– Threshold model (vs. simple signature/blacklist model)

– Score +5 on ‘Alert’ ( XSS )– Score +5 on ‘UNION’ ( SQLi )– Score +10 on “Alert UNION”?– Should distinct XSS and SQLi (classes)

02/17/1411

Copyright (c) Bitforest Co., Ltd.

 

 

Building Accurate Intrusion Detection System / Web Application Firewall(cont)

– Bayesian Network• Resolves almost all problems of the threshold model

02/17/1412

Copyright (c) Bitforest Co., Ltd.

 

 

Advantages of Bayesian Network

– Complicated relations can be modeled as network (GUI)

– Computation result is expressed as probability– Easy to maintain– Corresponds to expert knowledge

02/17/1413

Copyright (c) Bitforest Co., Ltd.

 

 

Complicated relations can be modeled as network (GUI)

– One to many, weak/strong relations can be expressed– Models can be developed in GUI tool and then can be

used to compute the probabilities– We use Weka Bayesian Network Editor– Example: XSS/CMS– Example: VA/User in Japan– Example: ‘eval’ and Programming languages(Java,

Ruby, JavaScript, Perl, PHP… )

02/17/1414

Copyright (c) Bitforest Co., Ltd.

 

 

Computation result is expressed as probability

– ‘UNION’ only ( not special )– ‘SELECT’ only ( not special )– Both ‘UNION’ and ‘SELECT’ ( should be marked )– The probability of ‘rare case’ is calculated as high by

Bayes Theorem

02/17/1415

Copyright (c) Bitforest Co., Ltd.

 

 

Easy to maintain

– Intermediate nodes(mediating variables) play important role

– Influences are as expected when we update the values in CPT

– Can be improved little by little because it is not a black box such as Neural Network

02/17/1416

Copyright (c) Bitforest Co., Ltd.

 

 

Corresponds to expert knowledge

“If A and B, then maybe C …”

Is expressed as probability

Similarity between human decision making process and Bayesian Network

02/17/1417

Copyright (c) Bitforest Co., Ltd.

 

 

Conclusion

Bayesian Network can be used to make decisions based on observations

If “Human(Expert) can detect attacks”

Then, We want the computer to do that

Use Bayesian Network!

02/17/1418

Copyright (c) Bitforest Co., Ltd.

 

 

We’re hiring!

– Bitforest Co.,Ltd.– Web Application Security Expert– Data Science Expert– Contact to @kinyuka

02/17/1419