Milano 7 Aprile 2009Quale sicurezza nelle reti IP?
Network Intrusion Detection Systems
Quale sicurezza nelle reti IP?
&Network Anomaly Detection:
a research perspectivea research perspective
Stefano GiordanoStefano GiordanoInternet Society (ISOC)
&&Università di Pisa
Dipartimento di Ingegneria dell’Informazione:Dipartimento di Ingegneria dell Informazione:Elettronica, Informatica, Telecomunicazioni
Telecommunication NetworksTelecommunication NetworksResearch Group
Network ProcessorsNetwork Processors
Radisys ENP2611 PCI Board :
Intel IXP2400 (8 cores, 2.5Gbps) 600 MHz, Three Gigabit Ethernet port, one 10/100 Ethernet port
ADI Engineering Roadrunner board:
Intel IXP2350 Network Processor at 900 MHz
Two Gigabit Ethernet ports, one 10/100 Ethernet port
Traffic Measurements• Develop a device capable of:
– Perform packet capturing at hi‐speed (>1Gbps) without lossp p g p ( p )
– Perform packet time‐stamping with high accuracy (no interrupt latency, no packet length noise)
– Perform packet processing in a scalable and flexible architecture
• The device should be:
– Reasonably cheapReasonably cheap
– Compatible with existing libpcap based applications
– User friendly– User friendly
– Capable of perform any kind of high level packet processing at wire speed and on‐lineat wire speed and on line
– Sufficiently accurate to allow traffic characterization
A PC/NP based Traffic Measurements System
Configuration & Management
A PC/NP based Traffic Measurements System
“B t h f ”
Management
“Batch frames”
Splitter
inbound outbound
Splitter
Network processor side application
Timestamp
• μengine side (“Data plane”)– Packet timestamping
• XScale side (“Management and Control Plane”)
H dl i– Packet classification– Batch frame crafting
– Handle exception– Configuration – Communication with User interface
Batch frameEthernet header
FCS18
MAX
Ethernet header:
Type: 0x9000
SRC: sending interface8 bytes 64+ bytes8 bytes
SRC: sending interface
DST: receiving PC mac
The mechanismAt arrival time, each packet is timestamped and moved to DRAM. Then the packet is
Each packet is dropped or assigned a flowID by the classification microengine in this lastThen the packet is trasmitted to the classification microengine.
microengine, in this last case a packet digest data structure is created and the fields gSTAMP and F are set
The batch frame builder microengine
MAC 0builder microengine matches the flowID with the number of bytes to be striped and
CThe packet digest is
MAC 1the destination PC MAC in charge of processing that FlowID
copied into the proper batch frame together with the first n bytes of the packet
MAC 31
of the packet
XScale application• Configuration
– Classifier
– FlowID space management
– Fragment length managementg g g
• Timestamp UTC calibrationTimestamp UTC calibration– Timestamp is the value of a counter
– Need correspondence to Timestamp UTC– Need correspondence to Timestamp UTC
Cli t/S• Client/Server – Management of the clients running on the PCs of the clustercluster
– Forward command from users to NP
PC side application: user space• User space:
/– NP communication (TCP/IP)
– User interface • Currently a PHP interface for classificationCurrently a PHP interface for classification
Kernel space communication (IOCTL System Call)– Kernel space communication (IOCTL System Call)
Kernel space: the mechanismA new empty sk_buff is allocated
A new “network layer” is registered for 0x9000 ethernet frame type
A virtual interface card is registered for each possible value of
It is timestamped with the UTC timestamp
A virtual interface card is registered for each possible value of
flowID (mon0 to mon64k if a single PC receives all the batch frames)
UTC timestamp corresponding to the STAMP field of the digest
The packet fragment
Finally the len field of the sk_buff is set to the original length of the The packet fragment address is copied into the data field of the sk_buff structure (zero copy)
captured packet and the netif_rx function is called to process the sk_buff
( py)The net_device field is set to the virtual interface corresponding to the flowID field of the digest
PC side application: kernel spacePC side application: kernel space• A new “network layer” is registered inside the linux kernel
• A virtual interface card is registered for each possible value of flowID (mon0 to mon64k if a single PC receives all the batch frames)
• Each Ethernet frame with type 0x9000 is steered to this layer
• The driver dissects the batch frame and creates a new sk_buff data structure for each packet digest:
– Every sk_buff is timestamped with the STAMP field of the digest
– Each fragment is copied in the data field of the sk_buff struct
Th k b ff i t t th i t l i t f ith i d l t th– The sk_buff is sent to the virtual interface with index equal to the flowID found in the digest
Testbed
S litt ttiR di ENP2611Splitter otticoRadisys ENP2611Spirent AX4000
TestbedTestbed
Traffic generators: BRUTETraffic generators: BRUTE• Create traffic flows with different models
• Schedule start and end of flow generation 2000
2500mgen
exponential distribution: lambda=1000
• Schedule start and end of flow generation
• Set a wide gamma of parameters
• Introduce new traffic models (C language) 1000
1500
2000
• Introduce new traffic models (C‐language)• Saturate Fast Ethernet Link (any frame size)
• Saturate Gigabit Ethernet Link 0
500
0 0.0005 0.001 0.0015 0.002 0.0025 0.003 0.0035 0.004
1500
2000
2500brute
exponetial distribution: lambda=1000
• Saturate Gigabit Ethernet Link
with frame > 256 bytes
• Great accuracy
500
1000
1500• Great accuracy
Nicola Bonelli, Stefano Giordano, Gregorio
0 0 0.0005 0.001 0.0015 0.002 0.0025 0.003 0.0035 0.004
Procissi, Raffaello Secchi, BRUTE: A High Performance and Extensibile Traffic Generator,
Int'l Symposium on Performance of Telecommunication Systems (SPECTS'05), vol. 1,
pp. 222-227, Philadelphia 2005
An NP based traffic generator: BRUNO
• Packet Generator:– 1.48Mpkts/s
Best paper at SPECTS08
Attack evolutionemail propagation of malicious code
“stealth”/advanced scanning techniques
DDoS attacks
increase in worms
widespread attacks using NNTP to distribute attacksophisticated command
& control
widespread attacks on DNS infrastructure
executable code attacks (against browsers)
t t d id d tt k
antiforensic techniques
phis
ticat
ion
automated widespread attacks
GUI intruder tools
hijacking sessions
home users targeted
distributed attack tools
increase in wide-scale Trojan horse distribution
Atta
ck S
op
Internet social engineering attacks
automated probes/scans
widespread denial-of-service
attacks
techniques to analyze code for vulnerabilitieswithout source code
Trojan horse distribution
Windows-based remote controllable Trojans
(Back Orifice)
packet spoofing
Intruder Knowledge1990 2003
© 2005 Carnegie Mellon University (Lawrence R. Rogers, Author)
Firewall Basics: Stateful versus Deep Inspection
• Stateful Packet Inspection looks only at headers – Equivalent to Post Office examining To/From and the package– Equivalent to Post Office examining To/From, and the package
type (envelope, tube, box…)
– Good for preventing unauthorized users and service types
• Deep Packet Inspection inspects ALL content– Equivalent to Post Office examining entire contents and
making a forwarding decision based on what it finds
– Required for Anti‐virus, Intrusion Prevention, Spyware, Anti‐Spam Web and Email Content Filtering
Email (SMTP, POP3, IMAP)W b (HTTP/S)
Header Layers Application Layer
et e
Anti‐Spam, Web and Email Content Filtering
EthernetInternetProtocol
(IP)
TransmissionControlProtocol(TCP)
Web (HTTP/S)File Xfer (FTP, Gopher)
NewsgroupsHost Sessions
Directory Services… Ethe
rnFr
ame
Stateful Packet InspectionDeep Packet Inspection
Regular Expressions
• Flexible way to describe pattern in IDS/IPSs– Example: for detecting yahoo messenger trafficExample: for detecting yahoo messenger traffic
^(ymsg|ypns|yhoo).?.?.?.?.?.?.? [lwt].*\xc0\x80
– More expressive than strings:p g• Strings +special symbols: . * + [ ] |
– Pretty simple to understand and write a REy p
• Used in many payload scanning applicationsy p y g pp– L7‐filter: protocol identifiers
– Bro: intrusion patternsBro: intrusion patterns
– SNORT: intrusion patterns
– CISCO devices: intrusion patternsCISCO devices: intrusion patterns
18
.*FAs new directions
• Largely investigated Field
– *FA Size is not an issue anymore– . FA Size is not an issue anymore
– Solutions for higher speed are investigated now
– More powerful techniques than “dumb” .*FAs are required to provide more functionalities
Regexes and [ND]FA
• Two ways to perform regex‐matching:
– Non‐deterministic Finite Automata (NFAs)• We must keep track of multiple active states/transitions:
– High memory bandwidth, low memory size
– Deterministic Finite Automata (DFAs)• A single active state => large number of states as a• A single active state => large number of states as a result of all the possible real combinations of NFA states
– Single memory access per character, high memory requirements
Non‐deterministic Finite Automata
• NFA:NFA:– Many active states at the same time
V t t t– Very compact structure
– Mostly used in HW
a b c
a x
a b x[^x]
Signature: a.*x MATCHED!
Input:[ x]
regexes: abc , a.*x
Deterministic Finite Automata
• DFA:– Single active state
– Potentially very large memory requirementsy y g y q
– Preferred in SW
b
a
a b c
xx
a b x
Signature: a.*x MATCHED!
Input:
c ab,c,x
regexes: abc , a.*x
[ND]FA Summary• NFAs are mostly used in hardware implementations (FPGA) where
we can easily perform multiple accesses to different pieces of y p p pmemory in parallel
• DFAs are preferred in software implementations… …but we have still a couple of issues:
– Size:• A small number of regexes (when combined) can lead to an exponential number of states in the corresponding DFA: STATE BLOWUPof states in the corresponding DFA: STATE BLOWUP
• A naïve encoding is largely redundant (256 transitions/state, 32bit/transition)
– Speed:• 1 access per byte can prevent the performance to reach very high Gbps figures
δFA
• Target: reduce transitionsl• Ficara et al. CCR SIGCOMM Oct. 2008
• Simple idea from “signal processing” world: – In “child” nodes, store only the transitions which are different wrt the parent node.
Ch St t i d h t iti• Char‐State compression: encode each transition with a variable number of bitsL d ti 90%• Large memory reduction: >90%
• Encoding (128bits wide reads):– Slightly more than 1 mem. access per byte– 1 cached access to CS compression table
• actually required only on a limited percentage of transitions• actually required only on a limited percentage of transitions
δFA lookup example
States traversed: 2 3 5 => 3 (DFA:3, D2FA:4)2Total N_transitions: 8 (DFA:20, D2FA:9)
Regexes: a+, b+c and c*d+
Input:
a b c
DFA vs. D2FA vs. δFA at a glance
DFA D2FA5 t t
δFA 5 t t
…improvement
5 states20 transitions1 traversal/char
5 states9 transitions1÷2 traversal/char
5 states8 transitions1 traversal/char
5
Bloom Filters for pattern matchingp g
Because of their limited memory requirements, BFs are used in:• Approximate Cache Reconciliation• Deep Packet Inspection Applications
Pattern Matching is a heavy taskDivided in 2 phases:
•Randomized (fast)•Randomized (fast)•Exact (slow)
BF/CBF introduction
• Probabilistic structure: – Trade memory for certaintyy y
0 0 0 1 0 0 1 1 0•Insert x 1 1 0 1 0 0 1 1 0
h1(x)=1h2(x)=6h (x)=0 • Very compact structureh3(x)=0
•Lookup y
y p• Does not allow deletions• False positives f= 2-k when m=nk/ln2
h1(y)=3h2(y)=0h ( ) 5
• CBFs allow deletions:• Bits are expanded to counters
h3(y)=5
ML‐CCBF(1)• Fact:• Fact:
– P(Φ) of counter Φ in a CBF is Poiss(n*k /(m‐1) ) and
– P(Φ>j) <P(Φ=j‐1)
• Idea:H ffman en odin (optimal for independent– Huffman encoding (optimal for independent symbols)
– 0 => 0
– 1 => 10
– 2 => 110
– 3 => 11103 => 1110
– …
ML‐CCBF(2)• Fact(1): CPUs and NPs provide the “popcount” instruction
• First attempt:pHuffman‐CBF1 0 1 1 0 0 1 1 1 0 1 0
CBF
Fact(2): lookup is much more frequent than insertions/deletions1 2 0 3 1
• Idea: Multilayer structure
0
0
10 1 0
0
0
1
1
0
1 0
1
1
1
0
10 1 1 1
1 0 1 1 0 1 1 1 00 1 0
1 1 0 1 1Lookup example: h(x) = 3popcount(110) = 2popcount(10)=1
BF!
popcount(10) 1popcount(0)=0
What is an IDS?
IDS Taxonomy
Misuse vs Anomaly Basedy
System 1: Markovian ModelsSystem 1: Markovian Models
Homogeneous First Order MC
Non Homogeneous MC
High order MC
Mixture Transition Distribution
State Space ReductionState Space Reduction
Parameters Estimation
Parameters Estimation
MC ‐ Detection Phase
MC ‐ Detection Phase
Experimental Resultsp
Experimental Results
Experimental Results
Experimental Results
Experimental Resultsp
Experimental ResultsExperimental Results
Spin offSpin‐off
• Netresults
netres lts itwww.netresults.it
• CUBIT
bitl bwww.cubitlab.com
Top Related