PFQ@ 10th Italian Networking Workshop (Bormio)
-
Upload
nicola-bonelli -
Category
Engineering
-
view
271 -
download
3
description
Transcript of PFQ@ 10th Italian Networking Workshop (Bormio)
![Page 1: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/1.jpg)
Running Monitoring Applica0ons on Accelerated Capture Engines
Nicola Bonelli
N. Bonelli, R.G Garroppo, L. Gazzarrini, S. Giordano, G. Procissi, F. Russo, G. Volpi
![Page 2: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/2.jpg)
Agenda
• Capture engines overview • What’s new in PFQ (2.0)
• Accelerated pcap library – PF_RING, PF_RING+DNA, NETMAP, PFQ
• Pcap-‐perf: a tool for benchmarking pcap apps
• Experimental results
![Page 3: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/3.jpg)
Speed maXers…
![Page 4: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/4.jpg)
Accelerated Capture Engine
• Linux is provided with a default capture engine – the PF_PACKET socket
• Because of speed, other capture engines emerged: – 2004: PF_RING
• designed for single core, beXer performance than the then PF_PACKET
– 2011: PFQ • first to address mul0-‐core architecture and mul0-‐queues NICs (Best Paper Award @PAM2012)
– 2012: PF_RING-‐DNA • accelerated drivers (Intel)
– 2012: NetMap • accelerated drivers (Intel,Broadcom) (Best Paper Award @Usenix ATC’12)
![Page 5: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/5.jpg)
… but what happens on these tracks?
![Page 6: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/6.jpg)
What’s new in PFQ 2.0 • From capture engine to monitoring framework… • Improved performance
– ~14.8 Mpps single user-‐space thread
• Improved features: – compliant with a plethora of NICs: pfq-‐oma0c – monitoring groups and classes – in-‐kernel extensible engine for packet steering: dispatching, copying, cloning, filtering
– na0ve bindings: C, C++11, Haskell (more to come) – per-‐group filtering: BFP, vlan (un-‐tagging) – pcap library
![Page 7: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/7.jpg)
Feature comparison PF_PACKET PF_RING 5.x PF_RING-‐DNA NETMAP -‐ 0813 PFQ 2.0
NIC * *, PF-‐AWARE (Intel, Broadcom)
only Intel 1/10G Intel 1/10G, forcedeth
* accelerated
Driver compat. * yes, non accel. no no yes, dynamic
mul0-‐core -‐ Hardware (RSS) Hardware (RSS) Hardware (RSS) Hw RSS + sog
mul0-‐queue yes (poor) yes yes yes yes
na0ve binding C C C C C, C++11, Haskell, Java, Python
groups -‐ -‐ -‐ -‐ yes
class -‐ -‐ -‐ -‐ yes
concurrent mon. yes yes commercial ? -‐ yes
clustering -‐ yes -‐ -‐ yes (MT, group)
steering -‐ -‐ commercial -‐ yes (MT, group)
STM state -‐ -‐ -‐ -‐ work in progress
![Page 8: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/8.jpg)
Feature comparison PF_PACKET PF_RING 5.x PF_RING-‐DNA NETMAP -‐ 0813 PFQ 2.0
Pcap library yes yes yes buggy/incomplete yes
BPF (filters) yes (MT) yes (MT) yes (user-‐space) -‐ yes (MT, group)
vlan filters -‐ yes yes (hw Intel) -‐ yes (MT, group)
vlan untagging -‐ -‐ -‐ -‐ yes (MT, sog.)
Intel hw filters -‐ yes yes -‐ No
bloom filters -‐ -‐ -‐ -‐ work in progress
![Page 9: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/9.jpg)
Accelerated PCAP library • Pcap library is the standard de-‐facto interface for packet capture • Accelerated capture engines provide their own pcap library:
– Both PF_RING and PF_RING-‐DNA provide a complete accelerated version
– NetMap provides an experimental and incomplete pcap support • BPF is missing
• PFQ provides a complete implementa0on – PFQ C-‐API mapped over pcap interface wherever possible,
implemented as environment variables otherwise – Clustering is enabled specifying mul0ple NICs in colon-‐separated
fashion, steering by means of PFQ_STEER variable
PFQ_GROUP=10 PFQ_STEER=ipv4-‐addr tcpdump –n –i eth2:eth3 PFQ_GROUP=10 PFQ_STEER=ipv4-‐addr tcpdump –n –i eth2:eth3
![Page 10: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/10.jpg)
Pcap-‐perf
• Pcap-‐perf is a C++11 applica0on designed for benchmarking capture engines through pcap interfaces
• Support for mul0-‐threads, BPF filter and plug-‐ins:
plug-‐in kind
Null packet counter
IP checksum light CPU computa0on
MD5 CPU computa0on
SHA256 heavy CPU computa0on
Bloom Filter memory (linear)
Protocol Classifica0on memory tree
TCP/UDP flow counter memory (std::unordered_set)
![Page 11: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/11.jpg)
Test-‐bed and measurements
• Intel Xeon 6 cores x5650 @2.67Ghz, 16G Ram + Intel 82599 10G (Debian Wheezy) • Accelerated drivers
– PF_RING: ixgbe 3.11.33 PF_RING-‐aware – PF_RING-‐DNA: ixgbe 3.10.16-‐DNA driver – Netmap: ixgbe driver shipped with the netmap package – PFQ: intel ixgbe 3.11.33 vanilla, recompiled through pfq-‐oma0c
• Best Interrupt affinity (MSI-‐X) – 4 or 5 kernel threads (NAPI) bound to fixed core (RSS), 1 or 2 user-‐space threads bound to
other core(s)
• Traffic is generated with randomized IP addresses, 64/128 bytes long UDP packets – using both PF_DIRECT and PF_RING-‐DNA
10 Gb link
mascara monsters
![Page 12: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/12.jpg)
Coun0ng packets is useless
(na0ve speed)
uint64_t counter = 0;!! ! !for(;;)!! ! !{!
! ! !counter++;!! ! !}!
![Page 13: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/13.jpg)
1 thread user-‐space (Intel 10G)
![Page 14: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/14.jpg)
pcap library
![Page 15: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/15.jpg)
Pcap library, 1 thread counter
![Page 16: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/16.jpg)
Pcap, 1 thread counter, BPF=udp
![Page 17: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/17.jpg)
Pcap, 1 thread counter, BPF=hXp || udp
![Page 18: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/18.jpg)
pcap-‐perf
![Page 19: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/19.jpg)
pcap-‐perf
![Page 20: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/20.jpg)
pcap-‐perf with BPF = udp
![Page 21: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/21.jpg)
pcap-‐perf (2 threads)
![Page 22: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/22.jpg)
tcpdump
![Page 23: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/23.jpg)
tcpdump –s 64 –i dev –w /ramdisk/dump.pcap ([email protected])
![Page 24: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/24.jpg)
tcpdump –s 138 –i dev –w /ramdisk/dump.pcap (100M@~8Mpps)
![Page 25: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/25.jpg)
tcpdump –i dev –w /ramdisk/dump.pcap vlan (5 Gbps)
![Page 26: PFQ@ 10th Italian Networking Workshop (Bormio)](https://reader033.fdocument.pub/reader033/viewer/2022051400/558e02531a28ab7f6c8b45f4/html5/thumbnails/26.jpg)
tcpdump –i dev –w /ramdisk/dump.pcap ip host 192.168.0.10 (voip call)