2003-MMLAB-TR-05
-
Upload
highlanderone -
Category
Documents
-
view
213 -
download
0
Transcript of 2003-MMLAB-TR-05
-
7/31/2019 2003-MMLAB-TR-05
1/134
ATHENS UNIVERSITY of ECONOMICS AND BUSINESSDEPARTMENT OF COMPUTER SCIENCE
Peer-To-Peer Wireless Network Confederation
Traffic Logging Subsystem
Name: Pantelis FrangoudisStudent Number: 3990108
Supervisor: Prof. G.C. PolyzosSupervising Assistant: E.C. Efstathiou
Athens, September 2003
-
7/31/2019 2003-MMLAB-TR-05
2/134
1
Summary
This document is the report for my diploma thesis project. What it deals with is Peer-
To-Peer Wireless Network Confederation (P2PWNC) and, in particular, an
Administrative Domains Local Traffic Logging/Accounting and Monitoring System.The above terms will be more thoroughly discussed in following chapters.
As stated in [1], [2], [3] and [4], a Peer-To-Peer Wireless Network Confederation
(P2PWNC) is a community of WLAN Administrative Domains (ADs) that offer
network access to each others registered users. An AD provides internet services to
P2PWNC users of other AD to compensate for the services that its own registered
users enjoy from other AD when they roam. This roaming scheme is decentralized,
leaving ADs make their own decisions on the amount of resources they contribute to
the P2PWNC system. ADs are composed of some modules, such as the WLAN
control module, the user authentication module, the P2PWNC module, etc.
This document deals with the traffic logging and analysis subsystem, which can be
considered as a section of the LANs Control Module of an AD. As a matter of fact,
though, the information this subsystem makes available can be used by other modules,
too. The traffic logging subsystem is divided in two sections. The first one is a
network packet capturing, analysis and logging daemon program. It captures packets
that pass from the network interface where a wireless LAN access point is connected
and analyses them. The purpose of this analysis is to gather aggregate traffic statistics
and application layer protocol information about the P2PWNC users.
The second section of the system is an XML-based statistics retrieval and exchange
protocol. It is a client server protocol designed for the retrieval and exchange of thestatistics that the packet logging and analysis daemon generates. Clients do not have
direct access to the database where the statistics are stored. Instead, they issue
properly formed requests (XML documents) to a server, which, in turn, queries the
database and returns the results to the clients in messages of a protocol-specified
format. Apart from the specification of the protocol, the implementation of a typical
server and client of it is discussed and presented.
As this is the first version of this system, it is expected that further improvement is
possible or needed on some topics. Future work must be bone on issues that have to
do with the packet logging and analysis daemons performance as well as its traffic
analysis capabilities. As far as the statistics retrieval, exchange and presentation is
concerned, issues of security are of greater importance. Also, extending the statistics
exchange protocol and developing servers and clients with more functionality and
operating system independence would enhance statistics retrieval and presentation
capabilities.
-
7/31/2019 2003-MMLAB-TR-05
3/134
2
Table of Contents
Chapter 1
INTRODUCTION 61.1 About Peer-To-Peer Networking 61.2 Peer-To-Peer Wireless Network Confederation 6
1.2.1 System Overview and Terminology 6
1.2.2 Modules and Subsystems 8
1.2.2.1 WLAN Control Module .. 8
1.2.2.2 Authentication and User Identification Module .. 8
1.2.2.3 Local AD Services Module . 8
1.2.2.4 Internet Connectivity Module .. 8
1.2.2.5 P2PWNC Management Module .. 8
1.2.2.6 Local P2PWNC Policy Module .. 8
1.2.3 Administrative Domains Local Traffic Accounting and
Monitoring System . 9
Chapter 2
TRAFFIC LOGGING AND ANALYSIS SUBSYSTEM . 102.1 System Testbed 10
2.1.1 Hardware 10
2.1.2 Operating System . 10
2.1.3 Other software ... 10
2.2 Tools Used .. 10
2.3 Logging Subsystem General Architecture .. 11
2.3.1 Login state.. 122.3.2 Connected State . 13
2.3.3 Logout state 13
2.3.4 Communication with the authentication module .. 14
2.3.4.1 Interprocess Communication (IPC) in Linux Environments 14
2.3.4.2 IPC implementation in the Traffic Logging Subsystem 15
2.3.4.3 Shared Memory Allocation and Handling . 21
2.4 Traffic logging and analysis subsystem implementation 25
2.4.1 Packet Capturing Systems . 25
2.4.2 Libpcap architecture and principles .. 26
2.4.3 Libpcap Advantages .. 28
2.4.4 Libpcap Performance Issues . 292.4.5 Data Storage .. 29
2.4.5.1 Available Information . 29
2.4.5.2 MySQL Database Scheme .. 30
2.5 The Packet Capturing and Analysis Daemon ... 36
2.5.1 Packet Capturing Daemon Architecture . 36
2.5.1.1 Protocol Header Stripping 36
2.5.1.2 User Packet Matching .. 38
2.5.1.3 Application Layer Protocol Determination .. 40
2.5.2 Application Layer Protocol-specific Statistics Extraction .. 43
2.5.2.1 Connection Tracking Principles .. 43
2.5.2.2 HTTP Tracking 45
2.5.2.2.1 Typical HTTP Scenario 45
-
7/31/2019 2003-MMLAB-TR-05
4/134
3
2.5.2.2.2. HTTP Request Tracking Algorithm 46
2.5.2.3 FTP Tracking .. 53
2.4.2.3.1 Typical FTP Scenario . 53
2.4.2.3.2 FTP Connection Tracking Algorithm .. 55
2.5.2.4 SMTP Tracking ... 59
2.4.2.4.1 Typical SMTP Scenario .. 592.4.2.4.2 SMTP Connection Tracking Algorithm .. 60
2.5.2.5 Pop3 Tracking . 64
2.5.2.5.1 Typical POP3 Scenario 64
2.5.2.5.2 POP3 Connection Tracking Algorithm 66
2.6 Demonstration .. 73
Chapter 3
AN XML BASED PROTOCOL for STATISTICS RETRIEVAL and
EXCHANGE . 753.1 Introduction 753.2 System Testbed 75
3.2.1 Hardware . 75
3.2.2 Operating System ... 76
3.2.3 Other software . 76
3.3 Tools Used . 76
3.4 Protocol Description .. 76
3.4.1 Introduction 76
3.4.2 ABNF Specification 77
3.4.3 Protocol semantics .. 80
3.4.3.1 REQUEST .. 803.4.3.1.1 Request version and header information . 80
3.4.3.1.2 Request fields ... 80
3.4.3.1.3 Request criteria . 80
3.4.3.2 RESPONSE . 81
3.4.3.2.1 Response version and header Information .. 81
3.4.3.2.2 Response pairs .. 82
3.4.4 Client functions .. 82
3.4.4.1 Login Function .. 82
3.4.4.2 Logout Function .. 83
3.4.4.3 Password changing function ... 84
3.4.4.4 Statistics retrieval function .. 853.4.5 Server functions 86
3.4.5.1 Server sequence of actions . 87
3.4.5.2 Login Function 87
3.4.5.3 Logout Function . 88
3.4.5.4 Password changing function .. 88
3.4.5.5 Statistics retrieval function . 89
3.4.6 Security Issues 91
3.4.6.1 Protocol Specifications 91
3.4.6.2 Suggestions about security .. 92
3.4.7 Protocol usage scenario .. 93
3.4.8 Advantages and disadvantages of this approach 96
3.5 Client and Server Implementation 97
-
7/31/2019 2003-MMLAB-TR-05
5/134
4
3.5.1 Introduction . . 97
3.5.2 XML parsing ... 98
3.5.3 Server implementation 101
3.5.3.1 Architecture of the server program . 101
3.5.3.2 Login and Logout Functions ... 102
3.5.3.3 Password changing function 1043.5.3.4 Statistics retrieval function . 106
3.5.4 Client Implementation 112
3.5.4.1 Login Function . 113
3.5.4.2 Logout Function .. 113
3.5.4.3 Password changing function 113
3.5.4.4 Statistics retrieval function . 114
3.6 Demonstration ... 115
Chapter 4
FUTURE WORK. 123
4.1 Traffic Logging and Analysis Subsystem . 1234.1.1 Performance issues . 123
4.1.2 Traffic Analysis .. 124
4.2 Statistics Exchange Protocol . 124
4.2.1 Security issues . 124
4.2.2 Protocol extension .. 124
4.2.3 Presentation and monitoring issues . 125
Conclusions ... 126
APPENDIX .... 128Installation of the programs .... 128
Setting up the database .... 129
Configuration ... 129
Execution and usage . 130
REFERENCES .... 132
LINKS .... 133
-
7/31/2019 2003-MMLAB-TR-05
6/134
5
Figures
Figure 1. P2PWNC Architecture . 7
Figure 2. Administrative Domains Linux Box .. 11
Figure 3. Traffic Logging Subsystem High Level View .. 12Figure 4. Login State .. 13
Figure 5. Logout State . 14
Figure 6. Subsystems IPC communication using Shared Memory 16
Figure 7. shmAppendUser() Determining new nodes position .. 22
Figure 8. removeFromList() shifting nodes back . 24
Figure 9. After removeFromList() . 24
Figure 10. Libpcap architecture .. 27
Figure 11. Packet Capturing Daemon Architecture .....36
Figure 12. Ethernet Packet Format .. 38
Figure 13. FTP command reply sequence 55
Figure 14. POP3 command reply sequence . 66Figure 15. FTP stored statistics ... 73
Figure 16. SMTP stored statistics 74
Figure 17. Statistics Client and Server Architecture and Interconnection with the
Traffic Logging Subsystem .....75
Figure 18. XML Parser Benchmark Test . 98
Figure 19. Login Form .116
Figure 20. General Statistics 117
Figure 21. Aggregate traffic ttatistics ..117
Figure 22. HTTP statistics . .118
Figure 23. FTP statistics .. 119
Figure 24. SMTP statistics ..119Figure 25. POP3 statistics 120
Figure 26. Password changing form 121
Figure 27. Connected administrator information .121
Figure 28. About box 122
-
7/31/2019 2003-MMLAB-TR-05
7/134
6
Chapter 1
INTRODUCTION
1.1 About Peer-To-Peer NetworkingA Peer-To-Peer (P2P) Network is a network comprised of autonomous and equivalent
entities. This is a fairly old concept in the area of Computer Networks and
Communications. A pure P2P system is decentralized. That is, there is no central
entity coordinating communication and interaction between peers. Today, there are
numerous examples of the Peer-To-Peer network model. Most of them, like Kazaa
(www.kazaa.com) and Gnutella are file sharing systems.
1.2 Peer-To-Peer Wireless Network Confederation1.2.1 System Overview and TerminologyA Peer-To-Peer Wireless Network Confederation (P2PWNC) is a community of
WLAN Administrative Domains (ADs) that offer network access to each others
registered users. Obviously, it is a roaming scheme which is decentralized. This
means that there is no central coordinative entity, nor any bilateral contracts to control
the parties behavior. The peers of this network are the different Administrative
Domains, which are autonomous as to the amount of resources they contribute and
their participation level in the confederation. The main goal of the P2PWNC System
is to provide ubiquitous network access to its members. The ability of an ADs
registered users to roam and enjoy free network access outweighs the ADs cost in
resources, performance, etc. of providing access to visitors (registered to other ADs ofthe P2PWNC).
As mentioned before, the main goal of this system is ubiquitous, cheap, fast and
secure access to network resources and particularly internet services. Wireless LANs
provide the best way to achieve this. It is easier to deploy wireless infrastructure to
quickly and more cost-effectively create a high-coverage network satisfying the above
demands in the best way. A P2PWNC, whose components are WLAN Administrative
Domains, incorporates the IEEE 802.11 technology. This is a standard growing
steadily in popularity. It offers easy access to the network as well as security (this is a
topic where a lot of discussion is taking place). Furthermore, it is relatively cheap and
fast to employ.
Being a pure P2P system, the P2PWNC lacks the notion of a central controlling entity
which can enforce a uniform policy among peers as to the usage and availability of
resources and participation. In the same way as in other peer-to-peer systems like
Gnutella, the problem of free riding is encountered. That is, an ADs members can
consume many of other ADs resources, while this AD offers little resources to other
ADs members. The equivalent case in terms of a file sharing P2P system like Kazak
would be something like this; a Kazaa user downloads many files from other users,
while he shares very few of his own files. The solution to this problem is to offer
incentives for peers to contribute to the system by imposing community-wide rules.
Such rules can control a peers behavior. Rule breaking would result to some sort ofpunishment, while rule compliance would be beneficial for both the peer (who would
-
7/31/2019 2003-MMLAB-TR-05
8/134
7
be rewarded) and the whole community (its function would be more regular and
systems use would be more sufficient). Rule enforcement is based on a distributed
accounting model which is briefly described below.
Although peers (ADs) have the autonomy as to the usage and availability of their
resources, the system imposes distributed constraint structures so that peers have anincentive to conform to the community rules. Every time a peer receives or offers
service, messages are exchanged and distributed accounting records are updated. The
messages exchanged include signed receipts that prove the provision of the service.
Therefore, forging the global accounting statistics of the system is made harder to
achieve. It is easy for a peer to deduce the rate of consumption of any other and this
can be achieved by inspecting and aggregating the above receipts. Although forging
the statistics is possible, distributed accounting provides the functionality of gathering
aggregated opinion about a peer by querying other peers about the services offered to
/ provided by it and thus assessing its reputation.
The combination of the P2PWNC peer-to-peer nature and the set of community-widerules described above offer the system the following advantages over other solutions:
- Scalability
- Decentralization
- Flexibility and low complexity
- Economic efficiency
The architecture of a P2PWNC is shown in the following figure
AD: Administrative Domain
AP network view
: Member
ADBlack
AD
Grey
AD
White
Figure 1. P2PWNC Architecture
-
7/31/2019 2003-MMLAB-TR-05
9/134
8
1.2.2 Modules and SubsystemsAn Administrative Domain consists of some modules and each of them has its own
functionality. These modules are:
1.2.2.1 WLAN Control ModuleThe WLAN control module manages the Access Points (AP) network and shapes
traffic coming from, or destined to, APs (and ultimately User Agents UAs). It
consists of the bandwidth control, the traffic logging and other local subsystems.
The bandwidth control subsystem is responsible for allocating portions of the
available bandwidth to the users visiting the AD and requesting services (we mean
users of this AD, as well as visitors from other ADs of the P2PWNC). The traffic
logging and monitoring subsystem will be described in detail in this document.
1.2.2.2 Authentication and User Identification ModuleThis module checks UA credentials and then decides what services the UA is
authorized to access. The decision is enforced by the WLAN control module. Users
registered to the ADs of the P2PWNC have a unique username within the same
domain. Usernames have the format: user@domain. Every time a user wishes to
use the internet services or other facilities available by an Administrative Domain, an
authentication/identification process takes place. If the user has the necessary
qualifications (certificates or user name-password pair) so that he can be served, he is
assigned a dynamic IP address. After the authentication / login process has been
successfully carried out, the user is identified using the assigned IP address. While
logged in, each packet traversing the router having the above IP address as its source /
destination is considered to be sent / received by the above user. After he has loggedout, the IP address that has up till now been used for identifying him is released (and
therefore can be assigned to another user) and is of no more significance for user
accounting purposes.
1.2.2.3 Local AD Services ModuleThis module offers other local services (PSTN VoIP gateway, webcache, etc)
1.2.2.4 Internet Connectivity ModuleThe Internet Connectivity module, as its name implies, is the Administrative
Domains gateway to the Internet.
1.2.2.5 P2PWNC Management ModuleThis module implements all the high-level Peer-To-Peer functionality of the system
(generic service provision, rules enforcement, etc)
1.2.2.6 Local P2PWNC Policy ModuleThis encapsulates the strategy of an AD as a participant in a P2PWNC (the amount of
resources offered to visitors, the request rate allowed for its own members, etc)
The above modules can be distinguished to those that would exist in any typical
WLAN AD, even if it wasnt participating in a P2PWNC (User Authentication,
WLAN Control, Internet Connectivity, Local Services) and those that have to do with
-
7/31/2019 2003-MMLAB-TR-05
10/134
9
communication between peers and are a distinctive characteristic of a Peer-To-Peer
Wireless Network Confederation member (P2PWNC Management Module, Local
P2PWNC Policy Module)
1.2.3 Administrative Domains Local Traffic Accounting
and Monitoring SystemThis is the ADs subsystem with which this document will deal with in the following
chapters. It is composed of two parts. The first one refers to the traffic logging and
analysis and local user accounting system. The second one is an XML-based protocol
for the retrieval and exchange of the statistics generated by the logging/accounting
subsystem.
-
7/31/2019 2003-MMLAB-TR-05
11/134
10
Chapter 2
TRAFFIC LOGGING AND ANALYSISSUBSYSTEMThe traffic logging subsystem is in fact a packet capturing and analysis daemon. This
daemon is responsible for capturing packets from a defined network device that
belongs to the router and analyzing them so that it can gather some aggregate traffic
statistics as well as application-specific information about the P2PWNC users. The
information is grouped by application protocol. For example, for the HTTP protocol
the statistics available include the HTTP requests users have made and in particular
the host name, request method (GET, POST, etc), the user agent (e.g. Mozilla,
Microsoft IE, etc) and the request URI. In a similar way, the traffic logging daemon
can track down information about other application level protocols (FTP, SMTP,POP3). Data is stored in a MySQL Database.
2.1 System TestbedIn order for the traffic logging subsystem to work properly, the following are required
2.1.1 HardwareThe system was developed and tested on an Intel Pentium III (800 MHz) computer
with 256 Mbytes of RAM. It was also tested on an Intel Celeron (500 MHz) with 64
Mbytes of RAM. These systems included an 802.11b Access Point, and a Network
Interface Card for Internet connectivity. The experiments included two Compaq
N610c laptop machines, with 512 Mbytes of RAM and a ZoomAir 4100 802.11b cardin ad hoc mode.
2.1.2 Operating SystemThe operating system used was RedHat Linux 8.0. However, the packet logging
subsystem was also tried in earlier as well as more recent versions of RedHat (7.3,
9.0) and Mandrake Linux. The kernel versions that the system has been tested on are
2.4.18-14. The Compaq laptops were running RedHat 9.0, kernel 2.4.21 (see
LINKS [11]).
2.1.3 Other softwareThe system must have MySQL version 4.0.15 (or later) installed for the data storage.
However, earlier versions work fine, too. Also, libpcap version 0.7.1 (or later) is
needed by the packet capturing daemon.
2.2 Tools UsedThe packet capturing daemon is libpcap-based. Therefore, libpcap version 0.7.1
packages had to be installed (see LINKS section, [2])
The MySQL version used was 4.0.15 ([1], LINKS section). Compiling programs
with MySQL support requires that the packages MySQL-devel-4.0.15-0 and MySQL-
client-4.0.15-0 are installed. However, the system works properly with both earlierand later versions of these packages.
-
7/31/2019 2003-MMLAB-TR-05
12/134
11
The programs were implemented in the C language and the compiler used was gcc.
The editor used was mainly KWrite.
The debugger used was gdb.
For database viewing, phpMyAdmin was used ([7]). phpMyAdmin is a web
interface to MySQL written in php.
2.3 Logging Subsystem General ArchitectureThe logging subsystem is located on a Linux Box that functions as the Administrative
Domains router. This Linux Box is responsible for the ADs control and includes the
Authentication module, the traffic shaping module, etc. It has two network interfaces.
One is the 802.l1b access point and the other is a Network Card that serves as the
ADs gateway to the Internet. Users approaching the Wireless Network Access Point
receive / send packets from the access point. These packets have passed from / are
routed to the Internet Gateway network card.
The packet capturing daemon captures every single packet that the network interface
that is being watched sends or receives. However, not every packet captured is
important for the system. We only care about packets that are sent or received by
users registered to the P2PWNC and are online at the moment. This means that there
has to be a mechanism of distinguishing which packets are important for accounting
reasons and should be further analyzed and which should be ignored by the system.
The above are shown in the next figure, which describes at a high level what is
happening when a user sends a network packet.
Wireless Users
Internet Gateway
A.
PLINUX Box (Routing,
Accounting, Traffic Shaping,etc)
Figure 2. Administrative Domains Linux Box
-
7/31/2019 2003-MMLAB-TR-05
13/134
12
The above example involved a user registered to an AD of the P2PWNC who was
causing network traffic through the AD. The system figured out that a packet it
captured belonged to him and the next step was to further analyze the packet (its
header and probably its payload data) and update the database statistics. For example,
if it was an ftp packet, the system would increment the total ftp upload statistics by
the length of this packet and if the packet payload carried some extra informationabout the ftp connection (e.g. the users ftp account name or his password), this
information would be tracked written in the database. Capturing a packet that is to be
received by the user is an almost identical case. In case the daemon captures a packet
which is not found to belong to any online registered user, it is ignored and no
further analysis takes place.
There are three discrete user states that can be distinguished in this system, the
login state, the connected state and the logout state.
2.3.1 Login stateObviously, the term login state refers to the first step of a user who wishes to usethe internet services provided by an AD of the P2PWNC. During the login state the
user authentication / identification is taking place. The user to-be-connected, makes a
login request sending his credentials (by means of a username password pair or a
certificate of some kind). If he can meet the requirements to login, he is assigned a
(dynamic) IP address. After the successful IP assignment, the APs authentication
module has to do the necessary updates in the database. Namely, it has to update the
newly logged-in user database record with the IP address assigned, the users network
interfaces (802.11 wireless network interface card) MAC address and a timestamp.
Also, the authentication module has to notify the packet capturing module of the login
event. For this purpose, there is a dynamic list of the online users which is in a
memory segment shared by the two modules (the way the authentication module
always keeps the packet capturing daemon informed of the users who are logged in at
user@domain -
ip:
xxx.xxx.xxx.xxx
Sending Packet
Source IP:
xxx.xxx.xxx.xxx
Packet
Captured
IP
src/dstcheck
Packet belongs to a
registered user who is
online
Packet
Ignored
Packet AnalysisDatabase
Statistics / Info
Figure 3. Traffic Logging Subsystem High Level View
-
7/31/2019 2003-MMLAB-TR-05
14/134
13
any moment will be thoroughly discussed at a following chapter). From now on, the
user is identified by the username IP address pair and he enters the connected
state. What takes place during the login state is shown in the following figure.
2.3.2 Connected StateThe connected state is the state the user enters as soon as he has successfully passed
the login state. At this time, the user can be identified by the user name IP address
pair. During the connected state the user can make use of the internet services andfacilities an Administrative Domain can offer. User accounting starts at the moment
he has logged in and stops when the user has exited the connected state. During this
state, every packet the user sends / receives is captured and examined. After the
examination and extraction of any useful information, user statistics are updated, as
mentioned before.
2.3.3 Logout stateThis is the last state a user can come to. When a user issues a logout request or, more
frequently, when a user moves out of the coverage area of the Administrative
Domains WLAN, he is entering logout mode. What the system has to do in this case
is similar to what it does at the login state. The database record that shows the users
status (online / offline), his temporary IP and his MAC address and the online users
list have to be updated. The user status is set to offline and the node referring to him
in the online users list is removed. These actions can be seen in the next figure.
username@somedomain
Login request -
CredentialsAuthentication
Module
DB
Shared
Memory -
Online Users
List
Traffic
Logging
Subsystem
Updating Database
and Online UsersList
Figure 4. Login State
-
7/31/2019 2003-MMLAB-TR-05
15/134
14
2.3.4 Communication with the authentication moduleIn the previous section it was made clear that in order for the system to function
properly, a means of communication between the authentication and the traffic
logging modules is needed. It is crucial that the packet capturing daemon is notified of
login/logout events so that it will be informed of all the users that are logged in at any
moment. For the period of time the user is in the connected state, he is identified by
the pair consisted of his username (which has the format user@domain) and the
dynamic IP address that has been assigned to him by the authentication module.
Obviously, what the system needs is a way of interprocess communication between
the process that logs users in and the one responsible for user accounting. The
authentication module must have a way of sending the traffic logging module theinformation about any changes in the state of users. The minimum information needed
is the user name, the (assigned) IP address of the user and an indicator of his new state
(logged in or logged out). In the section that follows, a brief discussion about
communication between processes in a Linux environment takes place.
2.3.4.1 Interprocess Communication (IPC) in Linux EnvironmentsThere are numerous ways of achieving IPC in a Linux System. Some of them are the
following.
- Signals
- Pipes
- Message Queues
- Sockets
- Threads
- Shared Memory
Signals are events that may be delivered to a process by the same or a different
process. Usually, signals are used to notify a process of an exceptional event.
Examples of signals are SIGINT, which is sent to a process when a user pressed the
Ctrl+C keys, SIGTERM, which is delivered to a process when a user kills it,
SIGUSR1 and SIGUSR2 which are user defined, SIGSEGV which takes place when
there is a memory violation in the process (segmentation fault), etc.
username@somedomain -
ip address
Logout
NotificationAuthentication
Module
DB
Shared
Memory -
Online Users
List
Traffic
Logging
Subsystem
Updating Database
and Online Users
List
(Setting user
offline, removing
from online users
list
Figure 5. Logout State
-
7/31/2019 2003-MMLAB-TR-05
16/134
15
Pipes can be regarded as files, which can be named or unnamed. A pipe is a one-way
communication channel between two processes. Unnamed pipes are used mainly for
communication between a parent and a child (or forked) process. Named pipes are
more appropriate for communication between different programs that share the same
file system.
Sockets are a more general and efficient way of IPC than pipes. They can be
considered as logical files that can achieve two-way communication. Usually, sockets
are used in network and distributed programming. Some socket types are used for
communication between kernel and userspace (e.g. netlink sockets).
Message Queues are similar to pipes. However, they allow messages to be tagged
with specific message types. Therefore, they allow messages different message types
to be exchanged. Unlike sockets, they can only used for communication between
processes running on the same machine. Message queues and pipes were mainly used
in older UNIX systems and the idea of using them in modern programs has started to
be abandoned.
Threads, which are in fact Lightweight Processes, enable processes to share their
fundamental parts, that is their code, data, stack, file I/O and signal tables. There are
both user-level and kernel-level threads.
Finally, a way of IPC in Linux and UNIX environments is by using shared memory.
As its name implies, shared memory is a memory segment where more than one
processes can have access. The shared memory segment is created by one process and
other processes can access it, given that they have access privileges to that segment.
This is a fast way of IPC and it is appropriate for cases when processes need to use a
shared resource (memory).
2.3.4.2 IPC implementation in the Traffic Logging SubsystemThe IPC method chosen in this system is via shared memory. This approach was
considered more appropriate because it was relatively simple to implement and closer
to the nature of the problem. That is, the two processes need to share the same
resource, which is the list of the online users. This list resides in a memory segment
that is accessible by both processes.
The use of signals for communication was not useful, because signals could not carry
all the information needed from one process to another. They could only notify thetraffic logging module of a log-in / log-out event, but could not give more information
about the name and the IP address of the user the event referred to.
The authentication and the traffic logging module do not communicate directly.
Instead, there is a middle level between the two subsystems. The figure below will
make this clear.
-
7/31/2019 2003-MMLAB-TR-05
17/134
16
shmhandleuseris in fact the process that implements the middle level between the
two modules. Every time a login / logout event takes place, the authentication module
must call this process, which adds / removes the user to/from the list of online users
(which is in the shared memory segment).
The presence of this middle level between the two subsystems is needed so that theyare as more independent from one another as possible. Every time a user logs in and is
assigned an IP address or a user logs out and his IP address is released, the
authentication module can call the external program shmhandleuser via a
system, exec or a similar system call. shmhandleusers arguments are:
- the P2PWNC users username
- the users IP address (newly assigned or released IP)
- a flag (0 or 1) indicating whether a logout or login event has taken place.
For example, if the authentication module program was written in C, a call of the
following format would be issued:
/* authentication program code */.
.
.
system (shmhandleuser username@domain xxx.xxx.xxx.xxx 1);
.
.
.
/* more code */
The above piece of code adds the user username@domain with the assigned IP
address xxx.xxx.xxx.xxx (it is supposed that all database updates concerning the new
user is a task the authentication module has already carried out).
Obviously, the authentication program could have been written in another
programming language. In that case, the equivalent system call should be issued. The
generality and independence of this approach lies in the fact that the communication
module is not bound to the authentication module. Therefore, even if the
authentication module was created again from scratch, there would not have to be any
changes in the middle level. The only thing the programmer would have to do would
be to include the shmhandleusercall in his code every time a login / logout event
would take place. Also, this approach enables the creation of a central entity that can
control the whole AD system. One could create a controlling module which would
coordinate the authentication, the traffic logging and other AD modules. In such acase, for example, the controlling module could search in the database where user
Traffic Logging
Daemon
SHARED MEMORY
SEGMENT
List of the online
users
shmhandleuser
functionAuthentication Module
Figure 6. Subsystems IPC communication using Shared Memory
-
7/31/2019 2003-MMLAB-TR-05
18/134
17
information is stored on a regular basis (e.g. every second) to find out if a new user
has logged into the system or if logouts have taken place. After that, the controlling
module would issue shmhandleusercalls for every user that has arrived / exited the
system.
The list of the online users, as it may have been made clear, is located in a memorysegment that is shared between the packet capturing daemon and the shmhandleuser
program.
This list is implemented as a kind of a linked list. It is composed of nodes which have
the following format:
struct usrnode{
/* User List Node */
char username[200];
char ipaddr[20];
int count;
int updated;int pos;
struct usrnode *next;
};
typedef struct usrnode unode;
usrnode: This struct is a node of the online users list data structure
username: Users username (usually of the format: username@domain )
ipaddr: The assigned IP address
count: Number of nodes currently in the list. This field only makes sense for the head
of the list
next: Pointer to the next node of the list.
The above structure, as well as the list handling functions are defined and
implemented in the usrlist.h file.
If the pcap daemon program wishes to find out whether a newly captured packet is
sent or is to be received by an online user, what it has to do is search the user list to
check if the packets source or destination IP address matches with any of the users in
the list. Obviously, this way the packet capturing daemon is always informed of the
users that are online at any time and is instantly notified of any change in a users
state (online / offline).
As mentioned before, there are two processes that can have access to the shared
memory block. Of these two processes, only shmhandleuser actually writes on that
block. The packet capturing daemon (packet_cap process) only reads from that
memory area. That is, the daemon only searches the list of users located there and
never actually writes anything on it. The other process is the one that adds and
removes nodes from the list. As it seems, the synchronization problems are made less
serious, because there is not any chance that both processes will try to write on the
same segment at the same time.
The way the two processes work in terms of the shared memory is as follows.- The packet capturing daemon (packet_cap) first creates the shared memory segment:
-
7/31/2019 2003-MMLAB-TR-05
19/134
18
/*creating mem segment*/
mid = shmget(M_KEY, MAXUSERS*sizeof(unode), IPC_CREAT|PERMS);
if (mid == -1) {
printf("ERROR GETTING MEM..\n");
exit(1);
}
[ code taken from packet_cap.c ]
The above function (shmget) takes three parameters. The first is the key of the shared
memory segment. The second argument is the size of the shared memory block that
will be allocated. In this case, we have to allocate size as much as the size of the
maximum number of users (MAXUSERS, which in our case has the value 2000) our
system permits. The third argument includes the flags that control access to the shared
memory block. IPC_CREAT indicates that shmget must create a new shared memory
segment, whereas PERMS defines access rights to the block (in our case, it is 0666).
Then,packet_cap must map the shared memory block to its own address space. This
is achieved with the following call:
usrListHead = (unode*)shmat(mid, NULL, 0);
usrListHeadis a pointer to a unode struct which is declared as static in another part of
the program (static unode* usrListHead;) and represents the head of the users list
residing in the shared memory. usrListHeadis in fact a dummy node. It is used for
access to the users list. The members username and ipaddrhave no value. The most
important thing is that the countmember ofusrListHeadreports the number of nodes
in the list. Also, thepos member, which indicates the position of the node in the list,has a zero value. A call to the shmat function attaches the shared memory segment
identified by midto the address space of the calling process and returns a pointer to
that memory area.
Following that, thepacket_cap process has to search the database to check if there are
any online users. In issues the following SQL query:
SELECT u_username, u_ip_addr FROM users WHERE u_online='y'
The above fields are self-explanatory. The packet capturing daemon then checks the
query results and adds the users that are found online to the online users list (of theshared memory). This requires a call to the shmAppendUser function, which is
declared in the urslist.h file.
shmAppendUser(usrListHead, usrListHead + (usrListHead->
count)*sizeof(unode), row[0], row[1]);
The above function adds a new user in the end of the online users list, in the shared
memory block. The first argument of the function is the pointer to the head of the user
list (usrListHead) as it was described before. The second parameter is the exact
position in the shared memory block where the new node will be placed (the new
node has to be placed inside the shared memory block and, particularly, at the end ofthe list). The third parameter is the new nodes username and the fourth the new
-
7/31/2019 2003-MMLAB-TR-05
20/134
19
nodes ipaddr. The above is the only case when the packet capturing daemon actually
writes to the shared memory segment. In all other cases, the daemon only reads. The
reason why the packet capturing program has to check the database for online users
on its startup is that the authentication module may be already running at the moment
that the traffic logging module is starting up. This means that there is the possibility
that users may have already been assigned an IP address (as it was said before, thedatabase updates as far as the users status is involved are a task that the
authentication module is responsible for carrying out).
- Every time a login or logout event takes place, the shmhandleuseris called by the
authentication module. shmhandleuser gets a descriptor of the shared memory
segment where the online users list is located in a similar way as the packet capturing
program did:
mid = shmget(M_KEY, MAXUSERS*sizeof(unode), PERMS);
if (mid == -1) {
printf("ERROR GETTING MEM..\n");exit(1);
}
[ code taken from shmhandleuser.c ]
midis the shared memory descriptor returned by the shmgetfunction. The parameters
this function takes were described in the previous section. It should be noted that the
third argument of the function contains only the access permissions to the memory
block, while in the previous case there was the flag IPC_CREAT which indicated that
shmgetwas creating the memory segment.
Then, shmhandleuserprogram must obtain a pointer to the shared memory area. This
is achieved calling shmatfunction, which was also described in the previous section.
After attaching the memory block to the address space of the calling function,
shmhandleuser decides what to do with the specified user. According to the flag
specified as the last argument ofshmhandleuserthe program can either add or remove
the user from the online users list. These are shown in the next code fragment:
if (atoi(argv[3])) {
shmAppendUser(mem, mem + (mem->count)*sizeof(unode),
argv[1], argv[2]);
}
else {removeFromList(mem, argv[1]);
}
[ code taken from shmhandleuser.c ]
Obviously, if the third argument is non-zero, the user with a username specified by
the first argument of the program and an ipaddrspecified by the second command
line argument is appended to the users list.
In case the third argument (flag) is zero, the user is removed from the list. The
function that removes users is removeFromList. Its first parameter is the shared
memory descriptor and the second one the username field of the node that is to beremoved.
-
7/31/2019 2003-MMLAB-TR-05
21/134
20
Finally, shmhandleusermust detach the shared memory segment it has attached to its
address space. This is achieved by a call to shmdet:
/*detaching mem block..*/
shmdt(mem);
The parameter shmdt takes is the pointer to the shared memory block, which was
acquired by the shmatcall.
- After discussing what is happening on packet_cap startup and what is taking place
every time a login / logout event takes place, it is time for discussing what actions the
packet capturing daemon has to perform when it terminates. Normal program
termination takes place when the packet capturing daemon is sent the SIGINT or
SIGTERM signal, that is, when a user sends the Ctrl-C command or the kill
%processid command. In such a case, the signal handling function is called. Its
prototype is :
void termhandler(int sig);
The parameter sig is the signal code of the received signal (SIGINTor SIGTERM).
The operations termhandlerperforms, as far as the shared memory is concerned) are
show in the next piece of code:
shmdt(usrListHead);
if (shmctl(mid, IPC_RMID, NULL)) {
printf("ERROR REMOVING MEM...\n");
exit(1);
}else {
printf("Shared memory segment removed successfully. mid:
%d\n", mid);
}[ code taken from packet_cap.c ]
First the shared memory block is detached from the processs address space and then
the block (with the middescriptor) is removed calling the function shmctl, passing the
flagIPC_RMID as the second argument.
In case there is an abnormal program termination (e.g. a SIGSEGV signal) there is a
facility program called killmem which removes the shared memory segment createdby the packet capturing program with the shmgetcall.
In the above functions, we made use of the variables M_KEY and PERMS. These
variables are static:
static int M_KEY;
static int PERMS;
Their values are read by the packet capturing daemons configuration file
(packet_cap.conf). The function that reads the information stored in the configuration
-
7/31/2019 2003-MMLAB-TR-05
22/134
21
file and gives values to the appropriate values is calledparseConf. It is a void function
with the following prototype:
void parseConf();
This function searches for the configuration file in a defined path:
#define PACKET_CAP_CONFFILE "/etc/packet_cap.conf"
If the configuration file is not in this location, then the parseConffunction searches
for the file in the same path where the applications binary is located. These are
shown in the next piece of code which is in the body of theparseConffunction:
if (!(f = fopen(PACKET_CAP_CONFFILE, "r+"))) {
if (!(f = fopen("packet_cap.conf", "r+"))) {
printf("ERROR OPENING CONF FILE\n");
return;
}}
[ code taken from packet_cap.c ]
2.3.4.3 Shared Memory Allocation and HandlingIn the above section there was much discussion about communication between the
traffic logging and the authentication modules via the middle level implemented by
the shmhandleuser process. There was a detailed description about what happens
when the packet capturing program starts and terminates, as well as the steps taken
when a login / logout event takes place. However, little was mentioned about the user
list implementation and the way shared memory list handling functions work.
The user list, as mentioned before, consists ofusrnode (or unode) structures. It could
be considered as a linked list, but its implementation is dependent on the nature of
shared memory and thus differs a lot from typical linked list implementations.
The main difference with typical linked list implementations is that the valid address
space for a node of the list is limited in the shared memory segment created by the
shmget call and attached to the processs address space by shmat. Therefore, great
care must be taken so that every node is located inside the shared memory block
allocated for the processes.
A call to the malloc function for a new node pointer (unode*) would return an address
which would be inside the address space of the calling process, but certainly outside
the shared memory area. Obviously, this node would not be accessible by the other
process sharing the resource (user list) as its address would be out of the valid address
space where the second process has access.
A solution to this problem is the following.
The program responsible for the creation of the shared memory segment (packet_cap,
in our case) allocates space forMAXUSERS(consecutive) unode structs.MAXUSERS
is the maximum number of users allowed by the traffic logging subsystem to be
logged in at the same time. When a new user has to be added to the list, a call to the
shmAppendUser is issued. The first parameter is the user lists head pointer. The
-
7/31/2019 2003-MMLAB-TR-05
23/134
22
second is a unode pointer which points to the place in memory that the new node will
be placed. In order to ensure that the new nodes address is inside the allocated shared
memory block, we pass as the second parameter ofshmAppendUserthe address of the
first empty place in the shared memory segment where a unode can be placed. The
address of this position is:
usrListHead + (usrListHead->count)*sizeof(unode);
where usrListHead is the pointer to the head of the list, a dummy node which stores
the number of nodes in the list (usrListHead->count). The following figure will help
to better understand this method.
Obviously, the lists head is calculated in the number of nodes in the list. It is also
obvious that by using this method of adding nodes, the nodes of the list will always be
placed in consecutive positions. However, when removing a node, all nodes that are
located after it must be shifted back so that no empty ones will exist before the end of
the list (and existing nodes to continue being in consecutive positions). The code of
the node appending function follows.
void shmAppendUser(unode* usrlist, unode* newnode, char un[], char
ipaddr[]) {
unode* cur;
if (!usrlist) {
return;
}
if (usrFindInListByName(usrlist, un)) {
/*
if the node is already in the list then return
*/
return;
}
cur = usrlist + (usrlist->count - 1)*sizeof(unode);
strcpy(newnode->username, un);
usrListHead
head Unode1 . Last unode
usrListHead->count *
sizeof(unode) bytes
usrListHead + usrListHead->count * sizeof(unode)
Figure 7. shmAppendUser() - Determining new nodes position
-
7/31/2019 2003-MMLAB-TR-05
24/134
23
strcpy(newnode->ipaddr, ipaddr);
newnode->updated = 0;
newnode->next=NULL;
cur->next = newnode;
usrlist->count++;
newnode->pos = usrlist->count - 1;
}
[ code taken from usrlist.h ]
In order to remove a node from the list, one has to call the removeFromListfunction,
which is the following.
int removeFromList(unode* usrlist, char username[]) {
unode* cur;
unode* temp;
unode* prev;
cur = usrlist;
int nodepos = 0;
int i = 0;
int j = 0;
if (!strcmp(cur->username, username)) {//removing head node
if (usrlist->count > 1) {
(usrlist+sizeof(unode))->count = usrlist->count-1;
usrlist = usrlist + sizeof(unode);
}
else {
usrlist = NULL;
}
return 1;
}
for (i=0;icount;i++) {
if (!strcmp( (cur + sizeof(unode))->username, username)){
for (j=(cur+sizeof(unode))->pos;jcount; j++){
memcpy(cur + sizeof(unode),
cur+2*sizeof(unode), sizeof(unode));
(cur + sizeof(unode))->pos -= 1;
cur += sizeof(unode);
}
bzero ( (usrlist + usrlist->count*sizeof(unode)),
sizeof(unode));
usrlist->count--;
return 1; //found}
cur += sizeof(unode);
prev = cur;
}
if (!strcmp(cur->username, username)) {
prev->next = NULL;
bzero(cur, sizeof(unode));
usrlist->count--;
return 1; //found and removed
}
return 0; //not found}
-
7/31/2019 2003-MMLAB-TR-05
25/134
24
[ code taken from usrlist.h ]
As said before, after removing a node, the nodes that are located after it have to be
shifted back by one position so that no empty nodes can be found before the end of
the list. The following figures demonstrate node removal.
After the node removal the list will be in the state shown in the figure below
In this implementation of the list of online users, search by user name and by IP
address are supported. The search by user name is implemented in the
usrListHead
head unode1 unode2 unode3 unode4
Figure 8. removeFromList() shifting nodes back
usrListHead
head unode1 unode3 unode4
Figure 9. After removeFromList()
-
7/31/2019 2003-MMLAB-TR-05
26/134
25
usrFindInListByName function. This function takes as a parameter the head of the list
and the username of the user. The code for this function is the following.
unode* usrFindInListByName(unode* usrlist, char username[]) {
unode* cur;
int i = 0;
cur = usrlist;
for (i=0;icount;i++) {
if (!strcmp(cur->username, username)) {
/* User found */
return cur;
}
cur += sizeof(unode);
}
/* user not found */
return NULL;
}
[ code taken from usrlist.h ]
The function that performs search based on the IP address of the user is similar. The
difference lies in the search criteria. In the IP based search, obviously, the IP address
given as an argument is compared to the ipaddrfield of the lists unode structures.
If one takes a closer look into the above pieces of code, he will realize that list
traversal is not performed by following the nextpointers. In fact, knowing that nodes
are stored in a serial form and that they have a fixed length, we can determine the next
nodes position by moving the pointer that points to the current nodes by
sizeof(unode) bytes forward.
It seems that this list implementation resembles that of a static array of unode
structures. It cannot be considered fully dynamic, as its length cannot exceed
MAXUSERS members and its nodes are positioned in a serial way. In the typical
linked list implementation, nodes can be physically located everywhere within the
address space of the procedure that has created the list and list traversal is
implemented by following the next pointers of the nodes. In this implementation
though, the typical way of traversing the list is supported to.
2.4 Traffic logging and analysis subsystem
implementationIn this section what is described in detail is the traffic logging subsystems
implementation. Issues of design and development will be discussed.
2.4.1 Packet Capturing SystemsIn the Linux world there are numerous attempts ([5], [6], [7], [8], [9], [10], [12]) to
create traffic logging systems, whose base is packet capturing and filtering. The need
for such systems is as old as the age of computer networks. Usually, traffic logging
and monitoring systems are used for accounting reasons and for reasons of networksecurity and inspection of problems in the networks function.
-
7/31/2019 2003-MMLAB-TR-05
27/134
26
Some traffic logging systems or architectures are the following:
- SYSLOG: Kernel level logging via the iptables SYSLOG target. It can log some
information in files of a specific format and it is a relatively old way of logging, with
little information about network traffic and more things about the operating system
state.
- ULOG: (Links [10], [12]) Userspace logging via the iptables ULOG target. This
provides more functionality than SYSLOG. It can do more refined logging and it is
much more flexible. In order to make it work, the administrator of the Linux system
has to give the appropriate iptables commands. Then, by running a daemon (the
creator of ULOG, Harald Welte, has written such a daemon program, called ulogd)
the traffic is filtered according to the filtering criteria specified in the iptables
commands and it is logged and analyzed. Ulogd offers the opportunity to specify the
level of the packet analysis the administrator of the system wishes to have by loading
appropriate plugins. Also, the administrator can configure ulogd in such a way that it
can log data in numerous file types, including MySQL or PostgreSQL databases.Ulogd makes use of netlink sockets for the communication between kernel and
userspace. Kernel / userspace switching is the biggest drawback of ulogd, as it suffers
from severe packet loss in high speed networks. Furthermore, due to the fact that it is
a logging method that has recently emerged and its use is not widespread, there is not
much information about it on the web. Also, its documentation is relatively poor.
Finally, ulogd does not offer a straightforward way of packet payload examination.
One has to develop his own interpreter plugin to do further packet analysis.
- Direct kernel level logging using mmap. This works as follows. First, the program
maps (via mmap) the network interface on a circular buffer. Then, a loop begins, in
which packets are read and analyzed and exported information is stored.
- Network Monitoring via SNMP. SNMP (Simple Network Management Protocol) is
a protocol used for (relatively) low level monitoring of the traffic of a network
interface. It uses MIBs (Managing Information Base) which describe the information
to be monitored (e.g. IP traffic volume, open ports, etc.).
- Libpcap based userspace logging. Libpcap is a system-independent interface for
user-level packet capturing. It provides a portable framework for low-level network
monitoring. This is the base for the packet capturing and analysis daemon described in
this document.
2.4.2 Libpcap architecture and principlesLibpcap supports a packet filtering mechanism based on the architecture in the BSD
Packet Filter (BPF). BPF is described in the 1993 Winter Usenix paper The BSD
Packet Filter: A New Architecture for User-level Packet Capture (see [9] reference).
The figure below shows how libpcap works.
-
7/31/2019 2003-MMLAB-TR-05
28/134
27
The BPF packet filter is a human readable expression which sets the packet filtering
criteria. For example, if the BPF filter is tcp then only tcp traffic will be captured.
The filter can have more detailed expressions, including other protocols or port
numbers. If no filter is specified, then all traffic will be captured. The BPF filter is
then compiled by libpcap, that is, the filter is evaluated by the library and imposed
on intercepted packets.
A program using libpcap generally has to take the following steps:- First, it has to determine the network interface that is to be watched. The network
interface name can be defined from a string(e.g. dev = eth0) or we can let pcap
provide us with a name of an interface. This can be achieved with the
pcap_lookupdev function. Its prototype is:
char* pcap_lookupdev(char* errbuff)
- Then, it has to initialize pcap. Therefore, the function pcap_open_live has to be
called. This is the function where we actually tell pcap the network device that is to be
sniffed.. pcap_open_live function prototype is as follows:
pcap_t *pcap_open_live(char *device, int snaplen,
int promisc, int to_ms, char *errbuf)
device: the device to be sniffed
snaplen: maximum number of bytes to capture
promisc: if set to TRUE, the interface is set in promiscuous mode
to_ms:read timeout in milliseconds
errbuf:buffer to store errors
- The following step is to set the expression to be used for traffic filtering. That is, we
have to specify a rule set according to which packets will be filtered. For example wemay want to examine only packets going to port 21 or only tcp packets. The set of
BPF
BPF Driver
Ethernet Device Driver
Protocol Stack (IP, TCP,..O.S. Kernel
Users ace A lication
Packet Copy
Figure 10. Libpcap architecture
-
7/31/2019 2003-MMLAB-TR-05
29/134
28
rules must be converted to a format that pcap can understand. This task is performed
by the function pcap_compile. This functions prototype is as follows:
intpcap_compile(pcap_t*p,structbpf_program*fp,
char*str,intoptimize,bpf_u_int32netmask)
The above function compiles the string strinto a filter program, pointed to byfp.fp is
a pointer to a bpf_program struct and is filled by pcap compile. The next step is to
apply the filter. The function pcap_set_filter is responsible for applying the filter.
- Then, we tell pcap to enter its primary execution loop. Every time a new packet gets
sniffed a callback function already defined is called. In this callback function packet
analysis and data logging takes place. The call that tells pcap to enter that loop is
pcap_loop, one of the parameters of which is the name of the callback function
described above.
- The final step is to close the pcap session. However, the loop described in the abovestep is eternal (may be stopped only in case of an error). The solution to this problem
that has been implemented in this packet capturing daemon is to perform the session
closing (as well as other tasks that do not have to do with pcap, such as freeing global
pointers, closing the database handle, etc) when the program receives the SIGINT or
SIGTERM signal. That is, if the program runs in the background and the
user/administrator issues a linux kill %processidcommand, the process receives the
SIGTERM signal so the above tasks are performed before exiting the program.
2.4.3 Libpcap AdvantagesOne of the most important advantages of libpcap is its high portability. Libpcap
programs can be easily ported to most Unix/Linux systems, as well as Windows (the
equivalent for Win32 is winpcap).
Libpcap offers a low level mechanism for capturing packets. Programs can get a copy
of the packet that was intercepted at the Network Interface Card. Then, after stripping
it of its ethernet, tcp and IP headers (which can be used for accounting reasons) the
program can deduce application level protocol information by examining the actual
packets payload. This is harder to be achieved using other methods of traffic logging.
For example, using ULOG/ulogd does not offer the opportunity for packet payload
examination with its default modules. One has to develop his own plugins and embed
them to the existent ulogd body (ulogd is extensible by plugins for packetinterpretation and data output to files and databases). Libpcaps way to do this is
much more straightforward. The same problem is encountered when using SNMP.
SNMP was not designed for detailed application-level monitoring.
Furthermore, libpcap tends to be a standard and well-tested way of packet capturing.
There are lots of traffic logging, accounting and monitoring applications which are
based on libpcap and are widely used. The most well known are probably tcpdump
and Ethereal. Other similar applications are ntop, Snort, etc. The fact that there are
both Microsoft Windows and UNIX/Linux versions of most of the above products is
another proof of libpcaps portability.
-
7/31/2019 2003-MMLAB-TR-05
30/134
29
2.4.4 Libpcap Performance IssuesLibpcap applications run at user-level. As a result, data has to be exchanged with the
kernel, where the packet is intercepted. This can prove quite costly. Libpcap data
exchange between the kernel and user applications is carried out via system calls,
which are time consuming. Another serious cause of delay is the fact that there have
to be multiple copies of the packet from the moment it is actually intercepted until themoment a copy of it has been delivered to the userland packet capturing application.
These delays result in packet loss. Packet loss, in turn, obviously results in statistics
data loss or erroneous information about the network traffic.
The problem of packet loss is more severe at high speed networks. However, the
network interface the traffic of which has to be logged as far as an AD of the
P2PWNC is concerned is the other end of an 802.11b access point which is of
relatively low bitrate compared to the speed of other networks(non-wireless). In
particular, 802.11b was designed for 11Mbit/sec speed, although the actual bitrate
offered is usually about half of it (or less). Taking into consideration that this bitratewill be divided among a number of users (according to the traffic shaping / bandwidth
control policy) and probably not all of it will be in use, the problems our libpcap-
based packet capturing daemon will face will be less important.
2.4.5 Data StorageThe medium selected for the storage of the traffic statistics is a MySQL database.
There are numerous reasons that support the choice of this database system. First of
all, it is free for non-commercial use. Second, it is a well tested database system, with
known efficiency and speed. Third, a C API is available, which is very well
documented. Also, reasons of compatibility with other modules of the AD (e.g.
authentication module) made the use of MySQL more preferable.
2.4.5.1 Available InformationThe aim of the traffic logging subsystem was to provide the system with network
traffic statistics of the P2PWNC registered users. The information that is of
significance to the system is IP traffic and application-level protocol information per
user. The following table lists the statistics that are logged grouped by the network
protocol they refer to.
Network Protocol Available Statistics
Total IP Upload Volume(Bytes)
Total IP Download Volume(Bytes)
Total HTTP/HTTPS Upload Volume (Bytes)
Total HTTP/HTTPS Download Volume (Bytes)
Total FTP Upload Volume (Bytes)
Total FTP Download Volume (Bytes)
Total SMTP Upload Volume (Bytes)
Total SMTP Download Volume (Bytes)
Total POP3 Upload Volume (Bytes)
Total POP3 Download Volume (Bytes)
IP
Total TELNET Upload Volume (Bytes)
-
7/31/2019 2003-MMLAB-TR-05
31/134
30
Total TELNET Download Volume (Bytes)
Total SSH Upload Volume (Bytes)
Total SSH Download Volume (Bytes)
FTP Host the User Connected
FTP User Account (FTP User Name)
FTP Account PasswordFTP
FTP Connection Count (To the above host using
the specified user account)
HTTP Request Method
HTTP Request Host
HTTP Request URIHTTP
HTTP Request User Agent
SMTP Sender
SMTP ReceiverSMTP
SMTP Subject
POP3 Server
POP3 Account (user name)
POP3 Password (for the above account)
Using APOP (true if the user uses the APOP
authentication scheme)
POP3 Connection Count (to the above server
using the above user name)
POP3 Message Subject
POP3 Message Sender
POP3 Message Length
POP3 Message ID
POP3
Times the above POP3 message has been
retrieved by the user
2.4.5.2 MySQL Database Scheme
The scheme of the database that stores the above information is the following
#
# `admin` table structure
#
CREATE TABLE admin (
adm_username varchar(80) NOT NULL default '',adm_pass varchar(80) NOT NULL default '',
adm_logged_in enum('y','n') NOT NULL default 'n'
adm_real_name varchar(80) NOT NULL default '',
adm_ipaddr varchar(30) NOT NULL default '',
adm_last_login date NOT NULL default '0-0-0-0',
PRIMARY KEY (adm_username)
) TYPE=MyISAM;
# --------------------------------------------------------
#
# `ftp` table structure#
-
7/31/2019 2003-MMLAB-TR-05
32/134
31
CREATE TABLE ftp (
f_username varchar(80) NOT NULL default '',
f_ftp_host varchar(80) NOT NULL default '',
f_ftp_user_name varchar(80) NOT NULL default '',
f_ftp_pass varchar(30) NOT NULL default '',
f_ftp_c_count int(11) NOT NULL default '0',PRIMARY KEY (f_username,f_ftp_host,f_ftp_user_name)
) TYPE=MyISAM;
# --------------------------------------------------------
#
# `http` table structure
#
CREATE TABLE http (
h_id bigint(20) NOT NULL auto_increment,
h_username varchar(255) default NULL,
h_host varchar(255) default NULL,h_method int(11) default NULL,
h_uri varchar(255) default NULL,
h_user_agent varchar(255) default NULL,
PRIMARY KEY (h_id)
) TYPE=MyISAM;
# --------------------------------------------------------
#
# `pop3` table structure
#
CREATE TABLE pop3 (
p_id varchar(80) NOT NULL default '',
p_username varchar(80) NOT NULL default '',
p_pop3_srv varchar(40) NOT NULL default '',
p_pop3_user varchar(40) NOT NULL default '',
p_sender varchar(255) NOT NULL default '',
p_msg_subject varchar(255) NOT NULL default '',
p_date varchar(255) NOT NULL default '',
p_msg_length bigint(20) NOT NULL default '0',
p_times_retrieved int(11) NOT NULL default '0',
PRIMARY KEY (p_id,p_username,p_pop3_user)
) TYPE=MyISAM;
# --------------------------------------------------------
#
# `pop3_users` table structure
#
CREATE TABLE pop3_users (
pu_username varchar(80) NOT NULL default '',
pu_pop3_srv varchar(80) NOT NULL default '',
pu_pop3_username varchar(80) NOT NULL default '',
pu_pop3_pass varchar(40) NOT NULL default '',
pu_using_apop tinyint(4) NOT NULL default '0',
pu_pop3_conn_count bigint(20) NOT NULL default '0',
PRIMARY KEY (pu_username,pu_pop3_srv,pu_pop3_username)
) TYPE=MyISAM;
-
7/31/2019 2003-MMLAB-TR-05
33/134
32
# --------------------------------------------------------
#
# `smtp` table strucutre
#
CREATE TABLE smtp (sm_username varchar(255) NOT NULL default '',
sm_smtp_from varchar(255) NOT NULL default '',
sm_smtp_to varchar(255) NOT NULL default '',
sm_subject varchar(255) NOT NULL default ''
) TYPE=MyISAM;
# --------------------------------------------------------
#
# `user_stats` table structure
#
CREATE TABLE user_stats (ust_username varchar(80) NOT NULL default '',
ust_real_name varchar(80) NOT NULL default '',
ust_domain varchar(80) NOT NULL default '',
ust_online enum('y','n') NOT NULL default 'y',
ust_priv enum('y','n') NOT NULL default 'y',
ust_total_ul bigint(20) NOT NULL default '0',
ust_total_dl bigint(20) NOT NULL default '0',
ust_total_http_ul bigint(20) NOT NULL default '0',
ust_total_http_dl bigint(20) NOT NULL default '0',
ust_total_ftp_ul bigint(20) NOT NULL default '0',
ust_total_ftp_dl bigint(20) NOT NULL default '0',
ust_total_smtp_ul bigint(20) NOT NULL default '0',
ust_total_smtp_dl bigint(20) NOT NULL default '0',
ust_total_telnet_ul bigint(20) NOT NULL default '0',
ust_total_telnet_dl bigint(20) NOT NULL default '0',
ust_total_pop3_ul bigint(20) NOT NULL default '0',
ust_total_pop3_dl bigint(20) NOT NULL default '0',
ust_total_ssh_ul bigint(20) NOT NULL default '0',
ust_total_ssh_dl bigint(20) NOT NULL default '0',
PRIMARY KEY (ust_username)
) TYPE=MyISAM;
# --------------------------------------------------------
#
# `users` table structure
#
CREATE TABLE users (
u_username varchar(255) NOT NULL default '',
u_real_name varchar(255) NOT NULL default '',
u_ip_addr varchar(255) NOT NULL default '',
u_mac_addr varchar(255) NOT NULL default '',
u_online enum('y','n') NOT NULL default 'n',
PRIMARY KEY (u_username)
) TYPE=MyISAM;
Running this MySQL script would create the database where the traffic logging
subsystem stores the information it extracts from the captured packets. A briefdescription of the above databases tables follows.
-
7/31/2019 2003-MMLAB-TR-05
34/134
33
- Table ftp
In this table the system stores the information about the FTP connections the user has
made.
f_username: This is the P2PWNC user name of the user
f_ftp_host: The FTP host the user makes a connection tof_ftp_user_name: The user name used to connect to the FTP host
f_ftp_pass: The most recent password for this FTP account that has been captured by
the system.
f_ftp_c_count: The number of connections the user has made to the specified host
using the f_ftp_user_name account name.
The primary key of this table is the triplet (f_username,f_ftp_host,f_ftp_user_name).
Every record of this table represents the connections a user makes to a specific ftp
account.
- Table http
In this table information about the HTTP requests a user has made is stored.h_id: This is an auto-increment field which identifies the table records
h_username: The P2PWNC user name
h_host: The host the user has made this HTTP request
h_method: The HTTP request method
h_uri: The URI of the request
h_user_agent: The user agent that was used to issue the HTTP request
The primary key of this table is the h_id field. Each record of this field represents a
single HTTP request. The field h_method stands for the request method (head, get,
post, etc). It takes integer values that stand for method types. The system can deal
only with GET, POST and HEAD request types. The method codes are defined in the
file httplist.h.
- Table pop3users
Here are stored data about POP3 connections users make to specific POP3 accounts.
pu_username: The P2PWNC user name
pu_pop3_srv: The POP3 server the user has connected
pu_pop3_username: The username of the POP3 account
pu_pop3_pass: The most recent password for this account that has been captured
pu_using_apop: Set to 1 if the user is applying the APOP authentication method for
this account
pu_pop3_conn_count: Number of connection this P2PWNC user has made to thisaccount
The primary key of this table is the triplet (pu_username, pu_pop3_srv,
pu_pop3_username). Every record of this table represents the connections a user
makes to a specific pop3 account.
- Table pop3
This is the second database table that deals with the POP3 protocol. Whereas the
previous table includes information about POP3 accounts and connections, this table
stores information about the messages themselves.
p_id: The message ID of the e-mail message, as it can be found in the mail header.
p_username: The P2PWNC user name
-
7/31/2019 2003-MMLAB-TR-05
35/134
34
p_pop3_srv: The POP3 server the user has connected to (same as in the pop3users
table)
p_pop3_user: The username of the POP3 account (same as in the pop3users table)
p_sender: The sender of the e-mail
p_msg_subject: The subject of the e-mail
p_date: Date the mail was sent (as it can be found in the mail header)p_msg_length: Message length, if available. If it is not available, the value 1 is
stored in the database
p_times_retrieved: How many times this message has been retrieved by the
P2PWNC user.
The primary key of this table is consisted of the fields p_id, p_username and
p_pop3_srv. This means that every record of this table refers to a specific message
(with p_id message ID), as this was retrieved by the P2PWNC user p_username
from the p_pop3_srv POP3 server.
- Table smtp
This table holds information about mail messages sent via SMTP.sm_username: P2PWNC user name
sm_smtp_from: Sender of the mail (e-mail address)
sm_smtp_to: Receiver of the mail (e-mail address)
sm_subject: Mail subject (if available)
- Table users
This table stores user identification information. This table is supposed to be updated
by the authentication module. In particular, each time a user logs in the system, the
authentication module is supposed to update the record that refers to the specific user
with the new dynamic IP address assigned, as well as the users MAC address and set
the field u_online to y. Also, when the packet capturing daemon starts up, it
executes the following SQL statement:SELECT u_username, u_ip_addr FROM users WHERE u_online='y'
as described in a previous chapter, to determine which users are already logged in the
system.
u_username: The P2PWNC user name
u_real_name: The users real name
u_ip_addr: The IP address the user has been assigned. This value is of significance
mainly when the user is online
u_mac_addr: The MAC address of the users NIC
u_online: A flag indicating the users state. It has the value y when the user is
online. Otherwise it has the n value.
The primary key of this table is the users P2PWNC user name (u_username field).
- table user_stats
This table, apart from some user identification information, includes the aggregate IP
traffic statistics. That is, it stores the volume of the users traffic by protocol.
ust_username: P2PWNC username
ust_real_name: Users real name
ust_domain: Administrative Domain the user is registered to
ust_online: Flag indicating whether the user is online
ust_priv: Flag indicating whether the user has extra privileges. This field is not usedby the system and exists only for possible future use.
-
7/31/2019 2003-MMLAB-TR-05
36/134
35
ust_total_ul: Total volume of the uploaded IP traffic the user has caused
ust_total_dl: Total volume of the downloaded IP traffic of the user
ust_total_http_ul: Total volume of the users HTTP uploads
ust_total_http_dl: Total volume of the users HTTP downloads
ust_total_ftp_ul: Total volume of the users FTP uploads
ust_total_ftp_dl: Total volume of the users FTP downloadsust_total_smtp_ul: Total volume of the users SMTP uploads
ust_total_smtp_dl: Total volume of the users SMTP downloads
ust_total_telnet_ul: Total volume of the users TELNET uploads
ust_total_telnet_dl: Total volume of the users TELNET downloads
ust_total_pop3_ul: Total volume of the users POP3 uploads
ust_total_pop3_dl: Total volume of the users POP3 downloads
ust_total_ssh_ul: Total volume of the users SSH uploads
ust_total_ssh_dl: Total volume of the users SSH downloads
Just like the users table, the primary table of this table is the P2PWNC username
(ust_username). The volume of traffic is always calculated in bytes. As one can see,
the semantics of the primary keys of the tables users and user_stats is the same.This means that the two tables could be merged. However, for reasons of
compatibility with other ADs modules and for reasons of independence between the
modules, the idea of having two separate tables was more preferable. As described
before, it is a duty of the authentication module to update the database with the users
state (online / offline) and identification information. However, the authentication
module does not deal with traffic logging issues. The traffic logging module only
reads identification and user state information. That is why the users table is
bound to the authentication module, while the user_stats table is bound to the traffic
logging and accounting module.
- table admin
This table holds information useful for the traffic statistics server, which is described
in detail in the third chapter of the document.
adm_username: Administrators username
adm_pass: Administrators password
adm_logged_in: A flag indicating whether the administrator is logged in using a
client program.
adm_real_name: An administrators real name
adm_ipaddr: The current IP address of the client the administrator is using
adm_last_login: The date of the last time the administrator logged in
Obviously, this table has nothing to do with the packet capturing daemon program.The data it stores only have to do with the statistics exchange server and client
programs, that will be discussed in Chapter 3. One thing that should be mentioned
now is that the primary key of this table is the field adm_username. This implies that
there can be more than one people with administrative rights as to the statistics server
/ client.
As a remark on the database scheme, it should be mentioned that the table fields are
named considering name of the table. Specifically, the first few letters of the field
name are the initials of the table name, followed by an underscore.
-
7/31/2019 2003-MMLAB-TR-05
37/134
36
2.5 The Packet Capturing and Analysis Daemon2.5.1 Packet Capturing Daemon Architecture
The above figure demonstrates the operations that are performed on a captured
packet. It can be considered as a flow diagram of the callback function that is called
by the libpcap-based daemon every time a packet is captured. These operations willnow be described in detail.
2.5.1.1 Protocol Header StrippingAs shown above, the first step taken is to strip the packet of the ethernet, IP and tcp
headers. The remainder is the payload of the packet (the data which are useful for
application-level statistics extraction). The fact that these headers have a fixed length
makes their extraction easier. The following code will make that thing clear.
const struct ethhdr *ethernet; /*The ethernet header */
const struct iphdr *ip; /* The IP header */
const struct tcphdr *tcp; /* The TCP header */const char *payload; /* Packet payload */
YES
Captured Packet
Strip Ethernet Header
Strip IP Header
Strip TCP Header
Packets
IP Addr
Check
Shared Memory
Online Users List
Packet Does Not
Belong To An
Online User
Packet Ignored
Packet Sent By
User
Packet Received
By User
TCP src port check
TCP dst port check
TCP Ports
20 / 21 - FTP
22 - SSH
23 - TELNET
25 - SMTP
80 / 443 - HTTP/HTTPS
110 - POP3
DBResults Of The Packet Analysis
Protocol Statistics
Figure 11. Packet Capturing Daemon Architecture
-
7/31/2019 2003-MMLAB-TR-05
38/134
37
int size_ethernet = sizeof(struct ethhdr);
int size_ip = sizeof(struct iphdr);
int size_tcp = sizeof(struct tcphdr);
/*
Stripping headers*/
ethernet = (struct ethhdr*)(packet);
ip = (struct iphdr*)(packet + size_ethernet);
tcp = (struct tcphdr*)(packet + size_ethernet + size_ip);
payload = (u_char *)(packet + size_ethernet + size_ip + size_tcp);
[ code taken from packet_cap.c ]
packet is a pointer to the packet captured (which is handled as an unsigned char
array). The definitions of the Ethernet, IP and TCP headers are located in the
linux/if_ether.h, netinet/ip.h and netinet/tcp.h files.
A brief explanation of the above method of extracting packet headers follows.
As mentioned, the packet is handled as a string of unsigned chars. Packet
manipulation takes place in the body of the callback function that is defined as
parameter of the call pcap_loop. In this program, the call to pcap_loop has the
following format:pcap_loop(handle, 0, (pcap_handler)updateUserStats, NULL);
handle refers to the pcap handle we have acquired be the call to pcap_open_live
(pcap session opening function).
updateUserStats is our packet handling callback function. Such a callback function
has a specified prototype. In this case, the definition of updateUserStats is as follows:
void updateUserStats(u_char *args, const struct pcap_pkthdr *header,
const u_char *packet)
What the packet capturing daemon program is more interested about is the packet
parameter. This parameter is an unsigned char array containing all of the packet
sniffed data. It is a collection of other structures (protocol headers and packet
payload) rather than a string. In fact, it is the serialised version of these structures.
In order to strip the packet of its headers, the program has to perform some type
casting tasks. As mentioned before, packet is a pointer to the start of the packet
structure. Making use of the fact that Ethernet, IP and TCP headers, as defined in
linux/if_ether.h, netinet/ip.h and netinet/tcp.h are of fixed length, we can acquire
pointers to the start of the Ethernet, IP and TCP headers. First, we get a pointer to the
beginning of the Ethernet header, which points to the start of the captured packet. This
pointer is type-cast to a struct of type struct ethhdr
ethernet = (struct ethhdr*)(packet);
The start of the IP header is immediately after the ending of the Ethernet header.
Therefore, a pointer to the beginning of the IP header is exactly sizeof(ethhdr) bytes
after the start of the Ethernet packet. In a similar way, we can calculate the position of
the pointer to the start of the TCP header (if we refer to a TCP / IP Packet) by adding
the size of the ethernet and ip header structures (in bytes) to the value of the position
-
7/31/2019 2003-MMLAB-TR-05
39/134
38
of the packet start pointer in memory and typecast this pointer to a tcphdrstructure.
Finally, we can find out the exact position in the packet where the payload data start
in the same way and typecast the pointer to u_char*.
ip = (struct iphdr*)(packet + sizeof(struct ethhdr));
tcp = (struct tcphdr*)(packet + sizeof(struct ethhdr) + sizeof(struct
iphdr));
payload = (u_char *)(packet + sizeof(struct ethhdr) + sizeof(struct
iphdr) + sizeof(struct tcphdr) );
The next figure demonstrates the format of an Ethernet packet and how one can
perform the above steps to get the protocol headers out of the captured packet.
2.5.1.2 User Packet MatchingAfter the protocol headers have been extracted, the system has to determine whether
the packet was sent or received by a registered online user. This is achieved by
checking the packets source and destination IP address.
The matching between users and packets is carried out as follows:
First, pointers to the unode structures that will point to the sender and the receiver of
the intercepted packet (if they can be found in the online users list) must be declared.
unode usrsnd; /* packet sender */unode usrrcv; /* packet receiver */
Then, given the source and destination IP addresses of the packet the program
searches the online users list (which is in the shared memory segment) to find out if
the above IP addresses correspond to any of the online users IP address (src / dst) . In
case such users are located in the list, the packet capturing program goes on to log
statistics of the traffic they cause. If no such user is found, then both usrsnd and
usrrcv pointers have a NULL value. The calls to the function that searches the users
list are the following.
usrsnd = usrFindInListByIp(usrListHead,(char*)inet_ntoa(ip-> ip_src.s_addr));
usrrcv = usrFindInListByIp(usrListHead,
sizeof (struct ethhdr)
IP Header
sizeof (struct iphdr)
TCP Header
sizeof (struct tcphdr)
Packet Payloa