Botnet Dr. 許 富 皓

149
1 Botnet Dr. 許

description

Botnet Dr. 許 富 皓. Botnet [Trend Micro]. Historical List of Botnets (1) [ wiki ]. Historical List of Botnets (2) [ wiki ]. Definition of a Botnet. - PowerPoint PPT Presentation

Transcript of Botnet Dr. 許 富 皓

Page 1: Botnet Dr. 許  富  皓

1

Botnet

Dr. 許 富 皓

Page 2: Botnet Dr. 許  富  皓

2

Botnet [Trend Micro]

Page 3: Botnet Dr. 許  富  皓

3

Historical List of Botnets (1) [wiki]

Page 4: Botnet Dr. 許  富  皓

4

Historical List of Botnets (2) [wiki]

Page 5: Botnet Dr. 許  富  皓

5

Definition of a Botnet

A botnet (zombie army or drone army) refers to a pool of compromised computers that are under the command of a single hacker, or a small group of hackers, known as a botmaster.

Page 6: Botnet Dr. 許  富  皓

6

Definition of a Bot

A bot refers to a compromised end-host, or a computer, which is a member of a botnet.

Page 7: Botnet Dr. 許  富  皓

7

The First Bot Generation Malware – PrettyPark [F-Secure]

The first bot generation malware, PrettyPark worm, appeared in 1999.

A critical difference between PrettyPark and previous worms is that it makes use of IRC as a means to allow a botmaster to remotely control a large pool of compromised hosts.

Its revolutionary idea of using IRC as a discrete and extensible method for Command and Control (C&C) was soon adopted by the black hat community.

Page 8: Botnet Dr. 許  富  皓

8

How Fast Could Your Computer Be Comprised? Based on the observation of an unpatched version of

Windows 2000 or Windows XP located within a dial-in network of a German ISP. Normally it takes only a couple of minutes before it is

successfully compromised. On average, the expected lifespan of the honeypot is less than

ten minutes. After this small amount of time, the honeypot is often successfully

exploited by automated malware. The shortest compromise time was only a few seconds:

Once we plugged the network cable in, an SDBot compromised the machine via an exploit against TCP port 135 and installed itself on the machine.

Page 9: Botnet Dr. 許  富  皓

9

Sizes of Botnets[Wikipedia]

Some botnets consist of only a few hundred bots. In contrast to this, several large botnets with up

to 50,000 hosts were also observed. Botnets with over several hundred thousands

hosts have been reported in the past. Kraken botnet

On April 13, 2008, there were 495,000 computers in the Kraken botnet[Damballa].

Storm botnet [Enright]

Conficker: 10,000,000 [F-Secure]

Page 10: Botnet Dr. 許  富  皓

10

A Hosts May be Infected by Several Botnets Simultaneously A home computer which got infected by 16

different bots has been found.

Page 11: Botnet Dr. 許  富  皓

11

Taxonomy of Botnets

Attacking behavior C&C models Rally mechanisms Communication protocols Observable botnet activities Evasion Techniques

Page 12: Botnet Dr. 許  富  皓

12

Attacking Behavior [Paul Bächer et al.]

Distributed Denial-of-Service Attacks Spamming Sniffing Traffic Keylogging Spreading new malware Installing Advertisement Addons Google AdSense abuse Manipulating online polls/games Mass identity theft

Page 13: Botnet Dr. 許  富  皓

13

Distributed Denial-of-Service Attacks (1)

Often botnets are used for Distributed Denial-of-Service (DDoS) attacks.

A DDoS attack is an attack on a computer system or network that causes a loss of service to users, typically the

loss of network connectivity and services

by consuming the bandwidth of the victim network or overloading the computational resources of the victim

system.

Page 14: Botnet Dr. 許  富  皓

14

Distributed Denial-of-Service Attacks (2)

Further research showed that botnets are even used to run commercial DDoS attacks against competing corporations: Operation Cyberslam documents the story of Jay R.

Echouafni and Joshua Schichtel alias EMP. Echouafni was indicted on August 25, 2004 on

multiple charges of conspiracy and causing damage to protected computers.

He worked closely together with EMP who ran a botnet to send bulk mail and also carried out DDoS attacks against the spam blacklist servers.

In addition, they took Speedera - a global on-demand computing platform - offline when they ran a paid DDoS attack to take a competitor's website down.

Page 15: Botnet Dr. 許  富  皓

15

Proxy Some bots offer the possibility to open a

SOCKS v4/v5 proxy on a compromised machine.SOCKS v4/v5 proxy : a generic proxy protocol

for TCP/IP-based networking applications (RFC 1928).

Page 16: Botnet Dr. 許  富  皓

16

Spamming After having enabled the SOCKS proxy, this

machine can then be used for nefarious tasks such as spamming. With the help of a botnet and thousands of bots, an

attacker is able to send massive amounts of spam mails.

Often that spam you are receiving was sent from, or proxied through, an old Windows computer at home.

In addition, this can of course also be used to send phishing-mails since phishing is a special case of spam.

Some bots also implement a special function to harvest email-addresses.

Page 17: Botnet Dr. 許  富  皓

17

Botnets Guilty for 87% of 2009 Global Spam Mail [Yahan Wu ]

According to a report released by Symantec, Botnets send out more than 87 percent of all unsolicited mail, equating to around 151 billion emails a day.

Page 18: Botnet Dr. 許  富  皓

18

Spam Capacity of Some Notorious Botnets

Name   Est. Bot #   Spam Capacity  

Conficker 10,000,000+ 10 billion/day

Kraken 495,000 9 billion/day

Srizbi 450,000 60 billion/day

Bobax 185,000 9 billion/day

Rustock 150,000 30 billion/day

Cutwail 125,000 16 billion/day

Storm 85,000 (only 35,000 send email) 3 billion/day

Donbot 80,000 500 million/day

Grum 50,000 2 billion/day

Onewordsub 40,000  ?

Mega-D 35,000 10 billion/day

Nucrypt 20,000 5 billion/day

Wopla 20,000 600 million/day

Spamthru 12,000 350 million/day

Crime Art 10,000 250 million/day

SilverNet Unknown Unknown

Page 19: Botnet Dr. 許  富  皓

19

Sniffing Traffic

Bots can also use a packet sniffer to watch for interesting clear-text data passing by a compromised machine.

The sniffers are mostly used to retrieve sensitive information like usernames and passwords.

If a machine is compromised more than once and also a member of more than one botnet, the packet sniffing allows to gather the key information of the other botnet. Thus it is possible to "steal" another botnet.

Page 20: Botnet Dr. 許  富  皓

20

Keylogging If the compromised machine uses encrypted

communication channels (e.g. HTTPS or POP3S), then just sniffing the network packets on the victim's computer is useless since the appropriate key to decrypt the packets is missing.

With the help of a keylogger it is very easy for an attacker to retrieve sensitive information. An implemented filtering mechanism further helps in stealing

secret data. e.g. "I am only interested in key sequences near the keyword

'paypal.com" And if you imagine that this keylogger runs on thousands of

compromised machines in parallel you can imagine how quickly PayPal accounts are harvested.

Page 21: Botnet Dr. 許  富  皓

21

Spreading New Malware

In most cases, botnets are used to spread new bots.

This is very easy since all bots implement mechanisms to download and execute a file via HTTP or FTP.

Spreading an email virus using a botnet is a very nice idea, too. A botnet with 10,000 hosts which acts as the

start base for the mail virus allows very fast spreading and thus causes more harm.

Page 22: Botnet Dr. 許  富  皓

22

Installing Advertisement Addons

Botnets can also be used to gain financial advantages.

This works by setting up a fake website with some advertisements: The operator of this website negotiates a deal with

some hosting companies that pay for clicks on ads. With the help of a botnet, these clicks can be

"automated" so that instantly a few thousand bots click on the pop-ups.

This process can be further enhanced if the bot hijacks the start-page of a compromised machine so that the "clicks" are executed each time the victim uses the browser.

Page 23: Botnet Dr. 許  富  皓

23

Google AdSense Abuse

A similar abuse is also possible with Google's AdSense program: AdSense offers companies the possibility to display

Google advertisements on their own website and earn money this way.

The company earns money due to clicks on these ads, for example per 10,000 clicks in one month.

An attacker can abuse this program by leveraging his botnet to click on these advertisements in an automated fashion and thus artificially increments the click counter.

This kind of usage for botnets is relatively uncommon, but not a bad idea from an attacker's perspective.

Page 24: Botnet Dr. 許  富  皓

24

Loss Caused by Click Fraud [

Catherine Holahan]

On average, consultants estimate that between 14% and 15% of clicks are fraudulent.

Page 25: Botnet Dr. 許  富  皓

Retrieve a URL from Old Version of Google Search Results

25

Page 26: Botnet Dr. 許  富  皓

26

Google Search Page

Page 27: Botnet Dr. 許  富  皓

27

Google Search Result Page

Page 28: Botnet Dr. 許  富  皓

28

Source HTML File of the Google Search Result Page

Page 29: Botnet Dr. 許  富  皓

29

Ampersands (&'s) in URLs [Liam Quinn ]

Always use & in place of & when writing URLs in HTML:

E.g.: <a href="foo.cgi?

chapter=1&amp;section=2&amp;copy=3&amp;lang=en">...</a>

Page 30: Botnet Dr. 許  富  皓

30

Click Fraud (1) - Use the Browser’s URL Field

Page 31: Botnet Dr. 許  富  皓

Retrieve a URL form Latest Version of Google Search Results – using Chrome

31

Page 32: Botnet Dr. 許  富  皓

move cursor above the hyperlink

32

Page 33: Botnet Dr. 許  富  皓

33

Click the right button of the mouse

Page 34: Botnet Dr. 許  富  皓

34

Choose Inspect element of the pop-up menu

Page 35: Botnet Dr. 許  富  皓

35

Click Fraud (2) – Connect to the Google Server Directly Attackers could launch the same attacks by

opening a HTTP connection to a Google server

and sending the URL in the previous slide to the

above server directly.

Page 36: Botnet Dr. 許  富  皓

36

Click Fraud (3) - Use Fake Page (1)

Page 37: Botnet Dr. 許  富  皓

37

Click Fraud (3) - Use Fake Page (2) [Mr. 東 ]

Page 38: Botnet Dr. 許  富  皓

38

Click Fraud (3) - Use Fake Page (3)

Page 39: Botnet Dr. 許  富  皓

39

Manipulating online Polls/Games

Since every bot has a distinct IP address, every vote will have the same credibility as a vote cast by a real person.

Online games can be manipulated in a similar way. Currently we are aware of bots being used

that way, and there is a chance that this will get more important in the future.

Page 40: Botnet Dr. 許  富  皓

40

Mass Identity Theft Often the combination of different functionality described

above can be used for large scale identity theft, one of the fastest growing crimes on the Internet.

Bogus emails ("phishing mails") that pretend to be legitimate (such as fake PayPal or banking emails) ask their intended victims to go online and submit their private information. These fake emails are generated and sent by bots via their

spamming mechanism. These same bots can also host multiple fake websites pretending

to be ebay, PayPal, or a bank, and harvest personal information. Just as quickly as one of these fake sites is shut down, another one

can pop up. In addition, keylogging and sniffing of traffic can also be

used for identity theft.

Page 41: Botnet Dr. 許  富  皓

41

What Is IRC, and How Does It Work? [David

Caraballo et al.]

IRC (Internet Relay Chat) provides a way of communicating in real time with people from all over the world.

It consists of various separate networks (or "nets") of IRC servers, machines that allow users to connect to IRC.

The largest nets are EFnet (the original IRC net, often having more than 32,000 people at

once), Undernet, IRCnet, DALnet, and NewNet.

Page 42: Botnet Dr. 許  富  皓

42

IRC Client

Generally, the user (such as you) runs a program (called a "client") to connect to a server on one of the IRC nets.

The server relays information to and from other servers on the same net.

Recommended clients: UNIX/shell: ircII Windows: mIRC Macintosh clients

Page 43: Botnet Dr. 許  富  皓

43

IRC Bot [wikepedia]

An IRC bot is a set of scripts or an independent program that connects to Internet Relay Chat as a client, and so appears to other IRC users as another user.

It differs from a regular client in that instead of providing interactive access to IRC for a human user, it performs automated functions.

Page 44: Botnet Dr. 許  富  皓

44

IRC Channels

Once connected to an IRC server on an IRC network, you will usually join one or more "channels" and converse with others there.

On IRC, channels are where people meet and chat. You may know them as "chat rooms". Channel names usually begin with a #, as in #irchelp. Conversations may be

public (where everyone in a channel can see what you type) or private (messages between only two people, who may or may

not be on the same channel).

Page 45: Botnet Dr. 許  富  皓

45

Scheme of an IRC-Network [wikipedia]

normal clients bots bouncers

Page 46: Botnet Dr. 許  富  皓

46

Command and Control (C&C) System

C&C works as follows. A botmaster sets up a C&C server, typically

an IRC server. After a bot virus infects a host, it will connect

back to the C&C server and wait on the botmaster’s command.

In a typical IRC botnet, the bot will join a certain IRC channel to listen to messages from its master.

Page 47: Botnet Dr. 許  富  皓

47

Categories of C&C C&C systems can be roughly categorized

into three different models the centralized model, the peer-to-peer (P2P) model the random model

P.S.: But there is possibility that future botnets may

use new command and control systems that are completely different from any of them, noting the quickly evolving nature of botnets.

Page 48: Botnet Dr. 許  富  皓

48

Centralized C&C Model In the centralized model, a botmaster selects a single

high bandwidth host to be the contacting point (C&C server) of all the bots. The C&C server, usually a compromised computer as well,

would run certain network services such as IRC, HTTP and etc. When a new computer is infected by a bot, it will join the botnet

by initiating a connection to the C&C server. Once joined to the appropriate C&C server channel, the bot

would then wait on the C&C server for commands from the botmaster.

Botnets may have mechanisms to protect their communications. For example, IRC channels may be protected by passwords only

known to bots and their masters to prevent eavesdropping.

Page 49: Botnet Dr. 許  富  皓

49

Popularity of the Centralized C&C Model The centralized model is the predominant

C&C model used by early botnets. Many well known bots, such as AgoBot, SDBot and RBot, fall into the category of the centralized C&C model.

Page 50: Botnet Dr. 許  富  皓

50

Why the Centralized C&C Model (1) ?

Due to the rich variety of software tools (e.g., IRC bot scripts on IRC servers and IRC bots), the centralized C&C model is rather simple to implement and customize.

Notice that a botmaster can easily control thousands of bots using the centralized model.

Botmasters are profit driven; hence, they are more interested in the centralized C&C model which allows them to control as many bots as possible and maximize their profit.

Page 51: Botnet Dr. 許  富  皓

51

Why the Centralized C&C Model (2) ?

Messaging latencies in the centralized model is small.

Therefore, it is easy for botmasters to coordinate botnets and launch attacks.

Page 52: Botnet Dr. 許  富  皓

52

Drawback of the Centralized C&C Model The C&C server is the crucial place where

most of the conversation happens. Therefore, the C&C server is the weakest link in a botnet.

If we can manage to discover and destroy the C&C server, the entire botnet will be gone.

Page 53: Botnet Dr. 許  富  皓

53

Motivation for a P2P-Based C&C Model

Some botnet authors have started to build alternative botnet communication systems, which are more resilient to failures in the network.

An interesting C&C paradigm exploits the idea of P2P communication. For instance, certain variants of Phatbot have used P2P

communication as a means to control botnets. References of P2P:

[Kazaa] [Mac_P2P] [P2P network] [CS_NCTU] [DHT_ACT] [DHT_Duke] [DHT_wiki]

Page 54: Botnet Dr. 許  富  皓

54

P2P Applications [ACT]

Napster Gnutella

LimeWire

Morpheus

FastTrack

Kazaa

iMesh&Grokster

eDonkey

DC++

OverNet

BitTorrent eXeem

eDonkey2000

1999 20012000 2002 …

Page 55: Botnet Dr. 許  富  皓

55

Futures of the P2P-Based C&C Model Compared with the centralized C&C model, the

P2P based C&C model is much harder to discover and destroy.

Since the communication system doesn’t heavily depend on a few selected servers, destroying a single, or even a number of bots, won’t necessarily lead to the destruction of an entire botnet.

Because of this, the P2P based C&C model has been used increasingly in botnets.

Page 56: Botnet Dr. 許  富  皓

56

Constraints of the P2P C&C Model (1)

Existing P2P systems only support conversations of small user groups, usually in the range of 10-50 users.

The group size supported by P2P systems is too small compared to the size of centralized C&C botnets, in which a botnet of 1000 compromised hosts is still on the small side.

Page 57: Botnet Dr. 許  富  皓

57

Constraints of the P2P C&C Model (2)

Existing P2P systems don’t guarantee message delivery and propagation latency.

Therefore, if using P2P communication, a botnet would be harder to coordinate than those which use centralized C&C models.

Page 58: Botnet Dr. 許  富  皓

58

Trend of the P2P C&C Model

The above two constraints have limited the wider adoption of P2P based communication in botnets.

As the knowledge on implementing P2P based botnets accumulates, new P2P-based botnets, which overcome the above limitations, may appear.

As such, more and more botnets will move to use P2P based communication since it is more robust than centralized C&C communication.

Page 59: Botnet Dr. 許  富  皓

59

Timeline of Peer-to-Peer Protocols and Bots [Grizzard et al.]

Date Name Type Distinguishing Description

12/1993 EggDrop Non-Malicious Bot Recognized as early popular non-malicious IRC bot

04/1998 GTbot

Variants Malicious Bot IRC bot based on mIRC executables and scripts

05/1999 Napster Peer-to-Peer First widely used hybrid central and peer-to-peer

service

11/1999 Direct

Connect Peer-to-Peer Variation of Napster hybrid model

03/2000 Gnutella Peer-to-Peer First decentralized peer-to-peer protocol

09/2000 eDonkey Peer-to-Peer Used checksum directory lookup for file resources

03/2001 Fast Track Peer-to-Peer Use of supernodes within the peer-to-peer

architecture

05/2001 WinMX Peer-to-Peer Proprietary protocol similar to FastTrack

06/2001 Ares Peer-to-Peer Has ability to penetrate NATs with UDP punching

Page 60: Botnet Dr. 許  富  皓

60

Timeline of Peer-to-Peer Protocols and BotsDate Name Type Distinguishing Description

07/2001 BitTorrent Peer-to-Peer Uses bandwidth currency to foster quick downloads

04/2002 SDbot

Variants Malicious Bot Provided own IRC client for better efficiency

10/2002 Agobot

Variants Malicious Bot Incredibly robust, flexible, and modular design

04/2003 Spybot

Variants Malicious Bot Extensive feature set based on Agobot

05/2003 WASTE Peer-to-Peer Small VPN-style network with RSA public keys

09/2003 Sinit Malicious Bot Peer-to-peer bot using random scanning to find peers

11/2003 Kademlia Peer-to-Peer Uses distributed hash tables for decentralized

architecture

03/2004 Phatbot Malicious Bot Peer-to-peer bot based on WASTE

03/2006 SpamThru Malicious Bot Peer-to-peer bot using custom protocol for backup

04/2006 Nugache Malicious Bot Peer-to-peer bot connecting to predefined peers

01/2007 Peacomm Malicious Bot Peer-to-peer bot based on Kademlia

Page 61: Botnet Dr. 許  富  皓

61

Random C&C Model In the proposed random C&C model, a bot will not

actively contact other bots or the botmaster. Rather, a bot would listen to incoming connections

from its botmaster. To launch attacks, a botmaster would scan the

Internet to discover its bots. When a bot is found, the botmaster will issue

command to the bot. Although this C&C model has not been used in real

world botnets, it is potentially interesting to certain future types of botnets that want high survivability.

Page 62: Botnet Dr. 許  富  皓

62

Constraints of Random C&C Model

While such a C&C model is easy to implement and highly resilient to discovery and destruction, the model intrinsically has scalability problem, and is difficult to be used for large scale, coordinated attacks.

Page 63: Botnet Dr. 許  富  皓

63

Rallying Mechanisms

Page 64: Botnet Dr. 許  富  皓

64

Rallying Mechanisms

Rallying mechanisms are critical for botnets to discover new bots

andrally them under their botmasters.

Page 65: Botnet Dr. 許  富  皓

65

Hard-coded IP Address

A common method used to rally new bots works like this: A bot includes hard-coded C&C server IP

addresses in its binary. When the bot initially infects a computer, the

computer will connect back to the C&C server using the hard-coded server IP address that is contained in the binary code.

Page 66: Botnet Dr. 許  富  皓

66

Drawbacks of Hard-coded IP Address

The problem with using hard-coded IP addresses is that the C&C server can be easily detected

and the communication channel easily blocked.

If a C&C server is "disconnected" in this fashion, a botnet may be completely deactivated.

Because of this, hard-coded server IP addresses are not as much used now by recent variants of bots.

Page 67: Botnet Dr. 許  富  皓

67

Dynamic DNS Domain Name

The bots today often include hard-coded domain names, assigned by dynamical DNS providers.

Page 68: Botnet Dr. 許  富  皓

68

Benefit of Dynamic DNS Domain Name (1) The benefit to use dynamic DNS is that, if a

C&C server is shutdown by authorities, the botmaster can easily resume his/her control by creating a new C&C server somewhere else and updating the IP address in the corresponding dynamic DNS entry. When connections to the old C&C server fail, the bots

will perform DNS queries and be redirected to the new C&C server.

This DNS redirection behavior is often known as herding.

Page 69: Botnet Dr. 許  富  皓

69

Benefit of Dynamic DNS Domain Name (2) Using dynamic DNS names, a botmaster

can retain the control on its botnet when existing C&C server fails to function.

Sometimes, a botmaster will also update the dynamic DNS entry periodically to shift the locations of the command and control server, making the detection harder.

Page 70: Botnet Dr. 許  富  皓

70

Distributed DNS Service

Some of the newer botnet breeds run their own distributed DNS service at locations that are out of the reach of law enforcement or other authorities.

Bots include the addresses of these DNS servers and contact these servers to resolve the IP addresses of C&C servers.

Many times, these DNS services are chosen to run at high port numbers in order to evade the detection by security devices at gateways.

The botnets using distributed DNS service to rally their bots are the hardest to detect and destroy, compared with other types of botnets discussed.

Page 71: Botnet Dr. 許  富  皓

71

Communication Protocols

Page 72: Botnet Dr. 許  富  皓

72

Communication Protocols

Bots communicate with each other and their botmasters following certain well-defined network protocols.

In most cases, botnets don’t create new network protocols for their communication. Instead, they use existing communication protocols that are implemented by publicly available software tools. e.g., the IRC protocol itself, and already publicly

available software implementations for IRC servers and clients.

Page 73: Botnet Dr. 許  富  皓

73

The Importance of Understanding the Botnet Comm. Protocols First, their communication characteristics provide

an understanding of the botnets’ origins

and the possible software tools being used.

Secondly, understanding the communication protocols help security researchers to decode the conversations which happen among bots and their masters.

Page 74: Botnet Dr. 許  富  皓

74

Common Botnet Communication Protocols IRC Protocol HTTP Protocol P2P Protocol … and so on.

Page 75: Botnet Dr. 許  富  皓

75

Evasion Techniques

Page 76: Botnet Dr. 許  富  皓

76

Evasion Techniques – for AV and IDS

A variety of techniques are used by botnets to evade AV and signature based IDS systems, e.g., sophisticated executable packersrootkits, etc

These evasion techniques improve the survivability of botnets and the success rate of compromising new hosts.

Page 77: Botnet Dr. 許  富  皓

77

Evasion Techniques – Communication (1) Additionally, botnets have also added (and continue to

add) new mechanisms to hide traces of their communication, e.g. fast-flux.

Some botnets are moving away from IRC, since monitoring of IRC traffic is increasingly done in an effort to detecting botnets.

Instead, botnets are starting to use modified IRC protocols or other protocols altogether (e.g., HTTP, VoIP)

for their communication channels.

Page 78: Botnet Dr. 許  富  皓

78

Evasion Techniques – Communication (2) Encryption schemes are also being used to

prevent the content from being revealed. Certain state-of-the-art botnets even use

covert channel communications such as TCP and ICMP tunneling, and even IPv6 tunneling.

There have been technical using SKYPE and IM to support communication.

Page 79: Botnet Dr. 許  富  皓

79

Observable Activities

Page 80: Botnet Dr. 許  富  皓

80

Other Observable Activities

In order to detect the presence of botnets, we need to discover abnormal behaviors exhibited by botnets.

The botnet observable behaviors can be categorized into three types: network based behaviorhost-based behaviorglobal correlated behavior.

Page 81: Botnet Dr. 許  富  皓

81

Network-based Behaviors1. Observable Communication

Botmasters need to communicate with their bots and launch attacks.

2. Observable Attacking Traffic When performing these functions, botnets will

generate certain observable network traffic patterns that we can use to detect

individual bots and their C&C servers.

Page 82: Botnet Dr. 許  富  皓

82

Observable Communication (1)

Since botnets often use IRC and HTTP to communicate with their bots, observable IRC & HTTP traffic with abnormal patterns can be used to indicate the presence of bots and the C&C servers. For example,

inbound/outbound IRC traffic to an interior enterprise network where IRC service is not allowed

and IRC conversations that follow certain syntax conventions that

humans don’t readily understand.

Page 83: Botnet Dr. 許  富  皓

83

Observable Communication (2)

Many botnets use dynamic DNS domain names to locate their C&C servers. Thus, abnormal DNS queries may also used to detect botnets.

In some instances, hosts are found to query for improper domain names (e.g., cheese.dns4biz.org, butter.dns4biz.org) which can indicate a high probability that these hosts are compromised. The next logical step in this methodology would be to attempt to

glean the IP addresses of their C&C servers in observable traffic streams.

If further detective work reveals that the IP address associated to a particular domain name keeps changing periodically, it can provide an even stronger indication the presence of a botnet.

Page 84: Botnet Dr. 許  富  皓

84

Observable Communication (3)

Moreover, botnets may exhibit additional network abnormalities that allow us to discover them. One example would be a case in which bots are usually idle

most of the time in a connection, and would response faster than a human being at the keyboard surfing the web.

Yet another example would be a case of some sort of communication traffic originated by botnets is more "bursty" than normal traffic.

So, botnets can potentially be discovered by monitoring network traffic flow.

Page 85: Botnet Dr. 許  富  皓

85

Observable Attacking Traffic

The traffic generated by botnets allows us to discover their presence. For example,

When launching DDoS TCP SYN flood attacks, botnets can send out a large number of invalid TCP SYN packets with fake source IP addresses.

Therefore, if a network monitoring device finds a large number of outbound TCP SYN packets that have invalid source IP address (i.e., IP addresses that should not come from the internal network), it would indicate that some internal hosts may be compromised, and actively participating in a DDoS attack.

Similarly, if an internal host is found to send out phishing e-mails, there is an indication that the host is infected by bots as well.

Page 86: Botnet Dr. 許  富  皓

86

Host Based Behavior

Bots compromise computers and hide their presence just like many older computer viruses.

Therefore, they exhibit certain observable behaviors as viruses do at compromised hosts. When executing, bots will make sequences of

system/library calls, e.g. modifying system registries and system files creating network connections disabling antivirus programs

The sequences of system/library calls made by bots are often different from legitimate programs and applications.

Page 87: Botnet Dr. 許  富  皓

87

Global Correlated Behaviors Perhaps botnet behavior observed in a global

snapshot is the most interesting one from the viewpoint of detection efficiency.

Those global behavioral characteristics are often tied to the fundamental structures and mechanisms of botnets.

Consequently, they are unlikely to change from botnet to botnet unless the structures and mechanisms of botnets themselves are redesigned and re-implemented.

As a result, these globally observable behaviors are the most valuable to detect families of botnets.

Page 88: Botnet Dr. 許  富  皓

88

Global Correlated Behaviors – DNS Traffic (1) Many botnets use dynamic DNS entry to track

their C&C servers. As a new C&C server is built, the related DNS

entry will be updated to the IP address of the new C&C server. Therefore, bots will find the location of the new C&C server.

Botmasters may herd their botnets to different C&C servers’ locations periodically to prevent detections.

Page 89: Botnet Dr. 許  富  皓

89

Global Correlated Behaviors – DNS Traffic (2) When a botmaster updates its dynamic DNS

entry for C&C server: there would be an observable global behavior on the

Internet specifically,

bots are disconnected from the old C&C server bots will query their DNS server for the new IP address of the

domain name, resulting in an increase of DNS queries to this DNS entry globally.

Page 90: Botnet Dr. 許  富  皓

90

Global Correlated Behaviors – DNS Traffic (3) Therefore, if a network monitor discovers that a

dynamic DNS entry is updated, which follows significant amount of DNS queries to this entry, then there is a high probability that this dynamic DNS domain name is being used by botnet C&C servers.

Such a feature is unlikely to change whether a botnet is using IRC for communication or using HTTP for communication, unless the communication structure is changed.

Page 91: Botnet Dr. 許  富  皓

91

Domain Name System[wikipedia]

Page 92: Botnet Dr. 許  富  皓

92

Domain Name System

A lookup mechanism for translating hostnames into IP addresses and vice-versa.

DNS provides the naming standard for IP-based networks.

A globally distributed, loosely coherent, scalable, reliable, dynamic database.

Comprised of three components: A “name space” (domain) Servers (name servers) making that name space available. Resolvers (clients) which query the servers about the name

space

Page 93: Botnet Dr. 許  富  皓

93

Domain

Domains are “namespaces” Everything below .com is in the com domain. Everything below ripe.net is in the ripe.net

domain and in the net domain.

Page 94: Botnet Dr. 許  富  皓

94

Domain Name Space

The domain name space consists of a tree of domain names.

Page 95: Botnet Dr. 許  富  皓

95

Zone

The tree sub-divides into zones beginning at the root zone.

A DNS zone is a subset of the hierarchical domain name structure of the DNS.

Every DNS zone must be assigned a set of authoritative name servers that are installed in NS records in the parent zone.A single name server can host several zones.

Page 96: Botnet Dr. 許  富  皓

96

Page 97: Botnet Dr. 許  富  皓

97

Delegated Subzone

Administrative responsibility over any zone may be divided, thereby creating additional zones.

Authority for a portion of the old space is said to be delegated, usually in form of sub-domains, to another nameserver and administrative entity .

The old zone ceases to be authoritative for the new zone.

Page 98: Botnet Dr. 許  富  皓

98

Comparison of a DNS Zone and DNS Domain [Microsoft] – (1)

Domain name servers store information about part of the domain name space called a zone.

The name servers are authoritative for a particular zone.

A single name server can be authoritative for many zones.

Page 99: Botnet Dr. 許  富  皓

99

Comparison of a DNS Zone and DNS Domain [Microsoft] – (2)

Understanding the difference between a zone and a domain is sometimes confusing.

A zone is simply a portion of a domain.

Page 100: Botnet Dr. 許  富  皓

100

Comparison of a DNS Zone and DNS Domain [Microsoft] – (3)

For example, the domain Microsoft.com may contain

all of the data for Microsoft.com Marketing.microsoft.com

and Development.microsoft.com.

Page 101: Botnet Dr. 許  富  皓

101

Comparison of a DNS Zone and DNS Domain [Microsoft] – (4)

However, the zone Microsoft.com contains only information for Microsoft.com

and references to the authoritative name servers for the

subdomains.

The zone Microsoft.com can contain the data for subdomains of Microsoft.com if they have not

been delegated to another server. For example,

Marketing.microsoft.com may manage its own delegated zone.

Development.microsoft.com may be managed by the parent, Microsoft.com.

Page 102: Botnet Dr. 許  富  皓

Comparison of a DNS Zone and DNS Domain [Microsoft] – (5)

102

Microsoft.com

Development.Microsoft.comMarketing.Microsoft.com

Microsoft.com domain

Microsoft.com zone

Marketing.Microsoft.com domain and zone

Page 103: Botnet Dr. 許  富  皓

103

Comparison of a DNS Zone and DNS Domain [Microsoft] – (6)

If there are no subdomains, then the zone and domain are essentially the same.

In this case the zone contains all data for the domain.

Page 104: Botnet Dr. 許  富  皓

104

Domain Name Formulation (1)

A domain name consists of one or more parts, technically called labels, that are conventionally concatenated, and delimited by dots, such as example.com.

Page 105: Botnet Dr. 許  富  皓

105

Domain Name Formulation (2)

The right-most label conveys the top-level domain.

For example, the domain name www.example.com belongs to the top-level domain com.

Page 106: Botnet Dr. 許  富  皓

106

Domain Name Formulation (3)

The hierarchy of domains descends from right to left; each label to the left specifies a subdivision, or subdomain of the domain to the right.

For example: the label example specifies a subdomain of the com domain, and www is a sub domain of example.com.

This tree of subdivisions may consist of 127 levels.

Page 107: Botnet Dr. 許  富  皓

107

Domain Name Formulation (4)

A hostname is a domain name that has at least one IP address associated.

For example, the domain names www.example.com and example.com are also hostnames, whereas the com domain is not.

Page 108: Botnet Dr. 許  富  皓

108

Structure of the Domain Space – Top Level Domains Immediately below the root is the Top

Level Domains. These consist of

country specific Top Level Domain (ccTLDs),

and generic Top Level Domains (gTLDs).

CCNSO and GNSO decides the contents of ccTLDs and gTLDs respectively.

Page 109: Botnet Dr. 許  富  皓

109

Structure of the Domain Space – Second Level Domains Below these domains, you have the second

level domain names. These domain names are usually "delegated" by the

administrators of the relevant TLD which means that someone else is responsible for administering that part of the name space.

e.g. the administrators of .ie delegated the domain linux.ie to the Irish Linux Users Group, which means that ILUG are now responsible for administering the domain in any way they see fit without reference to the administrators of .ie.

Once a domain is delegated, the administrators of the domain are responsible for making changes within that domain.

Page 110: Botnet Dr. 許  富  皓

110

Top Level Domain (TLD) Types

Page 111: Botnet Dr. 許  富  皓

111

General TLDs (1)

Page 112: Botnet Dr. 許  富  皓

112

General TLDs (2)

Page 113: Botnet Dr. 許  富  皓

113

DNS Servers and Their Layout

The DNS consists of a hierarchical set of DNS servers.

Each zone (domain) or subzone (subdomain) has one or more authoritative DNS servers that publish information about that zone (domain), and the name servers of any zones (domains) "beneath" it.

The hierarchy of authoritative DNS servers matches the hierarchy of zones (domains).

At the top of the hierarchy stand the root servers: the servers to query when looking up (resolving) a top-level domain name.

Page 114: Botnet Dr. 許  富  皓

114

DNS Name Server

A DNS name server is a server that stores the DNS records for a zone (domain name)

such as address (A) records name server (NS) records

and mail exchanger (MX) records

and responds with answers to queries against its database.

Page 115: Botnet Dr. 許  富  皓

115

DNS Server Categories

Server Type DefinitionRoot Any server that acts as a central lookup for other server to depend on, and does not rely on other servers for Name Server zone informationAuthoritative Any server that hosts zones (domains) and returns zone information publicly Resolver A server that performs domain queries for end users but does not host zones (domains) or zone information

Page 116: Botnet Dr. 許  富  皓

116

Root Name Servers

The top of the hierarchy is served by the root name servers, the servers to query when looking up (resolving) a top-level domain name (TLD).

Page 117: Botnet Dr. 許  富  皓

117

Anycast

Anycast is a network addressing and routing methodology in which datagrams from a single sender are routed to the topologically nearest node in a group of potential receivers all identified by the same destination address.

Page 118: Botnet Dr. 許  富  皓

118

Names of Root Name Servers While only 13 names are used for the root nameservers,

there are many more physical servers. The 13 names are in the form letter.root-servers.net, where letter ranges from A to M.

Each operator uses redundant computer equipment to provide reliable service even if failure of hardware or software occur.

C, F, I, J, K, L and M servers now exist in multiple locations on different continents, using anycast address announcements to provide decentralized service.

As a result most of the physical root servers are now outside the United States,

Page 119: Botnet Dr. 許  富  皓

119

Root Server Addresses [wikipedia][root-servers]

Page 120: Botnet Dr. 許  富  皓

120

Map of all 123 DNS root server instances (including local Anycast instances) at the end of 2006.

Page 121: Botnet Dr. 許  富  皓

121

Authoritative Name Server

The Domain Name System distributes the responsibility of assigning domain names and mapping those names to IP addresses by designating authoritative name servers for each zone (domain).

Authoritative name servers are assigned to be responsible for their particular zones (domains), and in turn can assign other authoritative name servers for their sub-zones (sub-domains). This mechanism has made the DNS distributed and fault tolerant

and has helped avoid the need for a single central register to be continually consulted and updated.

Page 122: Botnet Dr. 許  富  皓

122

Responses of Authoritative Name Servers An authoritative name server only returns

answers to queries about domain names that have been specifically configured by the administrator of the server.

Page 123: Botnet Dr. 許  富  皓

123

Master and Slave Server

An authoritative name server can either be a master server or a slave server.

A master server is a server that stores the original (master) copies of all zone records.

A slave server uses an automatic updating mechanism of the DNS protocol in communication with its master to maintain an identical copy of the master records.

Page 124: Botnet Dr. 許  富  皓

124

Name Server Delegation

Name servers in delegations are identified by name, rather than by IP address.

This means that a resolving name server must issue another DNS request to find out the IP address of the server to which it has been referred.

Page 125: Botnet Dr. 許  富  皓

125

Circular Dependencies and Glue Records If the name given in the delegation is a subdomain of the

domain for which the delegation is being provided, there is a circular dependency.

In this case the nameserver providing the delegation must also provide one or more IP addresses for the authoritative nameserver mentioned in the delegation.

This information is called glue. The delegating name server provides this glue in the

form of records in the additional section of the DNS response, and provides the delegation in the answer section of the response.

Page 126: Botnet Dr. 許  富  皓

126

Example Consider the domain example.org. Assume that the authoritative

name server for example.org is ns1.example.org. A computer trying to resolve www.example.org will first have to

resolve ns1.example.org. Since ns1 is also under example.org, resolving

ns1.example.org requires resolving example.org—a circular dependency.

To break the dependency, the nameserver for the org top level domain includes glue along with the delegation for example.org.

The glue records are A and/or AAAA records that provide IP addresses for ns1.example.org. The resolver uses one or more of these IP addresses to satisfy the circular dependency, which allows it to communicate with ns1.example.org and finish resolving the DNS query.

Page 127: Botnet Dr. 許  富  皓

127

DNS Resolution Sequence (1)

Page 128: Botnet Dr. 許  富  皓

128

DNS Resolution Sequence (2)

root domain server

Page 129: Botnet Dr. 許  富  皓

129

Record Caching

Because of the large volume of requests generated in the DNS for the public Internet, the designers wished to provide a mechanism to reduce the load on individual DNS servers.

To this end, the DNS resolution process allows for caching of records for a period of time after an answer.

This entails the local recording and subsequent consultation of the copy instead of initiating a new request upstream.

Page 130: Botnet Dr. 許  富  皓

130

TTL

The time for which a resolver caches a DNS response is determined by a value called the time to live (TTL) associated with every record.

The TTL is set by the administrator of the DNS server handing out the authoritative response.

The period of validity may vary from just seconds to days or even weeks.

Page 131: Botnet Dr. 許  富  皓

131

Resource Record

A Resource Record (RR) is the basic data element in the domain name system.

Each record has a type (A, MX, etc.) an expiration time limit a class and some type-specific data.

Resource records of the same type define a resource record set.

Page 132: Botnet Dr. 許  富  皓

132

RR (Resource record) Fields

Page 133: Botnet Dr. 許  富  皓

133

TYPE Field

TYPE is the record type. It indicates the format of the data and it gives a hint of its

intended use. For example

the A record is used to translate from a domain name to an IPv4 address

the NS record lists which name servers can answer lookups on a DNS zone

the MX record specifies the mail server used to handle mail for a domain specified in an e-mail address.

Page 134: Botnet Dr. 許  富  皓

134

Zone File [wikipedia]

A Domain Name System (DNS) zone file is a text file that describes a DNS zone.

A zone file is a sequence of entries for resource records.

Each line is a text description that defines a single resource record (RR).

Page 135: Botnet Dr. 許  富  皓

135

A Zone File Example

Page 136: Botnet Dr. 許  富  皓

136

Fast Flux [Riden][SSAC]

Page 137: Botnet Dr. 許  富  皓

137

Fast-flux Service Networks

Fast-flux service networks are a network of compromised computer systems with public DNS records that are constantly changing, in some cases every few minutes.

These constantly changing architectures make it much more difficult to track down criminal activities and shut down their operations.

Page 138: Botnet Dr. 許  富  皓

138

Goal of Fast-Flux

The goal of fast-flux is for a fully qualified domain name (such as www.example.com) to have multiple (hundreds or even thousands) IP addresses assigned to it.

These IP addresses are swapped in and out of flux with extreme frequency, using a combination of round-robin IP addresses and a very short Time-To-Live (TTL) for any given particular

DNS Resource Record (RR).

Page 139: Botnet Dr. 許  富  皓

139

Web Request – Normal Network

Page 140: Botnet Dr. 許  富  皓

140

Web Request – Fast Flux

Page 141: Botnet Dr. 許  富  皓

141

DNS Resolution – Single Flux

Page 142: Botnet Dr. 許  富  皓

142

DNS Resolution – Double Flux

Page 143: Botnet Dr. 許  富  皓

143

DNS Resolution – Double Flux

Page 144: Botnet Dr. 許  富  皓

144

Build a Fast-Flux Service Network (1)

Fast flux users often register domain names for their illegal activities at an accredited registrar or reseller.

In one form of attack, the fast flux customer registers a domain name (for a flux service network) to host

illegal web sites (boguswebsitesexample.tld) and a (second or several) domain name(s) for a flux

service network to provide name resolution service (nameserverservicenetwork.tld).

Page 145: Botnet Dr. 許  富  皓

145

The fast flux service network operator uses automated techniques to rapidly change name server information in the registration records maintained by the registrar for these domains.

In particular, the fast flux service network operator changes the IP addresses of the domain's name servers to point

to different hosts in the domain nameserverservicenetwork.tld and

sets the times to live (TTLs) in the address records for these name servers to a very small value (1-3 minutes is common).

Build a Fast-Flux Service Network (2)

In charge of providing IP info. for hosts in domain boguswebsitesexample.tld

Page 146: Botnet Dr. 許  富  皓

146

Build a Fast-Flux Service Network (3)

Resource records associated with a name server domain used in fast flux hosting might appear in a TLD zone file as:

$TTL 180boguswebsitesexample.tld. NS NS1.nameserverservicenetwork.tldboguswebsitesexample.tld. NS NS2.nameserverservicenetwork.tld…NS1.nameserverservicenetwork.tld. A 10.0.0.1NS2.nameserverservicenetwork.tld. A 10.0.0.2

Page 147: Botnet Dr. 許  富  皓

147

Build a Fast-Flux Service Network (4) Note that the time-to-live (TTL) for the resource records

is set very low (in the example, 180 seconds). When the TTL expires, the fast flux service network operator's automation assures that a new set of A records for name servers replaces the existing set:

$TTL 180boguswebsitesexample.tld. NS NS1.nameserverservicenetwork.tldboguswebsitesexample.tld. NS NS2.nameserverservicenetwork.tld…NS1.nameserverservicenetwork.tld. A 192.168.0.123NS2.nameserverservicenetwork.tld. A 10.10.10.233

Page 148: Botnet Dr. 許  富  皓

148

Build a Fast-Flux Service Network (5) Records associated with the illegal web site

might appear in a zone file hosted on a DNS bot in the nameserverservicenetwork.tld network as:

boguswebsitesexample.tld. 180 IN A 192.168.0.1

boguswebsitesexample.tld. 180 IN A 172.16.0.99

boguswebsitesexample.tld. 180 IN A 10.0.10.200

boguswebsitesexample.tld. 180 IN A 192.168.140.11

Page 149: Botnet Dr. 許  富  皓

149

Build a Fast-Flux Service Network (6) Note again that the time-to-live (TTL) for each A

resource record is set very low (in the example, 180 seconds).

When the TTL expires, the resource records would be automatically modified to point to other bots that host this illegal web site. Only minutes later, the zone file might read:

boguswebsitesexample.tld. 180 IN A 192.168.168.14boguswebsitesexample.tld. 180 IN A 172.17.0.199boguswebsitesexample.tld. 180 IN A 10.10.10.2boguswebsitesexample.tld. 180 IN A 192.168.0.111