Fast community structure identi cation of small … community...Fast community structure identi...
Transcript of Fast community structure identi cation of small … community...Fast community structure identi...
Fast community structure identification ofsmall world networks
( スモールワールドネットワークに対するコミュニティ構造の高速発見法 )
東京工業大学大学院社会理工学研究科人間行動システム専攻
学籍番号 12M55087池 光龍
2014年修士論文指導教員 脇田建 准教授
提出日: 2014年 7月 24日
September 10, 2014
Abstract
Detecting communities is of great importance for research and application in var-
ious disciplines where systems are often represented as networks. The existing
works mainly focus on the creation of new metric or technical improvements to
improve the processing power ( mainly speed ) and accuracy of community de-
tection. This paper focuses on the efficiency of community detection, which can
also be one way of improving processing power. For this purpose, we utilize small
world property, which is also the reason of the existence of community structures,
and propose a measure for indicating the efficiency. In addition, by proposing two
other countermeasures, we made a heuristic finally, which can detect communities
in a very high efficiency, while only outperforms with the part of networks. As a
result, although our method failed to improve processing power, we saw the pos-
sibility of improvement by only using small world property, and furthermore, the
possibility to create a new high-quality algorithm.
1
Contents
1 Introduction 8
1.1 Main contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2 Outline of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2 Social networks 14
2.1 Social networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Small world networks . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Scale-free networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3 Community detection 20
3.1 Community detection . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Modularity-based community detection . . . . . . . . . . . . . . . . 23
3.3 Louvain method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4 Proposed method 30
4.1 Basic idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 Modularity computation efficiency . . . . . . . . . . . . . . . . . . . 32
4.3 Max neighbor community selection . . . . . . . . . . . . . . . . . . 33
4.4 Changed neighbor community selection . . . . . . . . . . . . . . . . 35
4.5 Hybrid heuristic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.6 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5 Evaluation 48
5.1 Modularity computation efficiency . . . . . . . . . . . . . . . . . . . 48
5.2 Time and modularity . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.3 Community contents . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2
CONTENTS
6 Summary 61
6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.2 Future works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Bibliography 64
3
Glossary
Q Modularity of the whole network. 23
∆Qvici Modularity change amount of one node-neighbor community pair, when
the node vi move to the neighbor community ci. 27
C It means community set in a network V =|C|∪i=0
Ci, i, j < |C|, Ci ∩ Cj = ∅. 15
E It means edge set in a network E ⊂ (V× V). 13
G It always represents a network G = {V,E}. 13
K It means degree set in a network K = {ki |∑|V|
j=0 ei,j, ei,j ∈ E, i, j < |V|}. 16
V It means node set in a network. 13
|U → W| It means all available edges between nodes set U and W. U, W ⊂ V,|U → W| = |(U×W) ∩ V| . 23
|{ci} → V| It means the total number of edges of community ci. 27
|{ci} → {cj}| It means the weight of edges in the network that connect nodes
between community i and community j. 23
|{vi} → {cj}| It means the weight of edges that connect node vi and community
ci. 27, 32, 33, 36, 37
|{vi} → {vj}| It means the weight of edge between node vi and vj. 27
4
List of Tables
3.1 Comparison of some community optimization methods . . . . . . . 25
4.1 Changes in the number of moved nodes per Pass . . . . . . . . . . 36
5.1 Empirical Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.2 Synthetic Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.3 Modularity of all datasets . . . . . . . . . . . . . . . . . . . . . . . 55
5.4 Time of dataset scale free . . . . . . . . . . . . . . . . . . . . . . . 56
5.5 Datasets1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.6 Datasets2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.7 Datasets3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5
List of Figures
2.1 An example of network . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 An example of social network . . . . . . . . . . . . . . . . . . . . . 14
2.3 An example of adjacency matrix . . . . . . . . . . . . . . . . . . . . 15
2.4 Community Structure in a network. . . . . . . . . . . . . . . . . . 17
2.5 An example of power-law distribution. . . . . . . . . . . . . . . . . 19
3.1 Examples of modularity . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 Louvain method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3 Computation details of Louvain method . . . . . . . . . . . . . . . 28
4.1 A virtual network including three communities . . . . . . . . . . . . 31
4.2 Intermediate state of Louvain method . . . . . . . . . . . . . . . . . 31
4.3 Modularity change comparison between Louvain and Basic idea . . 34
4.4 MCE comparison between Louvain and Basic idea . . . . . . . . . . 35
4.5 The possible movement of one node’s unconnected nodes . . . . . . 37
5.1 Modularity of DBLP . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.2 MCE of DBLP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.3 Modularity of Web-Google . . . . . . . . . . . . . . . . . . . . . . . 52
5.4 MCE of Web-Google . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.5 Modularity of YouTube . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.6 MCE of YouTube . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.7 Modularity of Pokec . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.8 MCE of Pokec . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.9 Modularity of Small world . . . . . . . . . . . . . . . . . . . . . . . 53
5.10 MCE of Small world . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6
LIST OF FIGURES
5.11 Modularity of Scale free . . . . . . . . . . . . . . . . . . . . . . . . 53
5.12 MCE of Scale free . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.13 MCE changing of a high modularity scale-free network . . . . . . . 54
5.14 Time(s) of each Heuristics . . . . . . . . . . . . . . . . . . . . . . . 56
5.15 NMI when only average degree is different . . . . . . . . . . . . . . 60
5.16 NMI when only modularity is different . . . . . . . . . . . . . . . . 60
7
Chapter 1
Introduction
There are varieties of interacting systems in which scientists show interest that
are made up of individuals or components ( collection of individuals ) connected
together in some way. For instance, human societies are a collection of person
or organization connected by acquaintance or social interaction, such as business
relationship, friendship [12], sexual relationship [33] and so on, Internet [18], which
is a collection of computers and other devices connected by physical lines, such as
optical fiber lines. In World Wide Web, homepages are connected by hyperlinks,
while in electric power grid [47], generating stations and switching substations are
connected by high-voltage lines.
Facing these systems, some people feel interested in individuals or components,
like how a company should approach their sustainable development [16, 36], or how
to make a webpage for page views increment. Some other people feel interested
in the connections, like how to make and keep a good human relations [8], or
what is the optimal distance between two substations of electric power grid in a
specific area. However there are also some people who feel interested in studying
the pattern of connections between individuals or components, such as which is
more excellent topology of electric power grid with strong capacity to respond
to natural disasters, or what kind of person or organizations easily connect each
other.
In order to easily comprehend these patterns of network connections, one con-
venient to use graphs to represent them in a given system. A graph is a simple
form, which is made up of just points and lines connecting the pair of points.
8
CHAPTER 1. INTRODUCTION
Here, we call the point as node and line as edge [ See figure 2.1 ]. In addition, a
network could be represented as an adjacency matrix. An n× n adjacency matrix
can represent a network with n nodes, where the element of matrix ai,j indicates
the number of edges from node i to node j. Then the above interacting systems
should be represented as a graph or an adjacency matrix, where the individuals
or components of the system being nodes and the connections the edges ( The
graphs are described in detail in section 2.1 ). Using these graphs and adjacency
matrix, one can observe the nature of the networks, or find out something about
the structure of these networks by analyzing them in kinds of ways.
The first use of a graph dates back to Euler’s solution of Konigsberg’s Bridges
problem in 1736 [17]. Since then the scientists form a wide variety of fields studied
about networks and tried to solve all kinds of problems[6]. For example, the
shortest path problem [4, 13] would determine the shortest path ( minimum number
of edges ) between two nodes, graph coloring problem [28, 23] which would coloring
the nodes of graphs such that no two adjacent nodes share the same color. Graph
partition problem [32, 20] that would divide a graph into several components where
the number of edges between separated components is small.
On the other hand, there are also some mathematicians and physicists who
focus on some statistical properties that most networks seem to share. One of
them is scale-free property [3]. The degree of a node in a network is the number of
other nodes to which it is connected, and the degree distribution is the probability
distribution of these degrees over the whole network. Scale-free property is that
degree distribution of many networks follow power law, with a small number of
nodes holding very high degree while the others low [3, 15]. Another is small
world property. It consists of two parts, short average distance and community
structure. Short average distance means, in fact, the distance between two nodes
in most networks is short [38, 62], while community structure [24] means there are
lots of nodes groups included in most networks, where nodes in the same group
are connected densely, but nodes between different groups are connected sparsely.
Community structure is one of the most important properties that exists in
many networks. One community is a collection, which is made up of part of nodes
in a network. As mentioned above, it has the characteristic that, over the whole
network, nodes in the same community are connected densely, while nodes between
different communities are connected sparsely [ See figure 2.4 ]. A network consists
9
CHAPTER 1. INTRODUCTION
of many such communities. The earliest paper on community is Stuart Rice’s paper
in 1927, where they found communities ( they called it parties or blocs at that
time ) in small political bodies, based on the similarity of their voting patterns
[53]. After that, scientists of a variety of fields found community structures in
many networks, and it was found that these community structures are not just
some simple nodes collections but some meaningful collections, i.e. video topics
in a YouTube network [22], research topics in a collaboration network [42], global
economic entities in a worldwide air transportation network [27], genres in an
IMDb ( internet movie ) network [19], and cycles and pathways in a metabolic
networks [26, 48].
Since the community structures included in networks are meaningful, it is nec-
essary to identify and analyze them. However, community identification capacity
is limited in artificial way, although the accuracy of the result will be high, espe-
cially when the network size becomes large. Therefore, many community detection
methods, which would be done automatically by computers, are proposed.
In community detection literature, there are many kinds of research activities.
They could be classified by the types of networks; bipartite networks, multilayer
networks, and homogeneous networks. A bipartite network means a network whose
nodes can be divided into two independent sets U and V such that only edges be-
tween node ui ∈ U and node vi ∈ V are available. For instance, Amazon network
is made up of users sets and goods sets, which connect users and goods by a
purchasing relationship. One community detection method for bipartite networks
can detect communities that contain nodes from U and V [35, 57]. A multilayer
network is a complicated concept, which contains a few kinds of subclasses. A
simple subclass is a network whose nodes would be connected in different pattern
under different viewpoint or different times. For instance, relations of employees
in a company would make acquaintance network and superior-subordinate rela-
tionship network, and these two networks may make up a multilayer network [61].
In addition, these relationship networks would make another multilayer network,
since the relationship may change over time. One community detection method of
a multilayer network could detect communities that across multi layer [39, 40].
The third one is the homogeneous network. Homogeneous networks are tradi-
tional networks, which are commonly used in research. Unlike bipartite networks,
all of nodes in a network can be connected. It contains only one layer and the
10
CHAPTER 1. INTRODUCTION
nodes of the network are connected under one kind of interaction. It is valuable to
analyse these kinds of networks since they could simplify the problem and there
are many networks whose topology do not often change.
The research of community detection is one of the hottest topics in the net-
work literature. There are several kinds of approaches: Spectral bisection method
is a method which is based on the eigenvectors of Laplacian matrix made from
adjacency matrix of a network, while belief propagation divides networks by the
probability of one node belongs to one partition [46]. Hierarchical clustering is
another approach that will make communities by the similarity of node pairs [31];
Newman et al proposed edge betweenness which focuses on the number of shortest
paths between pairs of nodes that run along an edge, and modularity which fo-
cuses on the percentage of edges between inside community and outside community
[24, 45].
The detection methods of homogeneous networks mainly focus on accuracy and
processing power ( mainly indicate time and scale ), since one want to correctly
identify communities included in networks and the data size of networks becomes
bigger and bigger [24, 5, 55, 60, 10, 43].
In this research, we focus on the processing power. However, unlike the existing
works that mainly focus on the creation of new metric or technical improvements
to improve the processing power of community detection, we focus on the efficiency
of community detection as a way of improving processing power, and utilize small
world property ( one of the most important property of networks itself ) to achieve
the goals. In addition, some technical countermeasures are made to improve the
power further.
1.1 Main contribution
Main contributions of the thesis can be summarized as follows:
Small world
As mentioned above, small world property is one of the most important properties
of the networks, since it is the reason of community structures’ existence. However,
most of the previous approaches [25, 30, 9, 44, 10, 60, 5, 55] mainly focus on
11
CHAPTER 1. INTRODUCTION
community structures itself instead of on the reason of their existence, which would
also be important and valuable when detecting the structures from networks. Thus,
in this thesis, we approach the community detection problem from small world
property itself, not from the engineering viewpoint. We propose the possibilities
of utilizing small world property in community detection problem, and additionally
verify its’ correctness by making some heuristics and analyzing the results of them.
Modularity Computation Efficiency ( MCE )
As a means to enhance the processing power of community detection method,
making new metric [44, 51, 6, 46, 49, 30, 24] or attempting to obtain improvement
in technical way [44, 10, 60, 5, 55] are commonly used thus far. Nevertheless, in
this thesis, we attempt to achieve the goal another way, by improving community
detection efficiency which is a simple but efficient way. For instance, in Louvain
method [5], one can observe that in fact, there are many useless computations
when detecting communities, which can affect the processing power, therefore it
is obvious that one can improve processing power just by reducing these useless
computation. Several countermeasures should be taken for reducing useless com-
putation, and Modularity Computation Efficiency measure is proposed to evaluate
whether these countermeasures are effective. Here, modularity [45] is a metric for
evaluating how good the detected community structures are, and in order to get
the best modularity value, which means good detecting result, iterative modularity
computations should be done. Many useless computations are done during this
iterative computation, and we hope to decrease these useless part while getting
the same result as the original method. So that modularity computation efficiency
should be the quotient of modularity change amount and computation times, and
by using it, one can evaluate the efficiency of countermeasure without depending
on devices, such as computer memory or hard disk.
Effectiveness
After adding three valid countermeasures, we greatly improved modularity com-
putation efficiency over the whole process of community detection, keeping mod-
ularity at the same level as Louvain method.
12
CHAPTER 1. INTRODUCTION
1.2 Outline of the thesis
The structure of our thesis after this chapter is organized as follows. In chapter 2,
we first introduce the social networks, including small world networks and scale-free
networks. Small world networks is the subjects of our research. Chapter 3 reviews
representative existing works in community detection literature. The main work of
our research is written in chapter 4 and 5. Chapter 4 is the heart of this thesis, and
introduces one basic idea of our research, two countermeasures for overcoming each
drawback, and in addition, a metric for evaluating community detection efficiency
in detail. Through these works in chapter 4, we aim to make community detection
more efficient. Chapter 5 demonstrates the effect of the proposed method in various
ways, such as modularity computation efficiency, time, modularity, and the quality
of the detected community. Both synthetic networks and empirical networks are
studied here. Finally, Conclusion and future works are given in Chapter 7.
13
Chapter 2
Social networks
In this chapter, we will talk about social networks, including it’s definition, feature,
and so on. Small world networks will be concretely written, because this feature
not only is the reason why social networks include community structure, but also
will be the basis of our proposal. In section 3, we will also mention about scale-free
networks, another important feature of social networks.
2.1 Social networks
Figure 2.1: An example of network with
5 vertices and 7 edges.
Figure 2.2: A classic social network. Rep-
resent friendship between members in a
karate club.[65]
A graph is just a simple representation of some structure in a variety of science
14
CHAPTER 2. SOCIAL NETWORKS
3
1
4
5
2
G =
0 0 1 1 10 0 1 1 11 1 0 0 01 1 0 0 11 1 0 1 0
!
"
######
$
%
&&&&&&
Figure 2.3: This is an example of adjacency matrix. The left network is represented
as the right matrix. The element of matrix gi,j = 1, if the node i and j are
connected, gi,j = 0 or else.
fields, consist of points and lines, which connect points in pairs. In network science,
the points are referred to as node or vertex, and the lines are referred to as edges.
Mathematically, it is always represented as G= {V,E}, where V means node set
and E means edge set ( E⊂ (V×V) ). Furthermore, one can use an adjacency
matrix, when doing some calculation or analysis, to represent nodes of a network
are adjacent to which other nodes [ See figure 2.1 ].
Using this simple form one can represent some complex problem in a simple
way, for instance, the famous Seven Bridges of Konigsberg problem, and also can
often lead to new and useful insights via thinking this way. It is used in many
fields, example include food chain, a network linked by predator-prey relationships,
and World Wide Web, a network linked by web page. Moreover, when it comes
to human societies, a network linked by friendship or social interaction is used
to analysis the pattern of connections between components. We call it social
networks.
A social network means a social structure made up of individuals, organizations
and the relationship, made by their social activity. Therefore the above-mentioned
components means individuals or organizations, which is also the nodes or vertices
in a network, while the relation means the connection of the components, which is
edges in a network as well. Sociologists also refer to the nodes or vertices as actors
and the edges as ties. In this thesis, we will use vertices and edges.
15
CHAPTER 2. SOCIAL NETWORKS
Social networks are made and changed by human’s social activity or social
interaction. However, conversely the connection in a social network affect how
people learn, form opinions, and gather news, as well as affecting other less obvious
phenomena, such as the spread of disease.[41] Therefore, it is necessary to analysis
to know more about human society, and service for it. There are many methods
for research and application, such as community detection. We will describe in
detail about it in chapter 3.
As a property, compare to other type networks, social networks can be varying
by the types of connections of which it is composed. In other words, because there
are many different possible definitions of an edge, we can make kinds of networks
by different point of view, including friendship, acquaintance patterns, contacts
between business people, movie actors and musicians, even criminal contacts be-
tween terrorists, drug users, and so on.[41]
As a second, there are small world property and scale free distribution in social
networks. May be it can not be regard as property of social network, because small
world property and scale free distribution are also exist in other type of networks
according to Watts and Strogatz’s research.[62]
2.2 Small world networks
As mentioned in section 2.1, though small world property is not an indigenous
property of social networks, we will only consider under social networks, because
it is the main subject of our research.
Under human society, we have the tendency to associate and bond with similar
others, called homophily. Another hand, we also have the tendency to connect with
different individuals, called heterophily. Homophily is stronger than heterophily
in affecting human activity. Because of it, the possibility becoming friends with
friend’s friend is higher than being friends with complete stranger. Then we can
form groups with the others who have similarity to each other. Obviously, we
will have strong connection with others in the group than out of group. In other
world, if we represent a specific society as a network, considering individual as
node, friendship as edge, there will exist many such groups, which nodes inside it
probably share common properties and/or play similar roles within networks. In
fact, one of the most important contents of the small world property, and we call
16
CHAPTER 2. SOCIAL NETWORKS
Figure 2.4: Community Structure in a network.
the group structure community structure. In Figure 2.4, a schematic example of a
network with community structures is shown. Then, also, we can represent some
relationships between these structures and networks in mathematical way. If C
means community set in a network G. Then, V =|C|∪i=0
Ci, i, j < |C|, Ci ∩ Cj = ∅.
Another important content of small world property is short path-length. Short
path-length means in a network, two randomly selected nodes can be connected
with each other with short average path length. Here, one path means one edge
connecting two nodes in a network. This property was first verified by Stanley
Milgram[38]. After that, many researchers did kinds of experiments to verify
it, for instance, Peter Dodds conducted the Milgram’s experiment in large-scale
involving 60000 e-mail chains[14]. In addition, there are some interest applications
uses this property, like Erdos number in mathematicians community, and Bacon
number in actors community.
In summary, networks with small world property are called small world net-
works. A certain category of small world networks was identified by Duncan Watts
and Steven Strogatz in 1998.[63] Small world property are not only included in so-
cial networks, but also be included in other networks, like road maps, food chains,
and electric power grids. In addition, not all social networks have small world
17
CHAPTER 2. SOCIAL NETWORKS
property. Some of them have scale free distribution, be described in next section,
and even some of them have both scale free distribution and small world property.
In this thesis, we will utilize this small world property for implementing our
fast community structure identification algorithm, and WS model, a small world
network generation model, proposed by Duncan J.Watts and Steven Strogatz, will
be used to verify our proposal’s correctness and evaluate the performance of our
algorithm.
2.3 Scale-free networks
In the study of networks, the degree of one node is the number of edges it has and
we call the probability distribution of these degrees over the whole network, degree
distribution. If assume, we use a symbol K to represent the set of degrees. Then it
could be written as K = {ki |∑n
j=0 ei,j, ei,j ∈ E, n = |V |, i, j < n}, and the degree
distribution P (k) should be defined as the equation of degree k. For instance, a
Bernoulli random graph, in which each of n nodes is connected with independent
probability p, has a binomial distribution of degrees k:
P (k) =
(n− 1
k
)pk(1− p)n−1−k (2.1)
About the degree distribution, in 1999, Albert-Laszlo Barabasi found that some
nodes had many more connections than others and that the whole network had a
power-law distribution in some social, biological networks and World Wide Web
networks[1]. After that a large portion of real world networks are found to have
power-law distributions, in which the degree distribution of them has the form
P (K = k) ∝ k−γ (2.2)
where r is a parameter which should be greater than 2.
Under a random graph, there exists a scale of degrees, which could be embodied
by ”average node”, since the most of nodes in random graph have almost the same
number of degrees, and there are very few nodes whose number of degrees are
greater than average. However, under a power-distribution, there is no ”average
node”, which can characterize the degree of the network, because of the existence
of a few highest-degree nodes. Thus one can not decide a scale to represent this
18
CHAPTER 2. SOCIAL NETWORKS
Figure 2.5: An example of power-law distribution[3]. Here, the number of links
k means node degree. Most nodes have small degrees, while a few nodes (hubs)
have a large number of connections.
class of networks.[3] When the degree distribution of a network follows a power
law, we call it a scale-free network.[2] It means in a scale-free network, there are
a few nodes that have a degree that greatly exceeds the average, when most of
others have low degree [ See figure 2.5 ]. We call the highest-degree nodes hubs. In
real networks, many of them are conjectured to be scale-free, but a few networks
claimed to be scale-free, such as World Wide Web, biological networks, airline
networks, some financial networks[56] and so on. In social network, researchers
also discovered some scale-free networks, for instance sexual relationships among
people in Sweden, e-mail network[15], network of scientific papers[52], connected
by citations, collaboration network, connected by co-author relationships[42].
In this thesis, we will use BA model, a scale-free network generation model,
proposed by Albert-Laszlo Barabasi, making some scale-free networks, to evaluate
performance of our algorithm.
19
Chapter 3
Community detection
In this chapter, we will give an overview about community detection, including
its necessity, objective, type, application and so on. In addition, we will introduce
about modularity-base community detection researches, one of the hottest series
in community detection areas. Finally, we will write concretely about Louvain
method, which is the base algorithm of our research.
3.1 Community detection
In Section 2.2, we referred community structures exist in small world networks.
Community structure refers to the occurrence of a set of nodes, who are con-
nected more densely internally than with the rest of the network [ Figure 2.4 ] and
probably share common properties and/or play similar roles within the networks.
Because community structure is formed by social interaction in social networks,
communities in a social network might represent real social groupings, for instance,
an interest group of music in a university friendship network, a research topic in
a citation network, a category in a web network. So identify these communities
could help us understand and exploit these networks more effectively. Furthermore,
through identify communities, we can identify some communities never know, de-
crease problem scale and create data structure to efficiently store and handle large-
scale networks, even can decrease compute cost and improve accuracy of systems,
such as community detection preprocessing in item recommendation systems of
online sale company ( like, e.g., www.amazon.com ).
20
CHAPTER 3. COMMUNITY DETECTION
Community structures in small-scale networks can be simply detected by hu-
man power, while in large-scale networks, it is impossible. Moreover, community
structures in large-scale networks may be holding more information than small
one. Therefore, it is hoped to detect these structures by computer algorithms.
These algorithms are called community detection, community finding, community
identification or clustering.
Community detection is one of the hottest research topics in network science.
Weiss and Jacobson were first carried out the analysis of community structure,
trying to separate groups by studying the relationships between members of a
government agency. [21] After that, the problem has appeared in various forms in
several disciplines. Then kinds of community detection algorithms were proposed.
• Hierarchical clustering
Hierarchical clustering is a traditional technique to find communities in so-
cial networks. The method’s basic idea is from the hierarchical structure
displayed in many social networks. Like, a company is composed by some
departments, a department is composed by some teams, and a team is com-
posed by some staffs. The method will continuously do, computing similarity
or distance between clusters or nodes and merging similar clusters or nodes
into one cluster, two steps until the networks are merged into one cluster.
• Statistical inference
Statistical inference based methods aims at making a generative model by
deducing properties of network data sets and fitting it to actual networks for
finding community structures. Stochastic block model [29] is one of the most
referenced models in this literature. The method will divide a network to
some groups of equivalence, such as, structural equivalence, in which nodes
have the same neighbors will be evaluated as equivalent nodes, or regular
equivalence, in which nodes have similar connection patterns to nodes of
other classes. In addition, there are some other methods, which have good
performance, like Belief propagation [25] and agglomerativeMonte Carlo[50].
This class of methods not only can find community structures from social
networks, but also can be used to generate network data.
• Dynamic methods
This class of methods uses processes running on the network, like random
21
CHAPTER 3. COMMUNITY DETECTION
walk. The basic idea of these methods is that, if there are strong community
structures in a network, a random walker will use long time to walk inside a
community due to the high density of internal edges. Under this idea, there
are kinds of approaches being proposed, for instance, Zhou’s [67] ”global
attractor” and ”local attractor”, Zhou and Lipowsky’s net-walk, and Van
Dongen’s[59] Markov Cluster Algorithm.
• Divisive methods
Divisive methods focus on edges in networks. It will gradually find the edges,
who connecting nodes of different communities, and remove them. Above
two steps will stop when a network is decomposed into several disconnected
components, and each component will form community. A classic method
of divisive methods is the algorithm proposed by Girvan and Newman [24].
In their method, they proposed the concept of betweenness, including edge
betweenness, random-walk betweenness and current-flow betweenness. An
edge’s edge betweenness means the number of shortest paths between all node
pairs that get through the edge. An edge’s random-walk betweenness is given
by the frequency of the paths across the edge when a random walker running
on the network. Current-flow betweenness is defined as the amount of current
carried by the edge between two nodes, when a voltage difference is applied
between any two nodes in networks, under the concept of considering the
network as a resistor network, with edges having unit resistance. Using these
betweenness, the algorithm will gradually find and remove edges between
communities, so that finding community structures in networks.
• Others
There are many other kinds of community detection methods as well. Ac-
cording to network type, there are methods for bipartite networks [66] and
multipartite networks. Overlapping community structure detections are also
a hot research topic. Meanwhile, modularity maximization community detec-
tion methods are one of the most widely used methods, though it’s drawback
has been known. About these methods, more description that is detailed will
be written in 3.2 and 3.3, because our proposal is an extension of the litera-
ture.
22
CHAPTER 3. COMMUNITY DETECTION
3.2 Modularity-based community detection
As mentioned in above section, modularity-based methods are one of the most
widely researched community detection methods. Before talking about these meth-
ods, it will start talk from Modularity [45].
As we know, community detection methods will partition a network into num-
bers of communities, which existed in networks. Since we use machine to detect
communities existed, naturally it will occur quality problem. In other words,
though we hope to detect community structure as close to the exist one as possi-
ble; the quality of the detected result will be different between kinds of methods.
However, we do not know how good one algorithm is. Moreover, it will be more
serious in case of detecting unknown community structure from networks. So in
order to distinguish ”good” and ”bad” community detection result, we need some
measures to represent quality.
Modularity [45] is one of the most popular quality function, proposed by New-
man and Girvan. Through this quality function, one could evaluate how good the
structure found is. The basic idea of the modularity is like this: Since there is no
community structure in a random graph, one could use the comparison between
intra-community ( within-community ) edges ratio under current partition of a
network and the expected ratio value under random network. So, modularity can
then be written as follows:
Q =c∑
i=1
(ei,i − a2i ) (3.1)
where the sum runs over all nodes in the network. c means community numbers
divided from the network, ei,j the fraction of edges in the network that connect
nodes between community i and community j. ai =∑
j ei,j, which represent the
fraction of edges, connecting the outside nodes to inside nodes of community i.
Since in a network, an edge fall between nodes without regard for the communities
the nodes belong to, we would have ei,j = aiaj. Then, we can know that a2i is
the expected value of ei,i. If the detected community is close to the real one,
ei,i will greater than it’s expected value, a2i , otherwise it will less than a2i . The
closer the detected community to the real one, the greater modularity value will
be given to the community, as a result the modularity of the whole networks will
become greater. Though modularity was originally defined as a stopping criterion
23
CHAPTER 3. COMMUNITY DETECTION
Figure 3.1: The greater the modularity value, the better the partition. The dotted
line parts that contain some nodes is communities.
for Girvan and Newman’s algorithm, it is employed directly and/or indirectly in
many community detection methods. One major application of them is modularity
optimization. The motivation of these class methods is that there should be at least
one partition whose modularity will be the max value of all possible partitions,
since high values of modularity indicate good partitions. Under the partition with
best modularity value, every community in the partition will also hold the best
modularity. However, it was proved by Brandes [7] that modularity optimization is
an NP-complete problem. So most of modularity optimization methods try to find
good approximations of the modularity maximum in a reasonable time. For ease of
understanding and convenience, in this thesis, we would decide some symbols and
use them all over the thesis. First, we would use |U → W| to indicate the number
of available edges between nodes set U and W. In addition, we use Q as symbol of
modularity, ∆Q as ∆modularity. Then, we can use |{ci} → {cj}|= |E∩(ci×cj)| toindicate the weight of edge between community ci and cj. As a result modularity
Q could be represented as:
24
CHAPTER 3. COMMUNITY DETECTION
Table 3.1: Comparison of some community optimization methods
Method CharacteristicCommunityFormation Scale
Newman+ ( 2004 ) Modularity Definition Merge 104 nodes
Clauset+ ( 2004 ) Efficient Data Structure Merge 106 nodes
Wakita+ ( 2007 ) Consolidation Ratio Merge 107 nodes
Blondel+ ( 2008 ) Partial Optimization Move 108 nodes
Shiokawa+ ( 2013 ) Incremental Aggregation Merge 108 nodes
Q =
|C|∑i=0
(|{ci} → {ci}| − (∑j
|{ci} → {cj}|)2) (3.2)
The first modularity optimization method was approached by Newman [44] as
a greedy method. In this method, every nodes in a network will be considered as
a solely community at first. After that, compute ∆modularity of all community
pairs, those who are connected each other. Here ∆modularity means modularity
increment on the assumption that a community pair becomes one community.
Finally, select and merge the community pair with largest ∆modularity value into
one community. The compute and merge step should repeat until all communities
merged in one or community numbers reach a certain number. Utilize the method
one can handle up to 104 nodes. In a later work [10], Clauset et al. made an
accelerated version by using a special data structure, named max-heaps. This data
structure successfully decreased many useless computes in Newman’s method, and
the scale of networks that one can handle was extended to 106 nodes.
Immediately, a tendency that often led to poor values of the maximum modu-
larity was found. It is the fact that the greedy methods would form quickly large
communities at the expenses of small ones. In respect of this problem, Wakita
and Tsurumi [60] noticed that Clauset method often hold bias towards large com-
munities. To resolve this problem, they proposed consolidation ratio, which could
take balance between ∆modularity and community size. As a result, analyzable
nodes was extended to 107 nodes.
25
CHAPTER 3. COMMUNITY DETECTION
Blondel et al. [5] made a different approach ( Louvain method ) which can
handle up to 108 nodes. In this method, they move nodes to make community
instead of merge them. In addition, it will not select the best community pair
of the whole network; it will just select one best community pair of the partial
network. ( About Louvain method, because our work is based on this method, it
will be described in detail in section 3.3. ) The work by Shiokawa et al. [55]
made a heuristic utilizing community merge and partial modularity computation
ideas.
There are also some other kinds of modularity-based methods, but in thesis,
we will not describe about them.
3.3 Louvain method
In this section, we will give a detail explanation about Louvain method. Because,
as mentioned above, our proposal is based on this method.
Compare to other modularity optimization methods, Louvain method takes
two different steps. One is that it does not select global ( all community pairs
in the whole network ) max modularity of community pair every time, it would
select local ( one node and its neighbor community pairs ) max modularity of node-
community pair to form community structure. The other one is that the selected
node would be moved to the selected community, while in Newman method, the
two selected communities would be merged into one community. With these two
main differences, Louvain method got higher modularity than others, besides short
detect times.
Louvain method is divided into two phases that are repeated iteratively. First,
like Newman method, every node in the network is considered as a community
at initial time. Therefore, in initial partition community numbers are equal to
node numbers. Then, for each node i, the method would compute ∆modularity
between node i and its neighbor communities. Here, i’s neighbor community means
the community i’s connected node belong to. ∆modularity means the increment
of modularity that would take place under the assumption of node i’s movement
from current community to neighbor community. Finally, node i moves and belong
to the community which have the largest ∆modularity. In other word, every node
in the network takes the best movement to the neighbor communities. If the
26
CHAPTER 3. COMMUNITY DETECTION
Figure 3.2: Visualization of Louvain method. From a raw network on the left,
it computes until no node movement. Node color means community label. Here
phase two is described as community aggregation. Reprinted figure from Blondel’s
paper.[5]
value of node-current community pair is the largest one, the node stays in its
current community. Moreover, a node-community pair whose ∆modularity less
than 0 would never take part in comparison, because minus ∆modularity means
decrement of modularity in case of corresponding node-community pair becomes
one community. So, one node stops moving when ∆modularity with all of its
neighbor communities become minus value or smaller than threshold ( e.x.10−5 ).
In addition, the ∆modularity computation stops under the condition that there is
no node’s movement throughout the network. Meanwhile, because after one node
move to one neighbor community, the others continually update it’s information
( element nodes, total weight, etc. ), which affect ∆modularity value, every node
should take part in computation several Passes. Here the Pass means one round
computation of all nodes in the network. This is the first phase that complete if
no movement take place.
The second phase of the method rebuilds the network whose nodes are now
the communities found during the first phase. Moreover, the edges between these
new nodes should be the sum of the weight of the edges from the first phase
27
CHAPTER 3. COMMUNITY DETECTION
Figure 3.3: Visualization of node-community pairs’ ∆modularity computation.
Nodes in one cloud equal to one community. ∆modularity between center node
and its neighbors would be computed.
communities. Edges between nodes of the same community lead to self-weight
which would be used in ∆modularity computation. Once the second phase is
completed, the method would continually do phase one and phase two with the
new network. Like this, the method would switchover between computation step
and reset step repeatedly until no modularity improvement take place over the
network.
In other word, one person, assume there is a friendship network, temporarily
select one best community where his friends belong to, and if there is one com-
munity, where his other friends belong to, better than current one, he will move
and take part in that community. In addition, one would continue this movement
until he feels no communities, his friends fell in, better than current one. Un-
like Newman method, each person can freely select and move between neighbor
communities, while in Newman method, each person would just take part in one
community only if his best ∆modularity value is also the best one in the whole
network. This is the whole process of Louvain method.
On the above, ∆modularity is referred several times, and is described as mod-
ularity increment on assumption that one node move from original community to
neighbors. In this part, we will talk about this ∆modularity. As we mentioned
28
CHAPTER 3. COMMUNITY DETECTION
above, modularity is a quality function that can evaluate one network’s partition.
It means each partition of a network has its own modularity value. Furthermore,
we can say that each network has its own modularity value, as we can assume
every network was partitioned from other networks. In Newman/Louvain method,
since they gradually merge/move node into neighbor community, network struc-
ture should also gradually be changed. Therefore, we can evaluate the impact of
one node’s merge/move on modularity by comparing the modularity before and af-
ter node moving/merging. That is ∆modularity. Then from modularity’s equation
( Equation 3.1 ), we can define ∆modularity as follows:
∆Q =
[∑in + ki,in2m
−(∑
tot + ki2m
)2]−
[∑in
2m−(∑
tot
2m
)2
−(
ki2m
)2]
=1
2m
(ki,in −
ki∑
tot
m
)(3.3)
where∑
in is the sum of the weights of the edges inside neighbor community,∑
tot
is the sum of the weights of the edges incident to nodes in neighbor community,
ki is the self-weights of node i, initially it should be 0, ki,in is the weight of edge
between node and its neighbor community. m is the sum of the weights of all links
in the network.
In addition, we would define some symbol and rewrite the above original
∆modularity for convenience. Here, we use ∆Qvici to indicate the change amount
of modularity when node vi moves to community ci, while |{vi} → {cj}|= |E ∩(vi × ci)| denotes the number of edges that connect node vi and community ci.
|{ci} → V|= |E ∩ (ci × V)| represents the total number of edges of community
ci. Finally, we use |{vi} → {vj}| to indicate the weight of edge between node vi
and vj, so that |vi → vi| = |E ∩ (vi × vi)| could shows the weight of node vi itself.
According to these symbols, we can rewrite ∆modularity in an easy-to-understand
way as follows,
∆Qvici =1
2|E|(|vi → ci| −
|vi → vi| × |ci → V||E|
) (3.4)
29
Chapter 4
Proposed method
In this chapter, we will talk about our proposal. It consists of four heuristics.
Each one is proposed to overcome partial weakness of Louvain method. Finally we
propose a hybrid heuristic by merging these four proposal to make our community
detection heuristics more effective.
4.1 Basic idea
In Newman method, all community-community pairs of network (in fact, it is
node-node pairs, because every community-community pairs would be merged into
one nodes as soon as they are selected) need to be computed for one merge. So
assume there is a network with n nodes and m edges. It would need m times
compute for one merge. However, in Louvain method, it just need to compute
node and its neighbor community pairs for one move, which means it just computemn
times on an average. Mainly with this difference on compute time, Louvain
method outperforms all the other methods both in modularity and in speed. From
this natural progression, we wonder whether there is any idea that can decrease
community times even more.
On the other hand, from the description of related research(Chapter 2,3), we
should find that all of these methods try to give solution from the viewpoint of en-
gineering. Moreover, they can detect meaningful and useful community structure.
However, we wonder whether it should be more effective and realistic if utilize some
characteristics of social network itself, since the objective is detecting community
30
CHAPTER 4. PROPOSED METHOD
Figure 4.1: Assume a network including
three communities.
Figure 4.2: This figure represents one in-
termediate state of Louvain method.
structures existing in network. Above two points are our original motivations of
this research. Then, let us observe small world property again. First, small world
property is the reason why there are community structures in network. It also is
theoretical rationale of community detection methods. However, in fact we can
consider another thing from this property. If assume a node i belongs to a real
communityK(not detected community by detection method. See Figure 4.1), then
we can consider that most of node i’s neighbor nodes should belong to the same
community K with a high probability, because of homophily’s effect(See Section
2.2). So that at the intermediate state of Louvain method, the number of node i’s
neighbor communities(Community A, B and C in Figure 4.2) who finally belong to
real community K should be more than those not belong to. In addition, since as
long as we can guarantee node i’s affiliation community is Community K finally,
all we have to do is just always(at least finally) guarantee node i belong to one
of Community K’s sub community, e.g. Community A, B or C, at intermediate
state. On the other word, in fact, there is no need to compute node i and all of its
neighbor community pairs to select best to move, but just compute some of them
and select best may also get correct community structures. Because first, though
just select some of neighbor communities, the communities who finally belong to
real community K should be selected and take part in compute at high proba-
31
CHAPTER 4. PROPOSED METHOD
bility. Moreover, it will guarantee node i always belong to one sub community
of real Community K, though it may be not the best neighbor community of all
neighbors. This is our basic proposal.
We can consider this idea in real network like this. In Louvain method, one
person will find his affiliation community considering all communities that his
friends belong to, then he selects the best community to belong to. In addition,
he will observe neighbor communities’ change and move from one community to
another when one of these neighbor communities become better than current one
until no change take place in his neighbor communities. However, under our basic
idea, one person would just observe part of neighbors’ change and select the best
community among them. Although he always belongs to partially best community,
he finally can be a member of the best community of all neighbor communities,
because of small world property.
In summary, we propose the possibility of decreasing compute times utilizing
small world property. It is our basic idea, and we will take some other measures
additionally on this basic idea to optimize it.
4.2 Modularity computation efficiency
In section 4.1, we described about our basic proposal. In this part, we will talk
about another proposal, Modularity Computation Efficiency ( MCE ).
There are kinds of measures that can be used to evaluate clustering method
from various viewpoints. Through modularity, we can measure the quality of
the community detection, while times can represent data processing capability of
community detection algorithms. Rand Index could be used to get the similarity
between two communities. When measure how much information is shared between
a communities and a ground-truth data ( a data which we know correct community
structures included in ), we use Normalized mutual information ( NMI ). Pair-
counting F-Measure also can be used to compare community detection method
with different numbers of communities.
Using these measures one not only can evaluate a single community detection
method, but also can compare quality between several methods. Nevertheless, we
can notice that these measures just assess the accuracy of the result, though one
also need some tools which can represent the quality or efficiency of community
32
CHAPTER 4. PROPOSED METHOD
detection process. Especially for programmers or analysts, they need to know the
quality or efficiency of community detection for further optimization or analysis
that is more detailed.
Here we consider a simple measure that can represent the efficiency of commu-
nity detection process. We call it Modularity computation efficiency ( MCE ). Use
it one get a community detection efficiency independent of programmer language,
technique or computer devices ( e.g. memory, cpu or hard disk ). The measure is
as follows:
MCE =DegreeOfModularityIncrease
ComputeT imes(4.1)
where Degree of Modularity Increase means modularity increment per a certain
compute times ( e.g. 10000 ), and Compute Times means ”a certain compute
times” mentioned above.
In this research, we would use this measure to compare and evaluate community
detection efficiency between Louvain and some other heuristics we proposed.
4.3 Max neighbor community selection
In section 4.1 we have proposed a new idea that one utilized small world property
of social network, and on basis of this basic idea, we proposed a new evaluate
metric in addition. Using above two ideas, we do a preliminary experiment to
verify the correctness of our basic idea.
As a result [ See Figure 4.3 ], we can find that the heuristic based on our basic
idea can get much the same value in modularity, with higher MCE. However, the
better performance is just appeared in the forepart of all computation, after that,
the computation efficiency change make the same curve as the Louvain method.
Therefore, it is necessary to take some measures to improve the modularity com-
putation efficiency in the latter part.
Under this purpose, we consider the reasons from two viewpoints, MCE and
formula of ∆modularity itself.
First, in order to compare modularity change condition between Louvain method
and our heuristic, we make a figure, consists of compute times and MCE. From
Figure 4.4,we can see two peaks in modularity change, which be observed that
33
CHAPTER 4. PROPOSED METHOD
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0 5000 10000 15000 20000 25000 30000 35000
Modula
rity
Compute Times(*10000)
Louvain methodBasic idea heuristics
Figure 4.3: Modularity change comparison between Louvain and Basic idea heuris-
tic. Here x-axis means compute times, and y-axis means modularity. The whole
graph would represent modularity’s change curve, while slope of it means MCE.
there are more than two in some other data, but they are not take place continu-
ously. Even the first peak is just sustained for a shot time and the whole compute
time is more than Louvain method. However, in other hand, the figure gives us
the information that it is possible to optimize it by taking some measure to make
the peak of our heuristic has longer peak time and higher modularity change, since
the final modularity is a stable value.
Secondly, we consider ∆modularity equation itself, which is as follows:
∆Qvici =1
2|E|(|{vi} → {ci}| −
|{vi} → {vi}| × |{ci} → V||E|
) (4.2)
In the equation, it is obvious that numeric expression |{vi} → {vi}|×|{ci} → V| ≪|E|, |{vi}→{vi}|×|{ci}→V|
E ≪ |{vi} → {ci}| should be satisfied in the part process of
the whole computation, though one can not know how long it will be and when
it would be. So it is conceivable that we can select part of neighbor communities
in order of decreasing |{vi} → {cj}| value, because when the numeric expression
is satisfied, the greater the |{vi} → {cj}|’s value, the greater the ∆modularity’s
value. In addition, through select high-|{vi} → {cj}| value neighbor communities,
improvement of community detection efficiency can be considered. The reason is
34
CHAPTER 4. PROPOSED METHOD
0
0.0001
0.0002
0.0003
0.0004
0.0005
0.0006
0.0007
0 20000 40000 60000 80000 100000 120000 140000
Delta M
odula
rity
Compute Times(*10000)
Louvain methodBasic idea heuristics
Figure 4.4: MCE Comparison between Louvain and Basic idea heuristic
as follows. In basic heuristic, although it can reduce compute times by just refer-
ring part of neighbor communities, one node’s participate times of computation
should be increase ( represent as Pass times increasing, About Pass, see section
3.3 ), because the selected neighbor community does not always hold the greatest
∆modularity value, while Louvain always select best one.
From above consideration, we propose the countermeasure for improving MCE
at latter part of computation as follows: Relative to basic heuristic, we will select
part of neighbor communities in order of decreasing |{vi} → {cj}| value. In addi-
tion, the time utilize this idea may be from Pass2 to the last Pass. Because at
Pass1, most node-community pairs’ |{vi} → {cj}| values equal to 1, while, at the
last n Passes, the community who holds the greatest ∆modularity value may also
have great |{vi} → {cj}| value according to our observation.
4.4 Changed neighbor community selection
In section 4.3, we proposed an idea to improve MCE at the latter part of the whole
computation. However, in fact there is another problem that need to solve.
Figure 4.3 represents modularity change in the whole process of community
detection. From the figure we can see that modularity changes drastically at the
35
CHAPTER 4. PROPOSED METHOD
Table 4.1: Changes in the number of moved nodes per Pass
Pass movement Pass movement
1 43426 7 380
2 15926 8 128
3 8912 9 32
4 3765 10 25
5 1618 11 24
6 799 12 13
fore part of the computation, while at latter part it becomes slowly but surely
( After this, we will call these two parts as drastically changing part and slowly
changing part ). In addition, the computation times of latter part is about 5,6
times more than fore part, which is undesirable. Because relative to fore part, it
is obvious that node movements are less at slowly changing part, which means the
modularity computation efficiency is terribly lower than drastically changing part.
In order to solve this problem, we first observed changes in the number of
nodes that has moved. As a result [ See Table 4.1. It is a SNS data with 58228
nodes and 214078 edges from http://urx.nu/9NE5. ], we found that the number
of nodes which has moved are gradually becoming less, while the community de-
tection method would compute all nodes iteratively both in drastically changing
part and slowly changing part. In other words, if we assume some computations
are effective, when one node’s movement took place after node and its neighbor
communities computations. There are many ineffective computations doing in the
slowly changing part. Therefore, we consider that it is possible to improve MCE at
the slowly changing part just by identifying and computing effective nodes, which
probably moves from one community to another community.
Then, we began to see details about the moving nodes for finding some char-
acteristics of these nodes so that identify them directly. Before long, we found
an interesting phenomenon as follows ( maybe it is matter of course ). When the
slowly changing part sets in, one node’s movement would cause its neighbor nodes’
movement, and these neighbor nodes are the only movable nodes. Other nodes
36
CHAPTER 4. PROPOSED METHOD
Community C1
Community C2
!VV
Community C3
Community C1
Community C2
!V
V
Community C3 Community C4
Figure 4.5: The figure shows two cases, which represents the possible movement of
node v ’s unconnected nodes, when node v moves from one community to another
community. Under above two cases, other nodes that not be connected to node
v would never move the next time. It was proved by us. However, the remaining
cases, for instance the possibility of nodes belong to the under community in left
figure move to the upper two communities, are not yet proved.
that are not connected would never move. Moreover, this phenomenon would take
place throughout the slowly changing part. Furthermore, in fact, it would hap-
pen all over the whole computation. However, the difference between drastically
changing part and slowly changing part is that the former one is also including
unconnected nodes movement, while the latter one is only cause connected nodes
move. So we should improve MCE at the latter part through only compute the
neighbor nodes next time, which are connected to nodes moved this time. However,
about the phenomenon, we just proved part of it in a mathematical way, because of
the complexity of the problem and the coming deadline of the thesis. The proved
part is the two cases represented in figure 4.5. So we decide to remain this part
as future work and take another less effective countermeasure, though just neglect
the nodes who satisfy above two cases, when computing, may still can improve
compute efficiency.
The countermeasure we finally used is that mark up per community whether
it is changed. So that when we take one node, before computing node-community
pair, the method would just firstly check whether there is at least one neighbor
communities of the node has been changed after last compute, and if it has not
been changed, the node would be neglected with no computing. In other word, one
node, whose neighbor communities has no change after last computation would be
ignored, because if there is no neighbor community change, there should be no
node movement. It is our third proposal.
37
CHAPTER 4. PROPOSED METHOD
4.5 Hybrid heuristic
In this section, we would describe about a hybrid heuristic, which automatically
merges and switches among the countermeasures proposed so far, in order to detect
community under the maximum modularity computation efficiency.
Let us review these countermeasures again for deciding what we should do in
this heuristic. At first, we proposed an idea that can reduce compute times by
utilizing small world property. This is our basic proposal and it should be done
all over the whole community detection, since it is effective all the time. Then,
a countermeasure, selecting part of neighbor communities in order of decreasing
|{vi} → {cj}| [ See equation 4.2 ], was given to improve modularity computation
efficiency at the latter part. Here the latter part is not the slowly changing part
mentioned in section 4.4. It just indicates the latter part of drastically changing
part [ See section 4.4 ]. Another point which needs to notice is that at the fore
part of the computation, it is not possible to use this countermeasure because the
weight of all edges should be 1 at first time, which means all |{vi} → {cj}| shouldbe 1. Therefore, it will select neighbors randomly at fore part, and we need to
switch to this heuristic from randomly selection automatically. Finally, an idea
for improve MCE [ See section 4.2 ] at the drastically changing part is considered.
This part is also always effective. Therefore, there is no need to switch to this
heuristic during computation.
According to above review, we can grasp and design the whole process of hybrid
heuristic as follows:
1. Detect community by randomly selecting part of neighbor communities.
2. Switch selection method of neighbor communities.
3. Detect community by selecting part of neighbor communities in decreasing
order of |{vi} → {cj}|.
4. Just compute nodes connected to previous moved node [ See section 4.4 ]
About the switch time between random selection and decrease order selection,
we would implement it by compare the slope of the modularity curve [ Figure 4.3
]. It can be determined easily just by keeping modularity values per a certain
compute time ( e.x. 10000 compute times ). Then m = (Modularity2−Modularity1)t2−t1
38
CHAPTER 4. PROPOSED METHOD
would be the equation of curve slope. When m begins to decrease, the program
will switch into decrease order selection mode.
4.6 Implementation
In this section, it will be written about the detail of our implementation, including
pseudo codes of all heuristics. In order to observe and evaluate per heuristic, we
would make four heuristics and give a name to every heuristic.
First of all, we should introduce the whole process of Louvain method and one
important function, for expressing the difference of our heuristics more clearly.
Algorithm 1 Louvain Method
Require: G = {V,E}Ensure: community detection result of G1: Read G;
2: improvement ⇐ false;
3: do
4: improvement ⇐ one level();
5: ResetGraph();
6: while improvement
7: Print result;
Algorithm 1 represents Louvain method, where function one level() and Re-
setGraph() means Phase one and Phase two in Louvain.[ See section 3.3 ]. In
one level(), every nodes would be iterated and computed itself and neighbor com-
munities pair to select best neighbor community, which holds the greatest ∆modularity
value. After that, the whole network would be reset to a new network, whose nodes
are now the communities found during the first phase. The variable improvement
becomes to true when the modularity of the whole network is increased by running
one level().
The function one level() is the most important function in Louvain method,
where ∆modularity computations are done. Our proposal also would be appended
here.
39
CHAPTER 4. PROPOSED METHOD
Algorithm 2 one level()
Require: G = {V,E}Ensure: improvement
1: improvement ⇐ false, moves ⇐ 0;
2: newMod ⇐ getModularity(G);
3: currentMod ⇐ newMod;
4: do
5: currentMod ⇐ newMod;
6: for ∀v ∈ V do
7: bestComm ⇐ currentComm;
8: ▷ currentComm means node v’s currentCommunity
9: remove(v,currentComm);
10: ∆Modmax ⇐getDeltaMod(v,currentComm);
11: C′ ⇐getNeighborCommunity(v);
12: for Ci ∈ C′ do
13: ∆Mod ⇐getDeltaMod(v, Ci);
14: if ∆Mod > ∆Modmax then
15: ∆Modmax ⇐ ∆Mod;
16: bestComm ⇐ Ci;
17: end if
18: end for
19: insert(v,bestComm);
20: if bestCommm=currentComm then
21: moves++;
22: end if
23: end for
24: if moves > 0 then
25: improvement ⇐ true;
26: end if
27: newMod ⇐ getModularity(G);
28: while moves > 0 and (newMod - currentMod) > minMod
29: return improvement;
From the Algorithm 2, we can see detail process of the whole computation. It
40
CHAPTER 4. PROPOSED METHOD
would do two steps, computing ∆modularity for each node-neighbor community
pairs ( Line 12-18 ) and selecting the best neighbor community ( Line 7,9,14,19 ),
repeatedly until there is no nodes movement or the modularity’s change amount of
the whole network is less than threshold ( Line 2,27,28 ). The variable improvement
would be returned to Algorithm 1 for deciding whether do the next iteration after
reset the network ( Line 6 of Algorithm 1 ), which should be assigned true when
there is node movement ( Line 24-25 ). Function getDeltaMode [ Line 10,13 ] is
the part for computing ∆modularity. The number of times of this part being run
would be counted when compute MCE [ See section 4.2 ].
One point we want to especially explain is the part between Line 12 and 18.
In this part, Louvain method select all of node v’s neighbor for computation and
selection, while in our basic heuristic it just selects part of them under the support
of small world property.
41
CHAPTER 4. PROPOSED METHOD
Algorithm 3 one level() in Basic Heuristic
Require: G = {V,E}Ensure: improvement
1: ......
2: do
3: currentMod ⇐ newMod;
4: for ∀v ∈ V do
5: ......
6: C′ ⇐getNeighborCommunity(v);
7: if |C ′| > 3 then
8: ∃C′′, C′′ ∈ {A : a set | A ⊆ C′, |A| = 3}9: else
10: C′′ ⇐ C ′;
11: end if
12: for Ci ∈ C′′ do
13: ∆Mod ⇐getDeltaMod(v, Ci);
14: if ∆Mod > ∆Modmax then
15: ∆Modmax ⇐ ∆Mod;
16: bestComm ⇐ Ci;
17: end if
18: end for
19: ......
20: end for
21: ......
22: while moves > 0 and (newMod - currentMod) > minMod
23: return improvement;
Algorithm 3 describes our basic heuristic. As we mentioned in section 4.1, it
just randomly selects 3 neighbor communities per time to compute ( Line 7-8 ).
In the case that there are neighbors less than 3, it would do computation for all
neighbors ( Line 8-10 ).
The second heuristic which selects neighbors by decreasing order of ∥vi → Ci∥value is as follows. We will call it max neighbor heuristic.
42
CHAPTER 4. PROPOSED METHOD
Algorithm 4 one level() in Max Neighbor Heuristic
Require: G = {V,E}Ensure: improvement
1: ......
2: do
3: currentMod ⇐ newMod;
4: for ∀v ∈ V do
5: ......
6: C′ ⇐getNeighborCommunity(v);
7: if |C ′| > 3 then
8: C′′ =getMaxNeighborCommunity(v, 3)
9: else
10: C′′ ⇐ C ′;
11: end if
12: for Ci ∈ C′′ do
13: ∆Mod ⇐getDeltaMod(v, Ci);
14: if ∆Mod > ∆Modmax then
15: ∆Modmax ⇐ ∆Mod;
16: bestComm ⇐ Ci;
17: end if
18: end for
19: ......
20: end for
21: ......
22: while moves > 0 and (newMod - currentMod) > minMod
23: return improvement;
Different to basic heuristic, algorithm 4 selects 3 neighbor communities who are
the top three members of ∥vi → Ci∥ value. Function getMaxNeighborCommunity
returns the set C′′ including these 3 members [ Line 8 ], which can be represent like
this. ∃C′′,C′′ ∈ Q = {A : a set |A ⊆ C′, ∥A∥ = 3} ∧ ∀Bi ∈ B ∈ Q \ {C′′}, ∀Ci ∈C′′, ∥v, Ci∥ > ∥v,Bi∥.
The third heuristic is changed neighbor heuristic. It this heuristic, we just select
nodes whose neighbor communities were changed during the previous computation.
43
CHAPTER 4. PROPOSED METHOD
In order to specify these changed communities, variables for each community would
be given to remember whether it has been changed. Then the pseudo code can be
written as follows.
44
CHAPTER 4. PROPOSED METHOD
Algorithm 5 one level() in Changed Neighbor Heuristic
Require: G = {V,E}Ensure: improvement
1: ......
2: Init Array isChanged ⇐ false;
3: isCompute = false;
4: do
5: currentMod ⇐ newMod;
6: for ∀v ∈ V do
7: ......
8: C′ ⇐getNeighborCommunity(v);
9: if |C ′| > 3 then
10: ∃C′′, C′′ ∈ {A : a set | A ⊆ C′, |A| = 3}11: else
12: C′′ ⇐ C ′;
13: end if
14: if isChanged[∃Ci ∈ C′′]==true then
15: isCompute = true;
16: end if
17: for Ci ∈ C′′ and isCompute==true do
18: ∆Mod ⇐getDeltaMod(v, Ci);
19: if ∆Mod > ∆Modmax then
20: ∆Modmax ⇐ ∆Mod;
21: bestComm ⇐ Ci;
22: end if
23: end for
24: ......
25: if bestCommm=currentComm then
26: moves++;
27: isChanged[bestComm]=true;
28: isChanged[currentComm]=true;
29: end if
30: end for
31: ......
32: while moves > 0 and (newMod - currentMod) > minMod
33: return improvement; 45
CHAPTER 4. PROPOSED METHOD
Here, line 2 will set all communities in network as no change, while Line 27,
28 set communities where node movements ( remove or insert ) took place as
changed. According to the variable isChanged, in Line 14-16, the algorithm would
identify the node whose neighbors have been changed. One node which at least
one neighbor community changed would be selected for computation. Through this
countermeasure, one can reduce unnecessary computation, and it would be more
effective in slowly changing part [ See section 4.4 ], since the number of moveable
node would become fewer and fewer.
Finally, we implement a hybrid heuristic for getting more effective algorithm,
by merging all above ideas in one heuristic, which can absorb advantages from
each above heuristics. About Max neighbor selection, because ∀v ∈ V, ∀Ci ∈ Call ∥v → Ci∥ are the same value ( maybe 1 when the network is an unweighted
network ) at initial time, the heuristic would select neighbor communities randomly
in a period of time, and would switch into Max neighbor selection at appropriate
time.
Algorithm 6 is the detail of hybrid heuristic. The variable m, m′ will store
network’s modularity changing rate [ See section 4.5 ], which would be computed
per a certain compute times ( Here we computer per 10000 compute times, Line
33-37 ). Comparing these two changing rate, we can decide whether it is time to
switch to Max neighbor community selection mode.
46
CHAPTER 4. PROPOSED METHOD
Algorithm 6 one level() in Hybrid Heuristic
Require: G = {V,E}Ensure: improvement
1: ......
2: computeTime ⇐ 0; ▷ store ∆modularity’s compute time
3: switch ⇐ false; ▷ control whether switch or not
4: m,m′ ⇐ 0; ▷ store network’s modularity changing rate
5: beforeMod ⇐ 0; ▷ store modularity 10000 computation times before
6: do
7: currentMod ⇐ newMod;
8: for ∀v ∈ V do
9: ......
10: if |C ′| > 3 then
11: if switch==false then
12: ∃C′′, C′′ ∈ {A : a set | A ⊆ C′, |A| = 3}13: else
14: C′′ =getMaxNeighborCommunity(v, 3);
15: end if
16: else
17: C′′ ⇐ C ′;
18: end if
19: if isChanged[∃Ci ∈ C′′]==true then
20: isCompute = true;
21: end if
22: for Ci ∈ C′′ and isCompute==true do
23: ......
24: computeTime++;
25: end for
26: ......
27: if bestCommm=currentComm then
28: moves++;
29: isChanged[bestComm]=true; isChanged[currentComm]=true;
30: end if
31: if computeTime==10000 then
32: m′ = getModularity()−beforeModcomputeT ime
;
33: if m′ < m then switch==true; end if
34: m = m′,beforeMod=getModularity();
35: end if
36: end for
37: ......
38: while moves > 0 and (newMod - currentMod) > minMod
39: return improvement;
47
Chapter 5
Evaluation
In this chapter, we will do several experiments to compare and evaluate our heuris-
tics. The experiment would be done in three ways to evaluate MCE, time, mod-
ularity and community contents. About community contents, Normalized mutual
information ( NMI ) would be used to measure how much information is shared
between extracted community and true community existed in network. Both syn-
thetic network and empirical network are studied here. As synthetic network we
would use Watts-Strogatz model and Barabasi-Albert model to get small world
network and scale-free network, another generating model would also be used to
make networks with known planted community structure for applying our heuris-
tics to study. About empirical network datasets, they will be described in detail
in every section before being used.
5.1 Modularity computation efficiency
In this section, we conduct evaluations to confirm the effectiveness of our heuristics
on Modularity Computation Efficiency ( MCE ). In the experiments, we use a
number of empirical networks which are commonly used for efficiency comparison
and two synthetic networks which are generated by Watts-Strogatz model and
Barabasi-Albert model.
All of the datasets used here are downloaded from Stanford University’s Dataset
Collection homepage. ( https://snap.stanford.edu/data/ ) The following is
the description of each dataset. In table 5.1, 2|E||V | means the average degree of the
48
CHAPTER 5. EVALUATION
networks.
• Web-Google: Web-Google is a web network where nodes represent web pages
and edges represent hyperlinks between them. It consists of 875,713 nodes
and 5,105,039 edges. [37]
• DBLP: DBLP is a co-authorship network where two authors are connected
if they publish at least one paper together. It consists of 317,080 nodes and
1,049,866 edges. [64]
• YouTube: YouTube is a video-sharing web site, and when sharing video,
one user can create a group that only share videos of a particular theme,
person or event with users who have similar hobby or interest. Therefore,
the communities in it would be user-defined groups, and users should be
connected by such a friendship. It is consists of 1,134,890 nodes and 2,987,624
edges. [64]
• Pokec: This dataset is also from an on-line social network. Friendships
between two users may become edge, and it consists of 1,632,803 nodes and
30,622,564 edges. [58]
Table 5.1: Empirical Datasets
Web-Google DBLP YouTube Pokec
|V | 875,713 317,080 1,134,890 1,632,803
|E| 5,105,039 1,049,866 2,987,624 30,622,5642|E||V | 11 7 5 37.5
Meanwhile, in order to evaluate the performance under pure small world network
and scale free network, we would generate network data using Watts-Strogatz
model and Barabasi-Albert model. From the experience, however not be proved
or no analytical result about it, Scale free networks made from Barabasi-Albert
model are also contains community structures in it, while their average path length
increases with the size of the network. Further, they include some high-degree
nodes ( called hub ) in it, which is common in real world, and not included in
49
CHAPTER 5. EVALUATION
the network from Watts-Strogatz model. Therefore, we want to observe the per-
formance of our heuristics under networks with pure small world property and
networks with hub.
The Watts-Strogatz model is a network generation model that produces net-
works with so-called small world property. Therefore, the generated data should
contain community structures in it and may have short average path lengths. It
was proposed by Duncan J. Watts and Steven Strogatz, and the process of gener-
ation is as follows.
• Set the number of nodes in network as N , the mean degree K, a probability
p.
• Make a ring lattice of N nodes, in which each node would be connected to K
neighbors, K2on each side. So if we label these nodes as n0...nN−1, the two
nodes ni, nj, whose labels satisfy the inequality 0 < |i− j|mod(n− K2) ≤ K
2,
would be connected initially.
• For every node ni, all of its edges (ni, nj) with i < j would be rewired under
probability p.
Here, probability p is the probability that one node connect to other nodes in
different communities with itself, while one node and its neighbor nodes should con-
struct communities. We would generate networks under parameterN = 1, 000, 000,
K = 20, p = 0.1.
The Barababasi-Albert model is another generation model which can generate
scale-free networks. It was proposed by Albert-Laszio Barabasi and Reke Albert.
The model takes two general concepts called growth and preferential attachment,
which are widely observed in real networks. Here, growth indicates the growth
of the number of nodes in the networks over time, while preferential attachment
means, when the network is growing, the more connected nodes are the more likely
to receive new links. As a result, it would make some nodes with significantly high-
degree, and including these nodes, the network would have power-law (or scale-free
) degree distributions. The algorithm of the model is like this.
• Give a connected network with N nodes as an initial input
• New nodes are added to the network one at a time.
50
CHAPTER 5. EVALUATION
Table 5.2: Synthetic Datasets
small world scale free
|V | 1,000,000 10,000
|E| 40,000,000 99,9702|E||V | 80 19
Parameter k = 20, p = 0.1 m = 10, power = 0.6
• The probability that new node connects to existing node ni should be pi =ki∑j kj
, where ki means the degree of node ni, and the denominator is the sum
of degrees over all nodes in the network.
Here, high-degree nodes would have high probability to be connected to new
node, while low-degree has less chance to be connected, so that the new nodes
have a ”preference” to attach themselves to the already heavily connected nodes.
Table 5.2 is the datasets we generated. We used igraph package included in
R, free software for statistical computing and graphics. Parameters in table are
the parameters for generating, and the former two parameters k, p are the same
meanings as described in Watts-Strogatz above, while the latter two parametersm,
p means the number of edges to add in each time and the power of the preferential
attachment.
According to above six datasets, we do community detection with our four
heuristics and Louvain method. In order to compare MCE, each program would
store modularity and ∆modularity per a 10000 compute times, and we plot these
records. The stored data are just the data of the first phase one of the whole
detection process. The experiment would be done under a tsubame interactive
node with 6GB RAM and Intel Xeon CPU X5670 2.93GHz.
From Figure 5.1 to Figure 5.12 are the results. On the whole, we can find
that the hybrid version has outstanding performance in modularity computation
efficiency. It arrives at the final value in shortest time with minimal compute
times, under the influence of countermeasure, computing part of neighbors and
selecting neighbors by the decreasing order of node-neighbor community connec-
tion strength, and it also converges faster than other heuristics by reducing invalid
calculation ( Invalid calculation means there is no node movement, though the
51
CHAPTER 5. EVALUATION
0
0.1
0.2
0.3
0.4
0.5
0.6
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Mod
ular
ity
Compute Times(*10000)
Louvain MethodBasic Heuristic
Max Neighbor HeuristicChange Neighbor Heuristic
Hybrid Heuristic
Figure 5.1: Modularity of DBLP
0
0.005
0.01
0.015
0.02
0.025
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Mod
ular
ity C
ompu
tatio
n Ef
ficie
ncy
Compute Times(*10000)
Louvain MethodBasic Heuristic
Max Neighbor HeuristicChange Neighbor Heuristic
Hybrid Heuristic
Figure 5.2: MCE of DBLP
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0 500 1000 1500 2000 2500 3000 3500 4000
Mod
ular
ity
Compute Times(*10000)
Louvain MethodBasic Heuristic
Max Neighbor HeuristicChange Neighbor Heuristic
Hybrid Heuristic
Figure 5.3: Modularity of Web-Google
0
0.005
0.01
0.015
0.02
0.025
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Mod
ular
ity C
ompu
tatio
n Ef
ficie
ncy
Compute Times(*10000)
Louvain MethodBasic Heuristic
Max Neighbor HeuristicChange Neighbor Heuristic
Hybrid Heuristic
Figure 5.4: MCE of Web-Google
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0 500 1000 1500 2000 2500 3000 3500 4000 4500
Mod
ular
ity
Compute Times(*10000)
Louvain MethodBasic Heuristic
Max Neighbor HeuristicChange Neighbor Heuristic
Hybrid Heuristic
Figure 5.5: Modularity of YouTube
0
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0 500 1000 1500 2000 2500 3000 3500 4000 4500
Mod
ular
ity C
ompu
tatio
n Ef
ficie
ncy
Compute Times(*10000)
Louvain MethodBasic Heuristic
Max Neighbor HeuristicChange Neighbor Heuristic
Hybrid Heuristic
Figure 5.6: MCE of YouTube
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0 5000 10000 15000 20000 25000 30000 35000 40000
Mod
ular
ity
Compute Times(*10000)
Louvain MethodBasic Heuristic
Max Neighbor HeuristicChange Neighbor Heuristic
Hybrid Heuristic
Figure 5.7: Modularity of Pokec
0
0.0005
0.001
0.0015
0.002
0.0025
0 5000 10000 15000 20000 25000 30000 35000 40000
Mod
ular
ity C
ompu
tatio
n Ef
ficie
ncy
Compute Times(*10000)
Louvain MethodBasic Heuristic
Max Neighbor HeuristicChange Neighbor Heuristic
Hybrid Heuristic
Figure 5.8: MCE of Pokec
52
CHAPTER 5. EVALUATION
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Mod
ular
ity
Compute Times(*10000)
Louvain MethodBasic Heuristic
Max Neighbor HeuristicChange Neighbor Heuristic
Hybrid Heuristic
Figure 5.9: Modularity of Small world
0
0.005
0.01
0.015
0.02
0.025
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Mod
ular
ity C
ompu
tatio
n Ef
ficie
ncy
Compute Times(*10000)
Louvain MethodBasic Heuristic
Max Neighbor HeuristicChange Neighbor Heuristic
Hybrid Heuristic
Figure 5.10: MCE of Small world
0
0.05
0.1
0.15
0.2
0.25
0 50 100 150 200 250 300 350 400 450
Mod
ular
ity
Compute Times(*10000)
Louvain MethodBasic Heuristic
Max Neighbor HeuristicChange Neighbor Heuristic
Hybrid Heuristic
Figure 5.11: Modularity of Scale free
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
0 50 100 150 200 250 300 350 400 450
Mod
ular
ity C
ompu
tatio
n Ef
ficie
ncy
Compute Times(*10000)
Louvain MethodBasic Heuristic
Max Neighbor HeuristicChange Neighbor Heuristic
Hybrid Heuristic
Figure 5.12: MCE of Scale free
node-neighbor pairs were computed.).
Dataset Pokec and Small world show the greatest MCE changing from Louvain
method, since the two datasets hold higher average degree than other datasets, and
it is obvious that the higher average degree of the network, the more effective our
heuristics are, under the condition that the network includes clear community
structure ( or high modularity value).
Under the dataset scale free [ Figure 5.11 and Figure 5.12 ], where there is
almost no community structures, our heuristics are also hold high MCE. It means
our basic idea is still effective even when there is no clear community structure
in network data. However, in modularity [ See Figure 5.11, 5.12 ], it shows lower
value than Louvain method. In fact, it can be interpreted from the characteristic of
scale free networks. In scale free network, some nodes with greatly exceeds degree
are exists in it [ See section 2.3 ], which are called hub in network literature.
These hubs would have greater part of degrees in network, though hubs are few in
number. So it is obvious that the probability of one hub’s neighbor communities
belong to the same community finally should be very low, when the networks data
has no clear community structure in it. So that, when one compute a hub h1, those
neighbor communities which should finally be belong to the same community with
the hub h1 ( like Community A, B and C in figure 4.2 ) should be select in a low
53
CHAPTER 5. EVALUATION
probability, and it may lead hub to an incorrect community at a high probability.
Furthermore, when a hub belongs to an incorrect community, it would affect the
connection strength between the incorrect community and its connected nodes.
The above is the reason why there would be lower modularity value finally
in our heuristics than Louvain method, and additionally in order to verify our
argument, we do community detection with another scale free network with a
clear community structures in it this time. Then from the result [ See figure 5.13
and 5.14 ], we can see the same changing tendency of modularity.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0 50 100 150 200 250 300 350 400 450 500
Mod
ular
ity
Compute Times(*10000)
Louvain MethodBasic Heuristic
Max Neighbor HeuristicChange Neighbor Heuristic
Hybrid Heuristic
0
0.005
0.01
0.015
0.02
0.025
0.03
0 50 100 150 200 250 300 350 400 450 500
Mod
ular
ity C
ompu
tatio
n Ef
ficie
ncy
Compute Times(*10000)
Louvain MethodBasic Heuristic
Max Neighbor HeuristicChange Neighbor Heuristic
Hybrid Heuristic
Figure 5.13: MCE and modularity changing of another scale free dataset, which
is consisting of 1,000,000 nodes and 1,999,998 edges. It contains clear community
structure with a high modularity value 0.98.
Dataset YouTube is another point we need to notice, since it makes a differ-
ent changing in modularity than others especially when using basic and change
neighbor heuristics [ Figure 5.5 and 5.6]. In figure 5.5, the modularity changing
becomes slowly after the value of 0.3, though finally it gets the same modularity
as other heuristics. About the reason, we believe it is also related to the charac-
teristic of the dataset. As we mentioned in datasets description part, communities
in YouTube would be consist of users who have similar interest or hobby. The
groups may share videos in a particular theme, person or event. Obviously, there
would be some users ( may be the account is a government, organization and so
on.) who are connected very large number of other users, because the popularity
of the videos they share, but the number of them would be not great in number.
These users could become a hub and, moreover, there would be unclear commu-
nity structures around them, even though the modularity of the whole network is
high, since these users are connected to kinds of users, but almost never take the
initiative to connect to other users. On the contrary, most of users should only
connect with few users, since most of users do not join a particular group, and
54
CHAPTER 5. EVALUATION
share various types of videos freely. So that, dataset YouTube show the similar
tendency with scale free network, which has low modularity value, even though
YouTube itself has a clear community structures ( or has high modularity value).
5.2 Time and modularity
About the comparison of time and modularity between Louvain method and our
four heuristics, we would continue to use the six datasets mentioned in section 5.1.
First, about the modularity, all of our heuristics finally get almost the same
modularity values as Louvain method, which means our proposal is theoretically
correct [ The modularity of each dataset is as Table 5.3 ]. Especially, under the
scale free network with low modularity ( it means there is almost no community
structure in it ), they are also get the same values finally, even though in figure
5.11, it gets lower modularity value in the first phase one.
Table 5.3: Modularity of all datasets
Dataset Modularity Dataset Modularity
Web-Google 0.98 Pokec 0.72
DBLP 0.82 Small world 0.77
YouTube 0.72 Scale free 0.27
On the other hand, about the time of the heuristics and Louvain method, there
exists a large gap between them [ See figure 5.14 ] when the data size becomes big.
Especially when the dataset up to million nodes, there is 3 - 6 times gap between
them, but one interesting point is that, under dataset youtube, our hybrid and
max neighbor heuristics have close speed to the Louvain method. It also may
be related to hub included in YouTube, since dataset small world with no hubs
becomes about 6 times slower than Louvain method.
The Louvain method holds the fastest speed, even though it has the worst mod-
ularity computation efficiency. It means, although our heuristics could compute
and detect community in a high efficiency, the cost applied by per node-neighbor
55
CHAPTER 5. EVALUATION
25.7
7.49 13.98
80.27
22.49 31.63
19.45
48.26
222.87
90.93
17.35 10.99
19.76
202.05
43.39
23.31 13.44
51.31
223.98
86.64
16.28 7.51
24.5
217.09
115.82
0
50
100
150
200
250
Web-‐Google DBLP Youtube Pokec Smallworld
Louvain Basic HeurisIc Max Neighbor HeurisIc Change Neighbor HeurisIc Hybrid HeurisIc
Figure 5.14: Time(s) of each Heuristics
community computation is far higher than Louvain method, so decreasing cost of
per computation would be one of our future work.
Meanwhile, when it comes to dataset with one hundred thousand nodes level,
our hybrid heuristic would complete community detection almost at the same time
as Louvain method under DBLP, while with dataset Google, it becomes faster than
Louvain method, even though its size is bigger than DBLP. It may be because
dataset Google has higher average degree and clear community structures in it.
About dataset scale free, because its runtime is too short to represent in the
figure 5.14, we give the times in numerical way. They are as follows.
Table 5.4: Time of dataset scale free
Method Time ( s ) Method Time ( s )
Louvain 0.22 Basic Heuristic 0.44
Max Neighbor Heuristic 0.34 Change Neighbor Heuristic 0.36
Hybrid Heuristic 0.36
56
CHAPTER 5. EVALUATION
5.3 Community contents
In this section, we would do some experiments to evaluate performance by quan-
titative comparison of the community assignments found by community detection
algorithms.
As a metric, here, we use Normalized Mutual Information ( NMI ), which is
based on information theory concepts. It is defined as follows[11]. Let Cd be a
detected community, Cr as a true community existed in network ( we would call
it real community ), and nd,r = {vi|Cd ∩ Cr}, which can represent the number
of shared nodes between a detected community Cd and a real community Cr.
Then define P (X = d,X = r) =nd,r
|V | to be the joint probability that a randomly
selected nodes is both in community Cd and Cr. Using this joint probability over
the random variables X and Y, one can compute Mutual Information, which can
describe a measure of two random variables’ mutual dependence.
I(X;Y ) = −∑y∈Y
∑x∈X
p(x, y) log(p(x, y)
p(x)p(y)) (5.1)
Equation 5.1 represents mutual information between two variables, and, here,
p(x) = |Cx||V | , should be the probability of one community ( detected or real one).
With mutual information, one can measure the similarity between detected and
real existing community assignments, and in order to fit the value between 0 and 1,
it would be normalized by Shannon entropy[54]. Then the value would be 1 if the
two assignments were identical, while it would be 0 if they were uncorrelated. So
that the normalized mutual information becomes as follows, where H(X), H(Y )
means Shannon entropy of the random variables X and Y .
H(X) = −∑x∈X
p(x) log p(x) (5.2)
H(Y ) = −∑y∈Y
p(y) log p(y) (5.3)
Inorm =2I(X;Y )
H(X) +H(Y )(5.4)
On the other hand, for comparing NMI, we need some datasets, whose real
community assignments are known in advance. Here, we would generate some
synthetic networks with community structures planted in it, since we hope to
57
CHAPTER 5. EVALUATION
know the change of NMI when only one condition is changed ( e.g., when degree
or modularity is the only changing value ), and one real data would never satisfy
this needs. As a generation model, we would use the model from a paper of
Lancichinetti, et al.[34]. Use the model, one can generate network data that hold
both scale-free and small world property. As parameters, one can specify the
average degree size k, the mixing parameter µ that can decide the prabability p1,
p2. Here, p1 means the probability of the connections between two nodes of the
same communities, while p2 means the probability of the connections between two
nodes of the different communities. In addition, one can decide the node size (
N ), the range of degree size ( k ), the range of community size ( minc, maxc ),
the exponents of the degree ( t1 ) and the community size distributions ( t2 ). For
our experiment, we would mainly use k, µ and C to generate network data, whose
degree size or modularity change gradually. We generate three sets of data, where
one consists of three data and the other two are made up from 5 data each.
Table 5.5: Datasets1
Data1( Louvain / Hybrid )
Data2( Louvain / Hybrid )
Data3( Louvain / Hybrid )
Parameters
N = 100,000,k = 10, µ=0.1,minc = 3,000,maxc = 4,000
N = 30,000,k = 100, µ=0.5,minc = 1,000,maxc = 4,000
N = 1,000,k = 10, µ=0.01,
minc = 10,maxc = 20
|V | 100,000 30,000 1,000
|E| 1,000,000 2,988,272 4,9992|E||V | 10 199.2 19.8
p1/p2 0.26e-02 / 0.10e-06 0.27e-01 / 0.18e-02 0.68 / 0.10e-03
|C| 29 14 62
|C ′| 11/27 14 / 11 62 / 62
Modularity 0.73 / 0.76 0.42 / 0.40 0.97 / 0.97
NMI 0.32 / 0.34 0.25 / 0.25 0.61 / 0.61
58
CHAPTER 5. EVALUATION
Table 5.6: Datasets2
Data4 Data5 Data6 Data7 Data8
Parameters
N=10,000k=50µ=0.08
minc=100maxc=300
N=10,000k=100µ=0.08
minc=100maxc=300
N=10,000k=150µ=0.08
minc=100maxc=300
N=10,000k=200µ=0.08
minc=100maxc=300
N=10,000k=400µ=0.08
minc=100maxc=300
|V | 10,000 10,000 10,000 10,000 10,000
|E| 250,000 500,000 750,000 1,000,000 2,000,0002|E||V | 50 100 150 200 400
p1 0.30 0.55 0.65 0.74 0.76
p2 0.41e-03 0.82e-03 0.12e-02 0.1e-02 0.34e-02
|C| 60 54 45 39 20
|C ′| 60 54 45 39 20
Modularity 0.90 0.90 0.90 0.89 0.87
Table 5.7: Datasets3
Data9 Data10 Data11 Data12 Data13
Parameters
N=10,000k=30µ=0.01
minc=100maxc=300
N=10,000k=30µ=0.1
minc=100maxc=300
N=10,000k=30µ=0.2
minc=100maxc=300
N=10,000k=30µ=0.5
minc=100maxc=300
N=10,000k=30µ=0.7
minc=100maxc=300
|V | 10,000 10,000 10,000 10,000 10,000
|E| 150,000 150,000 150,000 150,000 150,0002|E||V | 30 30 30 30 30
p1 0.18 0.16 0.13 0.09 0.56e-01
p2 0.31e-06 0.31e-03 0.92e-03 0.15e-02 0.21e-02
|C| 56 54 55 56 56
|C ′| 57 54 56 56 13
Modularity 0.96 0.88 0.68 0.48 0.25
Using Datasets1 [ table 5.5 ], we compared NMI between Louvain method and
our hybrid heuristics, where |C| and |C ′| represents number of real communities
59
CHAPTER 5. EVALUATION
Average degree of network
Nor
mal
ized
mut
ual i
nfor
mat
ion
0.3
0.32
0.34
0.36
0.38
0.4
0.42
0.44
0.46
50 100 150 200 400
Figure 5.15: The comparison of NMI
when only average degree is different.
Modularity
Nor
mal
ized
mut
ual i
nfor
mat
ion
0.425
0.43
0.435
0.44
0.445
0.45
0.455
0.46
0.465
0.47
0.475
0.96 0.88 0.68 0.48 0.25
Figure 5.16: The comparison of NMI
when only modularity is different.
and detected communities. Then from the data we can see that the modularity
and NMI in both are always the same, which means our proposal is theoretically
correct, and so that, after this, we would only consider the hybrid heuristic.
Then, with datasets 2 and 3, we do two experiments to observe the changing
of NMI. From datasets 2, we would observe the change of NMI when average
degree of datasets increase with other conditions are the same, while in datasets
3, modularity would be the only different element. The result is as follows.
From the result, we can find some interesting points. One point is that the
NMI is higher when a network has low degree and clear community structure
[ See data 2 and datasets 2 ]. It means the accuracy of Louvain method ( or
our heuristic ) would become higher when one network has stronger small world
property, since low degree and clear community structure are the contents of it (
It is obvious because community detection methods are just designed for detecting
existing community ).
Another point is that in some cases the method get high NMI value just by
making less number of communities than real one [ See data 4, 1, 13 ]. About this
point, although can not make any conclusion, we feel very interesting because it
means, in some cases, the method can not identify invalid node movement, and
this thing may be caused by network itself, the drawback of Louvain method ( or
our heuristic ) or even by hub. Furthermore, maybe from this point, we could find
some ideas to improve the accuracy. It also is one of our future works.
60
Chapter 6
Summary
This chapter concludes our work, and talk about our possible future works.
6.1 Summary
In this thesis, we have proposed some heuristics for improving the processing power
of community detection method, mainly focusing on its efficiency. As a main
countermeasure, we used a property of the network itself - the small world property,
and successfully reduced the computing times without affecting the acccuracy and
modularity. In addition, we proposed two ideas in selecting more meaningful
neighbor communities for improving the efficiency of community detection in the
latter part of the computation and shortening the convergence time, which also
led to the improvement of modularity computation efficiency ( MCE ), a measure
proposed in this thesis. Especially in the changed neighbor selection part [ Section
4.4], we found an interesting phenomenon. Although we failed to complete the
proof in a mathematical way, it may help to further improve the MCE. Finally,
we implemented a hybrid version heuristic to absorb the benefits of each idea.
In the second part of this research, we evaluated our heuristics in three ways;
MCE, time/modularity, and quality of detected communities. As a result, our
heuristics were better than Louvain method ( up to 3.5 times in small world dataset
) for MCE, while modularity and quality remained about the same. Especially,
when the dataset has a large average degree, it showed superiority in MCE. An-
other interesting point is that the heuristic performed just as well even with scale
61
CHAPTER 6. SUMMARY
free networks. Meanwhile, we also found some problems, for instance, in Louvain
method ( or our heuristic ), that they would create far bigger communities than
the real ones in some cases. Finally, about the time, the computation completed
quickly in some datasets such as Web-Google, DBLP and YouTube where com-
munity structures are clear or where hubs are included. In addition, although
our heuristic’s run time is slower than Louvain, it is obvious that this slowness is
caused by the cost of per node-neighbor community computation, since our heuris-
tic holds less computation times. In other words, our basic idea of utilizing the
small world property was effective enough.
6.2 Future works
The thesis leaves a number of possible directions open for future investigations.
The first one is from the phenomenon mentioned in section 4.4. If the phenomenon
occurs every time irrespective of network structures and is proved in a mathemat-
ical way, then one can directly identify those nodes that might move the next
without computing each node-neighbor community pairs in slowly changing part.
It should lead to further improvement of MCE. The second one is about the run
time. As we mentioned above, our heuristic is slower than the Louvain method
despite it holding higher modularity computation efficiency, which means each
computation applies heigher cost than the Louvain method. Therefore, we need
to improve it in a technical way. Finally, we also hope to investigate the reason
why the Louvain method ( or our heuristic ) creates far bigger communities than
the real ones in some cases.
62
CHAPTER 6. SUMMARY
Acknowledgement
Foremost, I would like to offer my sincerest gratitude to my supervisor, Professor
KenWakita, who has supported me from 3 years ago when I was a research student.
Without his patience and advice on my research and study, it would have been
impossible for me to write this thesis and get a master’s degree. Thank you for
reaching out to me when I failed in entrance examination and was wondering what
to do next. I attribute the level of my Masters degree to his encouragement and
effort.
In addition to my advisor, I would also like to express my gratitude to all
members of the Wakita Laboratory. Theis research proceeded smoothly, under
their insightful comments and valuable discussion.
Last but not least, I would like to thank my parents for their unconditional
support throughout my life, my two uncles’ family for their financial and emotional
support throughout my degree.
63
Bibliography
[1] Albert-Laszlo Barabasi and Reka Albert. Emergence of scaling in random
networks. science, 286(5439):509–512, 1999.
[2] Albert-Laszlo Barabasi, Reka Albert, and Hawoong Jeong. Mean-field the-
ory for scale-free random networks. Physica A: Statistical Mechanics and its
Applications, 272(1):173–187, 1999.
[3] A.L. Barabsi. Linked: The New Science of Networks. Perseus Pub., 2002.
[4] Richard Bellman. On a routing problem. Technical report, DTIC Document,
1956.
[5] Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne
Lefebvre. Fast unfolding of communities in large networks. Journal of Statis-
tical Mechanics: Theory and Experiment, 2008(10):P10008, 2008.
[6] Bela Bollobas. Modern graph theory, volume 184. Springer, 1998.
[7] U. Brandes. On Modularity - NP-completeness and Beyond. Interner Bericht.
Univ., Fak. fur Informatik, Bibliothek, 2006.
[8] Dale Carnegie. How to Win Friends and Influence People. Simon and Schus-
ter, 2009.
[9] Eunjoon Cho, Seth A Myers, and Jure Leskovec. Friendship and mobility:
user movement in location-based social networks. In Proceedings of the 17th
ACM SIGKDD international conference on Knowledge discovery and data
mining, pages 1082–1090. ACM, 2011.
64
BIBLIOGRAPHY
[10] Aaron Clauset, Mark EJ Newman, and Cristopher Moore. Finding community
structure in very large networks. Physical review E, 70(6):066111, 2004.
[11] Leon Danon, Albert Diaz-Guilera, Jordi Duch, and Alex Arenas. Comparing
community structure identification. Journal of Statistical Mechanics: Theory
and Experiment, 2005(09):P09008, 2005.
[12] Jorn Davidsen, Holger Ebel, and Stefan Bornholdt. Emergence of a small
world from local interactions: Modeling acquaintance networks. Physical Re-
view Letters, 88(12):128701, 2002.
[13] Edsger W Dijkstra. A note on two problems in connexion with graphs. Nu-
merische mathematik, 1(1):269–271, 1959.
[14] Peter Sheridan Dodds, Roby Muhamad, and Duncan J Watts. An experi-
mental study of search in global social networks. science, 301(5634):827–829,
2003.
[15] Holger Ebel, Lutz-Ingo Mielsch, and Stefan Bornholdt. Scale-free topology of
e-mail networks. arXiv preprint cond-mat/0201476, 2002.
[16] Kee-Cheok Cheong Edmund Terence Gomez, Franois Bafoil. Government-
Linked Companies and Sustainable, Equitable Development(Routledge
Malaysian Studies Series). Routledge, 2014.
[17] Leonhard Euler. Solutio problematis ad geometriam situs pertinentis. Com-
mentarii Academiae Scientiarum Imperialis Petropolitanae, 8:128–140, 1736.
[18] Michalis Faloutsos, Petros Faloutsos, and Christos Faloutsos. On power-law
relationships of the internet topology. In ACM SIGCOMM Computer Com-
munication Review, volume 29, pages 251–262. ACM, 1999.
[19] Maryam Fatemi and Laurissa Tokarchuk. An empirical study on imdb and
its communities based on the network of co-reviewers. In Proceedings of the
First Workshop on Measurement, Privacy, and Mobility, page 7. ACM, 2012.
[20] Charles M Fiduccia and Robert M Mattheyses. A linear-time heuristic for
improving network partitions. In Design Automation, 1982. 19th Conference
on, pages 175–181. IEEE, 1982.
65
BIBLIOGRAPHY
[21] Santo Fortunato. Community detection in graphs. Physics Reports, 486(3):75–
174, 2010.
[22] Ullas Gargi, Wenjun Lu, Vahab S Mirrokni, and Sangho Yoon. Large-scale
community detection on youtube for topic discovery and exploration. In
ICWSM, 2011.
[23] Fanica Gavril. Algorithms for minimum coloring, maximum clique, minimum
covering by cliques, and maximum independent set of a chordal graph. SIAM
Journal on Computing, 1(2):180–187, 1972.
[24] Michelle Girvan and Mark EJ Newman. Community structure in social
and biological networks. Proceedings of the National Academy of Sciences,
99(12):7821–7826, 2002.
[25] Prem K Gopalan and David M Blei. Efficient discovery of overlapping commu-
nities in massive networks. Proceedings of the National Academy of Sciences,
110(36):14534–14539, 2013.
[26] Roger Guimera and Luis A Nunes Amaral. Functional cartography of complex
metabolic networks. Nature, 433(7028):895–900, 2005.
[27] Roger Guimera, Stefano Mossa, Adrian Turtschi, and LA Nunes Amaral.
The worldwide air transportation network: Anomalous centrality, community
structure, and cities’ global roles. Proceedings of the National Academy of
Sciences, 102(22):7794–7799, 2005.
[28] Magnus M Halldorsson. A still better performance guarantee for approximate
graph coloring. Information Processing Letters, 45(1):19–23, 1993.
[29] Paul W Holland, Kathryn Blackmond Laskey, and Samuel Leinhardt. Stochas-
tic blockmodels: First steps. Social networks, 5(2):109–137, 1983.
[30] Xiufang Jiang, Guiquan Liu, and Zhiting Lin. A fast algorithm for finding
community structure based on community closeness. In Proceedings of the
2010 Third International Joint Conference on Computational Science and
Optimization - Volume 01, CSO ’10, pages 436–439, Washington, DC, USA,
2010. IEEE Computer Society.
66
BIBLIOGRAPHY
[31] Leonard Kaufman and Peter J Rousseeuw. Finding groups in data: an intro-
duction to cluster analysis, volume 344. John Wiley & Sons, 2009.
[32] Brian W Kernighan and Shen Lin. An efficient heuristic procedure for parti-
tioning graphs. Bell system technical journal, 49(2):291–307, 1970.
[33] Mirjam Kretzschmar. Sexual network structure and sexually transmitted
disease prevention: a modeling perspective. Sexually transmitted diseases,
27(10):627–635, 2000.
[34] Andrea Lancichinetti, Santo Fortunato, and Filippo Radicchi. Benchmark
graphs for testing community detection algorithms. Physical Review E,
78(4):046110, 2008.
[35] Daniel B Larremore, Aaron Clauset, and Abigail Z Jacobs. Efficiently inferring
community structure in bipartite networks. arXiv preprint arXiv:1403.2933,
2014.
[36] Chris Laszlo. The Sustainable Company: How to Create Lasting Value through
Social and Environmental Performance. Island Press, 2013.
[37] Jure Leskovec, Kevin J Lang, Anirban Dasgupta, and Michael W Mahoney.
Community structure in large networks: Natural cluster sizes and the absence
of large well-defined clusters. Internet Mathematics, 6(1):29–123, 2009.
[38] Stanley Milgram. The small world problem. Psychology Today, 67(1):61–67,
1967.
[39] Peter J Mucha and Mason A Porter. Communities in multislice voting net-
works. Chaos, 20(4):041108, 2010.
[40] Peter J Mucha, Thomas Richardson, Kevin Macon, Mason A Porter, and
Jukka-Pekka Onnela. Community structure in time-dependent, multiscale,
and multiplex networks. Science, 328(5980):876–878, 2010.
[41] Mark Newman. Networks: An Introduction. Oxford University Press, Inc.,
New York, NY, USA, 2010.
67
BIBLIOGRAPHY
[42] Mark EJ Newman. The structure of scientific collaboration networks. Pro-
ceedings of the National Academy of Sciences, 98(2):404–409, 2001.
[43] Mark EJ Newman. Detecting community structure in networks. The European
Physical Journal B-Condensed Matter and Complex Systems, 38(2):321–330,
2004.
[44] Mark EJ Newman. Fast algorithm for detecting community structure in net-
works. Physical review E, 69(6):066133, 2004.
[45] Mark EJ Newman and Michelle Girvan. Finding and evaluating community
structure in networks. Physical review E, 69(2):026113, 2004.
[46] Mikael Onsjo and Osamu Watanabe. A simple message passing algorithm for
graph partitioning problems. In Algorithms and Computation, pages 507–516.
Springer, 2006.
[47] Giuliano Andrea Pagani and Marco Aiello. The power grid as a complex
network: a survey. arXiv preprint arXiv:1105.3338, 2011.
[48] Gergely Palla, Imre Derenyi, Illes Farkas, and Tamas Vicsek. Uncovering the
overlapping community structure of complex networks in nature and society.
Nature, 435(7043):814–818, 2005.
[49] Christopher R Palmer and Christos Faloutsos. Electricity based external simi-
larity of categorical attributes. In Advances in Knowledge Discovery and Data
Mining, pages 486–500. Springer, 2003.
[50] Tiago P Peixoto. Efficient monte carlo and greedy heuristic for the inference
of stochastic block models. Physical Review E, 89(1):012804, 2014.
[51] Arnau Prat-Perez, David Dominguez-Sal, and Josep-Lluis Larriba-Pey. High
quality, scalable and parallel community detection for large real graphs. In
Proceedings of the 23rd international conference on World wide web, pages
225–236. International World Wide Web Conferences Steering Committee,
2014.
68
BIBLIOGRAPHY
[52] Sidney Redner. How popular is your paper? an empirical study of the cita-
tion distribution. The European Physical Journal B-Condensed Matter and
Complex Systems, 4(2):131–134, 1998.
[53] Stuart A Rice. The identification of blocs in small political bodies. American
Political Science Review, 21(03):619–627, 1927.
[54] Claude Elwood Shannon. A mathematical theory of communication. ACM
SIGMOBILE Mobile Computing and Communications Review, 5(1):3–55,
2001.
[55] Hiroaki Shiokawa, Yasuhiro Fujiwara, and Makoto Onizuka. Fast algorithm
for modularity-based graph clustering. In Twenty-Seventh AAAI Conference
on Artificial Intelligence, 2013.
[56] Kimmo Soramaki, Morten L Bech, Jeffrey Arnold, Robert J Glass, and Wal-
ter E Beyeler. The topology of interbank payment flows. Physica A: Statistical
Mechanics and its Applications, 379(1):317–333, 2007.
[57] Kenta Suzuki and Ken Wakita. Extracting multi-facet community structure
from bipartite networks. In Computational Science and Engineering, 2009.
CSE’09. International Conference on, volume 4, pages 312–319. IEEE, 2009.
[58] Lubos Takac and Michal Zabovsky. Data analysis in public social networks.
In International Scientific Conference AND International Workshop Present
Day Trends of Innovations, 2012.
[59] Stijn van Dongen. Graph Clustering by Flow Simulation. PhD thesis, Univer-
sity of Utrecht, Utrecht, May 2000.
[60] Ken Wakita and Toshiyuki Tsurumi. Finding community structure in mega-
scale social networks:[extended abstract]. In Proceedings of the 16th interna-
tional conference on World Wide Web, pages 1275–1276. ACM, 2007.
[61] Stanley Wasserman. Social network analysis: Methods and applications, vol-
ume 8. Cambridge university press, 1994.
[62] D.J. Watts. Six Degrees: The Science of a Connected Age. William Heine-
mann, London, 2003.
69
BIBLIOGRAPHY
[63] Duncan J Watts and Steven H Strogatz. Collective dynamics of‘small-world’networks. nature, 393(6684):440–442, 1998.
[64] Jaewon Yang and Jure Leskovec. Defining and evaluating network communi-
ties based on ground-truth. In Proceedings of the ACM SIGKDD Workshop
on Mining Data Semantics, page 3. ACM, 2012.
[65] Wayne Zachary. An information flow model for conflict and fission in small
groups. J. of Anthropological Research, 33:452–473, 1977.
[66] Peng Zhang, Jinliang Wang, Xiaojia Li, Menghui Li, Zengru Di, and Ying Fan.
Clustering coefficient and community structure of bipartite networks. Physica
A: Statistical Mechanics and its Applications, 387(27):6869–6875, 2008.
[67] Haijun Zhou. Distance, dissimilarity index, and network community structure.
Physical review e, 67(6):061901, 2003.
70