1 Towards Taxonomy-based Routing in P2P Networks Alexander L¨oser 指導老師 : 許子衝 老師...

26
1 Towards Taxonomy-based Routing in P2P Networks Alexander L¨oser 指指指指 : 指指指 指指 指指 : 指指指 指指 :M97G0216

Transcript of 1 Towards Taxonomy-based Routing in P2P Networks Alexander L¨oser 指導老師 : 許子衝 老師...

1

Towards Taxonomy-based Routing in P2P Networks

Alexander L¨oser

指導老師 :許子衝 老師學生 :羅英辰學號 :M97G0216

2

Introduction(1)

The development of smart, scalable approaches for the discovery and location of data sources in distributed heterogeneous information systems is an important problem in many scientific and commercial domains.

In the e-learning domain during the last years a large number of digital e-learning repositories has been build.

3

Introduction(2)

4

Super-Peer based Architecture(1) A super-peer is a node that acts as a centralized

server to a subset of clients,e.g. information provider and information consumer.

Super-peers are also connected to each other as peers in a pure system are (Figure 2), routing messages over this overlay network, and submitting and answering queries on behalf of their clients and themselves.

5

Super-Peer based Architecture(2)

6

Models and Queries(1)

Each peer is classified by paths in one or more taxonomies and publishes a model based on semi-structured XML data with the taxonomies and paths.

Open Directory Project (ODP) 是網站的開放內容目錄,也就是所謂的 DMOZ ( 來自其原始網域名稱: directory.mozilla.org) 。

7

Models and Queries(2)

8

Models and Queries(3)

To lookup peer models we use a subset of the XPATH(XML Path) language.

9

Distributed Hash Tables (DHT)

10

Indexing Peer Models and Tax-onomies in a DHT(1) Models are indexed in a catalog based on a

Distributed Hash Table. The Catalog is distributed among the SP-SP

network. Consider the model with PID=E.

11

Indexing Peer Models and Tax-onomies in a DHT(2)

Use SHA-1(Secure Hash Algorithm)

12

Indexing Peer Models and Tax-onomies in a DHT(3)

SUCC (successor)

13

Indexing Peer Models and Tax-onomies in a DHT(4)

14

CHORD protocol

15

Indexing Peer Models and Tax-onomies in a DHT(5)

Keys are stored clockwise at the closest node with the next higher hash value.

16

Lookup Models in a DHT(1)

Exact Lookups BFS-based Lookups Conjunctive Lookup

17

Lookup Models in a DHT(2)

Exact Lookups Figure 4. q1/Computers/Programming/Languages/J

ava.

The taxonomy path of the query is hashed to $EA66 and then a lookup on the Chord ring is executed.

The result of the lookup is a set of PIDs storing models with this classification path, e.g the peers with the PID: D,E,F.

18

Lookup Models in a DHT(3)

BFS-based Lookups

19

Lookup Models in a DHT(4)

Conjunctive Lookup Ex: Figure 4 q3

20

Storage Load Balancing Str-ategies

21

Implementation and Evaluation(1)

Without load balancing(-VS-LBM) Virtual server(+VS) Partition based load balancing(+LBM) Combination of partition based load balancing and virtual server (+LBM+VS)

50 Super-peer

15000 Peers

Join and leavewithin 3600s

22

Implementation and Evaluation(2) Our load balancing approach performs better than virtual

server(+VS) and the simulation without any load balancing(-VS-LBM).

This result are valid for a small super-peer network, such as simulated in our experiment.

In our approach we are only able to reduce the number of taxonomy paths a super-peer is responsible for.

23

Implementation and Evaluation(3)

Each peer issues each 240 sec an exact query for a taxonomy path.

The average required bandwidth for serving queries and joining and leaving peers each super-peer is 25KByte/sec.

24

Implementation and Evaluation(4) Figure 10 shows the costs using our storage

load balancing approach only for joining leaving peer nodes (J/L) and for issuing queries and joining and leaving peer nodes (J/L +Query).

25

Implementation and Evaluation(5)

J/LJoin and Leave

26

Summary and FurtherWork

We presented a completely new approach for enabling efficient semantic query routing in P2P networks.

Much work remains, for example dynamic storage load balancing strategies allowing super-peers to join and leave the catalog with a high frequency while the catalog remains robust.