Artur Andrzejak Zuse-Institute Berlin (ZIB)
description
Transcript of Artur Andrzejak Zuse-Institute Berlin (ZIB)
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
Artur AndrzejakArtur AndrzejakZuse-Institute Berlin (ZIB)Zuse-Institute Berlin (ZIB)
Overview:Overview:
Challenges in P2P SystemsChallenges in P2P Systems
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
What is a Peer-To-Peer System?What is a Peer-To-Peer System?
Participants are autonomous (different owners) Participants are autonomous (different owners)
Resources are distributedResources are distributed
Sites have equal functionalitySites have equal functionality
clientsclients, when accessing information, when accessing information
serversservers, when serving information to other peers, when serving information to other peers
routersrouters, when forwarding information, when forwarding information
... and so are called „... and so are called „peerspeers““
Lange number of participantsLange number of participants
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
P2P – a Bad Idea?P2P – a Bad Idea?
„„Distribution is expensive, specialized functionality is Distribution is expensive, specialized functionality is
good!“ (Garcia-Molina)good!“ (Garcia-Molina)
If distribution is necessary (e.g. due to reliability):If distribution is necessary (e.g. due to reliability):
build centralized directory and use backupsbuild centralized directory and use backups
computational efficiency suffers in P2P-scenario!computational efficiency suffers in P2P-scenario!
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
So Why P2P-Systems Exist at All?So Why P2P-Systems Exist at All?
User‘s view:User‘s view:
exploiting existing inexpensive resourcesexploiting existing inexpensive resources
sharing costs among manysharing costs among many
legal protectionlegal protection
autonomyautonomy
anonymityanonymity
Researcher‘s view:Researcher‘s view:
ScalabilityScalability
Self-organization and low management costSelf-organization and low management cost
High availability and fault-toleranceHigh availability and fault-tolerance
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
Main ChallengesMain Challenges
SearchSearch
Reliability and securityReliability and security
(Resource Management)(Resource Management)
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
Main ChallengesMain Challenges
SearchSearch
Reliability and securityReliability and security
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
Search Mechanism CharacteristicsSearch Mechanism Characteristics
Comprehensiveness and guaranteesComprehensiveness and guarantees
Many of today‘s systems do not guarantee that existing items Many of today‘s systems do not guarantee that existing items will be found at all, or they do not find all itemswill be found at all, or they do not find all items
Query expressivenessQuery expressiveness
Today: only key/keyword searches; range queries, aggregates Today: only key/keyword searches; range queries, aggregates and SQL-like queries desirableand SQL-like queries desirable
EfficiencyEfficiency
A major problem: too many messages for searching, some A major problem: too many messages for searching, some systems even use floodingsystems even use flooding
RobustnessRobustness
AutonomyAutonomy
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
Search Mechanism Determines..Search Mechanism Determines..
TopologyTopology
From arbitrary (Gnutella) to rigid (Napster)From arbitrary (Gnutella) to rigid (Napster)
Rigid topology increases efficiency but decreases autonomyRigid topology increases efficiency but decreases autonomy
Placement of Data/MetadataPlacement of Data/Metadata
Gnutella – only own data; Chord – data/metadata is carefully Gnutella – only own data; Chord – data/metadata is carefully distributed in whole network; superpeers – metadata for distributed in whole network; superpeers – metadata for superpeers is centralizedsuperpeers is centralized
Message RoutingMessage Routing
Each query message is sent to a group of peersEach query message is sent to a group of peers
From unstructured flooding (Gnutella) to sofisticated protocols From unstructured flooding (Gnutella) to sofisticated protocols (Chord, CAN etc.)(Chord, CAN etc.)
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
Gnutella – How it WorksGnutella – How it Works
query hitquery hit
downloaddownload
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
Gnutella – CharacteristicsGnutella – Characteristics
Characteristics Gnutella
Comprehensivness ++
Expressivness ++++
Efficiency +
Autonomy ++++
Robustness +++
Architectural Properties
Gnutella
Topology power law
Data Placement arbitrary
Message Routing flooding
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
Chord – How it WorksChord – How it Works
A key is stored at its successor:node with next higher ID
N32
N90
N105
K80
K20
K5
Circular 160-bitID space
Key 5
Node 105
N80
½¼
1/8
1/161/321/641/128
112
N120
Finger i points to successor of n+2i
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
Chord - CharacteristicsChord - Characteristics
Characteristics Gnutella Chord
Comprehensivness ++ ++++
Expressivness ++++ +
Efficiency + ++++
Autonomy ++++ ++
Robustness +++ ++
Architectural Properties
Gnutella Chord
Topology power law ring
Data Placement arbitrary hashing
Message Routing flooding directed
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
Decouple Efficiency, Autonomy, RobustnessDecouple Efficiency, Autonomy, Robustness
autonomy
robustness efficiency
+
+
+
gnutellachord
(From „Open Problems in Data Sharing Peer-To-Peer Systems“ by Hector (From „Open Problems in Data Sharing Peer-To-Peer Systems“ by Hector Garcia-Molina)Garcia-Molina)
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
Novelty: Location-Independent RoutingNovelty: Location-Independent Routing
Each unique document or endpoint has a globally unique Each unique document or endpoint has a globally unique identifier (GUID)identifier (GUID)
Locating data can be seen as a routing problem:Locating data can be seen as a routing problem:
clients construct messages addressed with GUIDs and let clients construct messages addressed with GUIDs and let peers pass these messages until object is locatedpeers pass these messages until object is located
Known as Known as Decentralized Object Location and Routing (DOLR) Decentralized Object Location and Routing (DOLR) paradigm or paradigm or Distributed Hash Table (DHT)Distributed Hash Table (DHT)
Advantages: Advantages:
allows for routing messages to objects without knowing their allows for routing messages to objects without knowing their locationlocation
data can be stored anywhere, amidst millions of peers data can be stored anywhere, amidst millions of peers scalabilityscalability
provides locality: use of local resources instead of distant, if provides locality: use of local resources instead of distant, if possiblepossible
Implemented in Chord, CAN, Pastry, Tapestry Implemented in Chord, CAN, Pastry, Tapestry
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
Main ChallengesMain Challenges
SearchSearch
Reliability and securityReliability and security
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
Essence: Untrusted/Unreliable Components Essence: Untrusted/Unreliable Components Centralized systems have componentsCentralized systems have components which are professionally which are professionally
maintained andmaintained and trusted to behave well trusted to behave well
Components of a P2P-system may crash or fail at any time Components of a P2P-system may crash or fail at any time
((unreliable componentsunreliable components))
Also, the participants might be adversarial, attempting to damage Also, the participants might be adversarial, attempting to damage
the system (the system (untrusted componentsuntrusted components))
Failure rate ~ system size Failure rate ~ system size larger P2P-systems are guaranteed larger P2P-systems are guaranteed
to have malfunctioning components to have malfunctioning components
P2P-system builders must invoke P2P-system builders must invoke new design principles to new design principles to achieve guaranteesachieve guarantees
„„only the aggregate behaviour of many peers can be trusted“only the aggregate behaviour of many peers can be trusted“
Techniques for untrusted components solve issues for unreliable Techniques for untrusted components solve issues for unreliable ones (converse is not true)ones (converse is not true)
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
Achieving Reliability and Security Achieving Reliability and Security
ReplicationReplication
CryptographyCryptography
Byzantine AgreementByzantine Agreement
Exploiting differencesExploiting differences
„„Thermodynamic“ Systems DesignThermodynamic“ Systems Design
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
ReplicationReplication Redundancy helps to achieve fault tolerance by providing Redundancy helps to achieve fault tolerance by providing
online replacements for faulty resourcesonline replacements for faulty resources
Advanced P2P Systems (Intermemory, OceanStore, Advanced P2P Systems (Intermemory, OceanStore, FreeHaven) use so called FreeHaven) use so called erasure codingerasure coding
Each chunk of data is transformed into many fragmentsEach chunk of data is transformed into many fragments
Very low Fraction of Blocks Lost Per Year (FBLPY)Very low Fraction of Blocks Lost Per Year (FBLPY)
Losses per year for Losses per year for 6 months repair 6 months repair interval:interval:
Std: Std: 0.03 blocks0.03 blocksErasure: Erasure: 1010-35-35 blocks blocks
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
Byzantine Agreement Byzantine Agreement
Immutable (read-only) data can be easily signed Immutable (read-only) data can be easily signed
(„sealed“) by cryptographic means to detect and discard („sealed“) by cryptographic means to detect and discard
faulty informationfaulty information
Also repairs are possible by these techniquesAlso repairs are possible by these techniques
However, However, some decisions are activesome decisions are active: e.g. changing, : e.g. changing,
replacing or deleting informationreplacing or deleting information
These decisions must be taken collectively to eliminate These decisions must be taken collectively to eliminate
corrupted nodescorrupted nodes
Here Here Byzantine AgreementByzantine Agreement can be used: only if a correct can be used: only if a correct
number of nodes agree, a unified decision is takennumber of nodes agree, a unified decision is taken
Works if no more than 1/3 of the nodes are compromizedWorks if no more than 1/3 of the nodes are compromized
Applied in OceanStore and FarsiteApplied in OceanStore and Farsite
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
Exploiting Differences Exploiting Differences
Some peers are „more equal“ than others:Some peers are „more equal“ than others:
Different CPUs, memory, storage cap., network connectivityDifferent CPUs, memory, storage cap., network connectivity
Some are professionally managed, others notSome are professionally managed, others not
Physically, some are locked in secure rooms, others are Physically, some are locked in secure rooms, others are
publicpublic
We can exploit these differences to tune performance, We can exploit these differences to tune performance,
availability, reliability, securityavailability, reliability, security
Examples:Examples:
Computers with higher connectivity as supernodesComputers with higher connectivity as supernodes
Actively managed nodes for Byzantine AgreementActively managed nodes for Byzantine Agreement
Placing archival data on servers deep in mountainsPlacing archival data on servers deep in mountains
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
„„Thermodynamic“ Systems Design Thermodynamic“ Systems Design A new concept of John Kubiatiowicz – A new concept of John Kubiatiowicz – „Stability through „Stability through
Statistics“Statistics“
We can give We can give guarantees on collective behaviourguarantees on collective behaviour while individual while individual nodes are not predictablenodes are not predictable
Over time, the latent order of a system is destroyed – this Over time, the latent order of a system is destroyed – this resembles the 2nd law of thermodynamics: „resembles the 2nd law of thermodynamics: „entropy of closed entropy of closed systems increasessystems increases““
Therefore, Therefore, self-organizing behaviour is necessaryself-organizing behaviour is necessary::
Servers must continuously collect, regenerate and Servers must continuously collect, regenerate and redistribute fragments in a data storage systemredistribute fragments in a data storage system
They must adjust routing links in the DOLR to correct They must adjust routing links in the DOLR to correct changeschanges
They must recognize faults without global communicationThey must recognize faults without global communication
Entropy reduction can be also achieved by Entropy reduction can be also achieved by introspectionintrospection
System observes itself, applies analyses, then adapts System observes itself, applies analyses, then adapts accordinglyaccordingly
Research in the area of IBM‘s Autonomic ComputingResearch in the area of IBM‘s Autonomic Computing
ZuseZuse-Institute -Institute BerlinBerlin (ZIB) (ZIB) Computer Science Computer Science ResearchResearch
P2P-Research at ZIB: CSR-DMS P2P-Research at ZIB: CSR-DMS
Management of large scientific data-sets (up to 400 Management of large scientific data-sets (up to 400
Mio. files)Mio. files)
Should improve existing approaches in the area of Should improve existing approaches in the area of
GRID technologiesGRID technologies
Also as a framework for researchAlso as a framework for research
Architecture is P2P-basedArchitecture is P2P-based
Should exhibit self-management abilitiesShould exhibit self-management abilities
Candidates for Diplomarbeiten are very welcome! Candidates for Diplomarbeiten are very welcome!