Peer To Peer Network in Distributed Systems
Peer To Peer Network in Distributed Systems
Peer To Peer Network in Distributed Systems
Presented By:
Rahul Batra (MT13049)
Juhi Pruthi (MT13097)
Kritika Anand (MT13039)
Pooja Aggarwal (MT13072)
Pooja Gupta (MT13011)
Outline
P2P
Introduction
Incentives
Features
DHT
Structured overlay networks
Unstructured overlay networks
– Centralized Directory based P2P systems
– Pure P2P systems
– Hybrid P2P systems
Napster
Gnutella
Kazaa
Bit Torrent
Chord Protocol
CAN
Introduction
What is P2P?
A peer-to-peer network is a large network of nodes,
called peers, that agree to cooperate in order to achieve a
particular task.
• R:
•
One player is the client and one player is the server in each
game,
Client Always to cooperate
Server chooses whether to Cooperate or Defect based on
previous experience with the client
Client cannot trace defections
Population Dynamics
Entities take independent and continuous actions that
change the composition of the population.
• In each round of the game, each player plays one game as
client and one game as server
• At the end of each round, players can:
– Mutate
– Learn
– Turnover
– Stay the same
Population Dynamics
Let s : running average of the performance of a player’s
current strategy per round
age : number of rounds she has been using the strategy.
A strategy’s rating is
Vulnerable to collusion
Shared History Attack: Collusion
• Group of peers lie about transactions
• Positive (bad peer claims another bad peer cooperated)
• Negative (bad peer claims that good peer defected)
• Bad for objective (trust everyone) reputation
• Need subjective reputation mechanism
Shared History Attack: Collusion
Maxflow-based Subjective Reputation:
In the example in Figure 1, C can falsely claim that A served
him, thus deceiving B into providing service.
A maxflow-based algorithm promotes cooperation despite
collusion among 1/3 of the population.
Basic idea : B would only believe C if C had already
provided service to B.
The cost is its O(V ^3) running time, where V is the number
of nodes in the system.
Adaptive Stranger Policy
Zero-cost identities :
– Allows noncooperating peers to escape the consequences
of not cooperating
– Destroys cooperation in the system
In practice, peers serve all strangers with constant
probability .
Whitewashing can be nearly eliminated from the system
using adaptive stranger policy.
Adaptive Stranger Policy
Let ps and cs be the number of services that
strangers have provided and consumed,
respectively.
A player using “Stranger Adaptive” helps a stranger
with probability r=ps/cs
And update :
r = (ps+1)/cs, if the stranger provided service
r = ps/(cs+1), if the stranger consumed service
Traitors
players who acquire high reputation scores by
cooperating for a while, and then traitorously turn into
defectors before leaving the system
– users who turn deliberately to gain a higher score
– cooperators whose identities have been stolen and
exploited by defectors
Long-term history exacerbates this problem by allowing
peers with many previous transactions to exploit that
history for many new transactions.
Short-term history prevents traitors from disrupting
cooperation.
P2P Features
Efficient use of resources
Scalability
- Consumers of resources also donate resources
Reliability
- No single point of failure
-Redundant data source
Ease of deployment and administration
-Nodes are self-organized
-No need to deploy servers to satisfy demand
-Built-in fault tolerance, replication, and load balancing
What is DHT ?
Alice
Napster: Working
insert(X,
123.2.21.23)
...
Publish
123.2.0.18
search(A)
-->
Fetch 123.2.0.18
Query
Reply
Where is file A?
Napster: Pros & Cons
• Pros:
– Simple
– Search scope is O(1)
• Cons:
– Server maintains lot of state
– Performance Bottlenecks
– Single point of failure
Gnutella: Overview
• Query Flooding:
- Join: on startup, client contacts a few other nodes;
these become its “neighbors”
• Ping-Pong protocol
– Publish: no need
– Search: send queries to its neighbors, who ask their
neighbors, and so on...
– Fetch: get the file directly from peer
Features:
- No central server
- Constrained broadcast:
-Every peer sends packets it receives to all of its
peers
- Lifetime of packets limited by time -to-live
-Packets have unique ids to detect loops
Gnutella: Working
Reply
Query
Where is file A?
Protocol Message Types
Type Description Information
Ping Announces None
availability
Pong Response to a ping IP address and port
no. of responding
peer;
number and total KB
of files shared
Query Search request Search criteria
QueryHit Returned by peers IP address,port no.
that have and network
requested file bandwidth of
responding peer;
Number of results
and resultsets
Communication Model
Gnutella: Pros & Cons
• Pros:
– Fully de-centralized
– Search cost distributed
– Simple,Robust and Scalable
• Cons:
– Search scope is O(N)
– Search time is more
KaZaA: Overview
• “Smart” Query Flooding:
– Join: on startup, client contacts a group leader
– Publish: sends list of files to group leader
– Search: sends query to group leader,group leader
floods query amongst themselves.
– Fetch: gets the file directly from peer(s); can fetch
simultaneously from multiple peers
Features
- Each peer is either a group leader or assigned to a group
leader:
-TCP connection between peer and its group leader.
-TCP connections between some pairs of group
leaders.
-Group leader tracks the content in all its children.
KaZaA: Overview
ordinary peer
group-leader peer
neighoring relationships
in overlay network
KaZaA: Working
• Each file has a hash and a descriptor
• Client sends keyword query to its group leader
• Group leader responds with matches:
- Match Contains: metadata, hash, IP address
• If group leader forwards query to other group leaders,
they respond with matches
• Client then selects files for downloading:
- HTTP requests using hash as identifier is sent to peers
holding desired file
• Group Leader selection is time-based
– Node working safely for quite long is good option for
selection
KaZaA: Working
insert(X,
123.2.21.23)
...cc
Publish
I have X!
123.2.21.23
KaZaA: Working
search(A)
-->
123.2.22.50
search(A)
-->
123.2.22.50
123.2.0.18
Query Replies
Where is file A?
123.2.0.18
KaZaA: Pros & Cons
• Pros:
– Tries to take into account node heterogeneity:
• Bandwidth
• Host Computational Resources
• Host Availability
• Cons:
- No real guarantees on search scope or search time
- Limitations on simultaneous uploads
- Request queuing
- Incentive priorities
- Parallel downloading
What is BitTorrent?
NAME
Hashing
LENGTH .torrent
info
URL of
tracker
.torrent file
Tracker
Swarm
Seeder
Leechers
Sharing Pieces
Initial Seeder
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
Leecher
Seeder Leecher
Seeder 55
The Beauty of BitTorrent
More leechers = more replicas of pieces
56
BitTorrent: Piece Selection
As File is transferred in form of chunks.
Selecting pieces to download in a good order is very
important for good performance.
Poor piece selection algorithm can result in having:
---Little to share with each other
---Limiting the scalability of the system
• Problem: eventually nobody has rare chunks
– E.g., the chunks required are ones which are the end of
file
– Limiting the ability to complete a download
Piece Selection Policies
Strict Priority
– First Priority
• Rarest First
– General rules
• Random First Piece
– Special case, at the beginning
• Endgame Mode
– Special case
Download Phases
100%
59
BitTorrent: Rarest Chunk First
• Which chunks to request first?
– The chunk with the fewest available copies
– I.e., the rarest chunk first
• Benefits to the peer
– Avoid starvation when some peers depart
• Benefits to the system
– Avoid starvation across all peers wanting a file
– Balance load by equalizing # of copies of chunks
Upload and Download Control
How does each peer decide who to trade with?
Incentive mechanism
– Based on tit-for-tat, game theory
– “If you give a piece to me, I’ll give a piece to you”
– “If you screw me over, you get nothing”
– Two mechanisms: choking and optimistic unchoke
Choking Algorithm
• Choking is a temporary refusal to upload
• Each peer unchokes a fixed number of
peers(default 4)
• Reasons:
– To discover whether currently unused connections are
better than the ones being used
– To provide minimal service to new peers
Bit-Torrent: Preventing Free-Riding
N1 N2
N3
Client
Publisher N4 Lookup(H(audio data))
Key=H(audio data)
Value={artist,
album N6 N7 N8
title,
track title}
N9
65
Structure Of DHT
Fault tolerance
Scalability
One node connect with O(log n) other nodes
Different strategies
– Decentralization
– Availability
• successor(k)
first node clockwise from k
Identifier Circle
• Join()
• Create()
• Stabilize()
• Fix Fingers()
• Check_Predecessor()
Node Join
Example illustrating the join operation. Node 26 joins the system between
nodes 21 and 32. The arcs represent the successor relationship.
(a) Initial state:node 21 points to node 32;
(b) node 26 finds its successor (i.e., node 32) and points to it;
Node Join
• Interface:
insert(key, value)
retrieve(key)
y
State of the system at time t
Peer
Resource
Zone
x
In this 2 dimensional space a key is mapped to a point (x,y)
Design Of CAN
• Routing
• Construction
• Maintenance
Routing in CAN