Common Friends Problem

COMMON FRIENDS
PROBLEM
dasfdaffsfsfsfdfsdfsfsf By:
Aditi Jain(19MCMB07)
Manisha Golane(19MCMB03)
CONTENT:
 Introduction
 Possible Solutions
 Input & POJO Solution
 MapReduce Algorithm for Common Friends
 A Semantic-based Friend Recommendation System for
Social Network
 Longest Common Subsequence in MapReduce
 All Pair Shortest Path in MapReduce
INTRODUCTION:
 These days most social network sites such as Facebook, Instagram,
and LinkedIn offer you this service.
 When you visit someone’s profile, you see a list of friends that you
have in common.
 By definition, a friend is a person whom one knows, likes, and trusts.
 For example, Facebook has your list of friends, and friend
relationships are bidirectional on the site; if I’m your friend, you’re
mine too.
 Typically social networks precompute calculations when they can to
reduce the processing time of requests, and one common processing
POSSIBLE SOLUTIONS:
 Use a caching strategy and save the common friends in a
cache (such as Redis or memcached).
 Use MapReduce to calculate everyone’s common friends
once a day and store those results.
1)MapReduce/Hadoop using primitive data types
2)MapReduce/Hadoop using custom data types
3)Spark using RDDs
INPUT:
POJO SOLUTION:
 The Plain Old Java Object Solution.
 Let {A 1 , A 2 , ..., A m } be a set of friends for User 1 and
{B 1 , B 2 , ..., B n } be a set of friendsfor User 2 . Thus,
common friends of User 1 and User 2 can be defined as
the intersection (common elements) of these two sets.
CODE:
MapReduce Algorithm:
 Our MapReduce solution to find “common friends” has both map()
and reduce() functions.
 The mapper accepts a (key 1 , value 1 ) pair, where key 1 is a person
and value 1 is a list of associated friends of that person.
 The mapper emits a set of new (key 2 , value 2 ) pairs; key 2 is a
Tuple2(key 1 , friend i ) , where friend i ∊ value 1 , and value 2 is the
same as value 1 (a list of all friends for key 1 ).
 The reducer’s key is a pair of two users (User j , User k ) and value is
a list of sets of friends.
 The reduce() function will intersect all sets of friends to find common
Map() FUNCTION:
 The mapper’s output keys are sorted via the buildSortedKey() function to
prevent duplicate keys. Note that we assume that friendship is bidirectional:
if Alex is a friend of Bob, then Bob is a friend of Alex.
Reduce() Function:
The MapReduce Algorithm in Action:
 After applying the map() function on the given data, following

key-pairs are generated:
A SEMANTIC-BASED FRIEND
RECOMMENDATION SYSTEM FOR SOCIAL
NETWORKS:
 Semantic-based friend recommendation system, which recommends
friends to users based on their lifestyles. Similarity of lifestyles
between users is measured by a similarity metric and the system
recommends those friends to the users that share high similarity
among their lifestyles.
 The rules to group people together include:
Shortest Path Based Potential Common
Friend Recommendation in Social Networks:
 Recommending new friends is based on either topological structure of

OSN or user profiles to recommend new friends.
 Floyd Warshall algorithm to get the top-k shortest paths between two
undirectly connected users.
 Extend the longest common subsequence(LCS) algorithm to get the
potentially common friends.
Floyd Warshall :Top-k shortest
path algorithm
 Algorithm 1: Top-k-Path()
 Input: The start string start − user and the end string end − user
 Output: The Top-k Shortest Path
 let dist be a |V| × |V| array of minimum distances initialized to ∞ (infinity)
 2 for each edge (u,v)
 3 dist[u][v] ← w(u,v) // the weight of the edge (u,v)
 4 for each vertex v
 5 dist[v][v] ← 0
 6 for k from 1 to |V|
 7 for i from 1 to |V|
 Algorithm 2: DeleteEdge()
 Input: The labels label1 and label2 corresponding two users
 Output: The graph by deleting the edge between label 1 & 2
 tArr[0] = label1;
 tArr[1]= label2;
 num1= 0;
 num2 = 0;
 for (i=0;i<nVerts;i++) do
 if vertexList[i].label.equalsIgnoreCase(label1) then
 num1 = i;
 end
ALL PAIR SHORTEST PATH IN
MAPREDUCE:
 In case of hadoop framework, to make the algorithm efficient
for running it parallel on several machines the algorithm(Floyds)
has to be modified.
 The input to the program is a adjacency list of the form

1 3,43|
2 4,18|
3 2,31|1,32|4,14|
4 2,27|3,23|5,48|
Mapper:
 Mapper class takes the entire file an input and parses it line by line.
 Let the consider nodeId be "k".
 For a vertex "i" adjacent to "k" it emits a new node.
 For generating the new node, the algorithm iterates through the nodes
adjacent to "k"
 for each j adjacent to k it sums the distances dist(i,k) and
 dist(k,j) and sets dist(i,j) as the sum...
 dist(i,j)=dist(i,k)+dist(k,j)
st
 After 1 Iteration:
 3 1,43|
 4 2,18|
 2 3,31|1,63|4,45|
 1 3,32|2,63|4,46|
 4 3,14|2,45|1,46|
 2 4,27|3,50|5,75|
 3 4,23|2,50|5,71|
Reducer
 The output of the mapper will be the input to the reducer
class.
 A list of values for a particular are sent to the reducer.
 The value is the updated adjacency list using each node
adjacent to the keyId.
 For each vertex it maintains a boolean variable isConverged
to
 indicate whether that vertex has been used to update
distances to its adjacent vertices.
Reducer contd..
 The output emitted by the reducer after the first iteration is :
 1 2,63|3,32|4,46|5,23|
 2 1,63|3,31|4,18|5,75|
 3 1,32|2,31|4,14|5,71|
 4 1,46|2,18|3,14|5,48|
 5 1,23|2,75|3,71|4,48|
Final Output
 The final output after all the iterations have been completed
is :
 1 2,63|3,32|4,46|5,23|
 2 1,63|3,31|4,18|5,66|
 3 1,32|2,31|4,14|5,55|
 4 1,46|2,18|3,14|5,48|
 5 1,23|2,66|3,55|4,48|
The Extended Longest Common
Subsequence algorithm(LCS)
 To find the potentially common friends between two
undirectly connected users.
 The inputs of the algorithm are two strings s1 and s2 which

are used to store any two shortest path sequence obtained
from the top-k shortest algorithm1.
 Example:
 Case 1:
 String A: "ABCD", String B: "AEBD"
 LCS("ABCD", "AEBD") = 1 + LCS("ABC", "AEB")
 Case 2:

 String A: "ABCDE", String B: "AEBDF"

 LCS("ABCDE", "AEBDF") = Max(LCS("ABCDE",
 In implementing MapReduce strategy, the program is
divided into 4 steps, which are:
 (1) Preliminary Step
 (2) Mapper Step
 (3) Combiner Step and
 (4) Reducer Step.
 The output of the one step is fed to the consequent next step
as input and final step output is received as program output.
Preliminary Step:
 1. Preliminary Step Here, we have to calculate the LCS of strings
str1 and str2.
 First, we take partition size as 15 and divide each string as a 15
char substring and keep the string sequence order.
 This work is done in Preliminary Step.
 Input str1=ABCBDABATCGACGATCGGGGTTCT
TCACCACGGGGTTCTTCACCAGAGTTATCT
str2=BDCABACTCAGGCACCGCAGTGACA
 AAAGTCGCAGTGACAAAAGTCAGGACGGC
 Partition size: 15
Mapper Step:
 2. Mapper Step The preprocess string is fed to the
mapper. Mapper Process is responsible for finding LCS
of the small substring of str1 and str2.
 Along with LCS, it also calculates A, B, E, F and its
index which is helpful to calculate combine LCS in an
upcoming step.
 Input
 Mapper1: ’a’: ’ABCBDABATCGACGA’, ’index’: 0, ’b’:
’BDCABACTCAGGCAC’
Combiner Step:
 Combiner Step Combiner step combines the output of the 2 mappers
within the system to give intermediate output.
 Input
 Combiner1: Key: 0 Value: [[0, ’A’: ’ABC’, ’a’: 0, ’B’: u”, ’E’: u”, ’lcs’:
’BDABATCAGA’, ’F’: ’C’, ’f’: 15, ’b’: 15, ’e’: 0, ’s2’:
’BDCABACTCAGGCAC’, ’s1’: ’ABCBDABATCGACGA’], [1, ’A’: ’T’, ’a’:
0, ’B’: u”, ’E’: u”, ’lcs’: ’CGGTAC’, ’F’: ’AAAAGT’, ’s2’:
’CGCAGTGACAAAAGT’, ’f’: 15, ’b’: 15, ’e’: 0, ’s1’:
’TCGGGGTTCTTCACC’]]
 Combiner2: Key: 1 value: [[2, ’A’: ’A’, ’a’: 0, ’B’: u”, ’E’: u”, ’lcs’:
’CGGTAC’, ’F’: ’AAAAGT’, ’s2’: ’CGCAGTGACAAAAGT’, ’s1’:
’ACGGGGTTCTTCACC’, ’f’: 15, ’b’: 15, ’e’: 0], [3, ’A’: u”, ’a’: 0, ’B’: u”,
’E’: ’C’, ’lcs’: ’AGGAC’, ’F’: ’GGC’, ’s2’: ’CAGGACGGC’, ’s1’:
Reducer Step:
 Reducer Process Finally, reducer step reduces the
intermediate output from all combiner to one final lcs.
 Input:
 Key: lcs
 value: [[0, ’a’: 0, ’A’: u”, ’b’: 30, ’e’: 0, ’lcs’:
’BDABATCAGACCGGTAC’, ’f’: 30, ’s2’:
’BDCABACTCAGGCACCGCAGTGACAAAAGT’, ’s1’:
’ABCBDABATCGACGATCGGGGTTCTTCACC’, ’F’: u”,
’B’: u”, ’E’: u”], [1, ’a’: 0, ’A’: u”, ’b’: 25, ’e’: 0, ’lcs’:
’CGGTACAGGAC’, ’f’: 24, ’s2’:
THANKYOU

Common Friends Problem

Uploaded by

Copyright:

Available Formats

Common Friends Problem

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Common Friends Problem

Uploaded by

Copyright:

Available Formats

COMMON FRIENDS

 After applying the map() function on the given data, following

 Recommending new friends is based on either topological structure of

 The input to the program is a adjacency list of the form

 The inputs of the algorithm are two strings s1 and s2 which

 String A: "ABCDE", String B: "AEBDF"

You might also like