Common Friends Problem
Common Friends Problem
Common Friends Problem
PROBLEM
dasfdaffsfsfsfdfsdfsfsf By:
Aditi Jain(19MCMB07)
Manisha Golane(19MCMB03)
CONTENT:
Introduction
Possible Solutions
Input & POJO Solution
MapReduce Algorithm for Common Friends
A Semantic-based Friend Recommendation System for
Social Network
Longest Common Subsequence in MapReduce
All Pair Shortest Path in MapReduce
INTRODUCTION:
These days most social network sites such as Facebook, Instagram,
and LinkedIn offer you this service.
When you visit someone’s profile, you see a list of friends that you
have in common.
By definition, a friend is a person whom one knows, likes, and trusts.
For example, Facebook has your list of friends, and friend
relationships are bidirectional on the site; if I’m your friend, you’re
mine too.
Typically social networks precompute calculations when they can to
reduce the processing time of requests, and one common processing
POSSIBLE SOLUTIONS:
Use a caching strategy and save the common friends in a
cache (such as Redis or memcached).
Use MapReduce to calculate everyone’s common friends
once a day and store those results.
1)MapReduce/Hadoop using primitive data types
2)MapReduce/Hadoop using custom data types
3)Spark using RDDs
INPUT:
POJO SOLUTION:
The Plain Old Java Object Solution.
Let {A 1 , A 2 , ..., A m } be a set of friends for User 1 and
{B 1 , B 2 , ..., B n } be a set of friendsfor User 2 . Thus,
common friends of User 1 and User 2 can be defined as
the intersection (common elements) of these two sets.
CODE:
MapReduce Algorithm:
Our MapReduce solution to find “common friends” has both map()
and reduce() functions.
The mapper accepts a (key 1 , value 1 ) pair, where key 1 is a person
and value 1 is a list of associated friends of that person.
The mapper emits a set of new (key 2 , value 2 ) pairs; key 2 is a
Tuple2(key 1 , friend i ) , where friend i ∊ value 1 , and value 2 is the
same as value 1 (a list of all friends for key 1 ).
The reducer’s key is a pair of two users (User j , User k ) and value is
a list of sets of friends.
The reduce() function will intersect all sets of friends to find common
Map() FUNCTION:
The mapper’s output keys are sorted via the buildSortedKey() function to
prevent duplicate keys. Note that we assume that friendship is bidirectional:
if Alex is a friend of Bob, then Bob is a friend of Alex.
Reduce() Function:
The MapReduce Algorithm in Action:
3 1,43|
4 2,18|
2 3,31|1,63|4,45|
1 3,32|2,63|4,46|
4 3,14|2,45|1,46|
2 4,27|3,50|5,75|
3 4,23|2,50|5,71|
Reducer
The output of the mapper will be the input to the reducer
class.
A list of values for a particular are sent to the reducer.
The value is the updated adjacency list using each node
adjacent to the keyId.
For each vertex it maintains a boolean variable isConverged
to
indicate whether that vertex has been used to update
distances to its adjacent vertices.
Reducer contd..
The output emitted by the reducer after the first iteration is :
1 2,63|3,32|4,46|5,23|
2 1,63|3,31|4,18|5,75|
3 1,32|2,31|4,14|5,71|
4 1,46|2,18|3,14|5,48|
5 1,23|2,75|3,71|4,48|
Final Output
The final output after all the iterations have been completed
is :
1 2,63|3,32|4,46|5,23|
2 1,63|3,31|4,18|5,66|
3 1,32|2,31|4,14|5,55|
4 1,46|2,18|3,14|5,48|
5 1,23|2,66|3,55|4,48|
The Extended Longest Common
Subsequence algorithm(LCS)
To find the potentially common friends between two
undirectly connected users.
The output of the one step is fed to the consequent next step
as input and final step output is received as program output.
Preliminary Step:
1. Preliminary Step Here, we have to calculate the LCS of strings
str1 and str2.
First, we take partition size as 15 and divide each string as a 15
char substring and keep the string sequence order.
This work is done in Preliminary Step.
Input str1=ABCBDABATCGACGATCGGGGTTCT
TCACCACGGGGTTCTTCACCAGAGTTATCT
str2=BDCABACTCAGGCACCGCAGTGACA
AAAGTCGCAGTGACAAAAGTCAGGACGGC
Partition size: 15
Mapper Step:
2. Mapper Step The preprocess string is fed to the
mapper. Mapper Process is responsible for finding LCS
of the small substring of str1 and str2.
Along with LCS, it also calculates A, B, E, F and its
index which is helpful to calculate combine LCS in an
upcoming step.
Input
Mapper1: ’a’: ’ABCBDABATCGACGA’, ’index’: 0, ’b’:
’BDCABACTCAGGCAC’
Combiner Step:
Combiner Step Combiner step combines the output of the 2 mappers
within the system to give intermediate output.
Input
Combiner1: Key: 0 Value: [[0, ’A’: ’ABC’, ’a’: 0, ’B’: u”, ’E’: u”, ’lcs’:
’BDABATCAGA’, ’F’: ’C’, ’f’: 15, ’b’: 15, ’e’: 0, ’s2’:
’BDCABACTCAGGCAC’, ’s1’: ’ABCBDABATCGACGA’], [1, ’A’: ’T’, ’a’:
0, ’B’: u”, ’E’: u”, ’lcs’: ’CGGTAC’, ’F’: ’AAAAGT’, ’s2’:
’CGCAGTGACAAAAGT’, ’f’: 15, ’b’: 15, ’e’: 0, ’s1’:
’TCGGGGTTCTTCACC’]]
Combiner2: Key: 1 value: [[2, ’A’: ’A’, ’a’: 0, ’B’: u”, ’E’: u”, ’lcs’:
’CGGTAC’, ’F’: ’AAAAGT’, ’s2’: ’CGCAGTGACAAAAGT’, ’s1’:
’ACGGGGTTCTTCACC’, ’f’: 15, ’b’: 15, ’e’: 0], [3, ’A’: u”, ’a’: 0, ’B’: u”,
’E’: ’C’, ’lcs’: ’AGGAC’, ’F’: ’GGC’, ’s2’: ’CAGGACGGC’, ’s1’:
Reducer Step:
Reducer Process Finally, reducer step reduces the
intermediate output from all combiner to one final lcs.
Input:
Key: lcs
value: [[0, ’a’: 0, ’A’: u”, ’b’: 30, ’e’: 0, ’lcs’:
’BDABATCAGACCGGTAC’, ’f’: 30, ’s2’:
’BDCABACTCAGGCACCGCAGTGACAAAAGT’, ’s1’:
’ABCBDABATCGACGATCGGGGTTCTTCACC’, ’F’: u”,
’B’: u”, ’E’: u”], [1, ’a’: 0, ’A’: u”, ’b’: 25, ’e’: 0, ’lcs’:
’CGGTACAGGAC’, ’f’: 24, ’s2’:
THANKYOU