Academia.eduAcademia.edu

PageRank Algorithm using Eigenvector Centrality -- New Approach

2022

The purpose of the research is to find a centrality measure that can be used in place of PageRank and to find out the conditions where we can use it in place of PageRank. After analysis and comparison of graphs with a large number of nodes using Spearman's Rank Coefficient Correlation, the conclusion is evident that Eigenvector can be safely used in place of PageRank in directed networks to improve the performance in terms of the time complexity.

PageRank Algorithm using Eigenvector Centrality- New Approach Saumya Chandrashekhar Suvarna Computer Science Department Vellore Institute of Technology Mashrin Srivastava Prof. B Jaganathan Dr. Pankaj Shukla Computer Science Department Vellore Institute of Technology [email protected] Mathematics Department Vellore Institute of Technology [email protected] Mathematics Department Vellore Institute of Technology [email protected] [email protected] Abstract—The purpose of the research is to find a centrality measure that can be used in place of PageRank and to find out the conditions where we can use it in place of PageRank. After analysis and comparison of graphs with a large number of nodes using Spearman’s Rank Coefficient Correlation, the conclusion is evident that Eigenvector can be safely used in place of PageRank in directed networks to improve the performance in terms of the time complexity. Keywords— PageRank, Centrality, Eigenvector, Webgraph I. INTRODUCTION In today's era of computer technology with the vast usage of the World Wide Web, users want a fast and accurate answer. It is the need of the hour to explore every avenue for quicker results and better time complexity. Thus a constant reevaluation of existing algorithms is necessary. Centrality measures in any kind of network highlights the important parts of that network. In dense networks, there is a potential for more importance as compared to sparse networks. Ranking is essential in terms of finding the most important or influential nodes in a network. II. EXISTING ALGORITHMS[1] Degree centrality selects the most important node based on the principle that an important node is involved in a lot of interactions. However degree centrality doesn’t take into account the importance of the nodes it is connected to. Closeness centrality decides the importance of the nodes according to the concept that important nodes will be able to communicate quickly with the nodes. Betweenness centrality finds the most important node based on the theory that if a node is important it will lie on the path between two nodes. Betweenness and closeness centrality is mostly redundant when it comes to ranking of webpages as it attaches no proper importance to the relevant webpages. Degree centrality is relevant however it doesn’t take into account the importance of the node pointing to it hence it gives ample opportunity for malicious and inauthentic websites to be ranked first. Apart from the abovementioned algorithms, other algorithms to find the ranking of a webpage include A. The Existing Algorithms in Use Eigentrust Algorithm[2] : Peer-to-peer network consists of partitioning tasks between various peers such that each peer is equally privileged. This method is popular for sharing information but due to its open source nature, the spread of inauthentic files is easy. Eigentrust Algorithm assigns a rank to each peer based on the past uploads made by the peer in an attempt to reduce the rank of inauthentic or malicious peers trying to pose as peers distributing authentic files. SimRank: There are many areas where the similarity of objects or interests come into play. The most obvious one being the use to find similar documents on the World Wide Web. Another is to group people with similar interests together, for example the recommended friends list by social media sites. SimRank measures the similarity between two objects based on the basis of the objects that they are referenced by. TrustRank[3]: Web spam pages are created in order to mislead the search engines by using various techniques to get a higher rank. TrustRank involves manually identifying a set of authentic websites known as seed pages and sending a crawl to identify pages similar to the seed pages. In Anti-Trust Rank[4] unauthentic or malicious websites are found out and websites close to that are non-trustworthy websites. Trust rank is decreased as it moves father away from the seed site. B. Algorithms Used By GOOGLE Google is the most popular search engine used. The reason being that it is able to provide relevant search results quickly. The very first algorithm used by Google was PageRank algorithm by Larry page and Sergey Brin. However incidents like Google bombing, where a search for a topic can lead to the webpages of a seemingly unrelated topic prompted tweaks in algorithms used by Google. Google bombs can be a done for business, political or comical reasons. It works on the principle of spamdexing, which is the manipulation of the index used by the search engines and heavy linking of websites. In order to reduce incidents like Google bombing, Google uses many algorithms along with the original PageRank algorithm to ascertain the final ranking of a webpage. These algorithms includeGoogle Panda: Authority is one of the key measurements of ranking in search engines. The trust can be measured by the authority of the link to the article, where the more a link is authoritative, the more the article it points to can be trusted. Google Panda is essentially a filter of the content quality, depreciating low quality websites including websites that have unoriginal or redundant content, contains a huge number of advertisements, have content that is not written or authorized by an expert. If the panda rank score of a site is high, the website gets ‘Pandified’ which means that the pages in the site are imposed with a penalty. Google Hummingbird: Google Hummingbird attempts to judge the intent of the person making the query ie. to take into consideration the meaning of the sentence as a whole as compared to certain keywords. It is said that Google uses the vast database of information or the ‘Knowledge graph’ to determine the best results. Google Penguin: Google Penguin assigns a penalty to the webpages which are involved in the usage of black hat Search Engine Optimization techniques or violation of Google’s webmaster guidelines. The goal is to do away with keyword stuffing, link building, meaningless and irrelevant content and doorway pages or paying for links to the website. III. ALGORITHMS FOR COMPARISON A. Existing: PageRank Algorithm PageRank[5] centrality defines the importance of nodes based on the number and quality of nodes connected to it. The World Wide Web can be represented as a directed graph in which every webpage is represented by a node and the edges pointing to a node represents the links pointing to a webpage and the edges pointing away from the node represents the links that are pointing towards other websites. The most relevant page will be decided not only by the in-degree but also the importance of the node pointing towards the node in question. PageRank is basically based on the probability of a person surfing the web by randomly clicking on links to stop. The probability that the person will continue to click on the links is given by the damping factor and studies suggest that the ideal value is 0.85. The formula is given by PRa = (1 − d ) + d PRb ∑ Olb b∈S The above equation represents the eigenvector centrality in the matrix form, where A is the adjacency matrix and λ is the largest eigenvalue of the adjacency matrix. x is the vector containing the centrality score of the node. The PerronFrobenius theorem states that there is atleast one non-negative eigenvector corresponding to the largest eigenvalue of a square matrix. It can also be defined as the eigenvector corresponding to the largest eigenvalue λ of the matrix A. n a Where Sa is a set of all the webpages pointing to a page ‘a’ and Olb is the out-links from the page, d is the damping factor λ xk = l=i . A= For the Graph given in Fig.1, the value of PageRank is given by The time complexity of this algorithm using adjacency matrix as input is Ο(V 3), where V is the number of vertices. The initial value of Page Rank is assumed and the subsequent values are calculated iteratively till the values converge. igraph[6] has stated that the time complexity for the algorithm with input as adjacency list is Ο( E + V ) where V is the total number of vertices and E is the total number of edges in the graph. B. Proposed: Eigenvector Centrality Eigenvector centrality[7] defines the importance of a node in the graph by the importance of the nodes connected to it. It can be used for finding the most influencing person or for the identification of the spread of infection. One of the most interesting applications of eigenvector centrality is its potential to be used for analyzing the connectivity patterns in the data obtained from a functional magnetic resonance scale[8]. Let us assume a graph G that has E edges and V vertices. The graph can be given by G(E,V). If A is the adjacency matrix, for the graph G, the value of aij will be equal to the weight of the edges of the graph. If the graph is a directed graph, and the direction is from i to j ie. the head is at j node and the tail is at i node, then the value of aij will be equal to the edge weight. If the graph is undirected, both the values of aij and aji will be equal to the edge weight. If two vertices are not connected, the value of aij is 0. Ax = λx k = 1,2, 3…… . n akl xl The above equation represents the eigenvector centrality in a sum form where λ is the largest eigenvalue of the adjacency matrix and akl represents the entry in the kth row and lth column. For the graph given in Fig.1, the adjacency matrix will be Fig. 1. A 7 node graph used to illustrate value of the centrality measures. 1.5242 1.2125 1 PR = 0.8867 1.0283 1.0283 0.3200 ∑ The 0 1 0 1 1 0 0 1 0 1 0 1 1 0 1 0 0 1 1 1 1 eigenvalues 0 1 1 0 1 1 0 1 0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 0 0 0 1 0 calculated are λ0 = 3.3911 , λ1 = − 1.1290 + 1.0560i, λ2 = − 1.1290–1.0560i, λ3 = − 0.0334 + 0.8151i, λ4 = − 0.0334 − 0.8151i, λ5 = 0.1522 , λ6 = − 1.2185 . The principal or the largest eigenvalue is λ 0 = 3.3911 and the corresponding eigenvector which also represents the eigenvector centrality of the graph given in Fig.1 is v0= 0.4924 0.3853 0.3564 0.3548 0.4686 0.3544 0.1051 The time complexity of this algorithm using adjacency matrix as an input is Ο(V 2), where V is the total number of vertices. The calculation of the time complexity hinges on the calculation of the eigenvector as the rest of the processes can be done in constant time. igraph has stated that the time complexity for the algorithm using adjacency list as input is Ο ( V ) . Time complexity analysis consists of the assumption that n → ∞ . If we consider each webpage to be a node in a graph this assumption becomes true due to the vastness of the websites that is available on the internet. In a directed graph, in the case that every node (V) is connected to every other node (V-1), the maximum number of edges can be given by V(V − 1). Hence in the calculation of the time complexity of the worst case, we can take the value of E as V 2 hence giving the time complexity as O(V 2 ), thus proving that the time complexity to calculate the values of the eigenvector centrality PageRank algorithm. O(V ) is better than the existing Fig. 4. A 21 node graph used for comparison of the centrality measures. TABLE 2 COMPARISON BETWEEN EIGENVECTOR CENTRALITY AND PAGERANK CENTRALITY FOR THE GRAPH SHOWN IN FIG. 4 IV. CALCULATION AND VERIFICATION In our endeavor to prove that our hypothesis is right, we have found the eigenvector and PageRank values for several graphs. We initially started with 21 node directed graphs with each node representing a webpage. However, practically it is irrelevant to have such few webpages in a network therefore we increased the number of nodes to 50 and then 100 to simulate the trends with the increase in the number of webpages. Vertex Page rank Rank Eigenvector Rank v13 1.3884 1 0.3011 1 v9 1.2609 2 0.2837 2 v17 1.2511 3 0.2622 4 v3 1.2085 4 0.2755 3 v19 1.1892 5 0.2574 5 Fig. 3. A 21 node graph used for comparison of the centrality measures. v8 1.1586 6 0.2452 7 TABLE 1 COMPARISON BETWEEN EIGENVECTOR CENTRALITY AND PAGERANK CENTRALITY FOR GRAPH SHOWN IN FIG.3 v6 1.1387 7 0.2467 6 v20 1.0904 8 0.2265 10 v11 1.0795 9 0.2343 8 v18 1.0750 10 0.2294 9 v12 0.9454 11 0.1951 11 v21 0.9312 12 0.1935 13 v7 0.9183 13 0.1912 14 Vertex Page rank Rank Eigenvector Rank v10 0.9053 14 0.194 12 v14 0.9037 15 0.1858 16 v2 0.9028 16 0.1893 15 v5 0.8496 17 0.1838 17 v4 0.7528 18 0.1558 18 v16 0.7365 19 0.1519 20 v15 0.7179 20 0.1531 19 v1 0.5964 21 0.1161 21 Vertex Page rank Centrality Rank Eigenvector Centrality Rank v19 1.4466 1 0.3168 1 v16 1.2509 2 0.273 3 v2 1.2364 3 0.2818 2 v15 1.1694 4 0.2517 5 v17 1.1688 5 0.254 4 v20 1.1065 6 0.2391 6 v12 1.0696 7 0.2262 7 v8 1.0356 8 0.2224 8 v3 1.0296 9 0.2139 10 v13 0.9872 10 0.2145 9 v10 0.9800 11 0.2112 11 v6 0.9744 12 0.2064 13 v7 0.9407 13 0.2085 12 v5 0.8944 14 0.1852 16 v4 0.8915 15 0.1884 14 v9 0.8843 16 0.1799 17 v18 0.8658 17 0.1882 15 Fig. 5. A 50 node graph used for comparison of the centrality measures. v21 0.8261 18 0.175 18 v14 0.8227 19 0.1674 19 TABLE 3 COMPARISON BETWEEN EIGENVECTOR CENTRALITY AND PAGERANK CENTRALITY FOR THE GRAPH SHOWN IN FIG. 5 v1 0.7954 20 0.1612 20 v11 0.6242 21 0.1239 21 . Vertex PageRan k Rank Eigenvector Rank v1 1.2113 1 0.178 1 v28 1.2057 2 0.1746 2 v11 1.1892 3 0.1733 3 v48 1.1889 4 0.1715 4 v12 0.8243 44 0.1144 44 v42 1.1646 5 0.1652 5 v46 0.8193 45 0.1102 45 v8 1.1379 6 0.162 6 v18 0.8165 46 0.1047 48 v5 1.1314 7 0.1609 7 v43 0.8086 47 0.1046 49 v4 1.1222 8 0.1596 9 v40 0.7931 48 0.1074 46 v21 1.1181 9 0.1606 8 v14 0.7762 49 0.1051 47 v2 1.1096 10 0.1561 11 v29 0.7466 50 0.0961 50 v50 1.1041 11 0.1587 10 v25 1.0955 12 0.1558 12 v34 1.0772 13 0.1517 14 v39 1.0678 14 0.1497 17 v44 1.065 15 0.1531 13 v27 1.064 16 0.1496 18 v7 1.0609 17 0.1434 24 v30 1.0606 18 0.15 15 v22 1.0486 19 0.1498 16 v35 1.0408 20 0.1457 20 v41 1.0335 21 0.1458 19 v9 1.0331 22 0.144 23 v38 1.0297 23 0.1446 22 v3 1.0248 24 0.1422 26 v36 1.0225 25 0.1452 21 v16 1.0168 26 0.1404 28 v6 1.015 27 0.1431 25 v24 1.0089 28 0.1399 30 v13 1.0043 29 0.1418 27 v17 0.9951 30 0.14 29 v32 0.9915 31 0.1399 30 v37 0.986 32 0.1393 32 v20 0.9787 33 0.1343 33 v47 0.9483 34 0.1295 36 v10 0.9298 35 0.1319 34 v23 0.929 36 0.129 37 v45 0.9248 37 0.131 35 v31 0.9098 38 0.1279 38 v26 0.8998 39 0.1249 39 v33 0.8924 40 0.1227 40 v15 0.8881 41 0.1214 41 v49 0.8583 42 0.1151 42 Page rank Rank 0.8319 43 Vertex v19 Eigenvector 0.1151 Fig. 6. A 50 node graph used for comparison of the centrality measures. TABLE 4 COMPARISON BETWEEN EIGENVECTOR CENTRALITY AND PAGERANK CENTRALITY FOR THE GRAPH SHOWN IN FIG. 6 Vertex PageRank Rank Eigenvector Rank v30 1.2687 1 0.1821 1 v12 1.2547 2 0.1811 3 v11 1.2358 3 0.1814 2 v5 1.1997 4 0.1739 4 v50 1.1722 5 0.1654 5 v2 1.1581 6 0.1641 7 v6 1.1392 7 0.1651 6 v32 1.1296 8 0.1623 8 v8 1.1268 9 0.1614 9 v27 1.1218 10 0.1585 10 v24 1.1084 11 0.1568 11 v43 1.0988 12 0.1523 16 v9 1.0899 13 0.1568 11 v4 1.0839 14 0.1548 14 v35 1.0799 15 0.1505 17 v23 1.0786 16 0.1526 15 v45 1.0778 17 0.1551 13 v22 1.0636 18 0.1486 19 v28 1.0577 19 0.1466 21 v20 1.0549 20 0.1501 18 v31 1.044 21 0.148 20 v44 1.031 22 0.1447 22 v41 1.0235 23 0.1417 23 v36 1.0091 24 0.1371 26 v15 1.0058 25 0.1414 24 Rank 42 v33 1.002 26 0.1384 25 v52 1.1044 12 0.1099 16 v21 0.9746 27 0.1367 27 v9 1.1004 13 0.1117 11 v26 0.966 28 0.1331 28 v85 1.0982 14 0.1098 17 v3 0.9568 29 0.131 31 v56 1.0928 15 0.1101 14 v39 0.9422 30 0.1325 30 v94 1.0899 16 0.1106 13 v10 0.9409 31 0.1305 32 v100 1.0845 17 0.1101 14 v42 0.9397 32 0.1327 29 v2 1.082 18 0.1088 19 v25 0.9256 33 0.1291 33 v83 1.0805 19 0.1096 18 v47 0.9227 34 0.127 35 v84 1.0691 20 0.1073 22 v34 0.9205 35 0.1275 34 v87 1.0672 21 0.1088 19 v13 0.9203 36 0.1265 36 v26 1.0669 22 0.1069 24 v48 0.8949 37 0.1215 37 v34 1.0667 23 0.1064 25 v1 0.8937 38 0.1179 42 v11 1.0658 24 0.1074 21 v29 0.8916 39 0.1196 40 v4 1.0642 25 0.1057 28 Vertex Page rank Rank Eigenvector Rank v95 1.0578 26 0.107 23 v37 0.8728 40 0.1203 39 v53 1.0558 27 0.1056 29 v16 0.8691 41 0.1166 46 v68 1.0544 28 0.1047 33 v18 0.8634 42 0.1173 43 v71 1.054 29 0.1063 26 v7 0.8631 43 0.1171 44 v75 1.0536 30 0.1051 30 v38 0.8534 44 0.1215 37 v81 1.0498 31 0.105 32 v14 0.8439 45 0.1171 44 v86 1.0479 32 0.104 38 v40 0.8405 46 0.1191 41 v35 1.0473 33 0.1045 35 v19 0.8368 47 0.1148 48 v12 1.0454 34 0.1059 27 v46 0.8245 48 0.1151 47 v27 1.0452 35 0.1047 33 v49 0.8202 49 0.1103 49 v54 1.0442 36 0.1051 30 v17 0.7071 50 0.0896 50 v21 1.041 37 0.1043 36 v89 1.0405 38 0.1021 42 v15 1.0393 39 0.104 38 v80 1.0309 40 0.1043 36 TABLE 5 COMPARISON BETWEEN EIGENVECTOR CENTRALITY AND PAGERANK CENTRALITY FOR A 100 NODE GRAPH Vertex PageRank Rank Eigenvector Rank v30 1.0294 41 0.102 43 v8 1.254 1 0.1294 1 v22 1.0274 42 0.1023 41 v24 1.1787 2 0.1196 2 Vertex PageRank Rank Eigenvector Rank v55 1.1681 3 0.1185 3 v57 1.0176 44 0.1006 45 v28 1.1599 4 0.1165 4 v82 1.0165 45 0.1024 40 v61 1.1446 5 0.1164 5 v44 1.0159 46 0.1006 45 v62 1.1398 6 0.1148 7 v25 1.0137 47 0.1004 47 v91 1.1364 7 0.1138 8 v77 1.0088 48 0.1004 47 v1 1.1207 8 0.1133 9 v41 1.0007 49 0.099 52 v6 1.117 9 0.1151 6 v32 0.9993 50 0.0983 57 v76 1.1141 10 0.113 10 v5 0.9989 51 0.1004 47 v48 1.112 11 0.1112 12 v78 0.9925 52 0.0986 55 v98 0.9912 53 0.0987 53 v46 0.855 94 0.085 89 v14 0.9908 54 0.0972 60 v38 0.8536 95 0.0838 93 v66 0.9906 55 0.0986 55 v69 0.8368 96 0.0796 97 v63 0.9905 56 0.098 58 v40 0.8301 97 0.0805 96 v39 0.9879 57 0.0994 50 v17 0.8224 98 0.0789 98 v20 0.9878 58 0.0991 51 v23 0.8188 99 0.0781 99 v74 0.9827 59 0.0973 59 v96 0.7894 100 0.0743 100 v92 0.9825 60 0.0971 62 v58 0.9816 61 0.0987 53 v33 0.9789 62 0.096 65 v59 0.9778 63 0.0967 63 v64 0.9744 64 0.0972 60 v73 0.9707 65 0.0967 63 v97 0.9676 66 0.0955 67 v36 0.9601 67 0.0958 66 v37 0.9562 68 0.0934 73 v88 0.9562 68 0.0947 69 v51 0.955 70 0.0941 72 v19 0.9548 71 0.0953 68 v60 0.9475 72 0.0942 71 v45 0.9474 73 0.0943 70 v43 0.9463 74 0.0928 74 v50 0.9455 75 0.0917 76 v42 0.9451 76 0.0918 75 v47 0.9403 77 0.0915 77 v18 0.9269 78 0.0901 83 v3 0.9234 79 0.0915 77 v72 0.9232 80 0.0909 79 v90 0.9186 81 0.0907 80 v13 0.9176 82 0.0907 80 v67 0.9171 83 0.0901 83 v29 0.9012 84 0.0881 86 v31 0.9012 84 0.0903 82 v99 0.8988 86 0.0873 87 v65 0.8973 87 0.0897 85 v79 0.8888 88 0.0872 88 v70 0.8824 89 0.0847 91 v7 0.8774 90 0.0849 90 v16 0.8729 91 0.084 92 v10 0.8585 92 0.0833 95 v49 0.856 93 0.0835 94 TABLE 6 COMPARISON BETWEEN EIGENVECTOR CENTRALITY AND PAGERANK CENTRALITY FOR A 100 NODE GRAPH Vertex PageRank Rank Eigenvector Rank v51 1.1992 1 0.1222 1 v77 1.1976 2 0.1215 2 v47 1.1793 3 0.1203 3 v50 1.1453 4 0.1167 5 v11 1.1409 5 0.1169 4 v42 1.1263 6 0.114 7 v82 1.1257 7 0.1131 9 v73 1.1236 8 0.1141 6 v66 1.1204 9 0.1132 8 v17 1.1189 10 0.1123 10 v100 1.1024 11 0.111 11 v21 1.0941 12 0.1099 14 v7 1.0884 13 0.1107 12 v31 1.0882 14 0.1099 14 v41 1.0827 15 0.1075 23 v84 1.0814 16 0.1071 24 v98 1.0806 17 0.1099 14 v86 1.0791 18 0.11 13 v90 1.0782 19 0.1086 18 v27 1.0774 20 0.1093 17 v70 1.0744 21 0.1077 21 v69 1.0682 22 0.1077 21 v32 1.0671 23 0.1081 19 v38 1.0651 24 0.1078 20 v97 1.0603 25 0.1043 34 v58 1.0594 26 0.1066 25 v62 1.0562 27 0.1056 30 v78 1.0538 28 0.106 28 v52 1.0522 29 0.1063 26 v80 1.0522 29 0.1062 27 v44 1.0517 31 0.104 37 v22 0.9555 72 0.0928 76 v28 1.0476 32 0.1059 29 v23 0.9541 73 0.0935 74 v36 1.0414 33 0.1047 31 v88 0.9504 74 0.0933 75 v9 1.0359 34 0.1039 39 v34 0.9502 75 0.0924 77 v63 1.0335 35 0.1038 40 v91 0.9471 76 0.0945 70 v89 1.0329 36 0.1043 34 v83 0.9451 77 0.0939 72 v92 1.0329 36 0.1044 32 v20 0.9413 78 0.0924 77 v30 1.0305 38 0.1044 32 v96 0.9389 79 0.0916 80 v19 1.029 39 0.104 37 v93 0.9309 80 0.0913 81 v43 1.0273 40 0.1024 43 v26 0.9248 81 0.092 79 v94 1.0265 41 0.1041 36 v10 0.9207 82 0.0892 83 v85 1.0258 42 0.1019 45 v99 0.9188 83 0.0885 84 v35 1.0252 43 0.1025 42 v76 0.9117 84 0.0893 82 v79 1.0247 44 0.1036 41 v45 0.9083 85 0.0884 85 v49 1.0233 45 0.1017 46 v37 0.9062 86 0.0872 89 v60 1.0195 46 0.1006 49 v56 0.9019 87 0.0876 87 v55 1.018 47 0.1014 47 v74 0.8942 88 0.0874 88 v6 1.0165 48 0.1 52 v72 0.8925 89 0.0877 86 v24 1.0139 49 0.1014 47 v12 0.8825 90 0.0856 91 v54 1.0078 50 0.1021 44 v8 0.8806 91 0.0863 90 v53 1.0024 51 0.1004 50 v33 0.8731 92 0.0845 92 v1 1.0016 52 0.1001 51 v87 0.8683 93 0.0835 93 v61 1.0013 53 0.0994 54 v64 0.8615 94 0.0832 95 v16 0.9985 54 0.1 52 v71 0.851 95 0.0816 96 v46 0.9953 55 0.0988 56 v81 0.8507 96 0.0833 94 v25 0.9916 56 0.0991 55 v65 0.8339 97 0.0801 97 Vertex PageRank Rank Eigenvector Rank v39 0.8208 98 0.0793 98 v5 0.9872 58 0.0973 59 v29 0.8084 99 0.0775 99 v40 0.9787 59 0.0987 57 v57 0.7512 100 0.0708 100 v3 0.9736 60 0.0959 60 v4 0.9728 61 0.0954 65 v18 0.9696 62 0.0956 63 v95 0.9666 63 0.0958 62 v2 0.966 64 0.0951 67 v59 0.9649 65 0.0959 60 v14 0.9619 66 0.0943 71 v13 0.9613 67 0.0946 68 v67 0.9611 68 0.0936 73 v75 0.9599 69 0.0955 64 v68 0.9597 70 0.0954 65 30 23 15 8 0 v1 v3 v5 v7 v9 Page rank Ranking v11 v13 v15 v17 v19 v21 Eigenvector Ranking Fig. 7. A graph showing comparison of the ranks for Fig. 3. 30 23 15 8 0 v1 v3 v5 v7 v9 Page rank Ranking v11 v13 v15 v17 v19 v21 Eigenvector Ranking Fig. 8. A graph showing comparison of the ranks for Fig. 4. v15 0.959 71 0.0946 68 50 38 25 13 0 v1 v5 v9 v13 v17 v21 v25 v29 v33 v37 v41 v45 v49 Page rank Ranking Eigenvector Ranking Fig. 9. A graph showing comparison of the ranks for Fig. 5. 50 38 25 13 0 v1 v5 v9 v13 v17 v21 v25 v29 v33 v37 v41 v45 v49 Page rank Ranking Eigenvector Ranking 100 75 50 25 v1 v8 v15 v22 v29 v36 v43 v50 v57 v64 v71 v78 v85 v92 v99 Pagerank Ranking Eigenvector Ranking Fig. 11. A graph showing comparison of the ranks for Table 5. 100 75 50 25 0 0.13 0.18 0.0975 0.135 0.065 0.09 0.0325 0.045 0 Fig. 10. A graph showing comparison of the ranks for Fig. 6 0 VI. PEARSON’S COEFFICIENT CORRELATION The Pearson’s rank correlation coefficient is much like Spearman’s coefficient correlation however it differs in the respect that it indicates the linear relation between the two variables. 0 0.325 0.65 0.975 0 1.3 0 0.325 0.65 0.975 1.3 Fig. 13. Graphs showing comparison of PageRank and Eigenvector As shown in the figures Fig. 13, the graph between PageRank and Eigenvector centrality is not only monotonic, but is also linear. Hence, it gives a value of almost 1 for both the Pearson’s coefficient correlation test and Spearman’s coefficient correlation test. Some of the values obtained by Spearman’s coefficient correlation to back the hypothesis that PageRank Centrality can be easily replaced with eigenvector centrality are given in Table 7. Critical value for 21 node graph is 0.681 for a 0.05% probability that the value occurred by chance. Similarly for 50 nodes graph, the critical value is 0.465 for a 0.05% probability that the value occurred by chance and for a 100 node graph, the critical value is 0.326 for a 0.05% probability that the value occurred by chance. TABLE 7 SPEARMAN’S RANK COEFFICIENT CORRELATION VALUES v1 v8 v15 v22 v29 v36 v43 v50 v57 v64 v71 v78 v85 v92 v99 Pagerank Ranking Eigenvector Ranking Fig. 12. A graph showing comparison of the ranks for Table 6 From Table 1-Table 6, showing the comparison of the ranks assigned by application of the eigenvector formula is comparable to the ranks obtained by PageRank. In a search engine that will be used for finding the most accurate results, the highest priority is given to the first few pages as those are the ones that will attract the attention of the user who is surfing the internet. For all the graphs taken, the first five vertices analogous to the webpages ranking are the same, thus strengthening our argument for the use of Eigenvector in place of PageRank. The figures (Fig. 7- Fig. 12) show that the variation of the ranks assigned by eigenvector centrality as compared to the ones assigned using the PageRank algorithm. As observed in the figures, the variation is insignificant and it goes on decreasing with the increase in the number of nodes in the graph. Therefore, in an actual web graph the difference would be reduced to a miniscule amount while keeping the top nodes or the top pages the same. Number Of nodes Spearman's Rank coefficient correlation Number Of nodes Spearman's Rank coefficient correlation 21 0.979220779 50 0.9833613 21 0.928571429 50 0.9822449 21 0.940259740 50 0.9923649 21 0.941558442 50 0.9884514 21 0.988311688 50 0.9909484 21 0.987012987 50 0.9833613 21 0.954545455 50 0.9884994 21 0.979220779 50 0.9826170 21 0.985714286 50 0.9791357 21 0.945454545 100 0.9951995 Number Of nodes Spearman's Rank coefficient correlation Number Of nodes Spearman's Rank coefficient correlation 50 0.989531813 100 0.9674767 50 0.992364946 100 0.9945875 50 0.990948379 100 0.9945875 1 V. SPEARMAN’S RANK COEFFICIENT CORRELATION The Spearman’s rank correlation coefficient is a nonparametric measure that determines the statistical dependence of one variable on the other. It uses a monotonic function to assess the relation between the two variables. It measures how strongly the two variables which are ranked are associated. 0.9875 0.975 Average value of Spearman's Rank 0.9625 0.95 0 25 50 Number of nodes 75 100 Fig. 15. A scatterplot representing the average value of Spearman’s Rank Coefficient Correlation for different number of nodes. As shown in Fig. 15, the average value of Spearman’s Rank Coefficient correlation increases with the increase in the number of nodes in the graph. This shows that the strength of association between the values obtained by eigenvector and PageRank continue to converge with the increase in the number of nodes, analogous to webpages, which is practically n → ∞. 2. 3. 4. 5. 6. VII. CONCLUSION In a fast paced world where each nanosecond has proved to be of crucial importance, it is essential to adopt every possible means to save time. With this purpose in mind, we have proved the dominance of eigenvector centrality over the existing PageRank algorithm in the respect of time complexity showing positive results in our favour by giving supportive and conclusive evidence. VIII. 1. REFERENCES Luca Donetti, Franco Neri and Miguel A Muoz “Optimal network topologies:expanders, cages, Ramanujan graphs, entangled networks and all that” Journal of Statistical Mathematics (2006) 7. 8. 9. 10. 11. Sepandar D. Kamvar. Mario T. Schlosser, Hector Garcia-Molina, “The EigenTrust Algorithm for Reputation Management in P2P Networks,” Stanford University Z.Gyöngyi, H. Garcia-Molina, J.Pedersen “Combating Web Spam with Trust Rank” Stanford University, thirtieth international conference on very large databases – volume 30 Krishnan, Vijay; Raj, Rashmi. “Web Spam Detection with Anti-Trust Rank” Stanford University. S.Brin and L.Page, “The PageRank Citation Ranking: Bringing Order to the Web” Stanford InfoLab, (Jan.,1998) Gabor Csardi, Tamas Nepusz, “The igraph software package for complex network research,” InterJournal Complex Systems 2006 Spizzirri, Leo, 2011, Justification and application of eigenvector centrality, working paper. Gabriele Lohmann , Daniel S. Margulies, Annette Horstmann, Burkhard Pleger, Joeran Lepsien, Dirk Goldhahn, Haiko Schloegl, Michael Stumvoll, Arno Villringer, Robert Turner “Eigenvector Centrality Mapping for Analyzing Connectivity Patterns in fMRI Data of the Human Brain” B.Jaganathan,Kalyani Desikan, “Category-Based Pagerank Algorithm,” International Journal of Pure and Applied Mathematics., vol.101,No.5, pp. 811-820, August 2015. B.Jaganathan,Kalyani Desikan, “Penalty-Based Pagerank Algorithm,” ARPN Journal of Engineering and Applied Sciences, vol.10,No.5 B.Jaganathan,Kalyani Desikan, “Weighted Pagerank Algorithm based on In-Out weight of webpages,” Indian Journal of Science and Technology, vol.8,No.34, pp. 1-6, December 2015