Graph Data Structure
Graph Data Structure
Graph Data Structure
Graph is a non-linear data structure consisting of vertices and edges. The vertices are sometimes also
referred to as nodes and the edges are lines or arcs that connect any two nodes in the graph. More
formally a Graph is composed of a set of vertices( V ) and a set of edges( E ). The graph is denoted
by G(V, E).
Imagine a game of football as a web of connections, where players are the nodes and their
interactions on the field are the edges. This web of connections is exactly what a graph data
structure represents, and it’s the key to unlocking insights into team performance and player
dynamics in sports.
Vertices: Vertices are the fundamental units of the graph. Sometimes, vertices are also
known as vertex or nodes. Every node/vertex can be labeled or unlabelled.
Edges: Edges are drawn or used to connect two nodes of the graph. It can be ordered pair of
nodes in a directed graph. Edges can connect any two nodes in any possible way. There are
no rules. Sometimes, edges are also known as arcs. Every edge can be labelled/unlabelled.
1. Null Graph
2. Trivial Graph
Graph having only a single vertex, it is also the smallest graph
possible.
3. Undirected Graph
A graph in which edges do not have any direction. That is the nodes are unordered pairs in the
definition of every edge.
4. Directed Graph
A graph in which edge has direction. That is the nodes are ordered pairs in the definition of every
edge.
5. Connected Graph
The graph in which from one node we can visit any other node in the graph is known as a connected
graph.
6. Disconnected Graph
The graph in which at least one node is not reachable from a node is known as a disconnected graph.
7. Regular Graph
The graph in which the degree of every vertex is equal to K is called K regular graph.
8. Complete Graph
The graph in which from each node there is an edge to each other node.
9. Cycle Graph
The graph in which the graph is a cycle in itself, the minimum value of degree of each vertex is 2.
A graph in which vertex can be divided into two sets such that vertex in each set does not contain
any edge between them.
13. Weighted Graph
A graph in which the edges are already specified with suitable weight is known as a
weighted graph.
Weighted graphs can be further classified as directed weighted graphs and undirected
weighted graphs.
There are multiple ways to store a graph: The following are the most common representations.
Adjacency Matrix
Adjacency List
In this method, the graph is stored in the form of the 2D matrix where rows and columns denote
vertices. Each entry in the matrix represents the weight of the edge between those vertices.
Below is the implementation of Graph Data Structure represented using Adjacency Matrix:
This graph is represented as a collection of linked lists. There is an array of pointer which points to
the edges connected to that vertex.
Below is the implementation of Graph Data Structure represented using Adjacency List:
When the graph contains a large number of edges then it is good to store it as a matrix because only
some entries in the matrix will be empty. An algorithm such as Prim’s and Dijkstra adjacency matrix is
used to have less complexity.
Removing an
O(1) O(N)
edge
Traversal of Graph Data Structure- Traversing all the nodes in the graph.
Tree is a restricted type of Graph Data Structure, just with some more rules. Every tree will always be
a graph but not all graphs will be trees. Linked List, Trees, and Heaps all are special cases of graphs.
Graph Data Structure has numerous real-life applications across various fields. Some of them are
listed below:
If we recall all the previous data structures that we have studied like array, linked list, tree,
etc. All these had some restrictions on structure (mostly linear and tree hierarchical which
means no loops). Graph allows random connections between nodes which is useful in many
real world problems where do have restrictions of previous data structures.
Used heavily in social networks. Everyone on the network is a vertex (or node) of the graph
and if connected, then there is an edge. Now imagine all the features that you see, mutual
friends, people that follow you, etc can seen as graph problems.
Used to represent the topology of computer networks, such as the connections between
routers and switches.
Neural Networks: Vertices represent neurons and edges represent the synapses between
them. Neural networks are used to understand how our brain works and how connections
change when we learn. The human brain has about 10^11 neurons and close to 10^15
synapses.
Compilers: Graph Data Structure is used extensively in compilers. They can be used for type
inference, for so-called data flow analysis, register allocation, and many other purposes. They
are also used in specialized compilers, such as query optimization in database languages.
Robot planning: Vertices represent states the robot can be in and the edges the possible
transitions between the states. Such graph plans are used, for example, in planning paths for
autonomous vehicles.
Dependencies in a software project (or any other type of project) can be seen as graph and
generating a sequence to solve all tasks before dependents is a standard graph topological
sorting algorithm.
For optimizing the cost of connecting all locations of a network. For example, minimizing
wire length in a wired network to make sure all devices are connected is a standard Graph
problem called Minimum Spanning Tree.
Graph Data Structure used to represent a wide range of relationships as we do not have any
restrictions like previous data structures (Tree cannot have loops and have to be hierarchical.
Arrays, Linked List, etc are linear)
They can be used to model and solve a wide range of problems, including pathfinding, data
clustering, network analysis, and machine learning.
Any real world problem where we certain set of items and relations between them can be
easily modeled as a graph and a lot of standard graph algorithms like BFS, DFS, Spanning
Tree, Shortest Path, Topological Sorting and Strongly Connected
Graph Data Structure can be used to represent complex data structures in a simple and
intuitive way, making them easier to understand and analyze.
Graph Data Structure can be complex and difficult to understand, especially for people who
are not familiar with graph theory or related algorithms.
Creating and manipulating graphs can be computationally expensive, especially for very large
or complex graphs.
Graph algorithms can be difficult to design and implement correctly, and can be prone to
bugs and errors.
Graph Data Structure can be difficult to visualize and analyze, especially for very large or
complex graphs, which can make it challenging to extract meaningful insights from the data.
A Graph is a non-linear data structure consisting of vertices and edges. The vertices are sometimes
also referred to as nodes and the edges are lines or arcs that connect any two nodes in the graph.
More formally a Graph is composed of a set of vertices( V ) and a set of edges( E ). The graph is
denoted by G(V, E).
Representations of Graph
Here are the two most common ways to represent a graph : For simplicity, we are going to consider
only unweighted graphs in this post.
1. Adjacency Matrix
2. Adjacency List
An adjacency matrix is a way of representing a graph as a matrix of boolean (0’s and 1’s)
Let’s assume there are n vertices in the graph So, create a 2D matrix adjMat[n][n] having dimension
n x n.
The below figure shows an undirected graph. Initially, the entire Matrix is initialized to 0. If there is an
edge from source to destination, we insert 1 to both cases
(adjMat[destination] and adjMat[destination]) because we can go either way.
The below figure shows a directed graph. Initially, the entire Matrix is initialized to 0. If there is an
edge from source to destination, we insert 1 for that particular adjMat[destination].
Directed Graph to Adjacency Matrix
An array of Lists is used to store edges between two vertices. The size of array is equal to the number
of vertices (i.e, n). Each index in this array represents a specific vertex in the graph. The entry at the
index i of the array contains a linked list containing the vertices that are adjacent to vertex i.
Let’s assume there are n vertices in the graph So, create an array of list of size n as adjList[n].
adjList[0] will have all the nodes which are connected (neighbour) to vertex 0.
adjList[1] will have all the nodes which are connected (neighbour) to vertex 1 and so on.
The below undirected graph has 3 vertices. So, an array of list will be created of size 3, where each
indices represent the vertices. Now, vertex 0 has two neighbours (i.e, 1 and 2). So, insert vertex 1 and
2 at indices 0 of array. Similarly, For vertex 1, it has two neighbour (i.e, 2 and 0) So, insert vertices 2
and 0 at indices 1 of array. Similarly, for vertex 2, insert its neighbours in array of list.
Undirected Graph to Adjacency list
The below directed graph has 3 vertices. So, an array of list will be created of size 3, where each
indices represent the vertices. Now, vertex 0 has no neighbours. For vertex 1, it has two neighbour
(i.e, 0 and 2) So, insert vertices 0 and 2 at indices 1 of array. Similarly, for vertex 2, insert its
neighbours in array of list.
A Graph is a non-linear data structure consisting of nodes and edges. The nodes are sometimes also
referred to as vertices and the edges are lines or arcs that connect any two nodes in the graph. More
formally a Graph can be defined as, A Graph consisting of a finite set of vertices(or nodes) and a set
of edges that connect a pair of nodes
1. Undirected Graphs: A graph in which edges have no direction, i.e., the edges do not have
arrows indicating the direction of traversal. Example: A social network graph where
friendships are not directional.
2. Directed Graphs: A graph in which edges have a direction, i.e., the edges have arrows
indicating the direction of traversal. Example: A web page graph where links between pages
are directional.
3. Weighted Graphs: A graph in which edges have weights or costs associated with them.
Example: A road network graph where the weights can represent the distance between two
cities.
4. Unweighted Graphs: A graph in which edges have no weights or costs associated with them.
Example: A social network graph where the edges represent friendships.
5. Complete Graphs: A graph in which each vertex is connected to every other vertex. Example:
A tournament graph where every player plays against every other player.
6. Bipartite Graphs: A graph in which the vertices can be divided into two disjoint sets such
that every edge connects a vertex in one set to a vertex in the other set. Example: A job
applicant graph where the vertices can be divided into job applicants and job openings.
7. Trees: A connected graph with no cycles. Example: A family tree where each person is
connected to their parents.
8. Cycles: A graph with at least one cycle. Example: A bike-sharing graph where the cycles
represent the routes that the bikes take.
9. Sparse Graphs: A graph with relatively few edges compared to the number of vertices.
Example: A chemical reaction graph where each vertex represents a chemical compound and
each edge represents a reaction between two compounds.
10. Dense Graphs: A graph with many edges compared to the number of vertices. Example: A
social network graph where each vertex represents a person and each edge represents a
friendship.
Types of Graphs:
1. Finite Graphs
A graph is said to be finite if it has a finite number of vertices and a finite number of edges. A finite
graph is a graph with a finite number of vertices and edges. In other words, both the number of
vertices and the number of edges in a finite graph are limited and can be counted. Finite graphs are
often used to model real-world situations, where there is a limited number of objects and
relationships between the
2. Infinite Graph:
A graph is said to be infinite if it has an infinite number of vertices as well as an infinite number of
edges.
3. Trivial Graph:
A graph is said to be trivial if a finite graph contains only one vertex and no edge. A trivial graph is a
graph with only one vertex and no edges. It is also known as a singleton graph or a single vertex
graph. A trivial graph is the simplest type of graph and is often used as a starting point for building
more complex graphs. In graph theory, trivial graphs are considered to be a degenerate case and are
not typically studied in detail
4. Simple Graph:
A simple graph is a graph that does not contain more than one edge between the pair of vertices. A
simple railway track connecting different cities is an example of a simple graph.
5. Multi Graph:
Any graph which contains some parallel edges but doesn’t contain any self-loop is called a
multigraph. For example a Road Map.
Parallel Edges: If two vertices are connected with more than one edge then such edges are
called parallel edges that are many routes but one destination.
Loop: An edge of a graph that starts from a vertex and ends at the same vertex is called a
loop or a self-loop.
6. Null Graph:
A graph of order n and size zero is a graph where there are only isolated vertices with no edges
connecting any pair of vertices.A null graph is a graph with no edges. In other words, it is a graph
with only vertices and no connections between them. A null graph can also be referred to as an
edgeless graph, an isolated graph, or a discrete graph
7. Complete Graph:
A simple graph with n vertices is called a complete graph if the degree of each vertex is n-1, that is,
one vertex is attached with n-1 edges or the rest of the vertices in the graph. A complete graph is
also called Full Graph.
8. Pseudo Graph:
A graph G with a self-loop and some multiple edges is called a pseudo graph. A pseudograph is a type
of graph that allows for the existence of self-loops (edges that connect a vertex to itself) and multiple
edges (more than one edge connecting two vertices). In contrast, a simple graph is a graph that does
not allow for loops or multiple edges.
9. Regular Graph:
A simple graph is said to be regular if all vertices of graph G are of equal degree. All complete graphs
are regular but vice versa is not possible. A regular graph is a type of undirected graph where every
vertex has the same number of edges or neighbors. In other words, if a graph is regular, then every
vertex has the same degree.
If the vertices and edges of a graph are labeled with name, date, or weight then it is called a labeled
graph. It is also called Weighted Graph.
A graph G = (V, E) with a mapping f such that every edge maps onto some ordered pair of vertices (Vi,
Vj) are called a Digraph. It is also called Directed Graph. The ordered pair (Vi, Vj) means an edge
between Vi and Vj with an arrow directed from Vi to Vj. Here in the figure: e1 = (V1, V2) e2 = (V2, V3)
e4 = (V2, V4)
13. Subgraph:
A graph G1 = (V1, E1) is called a subgraph of a graph G(V, E) if V1(G) is a subset of V(G) and E1(G) is a
subset of E(G) such that each edge of G1 has same end vertices as in G.
Graph G is said to be connected if any pair of vertices (Vi, Vj) of a graph G is reachable from one
another. Or a graph is said to be connected if there exists at least one path between each and every
pair of vertices in graph G, otherwise, it is disconnected. A null graph with n vertices is a
disconnected graph consisting of n components. Each component consists of one vertex and no
edge.
A graph G consisting of n vertices and n> = 3 that is V1, V2, V3- – – – Vn and edges (V1, V2), (V2, V3),
(V3, V4)- – – – (Vn, V1) are called cyclic graph.
Vertex disjoint subgraph: Any two graph G1 = (V1, E1) and G2 = (V2, E2) are said to be vertex
disjoint of a graph G = (V, E) if V1(G1) intersection V2(G2) = null. In the figure, there is no
common vertex between G1 and G2.
Edge disjoint subgraph: A subgraph is said to be edge-disjoint if E1(G1) intersection E2(G2) =
null. In the figure, there is no common edge between G1 and G2.
Note: Edge disjoint subgraph may have vertices in common but a vertex disjoint graph cannot have a
common edge, so the vertex disjoint subgraph will always be an edge-disjoint subgraph.
Consider the graph G(V,E) as shown below. A spanning subgraph is a subgraph that contains all the
vertices of the original graph G that is G'(V’,E’) is spanning if V’=V and E’ is a subset of E.
So one of the spanning subgraph can be as shown below G'(V’,E’). It has all the vertices of the
original graph G and some of the edges of G.
This is just one of the many spanning subgraphs of graph G. We can create various other spanning
subgraphs by different combinations of edges. Note that if we consider a graph G'(V’,E’) where V’=V
and E’=E, then graph G’ is a spanning subgraph of graph G(V,E).
Ever wondered how does Google Maps finds the shortest and fastest route between two places?
Well, the answer is Dijkstra's Algorithm. Dijkstra's Algorithm is a Graph algorithm that finds the
shortest path from a source vertex to all other vertices in the Graph (single source shortest path). It
is a type of Greedy Algorithm that only works on Weighted Graphs having positive weights. The time
complexity of Dijkstra's Algorithm is O(V2) with the help of the adjacency matrix representation of
the graph. This time complexity can be reduced to O((V + E) log V) with the help of an adjacency list
representation of the graph, where V is the number of vertices and E is the number of edges in the
graph.
Dijkstra's Algorithm was designed and published by Dr. Edsger W. Dijkstra, a Dutch Computer
Scientist, Software Engineer, Programmer, Science Essayist, and Systems Scientist.
During an Interview with Philip L. Frana for the Communications of the ACM journal in the year
2001, Dr. Edsger W. Dijkstra revealed:
"What is the shortest way to travel from Rotterdam to Groningen, in general: from given city to given
city? It is the algorithm for the shortest path, which I designed in about twenty minutes. One
morning I was shopping in Amsterdam with my young fiancée, and tired, we sat down on the café
terrace to drink a cup of coffee and I was just thinking about whether I could do this, and I then
designed the algorithm for the shortest path. As I said, it was a twenty-minute invention. In fact, it
was published in '59, three years later. The publication is still readable, it is, in fact, quite nice. One of
the reasons that it is so nice was that I designed it without pencil and paper. I learned later that one
of the advantages of designing without pencil and paper is that you are almost forced to avoid all
avoidable complexities. Eventually, that algorithm became to my great amazement, one of the
cornerstones of my fame."
Dijkstra thought about the shortest path problem while working as a programmer at the
Mathematical Centre in Amsterdam in 1956 to illustrate the capabilities of a new computer known as
ARMAC. His goal was to select both a problem and a solution (produced by the computer) that
people with no computer background could comprehend. He developed the shortest path algorithm
and later executed it for ARMAC for a vaguely shortened transportation map of 64 cities in the
Netherlands (64 cities, so 6 bits would be sufficient to encode the city number). A year later, he came
across another issue from hardware engineers operating the next computer of the institute:
Minimize the amount of wire required to connect the pins on the machine's back panel. As a
solution, he re-discovered the algorithm called Prim's minimal spanning tree algorithm and
published it in the year 1959.
2. The Algorithm keeps records of the presently acknowledged shortest distance from each
node to the source node, and it updates these values if it finds any shorter path.
3. Once the Algorithm has retrieved the shortest path between the source and another node,
that node is marked as 'visited' and included in the path.
4. The procedure continues until all the nodes in the graph have been included in the path. In
this manner, we have a path connecting the source node to all other nodes, following the
shortest possible path to reach each node.
A graph and source vertex are requirements for Dijkstra's Algorithm. This Algorithm is established on
Greedy Approach and thus finds the locally optimal choice (local minima in this case) at each step of
the Algorithm.
Each Vertex in this Algorithm will have two properties defined for it:
1. Visited Property
2. Path Property
Visited Property:
1. The 'visited' property signifies whether or not the node has been visited.
3. A node is marked visited only when the shortest path has been found.
Path Property:
1. The 'path' property stores the value of the current minimum path to the node.
2. The current minimum path implies the shortest way we have reached this node till now.
4. This property is significant because it will store the final answer for each node.
Initially, we mark all the vertices, or nodes, unvisited as they have yet to be visited. The path to all
the nodes is also set to infinity apart from the source node. Moreover, the path to the source node is
set to zero (0).
We then select the source node and mark it as visited. After that, we access all the neighboring
nodes of the source node and perform relaxation on every node. Relaxation is the process of
lowering the cost of reaching a node with the help of another node.
In the process of relaxation, the path of each node is revised to the minimum value amongst the
node's current path, the sum of the path to the previous node, and the path from the previous node
to the current node.
Let us suppose that p[n] is the value of the current path for node n, p[m] is the value of the path up
to the previously visited node m, and w is the weight of the edge between the current node and
previously visited one (edge weight between n and m).
We then mark an unvisited node with the least path as visited in every subsequent step and update
its neighbor's paths.
We repeat this procedure until all the nodes in the graph are marked visited.
Whenever we add a node to the visited set, the path to all its neighboring nodes also changes
accordingly.
If any node is left unreachable (disconnected component), its path remains 'infinity'. In case the
source itself is a separate component, then the path to all other nodes remains 'infinity'.
The following is the step that we will follow to implement Dijkstra's Algorithm:
Step 1: First, we will mark the source node with a current distance of 0 and set the rest of the nodes
to INFINITY.
Step 2: We will then set the unvisited node with the smallest current distance as the current node,
suppose X.
Step 3: For each neighbor N of the current node X: We will then add the current distance of X with
the weight of the edge joining X-N. If it is smaller than the current distance of N, set it as the new
current distance of N.
Step 5: We will repeat the process from 'Step 2' if there is any node unvisited left in the graph.
Let us now understand the implementation of the algorithm with the help of an example:
Figure 6: The Given Graph
1. We will use the above graph as the input, with node A as the source.
3. We will set the path to 0 at node A and INFINITY for all the other nodes.
4. We will now mark source node A as visited and access its neighboring nodes.
Note: We have only accessed the neighboring nodes, not visited them.
5. We will now update the path to node B by 4 with the help of relaxation because the path to
node A is 0 and the path from node A to B is 4, and the minimum((0 + 4), INFINITY) is 4.
6. We will also update the path to node C by 5 with the help of relaxation because the path to
node A is 0 and the path from node A to C is 5, and the minimum((0 + 5), INFINITY) is 5.
Both the neighbors of node A are now relaxed; therefore, we can move ahead.
7. We will now select the next unvisited node with the least path and visit it. Hence, we will
visit node B and perform relaxation on its unvisited neighbors. After performing relaxation,
the path to node C will remain 5, whereas the path to node E will become 11, and the path
to node D will become 13.
8. We will now visit node E and perform relaxation on its neighboring nodes B, D, and F. Since
only node F is unvisited, it will be relaxed. Thus, the path to node B will remain as it is, i.e., 4,
the path to node D will also remain 13, and the path to node F will become 14 (8 + 6).
9. Now we will visit node D, and only node F will be relaxed. However, the path to node F will
remain unchanged, i.e., 14.
10. Since only node F is remaining, we will visit it but not perform any relaxation as all its
neighboring nodes are already visited.
11. Once all the nodes of the graphs are visited, the program will end.
1. A = 0
2. B = 4 (A -> B)
3. C = 5 (A -> C)
4. D = 4 + 9 = 13 (A -> B -> D)
5. E = 5 + 3 = 8 (A -> C -> E)
o We have to maintain a record of the path distance of every node. Therefore, we can store
the path distance of each node in an array of size n, where n is the total number of nodes.
o Moreover, we want to retrieve the shortest path along with the length of that path. To
overcome this problem, we will map each node to the node that last updated its path length.
o Once the algorithm is complete, we can backtrack the destination node to the source node
to retrieve the path.
o We can use a minimum Priority Queue to retrieve the node with the least path distance in an
efficient way.
Pseudocode:
2. // iterating through the nodes in Graph and set their distances to INFINITY
4. distance[N] = INFINITY
5. previous[N] = NULL
9.
12. // selecting a node Q having the least distance and marking it as visited
15.
16. // iterating through the unvisited neighboring nodes of the node Q and performing relax
ation accordingly
19.
20. // if the temporary distance is less than the given distance of the path to the Node, up
dating the resultant distance with the minimum value
23. previous[N] := Q
24.
Explanation:
In the above pseudocode, we have defined a function that accepts multiple parameters - the Graph
consisting of the nodes and the source node. Inside this function, we have iterated through each
node in the Graph, set their initial distance to INFINITY, and set the previous node value to NULL. We
have also checked whether any selected node is not a source node and added the same into the
Priority Queue. Moreover, we have set the distance of the source node to 0. We then iterated
through the nodes in the priority queue, selected the node with the least distance, and marked it as
visited. We then iterated through the unvisited neighboring nodes of the selected node and
performed relaxation accordingly. At last, we have compared both the distances (original and
temporary distance) between the source node and the destination node, updated the resultant
distance with the minimum value and previous node information, and returned the final list of
distances with their previous node information.
A minimum spanning tree (MST) is defined as a spanning tree that has the minimum weight among
all the possible spanning trees
A spanning tree is defined as a tree-like subgraph of a connected, undirected graph that includes all
the vertices of the graph. Or, to say in Layman’s words, it is a subset of the edges of the graph that
forms a tree (acyclic) where every node of the graph is a part of the tree.
The minimum spanning tree has all the properties of a spanning tree with an added constraint of
having the minimum possible weights among all possible spanning trees. Like a spanning tree, there
can also be many possible MSTs for a graph.
The number of vertices (V) in the graph and the spanning tree is the same.
There is a fixed number of edges in the spanning tree which is equal to one less than the
total number of vertices ( E = V-1 ).
The spanning tree should not be disconnected, as in there should only be a single source of
component, not more than that.
The spanning tree should be acyclic, which means there would not be any cycle in the tree.
The total cost (or weight) of the spanning tree is defined as the sum of the edge weights of
all the edges of the spanning tree.
A minimum spanning tree (MST) is defined as a spanning tree that has the minimum weight among
all the possible spanning trees.
The minimum spanning tree has all the properties of a spanning tree with an added constraint of
having the minimum possible weights among all possible spanning trees. Like a spanning tree, there
can also be many possible MSTs for a graph.
There are several algorithms to find the minimum spanning tree from a given graph, some of them
are listed below:
At each iteration, the algorithm adds the next lowest-weight edge one by one, such that the
edges picked until now does not form a cycle.
This algorithm can be implemented efficiently using a DSU ( Disjoint-Set ) data structure to keep track
of the connected components of the graph. This is used in a variety of practical applications such as
network design, clustering, and data analysis.
Follow the article on “Kruskal’s Minimum Spanning Tree Algorithm” for a better understanding and
implementation of the algorithm.
A minimum spanning tree (MST) or minimum weight spanning tree for a weighted, connected,
undirected graph is a spanning tree with a weight less than or equal to the weight of every other
spanning tree. To learn more about Minimum Spanning Tree, refer to this article.
Here we will discuss Kruskal’s algorithm to find the MST of a given weighted graph.
In Kruskal’s algorithm, sort all edges of the given graph in increasing order. Then it keeps on adding
new edges and nodes in the MST if the newly added edge does not form a cycle. It picks the minimum
weighted edge at first and the maximum weighted edge at last. Thus we can say that it makes a
locally optimal choice in each step in order to find the optimal solution. Hence this is a Greedy
Algorithm.
Below are the steps for finding MST using Kruskal’s algorithm:
2. Pick the smallest edge. Check if it forms a cycle with the spanning tree formed so far. If the
cycle is not formed, include this edge. Else, discard it.
3. Repeat step#2 until there are (V-1) edges in the spanning tree.
Step 2 uses the Union-Find algorithm to detect cycles.
Kruskal’s algorithm to find the minimum cost spanning tree uses the greedy approach. The Greedy
Choice is to pick the smallest weight edge that does not cause a cycle in the MST constructed so far.
Let us understand it with an example:
Illustration:
Input Graph:
The graph contains 9 vertices and 14 edges. So, the minimum spanning tree formed will be having (9
– 1) = 8 edges.
After sorting:
1 7 6
2 8 2
2 6 5
4 0 1
Weight Source Destination
4 2 5
6 8 6
7 2 3
7 7 8
8 0 7
8 1 2
9 3 4
10 5 4
11 1 7
14 3 5
Now pick all edges one by one from the sorted list of edges
Step 6: Pick edge 8-6. Since including this edge results in the cycle, discard it. Pick edge 2-3: No cycle
is formed, include it.
Add edge 2-3 in the MST
Step 7: Pick edge 7-8. Since including this edge results in the cycle, discard it. Pick edge 0-7. No cycle
is formed, include it.
Add edge 0-7 in MST
Step 8: Pick edge 1-2. Since including this edge results in the cycle, discard it. Pick edge 3-4. No cycle
is formed, include it.
This is also a greedy algorithm. This algorithm has the following workflow:
Then, it repeatedly checks for the minimum edge weight that connects one vertex of MST to
another vertex that is not yet in the MST.
This process is continued until all the vertices are included in the MST.
To efficiently select the minimum weight edge for each iteration, this algorithm uses priority_queue
to store the vertices sorted by their minimum edge weight currently. It also simultaneously keeps
track of the MST using an array or other data structure suitable considering the data type it is storing.
This algorithm can be used in various scenarios such as image segmentation based on color, texture,
or other features. For Routing, as in finding the shortest path between two points for a delivery truck
to follow.
Follow the article on “Prim’s Minimum Spanning Tree Algorithm” for a better understanding and
implementation of this algorithm.
Prim’s Algorithm for Minimum Spanning Tree (MST)
We have discussed Kruskal’s algorithm for Minimum Spanning Tree. Like Kruskal’s algorithm, Prim’s
algorithm is also a Greedy algorithm. This algorithm always starts with a single node and moves
through several adjacent nodes, in order to explore all of the connected edges along the way.
The algorithm starts with an empty spanning tree. The idea is to maintain two sets of vertices. The
first set contains the vertices already included in the MST, and the other set contains the vertices not
yet included. At every step, it considers all the edges that connect the two sets and picks the
minimum weight edge from these edges. After picking the edge, it moves the other endpoint of the
edge to the set containing MST.
Recommended Problem
Companies:
FlipkartAmazon
+3 more
Show Topics
Solve Problem
Medium
47.82%
1.4L
A group of edges that connects two sets of vertices in a graph is called cut in graph theory. So, at
every step of Prim’s algorithm, find a cut, pick the minimum weight edge from the cut, and include
this vertex in MST Set (the set that contains already included vertices).
The working of Prim’s algorithm can be described by using the following steps:
Note: For determining a cycle, we can divide the vertices into two sets [one set contains the vertices
included in MST and the other contains the fringe vertices.]
Consider the following graph as an example for which we need to find the Minimum Spanning Tree
(MST).
Example of a graph
Step 1: Firstly, we select an arbitrary vertex that acts as the starting vertex of the Minimum Spanning
Tree. Here we have selected vertex 0 as the starting vertex.
0 is selected as starting vertex
Step 2: All the edges connecting the incomplete MST and other vertices are the edges {0, 1} and {0,
7}. Between these two the edge with minimum weight is {0, 1}. So include the edge and vertex 1 in
the MST.
Step 3: The edges connecting the incomplete MST to other vertices are {0, 7}, {1, 7} and {1, 2}.
Among these edges the minimum weight is 8 which is of the edges {0, 7} and {1, 2}. Let us here
include the edge {0, 7} and the vertex 7 in the MST. [We could have also included edge {1, 2} and
vertex 2 in the MST].
Step 4: The edges that connect the incomplete MST with the fringe vertices are {1, 2}, {7, 6} and {7,
8}. Add the edge {7, 6} and the vertex 6 in the MST as it has the least weight (i.e., 1).
Step 6: Among the current connecting edges, the edge {5, 2} has the minimum weight. So include
that edge and the vertex 2 in the MST.
Step 8: See here that the edges {7, 8} and {2, 3} both have same weight which are minimum. But 7 is
already part of MST. So we will consider the edge {2, 3} and include that edge and vertex 3 in the
MST.
Include vertex 3 in MST
Step 9: Only the vertex 4 remains to be included. The minimum weighted edge from the incomplete
MST to 4 is {3, 4}.
The final structure of the MST is as follows and the weight of the edges of the MST is (4 + 8 + 1 + 2 + 4
+ 2 + 7 + 9) = 37.
Structure of the alternate MST if we had selected edge {1, 2} in the MST
Introduction
It aims to figure out the shortest path from each vertex v to every other u. Storing all the paths
explicitly can be very memory expensive indeed, as we need one spanning tree for each vertex. This
is often impractical regarding memory consumption, so these are generally considered as all pairs-
shortest distance problems, which aim to find just the distance from each to each node to another.
We usually want the output in tabular form: the entry in u's row and v's column should be the weight
of the shortest path from u to v.
Algorithm Cost
Floyd-Warshall O (V3)
Unlike the single-source algorithms, which assume an adjacency list representation of the graph,
most of the algorithm uses an adjacency matrix representation. (Johnson's Algorithm for sparse
graphs uses adjacency lists.) The input is a n x n matrix W representing the edge weights of an n-
vertex directed graph G = (V, E). That is, W = (wij), where
Floyd-Warshall Algorithm
The Floyd-Warshall algorithm is a dynamic programming algorithm used to discover the shortest
paths in a weighted graph, which includes negative weight cycles. The algorithm works with the aid
of computing the shortest direction between every pair of vertices within the graph, the usage of a
matrix of intermediate vertices to keep music of the exceptional-recognized route thus far.
But before we get started, let us briefly understand what Dynamic Programming is.
The key idea behind dynamic programming is to keep the solutions to the subproblems in memory,
so they can be reused later whilst solving larger problems. This reduces the time and area complexity
of the set of rules and lets it resolve tons larger and extra complex issues than a brute force approach
might.
Advertisement
1. Memoization
2. Tabulation
Memoization involves storing the outcomes of every subproblem in a cache, in order that they may
be reused later. Tabulation includes building a desk of answers to subproblems in a bottom-up
manner, beginning with the smallest subproblems and working as much as the larger ones. Dynamic
programming is utilized in an extensive range of packages, including optimization troubles,
computational geometry, gadget studying, and natural language processing.
Some well-known examples of problems that may be solved by the usage of dynamic programming
consist of the Fibonacci collection, the Knapsack trouble, and the shortest path problem.
The Floyd-Warshall set of rules was advanced independently via Robert Floyd and Stephen Warshall
in 1962. Robert Floyd turned into a mathematician and computer scientist at IBM's Thomas J.
Watson Research Center, whilst Stephen Warshall became a computer scientist at the University of
California, Berkeley. The algorithm was originally developed for use inside the field of operations
research, where it turned into used to solve the all-pairs shortest direction problem in directed
graphs with tremendous or negative side weights. The problem become of outstanding hobby in
operations research, as it has many applications in transportation, conversation, and logistics.
Floyd first presented the set of rules in a technical record titled "Algorithm 97: Shortest Path" in
1962. Warshall independently discovered the set of rules quickly afterwards and posted it in his
personal technical document, "A Theorem on Boolean Matrices". The algorithm has on account that
emerged as a cornerstone of pc technology and is broadly used in lots of regions of studies and
enterprise. Its capability to correctly find the shortest paths between all pairs of vertices in a graph,
including those with terrible side weights, makes it a treasured tool for solving an extensive range of
optimization problems.
1. Initialize a distance matrix D wherein D[i][j] represents the shortest distance between vertex
i and vertex j.
2. Set the diagonal entries of the matrix to 0, and all other entries to infinity.
3. For every area (u,v) inside the graph, replace the gap matrix to mirror the weight of the
brink: D[u][v] = weight(u,v).
4. For every vertex okay in the graph, bear in mind all pairs of vertices (i,j) and check if the path
from i to j through k is shorter than the current best path. If it is, update the gap matrix: D[i]
[j] = min(D[i][j], D[i][k] D[k][j]).
5. After all iterations, the matrix D will contain the shortest course distances between all pairs
of vertices.
Example:
Floyd-Warshall is an algorithm used to locate the shortest course between all pairs of vertices in a
weighted graph. It works by means of keeping a matrix of distances between each pair of vertices
and updating this matrix iteratively till the shortest paths are discovered.
In this graph, the vertices are represented by letters (A, B, C, D), and the numbers on the edges
represent the weights of those edges.
To follow the Floyd-Warshall algorithm to this graph, we start by way of initializing a matrix of
distances among every pair of vertices. If two vertices are immediately related by using a side, their
distance is the load of that edge. If there may be no direct edge among vertices, their distance is
infinite.
In the first iteration of the set of rules, we keep in mind the possibility of the usage of vertex 1 (A) as
an intermediate vertex in paths among all pairs of vertices. If the space from vertex 1 to vertex 2 plus
the space from vertex 2 to vertex three is much less than the present-day distance from vertex 1 to
vertex three, then we replace the matrix with this new distance. We try this for each possible pair of
vertices.
Advertisement
In the second iteration, we recollect the possibility to use of vertex 2 (B) as an intermediate vertex in
paths among all pairs of vertices. We replace the matrix in the same manner as earlier before.
In the third iteration, we consider the possibility of using vertex 3 (C) as an intermediate vertex in
paths between all pairs of vertices.
Finally, in the fourth and final iteration, we consider the possibility of using vertex 4 (D) as an
intermediate vertex in paths between all pairs of vertices.
Advertisement
After the fourth iteration, we have got the shortest path between every pair of vertices in the graph.
For example, the shortest path from vertex A to vertex D is 4, which is the value in the matrix at row
A and column D.
After the fourth iteration, we have got the shortest path between every pair of vertices in the graph.
For example, the shortest path from vertex A to vertex D is 4, which is the value in the matrix at row
A and column D.
Let us now discuss some advantages and disadvantages of using Floyd-Warshall Algorithm:
1. It can discover the shortest direction between all pairs of vertices in a weighted graph, such
as graphs with negative edge weights.
4. It has a time complexity of O(N^3), that is relatively efficient for most real-international
applications.
5. It can be used to discover negative weight cycles in a graph.
1. It calls for a matrix of size N^2 to store the intermediate results, which may be prohibitively
large for extremely large graphs.
2. It is not the maximum green set of rules for fixing the all-pairs shortest path hassle in sure
types of graphs, inclusive of sparse graphs or graphs with non-bad part weights.
3. It won't be suitable for real-time packages or packages with strict reminiscence constraints,
as it is able to take a long term to compute the shortest paths in very huge graphs.
4. It can be less intuitive than different algorithms, which include Dijkstra's algorithm, or the
Bellman-Ford set of rules, making it more difficult to understand for some builders.
The Floyd-Warshall algorithm is a dynamic programming algorithm used for finding the shortest
course among all pairs of vertices in a weighted graph. Here are some of the programs of the Floyd-
Warshall algorithm:
2. Airline Networks: The Floyd-Warshall set of rules can also be utilized in airline networks to
locate the shortest path between two cities with the lowest cost. It can assist airways plan
their routes and limit fuel charges.
3. Traffic Networks: The algorithm is used to find the shortest path between points in a visitors'
network. It can help reduce congestion and improve the go with the flow of visitors in city
areas.
5. Game Development: The set of rules may be used in game development to find the shortest
direction among two items in a sport world. It is beneficial in games in which the participant
desires to navigate through complex surroundings, together with a maze or a metropolis.
The max flow problem is a classic optimization problem in graph theory that involves finding the
maximum amount of flow that can be sent through a network of pipes, channels, or other pathways,
subject to capacity constraints. The problem can be used to model a wide variety of real-world
situations, such as transportation systems, communication networks, and resource allocation.
In the max flow problem, we have a directed graph with a source node s and a sink node t, and each
edge has a capacity that represents the maximum amount of flow that can be sent through it. The
goal is to find the maximum amount of flow that can be sent from s to t, while respecting the
capacity constraints on the edges.
One common approach to solving the max flow problem is the Ford-Fulkerson algorithm, which is
based on the idea of augmenting paths. The algorithm starts with an initial flow of zero, and
iteratively finds a path from s to t that has available capacity, and then increases the flow along that
path by the maximum amount possible. This process continues until no more augmenting paths can
be found.
Another popular algorithm for solving the max flow problem is the Edmonds-Karp algorithm, which is
a variant of the Ford-Fulkerson algorithm that uses breadth-first search to find augmenting paths,
and thus can be more efficient in some cases.
Advantages:
1. The max flow problem is a flexible and powerful modeling tool that can be used to represent
a wide variety of real-world situations.
2. The Ford-Fulkerson and Edmonds-Karp algorithms are both guaranteed to find the maximum
flow in a graph, and can be implemented efficiently for most practical cases.
3. The max flow problem has many interesting theoretical properties and connections to other
areas of mathematics, such as linear programming and combinatorial optimization.
Disadvantages:
1. In some cases, the max flow problem can be difficult to solve efficiently, especially if the
graph is very large or has complex capacity constraints.
2. The max flow problem may not always provide a unique or globally optimal solution,
depending on the specific problem instance and algorithm used.
The Ford-Fulkerson algorithm is a widely used algorithm to solve the maximum flow problem in a
flow network. The maximum flow problem involves determining the maximum amount of flow that
can be sent from a source vertex to a sink vertex in a directed weighted graph, subject to capacity
constraints on the edges.
The algorithm works by iteratively finding an augmenting path, which is a path from the source to
the sink in the residual graph, i.e., the graph obtained by subtracting the current flow from the
capacity of each edge. The algorithm then increases the flow along this path by the maximum
possible amount, which is the minimum capacity of the edges along the path.
Problem:
Given a graph which represents a flow network where every edge has a capacity. Also, given two
vertices source ‘s’ and sink ‘t’ in the graph, find the maximum possible flow from s to t with the
following constraints:
Incoming flow is equal to outgoing flow for every vertex except s and t.
For example, consider the following graph from the CLRS book.
Ford-Fulkerson Algorithm
The following is simple idea of Ford-Fulkerson algorithm:
2. While there exists an augmenting path from the source to the sink:
Determine the amount of flow that can be sent along the augmenting path, which is
the minimum residual capacity along the edges of the path.
Increase the flow along the augmenting path by the determined amount.
Time Complexity: Time complexity of the above algorithm is O(max_flow * E). We run a loop while
there is an augmenting path. In worst case, we may add 1 unit flow in every iteration. Therefore the
time complexity becomes O(max_flow * E).
Residual Graph of a flow network is a graph which indicates additional possible flow. If there is a
path from source to sink in residual graph, then it is possible to add flow. Every edge of a residual
graph has a value called residual capacity which is equal to original capacity of the edge minus
current flow. Residual capacity is basically the current capacity of the edge.
Let us now talk about implementation details. Residual capacity is 0 if there is no edge between two
vertices of residual graph. We can initialize the residual graph as original graph as there is no initial
flow and initially residual capacity is equal to original capacity. To find an augmenting path, we can
either do a BFS or DFS of the residual graph. We have used BFS in below implementation. Using BFS,
we can find out if there is a path from source to sink. BFS also builds parent[] array. Using the
parent[] array, we traverse through the found path and find possible flow through this path by
finding minimum residual capacity along the path. We later add the found path flow to overall flow.