7 Algorithms: By-Kurt Angelo Ballarta Bscs-2B

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

7

Algorithms

By- Kurt Angelo Ballarta


BSCS-2B

Design and Analysis of Algorithms


Prof. Jan Eilbert
Table of Contents

Comb Sort………………………………………………………….pg.3
Backjumping……………………………………………………….pg.7
Linear Multistep……………………………………………………pg.12
Brent’s Algorithm………………………….……………………….pg.16
Proximity Algorithm- Largest Sphere Algorithm………………......pg.21
Graph Reduction……………………………………………………pg.23
String Hashing……………………………………………………...pg.25
Comb Sort
Introduction
Comb sort is an improvement over the classic Bubble Sort algorithm. The primary idea behind
Comb Sort is to eliminate the "turtles," or small values near the end of the list, which slow down
the sorting process. It does this by using a larger gap between compared elements and then
progressively reducing the gap size until it becomes one, at which point Comb Sort effectively
becomes Bubble Sort.

Origin of Comb Sort


Comb Sort was developed by Wlodzimierz Dobosiewicz in 1980 and later rediscovered and
popularized by Stephen Lacey and Richard Box in 1991. The algorithm addresses the
inefficiencies of Bubble Sort by comparing and swapping elements that are farther apart,
reducing the total number of swaps required and improving performance on average.

Key Concepts and Steps of Comb Sort


1. Initial Gap Calculation: The gap is initially set to the length of the list divided by the
shrink factor, typically 1.3.
2. Gap Reduction: After each pass, the gap is reduced by the shrink factor, eventually
reaching 1.
3. Comparison and Swapping: Elements that are the current gap distance apart are compared
and swapped if they are out of order.
4. Final Bubble Sort: Once the gap is reduced to 1, the algorithm performs a final pass using
Bubble Sort to ensure the list is fully sorted.

Pseudocode of Comb Sort


1function combSort(arr):
gap = length(arr)
shrink = 1.3
sorted = false

while not sorted:


# Update the gap for the next comb
gap = floor(gap / shrink)

if gap > 1:
sorted = false
else:
gap = 1
sorted = true

# Perform a single "comb" over the array


for i from 0 to length(arr) - gap - 1:
if arr[i] > arr[i + gap]:
# Swap elements
temp = arr[i]
arr[i] = arr[i + gap]
arr[i + gap] = temp

sorted = false

# Example usage
arr = [34, 8, 64, 51, 32, 21]
combSort(arr)

Implementation in Python
A Worked Example in Comb Sort
Let's walk through a simplified worked example of Comb Sort with an array:
Initial array: [64, 34, 25, 12, 22, 11, 90]
1. Initial gap calculation:
 Gap = 7 (length of array) // 1.3 ≈ 5
2. First pass with gap 5:
 Compare 64 and 11, swap (64 > 11)
 Array after pass: [11, 34, 25, 12, 22, 64, 90]

3. Gap reduction:
 Gap = 5 // 1.3 ≈ 3
4. Second pass with gap 3:

 Compare 11 and 12, no swap


 Compare 34 and 22, swap (34 > 22)
 Array after pass: [11, 22, 25, 12, 34, 64, 90]
5. Gap reduction:
 Gap = 3 // 1.3 ≈ 2
6. Third pass with gap 2:

 Compare 11 and 25, no swap


 Compare 22 and 12, swap (22 > 12)
 Array after pass: [11, 12, 25, 22, 34, 64, 90]
7. Gap reduction:
 Gap = 2 // 1.3 ≈ 1
8. Final pass with gap 1 (Bubble Sort):

 Compare 11 and 12, no swap


 Compare 25 and 22, swap (25 > 22)
 Array after pass: [11, 12, 22, 25, 34, 64, 90]
Final sorted array: [11, 12, 22, 25, 34, 64, 90]

Time Complexity
 Best Case: 𝑂(𝑁 log 𝑁) when the list is already nearly sorted.
 Average Case: Ω(n2/2p)
 Worst Case: 𝑂(𝑁 2 ) when the list is in reverse order.
Auxiliary Space
Comb Sort uses O(1) auxiliary space as it is an in-place sorting algorithm.

Advantages of Comb Sort


 Improved Performance: Comb Sort generally performs better than Bubble Sort by
reducing the number of swaps.
 Simple Implementation: The algorithm is easy to understand and implement.
 Good for Small Lists: It works efficiently on small to moderately sized lists.

Disadvantages of Comb Sort


 Not Stable: Like most other non-stable sorting algorithms, Comb Sort does not maintain
the relative order of equal elements.
 Not the Fastest: While better than Bubble Sort, it is still outperformed by more advanced
algorithms like Quick Sort and Merge Sort on larger datasets.

Conclusion
Comb Sort offers an enhancement over Bubble Sort by effectively reducing the number of
unnecessary swaps and comparisons through the use of a shrinking gap. Although it may not be
the best choice for very large datasets, it provides a simple and efficient alternative for smaller
lists, with a straightforward implementation that can be easily understood and utilized.

References:
 https://www.geeksforgeeks.org/comb-sort/

 https://en.wikipedia.org/wiki/Comb_sort
Backjumping Algorithm
Introduction
Backjumping is a sophisticated technique used to optimize the backtracking process in constraint
satisfaction problems (CSPs). It improves the efficiency of backtracking algorithms by allowing the
algorithm to jump back more than one level when a dead end is encountered, skipping over certain levels
of the search tree that cannot possibly resolve the conflict. This technique helps reduce the number of
constraints checked and can significantly decrease the search space, making the solving process faster.

Origin of Backjumping
Backjumping was introduced as an enhancement to the traditional backtracking algorithms used in
solving CSPs. In CSPs, the goal is to assign values to variables under certain constraints such that no
constraints are violated. Backtracking is a common method for solving CSPs, where values are assigned
to variables sequentially, and upon encountering a conflict, the algorithm backtracks to the previous
variable to try a different value. Backjumping improves upon this by skipping over some variables,
directly jumping back to a variable that might resolve the conflict.

Advantages of Backjumping
 Efficiency: Reduces the number of variables and constraints checked, leading to faster problem-
solving.
 Reduced Search Space: By skipping over irrelevant parts of the search tree, the algorithm
focuses on more promising areas.
 Improved Performance: Especially beneficial in large and complex CSPs where traditional
backtracking would be too slow.

Disadvantages of Backjumping
 Complexity: Implementing backjumping can be more complex compared to simple backtracking.
 Overhead: Additional computation is needed to determine where to jump back, which might not
always be beneficial for small or simple problems.

Description of Backjumping
Backjumping works by maintaining information about the constraints that have caused conflicts. When a
conflict is encountered, instead of simply backtracking to the most recent variable, the algorithm analyzes
the cause of the conflict to determine the most appropriate variable to backtrack to. This allows it to skip
over certain levels of the search tree that are irrelevant to the current conflict.

Key Concepts and Steps of Backjumping


1. Conflict Set: Each variable maintains a conflict set, which records other variables that caused a
conflict when this variable was last assigned a value.
2. Jump Back: When a conflict is detected, the algorithm jumps back to the most recent variable in
the conflict set rather than the immediately preceding variable.
3. Update Conflict Sets: Each time a conflict occurs and a jump back is made, the conflict sets are
updated to reflect the new conflicts.

Pseudocode of Backjumping
1. Check Safety:
o Define a function is_safe(board, row, col) to check if placing a queen at
position (row, col) on the board is safe.
o Check if there are any queens in the same column as the current position.
o Check if there are any queens in the upper left diagonal.
o Check if there are any queens in the upper right diagonal.
o If no queen threatens the current position, return True; otherwise, return False.
2. Place Queens:
o Define a function solve_n_queens(board, row) to recursively place queens on
the board.
o Base case: If all queens are placed (i.e., row equals the size of the board), return
True.
o Iterate through each column in the current row.
o If placing a queen at the current position is safe, mark that position with a queen
and recursively call solve_n_queens for the next row.
o If a solution is found in the recursive call, return True.
o If no solution is found, backtrack by removing the queen from the current position
and try the next column.
o If no queen can be placed in the current row, return False.
3. Solve N-Queens Problem:
o Define a function n_queens(n) to solve the N-Queens problem for a given board
size n.
o Create a 2D array (board) of size n x n and initialize it with zeros.
o Call solve_n_queens with the initial board and row index 0.
o If a solution is found, print "Solution exists! Here is one possible configuration:"
followed by the board configuration.
o If no solution is found, print "Solution does not exist for this board size."

Example usage:

 Call n_queens(4) to find a solution for the 4-Queens problem.


Source Code
Worked Example
Consider a simple CSP with three variables X,Y,ZX, Y, ZX,Y,Z and the constraints X≠YX \neq
YX =Y and Y≠ZY \neq ZY =Z. The domain for each variable is {1, 2, 3}.

Step-by-Step Execution

1. Assign X=1X = 1X=1.


2. Assign Y=2Y = 2Y=2 (consistent with X≠YX \neq YX =Y).
3. Assign Z=2Z = 2Z=2 (conflict with Y≠ZY \neq ZY =Z).
4. Jump back to YYY and try a different value (since the conflict was caused by YYY).
5. Assign Y=3Y = 3Y=3.
6. Assign Z=2Z = 2Z=2 (now consistent).

Final assignment: X=1,Y=3,Z=2X = 1, Y = 3, Z = 2X=1,Y=3,Z=2.

Time Complexity
The time complexity of the backjumping algorithm depends on the size of the search tree and the
efficiency gained from jumping back multiple levels. O(N!)

Auxiliary Space Complexity


O(N * (N-D)) where D refer to the depth of the backjump.

Conclusion
Backjumping enhances the traditional backtracking approach by skipping irrelevant parts of the
search tree, which leads to improved efficiency in solving CSPs. Although it introduces
additional complexity in implementation, the performance benefits in large and complex
problems make it a valuable technique in the field of constraint satisfaction.

References
 Wikipedia, "Backjumping," https://en.wikipedia.org/wiki/Backjumping.
Linear Multistep Methods
Introduction
Linear multistep methods (LMM) are a family of numerical methods used to solve ordinary
differential equations (ODEs). These methods improve the accuracy of the numerical solution by
using information from multiple previous points to calculate the next point. They can be
categorized into explicit and implicit methods, making them particularly useful for both stiff and
non-stiff differential equations. The use of multiple steps allows for higher-order accuracy
compared to single-step methods like Euler's method.

Origin of Linear Multistep Methods


The development of linear multistep methods dates back to the early 20th century, aimed at
improving the accuracy and stability of numerical solutions for ODEs. These methods were
designed to use information from several previous steps to predict the future behavior of the
solution, thus providing better accuracy without significantly increasing computational cost. The
foundational work on these methods laid the groundwork for modern numerical analysis
techniques used in various scientific and engineering applications.

Advantages of Linear Multistep Methods


 Higher Accuracy: By utilizing multiple previous points, LMMs can achieve higher-
order accuracy.
 Efficiency: They reduce the number of function evaluations per step, making them
computationally efficient.
 Stability: Implicit LMMs, in particular, offer better stability properties for stiff equations.

Disadvantages of Linear Multistep Methods


 Complexity: Implementing LMMs can be more complex compared to single-step
methods.
 Initialization: Requires starting values, which might need other methods to compute.
 Implicit Methods: Implicit LMMs require solving nonlinear equations at each step,
which can be computationally expensive.
Description of Linear Multistep Methods
Linear multistep methods use a linear combination of previous points and their corresponding
function values to compute the next point in the solution. A general LMM can be written as:

∑𝑗 = 0𝑘𝛼𝑗𝑦𝑛 − 𝑗 = ℎ∑𝑗 = 0𝑘𝛽𝑗𝑓𝑛 − 𝑗\𝑠𝑢𝑚_{𝑗 = 0}^{𝑘} \𝑎𝑙𝑝ℎ𝑎_𝑗 𝑦_{𝑛 − 𝑗} = ℎ \𝑠𝑢𝑚_{𝑗


= 0}^{𝑘} \𝑏𝑒𝑡𝑎_𝑗 𝑓_{𝑛 − 𝑗}∑𝑗 = 0𝑘𝛼𝑗𝑦𝑛 − 𝑗 = ℎ∑𝑗 = 0𝑘𝛽𝑗𝑓𝑛 − 𝑗

where 𝛼𝑗\𝑎𝑙𝑝ℎ𝑎_𝑗𝛼𝑗 𝑎𝑛𝑑 𝛽𝑗\𝑏𝑒𝑡𝑎_𝑗𝛽𝑗 𝑎𝑟𝑒 𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡𝑠, 𝑦𝑛 − 𝑗𝑦_{𝑛 − 𝑗}𝑦𝑛 −j are the
previous points, 𝑓𝑛 − 𝑗𝑓_{𝑛 − 𝑗}𝑓𝑛 − 𝑗 are the function values at those points, and hhh is the
step size.

Key Concepts and Steps of Linear Multistep Methods


1. Coefficients Determination: The coefficients αj\alpha_jαj and βj\beta_jβj are
determined based on the desired order of accuracy.
2. Initialization: Starting values 𝑦0, 𝑦1, … , 𝑦𝑘 − 1𝑦_0, 𝑦_1,\𝑙𝑑𝑜𝑡𝑠, 𝑦_{𝑘 − 1}𝑦0, 𝑦1
, … , 𝑦𝑘 − 1 are needed to initiate the method.
3. Step Computation: Use the linear combination of previous values and function
evaluations to compute the next value.

Pseudocode of Linear Multistep Methods


function linear_multistep(f, y0, t0, h, n, alpha, beta):
y = [y0]
t = t0
for i = 1 to n:
y_new = 0
for j = 0 to k:
y_new += alpha[j] * y[i-j]
y_new += h * beta[j] * f(t - j*h, y[i-j])
y.append(y_new)
t += h
return y
Source Code
Worked Example
Consider solving the equation 𝑥3 − 𝑥 − 2 = 0𝑥^3 − 𝑥 − 2 = 0𝑥3 − 𝑥 − 2 = 0 within the
interval [1,2][1, 2][1,2] using Brent's algorithm. We start with the initial points 𝑎 = 1𝑎 = 1𝑎 =
1 𝑎𝑛𝑑 𝑏 = 2𝑏 = 2𝑏 = 2, and the function 𝑣𝑎𝑙𝑢𝑒𝑠 𝑓(1) = −2𝑓(1) = −2𝑓(1) =
−2 𝑎𝑛𝑑 𝑓(2) = 4𝑓(2) = 4𝑓(2) = 4. Following Brent's method, we iteratively apply inverse
quadratic interpolation, secant steps, and bisection steps to refine the interval until we converge
to the root with the desired tolerance. After a few iterations, Brent's method converges to the root
𝑥 ≈ 1.521𝑥 \𝑎𝑝𝑝𝑟𝑜𝑥. 1.521𝑥 ≈ 1.521, which is the solution to the equation 𝑥3 − 𝑥 − 2 =
0𝑥^3 − 𝑥 − 2 = 0𝑥3 − 𝑥 − 2 = 0.

Time Complexity
The time complexity of linear multistep methods depends on the number of steps n and the order
of the method k. Typically, the time complexity 𝑖𝑠 𝑂(𝑛𝑘) because for each of the n steps, the
method performs a fixed number of operations proportional to k. Higher-order methods might
require more computational effort per step but generally offer better accuracy.

Auxiliary Space Complexity


The auxiliary space complexity is mainly determined by the storage of past function values and
solution points. For a k-step method, the space complexity is O(k) because it needs to store k
previous values at any step. Therefore, the total auxiliary space complexity is O(k), where k is
the order of the method.

Conclusion
Linear multistep methods enhance the numerical solving of ordinary differential equations by
utilizing multiple previous points for higher accuracy and efficiency. Despite their complexity
and need for initialization, they are powerful tools for solving stiff and non-stiff ODEs, providing
significant improvements over single-step methods.

References
 Wikipedia, "Linear multistep method,"
https://en.wikipedia.org/wiki/Linear_multistep_method.
 LibreTexts, "Linear Multistep Methods,"
https://math.libretexts.org/Bookshelves/Differential_Equations/Numerically_Solving_Or
dinary_Differential_Equations_(Brorson)/01%3A_Chapters/1.06%3A_Linear_multistep_
methods.
 Scholarpedia, "Linear Multistep Method,"
http://www.scholarpedia.org/article/Linear_multistep_method.
Brent's Algorithm
Introduction
Brent's Algorithm is a root-finding algorithm that combines elements of root bracketing,
bisection, and inverse quadratic interpolation to find the roots of a real-valued function. This
method is renowned for its robustness and efficiency, often outperforming simpler methods like
bisection and the secant method. Developed by Richard Brent in 1973, this algorithm guarantees
convergence under certain conditions and is particularly effective for functions where derivatives
may not be readily available.

Root-finding algorithms are essential in numerical analysis for solving equations of the form
f(x)=0f(x) = 0f(x)=0. Brent's method, also known as the van Wijngaarden-Deker-Brent method,
combines several techniques to enhance both speed and reliability. This paper outlines the
principles behind Brent's method, providing a comprehensive explanation, pseudocode, and a
detailed example.

Brent's Algorithm
Brent's method strategically combines three primary techniques:

1. Root Bracketing: Ensures that the root lies within a specified interval.
2. Bisection Method: Narrows down the interval by repeatedly bisecting it and selecting
subintervals where the root lies.
3. Inverse Quadratic Interpolation: Uses a quadratic polynomial to approximate the root,
which can speed up convergence.

Algorithm Description
The algorithm starts with three points x1x_1x1, x2x_2x2, and x3x_3x3 that bracket the root. It
then performs inverse quadratic interpolation to predict the next point. If the predicted point does
not sufficiently reduce the interval or lies outside the interval, the algorithm falls back to
bisection.

Steps:

1. Initialize three points x1x_1x1, x2x_2x2, x3x_3x3 such that f(x1)f(x_1)f(x1),


f(x2)f(x_2)f(x2), and f(x3)f(x_3)f(x3) are known.
2. Check if f(x2)f(x_2)f(x2) is close enough to zero; if so, x2x_2x2 is the root.
3. Perform inverse quadratic interpolation using: x=x2+PQx = x_2 + \frac{P}{Q}x=x2+QP
where PPP and QQQ are derived from the function values at the three points.
4. If the new point is not an improvement, use bisection to find a better interval.
5. Repeat until the root is found within a desired tolerance.
Pseudocode
CLASS Node:
FUNCTION __init__(self, data):
self.data = data
self.next = None
CLASS LinkedList:
FUNCTION __init__(self):
self.head = None
FUNCTION push(self, new_data):
new_node = Node(new_data)
new_node.next = self.head
self.head = new_node
FUNCTION printList(self):
temp = self.head
WHILE temp:
PRINT temp.data, end=" "
temp = temp.next
FUNCTION detectCycle(self):
temp = self.head
IF NOT temp:
RETURN False
first_p = temp
second_p = temp.next
power = 1
length = 1
WHILE second_p AND second_p != first_p:
IF length == power:
power *= 2
length = 0
first_p = second_p
second_p = second_p.next
length = length + 1
IF NOT second_p:
RETURN
first_p = second_p = self.head
WHILE length > 0:
second_p = second_p.next
length = length - 1
WHILE second_p != first_p:
second_p = second_p.next
first_p = first_p.next
RETURN first_p
# Driver program for testing
llist = LinkedList()
llist.push(50)
llist.push(20)
llist.push(15)
llist.push(4)
llist.push(10)
# Create a loop for testing
llist.head.next.next.next.next.next = llist.head.next.next
res = llist.detectCycle()
IF res.data:
PRINT "loop found at ", res.data
ELSE:
PRINT "No Loop"
Source Code

Worked Example
Consider 𝑓(𝑥) = 𝑥2 − 2𝑓(𝑥) = 𝑥^2 − 2𝑓(𝑥) = 𝑥2 − 2, to find the square root of 2:

1. Initial interval: [1,2][1, 2][1,2]


2. 𝑓(1) = −1𝑓(1) = −1𝑓(1) = −1, 𝑓(2) = 2𝑓(2) = 2𝑓(2) = 2
3. 𝐴𝑝𝑝𝑙𝑦 𝑡ℎ𝑒 𝑎𝑙𝑔𝑜𝑟𝑖𝑡ℎ𝑚:
o 𝐹𝑖𝑟𝑠𝑡 𝑖𝑛𝑡𝑒𝑟𝑝𝑜𝑙𝑎𝑡𝑖𝑜𝑛: 𝑠 ≈ 1.5𝑠 \𝑎𝑝𝑝𝑟𝑜𝑥 1.5𝑠 ≈ 1.5
o 𝐸𝑣𝑎𝑙𝑢𝑎𝑡𝑒 𝑓(1.5) ≈ 0.25𝑓(1.5) \𝑎𝑝𝑝𝑟𝑜𝑥 0.25𝑓(1.5) ≈ 0.25
o 𝑁𝑒𝑤 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙: [1.5,2][1.5, 2][1.5,2]
o Repeat until convergence.
Auxiliary Space and Time Complexity
 Time Complexity: O(m + n) where m is the smallest index of the sequence which is
the beginning of a cycle, and n is the cycle’s length.
 Auxiliary Space: – O(1)

Conclusion
Brent's method is a powerful root-finding algorithm that guarantees convergence and combines
the reliability of bisection with the speed of inverse quadratic interpolation. Its hybrid approach
makes it suitable for a wide range of applications in numerical analysis.

References
1. Wolfram MathWorld, Brent's Method.
2. Wikipedia, Brent's method.
3. GeeksforGeeks, Brent's Cycle Detection Algorithm.
Proximity Algorithm - Largest Empty Sphere
Algorithm
Introduction
The largest empty sphere problem involves finding the largest possible hypersphere that can fit
within a given space without overlapping any specified obstacles. This problem is a key concept
in computational geometry, with applications in fields such as clustering, collision detection, and
spatial analysis.

In computational geometry, the largest empty sphere problem focuses on determining the largest
hypersphere that fits in a d-dimensional space without intersecting any given obstacles. This
problem can be applied to several real-world scenarios, including optimizing layouts and
maximizing space utilization.

Problem Definition
Given a set of obstacles in d-dimensional space, the goal is to find the largest hypersphere such
that its interior does not intersect any obstacle. The problem can be simplified to two dimensions,
where it involves finding the largest empty circle.

Algorithm Description
1. Initial Setup: Identify all obstacles and their coordinates in the given space.
2. Convex Hull Calculation: Compute the convex hull of the obstacles to define the
boundary within which the largest empty sphere will be located.
3. Voronoi Diagram Construction: Create a Voronoi diagram for the obstacle points. The
vertices of the Voronoi diagram are potential candidates for the sphere's center.
4. Sphere Calculation: For each candidate point, calculate the largest sphere that can be
centered at that point without intersecting any obstacles.
5. Optimization: Select the candidate point that allows for the largest sphere.

Pseudocode
function largestEmptySphere(obstacles, dimension):

convexHull = computeConvexHull(obstacles)
voronoiDiagram = constructVoronoiDiagram(obstacles)
largestSphere = null
maxRadius = 0

for vertex in voronoiDiagram.vertices:


if isWithinConvexHull(vertex, convexHull):
radius = computeMaxRadius(vertex, obstacles)
if radius > maxRadius:
maxRadius = radius
largestSphere = (vertex, radius)

return largestSphere
Worked Example
Consider a 2D example with obstacles at coordinates (1,1)(1, 1)(1,1), (2,2)(2, 2)(2,2), and
(3,1)(3, 1)(3,1):

1. Compute the convex hull, which forms a triangle encompassing all points.
2. Construct the Voronoi diagram.
3. Identify Voronoi vertices as potential centers for the largest empty circle.
4. Calculate the largest circle for each vertex.
5. Select the vertex yielding the largest circle.

Auxiliary Space and Time Complexity


 Time Complexity: The time complexity of constructing the Voronoi diagram is (n log n),
where n is the number of obstacles.
 Space Complexity: The space complexity is O(n) for storing the Voronoi diagram and
the convex hull.

Conclusion
The largest empty sphere algorithm is essential in computational geometry for optimizing space
utilization and solving proximity problems. By leveraging techniques such as convex hull
computation and Voronoi diagrams, this method provides an efficient solution to finding the
largest sphere that can fit within a given set of obstacles.

References
1. Wikipedia, Proximity problems.
2. Wikipedia, Largest empty sphere.
Graph Reduction
Introduction
Graph reduction is a technique in computer science used to implement non-strict (lazy)
evaluation. This method efficiently evaluates expressions by transforming them into graphs and
reducing these graphs. It is essential in functional programming languages, enabling more
effective computation and resource utilization.

Graph reduction allows expressions to be represented as graphs, where nodes correspond to


operations and edges represent dependencies. This approach is fundamental for implementing
lazy evaluation, where computations are deferred until their results are needed.

Graph Reduction Technique


The process involves converting a program into a graph form, where subexpressions can be
shared, thus avoiding redundant calculations. The main steps in graph reduction include:

1. Graph Representation: Expressions are represented as directed acyclic graphs (DAGs).


2. Reduction Operations: Nodes in the graph are reduced based on specific rules until a
final result is obtained.

Combinator Graph Reduction


Combinator graph reduction is an implementation technique for functional languages. It
transforms programs into combinator representations mapped to graphs. Execution involves
rewriting parts of these graphs to achieve results.

Key Steps:
1. Program Conversion: Convert the program into combinator form.
2. Graph Mapping: Map the combinator representation to a graph data structure.
3. Graph Reduction: Perform reduction on the graph to compute the result.

Pseudocode
function graphReduction(expression):
graph = convertToGraph(expression)
while not isReduced(graph):
for node in graph:
if isReducible(node):
reduce(node)
return graph.getResult()
Worked Example

𝐶𝑜𝑛𝑠𝑖𝑑𝑒𝑟 𝑡ℎ𝑒 𝑒𝑥𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 (2 + 2) + (3 + 3)(2 + 2) + (3 + 3)(2 + 2) + (3 + 3):

1. 𝑻𝒓𝒆𝒆 𝑹𝒆𝒑𝒓𝒆𝒔𝒆𝒏𝒕𝒂𝒕𝒊𝒐𝒏:
o 𝐼𝑛𝑖𝑡𝑖𝑎𝑙 𝑡𝑟𝑒𝑒: ((2 + 2) + (3 + 3))((2 + 2) + (3 + 3))((2 + 2) + (3 + 3))
o 𝑅𝑒𝑑𝑢𝑐𝑒𝑑 𝑡𝑟𝑒𝑒: (4 + 6)(4 + 6)(4 + 6)
o 𝑅𝑒𝑠𝑢𝑙𝑡: 101010
2. 𝑮𝒓𝒂𝒑𝒉 𝑹𝒆𝒑𝒓𝒆𝒔𝒆𝒏𝒕𝒂𝒕𝒊𝒐𝒏:
o 𝐼𝑛𝑖𝑡𝑖𝑎𝑙 𝑔𝑟𝑎𝑝ℎ: 𝑁𝑜𝑑𝑒𝑠 𝑓𝑜𝑟 𝑒𝑎𝑐ℎ 𝑜𝑝𝑒𝑟𝑎𝑡𝑖𝑜𝑛 𝑎𝑛𝑑 𝑒𝑑𝑔𝑒𝑠 𝑓𝑜𝑟 𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑐𝑖𝑒𝑠.
o 𝑅𝑒𝑑𝑢𝑐𝑒𝑑 𝑔𝑟𝑎𝑝ℎ: 𝑆ℎ𝑎𝑟𝑒𝑑 𝑠𝑢𝑏𝑒𝑥𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛𝑠 𝑒𝑣𝑎𝑙𝑢𝑎𝑡𝑒𝑑 𝑜𝑛𝑐𝑒.

Auxiliary Space and Time Complexity


 Time Complexity: Depends on the specific reductions applied, but typically O(n) for
linear reductions.
 Auxiliary Space: O(n), where n is the number of nodes in the graph.

Conclusion
Graph reduction is a powerful technique for implementing lazy evaluation in functional
programming. By representing expressions as graphs and applying reductions, it optimizes the
computation process, ensuring efficiency and avoiding redundant calculations.

References
1. Wikipedia, Graph reduction.
String Hashing
Abstract
String hashing is a technique used to convert a string into a numerical value, or hash, that
uniquely represents the content of the string. This technique is fundamental in various
applications, such as string comparison, data retrieval, and cryptography. Efficient string hashing
ensures quick lookups and comparisons, making it an essential tool in competitive programming
and computer science.

Introduction
String hashing involves converting a string into a fixed-size integer, which uniquely identifies
the string's content. The primary goals of string hashing are to achieve fast computation and to
minimize collisions (different strings having the same hash value). This paper explains the
principles of string hashing, provides a pseudocode implementation, and discusses its time and
space complexities.

String Hashing Technique


The most common approach for string hashing is the polynomial rolling hash function. It treats a
string as a polynomial, where each character is a coefficient, and evaluates this polynomial at a
specific base value.

Mathematical Representation:
Given a string SSS of length nnn with characters 𝑆[0], 𝑆[1], … , 𝑆[𝑛 − 1]𝑆[0], 𝑆[1],\𝑙𝑑𝑜𝑡𝑠, 𝑆[𝑛 −
1]𝑆[0], 𝑆[1], … , 𝑆[𝑛 − 1], the hash value H is computed as: 𝐻 = 𝑆[0] ⋅ 𝑝0 + 𝑆[1] ⋅ 𝑝1 + 𝑆[2] ⋅
𝑝2 + ⋯ + 𝑆[𝑛 − 1] ⋅ 𝑝𝑛 − 1𝑚𝑜𝑑 𝑚𝐻 = 𝑆[0] \𝑐𝑑𝑜𝑡 𝑝^{0} + 𝑆[1] \𝑐𝑑𝑜𝑡 𝑝^{1} +
𝑆[2] \𝑐𝑑𝑜𝑡 𝑝^{2} + \𝑙𝑑𝑜𝑡𝑠 + 𝑆[𝑛 − 1] \𝑐𝑑𝑜𝑡 𝑝^{𝑛 − 1} \𝑚𝑜𝑑 𝑚𝐻 = 𝑆[0] ⋅ 𝑝0 + 𝑆[1] ⋅
𝑝1 + 𝑆[2] ⋅ 𝑝2 + ⋯ + 𝑆[𝑛 − 1] ⋅ 𝑝𝑛 − 1𝑚𝑜𝑑𝑚 where p is a prime number (base) and mmm is a
large prime modulus to avoid overflow and reduce collisions.

Algorithm Description
String hashing can be efficiently implemented using a rolling hash approach, which allows the
hash value of a substring to be computed quickly based on the hash value of the previous
substring.

Steps:

1. Initialization: Select appropriate values for the base ppp and modulus mmm.
2. Hash Computation: Compute the hash value of the string using the polynomial rolling
hash formula.
3. Rolling Hash: Update the hash value for substrings efficiently using the previous hash
value.
Pseudocode
𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝐻𝑎𝑠ℎ(𝑆, 𝑝, 𝑚): 𝑛 = 𝑙𝑒𝑛𝑔𝑡ℎ(𝑆) ℎ𝑎𝑠ℎ_𝑣𝑎𝑙𝑢𝑒 = 0 𝑝_𝑝𝑜𝑤
= 1 # 𝑝^0 𝑖𝑛𝑖𝑡𝑖𝑎𝑙𝑙𝑦 𝑓𝑜𝑟 𝑖 𝑓𝑟𝑜𝑚 0 𝑡𝑜 𝑛 − 1: ℎ𝑎𝑠ℎ_𝑣𝑎𝑙𝑢𝑒
= (ℎ𝑎𝑠ℎ_𝑣𝑎𝑙𝑢𝑒 + (𝑜𝑟𝑑(𝑆[𝑖]) − 𝑜𝑟𝑑(′𝑎′) + 1) ∗ 𝑝_𝑝𝑜𝑤) % 𝑚 𝑝_𝑝𝑜𝑤
= (𝑝_𝑝𝑜𝑤 ∗ 𝑝) % 𝑚 𝑟𝑒𝑡𝑢𝑟𝑛 ℎ𝑎𝑠ℎ_𝑣𝑎𝑙𝑢𝑒 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑟𝑜𝑙𝑙𝑖𝑛𝑔𝐻𝑎𝑠ℎ(𝑆, 𝑝, 𝑚): 𝑛
= 𝑙𝑒𝑛𝑔𝑡ℎ(𝑆) ℎ𝑎𝑠ℎ𝑒𝑠 = 𝑎𝑟𝑟𝑎𝑦 𝑜𝑓 𝑠𝑖𝑧𝑒 𝑛 ℎ𝑎𝑠ℎ𝑒𝑠[0]
= 𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝐻𝑎𝑠ℎ(𝑆[0], 𝑝, 𝑚) 𝑝_𝑝𝑜𝑤 = 𝑝 𝑓𝑜𝑟 𝑖 𝑓𝑟𝑜𝑚 1 𝑡𝑜 𝑛 − 1: ℎ𝑎𝑠ℎ𝑒𝑠[𝑖]
= (ℎ𝑎𝑠ℎ𝑒𝑠[𝑖 − 1] ∗ 𝑝 + (𝑜𝑟𝑑(𝑆[𝑖]) − 𝑜𝑟𝑑(′𝑎′) + 1)) % 𝑚 𝑟𝑒𝑡𝑢𝑟𝑛 ℎ𝑎𝑠ℎ𝑒𝑠

plaintext
Copy code
function computeHash(S, p, m):
n = length(S)
hash_value = 0
p_pow = 1 # p^0 initially
for i from 0 to n-1:
hash_value = (hash_value + (ord(S[i]) - ord('a') + 1) * p_pow) % m
p_pow = (p_pow * p) % m
return hash_value

function rollingHash(S, p, m):


n = length(S)
hashes = array of size n
hashes[0] = computeHash(S[0], p, m)
p_pow = p
for i from 1 to n-1:
hashes[i] = (hashes[i-1] * p + (ord(S[i]) - ord('a') + 1)) % m
return hashes

Worked Example

Consider the 𝑠𝑡𝑟𝑖𝑛𝑔 𝑆 = "𝑎𝑏𝑐"𝑆 = "𝑎𝑏𝑐"𝑆 = "𝑎𝑏𝑐":

 𝐵𝑎𝑠𝑒 𝑝 = 31𝑝 = 31𝑝 = 31


 𝑀𝑜𝑑𝑢𝑙𝑢𝑠 𝑚 = 109 + 9𝑚 = 10^9 + 9𝑚 = 109 + 9

Using the polynomial rolling hash function:

 𝐻 = (1 ⋅ 310 + 2 ⋅ 311 + 3 ⋅ 312)𝑚𝑜𝑑 (109 + 9)𝐻 = (1 \𝑐𝑑𝑜𝑡 31^0 +


2 \𝑐𝑑𝑜𝑡 31^1 + 3 \𝑐𝑑𝑜𝑡 31^2) \𝑚𝑜𝑑 (10^9 + 9)𝐻 = (1 ⋅ 310 + 2 ⋅ 311 + 3 ⋅
312)𝑚𝑜𝑑(109 + 9)
 𝐻 = (1 + 62 + 2883)𝑚𝑜𝑑 (109 + 9)𝐻 = (1 + 62 + 2883) \𝑚𝑜𝑑 (10^9 +
9)𝐻 = (1 + 62 + 2883)𝑚𝑜𝑑(109 + 9)
 𝐻 = 2946𝐻 = 2946𝐻 = 2946

Time Complexity
 Hash Computation: 𝑂(𝑛)for computing the hash of a string of length nnn.
 Rolling Hash: 𝑂(𝑛) for computing the hashes of all substrings.
Auxiliary Space Complexity
 Space Complexity: O(n) for storing the hash values of substrings.]

Conclusion
String hashing is an efficient technique for string comparison and retrieval, crucial in various
computational applications. By using polynomial rolling hash functions and rolling hash
methods, we can achieve fast and reliable string hashing, ensuring efficient data processing and
minimal collisions.

References
1. CP-Algorithms, String Hashing.

You might also like