Daa C1

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 37

1

Design and Analysis of


Algorithms
CHAPTER ONE: INTRODUCTION

Part-1:  Basics of DAA


 Revision on DSA
AP2 2
 A program is a combination of data structures and algorithms.
 Algorithm is a set(or sequence) of instructions that built up the program. It determine a set and
sequence of operations that are going to be applied on the data reside in the data structures for
tackling a problem.
 Data structures are run time storages that enables the program to organize data during program
execution. Therefore, one can simply say data structures are runtime memories.
 Note that algorithm, program and process are related but quite different topics.
 Algorithm is a set( or sequence) of operation to solve a specific problem and expressed either in
pseudocode or flowchart on paper.
 Program is also a set of procedural steps( or operations) to solve a specific problem but expressed
using high level or low level programming languages.
 Processes are simply program at executions.
What is an Algorithm? 3
 An algorithm is a finite set of instructions or logic, written in order, to accomplish a
certain predefined task.
 Algorithm is not the complete code or program, it is just the core logic(solution) of a
problem, which can be expressed using either a pseudocode or flowchart.
 An algorithm takes some value or a set of values as input and produces some value
or a set of values as output.
 Data structures model the static part of the world. They are unchanging while the
world is changing.
 In order to model the dynamic part of the world we need to work with algorithms.
Classifications of algorithms 4
 Algorithms can be categorized based on the implementation technique or design
paradigm they deploy:
I. Based on implementation technique:
1. Recursive Vs Iterative algorithms
 Recursive algorithms are based on recursion e.g. the problem of Hanoi tower or Fibonacci
numbers
 where as Iteratives are based on loops or iterative statements.
 Every recursive algorithms has an equivalent iterative version, and vise-versa e.g. the
problem of factorials
Cont… 5
2. Serial Vs Parallel Vs Distributed algorithms
 Serial algorithms are implemented on the assumption that computers execute one
instruction of an algorithm at a time. Those computers are sometimes called serial
computers.
 Parallel algorithms take advantage of computer architectures where several processors can
work on a problem at the same time,
 whereas distributed algorithms utilize multiple machines connected with a network.
 Parallel or distributed algorithms divide the problem into more symmetrical (or
proportional) or asymmetrical subproblems and collect the results back together.
 Sorting algorithms can be parallelized efficiently. Iterative algorithms are generally
parallelizable.
 Some problems have no parallel algorithms, and are called inherently serial problems.
Cont… 6
3. Deterministic Vs Non-deterministic algorithms
 Deterministic algorithms solve the problem with exact decision at every step of the
algorithm.
 Whereas non-deterministic algorithm solves problems via guessing although typical
guesses are made more accurate through the use of heuristics.
4. Exact Vs Approximate algorithms
 While many algorithms reach an exact solution, approximation algorithms seek an
approximation that is close to the true solution.
 Approximation may use either a deterministic or a random strategy. Such algorithms have
practical value for many hard problems. E.g. Minimum Vertex Cover
Cont… 7
II. Based on design strategy :
1. Divide and Conquer Strategy
 A divide and conquer algorithm repeatedly reduce an instance of a problem to one or more
smaller instances of the same problem (usually recursively), until the instances are small
enough to solve easily.
 One such example of divide and conquer is merge sorting. Sorting can be done on each
segment of data after dividing data into segments and sorting of entire data can be obtained
in conquer phase by merging them.
 Divide, conquer then combine.
Cont… 8
2. Dynamic programming Strategy
 When a problem shows optimal substructure, meaning the optimal solution to a problem
can be constructed from optimal solutions to sub problems, and overlapping sub problems,
meaning the same sub problems are used to solve many different problem instances, a
quicker approach called dynamic programming avoids recomputing solutions that have
already been computed.
 “Programming” in this context refers to a tabular method, not to writing computer code).
 For example, the shortest path to a goal from a vertex in a weighted graph can be found by
using the shortest path to the goal from all adjacent vertices.
 The main difference between dynamic programming and divide and conquer is that sub
problems are more or less independent in divide and conquer, whereas sub problems
overlap in dynamic programming.
Cont… 9
3. Greedy algorithm Strategy
 A greedy algorithm is similar to a dynamic programming algorithm, but the difference is
that solutions to the sub problems do not have to be known at each stage; instead, a
"greedy" choice can be made of what looks best for the moment.
 The greedy method extends the solution with the best possible decision (not all feasible
decisions) at an algorithmic stage based on the current local optimum and the best decision
(not all possible decisions) made in previous stage.
 It is not exhaustive, and does not give accurate answer to many problems. But when it
works, it will be the fastest method. The most popular greedy algorithm is finding the
minimal spanning tree as given by Kruskal
Cont… 10
3. Backtracking Strategy
 Try each possibility until they find the right one. E.g. Depth First Search.
 During the search, if an alternative taken doesn’t work, the search backtracks to the choice
point, the place which presented different alternatives, and tries the next alternative.
 When the alternatives are exhausted, the search returns to the previous choice point and try
the next alternative there.
4. Search and Enumeration Strategy
 Many problems (such as playing chess) can be modeled as problems on graphs. A graph
exploration algorithm specifies rules for moving around a graph and is useful for such
problems.
 This category also includes search algorithms, branch and bound enumeration and
backtracking.
Cont… 11
5. Heuristic and Probabilistic Strategy
 Probabilistic algorithms are those that make some choices randomly (or pseudo-randomly);
for some problems, it can in fact be proven that the fastest solutions must involve some
randomness.
 Genetic algorithms attempt to find solutions to problems by mimicking biological
evolutionary processes, with a cycle of random mutations yielding successive generations
of "solutions". Thus, they emulate reproduction and "survival of the fittest". In genetic
programming, this approach is extended to algorithms, by regarding the algorithm itself as a
"solution" to a problem.
 Heuristic algorithms, whose general purpose is not to find an optimal solution, but an
approximate solution where the time or resources are limited. They are not practical to find
perfect solutions. An example of this would be local search, or simulated annealing.
Complexity Analysis 12
 In order to solve a specific( or a single) problem, there may be many possible
algorithms.
 One has to be able to choose the efficient algorithm for the problem at hand using
some scientific method.
 An algorithm is said to be efficient and fast, if it takes less time to execute and
consumes less memory space.
 On the basis of this, we measure the performance of algorithm theoretically in terms
of:
1. Space complexity
2. Time complexity
Space complexity 13
 It is the amount of memory used by the algorithm (including the input values to the
algorithm) to execute and produce the result.
 Space complexity must be taken seriously for multi-user systems and in situations
where limited memory is available.
 An algorithm generally requires space for following components :
1. Instruction Space: Its the space required to store the executable version of the
program. This space is fixed, but varies depending upon the number of lines of code in
the program.
2. Data Space: Its the space required to store all the constants and variables(including
temporary variables) value.
3. Environment Space: Its the space required to store the environment information
needed to resume the suspended function.
Cont… 14
 Mostly, while calculating the Space Complexity of any algorithm, we usually consider only Data
Space and we neglect the Instruction Space and Environmental Stack.
 For calculating the space complexity, we need to know the value of memory used by different type of
datatype variables. Example
Example?
1. {
Type Size int z=a+b+c+d;
return z;
bool, char, unsigned char, signed char, __int8 1 byte
}
__int16, short, unsigned short, wchar_t, __wchar_t 2 bytes 2. int sum(int a[], int n)
float, __int32, int, unsigned int, long, unsigned long 4 bytes {
int x = 0;
double, __int64, long double, long long 8 bytes for(int i = 0; i < n; i++)
{
x = x + a[i];
}
return(x);
}
Time complexity 15
 Time Complexity is a way to represent the amount of time required by the program to run till its
completion.
 There are two approaches for time complexity analysis:
1. Empirical/ Priori approach: examines the algorithm using different instances on different
platforms.
2. Theoretical/ Posteriori approach: determines the efficiency of the algorithm mathematically.
 Example: how much time the following code snippets will take?
1. return (n*n); 2. for(int i=0; i<n; i++)
{
n=n+n;
}

 The above code-snippets are meant to solve the same problem, which is the square of a number. But
they end-up with different time.
Cont… 16
 It is difficult to follow empirical approach as a consistent measure of an algorithm’s
efficiency, since it is dependent on
• Specific processor speed
• Current processor load
• Specific data for a particular run of the program
• Input Size
• Input Properties
• Operating Environment
 The better way is using theoretical approach, where instead of examining and
comparing the algorithm using different instance(or inputs) on different platforms,
we mathematically calculate the operations executed or storage space used.
Rules for (time complexity) Analysis 17
1. We assume an arbitrary time unit.
2. Execution of the following operations takes 1 time unit:
• Assignment Operation
• Single Input/Output Operation
• Single Boolean Operations
• Single Arithmetic Operations
• Function Return
3. Running time of a selection statement (if, switch) is the time for the condition evaluation +
the maximum of the running times for the individual clauses in the selection.
Cont… 18
4. Running time for a loop is equal to the running time for the statements inside the loop *
number of iterations. The total running time of a statement inside a group of nested loops is
the running time of the statements multiplied by the product of the sizes of all the loops.
For nested loops, analyze inside out.
• Always assume that the loop executes the maximum number of iterations possible.
5. Running time of a function call is 1 for setup + the time for any parameter calculations +
the time required for the execution of the function body.
Examples 19
1. int total(int n) 1. void func()
{
{ int x=0;
int sum=0; int i=0;
int j=1;
for (int i=1;i<=n;i++) cout<< “Enter an Integer value”;
sum=sum+1; cin>>n;
while (i<n){
return sum; x++;
} i++;
2. int count() }
while (j<n)
{ {
int k=0; j++;
}
cout<< “Enter an integer”; }
cin>>n; 2. int sum (int n)
{
for (i=0;i<n;i++) int partial_sum = 0;
k=k+1; for (int i = 1; i <= n; i++)
return 0; partial_sum = partial_sum +(i * i * i);
return partial_sum;
} }
Formal Approaches to Analysis 20
1. Loops: In general, a loop translates to a summation. The index and bounds of the
summation are the same as the index and bounds of the for loop.

2. Conditions: Compute the maximum of the running time form the conditions.
Asymptotic Analysis 21
 Asymptotic means a line that tends to converge to a curve, which may or may not
eventually touch the curve. It is the line that stays with bounds.
 Asymptotic notation represents the “fastest possible” and “slowest possible” running times
for an algorithm using high and low bounds on speed.
 Asymptotic analysis depends on the input state as:
1. Worst Case/ Upper bound: The amount of time the algorithm takes on the worst possible set
of inputs.
 Big-Oh (O) and Little-oh(o) : our interests is on worst case! Why?
2. Average Case: The amount of time the algorithm takes on an "average" set of inputs.
 Theta(θ)
3. Best Case/ Lower bound: The amount of time the algorithm takes on the smallest possible set
of inputs.
 Big-omega(Ω) and Little-Omega(ω)
Big-Oh (O) 22
 It is asymptotically tight upper bound of any function. Hence it denotes the worst-
case complexity of any algorithm.
 Formally:
f(n)=O(g(n)) if there exist c, n0 ∊ ℛ+ such that 0 ≤ f(n) ≤ cg(n) for all n ≥ n0 ≥1
Which is the set of functions which, as n gets large, grow no faster than a constant time.
 Graphically: Example: show
1. f(n)=10n+5 and g(n)=n. Show that
f(n) is O(g(n)).
2. f(n) = 3n2 +4n+1. Show that
f(n)=O(n2).
Home Practice
3. f(n) = 2n + n2 show that f(n)=O(2n)
Big-omega (Ω) 23
 It is asymptotically tight lower bound of any function. Hence it denotes the best-
case complexity of any algorithm.
 Formally:
f(n)=Ω (g(n)) if there exist c, n0 ∊ ℛ+ such that f(n)≥cg(n) or 0 ≤ cg(n)≤ f(n) for all n ≥ n0 ≥1,
considering 0 is important. Why?
 Graphically: Example: show
1. f(n)=10n+5 and g(n)=n. Show that
f(n) is Ω (g(n)).
2. f(n) = 3n2 +4n+1. Show that
f(n)=Ω (n2).
Theta (θ) 24
 Theta denotes the average-case complexity of any algorithm.
 Formally:
f(n)=θ(g(n)) if there exist c1,c2 and n0 ∊ ℛ+ such that c1g(n) ≤f(n) ≤c2g(n) for all n ≥ n0 ≥1
 Graphically:

Example: show
1. f(n)=10n+5 and g(n)=n. Show that
f(n) is θ (g(n)).
2. f(n) = 3n2 +4n+1. Show that f(n)=θ
(n2).
Little-oh(o) and Little-omega(ω) 25
 Both little-oh(o) and little-omega(ω) notations used to represent upper bound and
lower bound.

 Main difference with Big-Oh is that Big-Oh defines for some constants c by Little Oh
defines for all constants.
 Main difference with Ω is that, ω defines for some constants c by ω defines for all
constants.
 Why? Theta
Cont… 26
 Formally:
1. Little-oh(o):
f(n)=o(g(n)) means for all c>0 there exists some n0≥1 such that f(n)<cg(n) for all n>= n0.
2. Little-omega(ω):
f(n)= ω (g(n)) means for all c>0 there exists some n0≥1 such that f(n)>cg(n) for all n>= n0.

Example: show
1. f(n)=10n+5 and g(n)=n. Show that f(n) is not o(g(n)).
2. f(n) = 3n2 +4n+1. Show that f(n)is ω(n2).
3. f(n)=10n+5 and g(n)=n2. Show that f(n) is o(g(n)).
27

CHAPTER ONE: INTRODUCTION

Part-2:  Amortized analysis


 Recurrence relations
 Recurrence analysis
Amortized Time Analysis 28
 Amortized analysis is a worst-case analysis of a sequence of operations to obtain a tighter
bound on the overall program as average cost per operation rather than separately analyzing
each operation in the sequence.
 In an amortized analysis, we average the time required to perform a sequence of data-
structure operations over all the operations performed.
 With amortized analysis, we can show that:
 The average cost of an operation is small, if we average over a sequence of operations, even
though a single operation within the sequence might be expensive.
Cont… 29
 Amortized analysis can be done using three techniques:
1. Aggregate analysis: in aggregate analysis, we show that for all n, a sequence of n operations
takes worst-case time T(n) in total. In the worst case, the average cost, or amortized cost, per
operation is therefore T(n)/n.
 Note that this amortized cost applies to each operation, even when there are several types of operations
in the sequence.
2. The accounting Method: in the accounting method of amortized analysis, we assign differing
charges to different operations, with some operations charged more or less than they actually
cost. We call the amount we charge an operation its amortized cost.
 When an operation’s amortized cost exceeds its actual cost, we assign the difference to specific objects
in the data structure as credit.
 Credit can help pay for later operations whose amortized cost is less than their actual cost.
Cont… 30
3. The potential Method: Instead of representing prepaid work as credit stored with specific
objects in the data structure, the potential method of amortized analysis represents the prepaid
work as “potential energy,” or just “potential,” which can be released to pay for future
operations.
 The potential method is conceptually similar to accounting method, but in potential method the bank balance of a
particular state is independent on the previous state.
 A potential method makes use of a potential function represented as Ф(h), h is state, to:
1. to drive the potential at any state independently and
2. also to compute a potential difference, which shows the change in cost between two operations.

Example: Use Dynamic array to demonstrate the concept


Homework for students: analyze the amortized time of stack operations using the three techniques!
Recurrence Analysis 31
 When an algorithm contains a recursive call to itself, its running time can often be
described by a recurrence.
 A recurrence is an equation or inequality that describes a function in terms of its value on
smaller inputs.
 Recurrence analysis can be achieved by using any one of the three methods.
1. Substitution Method: Here we guess a bound and then use mathematical induction to prove our guess is
correct.
2. Iterative (Recursion Tree) Method: In this method, we convert the recurrence into a tree whose nodes
represent the costs incurred at various levels of the recursion.
3. Master Theorem: This method provides bounds for recurrences of the form T(n) = a T(n/b) – f(n) where a
≥ 1, b > 1 and f(n) is a given function.
Substitution Method 32
 Substitution method is the ultimate method of guess and confirm. That means we first
guess the solution and then prove it using mathematic induction that the solution works.
 This method is powerful, but it obviously can be applied only in case when it is easy to
guess the form of the solution.
 Pillars of substitution method:
1. Guess the solution: This can be achieved by experience and creativity or by using some sort of heuristics.
2. Use mathematical induction: We use this to find the constants and show the solution works.
Cont… 33
 Example-for merge sort using divide and conquer strategy: MergeSort (arr, p, r)
1. If(p<r)
T(n)= 2. q=(p+r)/2
3. MergeSort (arr, p, q)
 Now to solve this recurrence using substitution method: 4. MergeSort (arr, q+1, r)
5. Merge(arr, p, q, r)
1. Make heuristic guess of the solution: n {do this guess work for ?}
2. Using induction prove:
Prove: T(n) = n log n assuming T(k) = k log k + k for all k< n
Now since we said k < n, we can assume k=, therefore the recurrence
Can be written as T() = log + Then replace T() = log + in the original recurrence we get T(n) = 2( log +
) + n which gives T(n) = nlogn – 2n + n+ n.
Now that ends with T(n) = nlogn which we prove using inductive
Cont… 34
 But generally we use asymptotic notation for our purpose in analyzing recurrences to show the upper
(O) and lower (Ω) bounds separately.
Example:
show T(n) = 2T(n/2) + O(n) for tight upper bound and T(n) = 2T(n/2) + Ω(n) for maximum lower bound
 Some useful techniques to consider while using substitution method in analyzing reurrences:
1. Similar Recurrence: If a recurrence is similar to one that you have previously seen like: T(n) = 2T((n/2) +
5) + n which is similar to T(n) = 2T(n/2) + n then we guess the solution as O(n log n).
2. Floor, Ceil and Lower Order Terms: Floor, ceil and lower order terms don’t affect the result of the
recurrence. Such as if the recurrence is T(n) = 2T( ⎡n/ 2⎤ ) + n then the solution is O(n log n).
3. Avoid Wrong Guess: guess systematically!
4. Changing Variables: At times we need to change any complex term present in recurrence by any variable.
T(n) = 2T() + n, now solve assuming k=log n.
Iteration Method 35
 In iteration method the basic idea is to expand the recursion and try to express it as a summation of
terms dependent only on ‘n’ and the initial conditions.
 Example:
1. T(n)= T(n-1)+1
2. T(n)=2 T()+ n
Master Method 36
 If the given recurrence relation is of the form of T(n) = aT(n/b) + f(n) where:
 a and b are positive integers such that a ≥ 1, b > 1,
 f(n) be a non-negative function and
 n/b may be ⎡n/b⎤ or ⎣n/b⎦
then we can apply Master theorem to find the running time by using below three cases.
 Simplified Mater method or theorem:
Given: T(n) = a T(n/b) + O(nk) for a ≥ 1, b > 1, k ≥ 0
= O(nk) if a < bk
= O(nk log n) if a = bk

= O(n ) if a > bk
Cont… 37

You might also like