Chapter 2
Chapter 2
Chapter 2
Algorithm analysis refers to the process of determining how much computing time and storage
that algorithms will require. In other words, it’s a process of predicting the resource requirement
of algorithms in a given environment.
In order to solve a problem, there are many possible algorithms. One has to be able to choose the
best algorithm for the problem at hand using some scientific method. To classify some data
structures and algorithms as good, we need precise ways of analyzing them in terms of resource
requirement. The main resources are:
Running Time
Memory Usage
Communication Bandwidth
Running time is usually treated as the most important since computational time is the most
precious resource in most problem domains.
• What to measure
• What to analyze
• Compiler used
• Computer used
– Although they are important, we will not deal with them in the course
– Size of input
– Quality of compiler
• for most algorithm, the running time depends on size of the input
– That is the running time of most algorithms varies with the input and typically
grows with the input size
• Generally, we seek an upper bound on the running time, because everybody likes a
guarantee
Empirical Analysis
– Run the program with the data sets (the inputs) of varying size and composition
2
– Use the method like clock() ( or System.CurrentTime.Millis() ) to get an accurate
measure of the running time
• Is an experimental study
• The result comes from the running time of the program
• It can’t be used in estimating the efficiency of algorithms because the result varies due to
variations in
– Specific processor speed
– Input size
– SW environment
– So, in order to compare two algorithms, the same hardware and software
environment must be used
3
• Involves implementing the algorithm and testing it on various instances
• It is necessary to implement and test the algorithm in order to determine its running time
• Experiments can be done only on a limited set of inputs, and may not be indicative of the
running time on other inputs not included in the experiment
• That is, results may not be indicative of the running time on other inputs not included in
the experiment
Theoretical Analysis
• Uses a high-level description of the algorithm instead of testing one of its implementation
• That is, running time is expressed as T(n) for some function T on input size
n
• Rather it uses the number of operations (which are expressed in time units)
because the number of operations do not vary with
• Processor speed
4
• Input size
• SW environment
• The only factor that affects measure of complexity is the input size of the
algorithm
nà
– Order of magnitude
Primitive operations:
5
– Basic computations performed by an algorithm
– Identifiable in pseudocode
• Evaluating an expression
– Arithmetic operations
– Assignment operations
– Loop iterations
– Comparisons
Order of Magnitude
• Refers to the rate at which the storage or time grows as a function of problem size
• Input: T(n), the timing function
6
• Output: order of T(n) written as O(T(n))
• The output of helps to determine the category to which the algorithm belongs to some
known functions (functions whose complexity is known through research and
experiment)
• Thus, the output is expressed in terms of its relationship to some known functions
Accordingly, we can analyze an algorithm according to the number of operations required, rather
than according to an absolute amount of time involved. This can show how an algorithm’s
efficiency changes according to the size of the input.
Complexity Analysis
Complexity Analysis is the systematic study of the cost of computation, measured either in time
units or in operations performed, or in the amount of storage space required.
The goal is to have a meaningful measure that permits comparison of algorithms independent of
operating platform.
There are two things to consider:
Time Complexity: Determine the approximate number of operations required to solve a
problem of size n.
Space Complexity: Determine the approximate memory required to solve a problem of
size n.
Complexity analysis involves two distinct phases:
Algorithm Analysis: Analysis of the algorithm or data structure to produce a function T
(n) that describes the algorithm in terms of the operations performed in order to measure
the complexity of the algorithm.
Order of Magnitude Analysis: Analysis of the function T (n) to determine the general
complexity category to which it belongs.
There is no generally accepted set of rules for algorithm analysis. However, an exact count of
operations is commonly used.
7
The total running time of a statement inside a group of nested loops is the running time of
the statements multiplied by the product of the sizes of all the loops.
For nested loops, analyze inside out.
Always assume that the loop executes the maximum number of iterations possible.
5. Running time of a function call is 1 for setup + the time for any parameter calculations +
the time required for the execution of the function body.
6. Consecutive Statements
Add the time complexities of each statement
7. Switch statement: Take the complexity of the most expensive case.
Examples:
8
1. int count(){ Time Units to Compute
int k=0; -------------------------------------------------
cout<< “Enter an integer”; 1 for the assignment statement: int k=0
cin>>n; 1 for the output statement.
for (i=0;i<n;i++) 1 for the input statement.
k=k+1; In the for loop:
return 0; 1 assignment, n+1 tests, and n increments.
} n loops of 2 units for an assignment, and an
addition.
1 for the return statement.
9
{ -------------------------------------------------
int x=0; 1 for the first assignment statement: x=0;
int i=0; 1 for the second assignment statement: i=0;
int j=1; 1 for the third assignment statement: j=1;
cout<< “Enter an Integer value”; 1 for the output statement.
cin>>n; 1 for the input statement.
while (i<n){ In the first while loop:
x++; n+1 tests
i++; n loops of 2 units for the two increment
} (addition) operations
while (j<n) In the second while loop:
{ n tests
j++; n-1 increments
} ------------------------------------------------------------------
} -
T (n)= 1+1+1+1+1+n+1+2n+n+n-1 = 5n+5 = O(n)
In the above examples we have seen that analysis so complex. However, it can be simplified by
using some formal approach in which case we can ignore initializations, loop control, and book
keeping.
• In general, a for loop translates to a summation. The index and bounds of the summation
are the same as the index and bounds of the for loop.
10
N
f
or(
in
ti=1
;i<=N;i
++){
s
um= s
um+i
;
1N
}
i1
• Suppose we count the number of additions that are done. There is 1 addition per iteration of
the loop, hence N additions in total.
• Nested for loops translate into multiple summations, one for each for loop.
}
sum= sum+i+j; 2 2M 2MN
i1 j1 i1
}
• Again, count the number of additions. The outer summation is for the outer for loop.
Conditionals: Formally
• If (test) s1 else s2: Compute the maximum of the running time for s1 and s2.
if (test ==1) {
for (int i =1; i <=N ; i++) { N N N
}}
sum=sum +i; max 1
, 2
i1 i1 j1
elsefor (int i =1; i <=N ; i++) {
for (int j =1; j <=N ; j++) { max N, 2N2 2N2
sum=sum +i+j;
}}
11
Example:
Suppose we have hardware capable of executing 106 instructions per second. How long would it
take to execute an algorithm whose complexity function was:
T (n) = 2n2 on an input size of n=108?
The total number of operations to be performed would be T (108):
T(108) = 2*(108)2 =2*1016
The required number of seconds
required would be given by
T(108)/106 so:
We analyze the performance of the algorithm under the circumstances on which it works best. In
that way, we can determine the upper-bound of its performance. However, you should note that
we may obtain these results under very unusual or special circumstances and it may be difficult
to find the optimum input data for such an analysis.
The best case complexity of the algorithm is the function defined by the minimum number of
steps taken on any instance of size n.
• Best case is defined as which input of size n is the cheapest among all inputs of size n
• Best case is obtained from the in the input data set that results in best possible performance
Input
– Assumes the input data are found in the most advantageous order for the algorithm
– For sorting, the most advantageous order is that data are arranged in the required order
– For searching, the best case is that the required item is found at the first position
– Question
• Which order is the most advantageous for an algorithm that sorts 2 5 8 1 4
(ascending order)
Input size
Assumes that the input size is the minimum possible
• This case executes or causes the fewest number of executions
• So, it computes the lower bound of T(n) and You can not do better
• Is the amount of time the algorithm takes on the smallest possible set of inputs
• Best case behavior is usually not informative and hence usually uninteresting (why?)
12
• The function is denoted by Tbest(n) – best case running time of an algorithm for n inputs
Average case analysis
This gives an indication on how the algorithm performs with an average data set. It is possible
that this analysis is made by taking all possible combinations of data, experimenting with them,
and finally averaging them. However, such an analysis may not reflect the exact behavior of the
algorithm you expect from a real-life data set.
Nevertheless, this analysis gives you a better idea how this algorithm works for your problem.
The average-case complexity of the algorithm is the function defined by an average number of
steps taken on any instance of size n.
• Is the complexity of an algorithm averaged over all inputs of each size
• Varies between the best case and worst case
• Gives the average performance of an algorithm
• Computes the optimal (the one which is found most of the time) bound of T(n)
• The average case can be as bad as the worst case
Input (i.e., arrangement of data)
• Assumes that data are arranged in random order
• For searching, it assumes that the required item is found at any position or can be missing
• For sorting, it assumes that data are arranged in a random order
• Input size
• The input size can be any random number as small as the best case or as large as the
worst case
• Problems with performing an average case analysis
• It may not be apparent what constitutes an “average” input for a particular problem
• There is no clue as to what can be an average input
• It is difficult to envision the data that would cause an algorithm (e.g., alg. for insertion
sort) to exhibit its average behavior
• Average case is difficult to determine
• Problems with performing an average case analysis
• The average operation count is often
• Quite difficult to determine or define
• Quite difficult to analyze
• Often we assume that all inputs of a given size are equally likely
• In practice, this assumption may be violated, but randomized algorithms
can sometimes force this to hold
• As a result, in several of the algorithms, we limit our analysis to determining the best and
worst counts
• We focus on the worst case running time
• Easier to analyze
• Crucial to applications such as games, finance and robotics
13
Worst case analysis.
In contrast to the best-case analysis, this gives you an indication on how bad this algorithm can
go, or in other words, gives a lower-bound for its performance.
Sometimes, this could be useful in determining the applicability of an algorithm on a mission-
critical application. However, this analysis may be too pessimistic for a general application, and
it may be difficult to find a test data set that produces the worst case.
• Is the complexity of an algorithm based on worst input of each size
• Is the amount of time the algorithm takes on the worst possible set of inputs
• Is the longest running time for any input of size n
Input (i.e., arrangement of data)
• Assumes that the data are arranged in the most disadvantageous order
• For sorting, the worst case assumes that data are arranged in opposite order
• For searching, the worst case assumes that the required item is found at last position
Input size
• Assumes the input size to be infinite
• That is, it assumes that the input size is a very large number
• That is, it considers n àinfinity
• or missing
• This case, causes highest number of executions to be performed by the algorithm
• So, it computes the upper bound of T(n), the longest running time for any input size n
• Commonly used for computing T(n)
• Is a function and is denoted by Tworst(n) – the worst case running time of the algorithm for n
inputs
The worst case complexity of the algorithm is the function defined by the maximum number of
steps taken on any instance of size n.
We are interested in the worst-case time, since it provides a bound for all input – this is called the
“Big-Oh” estimate.
Typically, the worst case analysis is used to estimate the complexity of an algorithm (That is, we
compare algorithms at their worst case)
14
Unless otherwise specified, running time is the worst case
Asymptotic Analysis
Asymptotic analysis is concerned with how the running time of an algorithm increases with the
size of the input in the limit, as the size of the input increases without bound.
There are five notations used to describe a running time function. These are:
Big-Oh notation is a way of comparing algorithms and is used for computing the complexity of
algorithms; i.e., the amount of time that it takes for computer program to run . It’s only
concerned with what happens for very a large value of n. Therefore only the largest term in the
expression (function) is needed. For example, if the number of operations in an algorithm is n2 –
n, n is insignificant compared to n2 for large values of n. Hence the n term is ignored. Of course,
for small values of n, it may be important. However, Big-Oh is mainly concerned with large
values of n.
Formal Definition: f (n)= O (g (n)) if there exist c, k ∊ ℛ+ such that for all n≥ k, f (n) ≤ c.g (n).
Examples: The following points are facts that you can use for Big-Oh problems:
15
1. f(n)=10n+5 and g(n)=n. Show that f(n) is O(g(n)).
To show that f(n) is O(g(n)) we must show that constants c and k such that
(c=15,k=1).
Typical Orders
Here is a table of some typical cases. This uses logarithms to base 2, but these are simply
proportional to logarithms in other base.
16
1024 1 10 1,024 10,240 1,048,576 1,073,741,824
Demonstrating that a function f(n) is big-O of a function g(n) requires that we find specific
constants c and k for which the inequality holds (and show that the inequality does in fact hold).
Big-O expresses an upper bound on the growth rate of a function, for sufficiently large values of
n.
An upper bound is the best algorithmic solution that has been found for a problem.
“ What is the best that we know we can do?”
Exercise:
f(n) = (3/2)n2+(5/2)n-3
Show that f(n)= O(n2)
In simple words, f (n) =O(g(n)) means that the growth rate of f(n) is less than or equal to g(n).
Big-O Theorems
For all the following theorems, assume that f(n) is a function of n and that k is an arbitrary
constant.
Theorem 1: k is O(1)
Theorem 2: A polynomial is O(the term containing the highest power of n).
17
logbn is O(logdn) b, d > 1
Exponential functions grow faster than powers, i.e. is O( bn ) b > 1 and k >= 0
E.g. n20 is O( 1.05n)
18
f(n)= ( g (n)) means that f(n) is greater than or equal to some constant multiple of g(n) for all
values of n greater than or equal to some k.
In simple terms, f(n)= ( g (n)) means that the growth rate of f(n) is greater that or equal to g(n).
Theta Notation
A function f (n) belongs to the set of (g(n)) if there exist positive constants c1 and c2 such that
it can be sandwiched between c1.g(n) and c2.g(n), for sufficiently large values of n.
Formal Definition: A function f (n) is (g(n)) if it is both O( g(n) ) and ( g(n) ). In other
words, there exist constants c1, c2, and k >0 such that c1.g (n)<=f(n)<=c2. g(n) for all n >= k
In simple terms, f(n)= (g(n)) means that f(n) and g(n) have the same rate of growth.
Example:
f(n)=O(n4)
f(n)=O(n3)
f(n)=O(n2)
All these are technically correct, but the last expression is the best and tight one. Since 2n 2 and n2
have the same growth rate, it can be written as f(n)= (n2).
Little-o Notation
2n2 = O(n2)
=O(n3)
f(n)=o(g(n)) means for all c>0 there exists some k>0 such that f(n)<c.g(n) for all n>=k.
Informally, f(n)=o(g(n)) means f(n) becomes insignificant relative to g(n) as n approaches
infinity.
19
Example: f(n)=3n+4 is o(n2)
Little-Omega ( notation)
Little-omega () notation is to big-omega () notation as little-o notation is to Big-Oh notation.
We use notation to denote a lower bound that is not asymptotically tight.
Formal Definition: f(n)= (g(n)) if there exists a constant no>0 such that 0<= c. g(n)<f(n) for
all n>=k.
Transitivity
Symmetry
Transpose symmetry
Reflexivity
20
• f(n)=(f(n)),
• f(n)=O(f(n)),
• f(n)=(f(n)).
Exercises
Determine the run time equation and complexity of each of the following code segments.
1. for (i=0;i<n;i++)
for (j=0;j<n; j++)
sum=sum+i+j;
What is the value of sum if n=100?
3. int k=0;
for (int i=0; i<n; i++)
for (int j=i; j<n; j++)
k++;
What is the value of k when n is equal to 20?
4. int k=0;
for (int i=1; i<n; i*=2)
for(int j=1; j<n; j++)
k++;
What is the value of k when n is equal to 20?
5. int x=0;
for(int i=1;i<n;i=i+5)
x++;
What is the value of x when n=25?
6. int x=0;
for(int k=n;k>=n/3;k=k-5)
x++;
What is the value of x when n=25?
7. int x=0;
for (int i=1; i<n;i=i+5)
for (int k=n;k>=n/3;k=k-5)
x++;
21
What is the value of x when n=25?
8. int x=0;
for(int i=1;i<n;i=i+5)
for(int j=0;j<i;j++)
for(int k=n;k>=n/2;k=k-3)
x++;
What is the correct big-Oh Notation for the above code segment?
22