Analysis of Algorithm

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 10

1

2 Algorithm :

2.1 Definition :
The word Algorithm comes from the name of a Persian author Abu Jafar Mohammad
ibn Musba al Khowarizmi (c. 825 A.D.), who wrote the textbook on mathematics (see
Fig. 1.5). This word has taken special significance in computer science, where Algorithm has
come to refer to a method that can be used by a computer for the solution of a problem. This
is what makes algorithm different from words such as process, technique or method.

Fig. 1.5 : Abu Jafar Mohammad ibn Musba al Khowarizmi (825 AD)

An Algorithm is a finite set of


instructions that if followed, accomplishes a
particular task in a finite amount of time. It
is a good software engineering practice to
design algorithms before we write a program.
In addition, all algorithms must satisfy the
following criteria (Refer Fig. 1.6).

Fig. 1.6

 Input : All the algorithms should have some input. The logic of the algorithm should
work on this input to give the desired result.
 Output : At least one output should be produced from the algorithm based on the input
given.
Example :
If we write a program to check the given number to be prime or not, we should get an
output as ‘number is prime’ or ‘number is not prime’.
 Definiteness : Every step of algorithm should be clear and not have any ambiguity.
 Finiteness : Every algorithm should have a proper end. The algorithm can’t lead to an
infinite condition.
Example :
If we write an algorithm to find the number of primes between 10 and 100, the output
should give the number of primes and execution should end. The computation should not get
into an infinite loop.
2

 Effectiveness : Every step in the algorithm should be easy to understand and can be
implemented using any programming language.
2.2 How to Write an Algorithm :

The algorithm is described as a series of steps of basic operations. These steps must be
performed in a sequence. Each step of the algorithm is labelled.
Algorithm is a backbone for any implementation by some desired language. One can
implement the algorithm very effectively if some method is followed in writing an algorithm.
For having the systematic approach into it we have to use some algorithmic notations. Let us
see one example of algorithm. Refer Fig. 1.7.

Fig. 1.7

Step 1 : Start.
Step 2 : Read two positive integers and store them in X and Y.
Step 3 : Divide X and Y. Let the remainder be R and the quotient be Q.
Step 4 : If R is zero then go to step 8.
Step 5 : Assign Y to X.
Step 6 : Assign R to Y.
Step 7 : Go to Step 3.
Step 8 : Print Y ( the required GCD).
Step 9 : Stop.
The steps mentioned in the above algorithm are simple and unambiguous. Anybody,
carrying out these will clearly know what to do in each step. Hence, the above algorithm
satisfies the definiteness property of an algorithm.

2.3 Algorithmic Strategies :


Algorithmic strategies is a general approach by which many problems can be solved
algorithmically.
Various Algorithmic strategies are :
1. Divide and conquer : In this strategy the problem is divided into smaller subproblems
and these subproblems are solved to obtain the solution to main problem.
2. Dynamic programming : The problem is divided into smaller instances and results of
smaller reoccurring instances are obtained to solve the problem.
3

3. Greedy technique : From a set of obtained solutions the best possible solution is
chosen each time, to obtain the final solution.
4. Back tacking : In this method, in order to obtain the solution trial and error method is
followed.
Using any of these suitable methods the problem can be solved.
2.4 Purpose of Analysis of Algorithm :

When a programmer builds an algorithm during design phase of software development


life cycle, he/ she might not be able to implement it immediately. This is because
programming comes in later part of the software development life cycle. There is a need to
analyze the algorithm at that stage. This will help in forecasting time of execution and
amount of primary memory algorithm might occupy when it is implemented.

2.5 What is Analysis of Algorithm ?


After designing an algorithm, it has to be checked and its correctness needs to be
predicted; this is done by analyzing the algorithm. The algorithm can be analyzed by tracing
all step-by-step instructions, reading the algorithm for logical correctness, and testing it on
some data using mathematical techniques to prove it correct. Another type of analysis is to
analyze the simplicity of the algorithm. That is, design the algorithm in a simple way so that
it becomes easier to be implemented. However, the simplest and most straightforward way of
solving a problem may not be sometimes the best one. Moreover there may be more than one
algorithm to solve a problem. The choice of a particular algorithm depends on following
performance analysis and measurements :

1. Space complexity
2. Time complexity

Analysis of algorithm means developing a formula or prediction of how fast the algorithm
works based on the problem size.

The problem size could be :The number of inputs/outputs in an algorithm.


Example :
For sorting algorithm, the number of inputs is the total number of elements to be
arranged in a specific order. The number of outputs is the total number of sorted elements.
The number of operations involved in the algorithm.

Example :
For a searching algorithm, the number of operations is equal to the total number of
comparisons made with the search element.
 If we are searching an element in an array having ‘n’ elements, the problem size is same
as the number of elements in the array to be searched. Here the problem size and the
input size are the same and is equal to ‘n’.
4

 If we sorting elements in an array, there might be some copy operations (swaps) involved.
The problem size could be the number of elements in the array or the number of copies
performed during sorting.
 If two arrays of size n and m are merged , the problem size is the sum of two array sizes(=
n+m).
 If nth factorial is being computed, the problem size is n.
2.5.1 Space Complexity :
Analysis of algorithms is also defined as the process of determining a formula for
prediction of the memory requirements (primary memory) based on the problem size. This is
called Space Complexity of the algorithm. That is, Space Complexity is defined as the
amount of memory required for running an algorithm.
Some of the reasons for studying space complexity are:
1. If the program is to run on multi user system, it may be required to specify the
amount of memory to be allocated to the program.
2. We may be interested to know in advance that whether sufficient memory is
available to run the program.
3. There may be several possible solutions with different space requirements.
4. Can be used to estimate the size of the largest problem that a program can solve.

The space needed by a program consists of following components.


 Instruction space : Space needed to store the executable version of the program and it
is fixed.

 Data space : Space needed to store all constants, variable values and has further two
components :
(a) Space needed by constants and simple variables. This space is fixed.
(b) Space needed by fixed sized structural variables, such as arrays and structures.
(c) Dynamically allocated space. This space usually varies.

 Environment stack space: This space is needed to store the information to resume the
suspended (partially completed) functions. Each time a function is invoked the
following data is saved on the environment stack :
(a) Return address : i.e., from where it has to resume after completion of the called
function.

(b) Values of all lead variables and the values of formal parameters in the func-tion
being invoked .
The amount of space needed by recursive function is called the recursion stack space.
For each recursive function, this space depends on the space needed by the local variables
and the formal parameter. In addition, this space depends on the maximum depth of the
recursion i.e., maximum number of nested recursive calls.
5

2.5.2 Time Complexity :


The time complexity of an algorithm or a program is the amount of time it needs to
run to completion. The exact time will depend on the implementation of the algorithm,
programming language, optimizing the capabilities of the compiler used, the CPU speed,
other hardware characteristics/specifications and so on. To measure the time complexity
accurately, we have to count all sorts of operations performed in an algorithm. If we know the
time for each one of the primitive operations performed in a given computer, we can easily
compute the time taken by an algorithm to complete its execution. This time will vary from
machine to machine. By analyzing an algorithm, it is hard to come out with an exact time
required. To find out exact time complexity, we need to know the exact instructions executed
by the hardware and the time required for the instruction. The time complexity also depends
on the amount of data inputted to an algorithm. But we can calculate the order of magnitude
for the time required.

That is, our intention is to estimate the execution time of an algorithm irrespective of
the computer machine on which it will be used. Here, the more sophisticated method is to
identify the key operations and count such operations performed till the program completes
its execution. A key operation in our algorithm is an operation that takes maximum time
among all possible operations in the algorithm. Such an abstract, theoretical approach is not
only useful for discussing and comparing algorithms, but also it is useful to improve solutions
to practical problems. The time complexity can now be expressed as function of number of
key operations performed.

In summary, analysis of algorithm is aimed at determination of Time complexity and Space


complexity of the algorithm.

2.6 Order of Magnitude of an Algorithm :

The Order of Magnitude of an algorithm is the sum of number of occurrences of statements


contained in it.
Consider the following code snippet to understand the order of magnitude of algorithm:
Example 1 :

for (i=0; i<n; i++)


{
/* assume there are ‘c’ statements inside the loop*/
…….
…….
}
In the algorithm containing above piece of code it is assumed that
 There are ‘c’ statements inside the for loop.
 Each of the ‘c’ statements take one unit of time.
Note : The assumptions are made because information on the target machine is not
known when the algorithm is built. So it is not possible to say how much time each
statement in the algorithm takes to execute. The exact time depends on the
6

machine on which the algorithm is run.


Based on the above assumptions,
 Total time of execution for 1 loop= c*1=c
 Since the loop is executed ‘n’ times
 Total time of execution = n*c
 Thus ‘n*c’ is defined as the Order of Magnitude of Algorithm. Since ‘c’ is a constant
(This is because while analyzing the algorithm, the exact number of steps is not known
before programming. It is assumed that the number of steps is constant for each
algorithm and only the number of iterations change.) , Order of Magnitude is
approximated to be equal to ‘n’. This approximation is because for higher values of ‘n’ ,
the effect of ‘c’(constant) is not significant. Thus, constant can be ignored.
Example 2 :
for (i=0; i<n; i++)
{
for(j=0; j<m; j++)
{
/*assume we have ‘c’ number of statements inside the loop*/
……..
……..
}
}
In the above example, based on earlier assumptions,
 The total time for one execution of the for inner loop= c*1 = c
 Since the inner loop is executed ‘m’ times,
 The total time of execution in the inner loop = m*c
 As per the nested loop concept, for every iteration of the outer loop, the inner loop is
executed ‘m’ times.
 Since the outer loop is executed ‘n’ times, we have
 The total time of execution of the algorithm = n*(m*c)
 Since ‘c’ is constant, the order of magnitude of the above algorithm is approximately
equal to ‘n*m’.
The calculation of order of magnitude in the examples discussed above is the priori analysis of
the algorithm.

2.7 Worst Case, Average Case and Best Case Running Time of an
Algorithm :

While analyzing the algorithm based on Priori Analysis principles, three different cases are
identified. They are :
1. Best case
2. Average case
3. Worst case
7

It is important to note that this classification is purely based on the nature of the problem for
which we are developing the algorithm.
In the best case, the amount of time a program might be expected to take on best
possible input data.
In the average case, the amount of time a program might be expected to take on typical
(or average) input data.
In the worst case, the amount of time a program would take on the worst possible input
configuration.
Example : Assume there are 100 students in a class.
 The worst case of the problem is none of the students clearing the exams.
 The best of the problem is all 100 students clearing the exams.
 The average case of the problem is around 50 students clearing the exams.

2.8 Mathematical Notation for Determination of the Running Time


of an Algorithm :

For practical problems it is difficult to count the number of operations per instructions for
worst/average/best case running time. Hence there is a need for mathematical notations to determine
Worst case, Average case and Best case running times.

Asymptotic Notations (O, Ω, ) :

In order to analyse which data structure or algorithm is the best suited for the job, it is
necessary to understand how a data structure or algorithm behaves over time and space. An important
tool in this analysis is Asymptotic Notations.

2.8.1 Big-Oh notation- ‘O’ Notation :

It is defined as the rate at which the time of execution grows in terms of the problem size. Refer
Fig. 1.8.

Fig. 1.8

More formally, it can be defined as :


 For non-negative functions, f(n) and g(n), the function f(n)= O(f(n)) if there are positive
constants c and n0 such that f(n) ≤ c * g(n) for all n, n ≥ n0.
8

 This notation is known as Big-Oh notation. If graphed, g(n) serves as an upper bound to the curve
you are analysing, f(n). It describes the worst that can happen for a given data size.
 f(n) is the time of execution of an algorithm for different problem sizes n.
 g(n) is the mathematical representation of algorithm as a function of problem size n.
This should be read as – ‘f of n’ is ‘ Big Oh’ of ‘g of n’
While relating f(n) with g(n) , relation ‘≤’ is used . This relation signifies upper bound. The
time of computation (f(n)) can be equal or less than c * g(n). It cannot go beyond that value of
c * f(n).
Threshold problem size is the minimum problem size beyond which we can predict the
behaviour with respect to performance of the algorithm.
Threshold problem size is considered while arriving at worst case complexity equation (i.e
f(n)), this is because algorithms might have initialization etc., which are ignored in priori analysis .
Also time taken to execute assignment , move the data are ignored since during priori analysis , the
machine on which we run the program is not known.
Big Oh is a characteristic scheme that measures properties of algorithm complexity
performance and/or memory requirements. The algorithm complexity can be determined by
eliminating constant factors in the analysis of the algorithm. Clearly, the complexity function
f(n) of an algorithm increases as ‘n’ increases. This can be used to represent the worst case or
average case or best case conditions of algorithms.
Let us find out the algorithm complexity by analyzing the sequential searching
algorithm. In the sequential search algorithm we simply try to match the target value against
each value in the memory. This process will continue until we find a match or finish scanning
the whole elements in the array. If the array contains ‘n’ elements, the maximum possible
number of comparisons with the target value will be ‘n’ i.e., the worst case. That is the target
value will be found at the nth position of the array.
f (n) = n
i.e., the worst case is when an algorithm requires a maximum number of iterations or steps to
search and find out the target value in the array.
The best case is when the number of steps is less as possible. If the target value is
found in a sequential search array of the first position (i.e., we need to compare the target
value with only one element from the array)—we have found the element by executing only
one iteration (or by least possible statements)
f (n) = 1

Average case falls between these two extremes (i.e., best and worst). If the target value
is found at the n/2nd position, on an average we need to compare the target value with only
half of the elements in the array, so
f (n) = n/2
The complexity function f(n) of an algorithm increases as ‘n’ increases. The function f
(n)= O(n) can be read as “f of n is big Oh of n” or as “f (n) is of the order of n”. The total
running time (or time complexity) includes the initializations and several other iterative
statements through the loop.

Based on the time complexity representation of the big Oh notation, the algorithm can
be categorized as :
1. Constant time O(1)
9

2. Logarithmic time Olog(n)


3. Linear time O(n)
4. Polynomial time O(nc)
Where c > 1
5. Exponential time O(cn)
Limitation Of Big “Oh” Notation
Big Oh Notation has following two basic limitations :
1. It contains no effort to improve the programming methodology. Big Oh Notation does not
discuss the way and means to improve the efficiency of the program, but it helps to
analyze and calculate the efficiency (by finding time complexity) of the program.

2. It does not exhibit the potential of the constants. For example, one algorithm is taking
1000n2 time to execute and the other n3 time. The first algorithm is O(n2), which implies
that it will take less time than the other algorithm which is O(n3). However in actual
execution the second algorithm will be faster for n < 1000.

2.8.2 Omega notation- ‘Ω’ Notation :


 It represents the lower bound for time of execution of an algorithm.
 It represents the most optimal algorithm to the given problem.
The best case running time of an algorithm is determined when the algorithm has
the best possible condition for execution. The best case running time of an algorithm is
represented by a mathematical equation as follows :

For non-negative functions, f(n) and g(n), the function f(n) = Ω (g(n)) if
there are positive constants c and n0 such that f(n) ≥ c * g(n) for all n, n ≥ n0.
Where f(n) is the time of execution of an algorithm for different values of
problem size n
This should be read as – ‘f of n’ is ‘ Omega’ of ‘g of n’
Here the function g(n) is the lower bound for f(n). This means for any value of n (n≥n0),
the time of computation of algorithm f(n) is always above the graph of g(n),so g(n) serves as
the lower bound for f(n). It describes the best that can happen for a given data size.
2.8.3 Theta Notation- ‘’ Notation :

 The average case running time of an algorithm is determined when the algorithm has
average condition for execution. -notation is used to denote both upper and lower
bounds.

To understand the Average condition, let us consider the search example. If the number
to be searched is in the mid position in the search array, we call this condition as the average
condition.

The average case notation is again represented by a mathematical equation as follows :


For non-negative functions, f(n) and g(n), the function f(n) = (g(n)) if there
10

exist positive constants c1,c2 and n0 such that


c1g(n) ≤ f(n) ≤ c2g(n) , for all n, n ≥n0 where,
 f(n) - is the time of execution of an algorithm for different problem
sizes n
 g(n) - is the mathematical representation of an algorithm as a
function of problem size n.
This should be read as – ‘T of n’ is ‘ Theta’ of ‘g of n’
The Theta notation is more precise than both the Big oh and Omega notations. The
function f(n) = (g(n)) iff g(n) is both an upper and lower bound on f(n).
Some Common Time Complexity Examples :
1. O(1) -- Constant computing time
2. O(n) -- Linear
3. O(n2) -- Quadratic
4. O(n3) -- Cubic
5. O(2n) -- Exponential
 If an algorithm takes time O(log n), it is faster, for sufficiently large n, than if it had
taken O(n).
 Similarly, O(n log n) is better than O(n2) but not as good as O(n). These seven
computing times-O(1), O(log n), O(n), O(n log n), O(n2), O(n3), and O(2n)-are the ones
we see most often in this book.

2.9 Assumption while Finding the Time Complexity :


The leading constant of highest power of ‘n’ and all lower powers on ‘n’ are ignored in g(n).
Example :
Consider the following equation for time complexity of an algorithm :
f(n) = O(100 n3+29 n2+19n) …(1)
Here the leading constant of highest power of ‘n’ is 100. 100 can be ignored. Based on the
assumption, Equation (1) can be rewritten as:
f(n) = O(n3+29 n2+19n) …(2)
All the lower powers of n (i.e 29 n2+19n) can be ignored completely.
So, the above equation is rewritten as:
f(n) = O(n3)

You might also like