Unit 1
Unit 1
Unit 1
1.1 Introduction:
Data structure is the branch of computer science that unleashes the knowledge of how the
data should be organized, how the flow of data should be controlled and how a data
structure should be designed and implemented to reduce the complexity and increase the
efficiency of the algorithm.
The theory of structures not only introduces you to the data structures, but also helps you
to understand and use the concept of abstraction, analyze problems step by step and
develop algorithms to solve real world problems. It enables you to design and implement
various data structures, for example, the stacks, queues, linked lists, trees and graphs.
Effective use of principles of data structures increases efficiency of algorithms to solve
problems like searching, sorting, populating and handling voluminous data.
1. Data structure helps you to understand the relationship of one data element withy
the other and organize it within memory
2. With increasing complexities in computer algorithms, the amount of data usage is
increasing, this can affect the performance of the application and can create some
areas of concern:
Processing speed: To handle very large data, high-speed processing is required,
but with growing data processor may fail to achieve required processing speed.
Data Search: Getting a particular record from database should be quick and with
optimum use of resources.
Multiple requests: To handle simultaneous requests from multiple users
Data Structure Advantages
Efficient Memory use: With efficient use of data structure memory usage can be
optimized, for e.g we can use linked list vs arrays when we are not sure about the
size of data. When there is no more use of memory, it can be released.
Reusability: Data structures can be reused, i.e. once we have implemented a
particular data structure, we can use it at any other place. Implementation of data
structures can be compiled into libraries which can be used by different clients.
Abstraction: Data structure serves as the basis of abstract data types, the data
structure defines the physical form of ADT(Abstract Data Type). ADT is theoretical
and Data structure gives physical form to them.
1.2 Data Representation
Various methods are used to represent data in computers. Hierarchical layers of data
structure are used to make the use of data structure easy and efficient. The basic unit of
data representation is a bit. The value of a bit asserts on of two values-0 or 1.
Eight bits together form one byte which represents a character and one or more than one
characters are used to form a string.
Atomic Type
An atomic type data is a data structure that contains only the data items and not pointers.
Thus, for a list of data items, several atomic type nodes may exist each with a single data
item corresponding to one of the legal data types. The list is maintained using a list of node
which contains pointers to these atomic nodes and a type indicator indicating the type of
atomic node to which it points. Whenever a test node is inserted in the list, its address is
stored in the next free element of the list of pointers.
Fig above shows a list of atomic nodes maintained using list of nodes. In each node, type
represents the type of data stored in the atomic node to which the list node points. 1 stands
for integers type, 2 for real numbers and 3 for character type or any different assumption
can be made at implementation level to indicate different data types.
1.7 Difference between Abstract Data Types, Data Types and data structures
An abstract data type is the specification of the data type which specifies the logical and
mathematical model of the data type.
A data type is implementation of an abstract data type.
Data structure refers to the collection of computer variables that are connected in some
specific manner.
Application level
This level settles all details required for particular application such as names for
variables or special requirements for the operations imposed by applications.
Principles of Programming and Analysis of Algorithms: Software Engineering,
Program Design, Algorithms, Different Approaches to Designing an Algorithm, Complexity,
Big ‘O’ Notation, Algorithm Analysis, Structured Approach to Programming, Recursion, Tips
and Techniques for Writing Programs in ‘C’.
As the design stage involves taking the specification and designing solutions to the
problems, the designer needs to adopt a design strategy. The strategy adopted while
designing should be according to the given specifications.
Another important point while developing a solution strategy is that it should work
correctly in all conditions.
Generally, the people who use the system are not aware of the program design you
have adopted. Thus, there is a system manual which is a detailed guide to how the
design was achieved. In addition a user manual serves as a reference for the users
who are not familiar with system or machines.
A large program should be divided into small modules and sub modules by following
one of the two decomposition approaches -top down approach or bottom up
approach.
Other important criteria by which a program can be judged are execution time and
storage requirements..
1.11 Algorithms:
The term ‘algorithm’ refers to the sequence of instructions that must be followed to
solve a problem. An algorithm has certain characteristics. These are as follows:
Each instruction should be unique and concise.
Each instruction should be relative in nature and should not be repeated infinitely.
Repetition of same task(s) should e avoided.
The result should be available to the user after the algorithm terminates.
Thus, an algorithm is a well defined computational procedure, along with a specified set of
allowable inputs, that produce some value or set of values as output.
After an algorithm has been designed, its efficiency must be analyzed in terms of memory
required by the algorithm and computational time.
The importance of an algorithm is in the correctness and program complexity.
Top-down approach
A top down design approach starts by identifying the major components of the system or
program. Decomposing them in to their lower level components and ierating until the
desired level of module complexity is achieved.
Bottom-up approach
A bottom-up design approach starts with designing the most basic or primitive components
and proceeds to higher-level components.
Top-down Approach
1.13 Complexity
Complexity in context of computers is a characterization of the time or space requirements
for solving a problem by a particular algorithm. These requirements are expressed in terms
of a single parameter that represents the size of the problem.
The complexity of a program can be considered in to types. They are
Time complexity
Space complexity.
Time complexity is the amount of computer time needed to run to completion.
Space complexity is the amount of memory needed to run to completion.
The space needed by the program is the sum of the following components:
Fixed space requirements: this includes the instruction space, for simple variables, fixed
size structured variables, and constants.
Variable space requirements: this consists of space needed by structured variables
whose size depends on particular instance of variables. It also includes the additional space
required when the function uses recursion.
The Big O notation is used to express the upper bound of the runtime of an algorithm
and thus measure the worst-case time complexity of an algorithm. It analyses and
calculates the time and amount of memory required for the execution of an algorithm for an
input value.
Mathematically,
For a function, f(n) and another function g(n), where both functions are defined on
some unbounded set of real (positive) numbers.
Where g(n) is strictly positive for all large values of n. It can be written as:
f(n) = O(g(n))
Here, f and g are the necessary functions from positive integer to non-negative real
numbers.
For example:
As we know that the runtime performance of an algorithm depends on the input size of
n. Let's see some mathematical examples for making the runtime analysis of an algorithm
for different size of n:
o n = 20
log (20) = 2.996;
20 = 20;
20 log (20) = 59.9;
202 = 400;
220 = 1084576;
20! = 2.432902 + 1818;
o n = 10
log (10) = 1;
10 = 10;
10 log (10) = 10;
102 = 100;
210 = 1024;
10! = 3628800;
Thus, similarly, we calculate the runtime performance of an algorithm. Let's see
some algorithmic examples and see the runtime analysis of those algorithms:
o For Linear Search, the runtime complexity is O(n).
o For binary search, the runtime complexity is O(log n).
o For Bubble Sort, Selection Sort, Insertion Sort, Bucket Sort, the runtime
complexity is O(n^c).
o For Exponential algorithms such as Tower of Hanoi, the runtime complexity is
O(c^n).
o For Heap Sort, Merge Sort, the runtime complexity is O(n log n).
It is essential to determine both runtime and space complexity for an algorithm. It's
because on analyzing the runtime performance of the algorithm, we get to know the
execution time the algorithm is taking, and on analyzing the space complexity of the
algorithm, we get to know the memory space the algorithm is occupying. Thus, for
measuring the space complexity of an algorithm, it is required to compare the worst-case
space complexities performance of the algorithm.
In order to determine the space complexity of an algorithm, the following two tasks
are necessary to be done:
Both these are two important tasks to be accomplished first then only we can calculate the
space complexity for an algorithm.
Examples of Algorithms
Below we have mentioned some algorithmic examples with their space complexities:
o For Linear Search, Bubble sort, selection sort, Heap sort, Insertion sort, and Binary
Search, the space complexity is O(1).
o For radix sort, the space complexity is O(n+k).
o For quick SortSort, the space complexity is O(n).
o For merge sort, the space complexity is O(log n).
Example of Big O Notation in C
Below we have implemented the selection sort algorithm in C and calculated the
worst-case complexity (Big O notation) of the algorithm:
However, the main concern of analysis of algorithms is the required time or performance.
Generally, we perform the following types of analysis −
Worst-case − The maximum number of steps taken on any instance of size a.
Best-case − The minimum number of steps taken on any instance of size a.
Average case − An average number of steps taken on any instance of size a.
The structured program consists of well structured and separated modules. But the entry and
exit in a Structured program is a single-time event. It means that the program uses single-entry
and single-exit elements. Therefore a structured program is well maintained, neat and clean
program. This is the reason why the Structured Programming Approach is well accepted in the
programming world.
Advantages of Structured Programming Approach:
1. Easier to read and understand
2. User Friendly
3. Easier to Maintain
4. Mainly problem based instead of being machine based
5. Development is easier as it requires less effort and time
6. Easier to Debug
7. Machine-Independent, mostly.
Disadvantages of Structured Programming Approach:
1. Since it is Machine-Independent, So it takes time to convert into machine code.
2. The converted machine code is not the same as for assembly language.
3. The program depends upon changeable factors like data-types. Therefore it needs to be
updated with the need on the go.
4. Usually the development in this approach takes longer time as it is language-dependent.
Whereas in the case of assembly language, the development takes lesser time as it is fixed
for the machine.
1.17 Recursion
Some computer programming languages allow a module or function to call itself. This
technique is known as recursion. In recursion, a function α either calls itself directly or calls a
function β that in turn calls the original function α. The function α is called recursive function.
Example − a function calling itself.
printf("%d ",value);
}
Example − a function that calls another function which in turn calls it again.
Properties
A recursive function can go infinite like a loop. To avoid infinite running of recursive function,
there are two properties that a recursive function must have −
Base criteria − There must be at least one base criteria or condition, such that, when
this condition is met the function stops calling itself recursively.
Progressive approach − The recursive calls should progress in such a way that each
time a recursive call is made it comes closer to the base criteria.