DS Unit - 1

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 23

CLASS: I B.

Sc (AI & DS)

PAPER NAME: DATA STRUCTURES

UNIT Contents

I Abstract Data Types (ADTs) – ADTs and classes – introduction


to OOP –classes in Python – inheritance – namespaces – shallow and
deep copying. Introduction to analysis of algorithms – asymptotic
notations – recursion – analyzing recursive algorithms.

II Linear Structures- List ADT – array-based implementations –


linked list implementations – singly linked lists – circularly linked lists
– doubly linked lists– applications of lists – Stack ADT – Queue ADT –
double ended queues

III Sorting and Searching-Bubble sort – selection sort – insertion sort –


merge sort– quick sort – linear search – binary search – hashing – hash
functions – collision handling – load factors, rehashing, and efficiency

IV Tree Structures - Tree ADT – Binary Tree ADT – tree traversals –


binary search trees – AVL trees – heaps – multi-way search trees.

V Graph Structures- Graph ADT – representations of graph – graph


traversals –DAG – topological ordering – shortest paths – minimum
spanning trees.

1
UNIT – I
INTRODUCTION
DATA + STRUCTURE
DATA: The data means a collection of facts, concepts or instructions in a formalized
manner suitable for communication or processing. Processed data is called as
Information.
STRUCTURE: The formation or arrangements of data in some order in the
computer’s memory.
Data structure is a storage that is used to store and organize data. It is a way of
arranging data on a computer so that it can be accessed and updated efficiently.

ABSTRACT DATA TYPE: (ADT)


 Data type is a collection of values and set of operations that can be performed on
those values.
 The collections and all such operations from a mathematical construct that can be
implemented using a particular data structure.
 ADT is a way of looking at the data arrangement focusing on what ADT does, and
ignoring how it does.
 An abstract data type simplifies program design by allowing you to focus on the
essentials of a data structure, without worrying about its implementation.
 One of the basic rules concerning programming is that no routine should ever
exceed a page. This is accomplished by breaking the problem program down into
Module.
 Each module is a logical unit and does a specific job. It is size kept small by
calling other module.
Advantage of Modularity:
1. It is easier to debug a modular small routine than large routine.
2. It is easier for several people to work on modular program simultaneously.
3. It is easier to change.

2
4. ADT is a set of operation. It is a mathematical abstraction, now where in a
ADT’s definition is there any mention of low the set of operation is
implemented.
5. Object such as list, sets and graphs along with their operation can be viewed as
ADT, just as integer, real & Booleans are data types.
6. Integers, real & Booleans ADT’s and set ADT’s is union, intersection, size and
complement.
7. Two operations are used in ADT of union & find.
ADTs and classes

A class is a new kind of data type. Each class includes data and also has the
ability to include constructors and methods. Once you have created the class
(constructor, instance variables and methods) you may create objects of this class. Use
the keyword "new" to give it a name.

ObjectClass objectName = new ObjectClass(arguments);

An abstract data type (ADT) is a mathematical abstraction of a real world thing


and can be implemented in a concrete data type in different languages.

An ADT defines operations for the given type and mathematically expresses
their behaviour. Concrete implementations of an ADT can differ from each other. In
that way classes are implementing the ADT and methods implement operations.

Classes have a slightly different terminology than ADTs and add other characteristics,
like:

 they for example may live in packages


 their members are called attributes and methods
 attributes and methods have a certain visibility constraint

Introduction to OOP
Object-oriented programming (OOP) is a method of structuring a program by
bundling related properties and behaviors into individual objects.

OOPs Concepts in Python


 Class
 Objects
 Polymorphism

3
 Encapsulation
 Inheritance
 Data Abstraction

Objects
The object is an entity that has a state and behavior associated with it. It may
be any real-world object like a mouse, keyboard, chair, table, pen, etc. Integers,
strings, floating-point numbers, even arrays, and dictionaries, are all objects.
Class
A class is a collection of objects. A class contains the blueprints or the
prototype from which the objects are being created. It is a logical entity that contains
some attributes and methods.
Inheritance
Inheritance is the capability of one class to derive or inherit the properties
from another class. The class that derives properties is called the derived class or
child class and the class from which the properties are being derived is called the
base class or parent class.
Polymorphism
Polymorphism simply means having many forms. For example, we need to
determine if the given species of birds fly or not, using polymorphism we can do this
using a single function.
Encapsulation
Encapsulation is one of the fundamental concepts in object-oriented
programming (OOP). It describes the idea of wrapping data and the methods that
work on data within one unit. This puts restrictions on accessing variables and
methods directly and can prevent the accidental modification of data.
Data Abstraction
It hides unnecessary code details from the user. Also, when we do not want to
give out sensitive parts of our code implementation and this is where data abstraction
came.

Classes in Python
A class is a collection of objects. A class contains the blueprints or the
prototype from which the objects are being created. It is a logical entity that contains
some attributes and methods.
To understand the need for creating a class let’s consider an example, let’s say
you wanted to track the number of dogs that may have different attributes like breed,
and age.

4
 Classes are created by keyword class.
 Attributes are the variables that belong to a class.
 Attributes are always public and can be accessed using the dot (.) operator.
Eg.: Myclass.Myattribute

Class Definition Syntax:


class ClassName:
# Statement-1
.
.
.
# Statement-N
Ex:
class Dog:
attr1 = "mammal"
def init (self, name):
self.name = name

def speak(self):
print("My name is {}".format(self.name))
Rodger = Dog("Rodger")
Tommy = Dog("Tommy")

# Accessing class methods


Rodger.speak()
Tommy.speak()

Output
My name is Rodger
My name is Tommy

Inheritance
Inheritance allows us to define a class that inherits all the methods and
properties from another class. Parent class is the class being inherited from, also called
base class. Child class is the class that inherits from another class, also called derived
class.

5
The benefits of inheritance are:
 It represents real-world relationships well.
 It provides the reusability of a code. We don’t have to write the same code
again and again. Also, it allows us to add more features to a class without
modifying it.
 It is transitive in nature, which means that if class B inherits from another class
A, then all the subclasses of B would automatically inherit from class A.
Types of Inheritance
1) Single Inheritance: Single-level inheritance enables a derived class to inherit
characteristics from a single-parent class.
2) Multilevel Inheritance: Multi-level inheritance enables a derived class to
inherit properties from an immediate parent class which in turn inherits
properties from his parent class.
3) Hierarchical Inheritance: Hierarchical-level inheritance enables more than
one derived class to inherit properties from a parent class.
4) Multiple Inheritance: Multiple-level inheritance enables one derived class to
inherit properties from more than one base class.
Syntax
class derived-class(base class):
<class-suite>
EX:
class Animal:
def speak(self):
print("Animal Speaking")
#child class Dog inherits the base class Animal
class Dog(Animal):
def bark(self):
print("dog barking")
d = Dog()
d.bark()
d.speak()

output:
dog barking
Animal Speaking

6
Namespaces
A namespace is an abstraction that manages all of the identifiers that are
defined in a particular scope, mapping each name to its associated value. In Python,
functions, classes, and modules are all first-class objects, and so the “value” associated
with an identifier in a namespace may in fact be a function, class, or module
A namespace is a system that has a unique name for each and every object in
Python. An object might be a variable or a method. Python itself maintains a
namespace in the form of a Python dictionary.
Types of namespaces :

1. Built-In
2. Global
3. Local

The built-in namespace contains the names of all of Python’s built-in objects. These
are available at all times when Python is running.

builtin_names = dir( builtins )

for name in builtin_names:

print(name)

The global namespace contains any names defined at the level of the main
program. Python creates the global namespace when the main program body starts,
and it remains in existence until the interpreter terminates.

That namespace is local to the function and remains in existence until the
function terminates.
EX:
myNum1 = 10
myNum2 = 10
def add(num1, num2):
temp = num1 + num2
return temp

Shallow and deep copying


In Shallow copy, a copy of the original object is stored and only the reference
address is finally copied. In Deep copy, the copy of the original object and the
repetitive copies both are stored. 2. Shallow copy is faster than Deep copy.

7
Shallow Copy
A shallow copy creates a new object which stores the reference of the original
elements.
Ex:

import copy

old_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]


new_list = copy.copy(old_list)

print("Old list:", old_list)


print("New list:", new_list)
output

Old list: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]


New list: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

Deep Copy
A deep copy creates a new object and recursively adds the copies of nested objects presen
Ex:

import copy

old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]


new_list = copy.deepcopy(old_list)

print("Old list:", old_list)


print("New list:", new_list)

output

Old list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]


New list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]

8
Introduction to analysis of algorithms
ALGORITHM: Algorithms, on the other hand, are used to manipulate the data
contained in the data structures.
Algorithm analysis is an important part of computational complexity theory,
which provides theoretical estimation for the required resources of an algorithm to
solve a specific computational problem.
Most algorithms are designed to work with inputs of arbitrary length. Analysis
of algorithms is the determination of the amount of time and space resources required
to execute it.
The efficiency or running time of an algorithm is stated as a function relating
the input length to the number of steps, known as time complexity, or volume of
memory, known as space complexity.

Asymptotic Notations
Asymptotic notations are used to represent the complexities of algorithms for
asymptotic analysis. These notations are mathematical tools to represent the
complexities. There are three notations that are commonly used.
Big Oh Notation
Big-Oh (O) notation gives an upper bound for a function f(n) to within a
constant factor.
We write f(n) = O(g(n)), If there are positive constantsn0 and c such that, to the
right of n0 the f(n) always lies on or below c*g(n).
O(g(n)) = { f(n) : There exist positive constant c and n0 such that 0 ≤ f(n) ≤ c
g(n), for all n ≥ n0}
Big Omega Notation
Big-Omega (Ω) notation gives a lower bound for a function f(n) to within a
constant factor.
We write f(n) = Ω(g(n)), If there are positive constantsn0 and c such that, to the
right of n0 the f(n) always lies on or above c*g(n).
Ω(g(n)) = { f(n) : There exist positive constant c and n0 such that 0 ≤ c g(n) ≤
f(n), for all n ≥ n0}
Big Theta Notation
Big-Theta(Θ) notation gives bound for a function f(n) to within a constant
factor.

9
Big-Theta(Θ) Big-Omega (Ω) Big-Oh (O)

Recursion
Recursion can be defined as the process of defining something in terms of
itself. In simple words, it is a process in which a function calls itself directly or
indirectly.
Properties:
The algorithms must have a base case.
The algorithms should call themselves recursively.
The algorithm must change its state and move toward the base case.
Advantages
 A complicated function can be split down into smaller sub-problems utilizing
recursion.
 Sequence creation is simpler through recursion than utilizing any nested
iteration.
 Recursive functions render the code look simple and effective.
DisAdvantages
 A lot of memory and time is taken through recursive calls which makes it
expensive for use.
 Recursive functions are challenging to debug.
 The reasoning behind recursion can sometimes be tough to think through.

Syntax:
def func(): <--
|
| (recursive call)
|
func() ----

10
Ex:
def
function()
: x = 10

Analyzing Recursive Algorithm


A recursive algorithm simplifies a problem by breaking it down into sub-
problems of the same type. The output of one recursion becomes the input for another
recursion.

Procedure for Recursive Algorithm


1. Specify problem size
2. Identify basic operation
3. Worst, best, average case
4. Write recursive relation for the number of basic operation. Don't forget the initial
conditions (IC)
5. Solve recursive relation and order of growth
 Recursive computations can be implemented for:
1. Calculating for factorial value.
2. Tower of Hanoi problem

FACTORIAL CALCULATION
 The calculation of factorial value for an integer n is:
n! = n * (n-1) * (n-2) * …….* 3 * 2 * 1 [Or] n! = n * (n-1)
FACTORIAL (N)
Steps:
1. if (N =0) then fact = 1
2. else fact = N * FACTORIAL(N-1)
3. end if
4. return
5. stop
Steps:
1. val = N, top = 0, addr = step 10
2. PUSH (val, addr) //initialize the stack
3. val = val – 1, addr = step 7 //next value and return address
4. if (val= 0) then
1. fact = 1
2. Go to step 8
5. else

11
1. PUSH (val, addr)
2. Go to step 3
6. fact = val * fact
7. val = POP_PARAM(), addr = POP_ADDR()
8. go to addr
9. return (fact)
Tower Of Hanoi Problem
 Tower of Hanoi is an interesting problem invented by the French Mathematician
Edward Lucas in 1883.
 Consider that there are three pillars Source (A), Temp (B) and Target (C).
 There are N discs of decreasing size, so that no two discs are of the same size.
 Initially all the discs are stacked in pillar Source in their decreasing order of size.
 Other two pillars are empty.
 The problem is to move all the discs from one pillar to other using third pillar as
intermediate, so that
 Only one disc may be moved at a time.
 A disc may be moved from any pillar to another.
 At no time can a larger disc be placed on a smaller disc.

Algorithm:
 To solve the problem, the following steps can be followed :
1) Move the top N – 1 discs from Source to Temp (using Target as
intermediary pillar).
TOH ( N – 1 , Source , Target , Temp )
2) Move the bottom disc from Source to
Target. Move from Source to Target
3) Move N – 1 discs from Temp to Target (using Source as an
intermediary pillar).

Example: TOH ( N – 1 , Temp , Source , Target )


(a) Initial stage

Source Temp Target

(b) Move the top N – 1 discs from Source to Temp (using Target as intermediary
pillar)
12
(i)

(ii)
Source Temp Target

(iii) Source Temp Target

Source Temp Target

(c) Move the bottom disc from Source to


Target (iv)

Source Temp Target

(d)Move N-1 dics from temp to target

(i)

Source Temp Target


(ii)

(iii)
Source Temp Target

Source Temp Target

---------------------------------- UNIT I ----------------------------------------------

13
UNIT II
Linear Structures:

List ADT:

Lists are linear data structures in which data is stored in a non - continuous fashion.
List consists of data storage boxes called 'nodes'. These nodes are linked to each other
node consists of the address of some other block.

Ex: lists of integers, lists of characters, lists of payroll records, even lists of lists.

Arraybased implementations

An array is a linear data structure that collects elements of the same data type
and stores them in contiguous and adjacent memory locations.

Arrays work on an index system starting from 0 to (n-1), where n is the size of
the array.

Basic operations
o Traversal - This operation is used to print the elements of the array.
o Insertion - It is used to add an element at a particular index.
o Deletion - It is used to delete an element from a particular index.
o Search - It is used to search an element using the given index or by the value.
o Update - It updates an element at a particular index.

Insertion Operation
In the insertion operation, we are adding one or more elements to the array. Based on the requirement, a new
element can be added at the beginning, end, or any given index of array. This is done using input statements of
the programming languages.

Algorithm
Following is an algorithm to insert elements into a Linear Array until we reach the end of the array −
1. Start
2. Create an Array of a desired datatype and size.
3. Initialize a variable ‘i’ as 0.
4. Enter the element at ith index of the array.
5. Increment i by 1.
6. Repeat Steps 4 & 5 until the end of the array.
7. Stop

Deletion Operation
In this array operation, we delete an element from the particular index of an array. This deletion operation takes
place as we assign the value in the consequent index to the current index.
14
Algorithm
Consider LA is a linear array with N elements and K is a positive integer such that K<=N. Following is the
algorithm to delete an element available at the Kth position of LA.
1. Start
2. Set J = K
3. Repeat steps 4 and 5 while J < N
4. Set LA[J] = LA[J + 1]
5. Set J = J+1
6. Set N = N-1
7. Stop

Output
The original array elements are :
LA[0] = 1
LA[1] = 3
LA[2] = 5
The array elements after deletion :
LA[0] = 1
LA[1] = 5

Search Operation
Searching an element in the array using a key; The key element sequentially compares every value in the array
to check if the key is present in the array or not.

Algorithm
Consider LA is a linear array with N elements and K is a positive integer such that K<=N. Following is the
algorithm to find an element with a value of ITEM using sequential search.
1. Start
2. Set J = 0
3. Repeat steps 4 and 5 while J < N
4. IF LA[J] is equal ITEM THEN GOTO STEP 6
5. Set J = J +1
6. PRINT J, ITEM
7. Stop

The original array elements are :


LA[0] = 1
LA[1] = 3
LA[2] = 5
LA[3] = 7
LA[4] = 8
Found element 5 at position 3

Traversal Operation
This operation traverses through all the elements of an array. We use loop statements to carry this out.

Algorithm
Following is the algorithm to traverse through all the elements present in a Linear Array −
1 Start
2. Initialize an Array of certain size and datatype.
3. Initialize another variable ‘i’ with 0.
4. Print the ith value in the array and increment i.
5. Repeat Step 4 until the end of the array is reached.
6. End
The original array elements are :
LA[0] = 1
LA[1] = 3
LA[2] = 5
LA[3] = 7
LA[4] = 8

Update Operation
15
Update operation refers to updating an existing element from the array at a given index.

Algorithm
Consider LA is a linear array with N elements and K is a positive integer such that K<=N. Following is the
algorithm to update an element available at the Kth position of LA.
1. Start
2. Set LA[K-1] = ITEM
3. Stop
The original array elements are :
LA[0] = 1
LA[1] = 3
LA[2] = 5
The array elements after updation :
LA[0] = 1
LA[1] = 3
LA[2] = 10
Linked Lists:

Linked lists are one of the most commonly used data structures in any
programming language.Linked Lists, on the other hand, are different. Linked
lists, do not store data at contiguous memory locations. For each item in the
memory location, linked list stores value of the item and the reference or
pointer to the next item. One pair of the linked list item and the reference to
next item constitutes a node.

The following are different types of linked lists.

Single Linked List



Doubly Linked List

Circular Linked List

Singly Linked Lists
Singly linked lists contain two “buckets” in one node; one bucket holds the data and the other bucket holds the
address of the next node of the list. Traversals can be done in one direction only as there is only a single link
between two nodes of the same list.

Basic Operations in the Linked Lists


The basic operations in the linked lists are insertion, deletion, searching, display, and deleting an element at a
given key. These operations are performed on Singly Linked Lists as given below −
 Insertion − Adds an element in the list.
 Deletion − Deletes an element in the list.

Insertion

There are three possible positions where we can enter a new node in a single linked list –

 Insertion at beginning
16
 Insertion after nth position
 Insertion at end

Insertion at beginning
Imagine our linked list is not necessarily sorted and there is no reason to insert a new node in any special place
in the list. Then we have an easiest place to insert the node is at the beginning of the list. An algorithm that does
so follows.

Step 1: IF PTR = NULL

WriteOVERFLOW
GotoStep7
[END OF IF]

Step 2: SET NEW_NODE = PTR

Step 3: SET PTR = PTR → NEXT

Step 4: SET NEW_NODE → DATA = VAL

Step 5: SET NEW_NODE → NEXT = HEAD

Step 6: SET HEAD = NEW_NODE

Step 7: EXIT

Insertion at ending
To insert element in linked list last we would use the following steps to insert a new Node at the last of
the single linked list.

17
Step 1: IF PTR = NULL Write OVERFLOW
Go to Step 1
[END OF IF]

Step 2: SET NEW_NODE = PTR

Step 3: SET PTR = PTR - > NEXT

Step 4: SET NEW_NODE - > DATA = VAL

Step 5: SET NEW_NODE - > NEXT = NULL

Step 6: SET PTR = HEAD

Step 7: Repeat Step 8 while PTR - > NEXT != NULL

Step 8: SET PTR = PTR - > NEXT


[END OF LOOP]

Step 9: SET PTR - > NEXT = NEW_NODE

Step 10: EXIT

Insertion at After nth position node


First we will create a new node named by newnode and put the position where u want to insert the node.

Now give the address of the new node in previous node means link the new node with previous node.

After this, give the address of current node in new node.Means link your new node also with current node.

Algorithm
 Traverse the Linked List till the nth node
 Allocate the data and memory for the new node
 Assign the next pointer of the specific nth position node to this new node
 Assign the next pointer of this newNode to (n+1)th node (Nth node’s next)

18
Deletion

 Deletion at beginning
 Deletion at middle
 Deletion at last

Deletion at beginning
So in this diagram Node, A had been deleted, and after delete, the node will be shown in the diagram.

Algorithm

 Assign next node in a linked list as the new head


 free previous head

Deletion at middle

Algorithm

 Search the Node to be deleted and call it temp


 Store previous node call is as previous
 Assign previous->next to temp->next
 Free(temp)

19

Deletion at last

 Search the Node to be deleted and call it temp


 Store previous node call is as previous
 Assign previous->next to temp->next
 In this case, the temp->next will be Null so no difference from deletion at middle
 Free(temp)

Doubly Linked List

A doubly linked list is a linked data structure that consists of a set of


sequentially linked records called nodes.

Each node contains three fields: two link fields (references to the previous
and to the next node in the sequence of nodes) and one data field.

Advantages of Doubly Linked List

1. Traversal can be done on either side means both in forward as well as backward.

2. Deletion Operation is more efficient if the pointer to delete node is given.

Disadvantages of Linked List

1. Since it requires extra pointer that is the previous pointer to store previous node reference.

2. After each operation like insertion-deletion, it requires an extra pointer that is a


20
previous pointer which needs to be maintained.

So, a typical node in the doubly linked list consists of three fields:

o Data represents the data value stored in the node.


o Previous represents a pointer that points to the previous node.
o Next represents a pointer that points to the next node in the list.
Insertion

There are three possible positions where we can enter a new node in a single linked list –

 Insertion at beginning
 Insertion after nth position
 Insertion at end

Insertion at beginning

1. Firstly, allocate a new node (say new_node).


2. Now put the required data in the new node.
3. Make the next of new_node point to the current head of the doubly linked list.
4. Make the previous of the current head point to new_node.
5. Lastly, point head to new_node.

Insertion after nth position

1. Firstly create a new node (say new_node).


2. Now insert the data in the new node.
3. Point the next of new_node to the next of prev_node.
4. Point the next of prev_node to new_node.
5. Point the previous of new_node to prev_node.
6. Change the pointer of the new node’s previous pointer to new_node.

21
Insertion at end

1. Create a new node (say new_node).


2. Put the value in the new node.
3. Make the next pointer of new_node as null.
4. If the list is empty, make new_node as the head.
5. Otherwise, travel to the end of the linked list.
6. Now make the next pointer of last node point to new_node.
7. Change the previous pointer of new_node to the last node of the list.

Deletion

 Deletion at beginning
 Deletion at middle
 Deletion at last

Deletion at beginning

STEP 1:
IF HEAD = NULL
WRITE UNDERFLOW
GOTO STEP 6
STEP 2: SET PTR = HEAD
STEP 3: SET HEAD = HEAD → NEXT
STEP 4: SET HEAD → PREV = NULL
STEP 5: FREE PTR
STEP 6: EXIT

22
23

You might also like