Data Science With Python - Intermediate Level V2.0

Download as pdf or txt
Download as pdf or txt
You are on page 1of 153

Data Science with Python

Intermediate Level
(This page has been left blank intentionally)
INDEX
PYTHON OOP 3

PYTHON CLASSES AND OBJECTS 3


CONSTRUCTORS IN PYTHON 9
DESTRUCTORS IN PYTHON 11
INHERITANCE IN PYTHON 14
ENCAPSULATION IN PYTHON 26
POLYMORPHISM IN PYTHON 30
CLASS OR STATIC VARIABLES IN PYTHON 34
CLASS METHOD VS STATIC METHOD IN PYTHON 36

REGULAR EXPRESSION IN PYTHON 40

METACHARACTERS 40
SPECIAL SEQUENCES 43
REGEX MODULE IN PYTHON 44
MATCH OBJECT 53
SEARCH, MATCH AND FIND ALL 55
VERBOSE IN PYTHON REGEX 62
PASSWORD VALIDATION IN PYTHON 64

PYTHON COLLECTIONS 68

COUNTERS 68
ORDEREDDICT IN PYTHON 71
DEFAULTDICT IN PYTHON 75
CHAINMAP IN PYTHON 80
NAMEDTUPLE IN PYTHON 84
DEQUE IN PYTHON 88
HEAP QUEUE (OR HEAPQ) IN PYTHON 94
COLLECTIONS.USERDICT IN PYTHON 97
COLLECTIONS.USERLIST IN PYTHON 100
COLLECTIONS.USERSTRING IN PYTHON 103

INTERACTING WITH THE OS 106

HANDLING THE CURRENT WORKING DIRECTORY 106


CREATING A DIRECTORY 108
DELETING DIRECTORY OR FILES USING PYTHON 112
COMMONLY USED FUNCTIONS 114

ITERATORS IN PYTHON 119


ITERATOR FUNCTIONS IN PYTHON 122
PYTHON __ITER__() AND __NEXT__() | CONVERTING AN OBJECT INTO AN ITERATOR 125
PYTHON | DIFFERENCE BETWEEN ITERABLE AND ITERATOR 129

PYTHON DEBUGGER – PYTHON PDB 131

STARTING PYTHON DEBUGGER 131

AUTOMATED SOFTWARE TESTING WITH PYTHON 136

THE ‘UNITTEST’ MODULE 138


THE “NOSE2” MODULE 142
THE “PYTEST” MODULE 143

UNIT TESTING IN PYTHON – UNITTEST 147


(This page has been left blank intentionally)
Python OOP

Python Classes and Objects

A class is a user-defined blueprint or prototype from which objects are created. Classes
provide a means of bundling data and functionality together. Creating a new class creates a
new type of object, allowing new instances of that type to be made. Each class instance can
have attributes attached to it for maintaining its state. Class instances can also have methods
(defined by their class) for modifying their state.
To understand the need for creating a class let’s consider an example, let’s say you wanted to
track the number of dogs that may have different attributes like breed, age. If a list is used,
the first element could be the dog’s breed while the second element could represent its age.
Let’s suppose there are 100 different dogs, then how would you know which element is
supposed to be which? What if you wanted to add other properties to these dogs? This lacks
organization and it’s the exact need for classes.
Class creates a user-defined data structure, which holds its own data members and member
functions, which can be accessed and used by creating an instance of that class. A class is like
a blueprint for an object.

Some points on Python class:


• Classes are created by keyword class.
• Attributes are the variables that belong to a class.
• Attributes are always public and can be accessed using the dot (.) operator. Eg.:
Myclass.Myattribute
Class Definition Syntax:
class ClassName:
# Statement-1
.
.
.
# Statement-N
Defining a class –

# Python3 program to
# demonstrate defining
# a class
class Dog:
pass

In the above example, the class keyword indicates that you are creating a class followed by
the name of the class (Dog in this case).
Class Objects
An Object is an instance of a Class. A class is like a blueprint while an instance is a copy of the
class with actual values. It’s not an idea anymore, it’s an actual dog, like a dog of breed pug
who’s seven years old. You can have many dogs to create many different instances, but
without the class as a guide, you would be lost, not knowing what information is required.
An object consists of :

• State: It is represented by the attributes of an object. It also reflects the properties of


an object.
• Behavior: It is represented by the methods of an object. It also reflects the response
of an object to other objects.
• Identity: It gives a unique name to an object and enables one object to interact with
other objects.

Declaring Objects (Also called instantiating a class)

When an object of a class is created, the class is said to be instantiated. All the instances share
the attributes and the behavior of the class. But the values of those attributes, i.e. the state
are unique for each object. A single class may have any number of instances.
Example:

Declaring an object –
# Python3 program to
# demonstrate instantiating
# a class
class Dog:

# A simple class
# attribute
attr1 = "mammal"
attr2 = "dog"

# A sample method
def fun(self):
print("I'm a", self.attr1)
print("I'm a", self.attr2)

# Driver code
# Object instantiation
Rodger = Dog()

# Accessing class attributes


# and method through objects
print(Rodger.attr1)
Rodger.fun()

Output:
mammal
I'm a mammal
I'm a dog

In the above example, an object is created which is basically a dog named Rodger. This class
only has two class attributes that tell us that Rodger is a dog and a mammal.
The self
• Class methods must have an extra first parameter in the method definition. We do not
give a value for this parameter when we call the method, Python provides it.
• If we have a method that takes no arguments, then we still have to have one
argument.
• This is similar to this pointer in C++ and this reference in Java.
When we call a method of this object as myobject.method(arg1, arg2), this is automatically
converted by Python into MyClass.method(myobject, arg1, arg2) – this is all the special self is
about.
__init__ method
The __init__ method is similar to constructors in C++ and Java. Constructors are used to
initializing the object’s state. Like methods, a constructor also contains a collection of
statements(i.e. instructions) that are executed at the time of Object creation. It runs as soon
as an object of a class is instantiated. The method is useful to do any initialization you want
to do with your object.

# A Sample class with init method


class Person:

# init method or constructor


def __init__(self, name):
self.name = name

# Sample Method
def say_hi(self):
print('Hello, my name is', self.name)

p = Person('Nikhil')
p.say_hi()

Output:
Hello, my name is Nikhil

Class and Instance Variables


Instance variables are for data, unique to each instance and class variables are for attributes
and methods shared by all instances of the class. Instance variables are variables whose value
is assigned inside a constructor or method with self whereas class variables are variables
whose value is assigned in the class.
Defining instance variable using a constructor.
# Python3 program to show that the variables with a value
# assigned in the class declaration, are class variables and
# variables inside methods and constructors are instance
# variables.

# Class for Dog


class Dog:

# Class Variable
animal = 'dog'

# The init method or constructor


def __init__(self, breed, color):

# Instance Variable
self.breed = breed
self.color = color

# Objects of Dog class


Rodger = Dog("Pug", "brown")
Buzo = Dog("Bulldog", "black")

print('Rodger details:')
print('Rodger is a', Rodger.animal)
print('Breed: ', Rodger.breed)
print('Color: ', Rodger.color)

print('\nBuzo details:')
print('Buzo is a', Buzo.animal)
print('Breed: ', Buzo.breed)
print('Color: ', Buzo.color)
# Class variables can be accessed using class
# name also
print("\nAccessing class variable using class name")
print(Dog.animal)

Output:
Rodger details:
Rodger is a dog
Breed: Pug
Color: brown

Buzo details:
Buzo is a dog
Breed: Bulldog
Color: black

Accessing class variable using class name


dog

Defining instance variable using the normal method.


# Python3 program to show that we can create
# instance variables inside methods

# Class for Dog


class Dog:

# Class Variable
animal = 'dog'

# The init method or constructor


def __init__(self, breed):

# Instance Variable
self.breed = breed

# Adds an instance variable


def setColor(self, color):
self.color = color

# Retrieves instance variable


def getColor(self):
return self.color

# Driver Code
Rodger = Dog("pug")
Rodger.setColor("brown")
print(Rodger.getColor())

Constructors in Python

Constructors are generally used for instantiating an object. The task of constructors is to
initialize(assign values) to the data members of the class when an object of the class is
created. In Python the __init__() method is called the constructor and is always called when
an object is created.
Syntax of constructor declaration:
def __init__(self):
# body of the constructor

Types of constructors:

• default constructor: The default constructor is a simple constructor which doesn’t


accept any arguments. Its definition has only one argument which is a reference to
the instance being constructed.
• parameterized constructor: constructor with parameters is known as parameterized
constructor. The parameterized constructor takes its first argument as a reference to
the instance being constructed known as self and the rest of the arguments are
provided by the programmer.
Example of default constructor:

class GeekforGeeks:

# default constructor
def __init__(self):
self.geek = "GeekforGeeks"

# a method for printing data members


def print_Geek(self):
print(self.geek)

# creating object of the class


obj = GeekforGeeks()

# calling the instance method using the object obj


obj.print_Geek()

Output: GeekforGeeks

Example of the parameterized constructor:


class Addition:
first = 0
second = 0
answer = 0

# parameterized constructor
def __init__(self, f, s):
self.first = f
self.second = s

def display(self):
print("First number = " + str(self.first))
print("Second number = " + str(self.second))
print("Addition of two numbers = " + str(self.answer))

def calculate(self):
self.answer = self.first + self.second

# creating object of the class


# this will invoke parameterized constructor
obj = Addition(1000, 2000)

# perform Addition
obj.calculate()

# display result
obj.display()

Output:
First number = 1000
Second number = 2000
Addition of two numbers = 3000

Destructors in Python

Destructors are called when an object gets destroyed. In Python, destructors are not needed
as much as in C++ because Python has a garbage collector that handles memory management
automatically.
The __del__() method is a known as a destructor method in Python. It is called when all
references to the object have been deleted i.e when an object is garbage collected.
Syntax of destructor declaration:
def __del__(self):
# body of destructor
Note : A reference to objects is also deleted when the object goes out of reference or when
the program ends.
Example 1 : Here is the simple example of destructor. By using del keyword we deleted the
all references of object ‘obj’, therefore destructor invoked automatically.

# Python program to illustrate destructor


class Employee:

# Initializing
def __init__(self):
print('Employee created.')

# Deleting (Calling destructor)


def __del__(self):
print('Destructor called, Employee deleted.')

obj = Employee()
del obj

Output:
Employee created.
Destructor called, Employee deleted.
Note : The destructor was called after the program ended or when all the references to
object are deleted i.e when the reference count becomes zero, not when object went out of
scope.
Example 2 :This example gives the explanation of above mentioned note. Here, notice that
the destructor is called after the ‘Program End…’ printed.

# Python program to illustrate destructor

class Employee:

# Initializing
def __init__(self):
print('Employee created')

# Calling destructor
def __del__(self):
print("Destructor called")

def Create_obj():
print('Making Object...')
obj = Employee()
print('function end...')
return obj

print('Calling Create_obj() function...')


obj = Create_obj()
print('Program End...')

Output:
Calling Create_obj() function...
Making Object...
Employee created
function end...
Program End...
Destructor called
Example 3: Now, consider the following example:

# Python program to illustrate destructor

class A:
def __init__(self, bb):
self.b = bb

class B:
def __init__(self):
self.a = A(self)
def __del__(self):
print("die")

def fun():
b = B()

fun()

Output:
die
In this example when the function fun() is called, it creates an instance of class B which passes
itself to class A, which then sets a reference to class B and resulting in a circular reference.
Generally, Python’s garbage collector which is used to detect these types of cyclic references
would remove it but in this example the use of custom destructor marks this item as
“uncollectable”.
Simply, it doesn’t know the order in which to destroy the objects, so it leaves them.
Therefore, if your instances are involved in circular references they will live in memory for
as long as the application run.

Inheritance in Python

Inheritance is the capability of one class to derive or inherit the properties from another class.
The benefits of inheritance are:
• It represents real-world relationships well.
• It provides reusability of a code. We don’t have to write the same code again and
again. Also, it allows us to add more features to a class without modifying it.
• It is transitive in nature, which means that if class B inherits from another class A, then
all the subclasses of B would automatically inherit from class A.
Below is a simple example of inheritance in Python.

# A Python program to demonstrate inheritance

# Base or Super class. Note object in bracket.

# (Generally, object is made ancestor of all classes)

# In Python 3.x "class Person" is

# equivalent to "class Person(object)"

class Person(object):

# Constructor

def __init__(self, name):

self.name = name

# To get name

def getName(self):

return self.name

# To check if this person is an employee

def isEmployee(self):

return False
# Inherited or Subclass (Note Person in bracket)

class Employee(Person):

# Here we return true

def isEmployee(self):

return True

# Driver code

emp = Person("Geek1") # An Object of Person

print(emp.getName(), emp.isEmployee())

emp = Employee("Geek2") # An Object of Employee

print(emp.getName(), emp.isEmployee())

Output:
Geek1 False
Geek2 True

What is object class?


Like Java Object class, in Python (from version 3.x), object is root of all classes.
In Python 3.x, “class Test(object)” and “class Test” are same.
In Python 2.x, “class Test(object)” creates a class with object as parent (called new style class)
and “class Test” creates old style class (without object parent). Refer this for more details.
Subclassing (Calling constructor of parent class)
A child class needs to identify which class is its parent class. This can be done by mentioning
the parent class name in the definition of the child class.

Example:
class subclass_name (superclass_name):

___
___
# Python code to demonstrate how parent constructors

# are called.

# parent class

class Person( object ):

# __init__ is known as the constructor

def __init__(self, name, idnumber):

self.name = name

self.idnumber = idnumber

def display(self):

print(self.name)

print(self.idnumber)

# child class

class Employee( Person ):

def __init__(self, name, idnumber, salary, post):

self.salary = salary

self.post = post

# invoking the __init__ of the parent class

Person.__init__(self, name, idnumber)

# creation of an object variable or an instance

a = Employee('Rahul', 886012, 200000, "Intern")

# calling a function of the class Person using its instance

a.display()

Output:
Rahul
886012
‘a’ is the instance created for the class Person. It invokes the __init__() of the referred class.
You can see ‘object’ written in the declaration of the class Person. In Python, every class
inherits from a built-in basic class called ‘object’. The constructor i.e. the ‘__init__’ function
of a class is invoked when we create an object variable or an instance of the class.
The variables defined within __init__() are called as the instance variables or objects. Hence,
‘name’ and ‘idnumber’ are the objects of the class Person. Similarly, ‘salary’ and ‘post’ are the
objects of the class Employee. Since the class Employee inherits from class Person, ‘name’
and ‘idnumber’ are also the objects of class Employee.
If you forget to invoke the __init__() of the parent class then its instance variables would not
be available to the child class.
The following code produces an error for the same reason.

# Python program to demonstrate error if we

# forget to invoke __init__() of the parent.

class A:

def __init__(self, n = 'Rahul'):

self.name = n

class B(A):

def __init__(self, roll):

self.roll = roll

object = B(23)

print (object.name)

Output:
Traceback (most recent call last):
File "/home/de4570cca20263ac2c4149f435dba22c.py", line 12, in
print (object.name)
AttributeError: 'B' object has no attribute 'name'
Different forms of Inheritance:
Inheritance is defined as the capability of one class to derive or inherit the properties from
some other class and use it whenever needed. Inheritance provides the following properties:

• It represents real-world relationships well.


• It provides reusability of code. We don’t have to write the same code again and again.
Also, it allows us to add more features to a class without modifying it.
• It is transitive in nature, which means that if class B inherits from another class A, then
all the subclasses of B would automatically inherit from class A.

Example:

# A Python program to demonstrate

# inheritance

# Base class or Parent class

class Child:

# Constructor

def __init__(self, name):

self.name = name

# To get name

def getName(self):

return self.name

# To check if this person is student

def isStudent(self):

return False

# Derived class or Child class

class Student(Child):

# True is returned

def isStudent(self):

return True
# Driver code

# An Object of Child

std = Child("Ram")

print(std.getName(), std.isStudent())

# An Object of Student

std = Student("Shivam")

print(std.getName(), std.isStudent())

Output:
Ram False
Shivam True
Types of Inheritance in Python
Types of Inheritance depends upon the number of child and parent classes involved. There
are four types of inheritance in Python:
Single Inheritance: Single inheritance enables a derived class to inherit properties from a
single parent class, thus enabling code reusability and the addition of new features to existing
code.

Example:

# Python program to demonstrate

# single inheritance
# Base class

class Parent:

def func1(self):

print("This function is in parent class.")

# Derived class

class Child(Parent):

def func2(self):

print("This function is in child class.")

# Driver's code

object = Child()

object.func1()

object.func2()

Output:
This function is in parent class.
This function is in child class.
Multiple Inheritance: When a class can be derived from more than one base class this type
of inheritance is called multiple inheritance. In multiple inheritance, all the features of the
base classes are inherited into the derived class.
Example:

# Python program to demonstrate

# multiple inheritance

# Base class1

class Mother:

mothername = ""

def mother(self):

print(self.mothername)

# Base class2

class Father:

fathername = ""

def father(self):

print(self.fathername)

# Derived class

class Son(Mother, Father):

def parents(self):
print("Father :", self.fathername)

print("Mother :", self.mothername)

# Driver's code

s1 = Son()

s1.fathername = "RAM"

s1.mothername = "SITA"

s1.parents()

Output:
Father : RAM
Mother : SITA
Multilevel Inheritance: In multilevel inheritance, features of the base class and the derived
class are further inherited into the new derived class. This is similar to a relationship
representing a child and grandfather.

Example:

# Python program to demonstrate

# multilevel inheritance

# Base class
class Grandfather:

def __init__(self, grandfathername):

self.grandfathername = grandfathername

# Intermediate class

class Father(Grandfather):

def __init__(self, fathername, grandfathername):

self.fathername = fathername

# invoking constructor of Grandfather class

Grandfather.__init__(self, grandfathername)

# Derived class

class Son(Father):

def __init__(self,sonname, fathername, grandfathername):

self.sonname = sonname

# invoking constructor of Father class

Father.__init__(self, fathername, grandfathername)

def print_name(self):

print('Grandfather name :', self.grandfathername)

print("Father name :", self.fathername)

print("Son name :", self.sonname)

# Driver code

s1 = Son('Prince', 'Rampal', 'Lal mani')

print(s1.grandfathername)

s1.print_name()
Output:
Lal mani
Grandfather name : Lal mani
Father name : Rampal
Son name : Prince
Hierarchical Inheritance: When more than one derived classes are created
from a single base this type of inheritance is called hierarchical inheritance. In
this program, we have a parent (base) class and two child (derived) classes.

Example:

# Python program to demonstrate

# Hierarchical inheritance

# Base class

class Parent:

def func1(self):

print("This function is in parent class.")

# Derived class1

class Child1(Parent):

def func2(self):

print("This function is in child 1.")


# Derivied class2

class Child2(Parent):

def func3(self):

print("This function is in child 2.")

# Driver's code

object1 = Child1()

object2 = Child2()

object1.func1()

object1.func2()

object2.func1()

object2.func3()

Output:
This function is in parent class.
This function is in child 1.
This function is in parent class.
This function is in child 2.
Hybrid Inheritance: Inheritance consisting of multiple types of inheritance is called hybrid
inheritance.
Example:

# Python program to demonstrate

# hybrid inheritance

class School:

def func1(self):

print("This function is in school.")

class Student1(School):

def func2(self):
print("This function is in student 1. ")

class Student2(School):

def func3(self):

print("This function is in student 2.")

class Student3(Student1, School):

def func4(self):

print("This function is in student 3.")

# Driver's code

object = Student3()

object.func1()

object.func2()

Output:
This function is in school.
This function is in student 1.

Encapsulation in Python

Encapsulation is one of the fundamental concepts in object-oriented programming (OOP). It


describes the idea of wrapping data and the methods that work on data within one unit. This
puts restrictions on accessing variables and methods directly and can prevent the accidental
modification of data. To prevent accidental change, an object’s variable can only be changed
by an object’s method. Those types of variables are known as private variables.
A class is an example of encapsulation as it encapsulates all the data that is member functions,
variables, etc.

Consider a real-life example of encapsulation, in a company, there are different sections like
the accounts section, finance section, sales section etc. The finance section handles all the
financial transactions and keeps records of all the data related to finance. Similarly, the sales
section handles all the sales-related activities and keeps records of all the sales. Now there
may arise a situation when for some reason an official from the finance section needs all the
data about sales in a particular month. In this case, he is not allowed to directly access the
data of the sales section. He will first have to contact some other officer in the sales section
and then request him to give the particular data. This is what encapsulation is. Here the data
of the sales section and the employees that can manipulate them are wrapped under a single
name “sales section”. Using encapsulation also hides the data. In this example, the data of
the sections like sales, finance, or accounts are hidden from any other section.
Protected members
Protected members (in C++ and JAVA) are those members of the class that cannot be accessed
outside the class but can be accessed from within the class and its subclasses. To accomplish
this in Python, just follow the convention by prefixing the name of the member by a single
underscore “_”.
Although the protected variable can be accessed out of the class as well as in the derived
class(modified too in derived class), it is customary(convention not a rule) to not access the
protected out the class body.
Note: The __init__ method is a constructor and runs as soon as an object of a class is
instantiated.

# Python program to

# demonstrate protected members

# Creating a base class

class Base:

def __init__(self):

# Protected member

self._a = 2

# Creating a derived class

class Derived(Base):

def __init__(self):

# Calling constructor of

# Base class
Base.__init__(self)

print("Calling protected member of base class: ",

self._a)

# Modify the protected variable:

self._a = 3

print("Calling modified protected member outside class: ",

self._a)

obj1 = Derived()

obj2 = Base()

# Calling protected member

# Can be accessed but should not be done due to convention

print("Accessing protected member of obj1: ", obj1._a)

# Accessing the protected variable outside

print("Accessing protected member of obj2: ", obj2._a)

Output:
Calling protected member of base class: 2
Calling modified protected member outside class: 3
Accessing protected member of obj1: 3
Accessing protected member of obj2: 2
Private members
Private members are similar to protected members, the difference is that the class members
declared private should neither be accessed outside the class nor by any base class. In Python,
there is no existence of Private instance variables that cannot be accessed except inside a
class.
However, to define a private member prefix the member name with double underscore “__”.
Note: Python’s private and protected members can be accessed outside the class
through python name mangling.
# Python program to

# demonstrate private members

# Creating a Base class

class Base:

def __init__(self):

self.a = "GeeksforGeeks"

self.__c = "GeeksforGeeks"

# Creating a derived class

class Derived(Base):

def __init__(self):

# Calling constructor of

# Base class

Base.__init__(self)

print("Calling private member of base class: ")

print(self.__c)

# Driver code

obj1 = Base()

print(obj1.a)

# Uncommenting print(obj1.c) will

# raise an AttributeError

# Uncommenting obj2 = Derived() will

# also raise an AtrributeError as

# private member of base class

# is called inside derived class

Output:
GeeksforGeeks
Traceback (most recent call last):
File "/home/f4905b43bfcf29567e360c709d3c52bd.py", line 25, in <module>
print(obj1.c)
AttributeError: 'Base' object has no attribute 'c'

Traceback (most recent call last):


File "/home/4d97a4efe3ea68e55f48f1e7c7ed39cf.py", line 27, in <module>
obj2 = Derived()
File "/home/4d97a4efe3ea68e55f48f1e7c7ed39cf.py", line 20, in __init__
print(self.__c)
AttributeError: 'Derived' object has no attribute '_Derived__c'

Polymorphism in Python

The word polymorphism means having many forms. In programming, polymorphism means
the same function name (but different signatures) being used for different types.

Example of inbuilt polymorphic functions:

# Python program to demonstrate in-built poly-


# morphic functions

# len() being used for a string


print(len("geeks"))

# len() being used for a list


print(len([10, 20, 30]))

Output:
5
3
Examples of user-defined polymorphic functions:

# A simple Python function to demonstrate


# Polymorphism

def add(x, y, z = 0):


return x + y+z

# Driver code
print(add(2, 3))
print(add(2, 3, 4))

Output:
5
9
Polymorphism with class methods:
The below code shows how Python can use two different class types, in the same way. We
create a for loop that iterates through a tuple of objects. Then call the methods without being
concerned about which class type each object is. We assume that these methods actually
exist in each class.

class India():
def capital(self):
print("New Delhi is the capital of India.")

def language(self):
print("Hindi is the most widely spoken language of India.")

def type(self):
print("India is a developing country.")

class USA():
def capital(self):
print("Washington, D.C. is the capital of USA.")

def language(self):
print("English is the primary language of USA.")

def type(self):
print("USA is a developed country.")

obj_ind = India()
obj_usa = USA()
for country in (obj_ind, obj_usa):
country.capital()
country.language()
country.type()

Output:
New Delhi is the capital of India.
Hindi is the most widely spoken language of India.
India is a developing country.
Washington, D.C. is the capital of USA.
English is the primary language of USA.
USA is a developed country.

Polymorphism with Inheritance:


In Python, Polymorphism lets us define methods in the child class that have the same name
as the methods in the parent class. In inheritance, the child class inherits the methods from
the parent class. However, it is possible to modify a method in a child class that it has inherited
from the parent class. This is particularly useful in cases where the method inherited from the
parent class doesn’t quite fit the child class. In such cases, we re-implement the method in
the child class. This process of re-implementing a method in the child class is known
as Method Overriding.

class Bird:
def intro(self):
print("There are many types of birds.")

def flight(self):
print("Most of the birds can fly but some cannot.")

class sparrow(Bird):
def flight(self):
print("Sparrows can fly.")

class ostrich(Bird):
def flight(self):
print("Ostriches cannot fly.")

obj_bird = Bird()
obj_spr = sparrow()
obj_ost = ostrich()

obj_bird.intro()
obj_bird.flight()

obj_spr.intro()
obj_spr.flight()

obj_ost.intro()
obj_ost.flight()

Output:
There are many types of birds.
Most of the birds can fly but some cannot.
There are many types of birds.
Sparrows can fly.
There are many types of birds.
Ostriches cannot fly.

Polymorphism with a Function and objects:


It is also possible to create a function that can take any object, allowing for polymorphism. In
this example, let’s create a function called “func()” which will take an object which we will
name “obj”. Though we are using the name ‘obj’, any instantiated object will be able to be
called into this function. Next, let’s give the function something to do that uses the ‘obj’ object
we passed to it. In this case, let’s call the three methods, viz., capital(), language() and type(),
each of which is defined in the two classes ‘India’ and ‘USA’. Next, let’s create instantiations
of both the ‘India’ and ‘USA’ classes if we don’t have them already. With those, we can call
their action using the same func() function:

def func(obj):
obj.capital()
obj.language()
obj.type()

obj_ind = India()
obj_usa = USA()

func(obj_ind)
func(obj_usa)

Code: Implementing Polymorphism with a Function.

class India():
def capital(self):
print("New Delhi is the capital of India.")

def language(self):
print("Hindi is the most widely spoken language of India.")

def type(self):
print("India is a developing country.")

class USA():
def capital(self):
print("Washington, D.C. is the capital of USA.")

def language(self):
print("English is the primary language of USA.")

def type(self):
print("USA is a developed country.")
def func(obj):
obj.capital()
obj.language()
obj.type()

obj_ind = India()
obj_usa = USA()

func(obj_ind)
func(obj_usa)

Output:
New Delhi is the capital of India.
Hindi is the most widely spoken language of India.
India is a developing country.
Washington, D.C. is the capital of USA.
English is the primary language of USA.
USA is a developed country.

Class or Static Variables in Python

All objects share class or static variables. An instance or non-static variables are different for
different objects (every object has a copy). For example, let a Computer Science Student be
represented by class CSStudent. The class may have a static variable whose value is “cse” for
all objects. And class may also have non-static members like name and roll. In C++ and Java,
we can use static keywords to make a variable a class variable. The variables which don’t have
a preceding static keyword are instance variables.
The Python approach is simple; it doesn’t require a static keyword.
All variables which are assigned a value in the class declaration are class variables. And
variables that are assigned values inside methods are instance variables.

# Python program to show that the variables with a value

# assigned in class declaration, are class variables

# Class for Computer Science Student

class CSStudent:

stream = 'cse' # Class Variable


def __init__(self,name,roll):

self.name = name # Instance Variable

self.roll = roll # Instance Variable

# Objects of CSStudent class

a = CSStudent('Geek', 1)

b = CSStudent('Nerd', 2)

print(a.stream) # prints "cse"

print(b.stream) # prints "cse"

print(a.name) # prints "Geek"

print(b.name) # prints "Nerd"

print(a.roll) # prints "1"

print(b.roll) # prints "2"

# Class variables can be accessed using class

# name also

print(CSStudent.stream) # prints "cse"

# Now if we change the stream for just a it won't be changed for b

a.stream = 'ece'

print(a.stream) # prints 'ece'

print(b.stream) # prints 'cse'

# To change the stream for all instances of the class we can change it

# directly from the class

CSStudent.stream = 'mech'

print(a.stream) # prints 'ece'

print(b.stream) # prints 'mech'


Output:
cse
cse
Geek
Nerd
1
2
cse
ece
cse
ece
mech

Class method vs Static method in Python

Class Method
The @classmethod decorator is a built-in function decorator that is an expression that gets
evaluated after your function is defined. The result of that evaluation shadows your function
definition.
A class method receives the class as an implicit first argument, just like an instance method
receives the instance
Syntax:
class C(object):
@classmethod
def fun(cls, arg1, arg2, ...):
....
fun: function that needs to be converted into a class method
returns: a class method for function.

• A class method is a method that is bound to the class and not the object of the class.
• They have the access to the state of the class as it takes a class parameter that points
to the class and not the object instance.
• It can modify a class state that would apply across all the instances of the class. For
example, it can modify a class variable that will be applicable to all the instances.

Static Method
A static method does not receive an implicit first argument.
Syntax:
class C(object):
@staticmethod
def fun(arg1, arg2, ...):
...
returns: a static method for function fun.

• A static method is also a method that is bound to the class and not the object of the
class.
• A static method can’t access or modify the class state.
• It is present in a class because it makes sense for the method to be present in class.
Class method vs Static Method
• A class method takes cls as the first parameter while a static method needs no specific
parameters.
• A class method can access or modify the class state while a static method can’t access
or modify it.
• In general, static methods know nothing about the class state. They are utility-type
methods that take some parameters and work upon those parameters. On the other-
hand class methods must have class as a parameter.
• We use @classmethod decorator in python to create a class method and we use
@staticmethod decorator to create a static method in python.
When to use what?
• We generally use class method to create factory methods. Factory methods return
class objects ( similar to a constructor ) for different use cases.
• We generally use static methods to create utility functions.
How to define a class method and a static method?
To define a class method in python, we use @classmethod decorator, and to define a static
method we use @staticmethod decorator.
Let us look at an example to understand the difference between both of them. Let us say we
want to create a class Person. Now, python doesn’t support method overloading like C++ or
Java so we use class methods to create factory methods. In the below example we use a class
method to create a person object from birth year.
As explained above we use static methods to create utility functions. In the below example
we use a static method to check if a person is an adult or not.
Implementation

# Python program to demonstrate


# use of class method and static method.

from datetime import date

class Person:

def __init__(self, name, age):

self.name = name

self.age = age

# a class method to create a Person object by birth year.

@classmethod

def fromBirthYear(cls, name, year):

return cls(name, date.today().year - year)

# a static method to check if a Person is adult or not.

@staticmethod

def isAdult(age):

return age > 18

person1 = Person('mayank', 21)

person2 = Person.fromBirthYear('mayank', 1996)

print (person1.age)

print (person2.age)

# print the result

print (Person.isAdult(22))

Output:
21
25
True
Regular Expression in Python
A Regular Expressions (RegEx) is a special sequence of characters that uses a search pattern
to find a string or set of strings. It can detect the presence or absence of a text by matching
with a particular pattern, and also can split a pattern into one or more sub-patterns. Python
provides a re module that supports the use of regex in Python. Its primary function is to offer
a search, where it takes a regular expression and a string. Here, it either returns the first match
or else none.
Example:

import re

s = 'GeeksforGeeks: A computer science portal for geeks'

match = re.search(r'portal', s)

print('Start Index:', match.start())

print('End Index:', match.end())

Output
Start Index: 34
End Index: 40
The above code gives the starting index and the ending index of the string portal.
Note: Here r character (r’portal’) stands for raw, not regex. The raw string is slightly different
from a regular string, it won’t interpret the \ character as an escape character. This is because
the regular expression engine uses \ character for its own escaping purpose.
Before starting with the Python regex module let’s see how to actually write regex using
metacharacters or special sequences.

MetaCharacters

To understand the RE analogy, MetaCharacters are useful, important, and will be used in
functions of module re. Below is the list of metacharacters.

MetaCharacters Description
\ Used to drop the special meaning of character following it
[] Represent a character class
^ Matches the beginning
$ Matches the end
. Matches any character except newline
| Means OR (Matches with any of the characters separated by it.
? Matches zero or one occurrence
* Any number of occurrences (including 0 occurrences)
+ One or more occurrences
{} Indicate the number of occurrences of a preceding regex to match.
() Enclose a group of Regex
Let’s discuss each of these metacharacters in detail
\ – Backslash
The backslash (\) makes sure that the character is not treated in a special way. This can be
considered a way of escaping metacharacters. For example, if you want to search for the dot(.)
in the string then you will find that dot(.) will be treated as a special character as is one of the
metacharacters (as shown in the above table). So for this case, we will use the backslash(\)
just before the dot(.) so that it will lose its specialty. See the below example for a better
understanding.
Example:

import re

s = 'geeks.forgeeks'

# without using \

match = re.search(r'.', s)

print(match)

# using \

match = re.search(r'\.', s)

print(match)

Output
<_sre.SRE_Match object; span=(0, 1), match='g'>
<_sre.SRE_Match object; span=(5, 6), match='.'>
[] – Square Brackets
Square Brackets ([]) represents a character class consisting of a set of characters that we wish
to match. For example, the character class [abc] will match any single a, b, or c.
We can also specify a range of characters using – inside the square brackets. For example,

• [0, 3] is sample as [0123]


• [a-c] is same as [abc]
We can also invert the character class using the caret(^) symbol. For example,

• [^0-3] means any number except 0, 1, 2, or 3


• [^a-c] means any character except a, b, or c
^ – Caret
Caret (^) symbol matches the beginning of the string i.e. checks whether the string starts with
the given character(s) or not. For example –

• ^g will check if the string starts with g such as geeks, globe, girl, g, etc.
• ^ge will check if the string starts with ge such as geeks, geeksforgeeks, etc.
$ – Dollar
Dollar($) symbol matches the end of the string i.e checks whether the string ends with the
given character(s) or not. For example –

• s$ will check for the string that ends with a such as geeks, ends, s, etc.
• ks$ will check for the string that ends with ks such as geeks, geeksforgeeks, ks, etc.
. – Dot
Dot(.) symbol matches only a single character except for the newline character (\n). For
example –

• a.b will check for the string that contains any character at the place of the dot such as
acb, acbd, abbb, etc
• .. will check if the string contains at least 2 characters
| – Or
Or symbol works as the or operator meaning it checks whether the pattern before or after
the or symbol is present in the string or not. For example –

• a|b will match any string that contains a or b such as acd, bcd, abcd, etc.
? – Question Mark
Question mark(?) checks if the string before the question mark in the regex occurs at least
once or not at all. For example –

• ab?c will be matched for the string ac, acb, dabc but will not be matched for abbc
because there are two b. Similarly, it will not be matched for abdc because b is not
followed by c.
* – Star
Star (*) symbol matches zero or more occurrences of the regex preceding the * symbol. For
example –

• ab*c will be matched for the string ac, abc, abbbc, dabc, etc. but will not be matched
for abdc because b is not followed by c.
+ – Plus
Plus (+) symbol matches one or more occurrences of the regex preceding the + symbol. For
example –

• ab+c will be matched for the string abc, abbc, dabc, but will not be matched for ac,
abdc because there is no b in ac and b is not followed by c in abdc.
{m, n} – Braces
Braces match any repetitions preceding regex from m to n both inclusive. For example –

• a{2, 4} will be matched for the string aaab, baaaac, gaad, but will not be matched for
strings like abc, bc because there is only one a or no a in both the cases.
(<regex>) – Group
Group symbol is used to group sub-patterns. For example –

• (a|b)cd will match for strings like acd, abcd, gacd, etc.

Special Sequences

Special sequences do not match for the actual character in the string instead it tells the
specific location in the search string where the match must occur. It makes it easier to write
commonly used patterns.
List of special sequences
Special Description Examples
Sequence
\A Matches if the string begins with the given \Afor for geeks
character for the world
\b Matches if the word begins or ends with the \bge geeks
given character. \b(string) will check for the get
beginning of the word and (string)\b will check
for the ending of the word.
\B It is the opposite of the \b i.e. the string should \Bge together
not start or end with the given regex. forge
\d Matches any decimal digit, this is equivalent to \d 123
the set class [0-9] gee1
\D Matches any non-digit character, this is \D geeks
equivalent to the set class [^0-9] geek1
\s Matches any whitespace character. \s gee ks
a bc a
\S Matches any non-whitespace character \S a bd
abcd
\w Matches any alphanumeric character, this is \w 123
equivalent to the class [a-zA-Z0-9_]. geeKs4
\W Matches any non-alphanumeric character. \W >$
gee<>
\Z Matches if the string ends with the given regex ab\Z abcdab
abababab

Regex Module in Python

Python has a module named re that is used for regular expressions in Python. We can import
this module by using the import statement.
Example: Importing re module in Python

import re

Let’s see various functions provided by this module to work with regex in Python.
re.findall()
Return all non-overlapping matches of pattern in string, as a list of strings. The string is
scanned left-to-right, and matches are returned in the order found.
Example: Finding all occurrences of a pattern

# A Python program to demonstrate working of

# findall()

import re

# A sample text string where regular expression

# is searched.

string = """Hello my Number is 123456789 and

my friend's number is 987654321"""

# A sample regular expression to find digits.

regex = '\d+'

match = re.findall(regex, string)

print(match)

# This example is contributed by Ayush Saluja.

Output
['123456789', '987654321']
re.compile()
Regular expressions are compiled into pattern objects, which have methods for various
operations such as searching for pattern matches or performing string substitutions.
Example 1:

# Module Regular Expression is imported

# using __import__().

import re

# compile() creates regular expression

# character class [a-e],

# which is equivalent to [abcde].

# class [abcde] will match with string with

# 'a', 'b', 'c', 'd', 'e'.

p = re.compile('[a-e]')

# findall() searches for the Regular Expression

# and return a list upon finding

print(p.findall("Aye, said Mr. Gibenson Stark"))

Output:
['e', 'a', 'd', 'b', 'e', 'a']
Understanding the Output:

• First occurrence is ‘e’ in “Aye” and not ‘A’, as it being Case Sensitive.
• Next Occurrence is ‘a’ in “said”, then ‘d’ in “said”, followed by ‘b’ and ‘e’ in “Gibenson”,
the Last ‘a’ matches with “Stark”.
• Metacharacter backslash ‘\’ has a very important role as it signals various sequences.
If the backslash is to be used without its special meaning as metacharacter, use’\\’
Example 2: Set class [\s,.] will match any whitespace character, ‘,’, or, ‘.’ .
import re

# \d is equivalent to [0-9].

p = re.compile('\d')

print(p.findall("I went to him at 11 A.M. on 4th July 1886"))

# \d+ will match a group on [0-9], group

# of one or greater size

p = re.compile('\d+')

print(p.findall("I went to him at 11 A.M. on 4th July 1886"))

Output:
['1', '1', '4', '1', '8', '8', '6']
['11', '4', '1886']
Example 3:

import re

# \w is equivalent to [a-zA-Z0-9_].

p = re.compile('\w')

print(p.findall("He said * in some_lang."))

# \w+ matches to group of alphanumeric character.

p = re.compile('\w+')

print(p.findall("I went to him at 11 A.M., he \

said *** in some_language."))

# \W matches to non alphanumeric characters.

p = re.compile('\W')

print(p.findall("he said *** in some_language."))

Output:
['H', 'e', 's', 'a', 'i', 'd', 'i', 'n', 's', 'o', 'm', 'e', '_',
'l', 'a', 'n', 'g']
['I', 'went', 'to', 'him', 'at', '11', 'A', 'M', 'he', 'said', 'in',
'some_language']
[' ', ' ', '*', '*', '*', ' ', ' ', '.']
Example 4:

import re

# '*' replaces the no. of occurrence

# of a character.

p = re.compile('ab*')

print(p.findall("ababbaabbb"))

Output:
['ab', 'abb', 'a', 'abbb']
Understanding the Output:

• Our RE is ab*, which ‘a’ accompanied by any no. of ‘b’s, starting from 0.
• Output ‘ab’, is valid because of single ‘a’ accompanied by single ‘b’.
• Output ‘abb’, is valid because of single ‘a’ accompanied by 2 ‘b’.
• Output ‘a’, is valid because of single ‘a’ accompanied by 0 ‘b’.
• Output ‘abbb’, is valid because of single ‘a’ accompanied by 3 ‘b’.
re.split()
Split string by the occurrences of a character or a pattern, upon finding that pattern, the
remaining characters from the string are returned as part of the resulting list.
Syntax:
re.split(pattern, string, maxsplit=0, flags=0)
The First parameter, pattern denotes the regular expression, string is the given string in which
pattern will be searched for and in which splitting occurs, maxsplit if not provided is
considered to be zero ‘0’, and if any nonzero value is provided, then at most that many splits
occur. If maxsplit = 1, then the string will split once only, resulting in a list of length 2. The
flags are very useful and can help to shorten code, they are not necessary parameters, eg:
flags = re.IGNORECASE, in this split, the case, i.e. the lowercase or the uppercase will be
ignored.
Example 1:

from re import split

# '\W+' denotes Non-Alphanumeric Characters

# or group of characters Upon finding ','

# or whitespace ' ', the split(), splits the

# string from that point

print(split('\W+', 'Words, words , Words'))

print(split('\W+', "Word's words Words"))

# Here ':', ' ' ,',' are not AlphaNumeric thus,

# the point where splitting occurs

print(split('\W+', 'On 12th Jan 2016, at 11:02 AM'))

# '\d+' denotes Numeric Characters or group of

# characters Splitting occurs at '12', '2016',

# '11', '02' only

print(split('\d+', 'On 12th Jan 2016, at 11:02 AM'))

Output:
['Words', 'words', 'Words']
['Word', 's', 'words', 'Words']
['On', '12th', 'Jan', '2016', 'at', '11', '02', 'AM']
['On ', 'th Jan ', ', at ', ':', ' AM']

Example 2:

import re

# Splitting will occurs only once, at

# '12', returned list will have length 2


print(re.split('\d+', 'On 12th Jan 2016, at 11:02 AM', 1))

# 'Boy' and 'boy' will be treated same when

# flags = re.IGNORECASE

print(re.split('[a-f]+', 'Aey, Boy oh boy, come here',


flags=re.IGNORECASE))

print(re.split('[a-f]+', 'Aey, Boy oh boy, come here'))

Output:
['On ', 'th Jan 2016, at 11:02 AM']
['', 'y, ', 'oy oh ', 'oy, ', 'om', ' h', 'r', '']
['A', 'y, Boy oh ', 'oy, ', 'om', ' h', 'r', '']
re.sub()
The ‘sub’ in the function stands for SubString, a certain regular expression pattern is searched
in the given string(3rd parameter), and upon finding the substring pattern is replaced by
repl(2nd parameter), count checks and maintains the number of times this occurs.
Syntax:
re.sub(pattern, repl, string, count=0, flags=0)
Example 1:

import re

# Regular Expression pattern 'ub' matches the

# string at "Subject" and "Uber". As the CASE

# has been ignored, using Flag, 'ub' should

# match twice with the string Upon matching,

# 'ub' is replaced by '~*' in "Subject", and

# in "Uber", 'Ub' is replaced.

print(re.sub('ub', '~*', 'Subject has Uber booked already',

flags=re.IGNORECASE))

# Consider the Case Sensitivity, 'Ub' in


# "Uber", will not be replaced.

print(re.sub('ub', '~*', 'Subject has Uber booked already'))

# As count has been given value 1, the maximum

# times replacement occurs is 1

print(re.sub('ub', '~*', 'Subject has Uber booked already',

count=1, flags=re.IGNORECASE))

# 'r' before the pattern denotes RE, \s is for

# start and end of a String.

print(re.sub(r'\sAND\s', ' & ', 'Baked Beans And Spam',

flags=re.IGNORECASE))

Output
S~*ject has ~*er booked already
S~*ject has Uber booked already
S~*ject has Uber booked already
Baked Beans & Spam
re.subn()
subn() is similar to sub() in all ways, except in its way of providing output. It returns a tuple
with count of the total of replacement and the new string rather than just the string.
Syntax:
re.subn(pattern, repl, string, count=0, flags=0)
Example:

import re

print(re.subn('ub', '~*', 'Subject has Uber booked already'))

t = re.subn('ub', '~*', 'Subject has Uber booked already',

flags=re.IGNORECASE)
print(t)

print(len(t))

# This will give same output as sub() would have

print(t[0])

Output
('S~*ject has Uber booked already', 1)
('S~*ject has ~*er booked already', 2)
Length of Tuple is: 2
S~*ject has ~*er booked already
re.escape()
Returns string with all non-alphanumerics backslashed, this is useful if you want to match an
arbitrary literal string that may have regular expression metacharacters in it.
Syntax:
re.escape(string)
Example:

import re

# escape() returns a string with BackSlash '\',

# before every Non-Alphanumeric Character

# In 1st case only ' ', is not alphanumeric

# In 2nd case, ' ', caret '^', '-', '[]', '\'

# are not alphanumeric

print(re.escape("This is Awesome even 1 AM"))

print(re.escape("I Asked what is this [a-9], he said \t ^WoW"))

Output
This\ is\ Awesome\ even\ 1\ AM
I\ Asked\ what\ is\ this\ \[a\-9\]\,\ he\ said\ \ \ \^WoW
re.search()
This method either returns None (if the pattern doesn’t match), or a re.MatchObject contains
information about the matching part of the string. This method stops after the first match, so
this is best suited for testing a regular expression more than extracting data.
Example: Searching an occurrence of the pattern

# A Python program to demonstrate working of re.match().

import re

# Lets use a regular expression to match a date string

# in the form of Month name followed by day number

regex = r"([a-zA-Z]+) (\d+)"

match = re.search(regex, "I was born on June 24")

if match != None:

# We reach here when the expression "([a-zA-Z]+) (\d+)"

# matches the date string.

# This will print [14, 21), since it matches at index 14

# and ends at 21.

print ("Match at index %s, %s" % (match.start(), match.end()))

# We us group() method to get all the matches and

# captured groups. The groups contain the matched values.

# In particular:

# match.group(0) always returns the fully matched string

# match.group(1) match.group(2), ... return the capture

# groups in order from left to right in the input string

# match.group() is equivalent to match.group(0)


# So this will print "June 24"

print ("Full match: %s" % (match.group(0)))

# So this will print "June"

print ("Month: %s" % (match.group(1)))

# So this will print "24"

print ("Day: %s" % (match.group(2)))

else:

print ("The regex pattern does not match.")

Output
Match at index 14, 21
Full match: June 24
Month: June
Day: 24

Match Object

A Match object contains all the information about the search and the result and if there is no
match found then None will be returned. Let’s see some of the commonly used methods and
attributes of the match object.
Getting the string and the regex
match.re attribute returns the regular expression passed and match.string attribute returns
the string passed.
Example: Getting the string and the regex of the matched object

import re

s = "Welcome to GeeksForGeeks"
# here x is the match object

res = re.search(r"\bG", s)

print(res.re)

print(res.string)

Output
re.compile('\\bG')
Welcome to GeeksForGeeks
Getting index of matched object
• start() method returns the starting index of the matched substring
• end() method returns the ending index of the matched substring
• span() method returns a tuple containing the starting and the ending index of the
matched substring
Example: Getting index of matched object

import re

s = "Welcome to GeeksForGeeks"

# here x is the match object

res = re.search(r"\bGee", s)

print(res.start())

print(res.end())

print(res.span())

Output
11
14
(11, 14)
Getting matched substring
group() method returns the part of the string for which the patterns match. See the below
example for a better understanding.
Example: Getting matched substring

import re

s = "Welcome to Python"

# here x is the match object

res = re.search(r"\D{2} t", s)

print(res.group())

Output
me t
In the above example, our pattern specifies for the string that contains at least 2 characters
which are followed by a space, and that space is followed by a t.

Search, Match and Find All

The module re provides support for regular expressions in Python. Below are main methods
in this module.
Searching an occurrence of pattern
re.search(): This method either returns None (if the pattern doesn’t match), or a
re.MatchObject that contains information about the matching part of the string. This method
stops after the first match, so this is best suited for testing a regular expression more than
extracting data.

# A Python program to demonstrate working of re.match().

import re

# Lets use a regular expression to match a date string

# in the form of Month name followed by day number

regex = r"([a-zA-Z]+) (\d+)"

match = re.search(regex, "I was born on June 24")


if match != None:

# We reach here when the expression "([a-zA-Z]+) (\d+)"

# matches the date string.

# This will print [14, 21), since it matches at index 14

# and ends at 21.

print ("Match at index %s, %s" % (match.start(), match.end()))

# We us group() method to get all the matches and

# captured groups. The groups contain the matched values.

# In particular:

# match.group(0) always returns the fully matched string

# match.group(1) match.group(2), ... return the capture

# groups in order from left to right in the input string

# match.group() is equivalent to match.group(0)

# So this will print "June 24"

print ("Full match: %s" % (match.group(0)))

# So this will print "June"

print ("Month: %s" % (match.group(1)))

# So this will print "24"

print ("Day: %s" % (match.group(2)))

else:

print ("The regex pattern does not match.")

Output:
Match at index 14, 21
Full match: June 24
Month: June
Day: 24
Matching a Pattern with Text
re.match(): This function attempts to match pattern to whole string. The re.match function
returns a match object on success, None on failure.
re.match(pattern, string, flags=0)

pattern: Regular expression to be matched.


string: String where pattern is searched
flags: We can specify different flags
using bitwise OR (|).

# A Python program to demonstrate working

# of re.match().

import re

# a sample function that uses regular expressions

# to find month and day of a date.

def findMonthAndDate(string):

regex = r"([a-zA-Z]+) (\d+)"

match = re.match(regex, string)

if match == None:

print ("Not a valid date")

return

print ("Given Data: %s" % (match.group()))

print ("Month: %s" % (match.group(1)))

print ("Day: %s" % (match.group(2)))


# Driver Code

findMonthAndDate("Jun 24")

print("")

findMonthAndDate("I was born on June 24")

Output:
Given Data: Jun 24
Month: Jun
Day: 24

Not a valid date


Finding all occurrences of a pattern
re.findall(): Return all non-overlapping matches of pattern in string, as a list of strings. The
string is sca.nned left-to-right, and matches are returned in the order found.

# A Python program to demonstrate working of

# findall()

import re

# A sample text string where regular expression

# is searched.

string = """Hello my Number is 123456789 and

my friend's number is 987654321"""

# A sample regular expression to find digits.

regex = '\d+'

match = re.findall(regex, string)

print(match)

# This example is contributed by Ayush Saluja.


Output:
['123456789', '987654321']
Regular expression is a vast topic. It’s a complete library. Regular expressions can do a lot of
stuff. You can Match, Search, Replace, Extract a lot of data. For example, below small code is
so powerful that it can extract email address from a text. So we can make our own Web
Crawlers and scrappers in python with easy.Look at the below regex.
# extract all email addresses and add them into the resulting set
new_emails = set(re.findall(r"[a-z0-9\.\-+_]+@[a-z0-9\.\-+_]+\.[a-
z]+", text, re.I))
re.search() VS re.findall()
A Regular expression (sometimes called a Rational expression) is a sequence of characters
that define a search pattern, mainly for use in pattern matching with strings, or string
matching, i.e. “find and replace”-like operations. Regular expressions are a generalized way
to match patterns with sequences of characters.
Module Regular Expressions (RE) specifies a set of strings (pattern) that matches it. To
understand the RE analogy, MetaCharacters are useful, important and will be used in
functions of module re.
There are a total of 14 metacharacters and will be discussed as they follow into functions:
\ Used to drop the special meaning of character
following it (discussed below)
[] Represent a character class
^ Matches the beginning
$ Matches the end
. Matches any character except newline
? Matches zero or one occurrence.
| Means OR (Matches with any of the characters
separated by it.
* Any number of occurrences (including 0 occurrences)
+ One or more occurrences
{} Indicate number of occurrences of a preceding RE
to match.
() Enclose a group of REs
re.search()
re.search() method either returns None (if the pattern doesn’t match), or
a re.MatchObject that contains information about the matching part of the string. This
method stops after the first match, so this is best suited for testing a regular expression more
than extracting data.
Example:

# A Python program to demonstrate working of re.match().

import re

# Lets use a regular expression to match a date string

# in the form of Month name followed by day number

regex = r"([a-zA-Z]+) (\d+)"

match = re.search(regex, "I was born on June 24")

if match != None:

# We reach here when the expression "([a-zA-Z]+) (\d+)"

# matches the date string.

# This will print [14, 21), since it matches at index 14

# and ends at 21.

print("Match at index % s, % s" % (match.start(), match.end()))

# We us group() method to get all the matches and

# captured groups. The groups contain the matched values.

# In particular:

# match.group(0) always returns the fully matched string

# match.group(1) match.group(2), ... return the capture

# groups in order from left to right in the input string

# match.group() is equivalent to match.group(0)

# So this will print "June 24"


print("Full match: % s" % (match.group(0)))

# So this will print "June"

print("Month: % s" % (match.group(1)))

# So this will print "24"

print("Day: % s" % (match.group(2)))

else:

print("The regex pattern does not match.")

Output:
Match at index 14, 21
Full match: June 24
Month: June
Day: 24
re.findall()
Return all non-overlapping matches of pattern in string, as a list of strings. The string is
scanned left-to-right, and matches are returned in the order found.
Example:

# A Python program to demonstrate working of

# findall()

import re

# A sample text string where regular expression

# is searched.

string = """Hello my Number is 123456789 and

my friend's number is 987654321"""

# A sample regular expression to find digits.


regex = '\d+'

match = re.findall(regex, string)

print(match)

Output:
['123456789', '987654321']

Verbose in Python Regex

re.VERBOSE : This flag allows you to write regular expressions that look nicer and are more
readable by allowing you to visually separate logical sections of the pattern and add
comments.
Whitespace within the pattern is ignored, except when in a character class, or when preceded
by an unescaped backslash, or within tokens like *?, (?: or (?P. When a line contains a
# that is not in a character class and is not preceded by an unescaped backslash, all characters
from the leftmost such # through the end of the line are ignored.

# Without Using VERBOSE


regex_email = re.compile(r'^([a-z0-9_\.-]+)@([0-9a-z\.-]+)\.([a-z\.]{2,
6})$',
re.IGNORECASE)

# Using VERBOSE
regex_email = re.compile(r"""
^([a-z0-9_\.-]+) # local Part
@ # single @ sign
([0-9a-z\.-]+) # Domain name
\. # single Dot .
([a-z]{2,6})$ # Top level Domain
""",re.VERBOSE | re.IGNORECASE)

It’s passed as an argument to re.compile() i.e re.compile(Regular Expression,


re.VERBOSE). re.compile() returns a RegexObject which is then matched with the given
string.
Let’s consider an example where the user is asked to enter their Email ID and we have to
validate it using RegEx. The format of an email is as follow:

• Personal details/local part like john123


• Single @
• Domain Name like gmail/yahoo etc
• Single Dot(.)
• Top Level Domain like .com/.org/.net
Examples:
Input: [email protected]
Output: Valid

Input: [email protected]@
Output: Invalid

This is invalid because there is @ after the top level domain name.
Below is the Python implementation –

# Python3 program to show the Implementation of VERBOSE in RegEX


import re

def validate_email(email):

# RegexObject = re.compile( Regular expression, flag )


# Compiles a regular expression pattern into
# a regular expression object
regex_email=re.compile(r"""
^([a-z0-9_\.-]+) # local
Part
@ # single @
sign
([0-9a-z\.-]+) # Domain
name
\. # single Dot
.
([a-z]{2,6})$ # Top level
Domain
""",re.VERBOSE | re.IGNORECASE)

# RegexObject is matched with the desired


# string using fullmatch function
# In case a match is found, search()
# returns a MatchObject Instance
res=regex_email.fullmatch(email)

#If match is found, the string is valid


if res:
print("{} is Valid. Details are as follow:".format(email))

#prints first part/personal detail of Email Id


print("Local:{}".format(res.group(1)))
#prints Domain Name of Email Id
print("Domain:{}".format(res.group(2)))

#prints Top Level Domain Name of Email Id


print("Top Level domain:{}".format(res.group(3)))
print()

else:
#If match is not found,string is invalid
print("{} is Invalid".format(email))

# Driver Code
validate_email("[email protected]")
validate_email("[email protected]@")
validate_email("[email protected]")

Output:
[email protected] is Valid. Details are as follow:
Local:expectopatronum
Domain:gmail
Top Level domain:com

[email protected]@ is Invalid
[email protected] is Invalid

Password validation in Python

Let’s take a password as a combination of alphanumeric characters along with special


characters, and check whether the password is valid or not with the help of few conditions.
Conditions for a valid password are:
1. Should have at least one number.
2. Should have at least one uppercase and one lowercase character.
3. Should have at least one special symbol.
4. Should be between 6 to 20 characters long.
Input : Geek12#
Output : Password is valid.
Input : asd123
Output : Invalid Password !!
We can check if a given string is eligible to be a password or not using multiple ways.
Method #1: Naive Method (Without using Regex).

# Password validation in Python

# using naive method

# Function to validate the password

def password_check(passwd):

SpecialSym =['$', '@', '#', '%']

val = True

if len(passwd) < 6:

print('length should be at least 6')

val = False

if len(passwd) > 20:

print('length should be not be greater than 8')

val = False

if not any(char.isdigit() for char in passwd):

print('Password should have at least one numeral')

val = False

if not any(char.isupper() for char in passwd):

print('Password should have at least one uppercase letter')

val = False

if not any(char.islower() for char in passwd):

print('Password should have at least one lowercase letter')

val = False
if not any(char in SpecialSym for char in passwd):

print('Password should have at least one of the symbols $@#')

val = False

if val:

return val

# Main method

def main():

passwd = 'Geek12@'

if (password_check(passwd)):

print("Password is valid")

else:

print("Invalid Password !!")

# Driver Code

if __name__ == '__main__':

main()

Output:
Password is valid
This code used boolean functions to check if all the conditions were satisfied or not. We see
that though the complexity of the code is basic, the length is considerable.

Method #2: Using regex


compile() method of Regex module makes a Regex object, making it possible to execute regex
functions onto the pat variable. Then we check if the pattern defined by pat is followed by
the input string passwd. If so, the search method returns true, which would allow the
password to be valid.

# importing re library

import re
def main():

passwd = 'Geek12@'

reg = "^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*#?&])[A-Za-
z\d@$!#%*?&]{6,20}$"

# compiling regex

pat = re.compile(reg)

# searching regex

mat = re.search(pat, passwd)

# validating conditions

if mat:

print("Password is valid.")

else:

print("Password invalid !!")

# Driver Code

if __name__ == '__main__':

main()

Output:
Password is valid.
Python Collections
The collection Module in Python provides different types of containers. A Container is an
object that is used to store different objects and provide a way to access the contained objects
and iterate over them. Some of the built-in containers are Tuple, List, Dictionary, etc. In this
article, we will discuss the different containers provided by the collections module as below.

• Counters
• OrderedDict
• DefaultDict
• ChainMap
• NamedTuple
• DeQue
• UserDict
• UserList
• UserString

Counters

Counter is a container included in the collections module.


What is Container?
Containers are objects that hold objects. They provide a way to access the contained objects
and iterate over them. Examples of built in containers are Tuple, list, and dictionary. Others
are included in Collections module.
A Counter is a subclass of dict. Therefore it is an unordered collection where elements and
their respective count are stored as a dictionary. This is equivalent to a bag or multiset of
other languages.
Syntax: class collections.Counter([iterable-or-mapping])
Initialization:
The constructor of counter can be called in any one of the following ways:
• With sequence of items
• With dictionary containing keys and counts
• With keyword arguments mapping string names to counts
Example of each type of initialization:

# A Python program to show different ways to create

# Counter

from collections import Counter


# With sequence of items

print(Counter(['B','B','A','B','C','A','B','B','A','C']))

# with dictionary

print(Counter({'A':3, 'B':5, 'C':2}))

# with keyword arguments

print(Counter(A=3, B=5, C=2))

Output of all the three lines is same:


Counter({'B': 5, 'A': 3, 'C': 2})
Counter({'B': 5, 'A': 3, 'C': 2})
Counter({'B': 5, 'A': 3, 'C': 2})
Updation:
We can also create an empty counter in the following manner:
coun = collections.Counter()
And can be updated via update() method .Syntax for the same:
coun.update(Data)

# A Python program to demonstrate update()

from collections import Counter

coun = Counter()

coun.update([1, 2, 3, 1, 2, 1, 1, 2])

print(coun)

coun.update([1, 2, 4])

print(coun)

Output:
Counter({1: 4, 2: 3, 3: 1})
Counter({1: 5, 2: 4, 3: 1, 4: 1})
• Data can be provided in any of the three ways as mentioned in initialization and the
counter’s data will be increased not replaced.
• Counts can be zero and negative also.

# Python program to demonstrate that counts in

# Counter can be 0 and negative

from collections import Counter

c1 = Counter(A=4, B=3, C=10)

c2 = Counter(A=10, B=3, C=4)

c1.subtract(c2)

print(c1)

Output :
Counter({'c': 6, 'B': 0, 'A': -6})

• We can use Counter to count distinct elements of a list or other collections.

# An example program where different list items are

# counted using counter

from collections import Counter

# Create a list

z = ['blue', 'red', 'blue', 'yellow', 'blue', 'red']

# Count distinct elements and print Counter aobject

print(Counter(z))

Output:
Counter({'blue': 3, 'red': 2, 'yellow': 1})
OrderedDict in Python

An OrderedDict is a dictionary subclass that remembers the order that keys were first
inserted. The only difference between dict() and OrderedDict() is that:
OrderedDict preserves the order in which the keys are inserted. A regular dict doesn’t track
the insertion order and iterating it gives the values in an arbitrary order. By contrast, the order
the items are inserted is remembered by OrderedDict.

# A Python program to demonstrate working of OrderedDict

from collections import OrderedDict

print("This is a Dict:\n")

d = {}

d['a'] = 1

d['b'] = 2

d['c'] = 3

d['d'] = 4

for key, value in d.items():

print(key, value)

print("\nThis is an Ordered Dict:\n")

od = OrderedDict()

od['a'] = 1

od['b'] = 2

od['c'] = 3

od['d'] = 4

for key, value in od.items():

print(key, value)

Output:
This is a Dict:
a 1
c 3
b 2
d 4

This is an Ordered Dict:


a 1
b 2
c 3
d 4
Important Points:
• Key value Change: If the value of a certain key is changed, the position of the key
remains unchanged in OrderedDict.

# A Python program to demonstrate working of key

# value change in OrderedDict

from collections import OrderedDict

print("Before:\n")

od = OrderedDict()

od['a'] = 1

od['b'] = 2

od['c'] = 3

od['d'] = 4

for key, value in od.items():

print(key, value)

print("\nAfter:\n")

od['c'] = 5

for key, value in od.items():

print(key, value)
Output:
Before:

a 1
b 2
c 3
d 4

After:

a 1
b 2
c 5
d 4
• Deletion and Re-Inserting: Deleting and re-inserting the same key will push it to the
back as OrderedDict, however, maintains the order of insertion.

# A Python program to demonstrate working of deletion

# re-insertion in OrderedDict

from collections import OrderedDict

print("Before deleting:\n")

od = OrderedDict()

od['a'] = 1

od['b'] = 2

od['c'] = 3

od['d'] = 4

for key, value in od.items():

print(key, value)
print("\nAfter deleting:\n")

od.pop('c')

for key, value in od.items():

print(key, value)

print("\nAfter re-inserting:\n")

od['c'] = 3

for key, value in od.items():

print(key, value)

Output:
Before deleting:

a 1
b 2
c 3
d 4

After deleting:

a 1
b 2
d 4

After re-inserting:

a 1
b 2
d 4
c 3
Other Considerations:
• Ordered dict in Python version 2.7 consumes more memory than normal dict. This is
due to the underlying Doubly Linked List implementation for keeping the order. In
Python 2.7 Ordered Dict is not dict subclass, it’s a specialized container from
collections module.
• Starting from Python 3.7, insertion order of Python dictionaries is guaranteed.
• Ordered Dict can be used as a stack with the help of popitem function. Try
implementing LRU cache with Ordered Dict.

Defaultdict in Python

Dictionary in Python is an unordered collection of data values that are used to store data
values like a map. Unlike other Data Types that hold only single value as an element, the
Dictionary holds key-value pair. In Dictionary, the key must be unique and immutable. This
means that a Python Tuple can be a key whereas a Python List can not. A Dictionary can be
created by placing a sequence of elements within curly {} braces, separated by ‘comma’.
Example:

# Python program to demonstrate

# dictionary

Dict = {1: 'Geeks', 2: 'For', 3: 'Geeks'}

print("Dictionary:")

print(Dict)

print(Dict[1])

# Uncommenting this print(Dict[4])

# will raise a KeyError as the

# 4 is not present in the dictionary

Output:
Dictionary:
{1: 'Geeks', 2: 'For', 3: 'Geeks'}
Geeks
Traceback (most recent call last):
File "/home/1ca83108cc81344dc7137900693ced08.py", line 11, in
print(Dict[4])
KeyError: 4
Sometimes, when the KeyError is raised, it might become a problem. To overcome this Python
introduces another dictionary like container known as Defaultdict which is present inside the
collections module.
Defaultdict is a container like dictionaries present in the module collections. Defaultdict is a
sub-class of the dictionary class that returns a dictionary-like object. The functionality of both
dictionaries and defaultdict are almost same except for the fact that defaultdict never raises
a KeyError. It provides a default value for the key that does not exists.
Syntax: defaultdict(default_factory)

Parameters:
• default_factory: A function returning the default value for the dictionary defined. If
this argument is absent then the dictionary raises a KeyError.
Example:

# Python program to demonstrate

# defaultdict

from collections import defaultdict

# Function to return a default

# values for keys that is not

# present

def def_value():

return "Not Present"

# Defining the dict

d = defaultdict(def_value)

d["a"] = 1

d["b"] = 2

print(d["a"])
print(d["b"])

print(d["c"])

Output:
1
2
Not Present
Inner Working of defaultdict
Defaultdict adds one writable instance variable and one method in addition to the standard
dictionary operations. The instance variable is the default_factory parameter and the method
provided is __missing__.
• Default_factory: It is a function returning the default value for the dictionary defined.
If this argument is absent then the dictionary raises a KeyError.
Example:

# Python program to demonstrate

# default_factory argument of

# defaultdict

from collections import defaultdict

# Defining the dict and passing

# lambda as default_factory argument

d = defaultdict(lambda: "Not Present")

d["a"] = 1

d["b"] = 2

print(d["a"])

print(d["b"])

print(d["c"])

Output:
1
2
Not Present
• __missing__(): This function is used to provide the default value for the dictionary.
This function takes default_factory as an argument and if this argument is None, a
KeyError is raised otherwise it provides a default value for the given key. This method
is basically called by the __getitem__() method of the dict class when the requested
key is not found. __getitem__() raises or return the value returned by the
__missing__(). method.
Example:

# Python program to demonstrate

# defaultdict

from collections import defaultdict

# Defining the dict

d = defaultdict(lambda: "Not Present")

d["a"] = 1

d["b"] = 2

# Provides the default value

# for the key

print(d.__missing__('a'))

print(d.__missing__('d'))

Output:
Not Present
Not Present
Using List as default_factory
When the list class is passed as the default_factory argument, then a defaultdict is created
with the values that are list.
Example:

# Python program to demonstrate

# defaultdict

from collections import defaultdict

# Defining a dict

d = defaultdict(list)

for i in range(5):

d[i].append(i)

print("Dictionary with values as list:")

print(d)

Output:
Dictionary with values as list:
defaultdict(<class 'list'>, {0: [0], 1: [1], 2: [2], 3: [3], 4: [4]})
Using int as default_factory
When the int class is passed as the default_factory argument, then a defaultdict is created
with default value as zero.
Example:

# Python program to demonstrate

# defaultdict

from collections import defaultdict

# Defining the dict

d = defaultdict(int)
L = [1, 2, 3, 4, 2, 4, 1, 2]

# Iterate through the list

# for keeping the count

for i in L:

# The default value is 0

# so there is no need to

# enter the key first

d[i] += 1

print(d)

Output:
defaultdict(<class 'int'>, {1: 2, 2: 3, 3: 1, 4: 2})

ChainMap in Python

Python contains a container called “ChainMap” which encapsulates many dictionaries into
one unit. ChainMap is member of module “collections“.
Example:

# Python program to demonstrate

# ChainMap

from collections import ChainMap

d1 = {'a': 1, 'b': 2}

d2 = {'c': 3, 'd': 4}

d3 = {'e': 5, 'f': 6}

# Defining the chainmap


c = ChainMap(d1, d2, d3)

print(c)

Output:
ChainMap({'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'e': 5, 'f': 6})
Let’s see various Operations on ChainMap
Access Operations
• keys() :- This function is used to display all the keys of all the dictionaries in
ChainMap.
• values() :- This function is used to display values of all the dictionaries in ChainMap.
• maps() :- This function is used to display keys with corresponding values of all the
dictionaries in ChainMap.

# Please select Python 3 for running this code in IDE

# Python code to demonstrate ChainMap and

# keys(), values() and maps

# importing collections for ChainMap operations

import collections

# initializing dictionaries

dic1 = { 'a' : 1, 'b' : 2 }

dic2 = { 'b' : 3, 'c' : 4 }

# initializing ChainMap

chain = collections.ChainMap(dic1, dic2)

# printing chainMap using maps

print ("All the ChainMap contents are : ")

print (chain.maps)

# printing keys using keys()


print ("All keys of ChainMap are : ")

print (list(chain.keys()))

# printing keys using keys()

print ("All values of ChainMap are : ")

print (list(chain.values()))

Output:
All the ChainMap contents are :
[{'b': 2, 'a': 1}, {'c': 4, 'b': 3}]
All keys of ChainMap are :
['a', 'c', 'b']
All values of ChainMap are :
[1, 4, 2]

Note: Notice the key named “b” exists in both dictionaries, but only first dictionary key is
taken as key value of “b”. Ordering is done as the dictionaries are passed in function.
Manipulating Operations
• new_child() :- This function adds a new dictionary in the beginning of the ChainMap.
• reversed() :- This function reverses the relative ordering of dictionaries in the
ChainMap.

# Please select Python 3 for running this code in IDE

# Python code to demonstrate ChainMap and

# reversed() and new_child()

# importing collections for ChainMap operations

import collections

# initializing dictionaries

dic1 = { 'a' : 1, 'b' : 2 }

dic2 = { 'b' : 3, 'c' : 4 }


dic3 = { 'f' : 5 }

# initializing ChainMap

chain = collections.ChainMap(dic1, dic2)

# printing chainMap using map

print ("All the ChainMap contents are : ")

print (chain.maps)

# using new_child() to add new dictionary

chain1 = chain.new_child(dic3)

# printing chainMap using map

print ("Displaying new ChainMap : ")

print (chain1.maps)

# displaying value associated with b before reversing

print ("Value associated with b before reversing is : ",end="")

print (chain1['b'])

# reversing the ChainMap

chain1.maps = reversed(chain1.maps)

# displaying value associated with b after reversing

print ("Value associated with b after reversing is : ",end="")

print (chain1['b'])

Output:
All the ChainMap contents are :
[{'b': 2, 'a': 1}, {'b': 3, 'c': 4}]
Displaying new ChainMap :
[{'f': 5}, {'b': 2, 'a': 1}, {'b': 3, 'c': 4}]
Value associated with b before reversing is : 2
Value associated with b after reversing is : 3

Namedtuple in Python

Python supports a type of container like dictionaries called “namedtuple()” present in the
module, “collections“. Like dictionaries, they contain keys that are hashed to a particular
value. But on contrary, it supports both access from key-value and iteration, the
functionality that dictionaries lack.
Example:

# Python code to demonstrate namedtuple()

from collections import namedtuple

# Declaring namedtuple()

Student = namedtuple('Student', ['name', 'age', 'DOB'])

# Adding values

S = Student('Nandini', '19', '2541997')

# Access using index

print("The Student age using index is : ", end="")

print(S[1])

# Access using name

print("The Student name using keyname is : ", end="")

print(S.name)

Output:
The Student age using index is : 19
The Student name using keyname is : Nandini
Let’s see various Operations on namedtuple()
Access Operations
• Access by index: The attribute values of namedtuple() are ordered and can be
accessed using the index number unlike dictionaries which are not accessible by index.
• Access by keyname: Access by keyname is also allowed as in dictionaries.
• using getattr(): This is yet another way to access the value by giving namedtuple and
key value as its argument.

# Python code to demonstrate namedtuple() and

# Access by name, index and getattr()

# importing "collections" for namedtuple()

import collections

# Declaring namedtuple()

Student = collections.namedtuple('Student', ['name', 'age', 'DOB'])

# Adding values

S = Student('Nandini', '19', '2541997')

# Access using index

print("The Student age using index is : ", end="")

print(S[1])

# Access using name

print("The Student name using keyname is : ", end="")

print(S.name)

# Access using getattr()

print("The Student DOB using getattr() is : ", end="")

print(getattr(S, 'DOB'))

Output:
The Student age using index is : 19
The Student name using keyname is : Nandini
The Student DOB using getattr() is : 2541997
Conversion Operations
• _make() :- This function is used to return a namedtuple() from the iterable passed
as argument.
• _asdict() :- This function returns the OrderedDict() as constructed from the mapped
values of namedtuple().
• using “**” (double star) operator :- This function is used to convert a dictionary into
the namedtuple().

# Python code to demonstrate namedtuple() and

# _make(), _asdict() and "**" operator

# importing "collections" for namedtuple()

import collections

# Declaring namedtuple()

Student = collections.namedtuple('Student',

['name', 'age', 'DOB'])

# Adding values

S = Student('Nandini', '19', '2541997')

# initializing iterable

li = ['Manjeet', '19', '411997']

# initializing dict

di = {'name': "Nikhil", 'age': 19, 'DOB': '1391997'}

# using _make() to return namedtuple()

print("The namedtuple instance using iterable is : ")


print(Student._make(li))

# using _asdict() to return an OrderedDict()

print("The OrderedDict instance using namedtuple is : ")

print(S._asdict())

# using ** operator to return namedtuple from dictionary

print("The namedtuple instance from dict is : ")

print(Student(**di))

Output:
The namedtuple instance using iterable is :
Student(name='Manjeet', age='19', DOB='411997')
The OrderedDict instance using namedtuple is :
OrderedDict([('name', 'Nandini'), ('age', '19'), ('DOB',
'2541997')])
The namedtuple instance from dict is :
Student(name='Nikhil', age=19, DOB='1391997')
Additional Operation
• _fields: This function is used to return all the keynames of the namespace declared.
• _replace(): _replace() is like str.replace() but targets named fields( does not modify
the original values)

# Python code to demonstrate namedtuple() and

# _fields and _replace()

# importing "collections" for namedtuple()

import collections

# Declaring namedtuple()

Student = collections.namedtuple('Student', ['name', 'age', 'DOB'])


# Adding values

S = Student('Nandini', '19', '2541997')

# using _fields to display all the keynames of namedtuple()

print("All the fields of students are : ")

print(S._fields)

# ._replace returns a new namedtuple, it does not modify the original

print("returns a new namedtuple : ")

print(S._replace(name='Manjeet'))

# original namedtuple

print(S)

Output:
All the fields of students are :
('name', 'age', 'DOB')
The modified namedtuple is :
Student(name='Manjeet', age='19', DOB='2541997')

Deque in Python

Deque (Doubly Ended Queue) in Python is implemented using the module “collections“.
Deque is preferred over a list in the cases where we need quicker append and pop operations
from both the ends of the container, as deque provides an O(1) time complexity for append
and pop operations as compared to list which provides O(n) time complexity.
Example:

# Python code to demonstrate deque

from collections import deque

# Declaring deque

queue = deque(['name','age','DOB'])
print(queue)

Output:
deque(['name', 'age', 'DOB'])
Let’s see various Operations on deque:
• append():- This function is used to insert the value in its argument to the right end of
the deque.
• appendleft():- This function is used to insert the value in its argument to the left
end of the deque.
• pop():- This function is used to delete an argument from the right end of the deque.
• popleft():- This function is used to delete an argument from the left end of the
deque.

# Python code to demonstrate working of

# append(), appendleft(), pop(), and popleft()

# importing "collections" for deque operations

import collections

# initializing deque

de = collections.deque([1,2,3])

# using append() to insert element at right end

# inserts 4 at the end of deque

de.append(4)

# printing modified deque

print ("The deque after appending at right is : ")

print (de)

# using appendleft() to insert element at left end


# inserts 6 at the beginning of deque

de.appendleft(6)

# printing modified deque

print ("The deque after appending at left is : ")

print (de)

# using pop() to delete element from right end

# deletes 4 from the right end of deque

de.pop()

# printing modified deque

print ("The deque after deleting from right is : ")

print (de)

# using popleft() to delete element from left end

# deletes 6 from the left end of deque

de.popleft()

# printing modified deque

print ("The deque after deleting from left is : ")

print (de)

Output:
The deque after appending at right is :
deque([1, 2, 3, 4])
The deque after appending at left is :
deque([6, 1, 2, 3, 4])
The deque after deleting from right is :
deque([6, 1, 2, 3])
The deque after deleting from left is :
deque([1, 2, 3])
• index(ele, beg, end):- This function returns the first index of the value mentioned in
arguments, starting searching from beg till end index.
• insert(i, a) :- This function inserts the value mentioned in arguments(a) at
index(i) specified in arguments.
• remove():- This function removes the first occurrence of value mentioned in
arguments.
• count():- This function counts the number of occurrences of value mentioned in
arguments.

# Python code to demonstrate working of

# insert(), index(), remove(), count()

# importing "collections" for deque operations

import collections

# initializing deque

de = collections.deque([1, 2, 3, 3, 4, 2, 4])

# using index() to print the first occurrence of 4

print ("The number 4 first occurs at a position : ")

print (de.index(4,2,5))

# using insert() to insert the value 3 at 5th position

de.insert(4,3)

# printing modified deque

print ("The deque after inserting 3 at 5th position is : ")

print (de)

# using count() to count the occurrences of 3

print ("The count of 3 in deque is : ")

print (de.count(3))
# using remove() to remove the first occurrence of 3

de.remove(3)

# printing modified deque

print ("The deque after deleting first occurrence of 3 is : ")

print (de)

Output:
The number 4 first occurs at a position :
4
The deque after inserting 3 at 5th position is :
deque([1, 2, 3, 3, 3, 4, 2, 4])
The count of 3 in deque is :
3
The deque after deleting first occurrence of 3 is :
deque([1, 2, 3, 3, 4, 2, 4])
• extend(iterable):- This function is used to add multiple values at the right end of the
deque. The argument passed is iterable.
• extendleft(iterable):- This function is used to add multiple values at the left end of
the deque. The argument passed is iterable. Order is reversed as a result of left
appends.
• reverse():- This function is used to reverse the order of deque elements.
• rotate():- This function rotates the deque by the number specified in arguments. If
the number specified is negative, rotation occurs to the left. Else rotation is to
right.

# Python code to demonstrate working of

# extend(), extendleft(), rotate(), reverse()

# importing "collections" for deque operations

import collections
# initializing deque

de = collections.deque([1, 2, 3,])

# using extend() to add numbers to right end

# adds 4,5,6 to right end

de.extend([4,5,6])

# printing modified deque

print ("The deque after extending deque at end is : ")

print (de)

# using extendleft() to add numbers to left end

# adds 7,8,9 to left end

de.extendleft([7,8,9])

# printing modified deque

print ("The deque after extending deque at beginning is : ")

print (de)

# using rotate() to rotate the deque

# rotates by 3 to left

de.rotate(-3)

# printing modified deque

print ("The deque after rotating deque is : ")

print (de)

# using reverse() to reverse the deque

de.reverse()
# printing modified deque

print ("The deque after reversing deque is : ")

print (de)

Output:
The deque after extending deque at end is :
deque([1, 2, 3, 4, 5, 6])
The deque after extending deque at beginning is :
deque([9, 8, 7, 1, 2, 3, 4, 5, 6])
The deque after rotating deque is :
deque([1, 2, 3, 4, 5, 6, 9, 8, 7])
The deque after reversing deque is :
deque([7, 8, 9, 6, 5, 4, 3, 2, 1])

Heap queue (or heapq) in Python

Heap data structure is mainly used to represent a priority queue. In Python, it is available
using “heapq” module. The property of this data structure in Python is that each time
the smallest of heap element is popped(min heap). Whenever elements are pushed or
popped, heap structure in maintained. The heap[0] element also returns the smallest
element each time.
Let’s see various Operations on heap:
• heapify(iterable) :- This function is used to convert the iterable into a heap data
structure. i.e. in heap order.
• heappush(heap, ele) :- This function is used to insert the element mentioned in its
arguments into heap. The order is adjusted, so as heap structure is maintained.
• heappop(heap) :- This function is used to remove and return the smallest
element from heap. The order is adjusted, so as heap structure is maintained.

# Python code to demonstrate working of

# heapify(), heappush() and heappop()

# importing "heapq" to implement heap queue

import heapq
# initializing list

li = [5, 7, 9, 1, 3]

# using heapify to convert list into heap

heapq.heapify(li)

# printing created heap

print ("The created heap is : ",end="")

print (list(li))

# using heappush() to push elements into heap

# pushes 4

heapq.heappush(li,4)

# printing modified heap

print ("The modified heap after push is : ",end="")

print (list(li))

# using heappop() to pop smallest element

print ("The popped and smallest element is : ",end="")

print (heapq.heappop(li))

Output:
The created heap is : [1, 3, 9, 7, 5]
The modified heap after push is : [1, 3, 4, 7, 5, 9]
The popped and smallest element is : 1
• heappushpop(heap, ele) :- This function combines the functioning of both push and
pop operations in one statement, increasing efficiency. Heap order is maintained
after this operation.
• heapreplace(heap, ele) :- This function also inserts and pops element in one
statement, but it is different from above function. In this, element is first popped,
then the element is pushed.i.e, the value larger than the pushed value can be
returned. heapreplace() returns the smallest value originally in heap regardless of the
pushed element as opposed to heappushpop().
# Python code to demonstrate working of

# heappushpop() and heapreplce()

# importing "heapq" to implement heap queue

import heapq

# initializing list 1

li1 = [5, 7, 9, 4, 3]

# initializing list 2

li2 = [5, 7, 9, 4, 3]

# using heapify() to convert list into heap

heapq.heapify(li1)

heapq.heapify(li2)

# using heappushpop() to push and pop items simultaneously

# pops 2

print ("The popped item using heappushpop() is : ",end="")

print (heapq.heappushpop(li1, 2))

# using heapreplace() to push and pop items simultaneously

# pops 3

print ("The popped item using heapreplace() is : ",end="")

print (heapq.heapreplace(li2, 2))

Output:
The popped item using heappushpop() is : 2
The popped item using heapreplace() is : 3
• nlargest(k, iterable, key = fun) :- This function is used to return the k largest elements
from the iterable specified and satisfying the key if mentioned.
• nsmallest(k, iterable, key = fun) :- This function is used to return the k smallest
elements from the iterable specified and satisfying the key if mentioned.

# Python code to demonstrate working of

# nlargest() and nsmallest()

# importing "heapq" to implement heap queue

import heapq

# initializing list

li1 = [6, 7, 9, 4, 3, 5, 8, 10, 1]

# using heapify() to convert list into heap

heapq.heapify(li1)

# using nlargest to print 3 largest numbers

# prints 10, 9 and 8

print("The 3 largest numbers in list are : ",end="")

print(heapq.nlargest(3, li1))

# using nsmallest to print 3 smallest numbers

# prints 1, 3 and 4

print("The 3 smallest numbers in list are : ",end="")

print(heapq.nsmallest(3, li1))

Output:
The 3 largest numbers in list are : [10, 9, 8]
The 3 smallest numbers in list are : [1, 3, 4]

Collections.UserDict in Python

An unordered collection of data values that are used to store data values like a map is known
as Dictionary in Python. Unlike other Data Types that hold only a single value as an element,
Dictionary holds key:value pair. Key-value is provided in the dictionary to make it more
optimized.
Collections.UserDict
Python supports a dictionary like a container called UserDict present in the collections
module. This class acts as a wrapper class around the dictionary objects. This class is useful
when one wants to create a dictionary of their own with some modified functionality or with
some new functionality. It can be considered as a way of adding new behaviors to the
dictionary. This class takes a dictionary instance as an argument and simulates a dictionary
that is kept in a regular dictionary. The dictionary is accessible by the data attribute of this
class.
Syntax: collections.UserDict([initialdata])
Example 1:

# Python program to demonstrate

# userdict

from collections import UserDict

d = {'a':1,

'b': 2,

'c': 3}

# Creating an UserDict

userD = UserDict(d)

print(userD.data)

# Creating an empty UserDict

userD = UserDict()

print(userD.data)

Output:
{'a': 1, 'b': 2, 'c': 3}
{}
Example 2: Let’s create a class inheriting from UserDict to implement a customized
dictionary.

# Python program to demonstrate

# userdict

from collections import UserDict

# Creating a Dictionary where

# deletion is not allowed

class MyDict(UserDict):

# Function to stop deletion

# from dictionary

def __del__(self):

raise RuntimeError("Deletion not allowed")

# Function to stop pop from

# dictionary

def pop(self, s = None):

raise RuntimeError("Deletion not allowed")

# Function to stop popitem

# from Dictionary

def popitem(self, s = None):

raise RuntimeError("Deletion not allowed")

# Driver's code

d = MyDict({'a':1,

'b': 2,

'c': 3})
print("Original Dictionary")

print(d)

d.pop(1)

Output:
Original Dictionary
{'a': 1, 'c': 3, 'b': 2}
Traceback (most recent call last):
File "/home/3ce2f334f5d25a3e24d10d567c705ce6.py", line 35, in
d.pop(1)
File "/home/3ce2f334f5d25a3e24d10d567c705ce6.py", line 20, in pop
raise RuntimeError("Deletion not allowed")
RuntimeError: Deletion not allowed
Exception ignored in:
Traceback (most recent call last):
File "/home/3ce2f334f5d25a3e24d10d567c705ce6.py", line 15, in __del__
RuntimeError: Deletion not allowed

Collections.UserList in Python

Python Lists are array-like data structure but unlike it can be homogeneous. A single list may
contain DataTypes like Integers, Strings, as well as Objects. List in Python are ordered and
have a definite count. The elements in a list are indexed according to a definite sequence and
the indexing of a list is done with 0 being the first index.
Collections.UserList
Python supports a List like a container called UserList present in the collections module. This
class acts as a wrapper class around the List objects. This class is useful when one wants to
create a list of their own with some modified functionality or with some new functionality. It
can be considered as a way of adding new behaviors for the list. This class takes a list instance
as an argument and simulates a list that is kept in a regular list. The list is accessible by the
data attribute of the this class.
Syntax: collections.UserList([list])
Example 1:

# Python program to demonstrate

# userlist

from collections import UserList

L = [1, 2, 3, 4]

# Creating a userlist

userL = UserList(L)

print(userL.data)

# Creating empty userlist

userL = UserList()

print(userL.data)

Output:

[1, 2, 3, 4]
[]
Example 2:

# Python program to demonstrate

# userlist

from collections import UserList

# Creating a List where

# deletion is not allowed

class MyList(UserList):
# Function to stop deletion

# from List

def remove(self, s = None):

raise RuntimeError("Deletion not allowed")

# Function to stop pop from

# List

def pop(self, s = None):

raise RuntimeError("Deletion not allowed")

# Driver's code

L = MyList([1, 2, 3, 4])

print("Original List")

# Inserting to List"

L.append(5)

print("After Insertion")

print(L)

# Deleting From List

L.remove()

Output:

Original List
After Insertion
[1, 2, 3, 4, 5]

Traceback (most recent call last):

File "/home/9399c9e865a7493dce58e88571472d23.py", line 33, in


L.remove()
File "/home/9399c9e865a7493dce58e88571472d23.py", line 15, in remove

raise RuntimeError("Deletion not allowed")

RuntimeError: Deletion not allowed

Collections.UserString in Python

Strings are the arrays of bytes representing Unicode characters. However, Python does not
support the character data type. A character is a string of length one.

Example:

# Python program to demonstrate

# string

# Creating a String

# with single Quotes

String1 = 'Welcome to the Geeks World'

print("String with the use of Single Quotes: ")

print(String1)

# Creating a String

# with double Quotes

String1 = "I'm a Geek"

print("\nString with the use of Double Quotes: ")

print(String1)

Output:
String with the use of Single Quotes:
Welcome to the Geeks World

String with the use of Double Quotes:


I'm a Geek

Collections.UserString
Python supports a String like a container called UserString present in the collections module.
This class acts as a wrapper class around the string objects. This class is useful when one wants
to create a string of their own with some modified functionality or with some new
functionality. It can be considered as a way of adding new behaviors for the string. This class
takes any argument that can be converted to string and simulates a string whose content is
kept in a regular string. The string is accessible by the data attribute of this class.

Syntax: collections.UserString(seq)
Example 1:

# Python program to demonstrate

# userstring

from collections import UserString

d = 12344

# Creating an UserDict

userS = UserString(d)

print(userS.data)

# Creating an empty UserDict

userS = UserString("")

print(userS.data)

Output:
12344
Example 2:

# Python program to demonstrate

# userstring

from collections import UserString

# Creating a Mutable String


class Mystring(UserString):

# Function to append to

# string

def append(self, s):

self.data += s

# Function to remove from

# string

def remove(self, s):

self.data = self.data.replace(s, "")

# Driver's code

s1 = Mystring("Geeks")

print("Original String:", s1.data)

# Appending to string

s1.append("s")

print("String After Appending:", s1.data)

# Removing from string

s1.remove("e")

print("String after Removing:", s1.data)

Output:
Original String: Geeks
String After Appending: Geekss
String after Removing: Gkss
Interacting with the OS
The OS module in Python provides functions for interacting with the operating system. OS
comes under Python’s standard utility modules. This module provides a portable way of using
operating system-dependent functionality. The *os* and *os.path* modules include many
functions to interact with the file system.

Handling the Current Working Directory

Consider Current Working Directory(CWD) as a folder, where the Python is operating.


Whenever the files are called only by their name, Python assumes that it starts in the CWD
which means that name-only reference will be successful only if the file is in the Python’s
CWD.
Note: The folder where the Python script is running is known as the Current Directory. This is
not the path where the Python script is located.
Getting the Current working directory
To get the location of the current working directory os.getcwd() is used.
Example:

# Python program to explain os.getcwd() method

# importing os module

import os

# Get the current working

# directory (CWD)

cwd = os.getcwd()

# Print the current working

# directory (CWD)

print("Current working directory:", cwd)

Output:
Current working directory: /home/nikhil/Desktop/gfg
Changing the Current working directory
To change the current working directory(CWD) os.chdir() method is used. This method
changes the CWD to a specified path. It only takes a single argument as a new directory path.
Note: The current working directory is the folder in which the Python script is operating.
Example:

# Python program to change the

# current working directory

import os

# Function to Get the current

# working directory

def current_path():

print("Current working directory before")

print(os.getcwd())

print()

# Driver's code

# Printing CWD before

current_path()

# Changing the CWD

os.chdir('../')

# Printing CWD after

current_path()

Output:
Current working directory before
C:\Users\Nikhil Aggarwal\Desktop\gfg

Current working directory after


C:\Users\Nikhil Aggarwal\Desktop
Creating a Directory

There are different methods available in the OS module for creating a directory. These are –
• os.mkdir()
• os.makedirs()
Using os.mkdir()
os.mkdir() method in Python is used to create a directory named path with the specified
numeric mode. This method raises FileExistsError if the directory to be created already exists.
Example:

# Python program to explain os.mkdir() method

# importing os module

import os

# Directory

directory = "GeeksforGeeks"

# Parent Directory path

parent_dir = "D:/Pycharm projects/"

# Path

path = os.path.join(parent_dir, directory)

# Create the directory

# 'GeeksForGeeks' in

# '/home / User / Documents'

os.mkdir(path)

print("Directory '% s' created" % directory)

# Directory

directory = "Geeks"
# Parent Directory path

parent_dir = "D:/Pycharm projects"

# mode

mode = 0o666

# Path

path = os.path.join(parent_dir, directory)

# Create the directory

# 'GeeksForGeeks' in

# '/home / User / Documents'

# with mode 0o666

os.mkdir(path, mode)

print("Directory '% s' created" % directory)

Output:
Directory 'GeeksforGeeks' created
Directory 'Geeks' created
Using os.makedirs()
os.makedirs() method in Python is used to create a directory recursively. That means while
making leaf directory if any intermediate-level directory is missing, os.makedirs() method will
create them all.
Example:

# Python program to explain os.makedirs() method

# importing os module

import os
# Leaf directory

directory = "Nikhil"

# Parent Directories

parent_dir = "D:/Pycharm projects/GeeksForGeeks/Authors"

# Path

path = os.path.join(parent_dir, directory)

# Create the directory

# 'Nikhil'

os.makedirs(path)

print("Directory '% s' created" % directory)

# Directory 'GeeksForGeeks' and 'Authors' will

# be created too

# if it does not exists

# Leaf directory

directory = "c"

# Parent Directories

parent_dir = "D:/Pycharm projects/GeeksforGeeks/a/b"

# mode

mode = 0o666

path = os.path.join(parent_dir, directory)

# Create the directory 'c'


os.makedirs(path, mode)

print("Directory '% s' created" % directory)

# 'GeeksForGeeks', 'a', and 'b'

# will also be created if

# it does not exists

# If any of the intermediate level

# directory is missing

# os.makedirs() method will

# create them

# os.makedirs() method can be

# used to create a directory tree

Output:
Directory 'Nikhil' created
Directory 'c' created
Listing out Files and Directories with Python
os.listdir() method in Python is used to get the list of all files and directories in the specified
directory. If we don’t specify any directory, then the list of files and directories in the current
working directory will be returned.
Example:

# Python program to explain os.listdir() method

# importing os module

import os

# Get the list of all files and directories

# in the root directory

path = "/"

dir_list = os.listdir(path)
print("Files and directories in '", path, "' :")

# print the list

print(dir_list)

Output:
Files and directories in ' / ' :
['sys', 'run', 'tmp', 'boot', 'mnt', 'dev', 'proc', 'var', 'bin',
'lib64', 'usr',
'lib', 'srv', 'home', 'etc', 'opt', 'sbin', 'media']

Deleting Directory or Files using Python

OS module proves different methods for removing directories and files in Python. These are–
• Using os.remove()
• Using os.rmdir()
Using os.remove()
os.remove() method in Python is used to remove or delete a file path. This method can not
remove or delete a directory. If the specified path is a directory then OSError will be raised by
the method.
Example: Suppose the file contained in the folder are:

# Python program to explain os.remove() method

# importing os module

import os
# File name

file = 'file1.txt'

# File location

location = "D:/Pycharm projects/GeeksforGeeks/Authors/Nikhil/"

# Path

path = os.path.join(location, file)

# Remove the file

# 'file.txt'

os.remove(path)

e)

Output:

Using os.rmdir()
os.rmdir() method in Python is used to remove or delete an empty directory. OSError will be
raised if the specified path is not an empty directory.
Example: Suppose the directories are
# Python program to explain os.rmdir() method

# importing os module

import os

# Directory name

directory = "Geeks"

# Parent Directory

parent = "D:/Pycharm projects/"

# Path

path = os.path.join(parent, directory)

# Remove the Directory

# "Geeks"

os.rmdir(path)

Output:

Commonly Used Functions

os.name: This function gives the name of the operating system dependent module imported.
The following names have currently been registered: ‘posix’, ‘nt’, ‘os2’, ‘ce’, ‘java’ and ‘riscos’.

import os

print(os.name)
Output:
posix
Note: It may give different output on different interpreters, such as ‘posix’ when you run the
code here.
os.error: All functions in this module raise OSError in the case of invalid or inaccessible file
names and paths, or other arguments that have the correct type, but are not accepted by the
operating system. os.error is an alias for built-in OSError exception.

import os

try:

# If the file does not exist,

# then it would throw an IOError

filename = 'GFG.txt'

f = open(filename, 'rU')

text = f.read()

f.close()

# Control jumps directly to here if

# any of the above lines throws IOError.

except IOError:

# print(os.error) will <class 'OSError'>

print('Problem reading: ' + filename)

# In any case, the code then continues with

# the line after the try/except

Output:
Problem reading: GFG.txt

os.popen(): This method opens a pipe to or from command. The return value can be read or
written depending on whether the mode is ‘r’ or ‘w’.
Syntax: os.popen(command[, mode[, bufsize]])
Parameters mode & bufsize are not necessary parameters, if not provided, default ‘r’ is taken
for mode.

import os

fd = "GFG.txt"

# popen() is similar to open()

file = open(fd, 'w')

file.write("Hello")

file.close()

file = open(fd, 'r')

text = file.read()

print(text)

# popen() provides a pipe/gateway and accesses the file directly

file = os.popen(fd, 'w')

file.write("Hello")

# File not closed, shown in next function.

Output:
Hello
Note: Output for popen() will not be shown, there would be direct changes into the file.

os.close(): Close file descriptor fd. A file opened using open(), can be closed by close()only.
But file opened through os.popen(), can be closed with close() or os.close(). If we try closing
a file opened with open(), using os.close(), Python would throw TypeError.

import os

fd = "GFG.txt"

file = open(fd, 'r')

text = file.read()

print(text)

os.close(file)
Output:
Traceback (most recent call last):
File "C:\Users\GFG\Desktop\GeeksForGeeksOSFile.py", line 6, in
os.close(file)
TypeError: an integer is required (got type _io.TextIOWrapper)
Note: The same error may not be thrown, due to the non-existent file or permission privilege.

os.rename(): A file old.txt can be renamed to new.txt, using the function os.rename(). The
name of the file changes only if, the file exists and the user has sufficient privilege permission
to change the file.

import os

fd = "GFG.txt"

os.rename(fd,'New.txt')

os.rename(fd,'New.txt')

Output:
Traceback (most recent call last):
File "C:\Users\GFG\Desktop\ModuleOS\GeeksForGeeksOSFile.py", line
3, in
os.rename(fd,'New.txt')
FileNotFoundError: [WinError 2] The system cannot find the
file specified: 'GFG.txt' -> 'New.txt'
Understanding the Output: A file name “GFG.txt” exists, thus when os.rename() is used the
first time, the file gets renamed. Upon calling the function os.rename() second time, file
“New.txt” exists and not “GFG.txt” thus Python throws FileNotFoundError.
os.remove(): Using the Os module we can remove a file in our system using the remove()
method. To remove a file we need to pass the name of the file as a parameter.

import os #importing os module.

os.remove("file_name.txt") #removing the file.

The OS module provides us a layer of abstraction between us and the operating system. When
we are working with os module always specify the absolute path depending upon the
operating system the code can run on any os but we need to change the path exactly. If you
try to remove a file that does not exist you will get FileNotFoudError.
os.path.exists(): This method will check whether a file exists or not by passing the name of
the file as a parameter. OS module has a sub-module named PATH by using which we can
perform many more functions.

import os

#importing os module

result = os.path.exists("file_name") #giving the name of the file as a


parameter.

print(result)

Output
False
As in the above code, the file does not exist it will give output False. If the file exists it will give
us output True.
os.path.getsize(): In this method, python will give us the size of the file in bytes. To use this
method we need to pass the name of the file as a parameter.

import os #importing os module

size = os.path.getsize("filename")

print("Size of the file is", size," bytes.")

Output:
Size of the file is 192 bytes.
Iterators in Python
Iterator in python is an object that is used to iterate over iterable objects like lists, tuples,
dicts, and sets. The iterator object is initialized using the iter() method. It uses
the next() method for iteration.

• __iter(iterable)__ method that is called for the initialization of an iterator. This returns
an iterator object
• next ( __next__ in Python 3) The next method returns the next value for the iterable.
When we use a for loop to traverse any iterable object, internally it uses the iter()
method to get an iterator object which further uses next() method to iterate over. This
method raises a StopIteration to signal the end of the iteration.
How an iterator really works in python.

# Here is an example of a python inbuilt iterator


# value can be anything which can be iterate
iterable_value = 'Python'
iterable_obj = iter(iterable_value)

while True:
try:

# Iterate by calling next


item = next(iterable_obj)
print(item)
except StopIteration:

# exception will happen when iteration will over


Break

Output :
P
y
t
h
o
n

Below is a simple Python custom iterator that creates iterator type that iterates from 10 to a
given limit. For example, if the limit is 15, then it prints 10 11 12 13 14 15. And if the limit is
5, then it prints nothing.
# A simple Python program to demonstrate
# working of iterators using an example type
# that iterates from 10 to given value

# An iterable user defined type


class Test:

# Constructor
def __init__(self, limit):
self.limit = limit

# Creates iterator object


# Called when iteration is initialized
def __iter__(self):
self.x = 10
return self

# To move to next element. In Python 3,


# we should replace next with __next__
def __next__(self):

# Store current value ofx


x = self.x

# Stop iteration if limit is reached


if x > self.limit:
raise StopIteration

# Else increment and return old value


self.x = x + 1;
return x

# Prints numbers from 10 to 15


for i in Test(15):
print(i)

# Prints nothing
for i in Test(5):
print(i)

Output :
10
11
12
13
14
15
In the following iterations, the for loop is internally(we can’t see it) using iterator object to
traverse over the iterables.

# Sample built-in iterators

# Iterating over a list


print("List Iteration")
l = ["Datascience", "with", "Python"]
for i in l:
print(i)

# Iterating over a tuple (immutable)


print("\nTuple Iteration")
t = ("Python", "in", "Datascience")
for i in t:
print(i)

# Iterating over a String


print("\nString Iteration")
s = "Python"
for i in s :
print(i)

# Iterating over dictionary


print("\nDictionary Iteration")
d = dict()
d['xyz'] = 123
d['abc'] = 345
for i in d :
print("%s %d" %(i, d[i]))

Output :
List Iteration
datascience
with
Python

Tuple Iteration
Python
in
Datascience
String Iteration
P
y
t
h
o
n

Dictionary Iteration
xyz 123
abc 345

Iterator Functions in Python

Python in its definition also allows some interesting and useful iterator functions for efficient
looping and making execution of the code faster. There are many build-in iterators in the
module “itertools“.
This module implements a number of iterator building blocks.
Some useful Iterators:
• accumulate(iter, func) :- This iterator takes two arguments, iterable target and the
function which would be followed at each iteration of value in target. If no function is
passed, addition takes place by default.If the input iterable is empty, the output
iterable will also be empty.
• chain(iter1, iter2..) :- This function is used to print all the values in iterable targets one
after another mentioned in its arguments.

# Python code to demonstrate the working of


# accumulate() and chain()

# importing "itertools" for iterator operations


import itertools

# importing "operator" for operator operations


import operator

# initializing list 1
li1 = [1, 4, 5, 7]

# initializing list 2
li2 = [1, 6, 5, 9]
# initializing list 3
li3 = [8, 10, 5, 4]

# using accumulate()
# prints the successive summation of elements
print ("The sum after each iteration is : ",end="")
print (list(itertools.accumulate(li1)))

# using accumulate()
# prints the successive multiplication of elements
print ("The product after each iteration is : ",end="")
print (list(itertools.accumulate(li1,operator.mul)))

# using chain() to print all elements of lists


print ("All values in mentioned chain are : ",end="")
print (list(itertools.chain(li1,li2,li3)))

Output:
The sum after each iteration is : [1, 5, 10, 17]
The product after each iteration is : [1, 4, 20, 140]
All values in mentioned chain are : [1, 4, 5, 7, 1, 6, 5, 9, 8, 10,
5, 4]
• chain.from_iterable() :- This function is implemented similarly as chain() but the
argument here is a list of lists or any other iterable container.
• compress(iter, selector) :- This iterator selectively picks the values to print from the
passed container according to the boolean list value passed as other argument. The
arguments corresponding to boolean true are printed else all are skipped.

# Python code to demonstrate the working of


# chain.from_iterable() and compress()

# importing "itertools" for iterator operations


import itertools

# initializing list 1
li1 = [1, 4, 5, 7]

# initializing list 2
li2 = [1, 6, 5, 9]

# initializing list 3
li3 = [8, 10, 5, 4]
# initializing list of list
li4 = [li1, li2, li3]

# using chain.from_iterable() to print all elements of lists


print ("All values in mentioned chain are : ",end="")
print (list(itertools.chain.from_iterable(li4)))

# using compress() selectively print data values


print ("The compressed values in string are : ",end="")
print
(list(itertools.compress('GEEKSFORGEEKS',[1,0,0,0,0,1,0,0,1,0,0,0,0])))

Output:
All values in mentioned chain are : [1, 4, 5, 7, 1, 6, 5, 9, 8, 10,
5, 4]
The compressed values in string are : ['G', 'F', 'G']
• dropwhile(func, seq) :- This iterator starts printing the characters only after the func.
in argument returns false for the first time.
• filterfalse(func, seq) :- As the name suggests, this iterator prints only values that
return false for the passed function.

# Python code to demonstrate the working of


# dropwhile() and filterfalse()

# importing "itertools" for iterator operations


import itertools

# initializing list
li = [2, 4, 5, 7, 8]

# using dropwhile() to start displaying after condition is false


print ("The values after condition returns false : ",end="")
print (list(itertools.dropwhile(lambda x : x%2==0,li)))

# using filterfalse() to print false values


print ("The values that return false to function are : ",end="")
print (list(itertools.filterfalse(lambda x : x%2==0,li)))

Output:
The values after condition returns false : [5, 7, 8]
The values that return false to function are : [5, 7]
Python __iter__() and __next__() | Converting an object into an iterator

At many instances, we get a need to access an object like an iterator. One way is to form a
generator loop but that extends the task and time taken by the programmer. Python eases
this task by providing a built-in method __iter__() for this task.

The __iter__() function returns an iterator for the given object (array, set, tuple, etc. or
custom objects). It creates an object that can be accessed one element at a time
using __next__() function, which generally comes in handy when dealing with loops.
Syntax:
iter(object)
iter(callable, sentinel)
• Object: The object whose iterator has to be created. It can be a collection object like
list or tuple or a user-defined object (using OOPS).
• Callable, Sentinel: Callable represents a callable object, and sentinel is the value at
which the iteration is needed to be terminated, sentinel value represents the end of
sequence being iterated.
Exception:
If we call the iterator after all the elements have been iterated, then StopIterationError is
raised.
The __iter__() function returns an iterator object that goes through each element of the given
object. The next element can be accessed through __next__() function. In the case of callable
object and sentinel value, the iteration is done until the value is found or the end of elements
reached. In any case, the original object is not modified.
Code #1:

# Python code demonstrating

# basic use of iter()

listA = ['a','e','i','o','u']

iter_listA = iter(listA)

try:

print( next(iter_listA))

print( next(iter_listA))
print( next(iter_listA))

print( next(iter_listA))

print( next(iter_listA))

print( next(iter_listA)) #StopIteration error

except:

pass

Output:
a
e
i
o
u
Code #2:

# Python code demonstrating

# basic use of iter()

lst = [11, 22, 33, 44, 55]

iter_lst = iter(lst)

while True:

try:

print(iter_lst.__next__())

except:

break

Output:
11
22
33
44
55
Code #3:

# Python code demonstrating

# basic use of iter()

listB = ['Cat', 'Bat', 'Sat', 'Mat']

iter_listB = listB.__iter__()

try:

print(iter_listB.__next__())

print(iter_listB.__next__())

print(iter_listB.__next__())

print(iter_listB.__next__())

print(iter_listB.__next__()) #StopIteration error

except:

print(" \nThrowing 'StopIterationError'",

"I cannot count more.")

Output:
Cat
Bat
Sat
Mat
Throwing 'StopIterationError' I cannot count more.
Code #4: User-defined objects (using OOPS)

# Python code showing use of iter() using OOPs

class Counter:

def __init__(self, start, end):

self.num = start

self.end = end
def __iter__(self):

return self

def __next__(self):

if self.num > self.end:

raise StopIteration

else:

self.num += 1

return self.num - 1

# Driver code

if __name__ == '__main__' :

a, b = 2, 5

c1 = Counter(a, b)

c2 = Counter(a, b)

# Way 1-to print the range without iter()

print ("Print the range without iter()")

for i in c1:

print ("Eating more Pizzas, counting ", i, end ="\n")

print ("\nPrint the range using iter()\n")

# Way 2- using iter()

obj = iter(c2)

try:

while True: # Print till error raised

print ("Eating more Pizzas, counting ", next(obj))

except:
# when StopIteration raised, Print custom message

print ("\nDead on overfood, GAME OVER")

Output:
Print the range without iter()
Eating more Pizzas, counting 2
Eating more Pizzas, counting 3
Eating more Pizzas, counting 4
Eating more Pizzas, counting 5

Print the range using iter()

Eating more Pizzas, counting 2


Eating more Pizzas, counting 3
Eating more Pizzas, counting 4
Eating more Pizzas, counting 5

Dead on overfood, GAME OVER

Python | Difference between iterable and iterator

Iterable is an object, that one can iterate over. It generates an Iterator when passed to iter()
method. An iterator is an object, which is used to iterate over an iterable object using the
__next__() method. Iterators have the __next__() method, which returns the next item of the
object. Note that every iterator is also an iterable, but not every iterable is an iterator. For
example, a list is iterable but a list is not an iterator. An iterator can be created from an
iterable by using the function iter(). To make this possible, the class of an object needs either
a method __iter__, which returns an iterator, or a __getitem__ method with sequential
indexes starting with 0.
Code #1

# code
next("APTECH")

Output:
Traceback (most recent call last):
File "/home/1c9622166e9c268c0d67cd9ba2177142.py", line 2, in <module>
next("APTECH")
TypeError: 'str' object is not an iterator
We know that str is iterable but it is not an iterator. where if we run this in for loop to print
string then it is possible because when for loop executes it converts into an iterator to execute
the code.

# code
s="APTECH"
s=iter(s)
next(s)

Here iter( ) is converting s which is a string (iterable) into an iterator and prints G for the first
time we can call multiple times to iterate over strings.
When a for loop is executed, for statement calls iter() on the object, which it is supposed to
loop over. If this call is successful, the iter call will return an iterator object that defines the
method __next__(), which accesses elements of the object one at a time. The __next__()
method will raise a StopIteration exception if there are no further elements available. The for
loop will terminate as soon as it catches a StopIteration exception. Let’s call the __next__()
method using the next() built-in function.
Code #2: Function ‘iterable’ will return True if the object ‘obj’ is an iterable and False
otherwise.

# list of cities
cities = ["Bengaluru", "New Delhi", "Mumbai"]

# initialize the object


iterator_obj = iter(cities)

print(next(iterator_obj))
print(next(iterator_obj))
print(next(iterator_obj))

Output:
Bengaluru
New Delhi
Mumbai
Note: If ‘next(iterator_obj)’ is called one more time, it would return ‘StopIteration’.
Python Debugger – Python pdb
Debugging in Python is facilitated by pdb module(python debugger) which comes built-in to
the Python standard library. It is actually defined as the class Pdb which internally makes use
of bdb(basic debugger functions) and cmd(support for line-oriented command interpreters)
modules. The major advantage of pdb is it runs purely in the command line thereby making it
great for debugging code on remote servers when we don’t have the privilege of a GUI-based
debugger.
pdb supports-

• Setting breakpoints
• Stepping through code
• Source code listing
• Viewing stack traces

Starting Python Debugger

There are several ways to invoke a debugger

• To start debugging within the program just insert import pdb,


pdb.set_trace()commands. Run your script normally and execution will stop where
we have introduced a breakpoint. So basically we are hard coding a breakpoint on a
line below where we call set_trace(). With python 3.7 and later versions, there is a
built-in function called breakpoint() which works in the same manner. Refer following
example on how to insert set_trace() function.
Example1: Addition of two numbers
Intentional error: As input() returns string the program concatenates those strings instead of
adding input numbers

import pdb

def addition(a, b):

answer = a + b

return answer

pdb.set_trace()

x = input("Enter first number : ")

y = input("Enter second number : ")

sum = addition(x, y)
print(sum)

Output:

set_trace

In the output on the first line after the angle bracket, we have the directory path of our
file, line number where our breakpoint is located, and <module>. It’s basically saying that we
have a breakpoint in exppdb.py on line number 10 at the module level. If you introduce the
breakpoint inside the function then its name will appear inside <>. The next line is showing
the code line where our execution is stopped. That line is not executed yet. Then we have
the pdb prompt. Now to navigate the code we can use the following commands:

Command Function
help To display all commands
where Display the stack trace and line number of the current line
next Execute the current line and move to the next line ignoring function calls
step Step into functions called at the current line
Now to check the type of variable just write whatis and variable name. In the example given
below the output of type of x is returned as <class string>. Thus typecasting string to int in our
program will resolve the error.
Example 2:
• From the Command Line: It is the easiest way of using a debugger. You just have to
run the following command in terminal
python -m pdb exppdb.py (put your file name instead of exppdb.py)
This statement loads your source code and stops execution on the first line of code.
Example 3:

def addition(a, b):

answer = a + b

return answer

x = input("Enter first number : ")

y = input("Enter second number : ")


sum = addition(x, y)

print(sum)

Output:

command_line

• Post-mortem debugging means entering debug mode after the program is finished
with the execution process (failure has already occurred). pdb supports post-mortem
debugging through the pm() and post_mortem() functions. These functions look for
active trace back and start the debugger at the line in the call stack where the
exception occurred. In the output of the given example you can notice pdb appear
when exception is encountered in the program.
Example 4:

def multiply(a, b):

answer = a * b

return answer

x = input("Enter first number : ")

y = input("Enter second number : ")

result = multiply(x, y)

print(result)

Output:
Checking variables on the Stack
All the variables including variables local to the function being executed in the program as
well as global are maintained on the stack. We can use args (or use a) to print all the
arguments of function which is currently active. p command evaluates an expression given as
an argument and prints the result.
Here, example 4 of this article is executed in debugging mode to show you how to check for
variables:

cheking_variable_values
Python pdb Breakpoint
While working with large programs we often want to add a number of breakpoints where we
know errors might occur. To do this you just have to use the break command. When you
insert a breakpoint, the debugger assigns a number to it starting from 1. Use the break to
display all the breakpoints in the program.

Syntax: break filename: lineno, condition


Given below is the implementation to add breakpoints in a program used for example 4.

Adding_breakpoints

Managing Breakpoints
After adding breakpoints with the help of numbers assigned to them we can manage the
breakpoints using the enable and disable and remove command. disable tells the debugger
not to stop when that breakpoint is reached while enable turns on the disabled breakpoints.
Given below is the implementation to manage breakpoints using Example 4.

Manage_breakpoints

Automated software testing with Python


Software testing is the process in which a developer ensures that the actual output of the
software matches with the desired output by providing some test inputs to the software.
Software testing is an important step because if performed properly, it can help the developer
to find bugs in the software in very less amount of time.
Software testing can be divided into two classes, Mannual testing and Automated testing.
Automated testing is the execution of your tests using a script instead of a human. In this
article, we’ll discuss some of the methods of automated software testing with Python.
Let’s write a simple application over which we will perform all the tests.
class Square:

def __init__(self, side):

""" creates a square having the given side

"""

self.side = side

def area(self):

""" returns area of the square

"""

return self.side**2

def perimeter(self):

""" returns perimeter of the square

"""

return 4 * self.side

def __repr__(self):

""" declares how a Square object should be printed

"""

s = 'Square with side = ' + str(self.side) + '\n' + \

'Area = ' + str(self.area()) + '\n' + \

'Perimeter = ' + str(self.perimeter())

return s

if __name__ == '__main__':

# read input from the user

side = int(input('enter the side length to create a Square: '))

# create a square with the provided side

square = Square(side)
# print the created square

print(square)

Now that we have our software ready, let’s have a look at the directory structure of our
project folder and after that, we’ll start testing our software.
---Software_Testing
|--- __init__.py (to initialize the directory as python package)
|--- app.py (our software)
|--- tests (folder to keep all test files)
|--- __init__.py

The ‘unittest’ module

One of the major problems with manual testing is that it requires time and effort. In manual
testing, we test the application over some input, if it fails, either we note it down or we debug
the application for that particular test input, and then we repeat the process.
With unittest, all the test inputs can be provided at once and then you can test your
application. In the end, you get a detailed report with all the failed test cases clearly specified,
if any.
The unittest module has both a built-in testing framework and a test runner. A testing
framework is a set of rules which must be followed while writing test cases, while a test runner
is a tool which executes these tests with a bunch of settings, and collects the results.
Installation: unittest is available at PyPI and can be installed with the following command –
pip install unittest
Use: We write the tests in a Python module (.py). To run our tests, we simply execute the test
module using any IDE or terminal.
Now, let’s write some tests for our small software discussed above using
the unittest module.

• Create a file named tests.py in the folder named “tests”.


• In tests.py import unittest.
• Create a class named TestClass which inherits from the class unittest.TestCase.
Rule 1: All the tests are written as the methods of a class, which must inherit from the
class unittest.TestCase.
• Create a test method as shown below.
Rule 2: Name of each and every test method should start with “test” otherwise it’ll be
skipped by the test runner.
def test_area(self):

# testing the method Square.area().

sq = Square(2) # creates a Square of side 2 units.

# test if the area of the above square is 4 units,

# display an error message if it's not.

self.assertEqual(sq.area(), 4,

f'Area is shown {sq.area()} for side = {sq.side} units')

• Rule 3: We use special assertEqual() statements instead of the built in assert


statements available in Python.
• The first argument of assertEqual() is the actual output, the second argument is the
desired output and the third argument is the error message which would be displayed
in case the two values differ from each other (test fails).
• To run the tests we just defined, we need to call the method unittest.main(), add the
following lines in the “tests.py” module.

if __name__ == '__main__':

unittest.main()

• Because of these lines, as soon as you run execute the script “test.py”, the
function unittest.main() would be called and all the tests will be executed.
Finally the “tests.py” module should resemble the code given below.

import unittest

from .. import app

class TestSum(unittest.TestCase):

def test_area(self):

sq = app.Square(2)
self.assertEqual(sq.area(), 4,

f'Area is shown {sq.area()} rather than 9')

if __name__ == '__main__':

unittest.main()

Having written our test cases let us now test our application for any bugs. To test your
application you simply need to execute the test file “tests.py” using the command prompt or
any IDE of your choice. The output should be something like this.
-------------------------------------------------------------------
Ran 1 test in 0.000s
OK
In the first line, a .(dot) represents a successful test while an ‘F’ would represent a failed test
case. The OK message, in the end, tells us that all the tests were passed successfully.
Let’s add a few more tests in “tests.py” and retest our application.

import unittest

from .. import app

class TestSum(unittest.TestCase):

def test_area(self):

sq = app.Square(2)

self.assertEqual(sq.area(), 4,

f'Area is shown {sq.area()} rather than 9')

def test_area_negative(self):

sq = app.Square(-3)

self.assertEqual(sq.area(), -1,

f'Area is shown {sq.area()} rather than -1')

def test_perimeter(self):
sq = app.Square(5)

self.assertEqual(sq.perimeter(), 20,

f'Perimeter is {sq.perimeter()} rather than 20')

def test_perimeter_negative(self):

sq = app.Square(-6)

self.assertEqual(sq.perimeter(), -1,

f'Perimeter is {sq.perimeter()} rather than -1')

if __name__ == '__main__':

unittest.main()

.F.F
===================================================================
FAIL: test_area_negative (__main__.TestSum)
-------------------------------------------------------------------
Traceback (most recent call last):
File "tests_unittest.py", line 11, in test_area_negative
self.assertEqual(sq.area(), -1, f'Area is shown {sq.area()} rather than
-1 for negative side length')
AssertionError: 9 != -1 : Area is shown 9 rather than -1 for negative side
length
======================================================================
FAIL: test_perimeter_negative (__main__.TestSum)
----------------------------------------------------------------------
Traceback (most recent call last):
File "tests_unittest.py", line 19, in test_perimeter_negative
self.assertEqual(sq.perimeter(), -1, f'Perimeter is {sq.perimeter()}
rather than -1 for negative side length')
AssertionError: -24 != -1 : Perimeter is -24 rather than -1 for negative
side length
----------------------------------------------------------------------
Ran 4 tests in 0.001s
FAILED (failures=2)
A few things to note in the above test report are –

• The first line represents that test 1 and test 3 executed successfully while test 2 and
test 4 failed
• Each failed test case is described in the report, the first line of the description contains
the name of the failed test case and the last line contains the error message we
defined for that test case.
• At the end of the report you can see the number of failed tests, if no test fails the
report will end with OK

The “nose2” module

The purpose of nose2 is to extend unittest to make testing easier. nose2 is compatible with
tests written using the unittest testing framework and can be used as a replacement of
the unittest test runner.
Installation: nose2 can be installed from PyPI using the command,
pip install nose2
Use: nose2 does not have any testing framework and is merely a test runner which is
compatible with the unittest testing framework. Therefore we’ll the run same tests we wrote
above (for unittest) using nose2. To run the tests we use the following command in the
project source directory (“Software_Testing” in our case),
nose2
In nose2 terminology all the python modules (.py) with name starting from “test” (i.e.
test_file.py, test_1.py) are considered as test files. On execution, nose2 will look for all test
files in all the sub-directories which lie under one or more of the following categories,

• which are python packages (contain “__init__.py”).


• whose name starts with “test” after being lowercased, i.e. TestFiles, tests.
• which are named either “src” or “lib”.
nose2 first loads all the test files present in the project and then the tests are executed. Thus,
with nose2 we get the freedom to split our tests among various test files in different folders
and execute them at once, which is very useful when dealing with large number of tests.
Let’s now learn about different customisation options provided by nose2 which can help us
during the testing process.

1. Changing the search directory –


If we want to change the directory in which nose2 searchs for test
files, we can do that using the command line arguments -s or --
start-dir as,
nose2 -s DIR_ADD DIR_NAME
here, DIR_NAME is the directory in which we want to search for the test
files and, DIR_ADD is the address of the parent directory
of DIR_NAME relative to the project source directory (i.e. use “./” if test
directory is in the project source directory itself).
This is extremely useful when you want to test only one feature of
your application at a time.
2. Running specific test cases –
Using nose2 we can also run a specific test at a time by using the
command line arguments -s and --start-dir as,
nose2 -s DIR_ADD DIR_NAME.TEST_FILE.TEST_CLASS.TEST_NAME
•TEST_NAME: name of the test method.
• TEST_CLASS: class in which the test method is defined.
• TEST_FILE: name of the test file in which the test case is
defined i.e. test.py.
• DIR_NAME: directory in which the test file exists.
• DIR_ADD: address of the parent directory of DIR_NAME
relative to the project source.
Using this feature we can test our software on specific inputs.
3. Running tests in a single module –
nose2 can also be used like unittest by calling the
function nose2.main() just like we called unittest.main() in
previous examples.
Apart from above basic customisations nose2 provides advanced
features like, loading various plugins and config files or creating your
own test runner.

The “pytest” module

pytest is the most popular testing framework for python. Using pytest you can test
anything from basic python scripts to databases, APIs and UIs. Though pytest is mainly used
for API testing, in this article we’ll cover only the basics of pytest.
Installation: You can install pytest from PyPI using the command,
pip install pytest
Use: The pytest test runner is called using the following command in project source,
py.test
Unlike nose2, pytest looks for test files in all the locations inside the project directory. Any
file with name starting with “test_” or ending with “_test” is considered a test file in
the pytest terminology. Let’s create a file “test_file1.py” in the folder “tests” as our test file.
Creating test methods:
pytest supports the test methods written in the unittest framework, but
the pytest framework provides easier syntax to write tests. See the code below to
understand the test method syntax of the pytest framework.

from .. import app

def test_file1_area():

sq = app.Square(2)

assert sq.area() == 4,

f"area for side {sq.side} units is {sq.area()}"

def test_file1_perimeter():

sq = app.Square(-1)

assert sq.perimeter() == -1,

f'perimeter is shown {sq.perimeter()} rather than -1'

Note: similar to unittest, pytest requires all test names to start with “test”.
Unlike unittest, pytest uses the default python assert statements which make it
further easier to use.
Note that, now the “tests” folder contains two files namely, “tests.py” (written
in unittest framework) and “test_file1.py” (written in pytest framework). Now let’s run
the pytest test runner.
py.test
You’ll get a similar report as obtained by using unittest.

test session starts

platform linux -- Python 3.6.7, pytest-4.4.1, py-1.8.0, pluggy-0.9.0


rootdir: /home/manthan/articles/Software_testing_in_Python
collected 6 items

tests/test_file1.py .F [
33%]
tests/test_file2.py .F.F [100%]

===================================
FAILURES
===================================
The percentages on the right side of the report show the percentage of tests that have been
completed at that moment, i.e. 2 out of the 6 test cases were completed at the end of the
“test_file1.py”.
Here are a few more basic customisations that come with pytest.

• Running specific test files: To run only a specific test file, use the command,
py.test <filename>
• Substring matching: Suppose we want to test only the area() method of
our Square class, we can do this using substring matching as follows,
py.test -k "area"
With this command pytest will execute only those tests which have the string “area”
in their names, i.e. “test_file1_area()”, “test_area()” etc.
• Marking: As a substitute to substring matching, marking is another method using
which we can run a specific set of tests. In this method we put a mark on the tests we
want to run. Observe the code example given below,

# @pytest.mark.<tag_name>

@pytest.mark.area

def test_file1_area():

sq = app.Square(2)

assert sq.area() == 4,

f"area for side {sq.side} units is {sq.area()}"

• In the above code example test_file1_area() is marked with tag “area”. All the
test methods which have been marked with some tag can be executed by using the
command,
py.test -m <tag_name>
• Parallel Processing: If you have a large number of tests then pytest can be
customised to run these test methods in parallel. For that you need to install pytest-
xdist which can be installed using the command,
pip install pytest-xdist
• Now you can use the following command to execute your tests faster using
multiprocessing,
py.test -n 4
• With this command pytest assigns 4 workers to perform the tests in parallel, you
can change this number as per your needs.
• If your tests are thread-safe, you can also use multithreading to speed up the testing
process. For that you need to install pytest-parallel (using pip). To run your tests in
multithreading use the command,
pytest --workers 4
Unit Testing in Python – Unittest
What is Unit Testing?
Unit Testing is the first level of software testing where the smallest testable parts of a
software are tested. This is used to validate that each unit of the software performs as
designed. The unittest test framework is python’s xUnit style framework.
Method:
White Box Testing method is used for Unit testing.
OOP concepts supported by unittest framework:

• test fixture:
A test fixture is used as a baseline for running tests to ensure that there is a fixed
environment in which tests are run so that results are repeatable.
Examples:
• creating temporary databases.
• starting a server process.
• test case:
A test case is a set of conditions which is used to determine whether a system under
test works correctly.
• test suite:
Test suite is a collection of testcases that are used to test a software program to
show that it has some specified set of behaviours by executing the aggregated tests
together.
• test runner:
A test runner is a component which set up the execution of tests and provides the
outcome to the user.
Basic Test Structure: unittest defines tests by the following two ways:

• Manage test “fixtures” using code.


• test itself.

import unittest

class SimpleTest(unittest.TestCase):

# Returns True or False.

def test(self):

self.assertTrue(True)

if __name__ == '__main__':
unittest.main()

This is the basic test code using unittest framework, which is having a single test. This test()
method will fail if TRUE is ever FALSE.

Running Tests
if __name__ == '__main__':
unittest.main()

The last block helps to run the test by running the file through the command line.
-------------------------------------------------------------------
Ran 1 test in 0.000s
OK
Here, in the output the “.” on the first line of output means that a test passed.
“-v” option is added in the command line while running the tests to obtain more detailed test
results.
test (__main__.SimpleTest) ... ok

-------------------------------------------------------------------
Ran 1 test in 0.000s
OK
Outcomes Possible:
There are three types of possible test outcomes:

• OK – This means that all the tests are passed.


• FAIL – This means that the test did not pass and an AssertionError exception is raised.
• ERROR – This means that the test raises an exception other than AssertionError.
Let’s walk through an example to understand the implementation of unittest framework.
Implementation:

# Python code to demonstrate working of unittest

import unittest

class TestStringMethods(unittest.TestCase):
def setUp(self):

pass

# Returns True if the string contains 4 a.

def test_strings_a(self):

self.assertEqual( 'a'*4, 'aaaa')

# Returns True if the string is in upper case.

def test_upper(self):

self.assertEqual('foo'.upper(), 'FOO')

# Returns TRUE if the string is in uppercase

# else returns False.

def test_isupper(self):

self.assertTrue('FOO'.isupper())

self.assertFalse('Foo'.isupper())

# Returns true if the string is stripped and

# matches the given output.

def test_strip(self):

s = 'geeksforgeeks'

self.assertEqual(s.strip('geek'), 'sforgeeks')

# Returns true if the string splits and matches

# the given output.

def test_split(self):

s = 'hello world'

self.assertEqual(s.split(), ['hello', 'world'])

with self.assertRaises(TypeError):

s.split(2)
if __name__ == '__main__':

unittest.main()

The above code is a short script to test 5 string methods. unittest.TestCase is used to
create test cases by subclassing it. The last block of the code at the bottom allows us to run
all the tests just by running the file.
Basic terms used in the code:

• assertEqual() – This statement is used to check if the result obtained is equal to the
expected result.
• assertTrue() / assertFalse() – This statement is used to verify if a given statement is
true or false.
• assertRaises() – This statement is used to raise a specific exception.
Description of tests:

• test_strings_a
This test is used to test the property of string in which a character say ‘a’ multiplied by
a number say ‘x’ gives the output as x times ‘a’. The assertEqual() statement returns
true in this case if the result matches the given output.
• test_upper
This test is used to check if the given string is converted to uppercase or not. The
assertEqual() statement returns true if the string returned is in uppercase.
• test_isupper
This test is used to test the property of string which returns TRUE if the string is in
uppercase else returns False. The assertTrue() / assertFalse() statement is used for this
verification.
• test_strip
This test is used to check if all chars passed in the function have been stripped from
the string. The assertEqual() statement returns true if the string is stripped and
matches the given output.
• test_split
This test is used to check the split function of the string which splits the string through
the argument passed in the function and returns the result as list. The assertEqual()
statement returns true in this case if the result matches the given output.
unittest.main() provides a command-line interface to the test script.On running the above
script from the command line, following output is produced:
-------------------------------------------------------------------
Ran 5 tests in 0.000s
OK

You might also like