Session22 To 24 PYTHON COLAB
Session22 To 24 PYTHON COLAB
Session22 To 24 PYTHON COLAB
Edit > Notebook settings or Runtime>Change runtime type and select GPU as Hardware accelerator
Open python notebook in different ways
Bar Graph
• # BAR GRAPH
• import matplotlib.pyplot as plt
• plt.bar(x, y)
• plt.show()
Prime Numbers from n1 to n2
n1=int(input("Enter limit for Prime numbers From :"))
n2=int(input("Enter limit for Prime numbers upto :"))
print("Prime Numbers between",n1, "and", n2, "are :")
for n in range(n1,n2,+1):
x=n
s=1
for i in range(1,x,+1):
if(x%i)==0:
s=s+1
if s<=2:
print(n)
Cricket Data Analysis # Question 1 - Highest Run and Players Name
# Reading an excel file using Python for i in range(1,sheet.nrows,+1):
import xlrd k2=int(sheet.cell_value(i,3))
# Give the location of the file a.append(k2)
loc = ("Cricket_data.xls") largest2=max(a)
b=[] k3=int(sheet.cell_value(i,4))
c=[] b.append(k3)
largest2=max(b)
#data reading from excel to temprory memory
for j in range (0,len(b),+1):
for i in range(1,sheet.nrows,+1):
if (largest2==b[j]):
k0=(sheet.cell_value(i,0))
t2=j
m1.append(k0) print("Maximum Wicket Taker is ", m1[t2], "and
for i in range(1,sheet.ncols,+1): wicket taken is", largest2 )
k1=(sheet.cell_value(0,i))
m2.append(k1)
ABA Review
PYTHON
Session - 23
Output:
4
3.1415926
Hello, world
Variable types - Examples
Example types.py:
pi = 3.1415926
message = "Hello, world"
i = 2+2
print type(pi)
print type(message)
print type(i)
Output:
<type 'float'>
<type 'str'>
<type 'int'>
Operators
+ addition
- subtraction
/ division
** exponentiation
% modulus (remainder after division)
Comparison operators
Operators
Example operators.py
print 2*2
print 2**3
print 10%3
print 1.0/2.0
print 1/2
Output:
4
8
1
0.5
0
Note the difference between floating point division and
integer division in the last two lines
Type conversion
int(), float(), str(), and
bool() convert to
integer, floating point,
string, and boolean
(True or False) types,
respectively
Example typeconv.py:
Output:
print 1.0/2.0 0.5
print 1/2 0
print float(1)/float(2) 0.5
print int(3.1415926) 3
print str(3.1415926) 3.1415926
print bool(1) True
print bool(0) False
Chapter 2: Conditionals
True and False booleans
Comparison and Logical Operators
if, elif, and else statements
Comparison operators
== : is equal to?
!= : not equal to
> : greater than
< : less than
>= : greater than or equal to
<= : less than or equal to
is : do two references refer to the same object?
(See Chapter 6)
Logical operators
and, or, not
if (1+1==2):
print "1+1==2"
print "I always thought so!"
else:
print "My understanding of math must be faulty!"
Simple one-line if:
if (1+1==2): print “I can add!”
elif statement
Equivalent of “else if” in C
Example elif.py:
x=3
if (x == 1):
print "one"
elif (x == 2):
print "two"
else:
print "many"
Chapter 3: Functions
Defining functions
Return values
Local variables
Built-in functions
Functions of functions
Passing lists, dictionaries, and keywords to functions
Functions
Define them in the file above the point they're used
Body of the function should be indented consistently
(4 spaces is typical in Python)
Example: square.py
def square(n):
return n*n
Output:
The square of 3 is 9
Function variables are local
Variables declared in a function do not exist outside that
function
Example square2.py
def square(n):
m = n*n
return m
Output:
File "./square2.py", line 9, in <module>
print m
NameError: name 'm' is not defined
Scope
Variables assigned within a function are local to that
function call
Variables assigned at the top of a module are global to
that module; there's only “global” within a module
Within a function, Python will try to match a variable
name to one assigned locally within the function; if that
fails, it will try within enclosing function-defining (def)
statements (if appropriate); if that fails, it will try to
resolve the name in the global scope (but the variable
must be declared global for the function to be able to
change it). If none of these match, Python will look
through the list of built-in names
Multiple return values
Can return multiple values by packaging them into a
tuple
def onetwothree(x):
return x*1, x*2, x*3
print onetwothree(3)
3, 6, 9
Built-in Functions
Several useful built-in functions. Example math.py
print pow(2,3)
print abs(-14)
print max(1,-5,3,0)
Output:
8
14
3
Chapter 4: Iteration
while loops
for loops
range function
Flow control within loops: break, continue, pass, and
the “loop else”
while
Example while.py
i=1
while i < 4:
print i
i += 1
Output:
1
2
3
for
Example for.py
for i in range(3):
print i,
output:
0, 1, 2
range(n) returns a list of integers from 0 to n-1.
range(0,10,2) returns a list 0, 2, 4, 6, 8
Flow control within loops
General structure of a loop:
while <statement> (or for <item> in <object>):
<statements within loop>
if <test1>: break # exit loop now
if <test2>: continue # go to top of loop now
if <test3>: pass # does nothing!
else:
<other statements> # if exited loop without
# hitting a break
Parallel traversals
If we want to go through 2 lists (more later) in
parallel, can use zip:
A = [1, 2, 3]
B = [4, 5, 6]
for (a,b) in zip(A,B):
print a, “*”, b, “=”, a*b
output:
1*4=4
2 * 5 = 10
3 * 6 = 18
Chapter 5: Strings
String basics
Escape sequences
Slices
Block quotes
Formatting
String methods
String basics
Strings can be delimited by single or double quotes
Python uses Unicode, so strings are not limited to ASCII
characters
An empty string is denoted by having nothing between string
delimiters (e.g., '')
Can access elements of strings with [], with indexing starting
from zero:
>>> “snakes”[3]
'k'
Note: can't go other way --- can't set “snakes”[3] = 'p' to
change a string; strings are immutable
a[-1] gets the last element of string a (negative indices work
through the string backwards from the end)
Strings like a = r'c:\home\temp.dat' (starting with an r
character before delimiters) are “raw” strings (interpret
literally)
More string basics
Type conversion:
>>> int(“42”)
42
>>> str(20.4)
'20.4'
Compare strings with the is-equal operator, == (like
in C and C++):
>>> a = “hello”
>>> b = “hello”
>>> a == b
True
>>>location = “Chattanooga “ + “Tennessee”
>>>location
Chattanooga Tennessee
String methods
Strings are classes with many built-in methods.
Those methods that create new strings need to be
assigned (since strings are immutable, they cannot
be changed in-place).
S.capitalize()
S.center(width)
S.count(substring [, start-idx [, end-idx]])
S.find(substring [, start [, end]]))
S.isalpha(), S.isdigit(), S.islower(), S.isspace(),
S.isupper()
S.join(sequence)
And many more!
replace method
Doesn't really replace (strings are immutable) but
makes a new string with the replacement performed:
>>> a = “abcdefg”
>>> b = a.replace('c', 'C')
>>> b
abCdefg
>>> a
abcdefg
Regular Expressions
• Regular expressions are a way to do pattern-
matching. The basic concept (and most of the
syntax of the actual regular expression) is the same
in Java or Perl
Regular Expression Syntax
• Common regular expression syntax:
. Matches any char but newline (by default)
^ Matches the start of a string
$ Matches the end of a string
* Any number of what comes before this
+ One or more of what comes before this
| Or
\w Any alphanumeric character
\d Any digit
\s Any whitespace character
(Note: \W matches NON-alphanumeric, \D NON digits, etc)
[aeiou] matches any of a, e, i, o, u
junk Matches the string 'junk'
Match Object Funtions
• Search() and match() return a MatchObject. This
object has some useful functions:
group(): return the matched string
start(): starting position of the match
end(): ending position of the match
span(): tuple containing the (start,end) positions of
the match
Chapter 6: Collection Data Types
Tuples
Lists
Dictionaries
Tuples
Tuples are a collection of data items. They may be of
different types. Tuples are immutable like strings.
Lists are like tuples but are mutable.
>>>“Tony”, “Pat”, “Stewart”
('Tony', 'Pat', 'Stewart')
Python uses () to denote tuples; we could also use (),
but if we have only one item, we need to use a
comma to indicate it's a tuple: (“Tony”,).
An empty tuple is denoted by ()
Need to enclose tuple in () if we want to pass it all
together as one argument to a function
Lists
Like tuples, but mutable, and designated by square
brackets instead of parentheses:
>>> [1, 3, 5, 7, 11]
[1, 3, 5, 7, 11]
>>> [0, 1, 'boom']
[0, 1, 'boom']
An empty list is []
Append an item:
>>> x = [1, 2, 3]
>>> x.append(“done”)
>>> print x
[1, 2, 3, 'done']
Lists and Tuples Contain Object
References
Lists and tuples contain object references. Since lists
and tuples are also objects, they can be nested
>>> a=[0,1,2]
>>> b=[a,3,4]
>>> print b
[[0, 1, 2], 3, 4]
>>> print b[0][1]
1
>>> print b[1][0]
... TypeError: 'int' object is unsubscriptable
Dictionaries
Unordered collections where items are accessed by a
key, not by the position in the list
Like a hash in Perl
Collection of arbitrary objects; use object references
like lists
Nestable
Can grow and shrink in place like lists
Concatenation, slicing, and other operations that
depend on the order of elements do not work on
dictionaries
The “is” operator
Python “variables” are really object references. The
“is” operator checks to see if these references refer
to the same object (note: could have two identical
objects which are not the same object...)
References to integer constants should be identical.
References to strings may or may not show up as
referring to the same object. Two identical, mutable
objects are not necessarily the same object
“in” operator
For collection data types, the “in” operator
determines whether something is a member of the
collection (and “not in” tests if not a member):
>>> team = (“David”, “Robert”, “Paul”)
>>> “Howell” in team
False
>>> “Stewart” not in team
True
Chapter 7: Advanced Functions
Passing lists and keyword dictionaries to functions
Lambda functions
apply()
map()
filter()
reduce()
List comprehensions
Chapter 8: Exception Handling
Basics of exception handling
Chapter 9: Python Modules
Basics of modules
Import and from … import statements
Changing data in modules
Reloading modules
Module packages
__name__ and __main__
Import as statement
Module basics
Each file in Python is considered a module. Everything within
the file is encapsulated within a namespace (which is the
name of the file)
To access code in another module (file), import that file, and
then access the functions or data of that module by prefixing
with the name of the module, followed by a period
To import a module:
import sys
(note: no file suffix)
Can import user-defined modules or some “standard” modules
like sys and random
Any python program needs one “top level” file which imports
any other needed modules
Python standard library
There are over 200+ modules in the Standard Library
Consult the Python Library Reference Manual,
included in the Python installation and/or available
at http://www.python.org
What import does
An import statement does three things:
- Finds the file for the given module
- Compiles it to bytecode
- Runs the module's code to build any objects (top-level
code, e.g., variable initialization)
The module name is only a simple name; Python uses a
module search path to find it. It will search: (a) the
directory of the top-level file, (b) directories in the
environmental variable PYTHONPATH, (c) standard
directories, and (d) directories listed in any .pth files (one
directory per line in a plain text file); the path can be listed
by printing sys.path
The sys module
Printing the command-line arguments,
print-argv.pl
import sys
cmd_options = sys.argv
i=0
for cmd in cmd_options:
print "Argument ", i, "=", cmd
i += 1
output:
localhost(Chapter8)% ./print-argv.pl test1 test2
Argument 0 = ./print-argv.pl
Argument 1 = test1
Argument 2 = test2
The random module
import random
guess = random.randint(1,100)
print guess
dinner = random.choice([“meatloaf”, “pizza”,
“chicken pot pie”])
print dinner
Chapter 10: Files
Basic file operations
Opening a file
open(filename, mode)
where filename is a Python string, and mode is a
Python string, 'r' for reading, 'w' for writing, or 'a' for
append
Basic file operations
Basic operations:
import sys
sys.stdout = open('output.txt', 'w')
print message # will show up in output.txt
https://www.onlinegdb.com/online_python_compiler
Exercise 1 - Bike Sharing (1) - Startup
Example
• Bike sharing systems are a new generation of traditional bike rentals where the
whole process from membership, rental and return back has become automatic.
Through these systems, the user is able to easily rent a bike from a particular
position and return back at another position. Currently, there are about over 500
bike-sharing programs around the world which are composed of over 500
thousands bicycles. Today, there exists great interest in these systems due to their
important role in traffic, environmental and health issues.
• Apart from interesting real-world applications of bike sharing systems, the
characteristics of data being generated by these systems make them attractive for
the research. Opposed to other transport services such as bus or subway, the
duration of travel, departure, and arrival position is explicitly recorded in these
systems. This feature turns the bike sharing system into a virtual sensor network
that can be used for sensing mobility in the city. Hence, it is expected that most of
the important events in the city could be detected via monitoring these data.
Example 1 – Bike Sharing (2)
• Overview & Python Coding
• https://medium.com/@wilamelima/analysing-bike-sharing-trends-with-
python-a9f574c596b9
plt.bar(x, y)
plt.show()
Printing prime numbers from N1 to N2
n1=int(input("Enter limit for Prime numbers From :"))
n2=int(input("Enter limit for Prime numbers upto :"))
print("Prime Numbers between",n1, "and", n2, "are :")
for n in range(n1,n2,+1):
x=n
s=1
for i in range(1,x,+1):
if(x%i)==0:
s=s+1
if s<=2:
print(n)
Cricket Data Analysis # Question 1 - Highest Run and Players Name
wb = xlrd.open_workbook(loc) if (largest2==a[j]):
m2.append(k1)
Applied Business Analytics
Review
Example 1 – Soft Drink Preferences
• A group of 25 people was surveyed to find their soft drink
preferences (1 – Pepsi, 2 – Bovonto, 3 – Coke, 4 – Limca)
• R Coding – Input from Key Board
• # Sample data - 3 4 1 1 3 4 3 3 1 3 2 1 2 1 2 3 2 3 1 1 1 1 4 3 1
soft<-scan()
barplot(table(soft),xlab="Soft Drink",ylab="Frequency of
Preferences", main="Soft Drink Preferences",col="white")
barplot(table(soft)/length(soft),xlab="Soft Drink",ylab="Freq. of
Preferences",main=" Soft Drink Preferences ",col="gray70")
09/05/2024 ABA 78
Example 2 – Students enrolled in a University 2015-2019
• R Coding
nos <-
c(2810,890,540,3542,1363,471,4301,1663,652,5362,2071,895,6593,2752,1113)
09/05/2024 ABA 79
Example 3– Correlation of Asian Paints
closing price with other variables
• DATA SET : Asian paints.csv
cp<-read.csv(file.choose()) • Which variables are having highest positive
summary(cp) correlation? Why?
cor(cp$close, cp$OPEN)
• Any negative correlation between the variables?
cor(cp$close, cp$HIGH)
Why?
• Which variables are having lowest positive
cor(cp$close, cp$LOW)
correlation? Why?
cor(cp$close, cp$vwap)
par(mfrow=c(2,2))
plot(cp$close, cp$OPEN)
plot(cp$close, cp$HIGH)
plot(cp$close, cp$LOW)
plot(cp$close, cp$vwap)
plot(cp)
09/05/2024 ABA 80
Example 4 : Correlation of POC between ITES (INFY &
TCS)
• CSV files : INFY_2122_FY_POC.csv & TCS_2122_FY_POC
INFY=read.csv(file.choose())
TCS=read.csv(file.choose())
cor(INFY$closePOC, TCS$closePOC)
par(mfrow=c(2,2))
boxplot(INFY$closePOC, main="INFOSYS")
boxplot(TCS$closePOC, main="TCS")
hist(INFY$closePOC)
hist(TCS$closePOC)
# INFERENCE The p-value of the test is 0.02712, which is less than the significance level (0.05).
# We can conclude that men’s median weight is significantly different from women’s median weight
# if you want to test whether the median men’s weight is less than the median women’s weight,
wilcox.test(weight ~ group, data = my_data, exact = FALSE, alternative = "less")
#Or, if you want to test whether the median men’s weight is greater than the median women’s weight,
wilcox.test(weight ~ group, data = my_data, exact = FALSE, alternative =
"greater")
boxplot(men_weight,women_weight, xlab = "Gender", ylab="Weight", names=c("Men","Women"))
09/05/2024 ABA 82
Example 6 - Linear Modeling
• df1<-read.csv(file.choose())
• #Visualization
• boxplot(df1[3:9])
• boxplot(df1[11:13])
• boxplot(df1$POC1)
• #Simple Linear Regression, DV: Closing Price, IV: Opening Price
• reg1=lm(df1$close~df1$OPEN)
• summary(reg1)
• reg2=lm(df1$close~df1$LOW)
• summary(reg2)
• reg3=lm(df1$close~df1$HIGH)
• summary(reg3)
• reg4=lm(df1$close~df1$ltp)
• summary(reg4)
• reg5=lm(df1$close~df1$vwap)
• summary(reg5)
09/05/2024 ABA 83
Example 7 Regression using R
CSV - tcsnifty50POC_1year
data=read.csv(file.choose())
reg1=lm(data1$tcsPOC~data1$NIFTYPOC)
summary(reg1)
09/05/2024 ABA 84
Example 8 - Sales and Advertisement
• We’ll use the marketing data set [datarium package], which contains the impact of the
amount of money spent on three advertising medias (youtube, facebook and
newspaper) on sales.
• install.packages("datarium")
• library(datarium)
09/05/2024 ABA 87
General Statistics (1)
• Null Hypothesis (Ho)
• Nothing Happened, The mean was unchanged, The treatment has no effect,
The model did not improve
• Alternate Hypothesis (Ha)
• Something Happened, The mean rose, The treatment improved the patient’s
health, The model fit better
• Assume Ho is TRUE
• T-statistics
• P-Value
• Small (P<α ) – Strong evidence against Ho, i.e. Reject Ho
• not small (P >= α ) – Retain H0 (failing to reject Ho )
• Example : P< 0.05 –Reject Ho
• P < 0.05, (100-95)/100 = 5/100=0.05 = 95%
• High Risk Applications
P < 0.01, (100-99)/100 = 1/100=0.01 = 99%
P < 0.001, (100-99.9)/100 = 0.1/100=0.001 = 99.9%
09/05/2024 ABA 88
General Statistics (2)
Testing mean of the sample – (t-test – small sample, n<30)
• You have sample from a population, given this sample you
want to know if the mean of the population could reasonably
be “m”
• t.test is making inferences about a population mean from the
sample.
• wilcox.test : tells us whether the central locations of the two populations are
significantly different or equivalently whether their relative frequency are
different
09/05/2024 ABA 91
Example 9 – Wilcox Test
# Data in two numeric vectors
women_weight <- c(38.9, 61.2, 73.3, 21.8, 63.4, 64.6, 48.4, 48.8, 48.5)
men_weight <- c(67.8, 60, 63.4, 76, 89.4, 73.3, 67.3, 61.3, 62.4)
# Create a data frame
my_data <- data.frame(group = rep(c("Woman", "Man"), each = 9), weight =
c(women_weight, men_weight))
#Question : Is there any significant difference between women and men weights?
# Compute two-samples Wilcoxon test
res <- wilcox.test(weight ~ group, data = my_data, exact = FALSE)
print(res)
# INFERENCE The p-value of the test is 0.02712, which is less than the significance level (0.05).
# We can conclude that men’s median weight is significantly different from women’s median weight
# if you want to test whether the median men’s weight is less than the median women’s weight,
wilcox.test(weight ~ group, data = my_data, exact = FALSE, alternative = "less")
#Or, if you want to test whether the median men’s weight is greater than the median women’s weight,
wilcox.test(weight ~ group, data = my_data, exact = FALSE, alternative =
"greater")
boxplot(men_weight,women_weight, xlab = "Gender", ylab="Weight", names=c("Men","Women"))
09/05/2024 ABA 92
Regression -Base Model 1 & 2
SAMPLE SKETCHES
16.00 b<0
b=0
14.00 0<b<1
b=1
12.00 b>1
10.00
SAMPLE SKETCHES
8.00 45.00
6.00 40.00
35.00
4.00
30.00
b<0
2.00
b=0
25.00 0<b<1
- b=1
- 0.50 1.00 1.50 2.00 2.50 3.00
20.00 b>1
15.00
10.00
5.00
-
- 0.50 1.00 1.50 2.00 2.50 3.00
09/05/2024 ABA 93
Statistical Inferences
• Statistical Inferences
• Intercept
• Beta
• P-value
*** - Statistical significant at 1 percent level
** - Statistical significant at 5 percent level
* - Statistical significant at 10 percent level
• R2 Value
Higher the R2 better the model is fit (0.70 and above)
acceptable if R2 is (0.50 to 0.70). (Application Specific)
• Residual Error
09/05/2024 ABA 94
Try it Exercise – Assignment 1
• Download Your Company & Sector
• Regression using POC (Minimum 3 SLR required)
• Plot all POCs
• Assignment content
• 1. Data set (% of change, cleaned)
• 2. LM Summary output
• 3. Linear Regression Equation(s)
• 4. Plot & Boxplot output
• 5. Inferences
09/05/2024 ABA 95
Example 10 - MLR- HR
R Coding
Employee_Supervisor_Performance.
r
09/05/2024 ABA 96
Example 11 – All Tyre brands are equal
• Imagine that you are interested in understanding whether knowing the brand of car
tyre can help you predict whether you will get more or less mileage before you
need to replace them.
• We’ll draw what is hopefully a random sample of 60 tyres from four different
manufacturers and use the mean mileage by brand to help inform our thinking.
• While we expect variation across our sample we’re interested in whether the
differences between the tyre brands (the groups) is significantly different than what
we would expect in random variation within the groups.
H0: μ1 = μ2 = μ3 = μ4
• Our research or testable hypothesis is
μApollo=μBridgestone=μCEAT=μFalken
Our null hypothesis is basically “Tyre brand doesn’t matter in predicting the
mileage – i.e. all tyre brands are same”.
Alternate hypothesis is at least one of the tire brand populations is different
than the other three.
• Step 6 : Inference
http://rpubs.com/ibecav/308410
Assumption
NULL : Sales is not depending on Discount and location.
• Date
• Close
• No. of Trade
• ClosePOC
• NOTPOC
• Month
• CGPA
• GENDER
• COURSE
• CTC
• We see that there is 70% chance that this student will get the admit.
SEM Basics(5)
• SEM = Reliability + Regression
• Structural equation models combine measurement models (e.g.,
reliability) with structural models (e.g., regression). The sem package,
developed by John Fox, allows for some basic structural equation
models. To use it, add the sem package by using the package
manager.
End term Marks A written exam will be conducted at the end of the 40
trimester. This will carry 40% of total marks