Numpy Basics Introduction To
Numpy Basics Introduction To
Numpy Basics Introduction To
NumPy Basics
Nump
y
Numpy is known as numerical python. It is a library used for working with arrays.
Numpy was released in 2005. It is an open source library and you can use it for
free.
Why numpy?
In python , we have list as an array but the processing speed of list is very
Arrays are frequently used in data science where speed and accuracy is very
important. Numpy provides speed and accuracy while doing operations with arrays.
Arra
y
Array is the collection of data stored in a continuous memory location.
Array is a data structure that is used to store the element of same data type in a organized way and faster
● one dimensional array: it is used to store data in a sequential manner. Data can be accessed by index
position of array.
Multi- Dimensional
Array
Multi-dimensions array:-
stored.
Creating a numpy array - One
dimensional
Numpy is used to work with arras. . An array object in numpy is known as
ndarray.
This is an 1 dimensional
array.
Creating a numpy array - 2-
dimensional
We can pass list, tuple into array() function to create ndarray. Here we have passed a tuple inside an array
function..
2d array stores the data in the form of row and columns. We need to pass arrays inside an array according
to our needs to create 2d array.
Numpy array
Shape
Shape is the number of elements in each dimension of the array, i.e. no. of rows and
columns.
output
(2,
4)
Numpy Array-
Reshaping
Array reshaping means changing the shape of the array. By applying “reshaping” we can add or remove
dimensions and can edit number of elements in each dimension
From 1D to 2D
output
[[ 1 2
3]
[4 5
6]
[7 8
9]
[10 11
Numpy Joining -
Array
Joining two array means putting contents of two or more arrays together in one array. In numpy we add arrays
by axes. The concatenation() function is used to join axis of two arrays, while axis is not explicitly passed to the
function, it is taken as zero.
Output
[1,2,3,4,5
,6]
Numpy Array -
Splitting
Splitting is opposite of joining, In joining we join two arrays in one while in splitting we break one array in two or
more. The array_split() function is used to split array.
Output
[array([1,2]),array([3,4]),array([5,6])]
Numpy Array -
Sorting
Sorting means arranging elements in an ordered list. An ordered list can be like numeric or alphabetic, ascending
or descending.
sort() is the function of Numpy ndarray object that sort an array in a specific order
Output
[0 1 2
3]
Random number in
numpy
Random number means something that cannot be predicted logically.
Numpy offers random module to work with numbers. The randint() function helps to generate random int
Output:
20
Numpy array
operations
Adding two numpy
array
Output:[3 5 7]
Series :
Series is a 1- dimensional array capable of storing( int , float , string , etc. ).
Its size is immutable but value of data is mutable. It stores
homogenous data .For eg.
56 34 24 75 47 10 23
1. Import pandas as pd
2. Variable_name=pd.read_csv(“File path”) #for
csv Variable_name=pd.read_excel(“File path”) #for excel
file Variable_name=pd.read_json(“File path”) #for
json file
Selecting in Pandas: Selecting means selecting all the rows and some of the columns or some of the rows
and all of the columns, or some of the rows and columns.
print(df[‘Column_name’ # here df is dataframe name and column name means that column
] you want to print
Missing Data-Handling &
Filtering
The data we get from real word is very messy .That has lot of NaN and Null value in it. So, to deal with missing values, pandas
has a lot of inbuilt functions.
Ticket False
Fare
Embarked False
Adding new column to existing DataFrame
in Pandas
Using Dataframe.insert() method.
table records.
Introduction to
Matplotlib
What is Data
visualization?
Data visualization is graphical representation of information and data.
Human minds get easily familiar with the visual representation of data rather than raw (text/ numerical) data.
It is better to represent the data through graphs and other visual aids where we can analyze the data more
efficiently and make the specific decision according to data analysis.
Finding the hidden patterns inside a data and show how one variable is related to other variables etc.
plt.plot(x,y)
plt.show() #display the
graph
Add labels in
graph
from matplotlib import pyplot as
plt x=[1,2,5]
y=[2,4,6]
plt.plot(x,y)
plt.title("X-Y #adding title
plot") #adding X label
plt.xlabel('X-axis') #adding y label
plt.ylabel('Y-axis') #display the
plt.show() graph
Subplot
Function
Subplot() function is used to plot more then 2 plots in one figure.
We can use this method to separate two graphs which plotted in the same
axis.
plt.figure(figsize=(9,3))
plt.subplot(131)
plt.bar(vehical,no_of_vehical)
plt.subplot(132)
plt.scatter(vehical,no_of_vehical)
plt.subplot(133)
plt.plot(vehical,no_of_vehical)
plt.suptitle('Vehical record')
Types of Graph: Bar
Graph
Bar graph : Bar graphs are one of the most common types of graphs and are used to show the categorical data .
Matplotlib provides a bar() function to make bar graphs which accepts arguments such as: categorical variables,
their value and color.
Ex:
from matplotlib import pyplot as plt
Team = ['KLP','RCB','RR','Mi']
match_wins = [11,8,15,7]
plt.bar(Team,match_wins,color = 'orange')
plt.title('Score Card')
plt.xlabel('Team')
plt.ylabel('match_wins')
plt.show()
Type of graph: Line
Graph
Line graph : A line graph is a type of chart used to show information that changes over time. We plot line
graphs
using several points connected by straight lines.
Ex:
#Basic example of plotting line graph
Ex:
from matplotlib import pyplot as plt
Team = ['KLP ','RCB','RR','Mi']
match wining_percent = [23,45,10,22]
plt.pie(match wining_percent, labels=Team, autopct='%1.1f%%',
shadow=True, startangle=90)
plt.title('Score Card')
plt.show()
Type of graph:
Histogram
Histogram : A histogram is a graphical representation that organizes a group of data points into user defined ranges.
It is similar in appearance to a bar graph, the histogram condenses a data series into an easily interpreted visual by
taking many data points and grouping them into logical ranges or bins..
Ex:
from matplotlib import pyplot as plt
height =[132,122,145,165,145,135,137,160,122,111,190,189,178.155]
bins=[100,120,140,160,180,200]
plt.hist(height,bins,histtype='bar', rwidth=0.8,color='red')
plt.title('Height of students')
plt.show()
Type of graph: Scatter
Plot
Scatter plot : A scatter plot is a diagram where each value in the data set is represented by a dot. The Matplotlib
module
has a method for drawing scatter plots.
Ex:
from matplotlib import pyplot as plt
from matplotlib import style
style.use('ggplot')
x1 = [1,7,13]
y1 = [12,11,6]
x2 = [4,9,11,5]
y2 = [7,14,17,6]
plt.scatter(x1, y1)
plt.scatter(x2, y2,
color='y')
plt.title('Scatter plot')
plt.ylabel('Y-axis')
plt.xlabel('X-
axis') plt.show()
Type of graph: 3D
Graph
3D graph : A 3-D graph is composed of three axis X-axis , Y-axis , Z -axis . Three-dimension plots can be
created by
importing the mplot3d toolkit and pass 3D- projection argument.
Ex:
from mpl_toolkits import mplot3d
import numpy as np
import matplotlib.pyplot as plt
height =
np.array([100,110,87,85,65,80,96,75,42,59,54,63,95,71,86])
weight =
np.array([105,123,84,85,78,95,69,42,87,91,63,83,75,41,80])
fig = plt.figure()
ax = plt.axes(projection='3d')
ax.scatter3D(height,weight) # This is used to plot 3D
scatter plt.title("3D Scatter P lot")
plt.xlabel("Height")
plt.ylabel("Weight")
plt.title("3D Scatter
P lot")
Thank
You...