Python Amit

PROBLEM :- 01
(A)Take a dataset which contains 20 rows

and 7 columns .
Write syntax for following scenario
1Find missing value in dataset and replace
with prefix or next value.
2 drop 1 column and 1 row from dataset
3 access multiple rows
4 Access multiple columns
5 use of loc and iloc
6 create label or index by taking any example
7 Print the first 5 rows of the DataFrame:
8 Print the last 5 rows of the DataFrame: tail()
9 sort the data by axis = 1
10 perform data alignment
11 Print information about the data
12 Visualization of any column
(B). Use matplotlib library to plot data points in
various style.
Solution (A) :-
import pandas as
pd import numpy as
np
# Create a dictionary with sample

data data = {
'EmployeeID': range(1,
21), 'Name': [
'Amit Kumar Singh', 'Aashu Kumar', 'Abhishek Raj', 'Gulshan
Kumar', 'Anmol Srivastava',
'Ujwal Singh', 'Tej Pratap Singh', 'Chirag Goyal', 'Siddharth
Pandey', 'Sudhanshu Yadav',
'Sourav Keshri', 'Bibhuti Singh', 'Tanishq Tiwari', 'Mukesh Kumar',
'Vikhyat Singh',
'Aditya Singh', 'Sarvesh Kumar', 'Ravi Prakash', 'Sachin Singh',
'Sanchit Mishra'
],
'Age':
[
29,
34,
22,
37,
28,
45,
31,
39,
23,
50,
33,
40,
27,
44,
32,
'Developer', 'Manager', 'Analyst', 'Developer', 'Executive',
'Specialist', 'Accountant', 'Manager', 'Director', 'Supervisor',
'Developer', 'Analyst', 'Assistant', 'Coordinator', 'Executive',
'Developer', 'Accountant', 'Supervisor', 'Manager', 'Developer'
],
'Salary': [
60000, 75000, 80000, 62000, 50000, 68000, 57000, 90000,
95000,
85000,
63000, 82000, 45000, 78000, 54000, 61000, 56000, 83000,
76000,
60000
],
'DateOfJoining': [
'2019-01-15', '2018-03-22', '2016-07-19', '2020-11-03', '2021-05-
10',
'2015-12-29', '2019-08-17', '2017-06-01', '2016-02-11', '2013-09-
23',
'2018-10-14', '2014-05-18', '2021-12-01', '2015-04-07', '2020-03-15',
'2017-08-21', '2019-11-27', '2014-01-30', '2016-12-08', '2019-04-15'
]
}
# Create a pandas DataFrame from the

dictionary df = pd.DataFrame(data)
# Adjust display options to show all

columns
pd.set_option('display.max_columns', None)
# Display the
DataFrame print(df)
Output:-
1.Find missing value in dataset and replace with prefix or next

value. df.fillna(method='ffill', inplace=True) # Forward fill
df.fillna(method='bfill', inplace=True) # Backward fill
print("After filling missing values with next values:\n",
df)
2. drop 1 column and 1 row from dataset.

df_dropped_col = df.drop(columns=['Position']) # Drop the 'Position' column
df_dropped_row = df_dropped_col.drop(index=[0]) # Drop the first row
print("After dropping a column and a row:\n", df_dropped_row)
3. access multiple rows
multiple_rows = df.iloc[5:11] # Access rows 5 to 10
print("Accessing multiple rows (5 to 10):\n",
multiple_rows)
4. Access multiple columns

multiple_columns = df[['Name', 'Salary']] # Access the 'Name' and 'Salary'
columns
print("Accessing multiple columns ('Name' and 'Salary'):\n",
multiple_columns)
5. use of loc and iloc

# Using loc to access rows and columns by label
loc_access = df.loc[5:10, ['Name', 'Department',
'Salary']] # Using iloc to access rows and columns by
integer index iloc_access = df.iloc[5:10, [1, 3, 5]]
print("Using loc to access data:\n", loc_access)
print("Using iloc to access data:\n",
iloc_access)
6. create label or index by taking any example

df.set_index('EmployeeID', inplace=True) # Set 'EmployeeID' as the index
# Display the DataFrame to verify the index has been set
print("After setting 'EmployeeID' as the index:\n", df)
7.Print the first 5 rows of the DataFrame:

print("First 5 rows of the DataFrame:\n",
df.head())
8.Print the last 5 rows of the DataFrame: tail()

print("Last 5 rows of the DataFrame:\n", df.tail())
9.sort the data by axis = 1
sorted_df = df.sort_index(axis=1)
print("DataFrame sorted by
columns:\n", sorted_df)
10. perform data alignment

# Create another DataFrame with similar
index df2 = pd.DataFrame({
'EmployeeID': range(1, 21),
'Bonus': np.random.randint(1000, 5000,
size=20)
}).set_index('EmployeeID')
aligned_df, aligned_df2 = df.align(df2, join='inner') # Align the
data print("Aligned DataFrame 1:\n", aligned_df)
print("Aligned DataFrame 2:\n", aligned_df2)
11.Print information about the data

print("Information about the DataFrame:")
print(df.info())
12 Visualization of any column

import matplotlib.pyplot as plt
plt.figure(figsize=(10, 5))
plt.plot(df.index, df['Salary'], marker='o') # Plot the 'Salary'
column plt.title('Salary of Employees')
plt.xlabel('EmployeeID')
plt.ylabel('Salary')
plt.grid(True)
plt.show()
(B). Use matplotlib library to plot data points in various
style.
Solution (B):-
# Scatter plot
plt.figure(figs
ize=(10, 5))
plt.scatter(df.index, df['Salary'], color='red')
plt.title('Scatter Plot of Salary')
plt.grid(True)
plt.show()
# Bar plot
plt.bar(df.index,
df['Salary'], color='blue')
plt.title('Bar Plot of Salary')
plt.show()
# Histogram
plt.hist(df['Salary'], bins=10, color='green')
plt.title('Histogram of Salary')
plt.xlabel('Salary')
plt.ylabel('Frequency')
plt.show()
# Line plot
plt.plot(df['EmployeeID'], df['Salary'], color='purple', marker='o', linestyle='-')
plt.title('Line Plot of Salary')
plt.grid(True)
plt.show()
PROBLEM:- 2
Show the output of following syntax:
import numpy as np
import pandas as pd
df= pd.DataFrame(np.arange(12).reshape(3, 4),columns=['P', 'Q', 'R', 'S'])
df
Output:
df.drop(['Q', 'R'], axis=1)

output:
df.drop([0, 1])
output
PROBLEM:-03
import pandas as pd
import numpy as np
# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, 45, 56,
np.nan],
'Third Score':[np.nan, 40, 80,
98]} # creating a dataframe from list
df = pd.DataFrame(dict)
# using isnull()
function Print(df.isnull())
Print(df.notnull())
Print(df.fillna(0))
Print(df.fillna(method = ‘pad’))
Print(df.fillna(method = ‘bfill’))
Output:- 1.
2.
3.
4.

Python Amit

Uploaded by

Copyright:

Available Formats

Python Amit

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Python Amit

Uploaded by

Copyright:

Available Formats

PROBLEM :- 01

(A)Take a dataset which contains 20 rows

# Create a dictionary with sample

# Create a pandas DataFrame from the

# Adjust display options to show all

1.Find missing value in dataset and replace with prefix or next

2. drop 1 column and 1 row from dataset.

4. Access multiple columns

5. use of loc and iloc

6. create label or index by taking any example

7.Print the first 5 rows of the DataFrame:

8.Print the last 5 rows of the DataFrame: tail()

10. perform data alignment

11.Print information about the data

12 Visualization of any column

df.drop(['Q', 'R'], axis=1)

You might also like