Answerkey
Answerkey
Answerkey
PART-A
1. 1. Business understanding
2. Data understanding
3. Data preparation
4. Modelling
5. Evaluation
6. Deployment
2. IBM Watson IoT Platform, Microsoft IoT-Azure IoT suite, Google Cloud IoT, Amazon AWS
IoT
4.
5. Latency
Energy-efficiency
Privacy
Scalability
6. with hdfs.open('/tmp/file1.txt','wb') as f:
f.write(b'You are Awesome!')
with hdfs.open('/tmp/file1.txt') as f:
print(f.read())
8. model = LinearRegressor(d)
loss = model.fit(X_train, Y_train, 20000) #Epochs = 20000
model.predict()
9. Supervised Vs Unsupervised
Supervised Unsupervised
Uses Known and Labeled Data as input Uses Unknown Data as input
The number of Classes is known The number of Classes is not known
Ex: Optical Character Recognition Ex: Face Recognition
10. Linear regression is a supervised learning task. It helps us to find the relationship
between the dependent variable y and the independent variable(s) x.
PART-B
11. In HDF5 files, data is organized into groups and datasets. A group is a collection of
groups or datasets. A dataset is a multidimensional homogeneous array.
Pytables:
1. Get the numeric data:
import numpy as np
arr = np.loadtxt('temp.csv', skiprows=1, usecols=(2,3),
delimiter=',')
2. Open the HDF5 file:
import tables
h5filename = 'pytable_demo.hdf5'
with tables.open_file(h5filename,mode='w') as h5file:
3. Get the root node:
root = h5file.root
4. Create a group with create_group() or a dataset with create_array(), and
repeat this until all the data is stored:
h5file.create_array(root,'global_power',arr)
5. Close the file:
h5file.close()
Pandas:
import pandas as pd
import numpy as np
arr = np.loadtxt('temp.csv', skiprows=1, usecols=(2,3), delimiter=',')
import pandas as pd
store=pd.HDFStore('hdfstore_demo.hdf5')
print(store)
store['global_power']=pd.DataFrame(arr)
store.close()
h5py:
import h5py
hdf5file = h5py.File('pytable_demo.hdf5')
ds=hdf5file['/global_power']
print(ds)
for i in range(len(ds)):
print(arr[i])
hdf5file.close()