Lab Guide-AI-powered Smart Site Selection V1
Lab Guide-AI-powered Smart Site Selection V1
Lab Guide-AI-powered Smart Site Selection V1
Lab Guide
1 Lab Environment
1.1 Introduction
1.1.1 About This Test
In this test, HUAWEI CLOUD ModelArts is used for model development and training, and the
model training result is sent to the application service device (development board).
Click Console in the upper right corner of the page. The console page is displayed.
In the service list on the left, choose Storage > Object Storage Service.
In the upper right corner of the page, click Create Bucket to create a bucket for storing the
data and configuration files required in this test.
AI-powered Smart Site Selection Page 5
Set the bucket name, for example, aiot2. Retain the default values for other parameters.
After the setting is complete, click Create Now.
Return to the Object Storage Service page. If the bucket name created in the previous step
exists in the bucket name list, the bucket is successfully created.
AI-powered Smart Site Selection Page 6
In the bucket list, click the name of the created bucket. On the aiot2 page, click the Objects
tab. ModelArts cannot be directly associated with the root directory of OBS. Therefore, you
need to create a data folder to store the files required in this test.
AI-powered Smart Site Selection Page 7
If the uploaded files exist in the object list, the upload is successful.
The instance name can be customized, for example, notebook-AIoT. Select Python3 as the
work environment and OBS as the storage, and set the data storage path.
AI-powered Smart Site Selection Page 10
If the created instance name exists in the notebook instance list, the creation is successful.
In this test, methods in the sklearn library are used. Therefore, select TensorFlow-1.13.1
during the creation.
AI-powered Smart Site Selection Page 11
The Sync OBS function is used to synchronize the objects selected in the list of notebook
instance files from the OBS bucket to the current container directory ~/work. Select the data
files to be synchronized and click Sync OBS. In the dialog box that is displayed, select YES.
----End
2.1 Introduction
2.1.1 About This Test
Offerings need to be supplemented for vending machines. How can we determine the
distribution warehouse location to ensure economical and efficient offering delivery? This
section uses the k-means clustering method to perform clustering on the location (longitude
and latitude) of each device to determine the optimal location of the distribution warehouse.
2.1.2 Objectives
On completion of this test, you will be able to master:
Important parameters of clustering models
Principles of the k-means clustering method
K-means clustering modeling method
2.2 Procedure
2.2.1 Importing Data
Import data file details.csv.
import pandas as pd
data = pd.read_csv('./details.csv')
#Adjust the path based on site requirements.
print(data.shape)#Check whether the number of samples is correct.
The data input format required by sklearn.cluster is array. If the DataFrame data format is
directly used, an error is reported. Therefore, you need to convert the arr format to the array
format.
addr_1=addr.values
addr_1 #View the data.
Invoke k-means in the sklearn.cluster library to implement clustering for device addresses.
The n_cluster parameter indicates the number of clusters and is manually input based on site
requirements. random_state indicates the initial random number sequence specified during
initial cluster center selection.
from sklearn.cluster import KMeans
Use the silhouette_score() method in the sklearn.metrics package to evaluate the clustering
effect.
from sklearn.metrics import silhouette_score
silhouette_score(addr_1,sites_kmeans.labels_,sample_size=1000)
##Four clusters
sites_kmeans_4 = KMeans(n_clusters=4,random_state = 12)
sites_kmeans_4.fit(addr_1)
##Five clusters
sites_kmeans_5 = KMeans(n_clusters=5,random_state = 12)
sites_kmeans_5.fit(addr_1)
The result shows that when the number of clusters is 4, the model evaluation result is the
best. If the construction cost is sufficient, four distribution warehouses are recommended.
----End
inertia_: sum of the distance from each point to the centroid of the cluster.
print('Labels:',sites_kmeans_4.labels_[:5])
print('Prec:',sites_kmeans_4.predict(addr_1)[:5])
AI-powered Smart Site Selection Page 15
print('Centers:\n',sites_kmeans_4.cluster_centers_)
# Save the longitude and latitude information, which can be used in DLV display.
Devices are divided into four clusters based on their longitudes and latitudes to display the
clustering result of each device.
plt.figure(figsize=[12,9])
----End