Lab Guide-AI-powered Smart Site Selection V1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Change History

Course Code Applicable To Product Version Issue

Developed By Time Reviewed By New/Update What's New

Zhu Yingying/WX522971 2019.12.11 New


AI-powered Smart Site Selection

Lab Guide

Huawei Technologies Co., Ltd.


AI-powered Smart Site Selection Page 3

1 Lab Environment

1.1 Introduction
1.1.1 About This Test
In this test, HUAWEI CLOUD ModelArts is used for model development and training, and the
model training result is sent to the application service device (development board).

1.1.2 Test Environment


ModelArts is a one-stop development platform, including data processing, model training,
management, and deployment, and AI Market for sharing models, APIs, and datasets.

Figure 1-1 ModelArts architecture

1.1.3 Environment Preparation


Step 1 Log in to the HUAWEI CLOUD website at https://www.huaweicloud.com/en-us/.

Before logging in to HUAWEI CLOUD, register an account at


https://www.huaweicloud.com/en-us/. Click Log In and enter the username and password as
prompted.
AI-powered Smart Site Selection Page 4

Step 2 Log in to management console.

Click Console in the upper right corner of the page. The console page is displayed.

Step 3 Create an Object Storage Service (OBS) bucket.

In the service list on the left, choose Storage > Object Storage Service.

In the upper right corner of the page, click Create Bucket to create a bucket for storing the
data and configuration files required in this test.
AI-powered Smart Site Selection Page 5

Set the bucket name, for example, aiot2. Retain the default values for other parameters.
After the setting is complete, click Create Now.

Return to the Object Storage Service page. If the bucket name created in the previous step
exists in the bucket name list, the bucket is successfully created.
AI-powered Smart Site Selection Page 6

Step 4 Upload files.

In the bucket list, click the name of the created bucket. On the aiot2 page, click the Objects
tab. ModelArts cannot be directly associated with the root directory of OBS. Therefore, you
need to create a data folder to store the files required in this test.
AI-powered Smart Site Selection Page 7

Go to the data folder and click Upload Object.

Select the data files and configuration files to be uploaded.

If the uploaded files exist in the object list, the upload is successful.

Step 5 Create a ModelArts notebook instance.

Choose Service List > EI Enterprise Intelligence > ModelArts.


AI-powered Smart Site Selection Page 8

Choose DevEnviron > Notebooks and click Create.


AI-powered Smart Site Selection Page 9

Step 6 Configure a ModelArts instance.

The instance name can be customized, for example, notebook-AIoT. Select Python3 as the
work environment and OBS as the storage, and set the data storage path.
AI-powered Smart Site Selection Page 10

If the created instance name exists in the notebook instance list, the creation is successful.

Step 7 Create a notebook.

In this test, methods in the sklearn library are used. Therefore, select TensorFlow-1.13.1
during the creation.
AI-powered Smart Site Selection Page 11

Step 8 Synchronize data using the Sync OBS function.

The Sync OBS function is used to synchronize the objects selected in the list of notebook
instance files from the OBS bucket to the current container directory ~/work. Select the data
files to be synchronized and click Sync OBS. In the dialog box that is displayed, select YES.

If information similar to the following is displayed, the data synchronization is successful.


AI-powered Smart Site Selection Page 12

----End

2 Distribution Warehouse Location Selection

2.1 Introduction
2.1.1 About This Test
Offerings need to be supplemented for vending machines. How can we determine the
distribution warehouse location to ensure economical and efficient offering delivery? This
section uses the k-means clustering method to perform clustering on the location (longitude
and latitude) of each device to determine the optimal location of the distribution warehouse.

2.1.2 Objectives
On completion of this test, you will be able to master:
 Important parameters of clustering models
 Principles of the k-means clustering method
 K-means clustering modeling method

2.2 Procedure
2.2.1 Importing Data
Import data file details.csv.
import pandas as pd

data = pd.read_csv('./details.csv')
#Adjust the path based on site requirements.
print(data.shape)#Check whether the number of samples is correct.

2.2.2 Processing Data


Extract the longitude and latitude information about the device from the raw data for
clustering.
addr = data[['longitude','latitude']]
addr.head()

Part of the data:


AI-powered Smart Site Selection Page 13

2.2.3 Clustering Data


Step 1 Convert the data format.

The data input format required by sklearn.cluster is array. If the DataFrame data format is
directly used, an error is reported. Therefore, you need to convert the arr format to the array
format.
addr_1=addr.values
addr_1 #View the data.

Part of the data:


array([[120.82, 29.58],
[120.57, 30.63],
[120.27, 30.17],
...,
[120.28, 28.15],
[120.3 , 30.42],
[120.2 , 30.27]])

Step 2 Create a clustering model.

Invoke k-means in the sklearn.cluster library to implement clustering for device addresses.
The n_cluster parameter indicates the number of clusters and is manually input based on site
requirements. random_state indicates the initial random number sequence specified during
initial cluster center selection.
from sklearn.cluster import KMeans

sites_kmeans = KMeans(n_clusters=3,random_state = 12) #Set three clusters.


sites_kmeans.fit(addr_1)

Parameters of the KMeans() method are output:


KMeans(algorithm='auto', copy_x=True, init='k-means++', max_iter=300,
n_clusters=3, n_init=10, n_jobs=None, precompute_distances='auto',
random_state=123, tol=0.0001, verbose=0)

2.2.4 Evaluating the Clustering Effect


Step 1 Evaluate the clustering effect.
AI-powered Smart Site Selection Page 14

Use the silhouette_score() method in the sklearn.metrics package to evaluate the clustering
effect.
from sklearn.metrics import silhouette_score

silhouette_score(addr_1,sites_kmeans.labels_,sample_size=1000)

The output is as follows:


0.46196462466163146

Step 2 Compare the clustering effects of different k values.

Set the number of clusters to 2, 4, and 5, respectively.


##Two clusters
sites_kmeans_2 = KMeans(n_clusters=2,random_state = 12)
sites_kmeans_2.fit(addr_1)

##Four clusters
sites_kmeans_4 = KMeans(n_clusters=4,random_state = 12)
sites_kmeans_4.fit(addr_1)

##Five clusters
sites_kmeans_5 = KMeans(n_clusters=5,random_state = 12)
sites_kmeans_5.fit(addr_1)

View the clustering effect.


print("when the number of clusters is 2",silhouette_score(addr_1,sites_kmeans_2.labels_, sample_size=1000))
print("when the number of clusters is 4",silhouette_score(addr_1,sites_kmeans_4.labels_, sample_size=1000))
print("when the number of clusters is 5",silhouette_score(addr_1,sites_kmeans_5.labels_, sample_size=1000))

The output is as follows:


When the number of clusters is 2 0.4488362943453525
When the number of clusters is 4 0.5168478728215721
When the number of clusters is 5 0.5120938006547828

The result shows that when the number of clusters is 4, the model evaluation result is the
best. If the construction cost is sufficient, four distribution warehouses are recommended.

----End

2.2.5 Viewing Results


Step 1 Print the k-means output.

The output includes:

cluster_centers_: coordinates of the cluster center

Labels_: classification result label of each point

inertia_: sum of the distance from each point to the centroid of the cluster.
print('Labels:',sites_kmeans_4.labels_[:5])
print('Prec:',sites_kmeans_4.predict(addr_1)[:5])
AI-powered Smart Site Selection Page 15

print('Centers:\n',sites_kmeans_4.cluster_centers_)

The output is as follows:


Labels: [2 0 0 1 2]
Prec: [2 0 0 1 2]
Centers:
[[120.24059235 30.34716132]
[120.70759845 28.07876567]
[121.64831582 29.86180276]
[119.31063156 28.92008776]]

# Save the longitude and latitude information, which can be used in DLV display.

Step 2 Visualize the results.

Devices are divided into four clusters based on their longitudes and latitudes to display the
clustering result of each device.
plt.figure(figsize=[12,9])

for cluster,marker in zip(range(4),['^','o','+','*']):


x_axis = addr_1[:,0][sites_kmeans_4.labels_ == cluster]
y_axis = addr_1[:,1][sites_kmeans_4.labels_ == cluster]
plt.scatter(x_axis,y_axis,marker = marker)
plt.show()
#^ indicates the drawing triangle, o indicates the drawing circle, and + indicates the drawing +.
#Obtain the first column of addr and assign the cluster to x.labels_.

The output is as follows:

----End

You might also like