Received May 9, 2019, accepted May 27, 2019, date of publication June 5, 2019, date of current version June 27, 2019.

27, 2019.
Digital Object Identifier 10.1109/ACCESS.2019.2920838

A Wireless Sensor Network for Monitoring

Environmental Quality in the
Manufacturing Industry
Harbin Engineering University, Harbin 150001, China
Georgia State University, Atlanta, GA 30303, USA

Corresponding author: Haitao Zhang

This work was supported in part by the China Numerical Tank Innovation Project, in part by the National Natural Science Foundation of
China under Grant 61370084 and Grant 61872105, in part by the Harbin Science and Technology Talents Research Special Foundation
under Grant G063316006, and in part by the Excellent Academic Leaders Project (The Key Techniques of Home-based Care Service
System) 2016RAXXJ013.

ABSTRACT Urban industrial plant areas are highly concentrated, and air pollution is increasingly serious.
The quantity of outdoor air quality monitoring sites is insufficient. Aiming at the above questions, related
studies propose solutions that use relatively cheap equipment networking to collect pollution data and
accurately analyze local monitoring information. In this paper, a new type of outdoor air quality monitoring
system is studied and preliminarily practiced and has proven certain feasibility and applicability. The main
contributions of this paper are: first, we improve the network layout by employing the Zigbee network,
which is combined with factory characteristics, and collected data on carbonic oxide, nitrogen dioxide, sulfur
dioxide, ozone, particulate matter, temperature, and humidity. And then, to establish the dilution coefficient
and diffusion coefficient of pollution diffusion, we adopt air movement as the energy model and, by utilizing
the method of pollution traceability, achieve the complete coverage pollution monitoring of the whole city by
local monitoring sites. Finally, we propose an improved long short-term memory (LSTM) method to predict
the pollution period of urban air quality. The experimental results show that the improved LSTM prediction
model has strong applicability and high accuracy in the period prediction of pollution weather. Meanwhile,
by analyzing the specific case in detail, we prove that air pollution in the city is mainly caused by the
manufacturing industry. We conclude that it will make a great contribution to the atmospheric environment
protection of cities by using weather quality prediction to dynamically adjust the production.

INDEX TERMS Environmental quality, wireless sensor network, Zigbee, pollution monitoring analysis,

I. INTRODUCTION or from mobile emissions point whenever, these pollutants

In the past quarter century, with the development of the may spread to a wide area for a short time and causing huge
industrial technology and transportation industry has grown and irreparable damage [2], [3]. Past research has established
exponentially, the energy and fuel demand is increasing and the negative health effects originating from the exposure
urban air pollution is becoming more and more serious which to these pollutants. Consequently, there is an urgent need
has an important impact on atmospheric environment, human to monitor and control the concentrations of these gaseous
health, ecosystem and climate change [1]. There are various pollutants in factories and for large-scale deployment of
pollutants in the atmosphere, such as automobile exhaust, sensor technologies to enable monitoring factory environ-
Nitrogen Oxides (NOx ), Carbonic Oxide (CO), Sulfur Diox- mental quality (FEQ) with high temporal and spatial reso-
ide (SO2 ), Ozone (O3 ), and Particulate Matter (PM), which lution in factories [4]. However, currently available solutions
may come from fixed locations, such as factory communities, are scarce and too costly for large installations [5]. In par-
ticular, to date, all wireless gas sensor networks have been
The associate editor coordinating the review of this manuscript and facing a trade-off between sensor node cost and data quality
approving it for publication was Tiago Cruz. because there is currently no suitable, low-cost technology

Q. Han et al.: Wireless Sensor Network for Monitoring Environmental Quality in the Manufacturing Industry

available for specific, quantitative chemical analysis [6]–[9]. based on absorption spectroscopy are the most promising
Especially on this basis, the multi-source analysis of the candidates for reliable CO detection [17]–[20]. It has been
interaction between various pollution sources in the city. found that a high degree of sensitivity and specific, quanti-
Since 2016, many sensor manufacturers have released tative detection can be achieved by tunable diode laser spec-
new gas sensors, such as the SGPC10 of SENSIRION in troscopy, albeit at high associated costs in terms of optical
Switzerland and the four-in-one gas sensor BME680 of and computational infrastructure and maintenance [21]–[24].
BOSCH in Germany, which have smaller size, lower power Currently, the so-called non-dispersive infrared absorption
consumption and more sensory content, such as VOC, humid- spectroscopy (NDIR) tool is the most popular tool for CO
ity and pressure, etc. [10]. Due to the composition of pollutant monitoring, which does not require analytical grade concen-
gas is very complex, the pollution phenomenon is several tration readings [25].
variable with the effect of various environmental factors and In recent years, with the development of computer vision
the monitoring methods for pollutant gas are also different technology and image processing technology, many domes-
from each other. Pollutants are mainly divided into two cate- tic and foreign scholars have proposed the air quality
gories are PM and chemical pollutants. Monitoring methods evaluation theory based on image processing technology.
for PM are mainly divided into four types are filter mem- Wong et al. [26] proposed photographs taken with digital
brane weighing method, TEOM method, β-ray absorption single-lens reflex cameras, optical theory and visible light-
method and light scattering method [11]. (1) Filter membrane wave bands are used as the algorithm of atmospheric char-
weighing method. The mass concentration of fine particulate acterization data. The relationship between the measured
is measured by separating fine particulate onto the filter reflectivity and the reflectors from material surface and atmo-
membrane; (2) TEOM method. The known volume of air is sphere is analyzed, and the concentration of PM10 in the
separated through the sampling head with cutpoint diame- atmosphere is estimated by regression analysis. A model of
ter and stacked on the microbalance. The mass concentra- camera bright-ness and the physical environment character-
tion is obtained by dividing the weight value and volume; istics of the scene is used [27]. In this model, the brightness
(3) The β-ray absorption method. Using the β-ray attenuation of the camera is determined by the scattering of light in the
principle, when β-ray irradiates the filter membrane with atmosphere, and the scattering phenomenon itself is asso-
particle precipitation, the energy of the ray will decrease, ciated with the physical properties of the medium such as
the attenuation amount and attenuation degree can be used floating particles in the atmosphere, so the model can explain
to measure and calculate the PM concentration; (4) Light why the contrast of camera imaging in fog is lower than that in
scattering method, is the reverse application of Mie scatter- sunny days. This also suggests that the quality of images can
ing theory. The mass concentration of fine particulate can be used to evaluate differences in air quality. By combining
be measured by measuring the magnitude of the scattered remote sensing image data with video surveillance image
light signal based on the characteristics of the attenuation data, the good and bad degree of air quality in a certain
of light passing through the particles. The above detection area is monitored and analyzed, and the corresponding air
methods and instruments are basically reliable and accurate, quality classification algorithm is given [28]. The greatest
but there are still two shortcomings: (1) Due to the high benefit of using surveillance image data to evaluate air quality
cost of instruments and equipment, large-scale deployment is its low cost. However, until now, there is seldom a cal-
of these instruments requires a large fund input. As a result, culation method to evaluate air quality completely through
it is difficult for monitoring stations to cover residential com- video surveillance image data. The main reason is that the
munities, factories, schools and other areas, and the number image itself has obvious distortion during the shooting pro-
of detection equipment is far from enough; (2) Due to the cess, so it is difficult to evaluate the quality of the shooting
complexity of air quality and local geographical environment environment through single image or comparison of multiple
in outdoor areas, air quality is related to weather changes, images. An implementation air quality estimation method
traffic conditions and residents’ activities. The current pop- based on color image processing technology is proposed [29].
ular detector is only sensitive to the environment around the By analyzing the relationship between PM2.5 and image
instrument and part of the air sampled by the instrument, quality degradation, the dark channel prior model and black
rather than the overall environmental situation of the region, point data in the image were used to establish a corre-
so the judgment of result of whole region is easy to be lation model between light disappearance coefficient and
affected by the sudden increase of a small amount of particles PM2.5 concentration, so as to estimate PM2.5 concentration
and the inaccuracy of the sample air in the vicinity of the value. Hsin-HungHsieh proposed a method for estimating the
instrument. And for chemical pollutants, for example, the CO concentration of particles using a consumer camera. Using
concentration was oftentimes inferred from the reading of a the support vector machine (SVM) technology, the related
metal-oxide based, total volatile organic compound (TVOC) data that may affect PM2.5, including the brightness value
sensor [12]–[16], but the correlation between TVOC and CO of the image gray image, the humidity value, the pixel infor-
has been shown to be weak. While Raman-based approaches mation in the sky gray histogram, the contrast of the den-
may be used to detect many gases simultaneously, techniques sity distribution in the sky gray image and the histogram

Q. Han et al.: Wireless Sensor Network for Monitoring Environmental Quality in the Manufacturing Industry

information of the HSV mode image, and PM concentration the energy of the nodes is mainly consumed through com-
model is set up. It is concluded that brightness, humidity and munication, so choosing a communication chip with low
PM2.5 concentration are correlated [30]. power consumption and high performance was crucial to
In this study, firstly we presented a design that built upon prolonging the life of the nodes [32]–[36]. In this design,
on the Zigbee idea, that is, using the Zigbee network to CC2420 was selected as the main control chip of wireless
collect the pollution emission of an urban manufacturing transceiver module. The standard radio frequency transceiver,
industry. We took integration and network layout on the moreover, had to have very low power consumption and a
collection sensors with correlation pollution elements to col- stable performance to ensure the effectiveness and reliability
lect pollution data and analyze pollution phenomena as a of short-distance communication. Therefore, we chose the
whole, instead of collecting single pollution index separately, ATmegal28 chip as the processor. It had a high-performance,
we integrated this basic setup into a wireless sensor network low-power 8-bit AVR microprocessor developed by ATmega.
to enable the deployment of sensor nodes capable of cal- In addition, it had six power-saving modes that could be
culating the concentration of NO2 , CO, SO2 , temperature, selected through software. Selecting these two chips could
humidity, and solid particulate matter. Using our wireless greatly help us reduce the power consumption of the nodes.
sensor network architecture, we demonstrated the long-term Next, we had to extend the life of the node. To do this,
stable operation of pollution data collection with the network- an A/D converter was used to convert the current signal into
ing method proposed in this study. The method opened up a digital signal. Then, the integrated digital information was
the future possibility of producing energy-saving, easy-to- processed by the processor. Using an apparatus to simulate
manufacture pollution gas sensors that do not feature cross- real-world conditions in the laboratory, we performed a cal-
sensitivities towards humidity. Because of the small size of ibration of the pollution gas sensor reading in dry synthetic
the sensor nodes, their wireless internet connectivity, and air at 1bar pressure. We also installed the calibrated sensor
their low cost, the system architecture can easily be adapted in the real enterprise production environment. Data were
to different scenarios and employed on a large scale. Then, transmitted via the repeater module, and sensing data on tem-
at the same time, due to the happening of the pollution is perature, humidity, and pollution gas concentration were sent
very complicated, and it is related to geography, meteorology, every 10min.
multi-polluting factors, manufacturing production and urban The sensor node design is depicted in Figure 3 and included
residents’ lives. However, the existing methods have been wireless connectivity to make it possible to monitor the pol-
relatively accurate, but the sampling categories and research lutant exposure of humans on a micro-scale, which is needed
pollution phenomena are relatively single. The formation, for next-generation studies on the health effects of pollutants
transmission and diffusion of pollution within the city and on urban residents. The data from the whole network can also
the production and living activities within the city should be fused to offer a comprehensive picture on a factory-wide
constitute a whole system. Therefore, we have studied the level. Each node was equipped with a repeater to enable the
overall relationship between the production behavior of urban transfer of the data to an internet application by means of an
manufacturing industry and urban pollution. Compared with internet gateway. In the undisturbed environments, the phys-
analyzing a single pollution phenomenon or a single pollution ical range of the transmission was about 100m. We created a
source separately, we prefer to conduct multi-source analysis tool to help determine the energy consumption of production,
through the interaction between individuals to solve complex the energy expended on pollution control, and the production
pollution problems. rate of gaseous pollutants in factories. Moreover, data on the
ratio of the above energy consumption metrics can be used
II. MATERIALS AND METHODS and integrated into production systems to minimize energy
Because of its importance in determining factory air qual- expenditure during factory production. Using this system,
ity, we took a gas sensor approach for CO by employing a we investigated the result of various gas detection methods
concept that can easily be adapted for other gases, includ- in a city and made inferences on the development of classes
ing NO2 , CO, and SO2 . There is evidence to suggest that of pollution in factories through a system deployed in a
correlations exist between the energy consumption of pro- city in Northern China. The location of all sensor network
duction and the energy expended on pollution control, with components of one of them, which is a large-scale iron and
the quantity of flue gas also allowing for a production limit steel enterprise, is depicted in Figure 1. Each sensor node
grade [31]. This made exhaust gas concentration the most was installed at about 1m height above ground. To ensure
important parameter in assessing FEQ. In order to enable reliable data, transfer signal repeaters were installed. The
large-scale deployment of the system at a low cost and at a factory was founded in 1990 and now has an annual pro-
small overall size, we made use of the RFD function, based duction capacity of 2 million tons of iron, 2 million tons
on the original design, which made it possible to build a large of steel, and 2 million tons of steel, with an annual output
network that could cover all enterprises in the whole jurisdic- value of 10 billion yuan. The factory had a high produc-
tion while improving battery life and reducing system cost tion capacity and some ability to control pollution but not
and power consumption through the combination of multi- active production control. As such, its pollution discharge
hop transmission, FFD and RFD. In wireless sensor networks, was affected by the enterprise’s production capacity and by

Q. Han et al.: Wireless Sensor Network for Monitoring Environmental Quality in the Manufacturing Industry

TABLE 1. Sensor node parameter.

FIGURE 3. Sensor node topology.

to the sensor node according to the original path to realize

handshake communication. If a checksum error occurred,
a resend was required. If the sensor node did not receive
a confirmation message, it would keep sending data until a
confirmation message was received from the co-ordinator.
When the coordinator received the correct data, it also per-
formed protocol conversion and then transferred it to the
monitoring platform through the network for data processing
and analysis.
The concept of the individual sensor nodes was based on
the use of micromachined sensor technology and internet
connectivity in order establish a wireless network for online,
in-situ factor environmental monitoring. The CO concentra-
tion was determined by listening to its concentration via the
photoacoustic effect. Both temperature and humidity were
determined using state-of-the-art microtechnology.
Each sensor module sent the current pollution gas concen-
tration, speed, and pressure to the cloud-based ‘‘EnControl’’
platform every 10min. Outdoor conditions were monitored
FIGURE 1. Overview of the system installation at the factory.
via weather stations in close proximity to the factory, whose
data was available online through the weather underground
web portal (i.e., the subordinate platform of meteorological
department). The sensor node was a miniaturized instru-
ment that monitored multiple parameters, such as PM, SO2 ,
NO2 , CO, temperature, and humidity in the atmosphere. The
equipment used the flexible power take off method and could
choose either a municipal power supply or solar power sup-
ply. The instrument selected various high precision sensors,
such as electrochemical and optical, and the detection limit
was low, the number of outputs was accurate, and the time
FIGURE 2. System structure diagram.
resolution was high. In addition, the sensor node also had
a small size and a low price, and was suitable for gridding
pollution control energy consumption. For investigating the and dense distribution. The equipment was based on wireless
influence of the ratio of production energy consumption to communication technology, and could realize confidential
pollution control energy consumption (PE-PCR) on FEQ, and safe communication with server and integrate a large
19 point locations were monitored, and each was equipped quantity of environmental data into the ‘‘cloud platform’’.
with a number of sensor nodes. Because of the materials
deployed in the factory building, the signal wave distance was III. THE IMPROVED POLLUTION ANALYSIS
reduced and limited by the RFD information transmission In this section, we mainly proposed a pollution traceability
distance. Therefore, an appropriate number of relay modules method, and used LSTM to predict the period of each pollu-
were installed according to the workshop area to ensure reli- tion element.
able signal transmission.
The algorithm selected the optimal communication path A. THE IMPROVED TRACEABILITY METHOD
and sent the data to the coordinator in the form of multiple In order to assess the air quality in the factory, we used the
hops. After the coordinator received the data, it verified the comprehensive reduced concentration of smoke as the main
data. If it was correct, it returned the confirmation information gas calculation parameter for exhaust emission. It is high cost

Q. Han et al.: Wireless Sensor Network for Monitoring Environmental Quality in the Manufacturing Industry

and not easy to achieve to city-wide coverage by sensors. With a known air dilution factor and air diffusivity factor,
Therefore, only 120 companies in the test city have installed the value of exhaust concentration could be used to determine
our sensors. In order to achieve the purpose of pollution the emission capacity of the factory. Apart from the flue
traceability to the whole city, predict the pollution diffusion gas concentration, the ability to produce and control pollu-
of near-surface through these factories with sensors installed tion were important factors influencing the FQE. Oftentimes,
and analysis on pollution traceability and pollution cause to there was an organized discharge of pollution during produc-
the whole city by the obtained knowledge. We find the main tion. However, the mismatch between the ability to control
distribution area of factory blowdown chimney concentration pollution and the production capacity would still lead to the
and better height through the box-plot analysis, calculate occurrence of pollution events. Further large-scale measure-
the height interval is 100-120m. The height is set to 150m ments in various factories would be necessary to establish a
in the process of pollution traceability. Due to the near- reliable and more precise pollutant discharge model for fac-
surface height, the effect of atmospheric phenomena such tories. The PE:PCE ratio of production energy consumption
as inversion layer on pollution diffusion can be neglected. to pollution control energy consumption represents the basic
In the process of modeling, ignore the effects of molecular energy consumption of factory sewage discharge with the
motion, temperature and inversion layer, etc., assuming that presumption that the influence factor of energy consumption
the energy of pollution diffusion is only provided by air mass, and sewage discharge was θ. The influence factor was easily
we join geography which the whole urban area is divided adaptable to different cities, industries, and seasons in the
into 22 thermal grids, emissions and precipitation data infor- year, and was represented as follows:
mation, research on the pollutant diffusion, determine the P
air dilution factor and air diffusivity factor of the pollutants. Dc = θ · P (3)
The pollution comes from which thermal grid of monitoring
alarm position is derived reversely based on wind direction, where Dc represents the amount of pollutants discharged in a
wind power and rainfall conditions, we determine which given time period, Epr and Epl represent the production energy
companies are most likely to be pollution sources according consumption and the total power consumption for pollution
to the type of factory and its corresponding pollution list in control within the period, respectively. The value of θ had to
the grid, and provide them to the local environmental protec- be determined to calculate the relationship between energy
tion department for a second detailed investigation. Where consumption and pollution emission.
air dilution factor and air diffusivity factor are as shown in
formula (1) and (2). Along with this value, we used speed and B. THE IMPROVED LSTM PERIOD PREDICTION METHOD
time to calculate the tailpipe emission and the background We adopted a more elaborate data processing process in the
levels at each point location. Among the diverse methods process of pollution period prediction, which is different
used to calculate the air change rate (ACR), we chose the from the off-group points and missing values processing in
decay method according, to VDI 4300 (2001), which has the above-mentioned pollution traceability method. This is
been proven a feasible and effective way to determine the air because the traceability process is based on the scope of
change rates in scenarios like the ones explored in this work. the thermal grid. We aim to perform relatively ambiguous
The exhaust gas concentration upon venting using outdoor analysis with a larger amount of real data. Although refined
air had to converge to the global background concentration, data processing can improve the data quality, it also reduces
which we assumed to be 400ppm. However, meteorology the recall rate of data. In the process of pollution prediction,
can contribute to the dilution and diffusion of polluted gases. it is necessary to seek higher accuracy. Therefore, we first
As such, we used the equation which derived for the temporal delete the noise data caused by sensor error, pollution fluctu-
evolution of the exhaust emission concentration to determine ation and protocol analysis error by the box-plot. As shown
the effective back-ground concentration, Ca, the air dilution in Figure 4, the adopted production factor data in factory is
factor, ω (considering only rainfall), and the air diffusivity used as an example for off-group data analysis. Secondly,
factor, λ, using the following fit function of the form: we treat each sensor’s daily data as an individual. For each
individual, we take the data from the first peak to the last
(C0 − Ca ) · αe−λ·t + βe−ω·t

peak but excluding the last value. To ensure the alignment
C(t) = + Ca (1) of the data, using the cubic spline method to interpolate
each individual to an equal length with 144 sample points.
where CO is the initial concentration upon the start of the It is verified that the final prediction result can be improved
pollutant discharge at t = 0, and α and β are the weight by nearly two percentage points by the experiment. Finally,
coefficients. A nonlinear regression ws applied to the mea- to meet signal processing requirements, partial waveform
sured raw data acquired in order to determine the background data such as current is transferred from frequency domain to
concentration. Based on the dry weather, we evaluated λ first, time domain.
as follows: In deep learning model, Recurrent Neural Network (RNN)
introduces the concept of time sequence into the net-
C(t) = (C0 − Ca ) · e−λ·t + Ca (2) work structure design, which makes it more adaptable in

Q. Han et al.: Wireless Sensor Network for Monitoring Environmental Quality in the Manufacturing Industry

advantage in time series prediction, and traditional linear

methods are difficult to adapt to multivariate or multiple
input prediction problems. We hope to obtain a predic-
tion that includes the correlations between the attributes
rather than a prediction that the attributes are fragmented.
Therefore, according to the literature [42]–[45] we select
eight indicators: PM2.5, PM10, SO2 , CO, O3 , NO2 , wind
direct and wind speed for period prediction, and the data of
the eight indicators are decomposed periodically to quali-
tatively validate our experience knowledge. Therefore, this
paper selects LSTM method for pollution period prediction.
The LSTM model replaces the hidden layer of RNN cells
with LSTM cells to enable them to have long-term memory.
After continuous evolution, the most widely used LSTM
model cell structure [46]–[48] is shown in Figure 2, where
FIGURE 4. Box-plot analysis of factory electricity data.
z is input module, and its forward calculation method can be
expressed as:
it = σ (Wxi xt + Whi ht−1 + Wci ct−1 + bi ) (4)
ft = σ (Wxf xt + Whf ht−1 + Wcf ct−1 + bf ) (5)
ct = ft ct−1 + it tanh(Wxc xt + Whc ht−1 + bc ) (6)
ot = σ (Wxo xt + Who ht−1 + Wco ct + bo ) (7)
ht = ot tanh(ct ) (8)
FIGURE 5. RNN model and cell structure in hidden layer. where i is input gate, f is forgetting gate, c is cell state,
o is output gate; W and b are the corresponding weight
coefficient matrix and bias term respectively; σ and tanh
are sigmoid and hyperbolic tangent activation function
respectively [49]–[51]. The LSTM model training process
uses a BPTT algorithm which is similar to the classical Back
Propagation (BP) algorithm [52], it can be roughly divided
into four steps: (1) calculate the output value of LSTM cells
according to the forward calculation method; (2) inversely
calculate the error term of each LSTM cell, including two
back propagation directions by time and network level;
(3) calculate the gradient of each weight according to the
corresponding error term; (4) update the weights using a
gradient-based optimization algorithm. Meanwhile, through
the supervised learning framework, we provide the pollution
FIGURE 6. LSTM cell structure in hidden layer.
collection values and the weather conditions of the previous
time node for the neural network to predict the weather
time series data analysis. Among numerous RNN vari- conditions in the next 48 hours.
ants, the long-short-term memory (LSTM) model compen- Seasonal differences are strongly correlated with the
sates for gradient disappearance, gradient explosion and lack spread of pollution and other pollution like coal-fired heat-
of long-term memory ability [37]. In recent years, LSTM ing in winter causes more particles emissions happened.
has achieved good prediction results in the fields of fault We added the seasonal difference markers and integer coding,
series prediction [38], air pollution time-space forecast [39], if we understand the period prediction process as a classifica-
short-term traffic flow prediction [40] and silicon content tion process, the addition of a large amount of the same sea-
in hot metal prediction [41]. Compared with traditional RNN, sonal information will lead to over high purity of each subset
LSTM can learn time series with long time span and auto- in the seasonal features. Since we make predictions through
matically determine optimal time lag prediction, and more the correlation between features, this will have an impact on
effectively utilize long-distance timing information. Like our final prediction results, resulting in our prediction results
LSTM, recurrent neural network can almost seamlessly sim- being more inclined to the interference of seasonal features.
ulate problems with multiple input variables. This is a big In order to eliminate this purity effect in the process of adding

Q. Han et al.: Wireless Sensor Network for Monitoring Environmental Quality in the Manufacturing Industry

seasonal features, and we know short-term time series of

spring and autumn are very similar through the data and the
climate information of Wu’an, so sample D is divided by
Gini coefficient into three sets of spring and autumn, winter,
summer. We use seasonal features to divide the sample set,
and the possible sets divided are as follows:
(1) divide point: ‘‘Spring and Autumn’’, divided subset:
{Spring and Autumn}, {Winter, Summer}
(2) divide pointt: ‘‘Winter’’, divided subset: {Winter},
{Spring and Autumn, Summer}
(3) divide point: ‘‘Summer’’, divided subset: {Summer},
{Winter, Spring and Autumn}
For each partition mentioned above, the purity of the FIGURE 8. Time-domain chart.

sample set D divided into two subsets D1 and D2 can

be calculated. correlation is very high. In general, it is easy to understand
|D1 | |D2 | PM2.5 has the same high and low as PM10, but has the same
Gini(D, A) = Gini(D1 ) + Gini(D2 ) (9) high and low with CO, NO2 need analysis. These two air
|D| |D|
pollutants are mainly related to automobile exhaust emissions
Therefore, for a feature with multiple values, it is necessary and industrial emissions, the city already has strict restric-
to calculate each value as the divide point to divide the tions. Therefore, it can only be explained that the possibility
sample D, after divided the purity is Gini(D, Ai ), where Ai of industrial pollution is the greatest, combined with the
is the possible value of feature A. Then, the division with above figure, which shows that urban pollution is related to
the smallest Gini index is found from all possible divisions. industrial pollution.
The divide point of this division is the best division point for The concept of the individual sensor nodes was based on
dividing the sample set D by feature A. It is found that the first the use of micromachined sensor technology and internet
division method is more suitable for our requirements through connectivity in order establish a wireless network for online,
calculation. This is because although summer and winter are in-situ factory environmental monitoring. The CO concen-
in a set, their series features are different. Even if the spring tration was determined by listening to its concentration via
and autumn are placed in a set, due to the similar combi- the photoacoustic effect. Both temperature and humidity were
nation of internal time series features does not cause much determined using state-of-the-art microtechnology.
impact. Due to the high complexity of the environmental informa-
tion monitored, the data curve showed both a sharp rise and
a sharp decline, and the curve showed many burrs. Because
the results exhibited almost no regularity compared with the
original data, we used a Butterworth filter to process the
existing data.
The fast Fourier transform of the sample data was carried
out with a band-pass Butterworth filter to generate the time-
domain chart of the sample data.
By filtering the CO, SO2 , NO2 , PM, temperature, and
humidity data of the frequency domain chart and generating
a time-domain chart, we eliminated the noise in the data and
found better environmental monitoring information data in
FIGURE 7. Environmental information data line chart. the November extremum position. The results showed that
the pollutants’ extremum position was concentrated roughly
IV. RESULTS on the same date, whereas the temperature and humidity
We analyzed data collected in November 2018. The raw data of the extremum position and location were almost at the
of the temperature calibration, humidity calibration, and gas- opposite extreme of the pollutant concentration. This rise
sensitive characterization are shown in Figure 7. We applied a of temperature and humidity from pollutant diffusion and
band-pass Butterworth filter on the sampled data to eliminate dilution played a positive role, a phenomenon that a curve
out-band interference. The lower and upper cutoff frequency similarity calculation also verified. Statistics for the extreme
was set as 1Hz and 2Hz, respectively (the filter order was 4). values are shown in Tables 2 and 3.
After filtering, the extreme data were deter-mined by finding The number of extreme points in each index was counted,
the peaks, as shown in Figure 8. and the number of days with increased pollution was found
In Figure 7, it can be found that PM2.5 and CO and NO2 to be approximately the same by counting the number of
are ‘‘same high and low’’, which further indicates that the maximum points.

78114 VOLUME 7, 2019

TABLE 2. Extreme point count.

FIGURE 10. Line chart on monitoring data on production energy


TABLE 3. Extreme point statistics.

FIGURE 11. Temperature data curves after screening and interpolation.

The moving average method was used to smooth the curves.

FIGURE 12. Periodic decomposition of temperature curves.

FIGURE 9. Extreme point statistics chart.

According to the specific location and data statistics of FIGURE 13. The test results from a certain day’s data in the test set,
the extreme point of environmental information, it can be where red is the prediction curve and blue is the original curve, and the
cosine similarity of the two curves is 95.8%.
seen from the information in the table that the time period of
pollution rise mainly occurred in the four frequency domains,
namely [10-30], [40-50], [70-90] and [115-150]. electricity in Figure 10, we found that the time area of
According to the statistical data in Tables 2 and 3, the pol- increased pollution was consistent with the time area of
lution time points were statistically analyzed through the increased plant production, and that the shapes of the two
histogram in Figure 9, which reflects the time area where the peaks were almost similar. Therefore, we could determine the
concentration of pollution increased in the urban monitoring effectiveness of the sensor node network, and the potential
jurisdiction. By comparing the statistical chart information correlation between factory production and urban pollution,
in the figure with the statistical chart of factory production through the correlation between the two, and we found that

Q. Han et al.: Wireless Sensor Network for Monitoring Environmental Quality in the Manufacturing Industry

FIGURE 14. The root mean square error of the iterative process of eight environmental indicators, output is made
every ten steps.

factory pollution had a great impact on the increase in urban

The histogram of the position of the time point shows the
extreme points of CO, NO2 , PM, and SO2 . In the figure,
the main concentrated areas reflecting increases in pollution
are divided into four time periods. After the inverse transfor-
mation in the frequency domain was converted into the time
domain, the middle points of these four time intervals were
roughly the 9th, 18th, 24th, and 30th days.
By monitoring the electricity consumption during six pro- FIGURE 15. Curves of CO, NO2 , PM, SO2 , and humidity cycle.
duction processes of typical sewage discharge factories in the
jurisdiction, it was found that the major production periods of
large sewage discharge enterprises in November 2018 were
concentrated in the following four sections: [4-12], [15-23],
[24-25] and [27-30]. This performance was consistent with
the time section when the concentration of pollution fac-
tors was increased in our urban environmental information
monitoring analysis. Therefore, our proposed wireless sensor
network was found to accurately reflect the changes in envi-
ronmental pollution in the city and to determine the poten-
tial laws affecting factory production and urban pollution
Through environmental data modeling and the study of FIGURE 16. Hierarchical clustering results of pollution cases.
historical data, we studied periodic trends in environmen-
tal factors. To forecast the future trends in pollution in the
short term, we used the temperature condition. For example, continental monsoon climate characteristics of this northern
we first observed the temperature history curve in Figure 11. Chinese city. As the third part depicts, the periodicity of
We used the box-plot method to remove abnormal data from the temperature curve conformed to our previous conjecture
the corresponding attribute. We observed that the cyclical and also conformed to the characteristics of urban daytime
temperature curve was rough, and increased suddenly and temperature.
sharply decreased scenario existed in the temperature curve. In Figure 13, the red curve is the prediction curve, the blue
We also observed some abnormal cycles. The general cycle curve is the original curve, the two gray curves are the pre-
was high during the day, low in the morning and evening, dicted threshold region, and the confidence interval is the root
and lowest at night. We divided the data into an 7:3 ratio for mean square error. If the true curve exceeded the confidence
training and testing. interval, the abrupt position would be judged as abnormal
The first part is the original data curve, the second part is pollution, and the alarm would be given. The periodic curves
the trend curve of data decomposition, the third part is the of the other elements are shown in Figure 15.
data cycle curve, and the fourth part is the residual part of the We obtained the pollution cases in the dataset through
data. hierarchical clustering, and we conducted a detailed analysis
As can be seen from the second part of Figure 12, since on one of them. As is shown in Figure 16.
November 2018, the temperature gradually decreased as the We interpret the pollution trend of wu’an through a visual-
date increased, which was consistent with the temperate ization method of thermal map. The hourly air quality change

Q. Han et al.: Wireless Sensor Network for Monitoring Environmental Quality in the Manufacturing Industry

FIGURE 17. Thermoeleric chart of pollution diffusion.

FIGURE 18. Statistical chart of wind direction in winter.

thermal map from November 2018 to January 2019 is a time

series from left to right, the city is arranged from north to FIGURE 19. Correlation analysis chart of environmental indicators.
south according to the dimension, and it can be seen from
the figure that the pollution trend is approximately oblique
diagonal diffusion. Then the change in dark color in the fig- only the radius is different, the correlation is very high, but
ure is roughly equivalent to the pollution diffusion route. As is there is no great practical significance in this study. SO2 is
shown in Figure 17 that the two major over standard pollution mainly derived from the desulfurization link of the steel plant
incidents, the southern urban area mainly continues to be and automobile exhaust gas in the pollution source given by
severe pollution, and the farther south, the higher pollution, the local environmental protection department. CO mainly
especially in the second pollution incident, there is a serious comes from the coal combustion of factories, heating depart-
pollution in the south. That is to say, the pollution has obvious ment and automobile exhaust. While wu’an carry out vehicle
geographical diffusion characteristic, and is roughly ‘‘oblique restrictions, has little impact on automobile exhaust, so this
diagonal’’ diffusion. The impact of urban external pollution time the pollution lock for factory emissions.
transmission is small.
There is a great correlation between air pollution and mete- V. CONCLUSIONS AND FUTURE WORKS
orology. This is the meteorological data from November 2018 Accurate monitoring and prediction of pollution and elec-
to January 2019, which is the main distribution month of the tricity consumption by enterprises can greatly relieve the
city in winter. It can be seen from the wind direction and wind pressure of urban pollution. Our study showed that in fac-
speed statistical chart that during this period, is mainly the tories without active pollution monitoring, prediction, and an
north wind of 0-2.1m, no obvious strong wind. This is consis- enterprise production monitoring system, it was very difficult
tent with the temperate continental monsoon climate of wu’an to restrict the production of enterprises in the city and to
and the geographical information of northerly winds and forecast and alarm the pollution so as to improve the quality
northwest winds in winter. However, as shown in Figure 18, of urban environment. Based on our results, we concluded
pollutants of pollution exceeding the standard twice are dis- that in the real world, these pollutants would not only accu-
charged from the southern region. Therefore, the north wind mulate and spread in a city, but if a linkage were to form
that often rises in winter hindering the diffusion of pollutants between several key polluted cities, the air quality in the
to some extent makes the pollution gas mainly concentrate in corresponding region would become worse and worse, which
the southern region. would have a huge impact on the health of citizens living
The Figure 19 above is the correlation analysis chart of in the region. Cities, therefore, must take the initiative to
various pollutions. The possible pollution sources are inferred monitor air quality and maintain a good quality of urban
by analyzing the correlation between main pollutants in the air environment. We also concluded that different degrees
air, and because PM2.5 and PM10 are both particle pollution, of restrictions on the production capacity of manufacturing

Q. Han et al.: Wireless Sensor Network for Monitoring Environmental Quality in the Manufacturing Industry

