AI Crop Predictor and Weed Detector Usin

Arabian Journal for Science and Engineering


AI Crop Predictor and Weed Detector Using Wireless Technologies:

A Smart Application for Farmers
Ishita Dasgupta1 · Jayit Saha1 · Pattabiraman Venkatasubbu1 · Parvathi Ramasubramanian1

Received: 8 May 2020 / Accepted: 29 August 2020

Agriculture is undoubtedly one of the biggest and most important professions in the world. Optimization of agriculture and
aiming gradually and extensively toward smart agriculture are the need of the hour. IOT (Internet of Things) technology has
already been successful in easing people’s lives with its wide range of applications in almost all arenas. In this paper, our
work takes the help of IOT devices, wireless sensor network (WSN) and AI techniques and combines them for faster and
effective recommendation of suitable crops to farmers based on a list of factors such as temperature, annual precipitation, total
available land size, past crop grown history and other resources. Additionally, detection of unwanted plants on crops, namely
weed detection, is implemented with frame-capturing drone and deep learning methods. Naïve Bayes algorithm for crop
recommendation based on several factors detected by WSN sensor nodes has been used, resulting in an accuracy of 89.29%,
which has proved to be better than several other discussed algorithms in the paper, like regression or support vector machine.
Deep learning using neural network successfully identifies weeds present in a specific area of crop growth extending an
additional protective measure to farmers. The comprehensive application developed for farmers not only reduces the physical
hardship and time spent on different agricultural activities, but also increases the overall land yield, reduces possibility of
losses due to failure of crops in a particular soil and lessens the chances of damage caused to crops by weeds.

Keywords Internet of Things (IOT) · Crop recommender systems · Deep learning · Weed detection · Wireless sensor network
(WSN) · Precision agriculture

1 Introduction season, then it does not give the soil a chance to regenerate
the resources, hence rendering the soil barren. With multi-
Farmers all around the world are required to follow an intri- ple crops, the nutrients are regenerated at a uniform pace.
cate practice of planting crops in rotation. This practice, However, with the concept of multiple crops comes the con-
which is extremely important to the well-being of the soil, is cern to recommend the best crops to be planted to maximize
referred to as crop rotation and precision agriculture [1, 3]. the yield. Farmers generally are not aware of the crops that
In this method, different crops are planted in alternating sea- should be planted on their particular field. Hence, they end
sons so that the soil is not constantly deprived of a particular up sowing the wrong seeds, which causes severe harm to the
mineral or nutrient. If the same crop is planted, season after soil and their yield and thus minimizes the profits. If a crop
is not suitable on a field, and still if it is planted, it yet again
society. It also aims for quality, uninfected and unadulterated and increasing researches and IOT devices have added to the
crops, thus improving the health of the society as a whole. accuracy of crop predictions. Mythili et al. in [8] used sev-
The main challenges involved are the collection of the dataset eral special sensors like humidity and PIR sensors. Arduino
and the farmers knowing the characteristics of their soil, on microcontroller is used for interfacing the hardware com-
which the computation is to be performed. This has been ponents, and its IDE is used by the GSM (Global System
achieved using WSN architecture to extract the soil features for Mobile) module to display messages on the screen for
and using machine learning algorithms on the input data for the farmers. It obviously has the advantage of not compul-
crop recommendation suited for the particular soil type. Also, sorily requiring a smartphone. Farmers can get updates via
an additional feature of this paper focuses on the crops getting the GSM network in their normal cell phones. All the data
damaged due to weeds. This feature has been accomplished collected from the various sensors placed across the field
using image processing and deep neural network techniques are passed through the Arduino controller. The GSM allows
to detect the weed [3] present in the agricultural field. Weeds messaging service to the farmers’ phone to give them the
are malicious plants, which grow together with the useful and updates in regular intervals. The microcontroller consists of
highly fertile crops. They survive by taking majority of the an additional Bluetooth module that serves as the message
nutrients from the soil. This is a bad indicator toward farm- provider to the users within the small range of the system.
ing, as the healthy and productive crops tend to wither due It provides the information to the users or farmers about the
to lack of proper nutrients, hence decreasing the productivity data extracted from the sensors such as temperature, water
and thus degrading the soil quality. Thus, for the betterment quantity in soil and smoke on GSM network or with the help
of the farming industry, we have also developed a flexible sys- of Bluetooth; 98.50% accuracy is obtained from this method.
tem to detect weeds and send the exact location to the farmers The PIR sensor is an impressive feature in this research study,
using the mobile application that we have developed. Soil which detects the presence of any animal close to the crops
moisture sensor and humidity sensors [4] connected in the and notifies the farmers immediately through a LED or a
WSN have been used to get the threshold readings to identify buzzer. The low system cost and efficient working without
the possibility of growth of weeds. The readings, if more than Internet are a major advantage of the proposed system. The
threshold would prompt the farmer to launch the drone [5] future extension of the paper is to add a water pump for facil-
which would capture the land images and send it to the con- itating irrigation when soil moisture content falls below the
troller for image processing. Thus if a weed is detected, the threshold level and take the model to the next level reduc-
coordinates will be sent and the application plots a map from ing the manual efforts of the farmers to a greater extent.
the farmers’ current location to that of the weed location. Manoj Athreya et al. [9] gave a clear comparison on the dif-
Then, the weed can be manually extracted and this saves the ferent types of recommendation systems coupled with IoT
crops from dying off. This is highly beneficial to the farm- technology in their research work. Mokarrama et al. in [10]
ers and the end customers. The input data for the analysis is work with the location of the user and the data collected from
implemented via several machine learning algorithms [1, 6], agroecological and agroclimatic areas in the Upazila regions.
which are initially fed into the system through multiple WSN Modules to detect the location for data storage, similarity
sensor nodes, which compute and gives a full analysis of the between location detection and recommendation have been
soil [7]. With the analysis report, we can then efficiently and used together to develop this system. The primary need of
with precision predict the crops to be planted. The algorithm this system is the user location. Using the most recent Google
also provides an order of the crops to be planted, based on location API, the address is collected, from which the Upazila
the nutrients available in the soil, thus making the task easier. is also identified. The database is divided into period of crop
The agricultural sector contributes to about 14% in India. On growth, thermal zone, rate of crop production, etc. Google
an average, a farmer receives 10–23% of what the consumers API for location detection and analysis of the geographical
pays to the retail. Thus, with increased yield, we intend to statistics of the locations is followed by similarity index cal-
eradicate the concept of intermediaries in the system of agri- culation using Pearson correlation algorithm. Finally, the best
culture. Thus, with an outlook to benefit the society from the suited crops are recommended to the users. IoT devices can
grassroot level to the zenith, our paper ranges to serve all be amalgamated with the existing technology to improve the
purposes efficiently and with complete ease. overall accuracy and productivity. Raja et al. [11] proposed a
system for crop prediction by studying the past records. Using
previous records of soil quality, crop cultivated and kinds of
2 Literature Survey crops grown on the land, the crop yields and the price of crops
are predicted using the sliding window nonlinear regression
Over the years, several researchers have studied, developed algorithm. Classification of dataset is done abiding the crop
and made a significant contribution toward smart agriculture. price in the market to find the crop demand. The total crop
Crop recommendation systems have evolved more and more, consumption and the quantity of crop cultivated are taken

into account to suggest alternatives of crops to be grown. The improvement resulting in dynamic decision making based
proposed workflow begins with the data collection followed on the Bayesian minimum error ratio. Dataset consists of
by conversion of data and data splitting. After data conver- manually taken images in different conditions of light, since
sion, similarity index is calculated, and the split data along color is primarily an important feature of image classifica-
with the similarity are fed into the recommender system for tion. Suitable color feature and a color space are necessary for
training and prediction. Although the method is comprehen- proper image classification, accompanied by Cg component
sive, however, some additional soil and temperature factors to identify the predominant green in crops. The steps in this
may be considered for boosting the median performance. method of classification start with the original image conver-
Pudumalar et al. [1] used the precision agriculture, which is sion of RGB to YCrCb. It is followed by image processing,
rapidly increasing in popularity, for prediction of the suitable obtaining the line of center of the rows of crops and division
crops to the farmers in their research work. Several AI algo- of cells. The MWIR is then calculated within the cells, fol-
rithms and data mining techniques, namely CHAID, KNN lowed by real-time decision making whether to spray crops
and Naïve Bayes, have been used in this paper for better or not. Crop vertical projection identified the black portion
accuracy yield. The overall working follows dataset collec- as the soil, white pixels as the crops. The MWIR database
tion and training of data by efficient feature extraction and is used to analyze the data, and then the decision is arrived
then applies it to the ensemble model comprising of the four at by calculating the Bayes error ratio. The accuracy of the
ML algorithms. Finally, the recommender system predicts proposed algorithm is also quite impressive, being 92.5%. In
the crop based on the model rules such as pH, soil depth, soil our work, we propose a unique combination of accurate crop
water content. Ensemble technique with the majority-voting prediction using AI and IOT coupled with weed detection
model is implemented for making the precision agriculture using drones fitted with camera for dynamic frame capturing.
system deliver a better result. The accuracy is fair, around Louargant et al. [16] widely focus on the spectral process-
88%. However, the addition of camera and other technolo- ing of images to distinguish between monocotyledonous and
gies can increase the rate. Shirshivkar et al. [12] also make use dicotyledonous plants. It uses an automated drone system
of sensors such as moisture sensor, temperature and humid- to track these images and classifies them via an unsuper-
ity sensors connected to Arduino microcontroller to develop vised learning algorithm. A positive classification of the
a recommender system for crop prediction by analyzing the crop field was continued with an accuracy of 80%–100%.
sensor data received via Wi-Fi module. Fertilization recom- It also successfully detected weed from the field, by running
mendation implemented by Hao Zhang et al. [13] analyzed a comparison over multispectral weed images. Two types
factors such as soil yield and crop targets using precision of testing were conducted, one in laboratory conditions on
agriculture. Precision fertilization coupled with a simple, effi- monocotyledonous and the other in dicotyledonous plants.
cient ArcGIS server and fertilization decision-making model They were tested with the reflectance spectra, and the field
made the proposed system successful in promoting technical experiments resulted in multispectral images. The model in
functionalities for scientific fertilization. Weeds cause dam- practice converts the reflectance spectra to pixelated values
age to the crops produced on the land, and hence there have based on different parameters such as luminance, objects in
been several techniques to detect weed; IoT robots are used scene, sensor characteristics which change the differentiating
for their removal. Similar study has been done in [14] to build capability of the learning model as they serve as the learning
an automated robot for detecting weed using image clas- parameters for the classification.
sification technique and spraying pesticides on the weeds.
Data collection is followed by data augmentation to popu-
late the dataset. The comparison among different algorithms 3 Proposed Work
of image classification proved CNN to have a higher accu-
racy as compared to others. Convolutional neural network The paper aims at a simple, user-friendly application for
model is implemented for weed detection. However, a robot farmers, which would assist them in selecting the suitable
comes with a lot of disadvantages too as highlighted in the crops for their land and also help them in detection and
paper itself, such as battery consumption, level of pesticides removal of weeds in their cropland. Wireless sensor network
that can be sprayed and long-time interval between capturing has been the breakthrough technology, especially in precision
and spraying of herbicides. Tang et al. [15] aim at accurate agriculture. Our model makes use of WSN, ML and AI work-
identification of weeds and properly spraying herbicides to ing together to deliver a model with a good accuracy of 90%.
reduce the loss of chemical fertilizers and also the environ- The environmental factors detected using the sensors in the
mental degradation caused by them. The paper aims at a WSN model are fed into our system, which extracts essential
combination of vertical projection and linear scanning tech- features from the factors and uses Naïve Bayes algorithm to
niques for finding out the central line of the row of crops. The predict the suitable crop for the farmland. As an additional
weeds infestation rate (WIR) is analyzed and modified for protection feature, our model includes a camera-fitted drone

Fig. 1 Workflow for our proposed model

which will capture real-time video of the crops from a suit- and testing and to drop the output column. The output column
able height above the ground. The video fed into our system was not to be encoded and kept as a string literal, as one-hot
uses CNN algorithm for detection of weed growth around encoding would have rendered into inefficiency of the algo-
the crops. Weed removal has not been included in our model. rithms due to wide possibilities and would increase the output
But the detected frames of weed growth can be used by future features by a manifold. With the data split into training and
add-ons to our model for weed removal. In Fig. 1, the overall testing sets, the training part of the input features was fed into
workflow for our proposed model is explained. Information the respective classifiers. It is an important note that while
is collected from all the different sensors distributed across fitting the training data into the classifiers, the data should
the farmland. All the sensors are connected to one cluster always be parsed as integers. Thus, the dataset was trained
head, which is in turn connected to the base station. In this with their respective training outputs and the classifier was
way, the sensor data and the data from the camera-fitted drone ready to predict. In Fig. 2, the weed classification model is
are collected via the Internet into the database system. After shown at its functional state. Higher percentage shown over
preprocessing the data, it is fed into the crop recommenda- an area of lands implies greater amount of weed in that area.
tion and the weed classification models. After prediction, the Since we have proposed classification of weed with the help
information is updated to the farmer’s application interface. of drones, we wanted our setup to be able to properly classify
based on a photograph taken from a height for which we have
tested our model with some drone photographs of croplands
with weeds, taken from a particular height. The crop dataset
4 Methodology
consists of the impact factors to be preprocessed for predic-
tion of crops, namely soil moisture content, soil type, soil pH,
4.1 Dataset Preprocessing
soil infiltration level, humidity, temperature and finally the
crops suitable for given values of these factors. The various
The first objective was to preprocess the dataset, as the first
factors taken into account in our dataset have been visual-
column included soil types, which had to be encoded using a
ized using histogram to explore the variations of our dataset,
label encoder. The next task was to ensure that these encoded
which can be seen clearly in Fig. 3. After preprocessing,
soil types do not indicate any priority. Thus, one-hot encod-
training and testing are carried out through our Naïve Bayes
ing technique was applied on the dataset. Rest of the features
algorithm, and the input data from the farmers are fed into our
was floating-point type. This ended the first phase of the data
model via WSN architecture, finally predicting the suitable
preprocessing. The next was to split the dataset into training

Fig. 2 Percentage of weed detected in cropland

Fig. 4 Correlation matrix showing dataset features

smart processing will not be effective in processing gathered

data and infer meaningful results. Hence, IoT and AI com-
bined provide the best, innovative and effective results for our
proposed application. Wireless communication technologies
have become a backbone for any data retrieval, processing
and communication between other devices or nodes. Zigbee
is one such technology gaining popularity majorly in preci-
sion agriculture (PA) which has been used as the IoT system
in our research. It has high scalability and also offers eas-
ier maintenance of the connected sensor nodes which can
intercommunicate with each other. Thus, the information
is collected from the various sensor nodes which commu-
nicate and transfer data via Zigbee technology to the base
Fig. 3 Visualization of the fields of our crop dataset
station. The base station receives all data, monitors each con-
nected sensor and serves as the Zigbee Coordinator (ZC).
The collected information in ZC is stored via Internet in a
cloud database. The database communicates with the server
to provide the information where the algorithms and predic-
crops for the land. Similarly, the weed dataset [17] consist- tor techniques are applied. The overall model revolves around
ing of weed training images has been trained by our CNN by working on an agricultural based dataset, which has sev-
model. The video of the crop land detected by our camera- eral soil types and soil conditions along with the different
fitted drone will be fed into our model to detect weeds. environment conditions. The correlation matrix between the
different features such as temperature, precipitation, humid-
4.2 Techniques ity is plotted to get a visual representation as shown in Fig. 4.
The output of this dataset is the different crops that are
As discussed above broadly, crop rotation and precision eligible to be planted in those particular environmental situ-
agriculture are the need of the agricultural sector but the ations. The main idea is to now use this dataset and predict
inability of the farmers to predict the crops to be planted the crops whenever some soil and environment analysis is
on their fields and the correct order in which the crops are given. These challenges were efficiently solved with sev-
required to be grown to maximize the yield and also the eral machine learning algorithms. The different algorithms,
deficiency in the technological implementation have moti- which were tested on for this dataset, were Naïve Bayes,
vated the authors to contribute toward this social cause and SVM, KNN, multiple regression and K-means. Then, this
make a holistic approach by a novel technique. IoT without classifier was used on the testing input features data. Now

NLTK Python Library was deployed to classify and was and hence is a robust and innovative technique in assign-
stored for further accuracy calculations. Then, the predicted ing to even unknown crops. Now this unknown crop belongs
output is compared with the test output feature. A confu- to a crop pool class, and thus, it is highly user-friendly if
sion matrix is plotted, and the accuracy is found out. By the user wants to add other crops into their consideration,
changing hyperparameters, the algorithms were tested and which does not belong to the dataset. Hence, this alternative
verified for production. Now these algorithms were imple- is provided which makes the application highly scalable and
mented in a Django framework, which would serve as an flexible for any type of crop, whether known as well as the
interface for the users. The users would pass in the data unknown crops based on the users discretion and priorities,
as a form and run their choice of predictions. The output thus accurately recommending the crops which are optimum
would efficiently predict the crops to be planted. However, for getting sown in the area based on the users environmen-
the major concern was the way users can gather the infor- tal conditions and other dependent factors, thus establishing
mation about their soil. The main idea of setting up an IOT an efficient and highly productive system. Another impor-
device was to resolve this issue. We used a WSN, which tant add-on feature of this is the drone-based weed detection,
requires a centralized unit which could be controlled by user which we have tried to implement as a secondary feature.
interface embedded in the Django app. Whenever the user It incorporates a 360-degree camera fitted underneath it. It
clicks the button, it would activate the sensors and would uses a complex CNN technique, which takes a video frame
upload the feed from the cloud network. The Web applica- as input and processes it in several deep neural layers. It also
tion would automatically fill in the required details in the has implemented inception network techniques. It marks the
Django form, and the analysis would be performed accord- regions where it detects weed plants and using the drones’
ing to the algorithm selected. IGMS can be incorporated in gyroscope plots a map between the user’s current location and
a Web application. IGMS sensors incorporated are primarily the different location of weed found in the crop fields. There
humidity, pH scale, salinity, temperature. Data collected will is a threshold value for the distinction between classification
be stored remotely in a server, and then a copy is sent to the of weed and nonweed, by tuning several hyperparameters
cloud. The cloud is again requested with a GET request, by and learning this threshold on repeated training in batches.
the application. The cloud inputs the required values in the Thus, in the end, the weed plants can be plucked out. This
form, and the result is found by using the desired algorithm. system is hence maximizing the crop yield and has a high
Thus, an efficient user-friendly two-click automated system impact factor.
is developed, which has a vision to be a boon to the agri-
cultural sector. Firstly, we have implemented a large number
of crop varieties in the dataset so that all the types of crop
pools were filled with a high number of instances in every
crop pool. A crop pool as used in our algorithms is the pos-
sibility of certain types of crops belonging to a same class
as predicted and guided by the machine learning algorithms
which runs on the massive dataset and hence setting up var-
ious rules for each crop, and justifying any crop which is
provided by the user while testing to allocate the crop to a
particular crop pool class. This classification is not based on
the properties of the crops; rather, it feeds on the environment
conditions and other dataset attributes. Now, the biggest chal-
lenge was to allocate a crop, which was never encountered
in the dataset before. Therefore, to overcome that challenge,
machine learning algorithms were applied to generate a set of
rules. The rule set would take all the types of crops available,
including crops which weren’t ever encountered in our train-
ing dataset. Based on the intrinsic properties of these crops
and also several other environmental factors pertaining to the
particular crop type, correlation checks of the identified crop
were performed against the crops available in our dataset to
find the closest correlated crop type. If the correlation value
passes the barrier limit, the unknown crop is readily placed
in the crop pool of that of the highest correlation crop. This
also helps in future adding and updating of the crop dataset Fig. 5 Crop prediction by our model

4.3 Model Implementation

This whole analysis model was deployed on a python-based

framework. Django has been used to serve as the user inter-
face. The user interface contains a form which can either be
filled by the user or with a button click, and a full analysis
can be done by the IOT sensor which uploads its analysis
reports on a remote cloud database, and the respective form
inputs, i.e., pH, temperature, humidity, temperature, etc., will
be populated from the same. These inputs are passed into the
machine learning models, which first train the dataset and
immediately process the predicted output. They also take
into account the previous history of the crops and optimize
the output. Naïve Bayes, SVM, KNN algorithms have been
implemented, and the user is free to choose any to com-
pare the accuracy and to avoid biasness. Figure 5 shows an
illustrative snapshot of the Django-based HTML processor
Jinja template. The output is of the Naïve Bayes algorithm
running and predicting the possible crops to be planted in
the particular soil. sklearn.svm.SVC, sklearn.naive_bayes,
sklearn.clusters.KMeans and other various python libraries
were used to make this implementation a success.
Genetic mathematical modeling of our research:

4.4 Neural Network in Weed Detection

The weed detection mechanism is largely dependent on the

neural network feed. The process is majorly based on taking
images of several parts of the field, which will serve as the
input for the neural network. Before passing them through the
network, the images have to be resized into 608 × 608 pixels
resolution, which were visualized to fit the neural architec-
ture with the highest accuracy. The resolution of the resulting
image must be of very good quality, as every layer will extract
different properties from the RGB layers, which will cumu-
latively be assessed for the prediction and detection.
The architecture in Fig. 6 is comprised of five convolution
and max-pooling blocks. In each of the layers, the pixels of
the image provide different pieces of information pertaining
to the features, which are associated with classification. Max
pooling is an important feature in each layer which allows
to reduce the number of features and hence the pixels and
pass it down to the next layer. Thus, according to the conver-
gence theorem, with each progression in the neural network,
we slowly converge toward a particular weight matrix by
repeated forward and back-propagation in the series of num-
ber of epochs and batch processing. The neural network
started with initially 16 filters and was then successively dou-
bled in the forthcoming layers. Downsampling by a factor of
32 was contributed by each of the max-pooling layers. A fea-
ture map of 19 × 19 × 256 was created at the end of the five
convolutional layers and pooling layers. After this step, we

Fig. 6 Neural network architecture layers

pass the resultant image into a layer of Inception V3 network cells is indicated by S, and that of bounding boxes is indicated
layer. Again, it is passed into a wide range of convolutional by B:
operations, finally resulting in a 19 × 19 × 21 resolution. This
is a methodology used to convert the two-dimensional ten- ωi j  denotes the jth bounding box in the grid celli
sor to three-dimensional tensor, so that bounding boxes can
noobj obj
be established. We then use the YOLO v3 algorithm to cre- ωi j  −ωi j
ate the bounding boxes, which were encoded to surround on
weed-laden areas via other algorithms and annotation pro- αcoor d  6
cesses. Also it is to be noted that the bounding boxes also
have the accuracy with which it predicted the possibility of αnoobj  0.5
a weed infestation. With this methodology, the camera can
be fit into a drone and a top-view image taken and the weed αcoor d  10 ∗ αnoobj
detection process is thus implemented. This neural network
does not use the default anchor boxes; rather, as mentioned wi , hi , xi , yi are the width, height and centroid coordinates
above, it is calculated based on our training weed data. Var- of the corresponding anchor box. The final loss function is
ious machine learning algorithms were tested to predict the calculated by summing up the L1 , L2 and L3 .
bounding boxes, and it was observed that K-means algorithm ci is the calculated confidence score of object pi (c) per-
gave the highest accuracy. taining to the classification loss. The parameters with hats
are the corresponding estimated values. c here denotes the
2 obj obj
L 1  αcoord Σi0 B
Σ j0 ωi j [( X i − X̂ i )2 + ( Yi −Ŷi )2 ] classes. ωij is 1 if there is an object in cell and 0 otherwise.
s2 obj √ √
+ αcoor d Σi0 B
Σ j0 ωi j [( Wi − Ŵi )2 )2
√ √
4.5 Machine Learning Algorithm in Crop
+ ( hi − ĥi )2 ] Recommendation
2 obj
L 2  Σi0 B
Σ j0 ωi j (Ci − ĉi )2
  Several machine learning algorithms have been developed
s2 noobj
+ αcoor d Σi0 B
Σ j0 ωi j (Ci − ĉi )2 to predict which crop is suitable based on the environmen-
s2 obj
  2 tal conditions. The Naïve Bayes algorithm outputted the
L 3  Σi0 ωi pi (c) − P̂i (c) best accuracy. Naïve Bayes is a probability-based algorithm
which is based on Bayes algorithm. It functions in develop-
L loss  L 1 (1)
ing the classifier models and is responsible for assigning the
+ L2 class labels. In this case, a set of crops which are in the data
+ L3 pool are selected and assigned accuracy based on prior train-
ing on a dataset. In this algorithm, a particular set is equally
The mathematical calculation of the bounding boxes as compared with all the attributes in the dataset, without dis-
prescribed by YOLO algorithm mainly is classified into L1 , crimination or biasness. The algorithm assures that only one
which indicates the error in the bounding box. L2 indicates crop set will be left in the pool. After the accuracy calcula-
the error in confidence of the system. L3 indicates the loss tion, the crop pool with the highest probability is outputted
function of the system. In the calculation shown in Eq. 1, by the pipeline. Support vector machine (SVM) is an algo-
αcoord and αnoobj are taken as 6 and 0.5. The number of grid rithm which isolates a hyperplane to a value of unity, hence

regarding itself as a discriminative classifier. It is a collection

of directed procedures of learning, relapse, order and excep-
tion’s revelation. Each attribute is plotted on a hyperplane,
in N-dimensional space with the value of each attribute serv-
ing as the component established in that particular chosen
plan. KNN is another algorithm implemented. Predictions are
made directly on the dataset while using this algorithm. For
a new instance namely x, by traversing through the entirety
of the training dataset, prediction is made by grouping the K
most correlated instances and then summarizing the output Fig. 7 Sensor node architecture
variable for all of the K instances. It is generally based on
the centroid of all the classes, or a median or mode, which
constantly shifts indicating which class the crop sets belong secondary battery source power offers longevity to our WSN.
to. To determine this, we find out the Euclidean distance The different sensors that are being used in our model, namely
between all the instances. Euclidean distance is calculated as relative humidity, temperature, soil water capacity and soil
the square root of the sum of the squared differences between porosity sensors, are placed at strategic position throughout
a new point (x) and an existing point (xi) across all input the agricultural land to retrieve the signals efficiently and pro-
attributes j: vide accurate input data to our crop recommendation engine
to perform data analysis on the data and predict the crops
√  2  suitable for the land. The WSN sensor node architecture is
E D(x, xi )  ΣΣi, j x j − xi j (2)
shown in Fig. 7. Sensors convert the specified quantity to be
measured into an analog signal, which is amplified and fed
4.6 WSN Architecture Implementation to processor via ADC. The transceiver section uses Zigbee
for network establishment and communication in WSN.
Wireless sensor network (WSN) architecture is currently the Temperature Sensor—Different crops have different lev-
most widely used remote sensing technology in precision els of tolerance toward both soil and environmental tem-
agriculture (PA). The advantages of using WSN are mani- peratures. For example, broccoli seeds germinate in cooler
fold as nowadays in current agricultural requirements, the climates whereas marigolds germinate in relatively warmer
optimum use of fertilizers or pesticides or water needed for ones. Hence, before selecting a crop for a farmland, the tem-
irrigation is yet not known clearly to famers. There is still a perature is a crucial thing to monitor. The temperature sensor
prevalent deficiency of laboratories for soil testing in many is hence used to determine the temperature in the farmland
areas of the world for which proper crops for a particular where it is implanted, sending off any abnormal conditions.
soil type are often not chosen correctly [18]. The amount of The sensor is primarily a diode where the value can be mea-
requirements for improving soil health is still ambiguous to sured based on the change in voltage between the terminals
farmers in several areas. And also, less than normal or more of the diode, which is used to calculate the temperature:
than normal use of pesticides and fertilizers for specific crops  
can adversely affect the nutrition of soil and health of farmers, T  Q × V f / nk × log I f (3)
respectively [19]. Hence, rapid data collection, prediction of
correct crops for a specific soil type and environmental con- where
ditions and monitoring of the data at a low cost but a higher T is the temperature measured,
efficiency are the need of the hour. WSN is one such impor- I f is the forward current,
tant technology, which helps in accurate spatial data retrieval k is the Boltzmann’s constant,
via different sensors placed at different strategic positions T is the absolute temperature,
spread throughout the farmland and then applying decision Q is the magnitude of electronic charge,
algorithms to derive useful results. n is the value between 1 and 2
The WSN architecture targeted in our paper uses Zig- pH Sensor—Another major regulator of crop nutrients and
bee technology as it offers all the basic and important an important factor to be considered for crop selection is the
requirements, namely less consumption of power and higher soil pH level. It is also related to the climatic conditions of
lifecycle, low data rate and moderate to high coverage area. that area; hence, the use of both pH sensor and temperature
Placement of WSN sensor nodes is a challenge; hence, this sensor would allow better decision making for the ML model
paper aims at terrestrial WSN using different network topolo- to identify the relevant crop accurately. Plants like asparagus
gies, like mesh, tree for sensor placement, based on farmer prefer a slightly alkaline soil to grow on, while blueberries
requirements. And again, alternative solar energy power as a grow well on more acidic soil. A pH sensor works based on

the principle of hydrogen-ion concentration measurement in plants need lesser humid conditions. Such a sensor mainly
the soil: works based on the change in the capacitive or resistive value
p H (soil)  p H (solution) of the dielectric element used to measure relative humidity
   values. The capacitive-type humidity sensor is proposed in
+ (E (solution) E (soil)) / R × T /F log 10 the WSN network sensor node connection, as it is known
(4) to provide better results than the resistive and thermal ones,
when tested across a varied range of temperatures
Rh  100
pH(soil) is the pH value to be calculated      
× ex p 17.625 × Td / 243.04 + Td / ex p (( 17.625 × T ) / ( 243.04 + T ))
pH(solution) is the known pH value of standard solution
R is the gas constant (joules per kelvin per mole) (6)
T is the temperature in kelvin
F is the Faraday constant value (coulombs per mole) where
E(X) is the electrode potential T is the temperature
Environmental factors affecting sensor readings such as Td is the dewpoint
temperature at its extreme values provide tension to the sen- Environmental factors affecting sensor readings—Humid-
sor electronics and would reduce longevity. In places where ity sensor readings are affected by altitude of the region due
temperature varies almost more than frequently the temper- to the change in vapor pressures. It is also affected by the
ature sensor used as a WSN sensor node, data value should temperature. As temperature increases, the relative humidity
be periodically fed into the pH sensor to result in more accu- decreases. Hence, sometimes calculation of dewpoint instead
rate readings. Also extreme pressure changes cause material gives a better air moisture content reading as it is independent
of the sensor to be affected and provide faulty readings. of temperature.
The glass membrane for pH measurement can be affected if Wireless sensor networks can be extended to other appli-
exposed to frequent dusty air conditions and would develop a cations such as:
coating which has to be chemically cleaned in specified time
intervals. 1) Drone automation for weed detection—Wireless sensor
VWC (volumetric water content) soil moisture sensor—It networks have a huge impact in precision agriculture as
is an important measurement to be considered while crop seen in the literature survey of our paper. The prediction
farming as it is very closely related to the amount of irrigation and recommendation of crops have highly benefitted due
required for the crop too. If the soil moisture content can to the WSN techniques collaborated with ML and DL.
be measured correctly, then it would aim at a possibility of As a future prospect of our research, it is aimed for drone
saving water as many farmers are unaware of the exact water automation using WSN. Weed growth depends on sev-
content their type of soil needs: eral factors such as the climatic influences of a particular
area followed by soil fertility, soil moisture capacity, etc.
Θ  εba + 1 − εm
(ρb / ρs) − 1 / εw −1 ) (5) In our research for crop recommendation, all the sen-
sors placed throughout the field are already monitoring
where several climatic and soil content features. Those features
 is the water content in VMC measurement recorded by the sensors in the WSN can be exploited to
εb is the bulk soil permittivity get a boundary limit condition for discharging or acti-
εm is the mineral permittivity vating the drones for weed detection mechanism. Hence,
ρs is the particle density it would save the unnecessary battery wastage of drones
ρb is the soil bulk density as they would be deployed only at the right and specific
εw is the water permittivity time, more toward the early growth of the crops, and
Environmental factors affecting sensor readings—The would in turn develop a fully automated drone monitor-
permittivity of water is inversely dependent on temperature, ing system.
and also conductivity of soil is directly correlated with tem- 2) Prevention and controlling of forest fires—Forest fires
perature. Bulk density of soils can result in +- 2% error in are becoming very common in today’s world due to
the water content sensor readings. increasing drastic climatic changes. And it is till date
Humidity sensor—Exposure to sunlight and temperature a very uncontrollable disaster and results in a huge loss
is joined with humidity of a region as one of the other main of forest resources, degradation of environment, and loss
important factors affecting crop growth. Increased humid- of human habitat too. Forests are an integral part of our
ity allows crop growth having longer and wider crop parts. survival, and protecting and conserving our forests is a
Vegetables ideally grow in room humidity, while flowering global concern. WSN sensors tracking the smoke and the

heat can send off alarms to the nearby controlling unit

and help in regulation and faster controlling of erupted
3) Smart home applications—WSN can also be widely
applied to our home and daily lives in the modern
cities. The very notion of saving electricity can be effi-
ciently executed by WSN and IOT devices continuously
tracking user moment using vibration and heat transfer
energies. The IOT and WSN can together communicate
via infrared frequencies and switch off the electronic Fig. 8 Naive Bayes code snippet
applications whenever the sensors are not in use. This
combined with all the devices combined can play a key
role in turning a modern city into a smart city.
4) Prevention and detection of wastewater contamina-
tion—Due to aggressive use of pesticides, there is an
immense amount of pesticides, herbicides and insec-
ticides that get heavily dissolved. This results later Fig. 9 Crop prediction accuracy
in phenomenon called as algae bloom which reduces
the oxygen content of the nearby water bodies. Most
agricultural water is disposed in some river or seeps
into the groundwater, which later is consumed by the
human population. Thus, it becomes a necessity to track
the purification status, as the agricultural wastewater is
totally untracked of. Thus, with the help of WSN and
IOT devices and sensors placed at every checkpoint, the
water is checked for the percentages of this contamina-
tion. If these amounts are found excessively high, then
the wireless network sends a signal to channel this water
into containment shells for further purification.

Fig. 10 Classification accuracy

5 Results Analysis and Discussion

The model showed an impressive accuracy of approximately computed, and the score and accuracy are predicted. As visi-
90% when trained on our dataset (training/testing: 3:1) ble in Fig. 9, the accuracy is predicted to be 89.29%. There is
using Naïve Bayes. The Bayesian code implemented in our an added validation set which is kept separately. Also in the
research for prediction of crops has been shown in Fig. 8. The image, a prediction of crop is also made for a particular tuple
run_naive_bayes_algorithm runs the Naive Bayes Classifier in the test data. Again, in Fig. 10, a graph between training
model on the trained dataset and returns the result accuracy and validation set is plotted based on the progress of training
which is printed in the main function. The main concept and testing of the data. The accuracy at every point is noted
revolves around the information gathered from sensors via and hence plotted. It can be seen that both the curves are
WSN and proper ML fitting model to our dataset coupled at par with each other; hence, the algorithm is highly capa-
with proper CNN detection of weed images captured from the ble and successful in predicting the accuracies. Moreover,
drone. In this era of revolutionizing agriculture, our proposed the hyperparameters are correctly set to avoid overfitting and
model will definitely ease the lives of farmers and help them underfitting. This trade-off is the key factor in determining
tackle with the huge losses incurred due to lack of knowl- the accuracy and legitimacy of an algorithm.
edge of the appropriate crops to be grown on a land and also In Table 1, a comparison between different ML techniques
due to the weeds affecting their crops. Figure 9 illustrates the based in their prediction accuracy is depicted. Although all
accuracy of the dataset using the Naïve Bayes algorithm. The techniques are fairly having a good accuracy score, we chose
dataset is randomly split into a training set and testing set. to commence with Naïve Bayes, as it has proven to give
The algorithm is then trained on the training set by the classi- higher accuracy and closer prediction results as compared to
fier. This classifier is then passed into the testing dataset, and others. The ensemble technique which was used in this paper
a confusion matrix is extracted. This confusion matrix is then is random forest. It matched our study requirements. Random

Table 1 Comparison between Techniques Accuracy (%) it was probably due to fact that our dataset was heavily pre-
different techniques for crop
prediction processed and also the unconditional change in environment
Naïve Bayes 89.29
conditions resulted in Naïve Bayes better than SVM. KNN
Ensemble 88.7
can actually quickly learn the classes as it uses median, cen-
KNN 88
troid and mode calculations, thus giving a slightly better
SVM 87
accuracy than SVM. At the end, SVM gave an accuracy score
Regression 84.8
of 87% (Table 1).
Regression algorithm too is designed to output well-
calibrated class probabilities. It has a smooth unconstrained
forest basically is a highly nonparametric algorithm except loss function which also supports Bayesian cases, but gen-
the hyperparameters, namely feature subspace ratio, number erally is focused on the linearity. The dataset is not so much
of trees and a few others. It was an optimally robust clas- linear with lots of attributes and hence may give rise to multi-
sifier owing to our dataset, which matches the requirements dimensional planes getting involved where regression-based
of the algorithm but ensemble learning is not immune to methods fail to comprehend with ease. Thus, it resulted in a
overfitting, which significantly penalized the accuracy dur- lesser accuracy metric score of 84.8% (Table 1.).
ing testing phase, thus resulting with an accuracy score of Future scope of improvement includes better management
88.7% (Table 1.). The Naïve Bayes algorithm is immune to of network longevity for WSN, addition of more possible
overfitting and hence was declared superior in accuracy and sensors for more accurate crop prediction and a drone with
other metrics to the other algorithms, thus resulting as the robotic arms which enables autoplucking of weeds based on
best algorithm for our use-case scenario and dataset. real-time weed detection.
A minute difference between Naïve Bayes and KNN algo-
rithm is that the former is a generative classifier, while the 6 Conclusion
latter is a discriminative classifier. KNN is actually a lazy
classifier, which operates on supervised learning. The main Precision agriculture has helped in revolutionizing the agri-
reason KNN did not give the desired results as being a lazy cultural industry to a great extent. Our proposed model of
classifier becomes difficult to predict which is exactly the using wireless sensor networks and AI models for crop pre-
scenario provided in real time. While implementing the algo- diction and weed detection has also added its bit of increasing
rithms, variable clustering sizes were used, and with the agricultural efficiency. In contrast to traditional agricultural
increase in clustering size till 5, the accuracy increased and methods consuming more time, hard work and sometimes
then progressively degraded. Hence, the maximum accuracy leading to improper outputs, losses, modern agricultural
KNN provided was that of 88%. The Naïve Bayes is on the methods involving the concept of AI and IoT will definitely
other side coined as an eager learning classifier, and it being help farmers worldwide in taking better decisions and help
much faster than KNN makes it easier to compute the huge them in increasing the overall crop yield and efficiency. The
crop dataset with a number of attributes. In this case, we took proposed model can harness possibility of precise manage-
probabilistic estimation methodology, generating probabili- ment of farm sector.
ties for each crop pool class. The algorithm learns over time,
and moreover it automatically takes care of the high dimen- Acknowledgement We express our heartfelt gratitude to Vellore Insti-
sionality pertaining to the different attributes in the dataset. tute of Technology, Chennai, for supporting us in our research work.
In KNN, a further dimensionality reduction technique had to Additionally, we would like to thank all the volunteers who helped us
in our study
be applied to make the accuracy viable to our test scenario,
taking significantly more time in prediction and also reduc- Authors’ Contributions Ishita Dasgupta was involved in conceptualiza-
ing the accuracy as the dataset attributes need not be highly tion, resources, methodology, software, investigation, writing—original
correlated with each other. This makes the KNN algorithm draft, writing—review and editing and visualization. Jayit Saha sup-
ported conceptualization, resources, methodology, software, formal
gullible and makes it predict wrong results. Thus, the Naïve analysis, writing—original draft, writing—review and editing and visu-
Bayes gave an accuracy score of 89.29%, whereas KNN gave alization. Pattabiraman. V was involved in validation, data curation,
a score of 88% (Table 1). writing—review and editing, supervision, project administration and
SVM can be equally considered as a best fit for this funding acquisition. Parvathi. R contributed to validation, data cura-
tion, writing—review and editing, supervision, project administration
use-case as it automatically takes into account the high- and funding acquisition.
dimensionality problems and computes each hyperplane,
thus perfectly giving the accuracy metrics. Moreover, it is
a perfectly balanced dataset which gives SVM an edge-over Compliance with Ethical Standards
KNN algorithm. Moreover, it is a perfectly balanced dataset
which gives SVM an edge-over KNN algorithm. However, Conflicts of interest The authors declare no conflicts of interest.

References 12. Recommender System For Smart Agriculture Using Iot”

Prathmesh Shirshivkar, Mayur Solaskar, Ramesh Borkar, Sameer
1. Pudumalar, S.; Ramanujam, E.; Rajashree, R. H.; Kavya, C.; Sawant, Smita Patil, International Conference on Innovative and
Kiruthika, T.; & Nisha, J.: Crop recommendation system for pre- Advanced Technologies in Engineering (ICIATE-2019), 2019,
cision agriculture. In: 2016 Eighth International Conference on ISSN (e): 2250-3021, ISSN (p): 2278–8719 PP 42–44
Advanced Computing (ICoAC) pp. 32–36. IEEE (2017) 13. Design and Implementation of Crop Recommendation Fertilization
2. Perez, A.J.; Lopez, F.; Benlloch, J.V.; Christensen, S.: Colour and Decision System Based on WEBGIS at Village Scale”, Hao Zhang,
shape analysis techniques for weed detection in cereal fields. Com- Li Zhang, Yanna Ren, Juan Zhang, Xin Xu1, Xinming Ma1, and
put. Electr. Agric. 25(3), 197–212 (2000) Zhongmin Lu, 4th Conference on Computer and Computing Tech-
3. Burgos-Artizzu, X.P.; Ribeiro, A.; Guijarro, M.; Pajares, G.: Real- nologies in Agriculture (CCTA), Oct 2010, pp.357–364, https://
time image processing for crop/weed discrimination in maize
fields. Comput. Electr. Agric. 75(2), 337–346 (2011) 14. Ngo H.C.; Hashim, U.R.; Sek, Y. W.; Kumar, Y. J.; Ke, W.S.:
4. Banavlikar, T.; Mahir, A..; Budukh, M.; Dhodapkar, S.: Crop rec- Weeds Detection in Agricultural Fields using Convolutional Neural
ommendation system using Neural Networks. In: International Network. In: International Journal of Innovative Technology and
Research Journal of Engineering and Technology (IRJET) (2018) Exploring Engineering (2019), (IJITEE)ISSN: 2278-3075
5. Bah, M.D.; Hafiane, A.; Canals, R.: Deep learning with unsuper- 15. Tang, J.-L.; Chen, X.-Q.; Miao, R.-H.; Wang, D.: Weed detection
vised data labeling for weed detection in line crops in UAV images. using image processing under different illumination for site-
Remote Sens. 10(11), 1690 (2018) specific areas spraying. Comput. Electr. Agric. 122, 103–111
6. Rajak, R.K.; Pawar, A.; Pendke, M.; Shinde, P.; Rathod, S.; (2016).
Devare, A.: Crop recommendation system to maximize crop yield 16. Louargant, M.; Villette, S.; Jones, G.; Vigneau, N.; Paoli, J.N.;
using machine learning technique. Int. Res.J Eng. Technol. 4(12), Gée, C.: Weed detection by UAV: simulation of the impact of spec-
950–953 (2017) tral mixing in multispectral images. Precis. Agric. 18(6), 932–951
7. Na, A.; Isaac, W.; Varshney, S.; & Khan, E.: An IoT based system (2017)
for remote monitoring of soil characteristics. In: 2016 International 17. dos Santos Ferreira, A., Pistori, H., Matte Freitas, D.,; Gonçalves
Conference on Information Technology (InCITe)-The Next Gen- da Silva, G.: Data for: Weed Detection in Soybean Crops Using
eration IT Summit on the Theme-Internet of Things: Connect your ConvNets”, Mendeley Data, v2 (2017)
Worlds (pp. 316–320). IEEE (2016) 3fmjm7ncc6.2
8. Mythili, R., Meenakshi K.; Apoorv T.; Neha P.: IoT Based Smart 18. Methods for Rapid Testing of Plant and Soil Nutrients”, Christian
Farm Monitoring” System International Journal of Recent Tech- Dimkpa, P.S. Bindraban, Joan E Mclean, Lydiah Gatere, https://
nology and Engineering (2019) (IJRTE)ISSN: 2277-3878, July (2017)
9. Manoj Athreya A..; Hrithik Gowda,; S., Madhu, S.; Ravikumar, V.: 19. Excessive use of nitrogenous fertilizers: an unawareness caus-
Agriculture Based Recommender System using IoT-A Research, ing serious threats to environment and human health”, Moddassir
IJRTE, ISSN: 2277-3878 (2019) Ahmed, Muhammad Rauf, Zahid Mukhtar, Nasir Ahmad Saeed,
10. Mokarrama M. J.; Arefin, M. S.: RSF: A recommendation system Environ. Sci. Pollut. Res., 14 (2017)
for farmers. In 2017 IEEE Region 10 Humanitarian Technology
Conference (R10-HTC) (pp. 843–850). IEEE (2017).
11. Raja, S. K. S.; Rishi, R.; Sundaresan, E.; Srijit, V.: Demand based
crop recommender system for farmers. In: 2017 IEEE Technolog-
ical Innovations in ICT for Agriculture and Rural Development
(TIAR) (pp. 194–199). IEEE (2017)


