Plant Disease Detection Using Machine Learning Algorithm
Plant Disease Detection Using Machine Learning Algorithm
Plant Disease Detection Using Machine Learning Algorithm
ISSN No:-2456-2165
1. Apple scab 2. Grape Esca 3.Corn leaf spot 4.potato Early 5.Tomato Bacterial
Fig 1:- Blight Spot
Applications
Biological research
Plant leaf disease detection also useful in agricultural
institute
Some plant leaf disease detection automatic technique are
beneficial for large work of monitoring in farm of crops
disease detection Fig 2:- General Block Diagram of Feature Based Approach
Objectives Image Acquisition: Image Acquisition is the process of
Farmers or experts keep a close eye on the plants to spot loading an image into a digital camera and then storing it on
and recognise sickness. a digital medium for use in subsequent MATLAB processes.
B. Image pre-processing: The main goal of image pre-
However, this procedure is frequently time-consuming, processing is to strengthen certain image features or enhance
expensive, and unreliable. Results from automatic detection the image information that contains undesired distortions in
employing image processing methods are quick and precise. preparation for any processing. Pre-processing techniques
This study uses deep convolutional networks as a substitute include dynamic image size and shape, noise filtering, image
method for developing a disease recognition model, conversion, image enhancement, and morphological
supported by leaf image classification. processes, among others. C. Image Segmentation: To divide
images into clusters for image segmentation, the K-means
The purpose of this deep convolutional network research cluster algorithm is used. At least one component of each
The goal of this study is to use CNN to focus on the cluster must have an image with the majority of the unhealthy
identification of potato, corn, grape, and apple leaf diseases. area. Applying the k means cluster algorithmic rule, the
The leaves of both healthy and sick plants are examined objects are divided into K different groups for each collection
using CNN. of characteristics. D. Texture feature extraction: Using
GLCM, texture features are extracted once clusters have
Motivation formed (Gray-Level Co-occurrence Matrix). E.
Identifying and recognition of leaves disease is the Classification: To test for leaf disease, classification is used.
solution for saving the reduction of large farm in crop disease For classification, the Random Forest classifier is employed.
detection and profit in productivity, it is beneficial in
agricultural institute, Biological research. Implementation work
Machine learning Model: There are a total of 24
II. EXISTING WORK AND IMPLEMENTATION different sorts of labels for apple plant leaves, including apple
WORK scab, black rot, apple rust, and healthy. Corn label,
specifically: Corn Blight, Corn Rust, Corn Healthy, and Corn
Overview of Existing Work Cercospora Gray spot. Black rot, Esca, healthy, and Leaf
Existing work related to leaf disease detection using blight are the specific grape labels. Early blight, healthy, and
CNN show to detect and classify leaf disease using image late blight are the three potato labels. Specifically, the
processing techniques that follow steps like following tomato diseases are included on the label: bacterial
spot, early blight, healthy, late blight, leaf mould, septoria leaf
spot, spider mite, target sport, mosaic virus, and yellow leaf
curl virus.
Convolutional neural networks can be used to create a layers are utilised for feature extraction, and fully linked
computer model that takes unstructured visual inputs and layers are employed for classification. Through the use of
transforms them into output labels of matched categorisation activation layers, the network is given nonlinearity. maintain
(CNN). It is a type of multi-layer neural network that may be the size of the image. The max pooling layer is used to
instructed to learn the features needed for classification. Less minimise the size of the feature maps, expedite training, and
pre-processing is required compared to conventional make the model more resistant to tiny changes in input. The
methods, and automatic feature extraction is done for better largest kernel size used in maximum pooling is 22. Re-LU
performance. For the goal of identifying leaf sickness, a activation layers are used in each of the blocks to introduce
LeNet architecture version produced the best results. LeNet non-linearity. To avoid over-fitting the train set, the Dropout
is a simplistic CNN model that has four layers: fully regularisation technique has also been used with a 0.5
connected, convolutional, activation, and max-pooling. This maintain probability. Dropout regularisation randomly
architecture is used in the LeNet model to classify leaf removes neurons from the network during training iterations,
diseases. It features an additional block of convolution, reducing the variance of the model and simplifying the
activation, and pooling layers in comparison to the original network. This method lessens the complexity of the network,
LeNet architecture. The model utilised in this investigation is hence preventing overfitting. The classification block is
shown in Fig. 2. A convolution layer, an activation layer, and composed of two neural network layers of 500 and 10 neurons
a max pooling layer are present in each block. Three such each, each of which is fully connected. After the second dense
blocks, completely connected layers, and soft-max activation layer, a soft max activation function is used to determine the
are utilised in this architecture. Convolution and pooling probability scores forthe ten classes.
Fig 4:- Experimental result (a) input image (b) convolution layer-1(c) convolution layer-2 (d) convolution layer-3 (e) flatting layer.
In addition, each experiment will compute the overall accuracy over the course of training and testing (for each epoch). The
overall accuracy score will be used to evaluate performance. Transfer learning is a method of knowledge exchange that uses 224*224
fixed-size images and requires the least amount of training data. Transfer learning is useful for transferring knowledge from one
It employs the VGG16 convolutional neural network. when 33 convolution layers are used. There were five max-
The convolution layer's input image must be 224 x 224 RGB pooling layers that carried out spatial pooling after some of
fixed in size. The image is then passed on to convolutional the convolution layers (not all the conv. layers are followed
layers, where the filters are applied with the lowest possible by max-pooling). There is a 2 * 2 pixel Max-pooling.
receptive field—33—to capture the ideas of left, right, up,
and down in addition to centre. In some configurations, it A stack of convolutional layers is followed by three
makes use of 11 convolution filters, which might cause the layers. The first two devices each have 4096 channels, while
input channels to undergo linear modification before the third device uses a 1000-way ILSVRC classification and
becoming nonlinear. One pixel is the fixed convolution stride. has 1000 channels. The soft-max layer is the final one. The
The spatial padding of the input to the convolution layer is fully connected layer arrangement helps to pinpoint the leaf
such that the spatial resolution is preserved after convolution illness.
All concealed layers have correction capabilities. Re-Lu Input layer: Data in the form of images are contained
It should also be highlighted that none of the networks feature in the input layer. The parameters include the image’s
Local Response Normalization (LRN), which does not dimensions (height, width, depth, and colour information)
enhance the performance on the dataset. Repaired linear units (RGB). The size of the input is a fixed 224 x 224 RGB picture.
contain non-linearity on networks.
Convolutional layer: Another name for this layer is
For the Large-Scale Image, CNN was employed. The feature extraction layer. With the use of dot products of the
best way to identify plant diseases is to complete two tasks. picture dimensions, this layer extracts the salient features
The first step is object localization, which is the detection of from the provided collection of photographs.
objects in an image that come from various classes. The
second is picture classification, which involves labelling each Pooling Layer: By lowering (or reducing) the
image with one of various categories. There are seven distinct dimensions of the featured matrix produced by utilising dot
layers in the CNN model. Certain information is handled in products, the pooling layer aids in lowering the processing
each layer. Here are those seven layers: Convolutional layers power required to process the data.
with fully connected, Soft-max, input, output, and pooling
layers.
Therefore, it is suggested to construct an image is supplied with a vector created by flattening a two-
processing system to automate the detection and dimensional feature matrix.
classification of leaf batches into particular illnesses in order
to determine the cause of the symptom using an automated Image Data Generator: Image Data Generator rapidly
tool. The system is comprised of three basic pieces, as learned about Python generators that will automatically
depicted in the above diagram: Image Analyser, Feature convert batches of unprocessed tensors from image files on
Database, and Classifier, respectively [9]. The two steps of disc.
the processing that these blocks attempt to propose are as
follows: offline Phase: A picture analyser processes a large Training Process: Before a trainer conducts a private
number of defective photos to extract aberrant features. training session, effective training starts, and it continues after
the session is over. Assessment, motivation, planning,
CNN Model Steps: delivery, and evaluation are the five connected processes or
Conv2D: It is a 2D Convolution Layer, this layer creates activities that make up training
a convolution kernel that’s wind with layers input which
helps produce a tensor of outputs. Epochs: A word used in machine learning is called an
Keras.layers.Conv2D(filters, kernel_size, strides=(1, 1), epoch, which describes how many rounds the machine
padding=’valid’, data_format=None, dilation_rate=(1, 1), learning algorithm has made across the entire training dataset.
activation=None, use_bias=True, Typically, datasets are organised into batches (especially
kernel_initializer=’glorot_uniform’, when the quantity of knowledge is extremely large).
bias_initializer=’zeros’, kernel_regularizer=None,
bias_regularizer=None, activity_regularizer=None, Validation Process: The process of evaluating a trained
kernel_constraint=None, bias_constraint=None model against a testing data set is called validation. The
training set’s corresponding data set may contain a separate
Max-pooling: A pooling method called max pooling section that serves as the testing data set. The main goal of
may select the best element from the feature map area that the using the testing data set is to evaluate a trained model’s
filter has covered. Therefore, the output following the capacity for generalisation.
maximum pooling level would be a feature map that included
the most crucial elements of the prior feature map. Training and Testing Model:
The dataset is pre-processed, including image scaling,
Flatten: There is a “Flatten” layer sandwiched between reshaping, and array form conversion. On top of that, similar
the convolutional layer and, consequently, the fully processing is applied to the test image. Any image from a
connected layer. A fully connected neural network classifier dataset of approximately 38 distinct plant leaf diseases is
frequently used as a test image for the software.
The train dataset is used to train the model (CNN), enabling it to recognise the test image and, consequently, the disease it
represents. Dense, Dropout, Activate, Flatten, Convolution 2D, and Maxpooling 2D are some of the layers that CNN has. If the
plant species is included in the dataset and the model has been successfully trained, the programme can detect the illness. The test
image and trained model are compared after effective training and pre-processing in order to identify the disease.
Dataset: The dataset was obtained from the online Convolutional networks are known to be capable of
Kaggle of Plant Village dataset, and the code was added to learning features when trained on larger datasets, hence the
the Kaggle online kernel for efficient computation and the outcomes of training with only original photos won't be
analysis of training loss and validation. examined. An overall accuracy of 88% was attained once the
network's settings were adjusted. The trained model was also
Image Pre-processing and Labelling: Pre-processing put to the test for each class separately. Every image from the
often involves removing low-recurrence foundation validation collection was put to the test. The results that were
disturbance, adjusting the power of the individual particle achieved should be compared to some other results, as
images, removing reflections, and obscuring portions of recommended by good practise standards. Additionally, aside
images. Pre-processing of images is a technique for from those dealing with plant species identification using
enhancing information. Additionally, the pre-processing photographs of the leaf, there are presently no commercial
method for images involved physically manipulating the solutions available. In this research, a technique for
seeming variety of images, creating a square around the automatically classifying and identifying plant diseases from
leaves to highlight the region of fascination (plant leaves). leaf photos was investigated. It was explained in detail every
Photographs having a less ambitious purpose and step of the way, from gathering the images used for training
measurements that weren't exactly 500 pixels were not and validation through image pre-processing and
regarded as significant pictures for the dataset during the augmentation to guiding the deep CNN and fine-tuning. To
period of collecting the images. In addition, the dataset was evaluate the performance of the newly developed model,
limited to only those images where the location of intrigue various tests were run. There was no comparison with
was closer to the objective. In this way, it was ensured that findings obtained using a similar method because, as far as
images contained all the information needed for highlight we all know, the proposed method has not been used in the
learning. The Internet makes it possible to find a lot of field of disease recognition. The test image we've provided in
resources, yet their value is frequently disputed. Horticultural this case is a leaf spot.
experts examined leaf images and labelled all the images with
appropriate infection abbreviations, taking into account a real Result and Conclusion:
concern for confirming the accuracy of classes in the dataset Convolutional networks are known to be capable of
that was first collected by a catchphrases search. As is learning features when trained on larger datasets, hence the
common knowledge, using correctly defined images is outcomes of training with only original photos won't be
important for the preparation and approval dataset. Only in examined. An overall accuracy of 88% was attained once the
this way is it possible to develop an accurate and reliable network's settings were adjusted. The trained model was also
identifying model. At this stage, duplicate images that put to the test for each class separately. Every image from the
remained after the primary focus of grouping and classifying validation collection was put to the test. The results that were
images was removed from the dataset. achieved should be compared to some other results, as
recommended by good practise standards. Additionally, aside
from those dealing with plant species identification using
REFERENCES