20 Page

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/370189362

Automated Freshwater Fish Species Classification using Deep CNN

Article in Journal of The Institution of Engineers (India) Series B · April 2023


DOI: 10.1007/s40031-023-00883-2

CITATIONS READS

2 86

3 authors, including:

Jayashree Deka
Assam Don Bosco University
7 PUBLICATIONS 27 CITATIONS

SEE PROFILE

All content following this page was uploaded by Jayashree Deka on 27 April 2023.

The user has requested enhancement of the downloaded file.


J. Inst. Eng. India Ser. B
https://doi.org/10.1007/s40031-023-00883-2

ORIGINAL CONTRIBUTION

Automated Freshwater Fish Species Classification using Deep


CNN
Jayashree Deka1 · Shakuntala Laskar1 ·
Bikramaditya Baklial2

Received: 9 February 2022 / Accepted: 8 April 2023


© The Institution of Engineers (India) 2023

Abstract Freshwater fish is considered a poor man’s pro- Introduction


tein supplement as they are easily available in lakes, rivers,
natural ponds, paddy fields, beels, and fisheries. There are The aquaculture of South Asian countries like India, Paki-
various freshwater fish species that resemble each other, stan, Bangladesh, Sri Lanka, and Nepal contributes in vol-
making it difficult to classify them by their external appear- ume 7.9% to Asian and 7.1% to world fish production [1].
ance. Manual fish species identification always needs exper- The predominant fish groups in these countries are the carps
tise and so, is erroneous. Recently, computer vision along and the catfishes [2]. Among freshwater fishes, the Indian
with deep learning plays a significant role in underwater major carps are most cultured [3], followed by Exotic Carps,
species classification research where the number of species Minor Carps, Catfish, and Trout. Apart from these, there are
under investigation is always limited to a maximum of eight plenty of small indigenous fish species (i.e.,-mola, climbing
(8). In this article, we choose deep-learning architectures, perch, barbs, batá, etc.) that grow in rivers, lakes, ponds,
AlexNet and Resnet-50, to classify 20 indigenous fresh- beels, streams, wetlands, lowland areas, and paddy fields.
water fish species from the North-Eastern parts of India. These indigenous fish are rich in nutrients, often containing
The two models are fine-tuned for training and validation high levels of zinc, iron, and vitamin A [4]. As per FishBase,
of the collected fish data. The performance of these net- India reports a total of 1035 freshwater fish species, while it
works is evaluated based on overall accuracy, precision, and is 241 from Pakistan and 256 from Bangladesh. Out of 1035
recall rate. This paper reports the best overall classification freshwater species, 450 are Small Indigenous Fish Species
accuracy, precision, and recall rate of 100% at a learning (SIFS, body length maximum 26 cm). Almost, 216 SIFS are
rate of 0.001 by the Resnet-50 model on our own dataset reported from Assam, the North-Eastern part of India [5].
and benchmark Fish-Pak dataset. Comprehensive empirical These indigenous fish species in the polyculture system have
analysis has proved that with an increasing Weight and Bias the potential of enhancing the nutritional security of the poor
learning rate, the validation loss incurred by the classifier which in turn can provide greater employment opportunities
also increases. in these underdeveloped Asian countries. As the production
and marketing of these indigenous fish species are at the
Keywords Fish classification · Fish species recognition · local level, they are mostly invisible in government statistics.
Resnet-50 · Deep-learning Also, manual identification of fish species needs information
about the number of fins, fin location, scales, and lateral line,
head shape, color, and texture of the body. However, a lack
of appropriate taxonomic knowledge or user expertise can
* Jayashree Deka potentially cause unsought repercussions for fishery man-
[email protected] agement. Manual fish identification based on morphology
1
Electrical and Electronics Engineering, Assam Don Bosco is often considered an erroneous, inaccurate, and inefficient
University, Guwahati, Assam 781017, India task. This shortcoming can be eliminated by using computer
2
Department of Zoology, Bahona College, Jorhat,
Assam 785101, India

13
Vol.:(0123456789)
J. Inst. Eng. India Ser. B

vision techniques, as they have shown outstanding perfor-


mance in the various image-based classification task.
Diversification of fish species in aquaculture has
become a popular research topic in India and Bangla-
desh. In poly-culture fisheries, it has been observed that
the presence of certain fish species (including SIFS) may
result in an increase or decrease in the growth of some
other species (mainly carp) [6–10]. So, it is important to
identify/classify rival species for early segregation to pro-
mote better growth. The study of fish diversification is
necessary to evaluate selectivity, dietary overlap, and food
competition among the cultured species. Therefore, it is
crucial to correctly classify all the fish species reared in
one environment to attain the targeted sustainable growth,
and productivity. The correct classification will also help
fish farmers with biomass estimation, disease identifica-
tion, feeding planning, and cost estimation.
Based on computer vision algorithms, much research
has been conducted for fish species recognition using fish
images and underwater videos. Most of the recognition
algorithms are based on hand-crafted feature extraction
techniques (color, texture, and shapes) from images until Fig. 1  Fish sample images used
the evolution of deep learning [11–22]. As deep-learning
gains popularity in animal and plant species recognition, it
becomes a common choice for fish classification research 587 images of these species are captured under different
too. Li et al. 2022 have studied how recent machine vision background conditions under natural lighting. Figure 1
techniques have changed the outlook of feature-based fish shows the fish sample images that we have collected for
species classification experiments [23]. As the underwater the experiment.
fish images do not possess much color and texture infor- The proposed method will help in the accurate classifica-
mation due to poor contrast and varied illumination, so tion of carps and small indigenous fish species reared in a
only shape information plays a significant role in underwa- ploy-culture environment. The proposed work will also help
ter fish image classification models. It has been observed different stakeholders in fisheries, i.e., government officials,
that the research on fish identification mostly deals with policymakers, scientists, and farmers in developing effective
marine-water fish resources and only a handful of research conservation measures for these fresh-water fish species for
has been executed on fresh-water fish species. Meanwhile, stock management.
a few researchers utilized pre-trained CNN models for The main contributions of this paper are:
fresh-water fish image classification where a constant
background was maintained for all images captured in a I. The paper reports for the first time, the application of
laboratory environment [24–27]. Although these works deep learning networks for automated classification of
demonstrate satisfactory performance, further research 20 freshwater fish species including carps and SIFS
into the effect of varied backgrounds on the performance native to North-eastern parts of India. Images captured
of these deep-learning models is still required. Moreover, from collected samples are labeled accurately with the
there is still scope for improvement in the accuracy of help of our third author who has an experience with
the small-indigenous fish classification problems [24, 28]. fish anatomy study.
Indian water bodies carry varieties of fresh-water fish spe- II. The proposed method utilizes a cost-effective image
cies, still, we are unable to track down any research article acquisition device by using a smartphone and the same
focusing related to these species. Here, in this work, we is being verified by the concerned expert.
perform a deep-learning-based based experiment using III. The paper investigates the appropriate learning rate,
MATLAB to classify images of 20 varieties of fresh-water weight, and bias learning rate factors to achieve the
fish including small indigenous species that are native to best classifier performance.
Assam, India. We deploy a deep residual 50-layer network, IV. As local fish farmers are exploring mixed culture farm-
ResNet-50, and an Alex-Net architecture to build a robust ing, early detection, and accurate classification of fish
model to work on different background images. A total of species will help those farmers to segregate the rival

13
J. Inst. Eng. India Ser. B

species in order to gain maximum growth and better species from Thailand based on shape and texture features.
yield. The system achieved a precision rate of 99% with the ANN
classifier [19]. Rodrigues et al. used SIFT feature extrac-
tion & Principal Component Analysis (PCA) for the clas-
Literature Search sification of 6 fish species from a dataset of 162 images and
achieved an accuracy of 92% [20].
Research on fish classification and recognition tasks started Fish recognition from live fish images of the South Tai-
back in the year 1993 [11]. Fish recognition was first inves- wan sea was performed using 66 different attributes based
tigated on dead demersal and pelagic fishes where color on color, texture, and shape features [21]. As the recogni-
and shape features were used for recognition [11]. They tion task depended on the detection of the fish in the video,
reported 100% sorting for demersal and 98% for pelagic the system showed a poor recognition rate. Different feature
fish, respectively. descriptors for different parts of fish images were employed
Cadieux et al. 2000 developed a system for monitoring in [22]. SIFT and color histograms methods extracted par-
fish using a Silhouette Sensor]. Fish silhouettes were used tial features from images and a partial hierarchical classifier
for extracting the shape features using Fourier Descriptor predicts the images with a score of 92.1% on the FISH4KN-
and other shape parameters i.e., area, perimeter, length, LOWLEDGE dataset [22]. Mattew et al. 2017 used land-
height, compactness, convex hull area, concavity rate, image mark selection and geometrical features for the identification
area, and the height of the image [12]. A multi-classifier of four Indian marine fish using ANN and they achieved an
approach was used for classification by incorporating Bayes accuracy of 83.33% [29].
maximum likelihood classifier, Learning Vector Quantifica- The popularity of Complex Neural Networks (CNNs) has
tion classifier, and One-Class-One-Network (OCON) neural been proved in fish detection and species classification tasks
network classifier. The combined classifier resulted in an too. A deep CNN based on the Alex-Net model for fish spe-
accuracy of 77.88%. cies classification on the LifeClef-15 dataset was used and
The fish classification from videos in the constrained it gained an F-score of 73.5% on underwater videos [30].
environment was explored based on the texture pattern and Video clips of the 16 species collected from the Western
shape of the Striped Trumpeter and the Western Butterfish Australia sea were studied using a pre-trained deep CNN
from 320 images [13]. The SVM classifier with a linear ker- where a cross-layer pooling was used as feature extractor
nel outperforms the polynomial kernel with an accuracy of and classification was performed using an SVM classifier
90%. thereby achieving an accuracy of 94.3% [31].
Spampinato et al. performed a classification of 10 fish Different fish images were collected from the internet by
species using a dataset of 360 images. They extracted tex- the authors in [32]. They used a deep residual 50-layer net-
ture features using Gabor filtering, and boundary features by work for residual feature extraction and an SVM classifier
using Fourier descriptors and curvature analysis [14]. The on those images. They examined the effect of the frozen
system showed an accuracy of 92%. Alsmadi et al. 2011 layer on the overall system performance. The work reported
developed a backpropagation classifier-based fish image rec- an accuracy of 97.19%. Salman et al.2019 used CNN-SVM
ognition. They extracted fish color texture features from 20 for best accuracy on the same dataset as in [31] where the
different fish families using GLCM and achieved an overall maximum accuracy they claimed was 90% [33]. Khalifa
testing accuracy of 84% on 110 testing fish images [15]. et al. [34] proposed a 4-layer CNN on QUT and LifeClef-15
Images from Birch Aquarium were studied using a Haar dataset image classification and achieved a testing accuracy
detector and classifier for classifying Scythe Butterflyfish of 85.59%.
images [16]. Hu et al. utilized color, texture, and shape fea- Islam et al. performed eight (8) different types of small
tures of six freshwater fish images that are collected from indigenous fish classification experiments in Bangladesh
ponds in China [17]. Color features were extracted from skin [35]. They developed an HLBP (hybrid census transform)
images by transforming RGB images into HSV color space, feature descriptor for feature extraction and used an SVM
whereas statistical texture features and wavelet texture fea- classifier to classify the fish images [35]. They achieved an
tures were obtained from a total of 540 fish images [17]. The accuracy of 94.97% with a linear SVM kernel. A 32-layer
accuracy achieved was 97.96% with some miss-classification CNN network was employed on 6 freshwater fish species
due to similar texture features among the species. from the Fish-Pak [24] dataset [25]. They investigated the
In a separate experiment, features were extracted by SIFT effect of different learning rates on the system performance
and SURF detectors for Tilapia fish identification [18]. It and reports the best classification accuracy of 87.78% at a
was observed that the performance with SIFT (69.57%) learning rate of 0.001 and epoch = 350 [25]. Another six
features was lower than that with SURF (94.4%) features. SIFS image classifications from Bangladesh were studied
Another fish recognition model was introduced for 30 fish using color, texture, and geometrical features [26]. A total

13
J. Inst. Eng. India Ser. B

of 14 features were extracted, which was further reduced by It is very clear that CNN-based classification models pro-
using PCA and then fed to an SVM classifier resulting in a vide better classification accuracy compared to traditional
classification accuracy of 94.2% [26]. For the first time, an machine learning models even with the typical constraints
experiment for finding fish species, order, and family of 68 of translation, rotation, and overlapping [44]. Also, fish clas-
Pantanal fish species was executed using Inception layers sification using deep learning requires a lot of training data,
[28]. These layers extracted the features from the fish images which poses a challenge for researchers working alone [23].
and then three branches were implemented with the help of This detailed background search on the fish classification
fully connected layers to predict the class, family, and spe- reveals that while marine fish classification is extensively
cies of the fish images. The technique showed maximum studied by maximum researchers, only a few have inves-
accuracy for order classification. tigated freshwater fish [17, 24–26, 35, 41, 42]. It has been
In a separate experiment conducted in the Lifeclef19 observed that a majority of the studies on automated fresh-
and QUT dataset, an Alex-Net with a reduced no. of con- water fish classification are restricted only to the carp fami-
volutional layers and an extra soft attention layer added in lies. These studies are missing out on a large variety of small
between the last fully connected layer and the output layer indigenous freshwater fish that plays an important role in the
was used. This gave an accuracy of 93.35% on the testing daily diet and livelihood of the common people of India. No
dataset [27]. A deep YOLO network on the LifeClef-15 research has been done to date that aims to classify mixed-
dataset and UWA datasets were used for fish detection and cultured species. Nevertheless, we also could not locate any
classification [36]. The classification accuracy observed work that specifically provides the identification/classifica-
was 91.64% and 79.8% on both datasets respectively. Deep tion of any of the Indian freshwater fish species through the
DenseNet121 and SPP-Net were used for the classification use of computer vision or image processing. Since SIFS are
of six different kinds of freshwater fish images downloaded tiny compared to carp species, it is difficult to identify them
from the internet in [37]. They used an ‘sgd’ optimizer for just by looking at the captured images. From Fig. 1 it is seen
training the models and a true classification rate(recall) of that, visually, Systomus Sarana(q) looks similar to Catla or
99.2% was achieved by SPP-Net [37]. In the later year, a Rohu. So, we collected images of twenty (20) varieties of
pre-trained VGG-16 model was utilized on four major carp indigenous fish species including SIFS and carps available
images and was able to achieve an accuracy of 100% with in our locality for designing the automatic fish classification
fivefold validation [38]. A modified Alex-Net was studied model. The proposed classification model shows excellent
using QUT and LifeClef dataset [39]. They adopted a drop- performance even though small indigenous fish species are
out layer before the SoftMax layer that helped in a higher tiny compared to six of the carp species.
accuracy over the traditional Alex-Net model and reported
90.48% accuracy on the validation dataset. To eliminate the
problem of class imbalances, the focal loss function together Materials and Methods
with the SeResNet-152 model was used for training and
achieved an accuracy of 98.8% [40]. It has been observed Overview of the Proposed System
that, with the class balanced focal loss function, the per-
formance increased for the species with fewer datasets too. This work proposes the classification of freshwater fish using
Dey et al. 2021 used a FishNet app based on 5-layer CNN a deep Convolutional Neural Network. Figure 1 presents the
for the recognition of eight varieties of indigenous freshwa- layout of the proposed classification model.
ter fish species from Bangladesh. [41]. An ‘adam’ optimizer
was chosen for simulation and three different drop-out rates Data Acquisition and Data preparation
were applied to the last three fully-connected layers of the
CNN. This method attained a validation accuracy of 99% Twenty different varieties of freshwater fish species are
with a 3.25% validation loss [41]. Abinaya et al. 2021 used considered for the fish classification purpose. For this
the Fish_Pak dataset and segmented fish heads, scales, and purpose, fifty-seven (57) fish samples of each of these 20
body are segmented from those images [42]. They used a varieties are captured from different local ponds in Assam,
Naïve Bayesian fusion layer in each training segment to India using fishing nets. The images from these species
increase the classification accuracy to 98.64% with the Fish- are acquired in natural lighting conditions using a Sam-
Pak dataset while with the BYU dataset accuracy achieved sung GalaxyM30s mobile phone that has a triple camera.
was 98.94% [42]. Jose et al. [43] used deep octonion net- We use the wide view mode of the main rear camera with
works as feature extractors and a soft-max classifier for a specification of 48-megapixel, aperture (f/2.0) lens, and
tuna fish classification. This method reported an accuracy focal length of 26 mm. The sensor size is 1/2.0" and the
of 98.01% on the self-collected tuna fish dataset and 98.17% pixel size is 0.8 µm. The third author of the paper, who is
accuracy on the Fish-Pak dataset. actively involved with the identification and study of fish

13
J. Inst. Eng. India Ser. B

Table 1  Data collection per species desired fish classification task and labeled with the help of
Species Fish_Pak Ours Total
the third author.
As these images are not enough for training the classifier,
Labeo gonius – 20 20 so, without acquiring new images, the size of the training
Labeo bata – 28 28 images is increased with the help of various image augmenta-
Channa punctata – 13 13 tion techniques. The process of image augmentation generates
Amblypharyngodon_mola – 15 15 duplicate images from the original images without losing the
Mystus Cavasius – 16 16 key features of the original images (Fig. 2).
Guduchia Chapra – 18 18
N. Notopterus – 14 14 Image Pre‑Processing
R. daniconus – 18 18
Trichogaster fasciata – 20 20 Since all the images, we have collected are of different sizes
Systomus Sarana – 16 16 and in different formats (jpeg and png), we first convert all
Puntius Ticto – 9 9 images into ‘png’ format in ‘Paint’ app. Again, as the number
Heteropneustes_fossils – 20 20 of images in some species are few, so, image augmentation
Anabas_testudineus – 23 23 techniques are applied to increase the size of the dataset. As
Puntius Sophore – 17 17 the dataset is very small, so, we have considered offline image
Cirrinhus Mrigal 70 22 92 augmentation technique which involves storing of images on
Labeo rohita(Rou) 73 16 89 the computer disk after each augmentation operation.
Cyprinus Carpio 50 10 60 Let I(x, y) be the image. Following augmentation tech-
Catla 20 12 32 niques are employed to each of these original images.
CIdella(grass-carp) 11 4 15
Silver carp 47 5 52 Cropping
Total 271 316 587
Cropping is the basic image reframing operation where a
desired portion of the image is retrieved. Let (x1, y1) be the
fauna across Assam, has supported the labeling of the col- origin of the image and (xe, ye) be the end point of the image.
lected images. Table 1 shows the fish image distribution per The new image co-ordinate (xp, yp) after cropping operation
species used in the experiment. We use Fish-Pak [24] dataset will be
for simulating the system in its initial phases. Images of the
(1)
( ) ( )
major carps are downloaded from the Fish-Pak dataset as I xp , yp = I xe − x1 , ye − y1
these are popular carps native to South Asian countries. A
total of 587 images of these species are accumulated for the

Fig. 2  Proposed methodology


for Fish Classification

13
J. Inst. Eng. India Ser. B

Flipping p = Inp (I, D) (5)

All the original images in the dataset were flipped upside where p is a deep learning architecture model which can
down. Then the images were again flipped horizontally as it be series or parallel network. The layer ­Inp performs some
is common than flipping in the vertical axis. It is one of the pre-processing if required, before passing to the next layer
easiest ways to implement and has proven useful on data- called convolution layer. The convolution layer uses filters
sets such as CIFAR-10 and ImageNet. Upside down images that perform convolution operations and scans the input I
have a new co-ordinate of I(x, width-y-1) whereas, a hori- with respect to its dimensions. The input image becomes less
zontally flipped image co-ordinate is I (width-x-1, y) in the noticeable at the output of a convolution layer. However, par-
new image. ticulars such as the edges, orientation, and patterns become
more prominent, and these are the key features from which a
Rotation machine actually learns. The output obtained from the con-
volution operation of the input, I, is called as feature map, fm
The image rotation routine involves an input image, a rota-
(6)
( )
fm = Conv I, k, d, rl
tion angle θ, and a point about which rotation is done. If a
point (x1, y1) in original image is rotated around (x0, y0) by where k is the filter window, d is the stride, and rl is RELU
an angle of θ0 then the new image co-ordinate ( x2 , y2 ), can activation function. Next, batch normalization layer evalu-
be presented by following equation. ates the mean and variance for each of this feature map.
Batch normalization process makes it possible to apply much
(2)
( ) ( )
x2 = x1 − x0 ∗ cos𝜃 + y1 − y0 × sin𝜃 + x0
higher learning rates without caring about initialization.
The pooling layer performs the down-sampling opera-
(3) tion on the feature maps obtained from the preceding layer
( ) ( )
y2 = −sin𝜃 ∗ x1 − x0 + cos𝜃 ∗ y1 − y0 + y0
and produces a new set of feature maps with compressed
If x0 = y0 = 0 and 𝜃 = 90◦ clockwise, then ( x2 , y2 ) resolution. The main objective of a pooling layer is to gather
beocomes crucial information only by discarding unnecessary details.
x2 = y1 (4) Two main pooling operations are average pooling and max
pooling. In average pooling, down-sampling is performed by
All the original images and upside-down images are partitioning the input into small rectangular pooling regions
rotated ­900 clockwise, anticlockwise and ­1800 clock-wise. and then computing the average values of each region.
The image count post augmentation techniques goes up to In the Max-pooling operation, the spatial size of the pre-
2781. vious feature map is reduced. It helps to bring down the
noise by eliminating noisy activations and hence is superior
Convolutional Neural Network than average pooling. The pooling operator carries forward
the maximum value within a group of T activations. Then
A Convolutional neural-network (CNN) is a deep structured n-th max-pooled band consists of K-related filters
algorithm commonly applied to visualize and extract the hid- pk,m = max(hj,(m−1)U+t ) (7)
den texture features of image datasets. Since the last decade,
CNN features and deep networks have become the primary where U ∈ {1, 2, 3 … , T} is pooling shift operation that per-
choice to handle any computer vision tasks. Different CNN mits overlap between pooling regions if U < T. This results
architecture models utilize different layers i.e., convolution in shrinking the dimensionality of the K convolutional band
layers, batch-normalization, max-pooling layer, fully con- to N pooled band (k − T)
[ where, N] = N.J ∕U + 1 and final layer
nected layer, and a soft-max layer. The major advantages becomes p = p1 , p2 … pN ∈ T .
of the deep CNN architecture over the manual supervised At the end of every CNN, there lies one or more fully-
method are its self-learning and self-organized character- connected (FC) layers. The output of pooling layer is flat-
istics [38]. tened and then, put in to the fully-connected layer. The
The first layer in the CNN architecture is the image input neurons in the FC layer performs a linear transformation to
primary layer that takes up 2-D and 3-D images as input. its input and later, a nonlinear transformation takes place
The size of the image must be initialized into this layer [45]. as follows
Let I = m × n is an input image where m and n represent the
rows and columns respectively. Equation (5) represents the
( nH )

image input layer by a function: Y=f Wjk X + B (8)
i=1

13
J. Inst. Eng. India Ser. B

where f is the activation function, X is the input, W is the RELU function. Batch normalization layers standardize the
weight matrix, j is the number of neurons in previous layer, activations of a given input volume before passing it into
k is the neuron in current layer, and B is the bias weight. It the succeeding layer. It calculates the mean and standard
is important to have the same number of output nodes in the deviation for each convolutional filter response across each
final FC layer as the number of classes. mini-batch at each iteration to normalize the present layer
Finally, Soft-max function computes the probability of activation. Resnet-50 has over 23 million trainable param-
each of the class that is outputted by the final FC layer. A eters. The original residual unit is modified by bottleneck
soft-max function is represented by design as shown in Fig. 3. For each residual function F, a
∑ stack of 3-layers was used instead of original 2-layers.
sof tmax(yi ) = exp(yi ) ∕ exp(yi ) (9)
i
Experimental Settings and Evaluation Criteria
From the background search on fish classification and
recognition, it is observed that CNN has shown outstanding The experimental simulation is performed in MAT-
result in fish classification compared to traditional method LAB2018a environment in a system with an Intel Core i5
of manual feature-based image classification. Some of the 7300HQ Processor of 2.50 GHz turbo up to 3.50 GHz with
species in our dataset had fewer images that resulted in class NVIDIA GeForce GTX 1050 Ti with a computation capa-
imbalance. So, to overcome this problem we have employed bility of 6.1.
transfer learning [46] method where a pretrained model of We use fine-tuned Resnet-50 and Alex-net from MAT-
CNN(ResNet-50 and Alex-Net) is being chosen to solve the LAB’s Deep-learning Toolbox. Due to variations in the
task. image sizes, we use ‘augmentedImageDatastore’ func-
tion from MATLAB to resize all the manually augmented
AlexNet images.
Our simulation-based experiments are performed first
Alex-Net is considered one of the most popular CNNs used on Fish_Pak dataset. A total of 1,191 augmented images
for pattern recognition and classification/identification appli- from this dataset are segregated randomly into 70% train-
cations. It incorporates 60 million parameters and 650,000 ing (834) and 30% validation (357) images. As our dataset
neurons [47]. The architecture consists of 5 convolution lay- is small, we use fine-tuned approach where the top 3 layers
ers, max-pooling layers, and 3 consecutive fully connected of Resnet-50 and Alex-Net models are removed and then
layers. The input layer takes the images of size 227 × 227 × 3 again, we re-arranged the model by adding the three layers
and then it filters the images by using a total 96 number of with six output neurons for the output layer. The weight and
kernels of size 11 × 11 with a stride of four. The output from
this layer is passed through the second convolutional layer
of 256 feature maps and the size of each kernel is 5 × 5 with
a stride of one. The output from this layer passes through a
pooling and normalization layer resulting in a 13 × 13 × 256
output. The third and fourth convolution layer use a kernel
of 3 × 3 with a stride of one. Then the output of the last con-
volutional layer is flattened through a fully connected layer
of 9216 feature maps which is again to a fully connected
layer with 4096 units. The last layer is the output that uses
the SoftMax layer with six units (classes) for the Fish_Pak
dataset and 20 units (classes) for our dataset.

ResNet‑50

The ResNet-50 architecture comprises of 5 stages each with


a convolution and identity block, one average pool layer,
and one fully connected layer with 1000 neurons [48].
Each convolution block has 3 convolution layers and each
identity block also has 3 convolution layers. The input to
Resnet-50 is of size 224 × 224 × 3. Each convolutional layer
is followed by a batch normalization layer and a nonlinear
Fig. 3  basic residual building unit of Resnet-50

13
J. Inst. Eng. India Ser. B

bias learning rate of the fully connected layer are varied for optimizer(‘sgdm’) with a momentum of 0.9 to train our data-
best performance of the classifier while the bottom layer set for the two architectures under consideration.
weights are frozen. Freezing the bottom layer weight helps The learning rate (Lr) in every CNN architecture pro-
in increased network speed during training. For training the duces a prominent impact on the overall accuracy of the
classifier an “sgdm” optimizer with a “momentum” of 0.9 classifier. While a low learning rate (Lr) can exhibit a slow
is used. Training data is divided into a “mini-batch” size execution rate of the entire model and thus increasing the
of 20 and we choose a maximum epoch = 30.The learning time complexity. On the other hand, using too large a learn-
capability of both the CNNs, i.e., Alex-Net and Resnet-50 is ing rate, Lr, there is a possibility that the model may get
evaluated in terms of classification accuracy, loss function stuck at the suboptimal output results. So, we have experi-
(training and validation loss) and confusion matrix (CM). mented with different values of the learning rate, i.e., 0.01,
CM is an ideal performance measure to assess accuracy of 0.001, and 0.0001. Tables 2 and 3 represent performance
multi-class balanced classification problem like fish clas- analysis of the two network models at different learning rate
sification. It is a two-dimensional matrix with target and with variations in the ‘WeightLearnRate’ factor and ‘Bias-
predicted class at the rows and columns, respectively. It pro- LearnRate’ factor of the fully connected layers.
vides insight into class-specific and also overall classifica- The performance of the classifier is determined by four
tion accuracies. It gives the overall classification accuracy evaluation parameters which are important for multi classi-
evaluated based on the number of images that have been fication problems: overall classification accuracy, mean pre-
classified correctly. cision, recall, and mean F-score. The recall is an important
As both the CNN models show satisfactory performance parameter in a multi-class classification task as it refers to
with Fish-Pak dataset, so next, we apply our own dataset the no. of positive classes that are labeled correctly. Preci-
consisting of a total 2781 augmented images. The training of sion gives information about the actual positive cases out of
1946 images(70% of 2781) is carried out and the 834 images the total positive prediction. All these parameters are calcu-
are chosen for the validation of the proposed CNN classifier. lated as per Eqs. (10)–(12).
A ‘mini-batch size of 32’ is chosen for both training and test-
ing purposes. We use a gradient descent solver momentum

Table 2  Performance Analysis Parameters Resnet-50 AlexNet


of Resnet-50 and Alex-Net
classifier on Fish-Pak dataset Lr = .01 Lr = .001 Lr = .0001 Lr = .001 Lr = .0001 Lr = 0.00001
with Max epoch = 30, validation
frequency = 3, momentum = 0.9 WeightLearnRateFactor’ = 10,’BiasLearnRateFactor’ = 10
Accuracy 100 100 98.04 98.60 99.16 94.69
Precision 100 100 97.78 98.43 99.24 94.21
Recall 100 100 98.21 98.43 99.22 95.61
F-score 100 100 98.00 98.43 99.23 94.91
Training loss .0003 .0001 .0147 .0014 .0086 .4798
Validation loss .0238 .0382 .1265 .0315 .0232 .1766
WeightLearnRateFactor’ = 20,’BiasLearnRateFactor’ = 20
Accuracy 99.44 99.16 98.88 97.76 98.04 91.34
Precision 99.24 98.77 98.74 98.07 97.34 89.93
Recall 99.45 99.43 99.72 97.85 98.09 93.92
F-score 99.35 99.09 98.99 97.96 97.71 91.88
Training loss .0145 .0013 .0162 .0043 .0020 .2763
Validation loss .1009 .0526 .0790 .0771 .0460 .2322
WeightLearnRateFactor’ = 30,’BiasLearnRateFactor’ = 30
Accuracy 97.21 99.72 98.88 98.04 98.32 96.09
Precision 96.93 99.80 98.96 98.44 98.79 95.99
Recall 96.87 99.76 99.05 98.73 98.17 96.11
F-score 96.90 99.72 99.00 98.14 98.48 96.09
Training loss .0620 .0002 .0128 .0546 .0057 .0725
Validation loss .1288 .0165 .1177 .0985 .0823 .1435

13
J. Inst. Eng. India Ser. B

Table 3  Performance Analysis Parameters Resnet-50 AlexNet


of Resnet-50 and Alex-Net
classifier on own dataset with Lr = .01 Lr = .001 Lr = .0001 Lr = .001 Lr = .0001 Lr = 0.00001
Max epoch = 30, validation
frequency = 3, momentum = 0.9 WeightLearnRateFactor’ = 10,’BiasLearnRateFactor’ = 10,
Accuracy 99.52 99.76 98.68 97.60 97.00 92.08
Precision 99.52 99.71 98.52 97.17 96.87 90.13
Recall 99.43 99.91 98.87 97.48 96.75 92.11
F-score 99.47 99.81 98.70 97.32 96.81 91.11
Training loss .0003 .0020 .0328 0.0597 .0145 .3374
Validation loss .0352 .0184 .1010 0.0949 .0796 .2774
WeightLearnRateFactor’ = 20,’BiasLearnRateFactor’ = 20
Accuracy 99.52 100 98.80 96.76 97.12 93.28
Precision 99.40 100 98.32 96.13 97.04 92.00
Recall 99.31 100 98.83 96.12 97.12 93.64
F-score 99.36 100 98.57 96.12 97.08 92.81
Training loss .0005 .0004 .0235 .0173 .0457 .2150
Validation loss .0429 .0206 .0728 .0903 .0840 .2105
WeightLearnRateFactor’ = 30,’BiasLearnRateFactor’ = 30
Accuracy 99.16 99.76 98.44 95.92 97.84 93.88
Precision 98.97 99.62 98.18 95.68 97.62 92.50
Recall 99.20 99.78 98.35 96.47 98.38 94.38
F-score 99.16 99.70 98.26 96.07 98.00 93.43
Training loss .0028 .0002 .0103 .0201 .0014 .2114
Validation loss .0720 .0269 .0707 .1495 .0556 .1938

True Positive successfully into their corresponding classes. Figure 5 is


Precision =
True positive + False positive (10) the best CM achieved with bias and weight factor 20 at a
learning rate 0.01 for ResNet-50 and the model in this case
True Positive fails to classify two(2) of the Catla fishes and mistakes it as
Recall =
True Positive + False Negative (11) Grass-carp(G.carp). With Lr = 0.001 and Weight = Bias = 30,
the maximum accuracy achieved with Resnet-50 is 99.72%
shown in Fig. 7 and in this case, only one mrigal fish is
Recall × Precision
F − score = 2 × (12) misclassified as “Rou” while all others are 100% correctly
Recall + Precision
classified. As maximum validation accuracy is achieved at

Results and discussions

The CMs shown in Fig. 4, 5 and 6 give the idea of how all
the fish species are classified to its corresponding classes
by Resnet-50 on the Fish-Pak dataset. The rows in the CM
correspond to the Output Class (predicted class) and the
columns correspond to the true class (target class). The
diagonal values represent the number of correctly classified
species and the percentage value represented in the diagonal
cells is the recall value, i.e., out of all predicted classes how
many are actually correctly classified(positive). All confu-
sion matrix plot is obtained by using the MATLAB function
“Plotconfusion”. Figure 4 represents the CM of Resnet-50
architecture that gives the best classification accuracy of
100% at a learning rate of 0.01 and weight = bias = 10 on Fig. 4  CM of Resnet-50 on Fish-Pak dataset at Lr = .01,
Fish Pak dataset. Here, all the fish test images are predicted Weight = 10 = Bias

13
J. Inst. Eng. India Ser. B

Fig. 5  CM of Resnet-50 on Fish Pak dataset at Lr = .01, Fig. 7  CM of Alex-Net on Fish Pak dataset at Lr = .0001 and
Weight = 20 = Bias Weight = 10 = Bias

Fig. 8  CM of Alex-Net on Fish Pak dataset at Lr = .0001 and


Fig. 6  CM of Resnet-50 on Fish Pak dataset at Lr = .001, Weight = 20 = Bias
Weight = 30 = Bias

of the ‘Catla’ fish are mis-classified as ‘G.carp’ while one


learning rate, Lr = 0.01 and weight = bias rate = 10, so the
is predicted wrong as Rou. Additionally, two of the Rou
variation of accuracy and loss for this is shown in Fig. 10.
fish are predicted wrong as ‘G. carp’, one as ‘Catla’ and
With Alex-Net, the maximum classification accuracy of
only one ‘mrigaal’ is wrongly predicted as ‘Rou’. With
99.16% is obtained at Lr = 0.0001 and Weight = Bias fac-
Lr = 0.0001 and Weight = Bias = 30, total 6 fish images are
tor = 10. The CM corresponding to this is shown in Fig. 7.
classified inaccurately and overall accuracy achieved is
From the CM, we come to know that all the predicted
98.32%. From Fig. 9 we see that one (1) ‘Catla’ and one
fish of ‘C. carp’, ‘G. carp’, ‘Silver-carp’ and ‘mrigaal’
(1) ‘C.carp’ fish is wrongly classified as ‘Rou’ (Fig. 10).
are 100% correctly classified. On the other hand, one (1)
Whereas, one (1) Rou fish is misclassified as ‘mrigaal’
‘Rou’ is misclassified as ‘C. carp’ and another one (1) is
and one as ‘Catla’. Figure 11 shows the best performance
mistaken as ‘Catla’, thus bringing down the recall rate of
of AlexNet from training at learning rate, Lr = 0.0001 and
‘Rou’ to 97.6%. Here, one of the Catla is also mis-classi-
weight = bias learning rate = 10.
fied as ‘Rou’ so, the recall value of ‘Catla’ is reduced to
Figure 12, 13 and 14 are the best confusion matrices
97.8%. The CM in Fig. 8 tells us how the AlexNet 100%
derive from ResNet-50 at different learning rate, Lr and
correctly predicted all the ‘C. carp’, ‘G. carp’, ‘Silver-
“Weight and Bias Learning Rate” on our own dataset. In
Carp’, but it fails in predicting 7 other fish images. Two
Fig. 12, all the fish species are correctly classified to its

13
J. Inst. Eng. India Ser. B

wrongly as G. Carp, whereas in Fig. 14, one of the ‘Catla’


fish is misclassified as ‘S. Sarana’ and one ‘Rou’ as ‘Catla’.
With AlexNet on our dataset, mis-classification rate is quite
high which is clearly visible from CMs shown in Fig. 15, 16
and 17. These are the best CMs of Alex-Net when we vary
the Weight and Bias Learn Rate factor from 10 to 30 on
three different learning rates. In Fig. 15, total mis-classifica-
tion is 20, in Fig. 16 it is 24 and in Fig. 17 it is 18. The train-
ing progress of best performance by Resnet-50 and Alex-Net
on our dataset is shown in Figs. 18 and 19, respectively.
It is clearly visible from the confusion matrices (CMs)
that ResNet-50 is the best at correctly classifying all the
fish species into their corresponding classes with few
miss-classification.
The performance parameters of both the CNN models
i.e., Resnet-50 and Alex-Net are shown in Tables 2 and 3
on the Fish-Pak dataset and our own dataset, respectively.
For Fish-Pak dataset, Resnet-50 offers 100% accuracy, pre-
Fig. 9  CM of Alex-Net on Fish Pak dataset at Lr = .0001 and cision and recall for both Lr = 0.01 and 0.001 at a weight
Weight = 30 = Bias
and bias factor = 10.However, the validation loss is more
with Lr = 0.001 than Lr = 0.01. So, therefore, we can say
corresponding classes. Though in Figs. 13 and 14, accura- best performance is achieved at Lr = 0.01. On the other hand,
cies are same, their mis-classification cases are different. For the maximum achievable accuracy with AlexNet on the
same accuracy, in Fig. 13, two of the Rou fish are predicted Fish_Pak dataset is 99.16% at a learning rate of 0.0001 with

Fig. 10  Accuracy and Loss curve of ResNet-50 at Lr = 0.01 and weight = 10 for Fish-Pak dataset

13
J. Inst. Eng. India Ser. B

Fig. 11  Accuracy and Loss curve of AlexNet-50 at Lr = 0.0001 and Weight = 10 for Fish-Pak dataset

Fig. 12  CM of Resnet-50 at Lr = 0.001and Weight = 20 = Bias factor on own dataset

13
J. Inst. Eng. India Ser. B

Fig. 13  CM of Resnet-50 at Lr = 0.001and Weight = 10 = Bias factor on own dataset

Fig. 14  CM of Resnet-50 at Lr = 0.001and Weight = 30 = Bias factor on own dataset

13
J. Inst. Eng. India Ser. B

Fig. 15  CM of Alex-Net at Lr = .001and Weight = 10 = Bias factor on own dataset

Fig. 16  CM of Alex-Net at Lr = .0001and Weight = 20 = Bias factor on own dataset

13
J. Inst. Eng. India Ser. B

Fig. 17  CM of Alex-Net at Lr = .0001and Weight = 30 = Bias factor on own dataset

Fig. 18  Accuracy and loss curve (performance) of Resnet-50 at Lr = .001and weight = 20 = bias factor on our dataset

Weight and Bias learning factor = 10. During the experi- same training options that we feed to Resnet-50. Therefore,
ment, we observed that, performance measures at Lr = 0.01 we totally discard results associated with this learning rate.
for Alex-Net is below 40% for all weight and bias factors at From Table 3, with our own dataset, the best classification
(100%) performance measures by ResNet-50 are obtained at

13
J. Inst. Eng. India Ser. B

Fig. 19  Accuracy and loss curve (performance) of Alex-Net at Lr = .0001and weight = 30 = bias factor on our dataset

Table 4  Mis-classification analysis with Resnet-50 and Alex-Net For both the datasets, it has been observed that training
Parameters Weight and Mis-classification Mis-classification
loss and validation loss in the case of ResNet-50 decrease
bias learning on Fish_Pak on own data- first when we train the model from learning rate, Lr = 0.01
rate dataset (validation set (validation to Lr = 0.001, and it again increases for the learning rate,
image = 357) image = 834) Lr = 0. 0001. While in the case of Alex-Net, a decrease in
Resnet-50 10 0 2 learning rate increases the training and validation loss. Also,
20 2 0 an increase in Weight and Bias learning factors increase
30 1 2 the training and validation loss of the classifiers. From
Alexnet 10 3 20 Tables 2 and 3, we can draw an inference that with a gradual
20 7 24 increase in Weight and Bias learning factor, the loss function
30 6 18 increases for both the datasets.
The performance of the proposed classification method is
compared with the most recent works that dealt with fresh-
Lr = 0.001 when we choose the Weight and Bias factor as water fish species for classification and a detailed analysis
20. It also experiences the least validation and training loss is shown in Table 5. Those works use similar kinds of fish
with this training options. Alex-Net best performance is only species that are common in South Asian countries.
97.84%. validation accuracy at Lr = 0.0001 with weight and We have considered multiple parameters for showing a
bias learning factor = 30. Table 4 gives the total misclas- detailed comparison of the proposed work with previous
sification numbers by both the models on the two datasets. works. The highest classification accuracy on the Fish-
So, from the above empirical analysis, we observe that the Pak dataset is 98.8% [39] followed by 98.64% to date [42],
best classification parameters among the two classifiers are whereas with our fine-tuned Resnet-50 network, we are able
offered by ResNet-50 for both datasets at a learning rate of to reach 100% classification accuracy, precision, and recall
0.001. on the Fish-Pak dataset. We also achieve a considerable
improvement in terms of validation (2.38%) and training

13
J. Inst. Eng. India Ser. B

Table 5  Performance comparison of the proposed model with state-of-the-art methods


Papers Algorithm Accuracy (%) Precision (%) Recall (%) Validation Training Loss Dataset No. of species
Loss (%) (%)

Islam et al. HLBP feature 94.78 Not men- Not men- Not men- Not men- BDIndig- 8
[35] with SVM tioned tioned tioned tioned enous-
Fish2019
Rauf et al. 32-layer VGG- 98.5%, 94.83, 95.67, Not men- Not men- Fish_Pak 6
[25] Net, (93) (89.88) (90.17) tioned tioned
(Resnet-50)
Sharmin et al. Handcrafted 94.2% 93% 94.59 NA NA self-collected 6
[26] fea- SIFS
tures + SVM-
classifier
Wang et al. SPP-densenet 97% 97.62% 99.2% 0.1(from Not men- Google 6
[37] graph) tioned Images
Banan et al. VGG-16 100% Not men- Not men- 0.0014 .0154 Own images 4
[38] tioned tioned of major
carps
Iqbal et al. AlexNet 90.48% Not men- Not men- Not men- Not men- LifeClef’15 6
[39] tioned tioned tioned tioned
Xu et al. [40] Combined SE- 98.8% Only per Only per NA NA Fish_Pak 6
Resnet-152 species is species is
mentioned mentioned
Dey et al. [41] 5-layer CNN 99 99.01 99.01 .0325 .0167 BDIndig- 8
enous-
Fish2019
Abinaya et al. AlexNet with 98.64 99.80 98.99 Not men- Not men- Fish_Pak 6
[42] fuse Naive tioned tioned
Bayesian
layer
Jose et al. [43] Octonion Net- 98.17% 98.11% 98.13% NA NA Fish_Pak 6
work + soft- (98.01%) (98.05%) (98.07%) (Self-col- (3)
max classifier lected Tuna
fish data)
Proposed Fine-tuned 100 100 100 .0238 0.0003 Fish_Pak 6
Work (Ours) Resnet-50
Proposed Fine-tuned 100 100 100 .0206 .0004 Own dataset 20
Work (Ours) Resnet-50

loss (0.03%). Our analysis shows that this is the best result loss < training loss. With our own dataset of 20 varieties
on the Fish_Pak dataset to this date. of indigenous fish images, we achieve a 100% validation
Although a 100% classification accuracy is achieved accuracy, precision and recall. We accomplish even less
by using VGG-16 [38], their target species is only 4 major training and validation loss than Fish-Pak dataset on our
carps. Another important thing we want to highlight in this simulation result. So, we can claim that our work is the first
paper is that the learning curve where the validation loss is to report successful automatic classification of 20 varieties
lesser than the training loss which is considered an underfit of fresh-water fish from India with best classification scores
problem. This is a case of an unrepresentative validation and minimum loss than any of the state-of-the-art methods.
dataset [49–51] which means validation dataset is easier for
the model to predict than the training data. Another possibil-
ity could be validation dataset considered in the simulation Conclusions
experiment is scarce but widely represented by the training
dataset therefore the model behaves pretty well with those The research article investigates different learning rate, weight
few validation data. Generally, a good training means vali- and bias factor for achieving maximum classifier accuracy by
dation loss should be slightly greater than the training loss. ResNet-50 and Alex-Net on fish classification problem. From
Also, it is advised to keep training the classifier if validation the performance analysis of these two deep-learning models,

13
J. Inst. Eng. India Ser. B

we have observed that a 100% classification accuracy is achiev- 2. T.M. Berra, An Atlas of Distribution of the Fresh Water Fish Fami-
able by Resnet-50 architecture at a learning rate of 0.001 on lies of the World (University of Nebarska Press, Lincoln, 1981)
3. D. Kumar, Fish culture in undrainable ponds. A manual for exten-
both the datasets under investigation. The proposed ResNet-50 sion, FAO Fisheries Technical Paper No. 325. (Rome, FAO,
model not only improves the target classification accuracy but 1992). 239 p
reduces network under-fitting too. It also reveals that the mis- 4. M. Karim, H. Ullah, S. Castine et al., Carp–mola productivity
classification rate by ResNet-50 is minimum compared to Alex- and fish consumption in small-scale homestead aquaculture in
Bangladesh. Aquac. Int. 25, 867–879 (2017). https://​doi.​org/​10.​
Net. The paper reports the best classification accuracy on the 1007/​s10499-​016-​0078-x
Fish-Pak dataset till date. The paper also reports maximum 5. B.K. Bhattacharjya, M. Choudhury, V.V. Sugunan, Ichthyofaunis-
classification accuracy of 100% first time for 20-class classifi- tic resources of Assam with a note on their sustainable utilization,
cation model. The proposed method will help in accurate clas- in Participatory approach for Fish Biodiversity Conservation in
North East India, ed. by P.C. Mahanta, L.K. Tyagi (Workshop
sification of carps and small indigenous fish species reared in Proc. NBFGR, Lucknow, 2003), pp. 1–14
one environment. With the help of this technique, fish farmers 6. S. Dewan, M.A. Wahab, M.C.M. Beveridge, M.H. Rahman, B.K.
would be able to manage early segregation of the fish species Sarker, Food selection, electivity and dietary overlap among
according to their food preference and growth rate for better planktivorous Chinese and Indian major carp fry and fingerlings
grown in extensively managed, rain-fed ponds in Bangladesh.
yield. Also, accurate classification of fish species can create Aquac. Res. 22, 277–294 (1991)
awareness among fish loving locals of the North-East India in 7. M.M. Rahman, M.C.J. Verdegem, M.A. Wahab, M.Y. Hossain,
limiting over exploitation of some of the threatened fish species. Q. Jo, Effects of day and night on swimming, grazing and social
In future, we would like to incorporate a live monitoring-based behaviours of rohu Labeo rohita (Hamilton) and com-mon carp
Cyprinus carpio (L.) in simulated ponds. Aquac. Res. 39, 1383–
classification system to develop a robust and realistic classifica- 1392 (2008)
tion model to help local and poor fish farmers for proper man- 8. M.A. Wahab, M.M. Rahman, A. Milstein, The effect of common
agement of their fisheries as their livelihood depends on this. carp Cyprinus carpio (L.) and mrigal Cirrhinus mrigala (Hamil-
ton) as bottom feeders in major Indian carp polycultures. Aquac.
Acknowledgements The authors are grateful to Mr. Balaram Mahal- Res. 33, 547–557 (2002)
der (Technical Specialist, WorldFishCenter), Mr.Hamid Badar Osmany 9. D.M. Alam, M. Hasan, Md. Wahab, M. Khaleque, M. Alam, Md.
(Assistant Biologist, Marine Fisheries Department, Karachi, Pakistan), Samad, Carp polyculture in ponds with three small indigenous fish
Mostafa A. R. Hossain (Professor, Professor, Aquatic Biodiversity & species—Amblypharyngodon mola, Chela cachius and Puntius
Climate Change,Department of Fish. Biology & Genetics, Bangladesh sophore. Progress. Agric. 13, 117–126 (2018)
Agricultural University), Dr. Shamim Rahman(Asstt. Professor in 10. A.S.M. Kibria, M.M. Haque, Potentials of integrated multi-trophic
Zoology,Devicharan Barua Girls’ College, Jorhat, India), Dr. Dibakar aquaculture (IMTA) in freshwater ponds in Bangladesh. Aquac.
Bhakta(Scientist (SS),Riverine and Estuaries Fisheries Division ICAR- Rep. 11, 8–16 (2018). https://d​ oi.o​ rg/1​ 0.1​ 016/j.a​ qrep.2​ 018.0​ 5.0​ 04
Central Inland Fisheries Research Institute, Barrackpore,India), Dr. 11. N.J.C. Strachan, Recognition of fish species by colour and shape.
Mosaddequr Rahman(Kadai · Graduate School of Agriculture Forestry Image Vis. Comput. 11, 2–10 (1993)
and Fisheries, Kagoshima University) and Bhenila Bailung(Research 12. S. Cadieux, F. Michaud, F. Lalonde, Intelligent system for auto-
Scholar,Dibrugarh University) for allowing us to use their fish images mated fish sorting and counting, Proceedings, in 2000 IEEE/RSJ
from FishBase for initial simulation of the deep-learning networks. International Conference on Intelligent Robots and Systems (IROS
Due to the COVID-19 led pandemic and lockdown, our own fish image 2000) (Cat. No.00CH37113), vol. 2 (Takamatsu, Japan, 2000), pp.
collections was hampered initially, but these individuals saved us by 1279–1284
permitting us to use their collection of images which helped in initial 13. A. Rova, G. Mori, L. M. Dill, One fish, two fish, butterfish, trum-
planning of the classification model. peter: recognizing fish in underwater video, in APR Conference
on Machine Vision Applications (2007), pp. 404–407
14. C. Spampinato, D. Giordano, R. Di Salvo, J. Chen-Burger, R.
Author contribution statement JD: Conceptualization, Methodol- Fisher, G. Nadarajan, Automatic fish classification for underwa-
ogy, Software, Investigation, Writing—original draft, Writing—review ter species behavior understanding. Anal. Retr. Tracked Events
& editing. Dr. SL: Supervision, Conceptualization. Dr. BB: Data col- Motion Imagery Streams (2010). https://​doi.​org/​10.​1145/​18778​
lection and validation. 68.​18778​81
15. M.K. Alsmadi, K.B. Omar, S.A. Noah et al., Fish classification
Funding The authors declare that no funds, grants, or other support based on robust features extraction from color signature using
were received during the preparation of this manuscript. back-propagation classifier. J. Comput. Sci. 7, 52 (2011)
16. B. Benson, J. Cho, D. Goshorn, R. Kastner, Field programmable
Declarations gate array (FPGA) based fish detection using Haar classifiers. Am.
Acad. Underwater Sci. (2009)
Conflict of interest The authors have no relevant financial or non- 17. J. Hu, D. Li, Q. Duan, Y. Han, G. Chen, X. Si, Fish species clas-
financial interests to disclose. sification by color, texture and multi-class support vector machine
using computer vision. Comput. Electron. Agric. 88, 133–140
(2012)
18. M. M. M. Fouad, H. M. Zawbaa, N. El-Bendary, A. E. Hassanien,
References Automatic Nile Tilapia fish classification approach using machine
learning techniques, in 13th International Conference on Hybrid
1. FAO, Aquaculture development trends in Asia (2000). http://w
​ ww.​ Intelligent Systems (HIS 2013) (Gammarth, 2013), pp. 173–178.
fao.​org/3/​ab980e/​ab980​e03.​htm#​TopOf​Page https://​doi.​org/​10.​1109/​HIS.​2013.​69204​77

13
J. Inst. Eng. India Ser. B

19. C. Pornpanomchai, B. Lurstwut, P. Leerasakultham, W. Kitiyanan, 36. A. Jalal, A. Mian, M. Shortis, F. Shafait, Fish detection and spe-
Shape- and texture-based fish image recognition system. Kasetsart cies classification in underwater environments using deep learn-
J. Nat. Sci. 47, 624–634 (2013) ing with temporal information. Ecol. Inform. 57, 101088 (2020).
20. M. Rodrigues, M. Freitas, F. Pádua, R. Gomes, E. Carrano, Eval- https://​doi.​org/​10.​1016/j.​ecoinf.​2020.​101088
uating cluster detection algorithms and feature extraction tech- 37. H. Wang, Y. Shi, Y. Yue, H. Zhao, Study on freshwater fish image
niques in automatic classification of fish species. Pattern Anal. recognition integrating SPP and DenseNet network. 2020, in
Appl. (2014). https://​doi.​org/​10.​1007/​s10044-​013-​0362-6 IEEE International Conference on Mechatronics and Automation
21. P.X. Huang, B.J. Boom, R.B. Fisher, Hierarchical classification (ICMA) (2020). https://​doi.​org/​10.​1109/​icma4​9215.​2020.​923
with reject option for live fish recognition. Mach. Vis. Appl. 26, 38. A. Banan, A. Nasiri, A. Taheri-Garavand, Deep learning-based
89–102 (2015) appearance features extraction for automated carp species identi-
22. M.-C. Chuang, J.-N. Hwang, K. Williams, A feature learning and fication. Aquac. Eng. 89, 102053 (2020)
object recognition framework for underwater fish images. IEEE Trans. 39. M.A. Iqbal, Z. Wang, Z.A. Ali et al., Automatic fish species clas-
Image Process. (2016). https://​doi.​org/​10.​1109/​tip.​2016.​25353​42 sification using deep convolutional neural networks. Wireless
23. D. Li, Qi. Wang, X. Li, M. Niu, He. Wang, C. Liu, Recent Pers. Commun. 116, 1043–1053 (2021). https://​doi.​org/​10.​1007/​
advances of machine vision technology in fish classification. ICES s11277-​019-​06634-1
J. Mar. Sci. 79(2), 263–284 (2022). https://​doi.​org/​10.​1093/​icesj​ 40. X. Xu, W. Li, Q. Duan, Transfer learning and SE-ResNet152 net-
ms/​fsab2​64 works-based for small-scale unbalanced fish species identification.
24. S.Z.H. Shah, H.T. Rauf, I.U. Lali, S.A.C. Bukhari, M.S. Khalid, Comput. Electron. Agric. 180, 105878 (2021). https://​doi.​org/​10.​
M. Farooq, M. Fatima, Fish-Pak: fish species dataset from Paki- 1016/j.​compag.​2020.​105878
stan for visual features based classification. Mendeley Data 41. K. Dey, M.M. Hassan, M.M. Rana, M.H. Hena, Bangladeshi
(2019). https://​doi.​org/​10.​17632/​n3ydw​29sbz.3 indigenous fish classification using convolutional neural networks.
25. H.T. Rauf, M.I. Lali, S. Zahoor, S.Z. Shah, A. Rehman, S.A.C. Int. Conf. Inf. Technol. (ICIT) 2021, 899–904 (2021). https://​doi.​
Bukhari, Visual features based automated identification of fish org/​10.​1109/​ICIT5​2682.​2021.​94916​81
species using deep convolutional neural networks. Comput. Elec- 42. N.S. Abinaya, D. Susan, R. Sidharthan, Naive Bayesian fusion
tron. Agric. (2019). https://​doi.​org/​10.​1016/j.​compag.​2019 based deep learning networks for multisegmented classification of
26. I. Sharmin, N.F. Islam, I. Jahan et al., Machine vision based local fishes in aquaculture industries. Ecol. Inform. 61, 101248 (2021).
fish recognition. SN Appl. Sci. 1, 1529 (2019). https://​doi.​org/​10.​ https://​doi.​org/​10.​1016/j.​ecoinf.​2021.​101248
1007/​s42452-​019-​1568-z 43. J. Jose, C.S. Dr. Kumar, S. Sureshkumar, Region-based split Octo-
27. Z. Ju, Y. Xue, Fish species recognition using an improved AlexNet nion networks with channel attention module for tuna classifica-
model. Optik 223, 165499 (2020). https://​doi.​org/​10.​1016/j.​ijleo.​ tion. Int. J. Pattern Recognit. Artif. Intell. (2022). https://​doi.​org/​
2020.​165499 10.​1142/​S0218​00142​25003​06
28. A.A. dos Santos, W.N. Gonçalves, Improving Pantanal fish species 44. M.K. Alsmadi, I. Almarashdeh, A survey on fish classification
recognition through taxonomic ranks in convolutional neural net- techniques. J. King Saud Univ. Comput. Inf. Sci. 34(5), 1625–
works. Ecol. Inform. 53, 100977 (2019). https://d​ oi.o​ rg/1​ 0.1​ 016/j.​ 1638 (2022). https://​doi.​org/​10.​1016/j.​jksuci.​2020.​07.​005
ecoinf.​2019.​100977 45. M. Riesenhuber, T. Poggio, Hierarchical models of object recog-
29. P. Mathew, S. Elizabeth, Fish identification based on geometric nition in cortex. Nat. Neurosci. 2, 1019–1025 (1999). https://​doi.​
robust feature extraction from anchor/landmark points, in National org/​10.​1038/​14819
Conference on Image Processing and Machine Vision (NCIPMV) 46. S.J. Pan, Q. Yang, A survey on transfer learning. IEEE Trans.
2017 (At University of Kerala, Trivandrum, 2017) Knowl. Data Eng. 22(10), 1345–1359 (2010)
30. J. Jäger, E. Rodner, J. Denzler, V. Wolff, K. Fricke-Neuderth, 47. A. Krizhevsky, I. Sutskever, G. Hinton, ImageNet classification
Seaclef 2016: Object proposal classification for fish detection in with deep convolutional neural networks. Neural Inf. Process.
underwater videos, in CLEF (Working Notes) (2016), pp. 481–489 Syst. (2012). https://​doi.​org/​10.​1145/​30653​86
31. S.A. Siddiqui, A. Salman, M.I. Malik, F. Shafait, A. Mian, M.R. 48. K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image
Shortis, E.S. Harvey, Automatic fish species classification in recognition, in Proceedings of the IEEE Conference on Computer
underwater videos: exploiting pretrained deep neural network Vision and Pattern Recognition (Las Vegas, NV, USA, 27–30 June
models to compensate for limited labelled data. ICES J. Mar. Sci. 2016), pp. 770–778
75, 374–389 (2017) 49. T. Mitchell, Machine Learning (McGraw-Hill Science/Engineer-
32. Y. Ma, P. Zhang, Y. Tang, Research on fish image classification ing/Math, Berlin, 1997)
based on transfer learning and convolutional neural network model 50. K. Horak, Introduction to Learning Curves. http://​vision.​uamt.​
2018, in 14th International Conference on Natural Computation, feec.​vutbr.​cz/​STU/​lectu​res/​Machi​neLea​r ning_​Learn​ingCu​r ves.
Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) (2018), Accessed 08 February (2022)
pp. 850–855. https://​doi.​org/​10.​1109/​FSKD.​2018.​86868​92 51. https://​www.​baeld​ung.​com/​cs/​learn​ing-​curve-​ml. Accessed 08
33. A. Salman, S. Maqbool, A.H. Khan, A. Jalal, F. Shafait, Real- Feb 2022
time fish detection in complex backgrounds using probabilistic
background modelling. Ecol. Inform. 51, 44–51 (2019) Publisher’s Note Springer Nature remains neutral with regard to
34. N.E. Khalifa, M. Taha, A.E. Hassanien, Aquarium family fish jurisdictional claims in published maps and institutional affiliations.
species identification system using deep neural networks, in Pro-
ceedings of the International Conference on Advanced Intelligent Springer Nature or its licensor (e.g. a society or other partner) holds
Systems and Informatics 2018, (2019), pp. 347–356. https://​doi.​ exclusive rights to this article under a publishing agreement with the
org/​10.​1007/​978-3-​319-​99010-1_​32 author(s) or other rightsholder(s); author self-archiving of the accepted
35. M. A. Islam, M. R. Howlader, U. Habiba, R. H. Faisal, M. M. Rah- manuscript version of this article is solely governed by the terms of
man, Indigenous fish classification of Bangladesh using hybrid such publishing agreement and applicable law.
features with SVM classifier. 2019, in International Conference
on Computer, Communication, Chemical, Materials and Elec-
tronic Engineering (IC4ME2) (2019). https://​doi.​org/​10.​1109/​
ic4me​247184.​2019.​90366​79

13

View publication stats

You might also like