Visvesvaraya Technological University: Chest X-Ray of Pnenmonia Disease Diagnosis Using CNN
Visvesvaraya Technological University: Chest X-Ray of Pnenmonia Disease Diagnosis Using CNN
Visvesvaraya Technological University: Chest X-Ray of Pnenmonia Disease Diagnosis Using CNN
A Project Report
On
MD NOORUZZAMAN 4MT17CS056
KARTHIK R 4MT17CS046
GANESH NAIK 4MT17CS040
FATHIMA ANISHA 4MT16CS37
Project Guide
Dr. SUKHWINDER SHARMA
Associate Professor
Department of CS & E
MITE, Moodabidri
February – May 2020
CERTIFICATE
This is to certify that the project work entitled “CHEST X-RAY OF PNEMONIA
Visvesvaraya Technological University, Belagavi during the year 2020 – 21. It is certified
that all corrections and suggestions indicated for Internal Assessment have been incorporated
in the report deposited in the departmental library. The project has been approved as it
satisfies the academic requirements in respect of project work prescribed for the Bachelor of
Engineering degree.
1.
2.
ABSTRACT
As the most common examination tool in medical practice, chest radiography has important clinical
value in the diagnosis of disease. The automatic detection of chest disease such as pneumonia,
tuberculosis and lung disease based on chest x-ray images has become one of the hot topics in medical
research. We propose an automatic chest radiography image using deep learning. The main aim is to
develop a system that can detect disease from chest X-ray datasets (images) using CNN
(Convolutional neural network). We are using CNN VGG16 architecture. We will begin by importing
some key packages which we will be using throughout the project.
For an engineering student looking to graduate in Computer Science and Engineering, working with
these professional minds was definitely a great experience. This report is done with many limitations
and obstacles. Thanks to so many people who helped me to do this process. I hope that the authority is
pleased with my efforts and work done by me during the short period of time.
i
ACKNOWLEDGEMENTS
The satisfaction and the successful completion of this project would be incomplete without
the mention of the people who made it possible, whose constant guidance encouragement
crowned our efforts with success.
This project is made under the guidance of Dr. Sukhwinder Sharma, Associate Professor,
in the Department of Computer Science and Engineering. We would like to express sincere
gratitude to our guide for all the helping hand and guidance in this project.
We would like to thank our project coordinators Dr. Sukhwinder Sharma, Associate
Professor in the Department of Computer Science and Engineering, for their cordial support,
valuable information and guidance, which helped us in completing this project through the
various stages.
We would like to express appreciation to Dr. Venkatramana Bhat P., Professor and Head of
the department, Computer Science and Engineering, for his support and guidance.
We would like to thank our Principal Dr. G.L. Easwara Prasad, for encouraging us and
giving us an opportunity to accomplish the project.
We also thank our management who helped us directly and indirectly in the completion of
this project.
Our special thanks to faculty members and others for their constant help and support.
Above all, we extend our sincere gratitude to our parents and friends for their constant
encouragement with moral support.
MD NOORUZZAMAN 4MT17CS056
KARTHIK R 4MT17CS046
GANESH NAIK 4MT17CS040
FATHIMA ANISHA 4MT17CS037
ii
TABLE OF CONTENTS
Contents Page No
ABSTRACT i
ACKNOWLEDGEMENT ii
TABLE OF CONTENTS iii
LIST OF FIGURES v
Chapter no TITLE
1 INTRODUCTION 1
1.1 Introduction 1
2 LITERATURE SURVEY 3
4 SYSTEM DESIGN 8
5 IMPLEMENTATION 11
5.1 Backend 11
iii
5.1.2 Uploading Datasets from Google Drive 12
5.1.8 Plotting 19
5.2 Frontend 20
5.2.1 HTML5 21
5.2.2 CSS3 21
5.2.3 Flask 21
5.2.5 html2pdf.js 24
6.1 Result 25
6.2 Snapshots 25
REFERENCES
iv
LIST OF FIGURES
Fig No. FIGURE NAME Page No.
v
CHAPTER 1
INTRODUCTION
Chest X-ray of Pneumonia disease diagnosis using CNN
CHAPTER 1
INTRODUCTION
This chapter gives an introduction to CNN (Convolutional Neural Networks).
1.1 Introduction
A Convolutional neural network (CNN) is a neural network that has one or more
convolutional layers and are used mainly for image processing, classification, segmentation
and also for other auto correlated data. CNNs are regularized versions of multilayer
perceptron. Multilayer perceptrons usually mean fully connected networks, that is, each
neuron in one layer is connected to all neurons in the next layer The "full connectivity" of
these networks makes them prone to overfitting data. Convolutional networks were inspired
by biological processes in that the connectivity pattern between neurons resembles the
organization of the animal visual cortex. Individual cortical responses to stimuli only in a
restricted region of the visual field known as the receptive field. The receptive fields of
different neurons partially overlap such that they cover the entire visual field.
The risk of disease is immense for many, especially in developing nations where billions face
problems. The WHO estimates that over 4 million premature deaths occur annually from
household air pollution-related diseases including pneumonia, tuberculosis. Over 150 are
infected with pneumonia on an annual basis, especially children under 5 years old. In such
regions, the problem can be further aggravated due to the lack of medical resources. For
example, in Africa’s 57 nations, a gap of 2.3 million doctors and nurses exists. For these
populations, accurate and fast diagnosis means everything. It can guarantee timely access to
treatment and save much needed time and money for those already experiencing poverty.
Despite several diseases. There are also advantages of X-ray image, still in some cases it is
not possible to identify the correct region of interest in radiographic image for detecting
observations as the diagnostic accuracy of automated methods reach the human level.
1.3 Objectives
According to UNICEF a child dies of pneumonia every 39 seconds. Pneumonia kills more
children than any other infectious disease, claiming the lives of over 800,000 children under
five every year, or around 2,200 every day. This includes over 153,000 newborns. Almost all
of these deaths are preventable. Globally, there are over 1,400 cases of pneumonia per
100,000 children, or 1 case per 71 children every year, with the greatest incidence occurring
in South Asia (2,500 cases per100,000children. Above data from here
https://data.unicef.org/topic/child-health/pneumonia/.
The scope of this project is to train a VGG16 model that is useful to predict the normal and
abnormal image of Chest X-ray of pneumonia images. Deep Learning (DL) has a big scope in
future and the biggest reason for it is that DL doesn’t require any kind of feature engineering.
Deep Learning extracts the features from the data itself instead of us giving it the features
after extracting it from the data. This way it solves our biggest problem of feature engineering.
Also, since features are learned by the model itself, it has a better probability of producing a
model which is more generalized than the feature engineered models.
CHAPTER 2
LITERATURE SURVEY
A literature survey or a literature review in a project report is that section which shows the
various analysis and research made in the field of interest and the results already published,
taking into account the various parameters and the extent of the project.
In this study, they compared two CNN network's performance on the diagnosis of pneumonia
disease. While training, they used transfer learning and fine-tuning. After the training phase,
they compared two network test results. The test results showed that Vgg16 network
outperforms Xception network by accuracy 0.87%, pneumonia precision 0.91% and pneumonia
f1 score 0.90%. Whereas Xception network outperforms Vgg16 network by sensitivity 0.85%,
normal precision 0.86% and pneumonia recall 0.94%. Xception network is more successful for
detecting pneumonia cases than the Vgg16 network. At the same time Vgg16 network is more
successful at detecting normal cases.
In this paper, they attempt to find a simpler approach for pneumonia detection based on Chest
X-ray by comparing the performances of 15 different CNN architectures trained on the same
dataset. Based on their findings, they select the most ideal model which is easy to train and has
one of the best performance metrics. The metrics of the selected architecture compared to some
of the state-of-the-art architectures trained on Chest X-ray that goes ahead to prove that striving
for the simplification of CNN architectures is crucial for intelligibility without compromising
accuracy and quality of performance.
They developed an algorithm which detects pneumonia from frontal-view chest X-ray images at
a level exceeding practicing radiologist. They also show that a simple extension of an algorithm
to detect multiple diseases outperforms previous state of the art on ChestX-ray14, the largest
publicly available chest X-ray dataset. With automation at the level of experts, They have
clearly mentioned that this technology can improve healthcare delivery and increase access to
medical imaging expertise in parts of the world where access to skilled radiologists is limited.
Viral Pneumonia Screening on Chest X-rays Using Confidence Aware Anomaly Detection.
By Jianpeng Zhang, Yutong Xie, Guansong Pang, Zhibin Liao, Johan Verjans, Wenxing Li,
Zongji Sun, Jian He, Yi Li, Chunhua Shen, and Yong Xia.
In this paper, they have proposed the CAAD model for viral pneumonia screening. Their results
on two chest X-ray datasets indicate that (1) anomaly detection works well in term of viral
pneumonia screening on chest X-ray images and is superior to binary classification methods,
and (2) learning model confidence is useful to predict failures, greatly reducing the false
negatives, and (3) achieves an AUC of 83.61% and sensitivity of 71.70% on the unseen dataset,
which is comparable to the performance of medical professionals.
This paper primarily aims to improve the medical adeptness in areas where the availability of
radiotherapists is still limited. It has been observed that the performance of various pretrained
CNN models along with distinct classifiers and then on the basis of statistical results selected
DenseNet-169 for the feature extraction stage and SVM for the classification stage. They have
mentioned that performing hyperparameter optimization in the classification stage ameliorated
the model performance.
This paper presents that using CNN- ResNet34 architecture which is after training the model for
500 epochs, test time augmentation was done, and the model was tested for performance on the
test data. The model accomplished an accuracy of 92.9%, considerably superior than the
baseline accuracy of 63%. The precision and recall values were approximately around 0.9088
and 0.9927 respectively.
In this paper it has been represented that the validation accuracy, recall and F1 score of CNN
classifier model 3 with three convolutional layers are 92.31%, 98% and 94%, respectively,
which are quite high compared to other models that were trained. CNN classifier model 4
with four convolutional layers also comes very close in performance with 91.67% validation.
Accuracy, 98% recall and 94% F1 score. Both of these models have the same recall and F1
scores.
CHAPTER 3
CHAPTER 4
SYSTEM DESIGN
System overview provides a top-level view of the entire software product. It highlights the
major components without taking account the inner details of the implementation. It describes
functionality of the product and context and design of the software product.The application will
be developed in a way which allows the user to interact with the system and simplifies the tasks by
providing a smooth user interface and user experience with an easily readable and understandable
view.
The input of VGG16 is 224×224×3 pixels images , then we two convolutional layers with
each 224×224×64 size then we have pooling layer which reduces height and width of image
112×112×64 .In both layers 64 filters , same padding and non-linearity activation
function(ReLu) was used .Then we have two conv128 layers with each 112×112×128 size
after that we have a pooling layer which is again reducing the height and weight of the image
of 56×56×128 . Here we have used 128 filters with the same padding and non-linearity
activation function (ReLu). Then we have three conv256 layers with each 56×56×256 size,
after that again a pooling layer reducing the image size 28×28×256. Then we have three
conv512 layers with each 28×28×512 size, after that again pooling layer reducing the size
of the image of 14×14×512.Then again we have three conv256 layers with 14×14×512 size ,
after that we have pooling layer of 7×7×512 and then we have two dense or fully-connected
layers with each 4096 nodes and at the last we have a final dense or output layer with 1000
nodes of size which classify between 1000 class .This model processes the input image and
outputs a vector of 1000 values. This vector represents the classification probability for the
corresponding class.
CHAPTER 5
IMPLEMENTATION
5.1 BACKEND
We did the implementation using Google Colaboratory editor. Google Colaboratory (also
known as Collab) is a free Jupiter notebook environment that runs in the cloud and stores its
notebooks on Google Drive. Collab was originally an internal Google project; an attempt was
made to open source all the code and work more directly upstream, leading to the
development of the "Open in Collab”. We used Python Programming language and libraries.
Libraries we used are as follows.
● Keras: Keras is an open-source software library that provides a Python interface for
artificial neural networks. Keras acts as an interface for the TensorFlow library.
● NumPy: NumPy is a library for the Python programming language, adding support for
large, multi-dimensional arrays and matrices, along with a large collection of high-level
mathematical functions to operate on these arrays.
● Matplotlib: Matplotlib is a plotting library for the Python programming language and
its numerical mathematics extension NumPy. It provides an object-oriented API for
embedding plots into applications.
There were several steps we used to implement and train the model. The steps are as follows.
1. Importing libraries
2. Uploading datasets from Google Drive
3. Loading datasets
4. Data augmentation
5. Training the using VGG16 model
6. Early Stopping
7. Model compilation
8. Plotting
9. Model evaluation
10. Saving the trained model
The first step of implementation is to import the libraries. We used libraries like keras, NumPy
and matplotlib. Import in python is similar to #include header file in C/C++. Python modules
can get access to code from another module by importing the file/function using import.
We uploaded datasets from Google Drive because we were using Collab and our machine was
not compatible for training the model. There are some steps we followed.
1. To connect Google Drive (GDrive) with Collab, executed the following two lines of code in
Collab.
2. Got the authorization code after signing in with Google account then we had to paste that
code then we got permission to access to Google Drive.
We have used X-ray image datasets which are publicly available on Kaggle. Find it here
https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia. It has a total of 5856
images (1.15 GB) which has 5216 training images, 624 test images and 16 validation images.
Overall, we have 4273 pneumonia images (73% of total datasets) and 1583 normal images
(27% of total datasets). We used the below code to access Google Drive.
Data samples
For the VGG16 model we used keras sequential model. A Sequential model is appropriate for a
plain stack of layers. We created a Sequential model by passing a list of layers to the
Sequential constructor using model.add .
Conv2D is a 2D convolutional layer This layer creates a convolution kernel that is convolved
with the layer input to produce a tensor of outputs. If use bias is True, a bias vector is created
and added to the outputs. Finally, if activation is not None, it is applied to the outputs as well.
When using this layer as the first layer in a model, provide the keyword argument input shape
(tuple of integers or None, does not include the sample axis), e.g., input shape= (150, 150, 3)
for 128x128 RGB pictures in data format="channels last". In Conv2D arguments we also used
fitters, padding (e.g., padding = same), activation function (e.g., Relu) and kernal_size (e.g.,
kernel _size (3,3)).
MaxPool2D is a max pooling for 2D data Down samples the input along its spatial dimensions
(height and width) by taking the maximum value over an input window (of size defined by pool
size) for each channel of the input. The window is shifted by strides along each dimension. The
resulting output, when using the "valid" padding option, has a spatial shape (number of rows or
columns) of output shape = math. Floor ((input shape - pool size) / strides) + 1 (when input
shape >= pool size) The resulting output shape when using the "same" padding option is: output
shape = math. Floor ((input shape - 1) / strides) + 1. We used pool size = (2,2) and stride = (2
,2) in the MaxPool2D arguments.
We used a Flatten layer for flattening the input If inputs are shaped (batch,) without a feature
axis, then flattening adds an extra channel dimension and output shape is (batch, 1).
We used the Dense layer in the VGG16 model for a fully connected layer. It is just regular
Densely connected Neural Networks layers. We used two fully connected layers and one output
layer. Dense implements the operation: output = activation (dot (input, kernel) + bias) where
activation (i.e., Sigmoid) is the element-wise activation function passed as the activation
argument, kernel is a weights matrix created by the layer, and bias is a bias vector created by
the layer.
We used two activation functions, one for convolutional layers (i.e., Relu) and other for fully
connected layers (i.e., sigmoid).
Sigmoid: Sigmoid is also a non-linear activation function. It transforms the value between 0 to
1. Here is the mathematical expression.
Sigmoid is non-linear this means that when I have multiple neurons having sigmoid function as
their activation function, the output is non-linear as well.
We used three arguments: Optimizer which is RMSProp, loss function and metrics which is
accuracy.
RMSprop: We used the RMSprop optimization algorithm. The RMSprop optimizer is similar
to the gradient-descent algorithm with momentum. It restricts the oscillation in the vertical
direction. Therefore, we can increase our learning rate and our algorithm could take larger steps
in the horizontal direction and converge faster.
RMSprop Algorithm:
Input: A vector 𝑅 𝑑 ∋ 1𝑑 ≥ 0 , parameters 𝛽 ∈ [0,1] ,step size 𝛼, initial
starting point 𝑥1 ∈ 𝑅 𝑑 and 𝑓: 𝑅 𝑑 → 𝑅
Function RMSprop
Initialize: 𝒗𝟎 = 𝟎
for 𝒕 = 𝟏, 𝟐, 𝟑. . . . ..do
𝒈𝒕 = 𝒊𝒏𝒗𝒆𝒓𝒕𝒆𝒅 − 𝒅𝒆𝒍𝒕𝒂 𝒇(𝒙𝒕 )
𝒗𝒕 = 𝜷 𝒗𝒕−𝟏 + (𝟏 − 𝜷)(𝒈𝒕 𝟐 + 𝝃𝟏𝒅 )
𝑽𝒕 = 𝒅𝒊𝒂𝒈(𝒗𝒕 )
𝟏
−
𝒙𝒕+𝟏 = 𝒙𝒕 − 𝜶𝑽𝒕 𝟐 𝒈𝒕
end for
end function
RMSprop uses adaptive learning rate instead of treating learning rate as a hyperparameter. This
means that the learning rate changes over time. The value of momentum and learning rate is
denoted by 𝛽 and 𝛼respectivily. Sometime the value of 𝑣𝑑𝑤 ≈ 0 and included parameter epsilon
which is set small to prevent the gradient from exploding and vanishing.
Loss function: Our model finds whether the patient is suffering from pneumonia or not. That
means we are finding “yes” or “No” (binary classification problem). So, we used a Binary
classification loss function i.e., Binary Cross-Entropy.
5.1.8 Plotting
We used Matplotlib for plotting the graph two graphs were plotted, one was for loss and
accuracy. Below code and graph shows the loss of training and validation.
Keras provides Model.evaluate() function to develop the model to check whether the model is
best fit for the given problem and corresponding data.
After evaluation we got loss = 0.1583 and accuracy = 0.9375 on validation and on the other side
we got loss = 0.12 and accuracy = 0.9528 on training.
We saved the model using the HDF5 library. To use this, we installed the h5py package. The
h5py package is a Pythonic interface to the HDF5 binary data format.
5.2 FRONTEND
We developed one web application where a Radiologist or clinical expert can easily upload the
image and be able to predict whether the Pneumonia Chest x - ray image is normal or abnormal.
After getting the output we also provided that report will be generated in PDF format.
Radiologists are easily able to download the report. In the frontend VS Code editor was used to
create web applications. There were five technologies we used in the frontend.
5.2.1 HTML5
We used HTML5 for structuring the web application. HTML5 is a markup language used for
structuring and presenting content on the World Wide Web. It is the fifth and last major HTML
version that is a World Wide Web Consortium recommendation. The current specification is
known as the HTML Living Standard.
5.2.2 CSS3
For styling we had used CSS3 in the web application. Cascading Style Sheets is a style sheet
language used for describing the presentation of a document written in a markup language such
as HTML. CSS is a cornerstone technology of the World Wide Web, alongside HTML and
JavaScript.
5.2.3 Flask
We also used Flask to make the site more interactive where we used our saved model in HDF5
format. Flask is a micro web framework written in Python. It is classified as a microframework
because it does not require particular tools or libraries. It has no database abstraction layer, form
validation, or any other components where pre-existing third-party libraries provide common
functions.
As we can see above, we imported flask and pymongo libraries. We used Flask for machine
learning connection with model files and loading the model using load model () . Our trained
model has been saved in my_model.h5.
The code which is given below you see we used a third route (‘/upload’) to upload the image
and save using file path. We used model. Predict () to predict whether the given image is
pneumonia or normal (value of prediction will be in between 0 and 1) and storing in classes
variables. Then we applied a conditional if statement. If the prediction value is greater than 0.5
then the given image has pneumonia, otherwise normal.
We are storing our patient information in the MongoDB database. MongoDB is a source-
available cross-platform document-oriented database program. Classified as a NoSQL database
program, MongoDB uses JSON-like documents with optional schemas. We are using pymongo
libraries which is a Python distribution containing tools for working with MongoDB and is the
recommended way to work with MongoDB from Python. See the below code where we used
pymongo. Mongo Client to make a connection with the installed MongoDB database.
5.2.5 html2pdf.js
We provided a feature where radiologists can easily generate PDF of Patient reports and are
also able to download the PDF. We achieved this using a JavaScript framework called
html2pdf.js. html2pdf.js converts any webpage or element into a printable PDF entirely client-
side using html2canvas and jsPDF.
We used the src link of the html2pdf framework (i.e., html2pdf.bundle.js). In pdf () we used
DOM to generate and save the PDF.
CHAPTER 6
RESULTS AND SNAPSHOTS
6.1 RESULT
6.2 SNAPSHOTS
CHAPTER 7
Pneumonia constitutes a significant cause of morbidity and mortality. The WHO estimates
that over 4 million premature deaths occur annually from household air pollution-related
diseases including pneumonia, tuberculosis. Over 150 are infected with pneumonia on an
annual basis, especially children under 5 years old. In such regions, the problem can be
further aggravated due to the lack of medical resources. For example, in Africa’s 57 nations,
a gap of 2.3 million doctors and nurses exists. According to UNICEF a child dies of
pneumonia every 39 seconds. Pneumonia kills more children than any other infectious
disease, claiming the lives of over 800,000 children under five every year, or around 2,200
every day. So, this is really a big problem where technology can help to improve medical
facilities.
We got 93% validation accuracy and 95% training accuracy with almost negligible error. In
the future we will improve our model to predict better results. With automation at the level
of experts, this technology can improve healthcare delivery and increase access to medical
imaging expertise in parts of the world where access to skilled radiologists is limited.
[1] Diagnosis of Pneumonia from Chest X-Ray Images using Deep Learning By Enes
AYAN * and Halil Murat ÜNVER ,978-1-7281-1013-4/19/$31.00 ©2019 IEEE.
[2] Pneumonia Detection using CNN through Chest x-ray By Harshvardhan GM, Mahendra
Kumar Gourisaria, Sidharth Swarup Rautaray* and Manjusha Pandey,Vol. 16, No. 1
(2021) 861 - 876.
[4] Viral Pneumonia Screening on Chest X-rays Using Confidence Aware Anomaly
Detection By Jianpeng Zhang, Yutong Xie, Guansong Pang, Zhibin Liao, Johan Verjans,
Wenxing Li, Zongji Sun, Jian He, Yi Li, Chunhua Shen, and Yong Xia,
arXiv:2003.12338v4 [eess.4] 2Dec 2020.
[5] Pneumonia Detection Using CNN based Feature Extraction By Dimpy Varshni, Kartik
Kartik Thakral, Lucky Agarwal , Rahul Nijhawan and Ankush Mittal , May 24,2021 at
10:46:14 UTC from IEEE.
[6] Transfer Learning with Deep Convolutional Neural Network (CNN) for Pneumonia
Detection Using Chest X-ray. By Tawsifur Rahman 1 , Muhammad E. H Chowdhury 2,
* , Amith Khandakar 2 ,Khandaker R. Islam 3 , Khandaker F. Islam 2 , Zaid B. Mahbub,
Muhammad A. Kadir and Saad Kashem 5. Appl. Sci. 2020, 10, 3233;
doi:10.3390/app10093233.
[7] Deep Learning Approach For Prediction Of Pneumonia By Kalyani Kadam, Dr.Swati
Ahirrao, Harbir Kaur ,Dr. Shraddha Phansalkar , Dr. Ambika Pawar , ISSUE 10,
OCTOBER 2019 ISSN 2277-8616.
[8] Efficient Pneumonia Detection in Chest Xray Images Using Deep Transfer Learning By
Mohammad Farukh Hashmi 1, † , Satyarth Katiyar 2,† , Avinash G Keskar 3,† ,Neeraj
Dhanraj Bokde 4,† and Zong Woo Geem 5. Diagnostics 2020, 10, 417;
doi:10.3390/diagnostics10060417.
[10] Convergence guarantees for RMSProp and ADAM in non-convex optimization and an
empirical comparison to Nesterov acceleration By Soham De 1 , Anirbit Mukherjee 2 ,
and Enayat Ullah 3, arXiv:1807.06766v3 [cs.LG] 20 Nov 2018.
[11] Training of Deep Neural Networks based on Distance Measures using RMSProp By
Thomas Kurbiel and Shahrzad Khaleghian , arXiv:1708.01911v1 [cs.LG] 6 Aug 2017.
[12] Variants of RMSProp and Adagrad with Logarithmic Regret Bounds By Mahesh
Chandra Mukkamala 1 2 Matthias Hein 1, Proceedings of the 34 th International
Conference on Machine Learning, Sydney, Australia, PMLR 70, 2017.