Brain Tumour Analysis Using Image Processsing
Brain Tumour Analysis Using Image Processsing
Brain Tumour Analysis Using Image Processsing
BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE ENGINEERING
By
CERTIFICATE
This is to certify that the project report titled Brain Tumour analysis using image
processing is a bonafide work of following IV/II B. Tech students in the Department of
Computer Science Engineering, Gayatri Vidya Parishad College of Engineering for
Women affiliated to JNT University, Kakinada during the academic year 2016-2020, in
fulfillment of the requirement for the award of the degree of Bachelor of Technology of
this university.
We take the opportunity to one and all who have helped in making the project
possible. We are thankful to Gayatri Vidya Parishad College of Engineering for
Women, for giving us the opportunity to work on a project as part of the curriculum.
Our sincere thanks to our guide Dr. M. Bhanu Sridhar, Associate Professor in
Department of Computer Science Engineering for his simulating guidance and assistance
from the beginning of the project.
We are very much thankful to our Head of Computer Science Engineering Prof.
P. V. S. L. Jagadamba. Head of the Department for her help and encouragement and
also for providing the lab facility to complete the project work.
Our sincere thanks to our beloved Vice-Principal Prof. G. Sudheer for providing
the best faculty and lab facility throughout these academic years.
Our sincere thanks to our beloved principal Prof. K. Subba Rao for providing the
best faculty and lab facility throughout these academic years.
Our sincere thanks to our beloved Director Prof. Dr. E. V. Prasad for providing
the best faculty and lab facility throughout these academic years.
Finally, we are thankful to our entire faculty and our lab technicians for their good
wishes and constructive criticism, which led to the successful completion of the project.
TABLE OF CONTENTS
TOPICS PAGE NO
ABSTRACT
1. INTRODUCTION
1.1 Motivation of the Project
1.2 Problem Definition
1.3 Objective of project
1.4 Limitations of project
1.5 Organization of documentation
2. LITERATURE SURVEY
3. ANALYSIS
3.1 Introduction
3.2 Software Requirement Specification
3.2.1 User requirements
3.2.2 Software requirements
3.2.3 Hardware requirements
3.3 Content diagram of project
3.4 Flowchart
3.5 Conclusion
4. DESIGN
4.1 Introduction
4.2 Basic Building blocks of UML
4.2.1 UML diagrams
4.2.1.1 Class diagram
4.2.1.2 Use case diagram
4.2.1.3 State chart diagram
i
4.2.1.4 Sequence diagram
4.2.1.5 Activity diagram
4.3 Module design and organization
4.3.1 Main module
4.3.2 Admin module
4.3.3 Donator module
4.3.4 Agent module
4.4 Conclusion
7 CONCLUSION
8 REFERENCES
ii
ABSTRACT
Medical science in image processing is an emerging field which has proposed a lot of
advanced techniques in detection and analysis of a particular disease. Treatment of brain
tumors in recent years is getting more and more challenging due to complex structure,
shape and texture of the tumor. Therefore, this project intends to implement different
methodologies to segment a tumor from an MRI image and perform normalized cross-
correlation in order to determine if the tumor extracted is accurately determined as tumor.
Using the best segmentation technique classification is performed to determine whether
the tumor is Benign or Malignant.
iii
1. INTRODUCTION:
Chapter 1: Introduction describes the motivation of this project and objective of the
developed project.
Chapter 2: Literature survey describes the primary terms involved in the development of
this software. It discusses about some key concepts of android and databases which is an
important step in developing android app.
Chapter 3: Analysis deals with detail analysis of the project. Software Requirement
specification which further contain user requirements analysis, software requirement analysis
and hardware requirement analysis. It also includes Algorithms and Flowcharts.
Chapter 4: Design includes UML diagrams along with explanation of the system design and
the organization.
Chapter 6: Gives the testing and validation details with design of test cases and scenarios
along with validation screenshots.
4
2. LITERATURE SURVEY
2.1 INTRODUCTION
2.2 EXISTING SYSTEM
2.3 DISADVANTAGES OF EXISTING SYSTEMS
2.4 PROPOSED SYSTEM
2.5 CONCLUSION
2.1 Introduction:
Brain tumor is caused due to uncontrolled growth of cells. These cells might affect normal
brain activities. Two main groups of brain tumor are: Primary and Metastatic. Primary or
Benign tumor are non-cancerous originating from the tissues of the brain. Metastatic or
malignant are cancerous which begin in other parts of the body (breast or lungs) and spread to
the brain. As per Globocon, 2019 report, issued by the International Association of Cancer
Registries (IACR) associated with WHO revealed that 28,142 brain tumor cases are annually
reported in India, while there are as many as 24,003 deaths.
• Threshold
• K-means clustering
• Fuzzy c mean segmentation
THRESHOLD:
Drawbacks:
K-MEANS CLUSTERING:
Drawbacks:
5
Fuzzy C mean algorithm is implemented where C is the number of clusters to be formed.
Drawbacks:
2.4 Conclusion:
3.ANALYSIS
3.1 INTRODUCTION:
IMAGE PROCESSING:
In computer science, digital image processing is the use of a digital computer to process
digital images through an algorithm. As a subcategory or field of digital signal processing,
digital image processing has many advantages over analog image processing. It allows a
much wider range of algorithms to be applied to the input data and can avoid problems such
as the build-up of noise and distortion during processing. Since images are defined over two
dimensions (perhaps more) digital image processing may be modelled in the form of multi-
dimensional systems. The generation and development of digital image processing are mainly
affected by three factors: 1.the development of computers; 2.the development of mathematics
(especially the creation and improvement of discrete mathematics theory); 3. The demand for
a wide range of applications in environment, agriculture, military, industry and medical
science has increased.
6
ADVANTAGES OF IMAGE PROCESSING:
2.It can be made available in any desired format. (X-rays, photo negatives, improved image
etc.,)
3.Digital imaging is the ability of the operator to post-process the image. It means manipulate
the pixel shades to correct image density and contrast.
4.Images can be stored in the computer memory and easily retrieved on the same computer
screen.
2.If computer is crashes then pics that have not been printed and filled into Book Albums that
are lost.
3.Digital cameras which are used for digital image processing have some disadvantages like:
7
FUNCTIONAL REQUIREMENTS
NON-FUNCTIONAL REQUIREMENTS
Software Requirements deal with defining software resource requirements and pre-
requisites that need to be installed on a computer to provide optimal functioning of an
8
application. These requirements or pre-requisites are generally not included in the software
installation package and need to be installed separately before the software is installed.
RAM : 4GB
It is a process of separating different tumor tissues (solid or active tumour, edema and
necrosis) from normal brain tissues: Gray matter, White matter and cerebrospinal
Fluid.
9
3.5 CONCLUSION
This chapter starts with the introduction of analysis in general followed by Software
Requirement Specification which contains software requirements and hardware requirements
i.e. what are the software’s required to implement and run the system and hardware
requirements required for establishing and running the project.
4.DESIGN
4.1 INTRODUCTION
Software Design is a process of planning the new or modified system. The design step
produces the understanding and procedural details necessary for implementing the system
recommended in the feasibility study. Analysis specifies what a new or modified system
does. Design specifies how to accomplish the same. Design is essentially a bridge between
requirement specification and the final solution satisfying the requirement. It is a blue print or
a solution for the system. The design step produces a data design, an architectural design and
procedural design.
10
The design process for a software system has two levels. At first level the focus is on
depending in which modules are needed for the system, the specification of these modules
and how the modules should be interconnected. This is what is called system designing of top
level design. In the second level, the internal design of the modules, or how the specification
should be interconnected.
It is the first level of the design which produces the system design, which defines the
components needed for the system, and how the components interact with each other. It focus
is on depending in which the modules are needed for the system, the specification of these
module should be interconnected.
Logic Design
The logic design of this application shows the major features and how they are related
to one another. The detailed specifications are drawn on the bases of user requirements. The
outputs, inputs and relationship between the variables are designed in this phase.
Input Design
The input design is the bridge between users and information system. It specifies the
manner in which the data enters the system for processing. It can ensure the reliability of the
system and produces reports from accurate data or it may result in the output of error
information. While designing the following points have to be taken into consideration:
Output Design
Each and every presented in the system is result-oriented. The most important feature of
this application for users is the output. Efficient output design improves the usability and
acceptability of the system and also helps in decision making. Thus the following points are
considered during the output design:
11
How to arrange the information in acceptable manner?
How the status has to be maintained each and every time?
How to distribute the output to different recipients?
The system being user friendly in nature is served to fulfill the requirements of the users.
Data Design
Data design is the first of the three design activities that are conducted during software
engineering. The Impact of data structure on program structure and procedural complexity
causes data design to have a profound influence on software quality.
Architectural Design
The architectural design defines the relationship among major structural components into
a procedural description of the software.
The use cases are the functions that are to be performed in the module.
An actor could be end-user of the system or an external system.
System Design
Grady Booch, James Raumbaugh and Ivor Jacobson have collaborated to combine the
best features of their individual object-oriented analysis and design methods into a unified
method the unified modeling language, the version 1.0 for the Unified modeling was released
in January 1997 the main parts of UML are based on the Brooch, OMT and OOSE methods.
12
To create a modeling language usable by both humans and machines
Use Case Diagrams: it shows a set of use cases, and how actors can use them.
Class Diagrams: describes the structure of the system , divided in classes with different
connections and relationships.
Sequence Diagrams: it shows the interaction between a set of objects, through the
messages that may be dispatched between them.
State Chart Diagram: state machines, consisting of states, transitions, events and
activities.
Activity Diagrams: shows the flow through a program from a defined start point to an
end point.
Object Diagrams: A set of objects and their relationships, this is a snapshot of instances
of the things found in the class objects.
Collaboration Diagrams: shows organizations and dependencies among a set of
components. These diagrams address the static implementation view of the system.
Deployment Diagrams: show the configuration of run-time processing nodes and
components that live on them.
13
Some of the Diagrams that help for the Diagrammatic Approach for the Object-Oriented
Software Engineering are
Class Diagrams
Sequence Diagrams
Activity Diagrams
Using these mentioned diagrams, we can show the entire system regarding the working of the
system or the flow of control and sequence of flow the state of the system and the activities
involved in the system.
In the diagram, classes are represented with boxes which contain three parts:
1. The top part contains the name of the class. It is printed in bold and cantered, and the
first letter is capitalized.
2. The middle part contains the attributes of the class. They are left-aligned and the first
letter is lowercase.
3. The bottom part contains the methods the class can execute. They are also left-aligned
and the first letter is lower case. In the design of a system, a number of classes are
identified and grouped together in a class diagram which helps to determine the static
relations between those objects. With detailed modeling, the classes of the conceptual
design are often split into a number of subclasses.
14
Relationships: A relationship is general terms covering the specific types of logical
connections found on class and object diagrams. UML shows the following relationships:
5. Generalization: The Generalization relationship ("is a") indicates that one of the two
related classes (the subclass) is considered to be a specialized form of the other (the super
type) and the super class is considered a 'Generalization' of the subclass. The UML graphical
representation of a Generalization is a hollow triangle shape on the super class end of the line
(or tree of lines) that connects it to one or more subtypes. The generalization relationship is
15
also known as the inheritance or "is a" relationship. Generalization can only be shown on
class diagrams and on Use case diagrams.
7. Dependency: Dependency is a weaker form of bond which indicates that one class
depends on another because it uses it at some point in time. One class depends on another if
the independent class is a parameter variable or local variable of a method of the dependent
class.
Use cases
A use case describes a sequence of actions that provide something of measurable
value to an actor and is drawn as a horizontal ellipse.
16
Actors
An actor is a person, Organization, or external system that plays a role in one or more
interactions with your system. Actors are drawn as stick figures.
Associations
Associations between actors and use case are indicated in use case diagrams by solid
lines. An association exists whenever an actor is involved with an interaction described by a
use case. Associations are modeled as lines connecting use cases and actors to one another,
with an optional arrowhead on one end of the line. The arrowhead is often used to indicating
the direction of the initial invocation of the relationship or to indicate the primary actor
within the use case.
1. Include
2. Extend
3. Generalization
1.Include
In one form of interaction, a given use case may include another. Include is a directed
relationship between two use cases, implying that the behavior of the included use case is
inserted into the behavior of the including use case.
2. Extend:
In another form of interaction, a given use case (the extension) may extend another. This
relationship indicates that the behavior of the extension use case may be inserted in the
extended use case under some conditions. The notation is a dashed arrow from the extension
to the extended use case, with the label<<extend>>.
3. Generalization:
17
4.2.1.3 STATE CHART DIAGRAM
At any given time, an object is in particular state. One way to characterize change in a
system is to say that its objects change the state in response to events and to time. The UML
State Diagram captures these kinds of changes it presents the states an object can be in along
with the transitions between the states, and shows the starting point and end point of a
sequence of state changes.
A Rounded Rectangle represents a state, along with the solid line and arrow head that
represents a transition. The arrow head points to the state being transition into. The solid
circle symbolizes starting point and the pulls eye that symbolizes the end point.
Donator will login to the app and provide food details like quantity, food prepared date and
time and then sends a request to the admin.
18
Activity diagram illustrates the dynamic nature of a system by modeling the flow of
control from activity to activity. An activity represents an operation on some class in the
system that results a change in the state of the system. Typically, activity diagram are used to
model workflow or business processes and internal operation. It is a special kind of statement
diagram that shows the flow from activity to activity within the system. Activity diagram
address the dynamic view of the system. The easiest way to visualize an Activity diagram is
to think of a flowchart of a code.
The flowchart is used to depict the business logic flow and the events that cause decisions
and actions in the code to take place. An Activity diagram represents the business and
operational workflow of a system.
• Initial node
• Activity
• Flow/edge
• Fork
• Join
• Condition
• Decision
• Merge
• Partition
• Flow final
Executable, atomic computations are called action states. An action state is represented using
a lozenge shape. Inside that state you may write any expression. Action states can’t be
decomposed. Furthermore, action states are atomic, meaning that events may occur, but the
work of the action states is not interrupted. Furthermore, activity states are not atomic,
19
meaning that may be interrupted. They take some duration to complete. However, there is no
notational distinction between action and activity state.
Graphical Representation
Transitions
When the action or activity of a state completes, flow of control passes immediately to the
next action or activity state. A transition is represented as a simple directed line.
Graphical Representation
Object
An object has state, behavior and identity. The structure and behavior of similar objects are
defined in their common class. Each object icon that is not named is referred to as a class
instance. The object icon is similar to a class icon except that the name is underlined.
20
UML Diagrams:
a. Usecase Diagram:
21
b. Class Diagram:
c. State-chart Diagram:
22
5.IMPLEMENTATION AND RESULT
OTSUS THRESHOLDING:
In Computer vision and image processing, OTSU’S method, named after Nobuyuki otsu, is
used to perform automatic image thresholding. In the simplest form, the algorithm returns a
single intensity threshold that separate pixels into two classes, foreground and background.
This threshold is determined by minimizing intra-class intensity variance, or equivalently, by
maximizing inter-class variance. OTSU’s method is a one-dimensional discrete analog of
Fishers Discriminant Analysis, is related Jenks Optimization Method, and is equivalent to a
globally optimal K-means performed on the intensity histogram.
Optimal threshold is calculated as follows: The first step is to consider histogram of the
image. Histogram represents the intensities of all the pixels. Now we need to choose a
threshold value whose intra class variance is minimum or equivalently whose inter class
variance is maximum. Intra class variance is the variance between the pixels of the same
class. Whereas inter class variance is the variance between the classes. So, we need to
minimize intra class variance and maximize inter class variance.
Now we compute within class variance for all the threshold values. Threshold 2 implies that
out of all the thresholds bars present here, first 2 bars are put in the background class and rest
of all are placed in the foreground class. So, all the pixels with these 2 intensities are made
black and the rest are made white. Let us consider an example for threshold value 3. In order
to calculate within class variance, we need to compute weight, mean and variance for the
foreground class and background class. Weight is calculated as sum of the frequencies of the
intensities divided by total number of pixels.
23
That is how these are computed and within class variance is calculated as the weighted sum
of background and foreground variances
24
Water-shed segmentation:
25
Maximally Stable Extremal Region:
The concept more simply can be explained by thresholding. All the pixels below a given
threshold are 'black' and all those above or equal are 'white'. Given a source image, if a
sequence of thresholded result images is generated where each image corresponds to an
increasing threshold t, first a white image would be seen, then 'black' spots corresponding to
local intensity minima will appear then grow larger. These 'black' spots will eventually
merge, until the whole image is black. The set of all connected components in the sequence is
the set of all extremal regions. In that sense, the concept of MSER is linked to the one of
component tree of the image. The component tree indeed provides an easy way for
implementing MSER.
Connected components:
Connected components, in a 2D image, are clusters of pixels with the same value, which are
connected to each other through either 4-pixel, or 8-pixel connectivity.
8- Connectivity:
8-connected pixels are neighbors to every pixel that touches one of their edges or corners.
These pixels are connected horizontally, vertically, and diagonally. In addition to 4-connected
pixels, each pixel with coordinates (x + or – 1, y + or – 1) is connected to the pixel at (x, y).
Extremal regions:
Extremal regions in this context have two important properties, that the set is closed under,
26
• Monotonic transformation of image intensities. The approach is of course sensitive to
natural lighting effects as change of day light or moving shadows.
Advantages of MSER:
Disadvantages of MSER:
Applications:
27
Code:
28
29
30
31
32
33
34
35
36
6.Feature extraction
What is feature extraction?
In machine learning, pattern recognition and in image processing, feature extraction starts
from an initial set of measured data and builds derived values (features) intended to be
informative and non-redundant, facilitating the subsequent learning and generalization steps,
and in some cases leading to better human interpretations. Feature extraction is related to
dimensionality reduction.
When the input data to an algorithm is too large to be processed and it is suspected to be
redundant, then it can be transformed into a reduced set of features (also called as feature
factor). Determining a subset of the initial features is called feature selection. The selected
features are expected to contain the relevant information from the input data, so that the
desired task can be performed by using this reduced representation instead of the complete
initial data.
Algorithm:
The SURF algorithm is based on the same principles and steps as SIFT but details in each
step are different. The algorithm has three main parts: interest point detection, local,
neighborhood description, and matching.
37
1.Detection:
SURF uses square shaped filters as an approximation of gaussian smoothing (SIFT
approach uses cascaded filters to detect Scale-invariant characteristic points, where the
difference of gaussians (DoG) is calculated on rescaled images progressively). Filtering the
image with a square is much faster if the integral image is used:
x y
S (x, y) =∑ ∑ I (i , j)
i=0 j=0
The sum of the original image with in a rectangle can be evaluated quickly using the integral
image, requiring evaluation at the rectangles four corners.
SURF uses a blob detector based on the Hessian matrix to find points of interest. The
determinant of the Hessian matrix is used as a measure of local change around the point and
points are chosen where this determinant is maximal. In contrast to the Hessian-Laplacian
detector by Mikolajczyk and Schmid, SURF also uses the determinant of the Hessian for
selecting the scale, as is also done by Lindeberg. Given a point p= (x, y) in an image I, the
Hessian matrix H(p, σ) at point p and scale σ, is:
The scale space is divided into a number of octaves, where an octave refers to a series of
response maps of covering a doubling of scale. In SURF, the lowest level of the scale space is
obtained from the output of the 9×9 filters.
Hence, unlike previous methods, scale spaces in SURF are implemented by applying box
filters of different sizes. Accordingly, the scale space is analyzed by up-scaling the filter size
rather than iteratively reducing the image size. The output of the above 9×9 filter is
considered as the initial scale layer at scale s =1.2 (corresponding to Gaussian derivatives
38
with σ = 1.2). The following layers are obtained by filtering the image with gradually bigger
masks, taking into account the discrete nature of integral images and the specific filter
structure. This results in filters of size 9×9, 15×15, 21×21, 27×27,.... Non-maximum
suppression in a 3×3×3 neighborhood is applied to localize interest points in the image and
over scales. The maxima of the determinant of the Hessian matrix are then interpolated in
scale and image space with the method proposed by Brown, et al. Scale space interpolation is
especially important in this case, as the difference in scale between the first layers of every
octave is relatively large.
3.Descriptor
The goal of a descriptor is to provide a unique and robust description of an image feature,
e.g., by describing the intensity distribution of the pixels within the neighbourhood of the
point of interest. Most descriptors are thus computed in a local manner, hence a description is
obtained for every point of interest identified previously.
The dimensionality of the descriptor has direct impact on both its computational complexity
and point-matching robustness/accuracy. A short descriptor may be more robust against
appearance variations, but may not offer sufficient discrimination and thus give too many
false positives.
The first step consists of fixing a reproducible orientation based on information from a
circular region around the interest point. Then we construct a square region aligned to the
selected orientation, and extract the SURF descriptor from it.
4.Orientation assignment
In order to achieve rotational invariance, the orientation of the point of interest needs to be
found. The Haar wavelet responses in both x- and y-directions within a circular
neighbourhood of radius 6s around the point of interest are computed, where s is the scale at
which the point of interest was detected. The obtained responses are weighted by a Gaussian
function centered at the point of interest, then plotted as points in a two-dimensional space,
with the horizontal response in the abscissa and the vertical response in the ordinate. The
dominant orientation is estimated by calculating the sum of all responses within a sliding
orientation window of size π/3. The horizontal and vertical responses within the window are
summed. The two summed responses then yield a local orientation vector. The longest such
vector overall defines the orientation of the point of interest. The size of the sliding window
is a parameter that has to be chosen carefully to achieve a desired balance between robustness
and angular resolution.
39
6.Matching
By comparing the descriptors obtained from different images, matching pairs can be found.
40
7.Testing and validation
41
42
Conclusion
From the results shown above, the given input resulted in accurate results for all of the three
methods. The main motive of all the three segmentation methods i.e., Otsu's thresholding,
Watershed segmentation and MSER is to segment the image and display the tumor region.
The foreground region which is being depicted on the resultant image is the tumor portion
that we are trying to extract. Then, Normalized cross-correlation is computed between the
resultant segmented image and the original MRI image. If the value is close to 1, then we can
conclude that the tumor is extracted efficiently.
After evaluating the normalized cross-correlation values for all of the three methods, MSER
is proved to be the best segmentation method with a value of 0.91. Now, feature extraction is
done to extract the relevant features from the image. The method that is used is Speeded Up
Robust Features (SURF) which extracts the useful information from the image and returns
features in the form of descriptors and these descriptors are used as features. Feature
extraction is followed by classification where feature set is given as the input to the classifier
to predict whether the extracted tumor is benign or malignant. Now, the feature set is passed
to the Random Forest Classifier which constructs a model consisting of 100 decision trees by
selecting random samples with replacement. If the classifier predicts the class as -1, then the
tumor is a benign tumor or if the class is +1, then the tumor is a malignant tumor. The
accuracy that is obtained by using Random Forest Classification is 95 percent which resulted
in one false prediction out of 20 testing samples.
43
44
45