1.1 Overview: Brain Tissues Classification
1.1 Overview: Brain Tissues Classification
1.1 Overview: Brain Tissues Classification
INTRODUCTION
1.1 OVERVIEW
In medical image processing the detection of new Multiple Sclerosis (MS) lesions
or an inflammatory diseases that is an inflammatory disease on Magnetic Resonance Imaging
(MRI) is important as a disease activity and surgical purpose.
Brain Magnetic Resonance Imaging (MRI) is widely used in surgical and diagnosis purpose in
that the image processing is used to give the result.
Brain tissues classification
Brain tissues classified as three ways that is,
1. White Matter(WM).
2. Gray Matter(GM).
3. Ceribrospial Fluid(CSF).
Multiple Sclerosis(MS)
Multiple sclerosis (MS) is a chronic autoimmune disorder affecting movement,
ensation, and bodily functions. It is caused by destruction of the myelin insulation covering nerve
fibers (neurons) in the central nervous system (brain and spinal cord).
MS is a nerve disorder caused by destruction of the insulating layer surrounding
neurons in the brain and spinal cord. This insulation, called myelin, helps electrical signals pass
quickly and smoothly between the brain and the rest of the body.When the myelin is destroyed,
nerve messages are sent more slowly and less efficiently. Patches of scar tissue, called plaques,
form over the affected areas, further disrupting nerve communication. The symptoms of MS
occur when the brain and spinal cord nerves no longer communicate properly with other parts of
the body. MS causes a wide variety of symptoms and can affect vision, balance, strength,
sensation, coordination, and bodily functions.
1
Multiple sclerosis affects more than a quarter of a million people in the United States.
Most people have their first symptoms between the ages of 20 and 40; symptoms rarely begin
before 15 or after 60. Women are almost twice as likely to get MS as men, especially in their
early years. People of northern European heritage are more likely to be affected than people of
other racial backgrounds, and MS rates are higher in the United States, Canada, and Northern
Europe than in other parts of the world. MS is very rare among Asians, North and South
American Indians, and Eskimos.
Causes and symptoms
Multiple sclerosis is an autoimmune disease, meaning its cause is an attack by the
body's own immune system. For unknown reasons, immune cells attack and destroy the myelin
sheath that insulates neurons in the brain.
The symptoms of multiple sclerosis may occur in one of three patterns:
The most common pattern is the "relapsing-remitting" pattern, in which there are
clearly defined symptomatic attacks lasting 24 hours or more, followed by complete or
almost complete improvement.
The period between attacks may be a year or more at the beginning of the disease, but
may shrink to several months later on. This pattern is especially common in younger
people who develop MS.
In the "primary progressive" pattern, the disease progresses without remission or with
occasional plateaus or slight improvements. This pattern is more common in older
people.
In the "secondary progressive" pattern, the person with MS begins with relapses and
remissions, followed by more steady progression of symptoms.
Between 10-20% of people have a benign type of MS, meaning their symptoms progress very
little over the course of their lives.
Because plaques may form in any part of the central nervous system, the symptoms of MS vary
widely from person-to-person and from stage-to-stage of the disease. Initial symptoms often
include:Muscle weakness, causing difficulty walking
2
Numbness, "pins and needles," or other abnormal sensations
Visual disturbances, including blurred or double vision.
3
Combines prior knowledge and observed data: prior probability of
a hypothesis multiplied with probability of the hypothesis given the
training data
Probabilistic hypothesis: outputs not only a classification, but a
probability distribution over all classes
Applications
Medical Diagnosis
Text classification
4
Figure 1.1: Multiple sclerosis in Brain MRI
5
corticosteroids. Physical therapy may help postpone or prevent specific disabilities.
The patient is encouraged to live as normal and active a life as possible.
6
Muhamad Abdel.Mottaleb, Baseem A.Scandura contributed that a
topology-preserving approach to the segmentation of brain images with multiple
sclerosis lesions by using T2-Weighted techniques but, it have computational
complexity [6].
7
Guido Gorig, Daniel Wolti,Alan Colchester gives brief information about
exploring the discrimination power of the time domain for segmentation and
characterization of lesions in MRI data by using spatial analysis .It provide high
sensitive to detect fluctuation structures [12].
Phase 1
Chapter 1 gives the introduction about the project report and also gives
the objective of the project with literature survey and organization of the project.
The chapter deals with design methodology and its implementation using
MATLAB.
9
CHAPTER 2
SOFTWARE DESDRIPTION
2. MATLAB
2.1 INTRODUCTION
2.2 SYNTAX
prompt, >> in the Command Window, one of the elements of the MATLAB
Desktop. In this way, MATLAB can be used as an interactive mathematical shell.
Variables
Vectors / matrices
11
MATLAB is the "Matrix Laboratory", and so provides many convenient
ways for creating matrices of various dimensions. In the MATLAB vernacular, a
vector refers to a one dimensional (1×N or N×1) matrix, commonly referred to as
an array in other programming languages. A matrix generally refers to a multi-
dimensional matrix, that is, a matrix with more than one dimension, for instance,
an N×M, an N×M×L, etc., where N, M, and L are greater than 1. In other
languages, such a matrix might be referred to as an array of arrays, or array of
arrays of arrays, etc. Matrices can be defined by separating the elements of a row
with blank space or comma and using a semicolon to terminate each row. The list
of elements should be surrounded by square brackets [].
12
Photoshop (or any other image editing software) are actually algorithms. With
MATLAB, the user can create these complex algorithms that are applied in
Photoshop.
Image matrices
Color images
The three matrices are stacked next to each other creating a 3 dimensional
m by n by 3 matrix. For an image which has a height of 5 pixels and width of 10
pixels the resulting in MATLAB would be a 5 by 10 by 3 matrix for a true color
image.
13
Therefore to represent this one range, only one color channel is needed.Thus we
only need a 2 dimensional matrix, m by n by 1. MATLAB terms this type of
matrix as an Intensity Matrix, because the values of such a matrix represent
intensities of one color.
Color maps
Pixel values
14
and therefore multiplying each double pixel value by 255 will yield a uint8 pixel
value. Similarly conversion from uint8 to double is done by dividing the uint8
value by 255.
MATLAB does have casting functions unit 8 () and double (). But these
only change the data type and do not scale the values. Scaling must be done
manually. The reason MATLAB has two formats is because uint8 values take less
storage. But in many older versions of MATLAB (version 6.0) direct arithmetic
operations on uint8 values is not possible because of accuracy issues. Therefore to
perform arithmetic operations, the pixel values must be converted
to double first. In version 2006a, this is not an issue as MATLAB simply changes
uint8 to double first, does the operations, and then changes the values back to
uint8.
15
lines of the built-in functions that MATLAB and its Image Processing Toolbox
provide.
Basic tasks
2. Another common way is to combine elements of the red, green and blue
channel with a weight by using, weightR * RedChannel +
weightG*GreenChannel + weightB*BlueChannel. Where, the weights,
weightR + weightG + weightB = 1. This method allows us to mix each
channel selectively to get a grayscale image.
3. More recent versions of MATLAB have a function called rgb2gray() -which
turns an RGB matrix to a grayscale matrix. This function is an
implementation of the option listed above.
16
a wavelet transform on the image, we must convert it into a different format. This
section explains four common formats.
Intensity image
This is the equivalent to a “gray scale image” and this is the image we will
mostly work with in this course. It represents an image as a matrix where every
element has a value corresponding to how bright or dark the pixel at the
corresponding position should be colored. There are two ways to represent the
number that represents the brightness of the pixel.The double class.This assigns a
floating number between 0 and 1 to each pixel. The value 0 corresponds to black
and the value 1 corresponds to white. The other class is called uint8 which assigns
an integer between 0 and 255 to represent the brightness of a pixel.
The value 0 corresponds to black and 255 to white. The class uint8 only
requires roughly 1/8 of the storage compared to the class double. On the other
hand, many mathematical functions can only be applied to the double class. We
will see later how to convert between double and uint8.
Binary image
This image format also stores an image as a matrix but can only color a
pixel black or white . It assigns a 0 for black and a 1 for white.
Indexed image
17
This is a practical way of representing color images. In this course we
will mostly work with gray scale images but once you have learned how to work
with a gray scale image you will also know the principle how to work with color
images. An indexed image stores an image as two matrices. The first matrix has
the same size as the image and one number for each pixel.
The second matrix is called the color map and its size may be different
from the image. The numbers in the first matrix is an instruction of what number to
use in the color map matrix.
RGB image
This is another format for color images. It represents an image with three
matrices of sizes matching the image format. Each matrix corresponds to one of
the colors red, green or blue and gives an instruction of how much of each of these
colors a certain pixel should use.
Multiframe image
Fundamentals
You can build matrices and arrays of floating-point and integer data,
characters and strings, and logical true and false states. Function handles connect
your code with any MATLAB function regardless of the current scope. Structures
and cell arrays, provide a way to store dissimilar types of data in the same array.
There are 15 fundamental classes in MATLAB. Each of these classes is in the form
of a matrix or array. With the exception of function handles, this matrix or array is
a minimum of 0-by-0 in size and can grow to an n-dimensional array of any size. A
function handle is always scalar one by one.Here are a couple of basic MATLAB
commands which do not require any tool box for displaying an image.
19
MATLAB
Operation:
command:
Sometimes your image may not be displayed in gray scale even though
you might have converted it into a gray scale image. You can then use the
command colormap to “force” MATLAB to use a gray scale when displaying an
image.
20
CHAPTER 3
Vision allows humans to perceive and understand the world surrounding us.
Computer vision aims to duplicate the effect of human vision by electronically perceiving and
understanding an image. Giving computers the ability to see is not an easy task - we live in a
three dimensional (3D) world, and when computers try to analyze objects in 3D space, available
visual sensors (e.g., TV cameras) usually give two dimensional (2D) images, and this projection
to a lower number of dimensions incurs an enormous loss of information. In order to simplify the
task of computer vision understanding, two levels are usually distinguished; low level image
processing and high level image understanding. Low level methods usually use very little
knowledge about the content of images. High level processing is based on knowledge, goals, and
21
plans of how to achieve those goals. Artificial intelligence (AI) methods are used in many cases.
High level computer vision tries to imitate human cognition and the ability to make decisions
according to the information contained in the image.
GRAY
RGB
BINARY
2D
3D
Image Compression
Image Enhancement and Restoration
Measurement Extraction.
Image compression
22
Image compression is the application of data compression on digital
images. In effect, the objective is to reduce redundancy of the image data in order
to be able to store or transmit data in an efficient form. Image compression can be
lossy or lossless. Lossless compression is sometimes preferred for artificial images
such as technical drawings, icons or comics and also for medical images.
Reducing the color space to the most common colors in the image. The
selected colors are specified in the color palette in the header of the
compressed image. Each pixel just references the index of a color in the
color palette. This method can be combined with dithering to avoid
posterization.
23
Chroma subsampling. This takes advantage of the fact that the eye perceives
brightness more sharply than color, by dropping half or more of the
chrominance information in the image.
24
Region of interest coding: Certain parts of the image are encoded with higher
quality than others. This can be combined with scalability (encode these parts first,
others later).
Meta information: Compressed data can contain information about the image
which can be used to categorize, search or browse images. Such information can
include color and texture statistics, small preview images and author/copyright
information.
For example, global contrast enhancement would affect the entire image,
whereas local contrast enhancement would improve the contrast of small details,
such as a face or a license plate on a vehicle. Some algorithms can remove
25
background noise without disrupting key components of the image. Following
image enhancement, measurement extraction is used to gather useful information
from an enhanced image. Image defects which could be caused by the digitization
process or by faults in the imaging set-up (for example, bad lighting) can be
corrected using Image Enhancement techniques.
Measurement extraction
Application areas
26
Car barrier detection
The future
Biological or human vision is still by far the most powerful and versatile
perceptual mechanism. Scientist Mead notes that “the visual system of a single
human being does more image processing than the entire world’s supply of
supercomputers.” However, some tasks such as image compression, enhancement
and data extraction from paper via technologies such as OMR, OCR and ICR, etc,
can now be performed on desktop computers available today. It should be noted
that the present computer methods, by contrast, could provide answers only to
precisely stated questions that are not ill-defined.
But a ray of hope surely comes from the distributed and parallel
computing paradigms that are expected to boost real-time response for many image
processing solutions. Image processing technology is waiting to address many
unanswered questions. There is every reason to believe that once this technology
achieves a level of competence that even modestly approaches that of human
vision, and at a competitive cost, the imaging applications will virtually explode.
27
CHAPTER 4
PROJECT DESCRIPTION
i) Histogram equalization.
29
In this first module any one of the training image get from one
particular data base. In that image is gray level which have few levels compare to
RGB image.
Histogram equalization
30
The Bayesian Classification represents a supervised learning method as
well as a statistical method for classification. Assumes an underlying probabilistic
model and it allows us to capture uncertainty about the model in a principled way
by determining probabilities of the outcomes. It can solve diagnostic and predictive
problems Bayesian Classification provides a useful perspective for understanding
and evaluating many learning algorithms.
31
Figure 4.1: Schematic diagram of Bayesian classifier
Applications
Medical Diagnosis.
Text classification.
In this module get any one of the test image from one particular data
base.
Median filtering
32
Median filtering is a nonlinear operation often used in image processing
to reduce "salt and pepper" noise. A median filter is more effective than
convolution when the goal is to simultaneously reduce noise and preserve edges.
Median filters are particularly effective in the presence of impulse noise, also
called ‘salt – and – pepper’ noise because of its appearance as white and black dots
superimposed on an image. For every pixel, a 3x3 neighborhood with the pixel as
center is considered. In median filtering, the value of the pixel is replaced by the
median of the pixel values in the 3x3 neighborhood.
33
each point in an input grayscale image. The Sobel edge detector uses a pair of 3x3
convolution masks, one estimating the gradient in the x-direction (columns) and
the other estimating the gradient in the y-direction (rows). A convolution mask is
usually much smaller than the actual image. As a result, the mask is slid over the
image, manipulating a square of pixels at a time or the norm of this vector. The
Sobel operator is based on convolving the image with a small, separable, and
integer valued filter n horizontal and vertical direction and is therefore relatively
inexpensive in terms of computations.
34
only integer values for the coefficients which weight the image intensities to
produce the gradient approximation.
In this module the test image going to rotate by using geometrical rotated
signals for match the position of both train and test image.
In this module test and train image is compare to detect differences b/w
that images.
35
• New Pixel Value Computed from Neighboring Pixel Values
• Convolution of an N x N Matrix (Kernel) with the Image
Introduction
36
In the original paper on random forests, it was shown that the forest error rate
depends on two things:
The correlation between any two trees in the forest. Increasing the
correlation increases the forest error rate.
The strength of each individual tree in the forest. A tree with a low error rate
is a strong classifier. Increasing the strength of the individual trees decreases
the forest error rate.
Reducing m reduces both the correlation and the strength. Increasing it increases
both. Somewhere in between is an "optimal" range of m - usually quite wide.
Using the oob error rate (see below) a value of m in the range can quickly be
found.
37
It gives estimates of what variables are important in the classification.
It generates an internal unbiased estimate of the generalization error as the
forest building progresses.
It has an effective method for estimating missing data and maintains
accuracy when a large proportion of the data are missing.
It has methods for balancing error in class population unbalanced data sets.
Generated forests can be saved for future use on other data.
Prototypes are computed that give information about the relation between
the variables and the classification.
It computes proximities between pairs of cases that can be used in clustering,
locating outliers, or (by scaling) give interesting views of the data.
The capabilities of the above can be extended to unlabeled data, leading to
unsupervised clustering, data views and outlier detection.
Remarks
Random forests does not overfit. You can run as many trees as you want.
It is fast. Running on a data set with 50,000 cases and 100 variables, it produced
100 trees in 11 minutes on a 800Mhz machine. For large data sets the major
memory requirement is the storage of the data itself, and three integer arrays with
the same dimensions as the data. If proximities are calculated, storage requirements
grow as the number of cases times the number of trees.
To understand and use the various options, further information about how
they are computed is useful. Most of the options depend on two data objects
38
generated by random forests. When the training set for the current tree is drawn by
sampling with replacement, about one-third of the cases are left out of the sample.
This data is used to get a running unbiased estimate of the classification error as
trees are added to the forest. It is also used to get estimates of variable importance.
After each tree is built, all of the data are run down the tree,
and proximities are computed for each pair of cases. If two cases occupy the same
terminal node, their proximity is increased by one. At the end of the run, the
proximities are normalized by dividing by the number of trees. Proximities are
used in replacing missing data, locating outliers, and producing illuminating low-
dimensional views of the data.
39
n averaged over all cases is the oob error estimate. This has proven to be unbiased
in many tests.
Variable importance
In every tree grown in the forest, put down the oob cases and count the
number of votes cast for the correct class. Now randomly permute the values of
variable m in the oob cases and put these cases down the tree. Subtract the number
of votes for the correct class in the variable-m-permuted oob data from the number
of votes for the correct class in the untouched oob data. The average of this number
over all trees in the forest is the raw importance score for variable m.If the values
of this score from tree to tree are independent, then the standard error can be
computed by a standard computation. The correlations of these scores between
trees have been computed for a number of data sets and proved to be quite low.
therefore we compute standard errors in the classical way, divide the raw score by
its standard error to get a z-score, ands assign a significance level to the z-score
assuming normality. If the number of variables is very large, forests can be run
once with all the variables, then run again using only the most important variables
from the first run.For each case, consider all the trees for which it is oob. Subtract
the percentage of votes for the correct class in the variable-m-permuted oob data
from the percentage of votes for the correct class in the untouched oob data. This is
the local importance score for variable m for this case, and is used in the graphics
program.
Gini importance
40
Every time a split of a node is made on variable m the gini impurity
criterion for the two descendent nodes is less than the parent node. Adding up the
gini decreases for each individual variable over all trees in the forest gives a fast
variable importance that is often very consistent with the permutation importance
measure.
Interactions
This number is also computed under the hypothesis that the two variables
are independent of each other and the latter subtracted from the former. A large
positive number implies that a split on one variable inhibits a split on the other and
conversely. This is an experimental procedure whose conclusions need to be
regarded with caution. It has been tested on only a few data sets.
Proximities
These are one of the most useful tools in random forests. The
proximities originally formed a NxN matrix. After a tree is grown, put all of the
data, both training and oob, down the tree. If cases k and n are in the same terminal
41
node increase their proximity by one. At the end, normalize the proximities by
dividing by the number of trees.
Users noted that with large data sets, they could not fit an NxN matrix
into fast memory. A modification reduced the required memory size to NxT where
T is the number of trees in the forest. To speed up the computation-intensive
scaling and iterative missing value replacement, the user is given the option of
retaining only the nrnn largest proximities to each case. When a test set is present,
the proximities of each case in the test set with each case in the training set can
also be computed. The amount of additional computing is moderate.
of the mth variable in class j. If the mth variable is categorical, the replacement is
the most frequent non-missing value in class j. These replacement values are called
fills.
The second way of replacing missing values is computationally more
expensive but has given better performance than the first, even with large amounts
of missing data. It replaces missing values only in the training set. It begins by
doing a rough and inaccurate filling in of the missing values. Then it does a forest
run and computes proximities. If x(m,n) is a missing continuous value, estimate its
fill as an average over the non-missing values of the mth variables weighted by the
proximities between the nth case and the non-missing value case. If it is a missing
42
categorical variable, replace it by the most frequent non-missing value where
frequency is weighted by proximity. Now iterate-construct a forest again using
these newly filled in values, find new fills and iterate again. Our experience is that
4-6 iterations are enough. When there is a test set, there are two different methods
of replacement depending on whether labels exist for the test set. If they do, then
the fills derived from the training set are used as replacements. If labels no not
exist, then each case in the test set is replicated nclass times (nclass= number of
classes).
The first replicate of a case is assumed to be class 1 and the class one fills
used to replace missing values. The 2nd replicate is assumed class 2 and the class 2
fills used on it.This augmented test set is run down the tree. In each set of
replicates, the one receiving the most votes determines the class of the original
case
Train image or
data set
Preprocessing of
the train image
Voxel-wise Bayesian
classifier for brain
tissues
43
Lesion level
Preprocess
Test Random forest
image Test image
classifier
New lesion
image
Color Transformation
44
Ensure that the input is acceptable to and understood by the staff. The goal of
designing input data is to make entry easy, logical and free from errors as possible.
The entering data entry operators need to know the allocated space for each field,
field sequence and which must match with that in the source document. The
processor analyzes the input required. It is then accepted or rejected.
and reports. The output from the system is required to communicate the result of
processing to the users. They are also used as the permanent copy for later
verifications.
45
CHAPTER 5
PROJECT IMPLEMENTATION
To take gray level brain MRI image because the gray scale image
having few levels (0-255) compare to RGB image. In that gray level image going
to apply histogram equalization. Histogram equalization I which is used to
distribute or equalize the pixels to provide the contrast image. Then, three types of
brain tissues are there that is white matter, gray matter, cerebro spial fluid(CSF),the
Bayesian classifier is used to classify the brain tissue. Then, Lesion level Random
classifier is applied to detect the difference between reference and follow image
for detecting an inflammatory disease that is called injuries, unwanted cells.
46
We apply two stage of classifier so, that it gives high performance and
accuracy. Here, Bayesian classifier is used to classify the Brain tissues and
Random forest classifier is used to detect the new lesions or an inflammations.
47
Figure 5.1: Training image
In this module brain MRI training image is a gray level image because
the gray level image have few levels (0-255) compare to RGB image.
HISTOGRAM EQUALIZATION
48
Figure 5.2: Visualize the histogram equalization
49
Figure 5.3: After the histogram equalization
50
Figure 5.4: Voxel-wise Bayesian classification
51
Figure 5.5: Test image
In this module going to take anyone of the test image from the data set.
MEWDIAN FILTERING
52
Figure 5.6: After the median filtering the test image
In this module the median filter is applied to the test image for replacing
the pixels and reduce the noise. It used to provide quality of image.
53
Figure 5.7: Edge detecting of test image
TRAINING IMAGE
54
Figure 5.8: Training image
55
Figure 5.9: Rotated test image
56
Figure 5.10: Less level representation
57
Figure 5.11: Corner detection
58
Figure 6.12: Detect an inflammations in Brain image
5.12OUTPUT IMAGE
59
Figure 5.13: Inflammation detection
CHAPTER 6
REFERENCES
61
Sclerosis Lesions in Brain MRI”Medical imaging,vol 32,no.8,PP,1345-
1357,2013.
62
and interobserver reproducibility,” Neuroradiology, vol. 41, no. 12, pp. 882–888,
1999.
[8] E. Ashton, C. Takahashi, M. Berg, A. Goodman, S. Totterman, and S. Ekholm,
“Accuracy and reproducibility of manual and semiautomated quantification of MS
lesions by MRI,” J. Magn. Reson. Imag., vol. 17, no. 3, pp. 300–308, 2003.
[9] B.Moraal et al., “Long-interval t2-weighted subtractionmagnetic resonance
imaging: A powerful new outcome measure in multiple sclerosis trials,” Ann.
Neurol., vol. 67, no. 5, pp. 667–675, 2010.
[10] M. P. Sormani, B. Stubinski, P. Cornelisse, S. Rocak, D. Li, and N. D.
Stefano, “Magnetic resonance active lesions as individual-level surrogate for
relapses in multiple sclerosis,” Multiple Sclerosis J., vol. 17, no. 5, pp. 541–549,
2011.
[11] K. Van Leemput, F. Maes, D. Vandermeulen, A. Colchester, and P. Suetens,
“Automated segmentation of multiple sclerosis lesions by model outlier detection,”
IEEE Trans. Med. Imag., vol. 20, no. 8, pp. 677–688, Aug. 2001.
[12] X. Wei, S. K. Warfield, K. H. Zou, Y. Wu, X. Li, A. Guimond, J. P. Mugler,
R. R. Benson, L.Wolfson, H. L.Weiner, and C. R. Guttmann, “Quantitative
analysis of MRI signal abnormalities of brain white matter with high
reproducibility and accuracy,” J. Magn. Reson. Imag., vol. 15, no. 2, pp. 203–209,
2002.
63
[13] M. Bosc, F. Heitz, J.-P. Armspach, I. Namer, D. Gounot, and L. Rumbach,
“Automatic change detection in multimodal serial MRI: Application to multiple
sclerosis lesion evolution,” NeuroImage, vol. 20, no. 2, pp. 643–656, 2003.
[14] A. Zijdenbos, R. Forghani, and A. Evans, “Automatic “pipeline” analysis of
3-d MRI data for clinical trials: Application to multiple sclerosis,” IEEE Trans.
Med. Imag., vol. 21, no. 10, pp. 1280–1291, Oct.2002.
64
65
66