Satyawan - 2019 - J. - Phys. - Conf. - Ser. - 1235 - 012049
Satyawan - 2019 - J. - Phys. - Conf. - Ser. - 1235 - 012049
Satyawan - 2019 - J. - Phys. - Conf. - Ser. - 1235 - 012049
Wira Satyawan, M Octaviano Pratama, Rini Jannati, Gibran Muhammad, Bagus Fajar,
Haris Hamzah, Rusnandi Fikri, Kevin Kristian
Email: [email protected]
Abstract. Since its emergence in 2011, Indonesian Electronic Id-card has been widely used as
authentication or citizen identity. Several issues like deep difficulty in detecting id-card field
and also difficulty in character recognition data in id-card should be concerned. In this
research, we propose a technique detect electronic Id-card using combination of Image
Processing and Optical Character Recognition (OCR). The result, we can obtain 98% accuracy
of Id-card detection using our image processing techniques and OCR. This research was
embedded in website interface which used by automotive company.
1. Introduction
The development of Information Technology has developed quite rapidly, both in theory and
application. A lot of research technology has used to facilitate and accelerate human works. The
researches have been implemented to computer and used to accomplishing human works optimally.
One example of the development of information technology in business is how to purchase goods.
Currently, we don’t have to visit the store for purchasing some goods. Purchase of goods can also be
done by online. In various businesses, companies need customer data that should be inputted into
database for online or offline purchase. Data of customers who buy item by online are usually
requested when registering an account, while customers who buy item by offline are usually asked to
get their identity. Data of costumer’s identity can be obtained from their ID Card. The ID Card that
used for this case is citizen ID Card. Previously, customers data inputted manually. That is not
efficient process because we need a lot of time to input data one by one. Therefore, we need a system
that processes automatically.
Based on that problem, Image Processing technique can be used as an alternative solution of
manually input process. This process starts by extracting information in ID Card image. Then, it will
be pre-processed to obtain the necessary part of image. Furthermore, Optical Character Recognition
(OCR) will be performed in order to recognize text in images. OCR can recognize handwriting and
text characters automatically through optical mechanism. OCR is designed to process images
consisting of text with little non-text data interference. While the OCR performance depends on the
quality of the inputted document [1].
Based on some research above, this study compares the result of character recognition of name and
NIK (identity number) in ID Card using two different tesseract models. The first model uses the train
data manually that created from five ID Card as data set and training on tesseract 3.05 with the support
of software QT-box version 1.08. While the second model uses train data that already contained in
tesseract 4.0, which is a data train that contains text data in Indonesian language with different fonts
Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution
of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Published under licence by IOP Publishing Ltd 1
The 3rd International Conference on Computing and Applied Informatics 2018 IOP Publishing
IOP Conf. Series: Journal of Physics: Conf. Series 1235 (2019) 012049 doi:10.1088/1742-6596/1235/1/012049
and using tesseract version 4.0 for OCR which in that version has implemented neural network model
that is LSTM.
The contributions of this research are: (1) we propose image processing method for detection citizen id
card particularly for Indonesian citizen id card, (2) we try various models of recognition implemented
in Tesseract frameworks, (3) we propose website interface that has this system for citizen id card
detection and recognition which will be useful for scanning result. The final result of this research is
used by one of automotive company in Indonesia which run on website interface platform.
2. Related Works
Recognition process of characters in the image from year to year has growing more. In 2005, Wang et
al used Gabor-filters for character recognition with low image quality and for Chinese-readable
characters [2]. In 2011, Vikas et al developed document segmentation using histogram analysis [3].
Sreedhar et al in 2012 developed image processing using the Morphological Transformation method
and Weber's Law which enhances the contrast of an image [4]. Ryan [5] et al in 2015 conducted
research about character recognition on ID Card of Indonesian people using Zhange-suen algorithm
divided into 2 algorithms: 3x3 algorithm and pixel-by-pixel algorithm. Valiente [6] using Optical
Character Recognition to detect id card combined with cloud technology. Most of previous research
using image processing technology combining with Machine Learning to detect citizen id card. The
appropriate selection of Image Processing and Machine Learning Techniques can improve accuracy of
prediction.
3. Methods
In this research, we start from data collection of Citizen Id Card, then we divide data into training data
and testing data. After collecting appropriate data, pre-processing is performed in order to make image
that used in forward tasks. Then text area extraction and Segmentation are performed to determine
area that should be taken automatically. The last step, Optical Character Recognition (OCR) was used
for predicting character in Citizen Id Card.
3.1. Pre-processing
ID card that used in this research has a uniform size 1654×2340 per images. The pre-processing of
image is generally divided into 3 parts: grayscale, thresholding, and morphological transformation.
3.1.1 Grayscale
Grayscale is the process of converting an image that previously consisted of 3 RGB layers into
a gray image that has 1 layer. Making image to Grayscale is used to obtain optimum Binary
image results.
3.1.2 Thresholding
Based on the thresholding formula [7] is defined as:
1, if f(x, y) > 𝑇
g(x, y) ' (1)
0, otherwise
This thresholding converts the image into binary by selecting the threshold. In this research,
we put 100 as thresholding value. For then so that the pixel colour becomes black. And for in
other values, so the pixel colour becomes white Therefore the character on the ID Card which
originally black will change to white colour while the other colour will change to black colour.
3.1.3 Sobel
Most edge detection methods work on the assumption that the edge occurs where there is a
discontinuity in the intensity function or a very steep intensity gradient in the image. Using
this assumption, if one takes the derivative of the intensity value across the image and find
points where the derivative is maximum, then the edge could be located. The gradient is a
vector, whose components measure how rapid pixel value are changing with distance in the x
2
The 3rd International Conference on Computing and Applied Informatics 2018 IOP Publishing
IOP Conf. Series: Journal of Physics: Conf. Series 1235 (2019) 012049 doi:10.1088/1742-6596/1235/1/012049
and y direction. Thus, the components of the gradient may be found using the following
approximation [8]:
67(8,9) 7(8=>8,9)?7(8,9)
= Δ𝑥 = (2)
68 >8
67(8,9) 7(8,9=>9)?7(8,9)
= Δ𝑦 = (3)
68 >9
where and measure distance along the and directions respectively.
3.1.5 Otsu
An image can be represented by a 2D gray-level intensity function. The value of is the gray-
level, ranging from to, where is the number of distinct gray-levels. Let the number of pixels
with gray-level be, and be the total number of pixels in a given image, the probability of
occurrence of gray-level is defined as [10]:
Y
𝑝(𝑖) = YZ (10)
The average gray-level of the entire image is computed as [10]:
𝜇 \ = ∑^?_
`ab 𝑖𝑝(𝑖) (11)
3
The 3rd International Conference on Computing and Applied Informatics 2018 IOP Publishing
IOP Conf. Series: Journal of Physics: Conf. Series 1235 (2019) 012049 doi:10.1088/1742-6596/1235/1/012049
3.3. Segmentation
Image segmentation is used to determine the part of the text to be retrieved. In this research the part
that will be taken is NIK character and name on ID Card. We set the width and height of the kernel
box and pixel coordinates. Then it will produce the result of crop ID Card as we want.
4. Result
In this research, we made two different models to identify name and ID card character section in ID
Card. First model used manual training data and trained using Tesseract 3.05 and QT Box software
Version 1.08. Second model used training data in tesseract 4.0 which is Indonesian text with different
font.
After the kernel was selected and was do segmentation the NIK and name on the ID card, the result
from segmentation NIK and name are entered the model in training manually using data train tesseract
4
The 3rd International Conference on Computing and Applied Informatics 2018 IOP Publishing
IOP Conf. Series: Journal of Physics: Conf. Series 1235 (2019) 012049 doi:10.1088/1742-6596/1235/1/012049
version 3.05 was assisted with qt-box version 1.08 and we obtained 100% prediction accuration. Next
step, the model tested with new data test and the result NIK and name of character recognition use
data test is shown in table 1.
Table 1. Result and accuracy character recognition NIK and name data test use data train was created
manually use qt-box version 1.08
There are some errors in the character reading results in the model. This is because to the small
amount of data train and several letters on the data test is not in the data train used.
4.2. Model Data Train Tesseract 4.0 and OCR Tesseract 4.0
In the second experiment, we used the same pre-processing and segmentation process as the first
experiment. The difference from the first experiment is to OCR, we use tesseract version 4.0 which in
that version has implemented neural network model that is LSTM.
Training data that used is default data that has been obtained from tesseract that contains text data
in Indonesia language with different font. To read the NIK number, we retrained the text data using
the font on the NIK number. Based on the model that obtained from the data train, we test using the
same data train with the first experiment and we obtained 100% prediction accuration for NIK but we
obtained 98.6% prediction accuration for name. After that, do the test using the same test data as the
test data in the first experiment, and got the results as shown in table 2.
Table 2. The result and accuracy of NIK and name character recognition on test data using data
train made from tesseract 4.0
The accuracy for NIK character recognition of both methods shows 100%. While the acknowledgment
of character recognition of names of both methods does not show accurate accuracy because there are
errors of some letter characters. This because of the train data tesseract is not enough so that tesseract
could not recognize all characters of letters.
5
The 3rd International Conference on Computing and Applied Informatics 2018 IOP Publishing
IOP Conf. Series: Journal of Physics: Conf. Series 1235 (2019) 012049 doi:10.1088/1742-6596/1235/1/012049
5. Conclusion
Citizen ID card can be detected by using proposed image processing techniques and collaborated with
OCR. Image processing techniques in this research consist of preprocessing, text area extraction, and
segmentation. OCR proposed for character recognition. This research combines grayscale
preprocessing techniques with binary image processing techniques such as Sobel, morphological
transformation, and OTSU. Text area extraction uses a kernel that identifies the text area of the NIK
and name on the ID card citizen. The experiments with training data made using tesseract 4.0 show
that accuracy of detection reach between 90 - 100 % using our propose technique. We also create
another model with training data created using qt-box as benchmarks.
6. References
[1] Mithe R, Indalkar S and Divekar N 2013 Optical Character Recognition Int. J. Recent Technol.
Eng. 2 72–5
[2] Wang X, Ding X and Liu C 2005 Gabor filters-based feature extraction for character recognition
Pattern Recognit. 38 369–79
[3] Dongre V J and Mankar V H 2011 DEVNAGARI DOCUMENT SEGMENTATION USING
HISTOGRAM APPROACH Int. J. Comput. Sci. Eng. Inf. Technol. 1
[4] Sreedhar K and Panlal B 2012 ENHANCEMENT OF IMAGES USING MORPHOLOGICAL
TRANSFORMATIONS Int. J. Comput. Sci. Inf. Technol. 4
[5] Ryan M and Hanafiah N 2015 An Examination of Character Recognition on ID card using
Template Matching Approach Procedia Computer Science vol 59 pp 520–9
[6] Valiente R, Sadaike M T, Gutiérrez J C, Soriano D F and Bressan G 2016 A process for text
recognition of generic identification documents over cloud computing IPCV’1International
Conf. Image Process. Comput. Vision, Pattern Recognit. 4
[7] Sitthi A, Nagai M, Dailey M and Ninsawat S 2016 Exploring Land Use and Land Cover of
Geotagged Social-Sensing Images Using Naive Bayes Classifier Sustainability 8 921
[8] Vincent O R and Folorunso O 2009 A Descriptive Algorithm for Sobel Image Edge Detection
[9] Zeng M, Li J and Peng Z 2006 The design of Top-Hat morphological filter and application to
infrared target detection Infrared Phys. Technol. 48 67–76
[10] Ng H-F, Kheng C-W and Lin J-M A Weighting Scheme for Improving Otsu Method for
Threshold Selection