A Robust and Fast Text Extraction in Images and Video Frames
A Robust and Fast Text Extraction in Images and Video Frames
A Robust and Fast Text Extraction in Images and Video Frames
Video Frames
1 Introduction
Text detection in video and image has attracted researchers’ attention for many years.
As a result, hundreds of thousands of hours of archival videos are being stored and
shared. Three types of text in images are: indoor -outdoor text which naturally occurs
in the field of view of the camera and caption/graphics/artificial text which is artifi-
cially superimposed on the video at the time of editing and animation text which
occurs in the field of view of the internet like captcha. The tough task in images is
text extraction due to complicated background, ambiguous text character colors and
different stroke specification.
There are two common methods are used to calculate the spatial connection which
is based on edge based feature and connected component features of text. The area
with a higher contrast between text and background focus upon Edge based method
[2]. In this way, edges from letters are identified and merged. Connected component
method [3] used a bottom-up approach by iteratively merge sets of connected pixels
using a homogeneity criterion leading to the creation of flat-zones or Connected
Components.
Our proposed method for image text extraction system (shown in Fig. 1) extract a
text region from an image which can be broadly classified into three basic section:
S. Unnikrishnan, S. Surve, and D. Bhoir (Eds.): ICAC3 2011, CCIS 125, pp. 342–348, 2011.
© Springer-Verlag Berlin Heidelberg 2011
A Robust and Fast Text Extraction in Images and Video Frames 343
(1)detection of the text region in the image, (2)localization of the region, and (3) ex-
tracted the output character image. Any possibility of text detection involves of text
in the image is detected and the process of localization involves further enhancing the
text regions by eliminating non-text regions. At last in text extraction process gener-
ates an output image with white text against a black background.
In this paper we focused on text extraction of four type’s images with the help of
line edge detector. The paper is organized as: Section 2 gives the proposed Algorithm
and Section 3 gives the Experiment results is explained. Section 4 gives conclusions
respectively.
2 Proposed Algorthem
In this section, the processing steps of the proposed text extraction are presented. Our
aim is to build fast and robust text detection system which is able to handle still im-
ages with complex background. We can see from figure 1 that the proposed algorithm
is mainly performed by three sections, which will be described below.
In our proposed method, images are next convolved with directional filters at different
orientation masks for line edge detection in the horizontal (0° or 180°),vertical (90° or
270°) directions [1]. So it can be imagine that the next region have higher edge strength
in the same directions. The line detection masks used are shown in Figure 2. Which
enhance the find text edge, and then calculate the threshold. If the threshold of the de-
tected edge set an appropriate value, than the other detected weak edge can be filtered.
(a) (b)
Fig. 5. (a) Horizontal Projection profile (b) Vertical Projection profile for image in Figure 4
Step-10. Text segmentation is the next step to take place. It starts with extraction of
text image from the gray image. Then, the segmentation process concludes with a
procedure which enhances text to background contrast on the text image.
Step-11. The available common OCR system requires to easily recognized the
character of an input images. Thus this process provides an output image with white
text against a black background. Final text image shown in figure 8.
In order to evaluate the performance of the proposed method, there are 28 distinct test
images are use which are of distinct font sizes, distinct perspective and distinct align-
ment under distinct circumstances. The results which are shown in figure 9 ~ 13
shows that our proposed method can detect the text with distinct font sizes, perspec-
tive, alignment, and detect the text string characters under distinct circumstances. The
importance of algorithms testing with change of scale, lighting and orientation, is use
to find the strength of every technique with change in these circumstance, and also
use to find that where each technique is successful and where it fails.
Figure 9~13, show that our proposed method has excellent performance with wide
variety of set of images. So that we can say our proposed method is a strong and im-
pressive approach to find the text based images. The performance of each method has
been calculated and it is based on obtained Recall rates and average time.
Recall Reate = (Correctly Detected Words) / (Correctly Detected Words+False Negatives)*100 (1)
(a) (b)
(a) (b)
Fig. 10. Indoor Image (a) Original image (b) Extracted image
(a) (b)
Fig. 11. Outdoor Image (a) Original image (b) Extracted image
The test set for this evaluation experiment consists of 28 single images selected
randomly from the internet (Google search engine). The experiment is carried out on
Matlab 7.0 software platform. The PC for experiment is equipped with an Intel P4
2.4GHz Personal laptop and 2GB memory. The total processing time, including read-
in and write-out for all 28 images is less than 4 seconds.
(a) (b)
Fig. 12. News vedio frame (a) Original image (b)Extracted image
(a) (b)
Table 1 shows the performance comparison of our proposed method with several
existing method, where our proposed method has a better performance in average
time and recall rate. The reason for fast speed that the proposed method used line
based edge approach cost less time.
4 Conclusions
In this paper, a fast and robust text extraction in images approach is proposed. The
line detection edge based method in two directions is able to better represent the in-
trinsic characteristics of text. Experiment results show that our method can obtain
95.3 % recall rate and average text detection time is 3.64 second, which is superior to
the existing text detection methods without much increasing the computational cost.
References
1. Al-Eidan, R.B., Al-Braheem, L., El-Zaart, A.: Line Detection Based On The Basic Masks
And Image Rotation. In: 2nd International Conference on Computer Engineering and Tech-
nology, IEEE, pp. 465–469. IEEE, Chengdu (2010)
2. Liu, X., Samarabandu, J.: Multiscale Edge-Based Text Extraction from Complex Images. In:
International Conference on Multimedia and Expo., ICME 2006, pp. 1721–1724. IEEE, Los
Alamitos (2006)
3. Gllavata, J., Ewerth, R., Freisleben, B.: A robust algorithm for text detection in images. In:
Proceedings of the 3rd International Symposium on Image and Signal Processing and
Analysis, ISPA, vol. 2, pp. 611–616. IEEE, Los Alamitos (September 2003)
4. Liu, X., Samarabandu, J.: An edge-based text region extraction algorithm for indoor mobile
robot navigation. In: International Conference on Mechatronics and Automation, pp. 701–
706. IEEE, Los Alamitos (2005)
5. Li, X., Wang, W., Jiang, S., Huang, Q., Gao, W.: Fast and effective text detection. In: 15th
IEEE International Conference on Image Processing, pp. 969–972 (October 2008)
6. Ye, Q., Huang, Q., Gao, W., Zhao, D.: Fast and robust text detection in images and video
frames. Image and Vision, Computing 23, 565–576 (2005)
7. Liu, Q., Jung, C., Kim, S., Moon, Y., Kim, J.: Stroke filter for text localization in video im-
ages. In: Proc. Int. Conf. Image Process, Atalanta, GA, USA, pp. 1473–1476 (October 2006)