-2

Input image enter image description here

I need as following,

output explanation Output scheme

Following steps needed to be done.

  1. Process the input image
  2. Split the image into more images based on the gap between the horizontal text line.
  3. then store that image in consecutive numbering.
  4. Again process each image and split into more images based on space between the each word.
  5. then store it in each image.

If possible i need individual characters image from the input image, i dont need text extraction. Since all Tamil OCR are not deciphering properly.

3
  • Not a job for a plain image editor. An OCR program should be able to do that, or some intelligent image processing package such as opencv.
    – xenoid
    Commented Jul 4, 2017 at 15:52
  • You should state the language you are using, and you should ask a specific question for a particular problem. Since Stack Overflow hides the Close reason from you: "Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question."
    – jww
    Commented Jul 4, 2017 at 20:43
  • thank you @xenoid but give some example commands, since am novice in those programs, it will help me to give a proper headstart. Commented Jul 6, 2017 at 12:35

1 Answer 1

0

Here is an approach - it is probably not perfect but you can maybe tweak it if you have more images available to test with.

The basic idea is to fatten up the individual letters so they touch each other but hopefully without bridging across to adjacent words. Then do a "Connected Components Analysis" to find the individual words of your original text as blobs.

Here is the first step - the fattening of the letters with ImageMagick

convert text.png -threshold 50% -morphology erode diamond:4 step1.png

enter image description here

I am using morphology techniques above, but you could equally try blurring and thresholding techniques instead.

Now find the "blobs":

convert step1.png \
    -define connected-components:verbose=true        \
    -define connected-components:area-threshold=100  \
    -connected-components 8 -auto-level output.png

Sample Output

Objects (id: bounding-box centroid area mean-color):
  0: 1086x188+0+0 556.4,83.0 155156 gray(255)
  7: 364x65+128+118 281.8,142.6 9206 gray(0)
  6: 212x34+817+115 919.3,131.1 4691 gray(0)
  4: 231x33+73+76 184.4,92.3 4645 gray(0)
  2: 181x42+494+8 578.4,27.7 4399 gray(0)
  9: 209x31+608+118 713.1,132.9 3892 gray(0)
  17: 148x34+826+148 903.0,165.1 2932 gray(0)
  22: 132x34+20+153 84.1,169.1 2453 gray(0)
  20: 126x27+384+151 443.4,165.9 2404 gray(0)
  1: 91x42+396+8 440.3,29.0 2390 gray(0)
  18: 117x34+708+149 764.2,165.5 2350 gray(0)
  21: 104x33+509+151 560.3,167.6 2245 gray(0)
  23: 112x27+271+158 325.8,169.3 2159 gray(0)
  8: 100x33+507+118 558.2,134.6 1982 gray(0)
  19: 91x33+615+150 659.4,166.2 1888 gray(0)
  10: 55x25+73+121 100.0,134.4 920 gray(0)
  3: 28x29+361+12 373.8,27.5 456 gray(0)

Each line above corresponds to one blob, or hopefully one word of your original text. The first line is a header line which tells you what the fields are. The second field on each subsequent line is the WIDTH X HEIGHT and X Y OFFSET (from top left corner of the image) of a blob followed by its centroid and its area and the last field is the mean colour.

You don't need this next step, as I guess the lines of text describing each word are what you need, but by way of illustration, I can draw in the boxes on the original image:

convert "text.png" -stroke red -fill none -strokewidth 1 \
   -draw "rectangle 128,118 492,183"  \
   -draw "rectangle 817,115 1029,149" \
   -draw "rectangle 73,76 304,109"    \
   -draw "rectangle 494,8 675,50"     \
   -draw "rectangle 608,118 817,149"  \
   -draw "rectangle 826,148 974,182"  \
   -draw "rectangle 20,153 152,187"   \
   -draw "rectangle 384,151 510,178"  \
   -draw "rectangle 396,8 487,50"     \
   -draw "rectangle 708,149 825,183"  \
   -draw "rectangle 509,151 613,184"  \
   -draw "rectangle 271,158 383,185"  \
   -draw "rectangle 507,118 607,151"  \
   -draw "rectangle 615,150 706,183"  \
   -draw "rectangle 73,121 128,146"   \
   -draw "rectangle 361,12 389,41" result.png

enter image description here

You don't actually need to create the intermediate images like I did above, you can do it all in one go as follows, but I wanted to explain my technique:

convert text.png -threshold 50% -morphology erode diamond:4 \
    -define connected-components:verbose=true               \
    -define connected-components:area-threshold=100         \
    -connected-components 8 -auto-level output.png

The output image (output.png) is actually "labelled" by which I mean that all the pixels of each identified blob are coloured in a successively lighter shade of white.

enter image description here


Note that there are other structuring element shapes and sizes that may give better results with your images, e.g.:

convert text.png -threshold 50% -morphology erode disk:3 result.png

enter image description here

See Anthony Thyssen's excellent introduction to morphology here.


Regarding your further question of splitting the image into its constituent parts, each containing a word, you could do the following...

Pipe the output of the previous convert command into awk to find all the lines that contain the word gray and print out the geometry of the box in field 2.

convert ... as above ... | awk '/gray/{print $2}'

Sample Output

1086x188+0+0
364x65+128+118
212x34+817+115
231x33+73+76
181x42+494+8
209x31+608+118
148x34+826+148
132x34+20+153
126x27+384+151
91x42+396+8
117x34+708+149
104x33+509+151
112x27+271+158
100x33+507+118
91x33+615+150
55x25+73+121
28x29+361+12

Now split that on the plus sign to separate the X and Y:

convert ... | awk '/gray/{split($2,a,"+");print a[1],a[2],a[3]}'

Sample Output

1086x188 0 0
364x65 128 118
212x34 817 115
231x33 73 76
181x42 494 8
209x31 608 118
148x34 826 148
132x34 20 153
126x27 384 151
91x42 396 8
117x34 708 149
104x33 509 151
112x27 271 158
100x33 507 118
91x33 615 150
55x25 73 121
28x29 361 12

Now sort by Y then X so that words come out in line order (Y counts down from the top) then word order (X counts across from the left):

convert ... | awk '/gray/{split($2,a,"+");print a[1],a[2],a[3]}' | 
    sort -n -k3 -k2 

Sample Output

1086x188,0,0
91x42,396,8
181x42,494,8
28x29,361,12
231x33,73,76
212x34,817,115
364x65,128,118
100x33,507,118
209x31,608,118
55x25,73,121
148x34,826,148
117x34,708,149
91x33,615,150
126x27,384,151
104x33,509,151
132x34,20,153
112x27,271,158

Now, pass the geometries into a loop and read them into bash variables, then crop the original image and name the individual words with a simple index (i).

convert ... | awk '/gray/{ split($2,a,"+");print a[1],a[2],a[3] }' |
   sort -n -k3 -k2 | 
 { i=0; 
   while read g x y; do 
          convert text.png -crop ${g}+${x}+${y} word-${i}.png
          ((i+=1))
   done }

Note that if your text is even slightly rotated, say 1 degree anticlockwise, the words on the right end of any given line will have a smaller Y coordinate than those on the left and so may come out before them in the image order when sorted. As such, you may need to round the Y coordinate to the nearest 10, or 20 or 40 in awk so that if one word is 168 pixels from the top and another is 169 pixels from the top, they both get sorted as though they were say 170 pixels from the top and come out in the same line.

2
  • Mark !!! impressive. I am half way on what am expecting. Is it possible extract the red color boxed images? I can understand WIDTHxHEIGHT in the output verbose. Is am i need to manually look for the X Y OFFSET. But i edited the question again and added one image, so please go through it and give me the better solution. Thank You!! Commented Jul 6, 2017 at 13:04
  • I have added some more at the end of my original answer to help you chop the image into individual words. Commented Jul 6, 2017 at 16:26

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.