0

I am using MediaPipe to find the breed of a dog, if there is one, in a camera image. I first use ObjectDetector, and if it finds a dog, I send what is inside the bounding box to ImageClassifier, with a .tflite trained on dog breeds. The code for the cropping is below.

Questions:

  1. I believe I get more accurate results as I only send the cropped part, not the whole camera image, to ImageClassifier. Right?
  2. Can the cropping be made faster / smarter / easier? (Format is RGBA_8888)
  3. If I add some parts of the image above/below the bounding box to make the crop square, would that improve the accuracy?
  4. If I resize the square-cropped image to 224 x 224 pixels = the input shape of my model, would that improve the accuracy?

I have really tried to answer these questions myself through googling and experimenting, but not succeeded. Any advice from more experienced developers would be much appreciated.

det = detectionResult.detections()[saveIndex]
// Crop the image: Create a new mpImage with what is inside the bounding box
val l = det.boundingBox().left.toInt()
val t = det.boundingBox().top.toInt()
val w = det.boundingBox().width().toInt()
val h = det.boundingBox().height().toInt()
val size = w * h * 4
val smallBuffer = ByteBuffer.allocateDirect(size)
// Crop mpImage
val wtot = imageProxy.width
smallBuffer.rewind()
byteBuffer.rewind()
var pixel = ByteArray(4)
for (rowNumber in 0..h - 1) {
    for (pixelNumber in 0..w - 1) {
        val offset = (rowNumber + t) * wtot * 4 + (l + pixelNumber) * 4
        pixel[0] = byteBuffer[offset]
        pixel[1] = byteBuffer[offset + 1]
        pixel[2] = byteBuffer[offset + 2]
        pixel[3] = byteBuffer[offset + 3]
        smallBuffer.put(pixel)
    }
}
// Convert smallBuffer to mpImage
smallBuffer.rewind()
val bitmapBuffer2 = Bitmap.createBitmap(
    w, h, Bitmap.Config.ARGB_8888
)
bitmapBuffer2.copyPixelsFromBuffer(smallBuffer)
val mpImage2 = BitmapImageBuilder(bitmapBuffer2).build()

// Run Image Classifier with cropped image as input

val classifierResult: ImageClassifierResult? =
    imageClassifier.classify(mpImage2)

1 Answer 1

0

To answer your first question on cropping your model will produce better results using the cropped images. I have made many high quality datasets and demonstrated the models achieve a much higher accuracy if the images are cropped so that the pixels in the region of interest are at least 50% of the pixels. With respect to some of your other questions here are the rules I follow to produce a high quality dataset. First I do a google and also a Bing search and download the images. For example images of a bald eagles. Next I have a python kernel that sorts the downloaded images by image area with the largest images first. You want to use the largest images you can get so that when you crop the image to the ROI the resultant cropped image is fairly large. I try to have each cropped image be such that at least 50% of the pixels in the image are from the region of interest. With large copped images you can now resize them and retain all the information in the image that is useful to your model. The resized images do not need to be square. With respect to question 4 models require that all images be of the same size. So setting the size to 224 X 224 is fine. Remember that is why You want the select the largest images first, then crop to the region of interest. That why your cropped images can be resized to 22X X 224 and have sufficient information in the cropped image for the model to work on. As to question2 I do not know what method you are using to crop the images? Manually?, using something like Yolo *? so I have no advice on that

1
  • Thanks, this is gread advice for the model training. But my questions are more about using the trained model. MediaPipe Studio at mediapipe-studio.webapps.google.com/studio/demo/… makes it easy to verify that a cropped image is better. But I still wonder about questions 2-4. Commented Jan 24 at 16:01

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.