I am using MediaPipe to find the breed of a dog, if there is one, in a camera image. I first use ObjectDetector, and if it finds a dog, I send what is inside the bounding box to ImageClassifier, with a .tflite trained on dog breeds. The code for the cropping is below.
Questions:
- I believe I get more accurate results as I only send the cropped part, not the whole camera image, to ImageClassifier. Right?
- Can the cropping be made faster / smarter / easier? (Format is RGBA_8888)
- If I add some parts of the image above/below the bounding box to make the crop square, would that improve the accuracy?
- If I resize the square-cropped image to 224 x 224 pixels = the input shape of my model, would that improve the accuracy?
I have really tried to answer these questions myself through googling and experimenting, but not succeeded. Any advice from more experienced developers would be much appreciated.
det = detectionResult.detections()[saveIndex]
// Crop the image: Create a new mpImage with what is inside the bounding box
val l = det.boundingBox().left.toInt()
val t = det.boundingBox().top.toInt()
val w = det.boundingBox().width().toInt()
val h = det.boundingBox().height().toInt()
val size = w * h * 4
val smallBuffer = ByteBuffer.allocateDirect(size)
// Crop mpImage
val wtot = imageProxy.width
smallBuffer.rewind()
byteBuffer.rewind()
var pixel = ByteArray(4)
for (rowNumber in 0..h - 1) {
for (pixelNumber in 0..w - 1) {
val offset = (rowNumber + t) * wtot * 4 + (l + pixelNumber) * 4
pixel[0] = byteBuffer[offset]
pixel[1] = byteBuffer[offset + 1]
pixel[2] = byteBuffer[offset + 2]
pixel[3] = byteBuffer[offset + 3]
smallBuffer.put(pixel)
}
}
// Convert smallBuffer to mpImage
smallBuffer.rewind()
val bitmapBuffer2 = Bitmap.createBitmap(
w, h, Bitmap.Config.ARGB_8888
)
bitmapBuffer2.copyPixelsFromBuffer(smallBuffer)
val mpImage2 = BitmapImageBuilder(bitmapBuffer2).build()
// Run Image Classifier with cropped image as input
val classifierResult: ImageClassifierResult? =
imageClassifier.classify(mpImage2)