0

I want to use a pretrained Faster-RCNN to extract RoI Features for the detected objects in images.

I couldn't find an easy way of doing that so I started reading the source code. The solution I managed to come up with is the following:

import torch
from torchvision.models.detection import fasterrcnn_resnet50_fpn

model = fasterrcnn_resnet50_fpn(pretrained=True)
model.eval()

images # torch.Tensor of shape (batch_size, channels, height, width)

features = model.backbone(images)

output = model(images)

list_of_boxes = [out["boxes"] for out in output]

list_of_image_sizes = [(height, width) for _ in range(batch_size)]

features_of_predicted_boxes = model.roi_heads.box_roi_pool(features, list_of_boxes, list_of_image_sizes)
features_of_predicted_boxes = model.roi_heads.box_head(features_of_predicted_boxes)

So my questions to anyone who understands Faster-RCNN better than me are the following:

  • Is the solution that I came up with doing what I want it to do?
  • Is there an easier way to do it?

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.