-1

I am new to pytorch. I want to use imagenet images to understand how much each pixel contributes to the gradient. For this, I am trying to construct attention maps for my images. However, while doing so, I am encountering the following error:

<ipython-input-64-08560ac86bab>:2: UserWarning: To copy  construct from a tensor, it is recommended to use    sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than   torch.tensor(sourceTensor).
  images_tensor = torch.tensor(images, requires_grad=True)
  <ipython-input-64-08560ac86bab>:3: UserWarning: To copy     construct from a tensor, it is recommended to use     sourceTensor.clone().detach() or   sourceTensor.clone().detach().requires_grad_(True), rather than    torch.tensor(sourceTensor).
  labels_tensor = torch.tensor(labels)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most   recent call last)
<ipython-input-65-49bfbb2b28f0> in <cell line: 20>()
 18     plt.show()
 19 
---> 20 show_attention_maps(X, y)

9 frames
/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py  in batch_norm(input, running_mean, running_var, weight, bias,  training, momentum, eps)
   2480         _verify_batch_size(input.size())
   2481 
-> 2482     return torch.batch_norm(
   2483         input, weight, bias, running_mean,     running_var, training, momentum, eps, torch.backends.cudnn.enabled
   2484     )

RuntimeError: running_mean should contain 1 elements not 64

I have tried changing the image size in preprocessing and changing the model to resnet152 instead of resnet18. My understanding from the research I have done is that the batchnorm in the first layer expects input size 1, but I have 64. I am not sure how that can be changed.

My code is here:

model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet18', pretrained=True)
import torch.nn as nn
new_conv1 = nn.Conv2d(15, 1, kernel_size=1, stride=1, padding=112)     
nn.init.constant_(new_conv1.weight, 1)
model.conv1 = new_conv1
model.eval()

for param in model.parameters():
    param.requires_grad = False

def show_attention_maps(X, y):
X_tensor = torch.cat([preprocess(Image.fromarray(x)) for x in X], dim=0)
y_tensor = torch.LongTensor(y)
attention = compute_attention_maps(X_tensor, y_tensor, model)
attention = attention.numpy()

N = X.shape[0]
for i in range(N):
    plt.subplot(2, N, i + 1)
    plt.imshow(X[i])
    plt.axis('off')
    plt.title(class_names[y[i]])
    plt.subplot(2, N, N + i + 1)
    plt.imshow(attention[i], cmap=plt.cm.gray)
    plt.axis('off')
    plt.gcf().set_size_inches(12, 5)
plt.suptitle('Attention maps')
plt.show()

show_attention_maps(X, y)

def compute_attention_maps(images, labels, model):
    images_tensor = torch.tensor(images, requires_grad=True)
    labels_tensor = torch.tensor(labels)
    predictions = model(images_tensor.unsqueeze(0))
    criterion = torch.nn.CrossEntropyLoss()
    loss = criterion(predictions, labels_tensor)
    model.zero_grad()
    loss.backward()
    gradients = images_tensor.grad
    attention_maps = torch.mean(gradients.abs(), dim=1)
    return attention_maps

Thank you very much in advance.

Edit: I changed my question because I was able to solve my previous problem by changing the resnet's conv1 (in line 3 of my code provided) and I am still trying to compute attention maps.

1
  • 1
    Please provide your code here, not as a link. Try to give a minimal reproducible code that raises the problem. Provide the full error backtrace.
    – Ivan
    Commented Apr 7 at 13:38

1 Answer 1

0

You defined your conv layer to output a single layer, while in the original implementation, it outputs 64, here. That's where the error comes from, a subsequent batch normalization layer expects 64, not 1.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.