Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to load model with load_state_dict ? #18097

Closed
1 task done
aymuos15 opened this issue Dec 7, 2024 · 8 comments
Closed
1 task done

How to load model with load_state_dict ? #18097

aymuos15 opened this issue Dec 7, 2024 · 8 comments
Labels
question Further information is requested

Comments

@aymuos15
Copy link

aymuos15 commented Dec 7, 2024

Search before asking

  • I have searched the Ultralytics YOLO issues and discussions and found no similar questions.

Question

weights = f"{HOME}/runs/detect/train70/weights/best.pt"
model = YOLO(weights)

This works fine! But I am in a situation where I can not use this.

My alternative which I found: ultralytics/yolov5#1441 (comment)

But this does not work. few layers throw mismatch error:


model = YOLO('yolo11n.yaml')
model.nc = 8 
model.model.load_state_dict(torch.load(weights)['model'].state_dict())

How do I go about this?

Additional

No response

@aymuos15 aymuos15 added the question Further information is requested label Dec 7, 2024
@UltralyticsAssistant
Copy link
Member

👋 Hello @aymuos15, thank you for your interest in Ultralytics 🚀! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a ❓ Question, please provide as much information as possible, including a detailed explanation of your use case and full error logs, as well as verifying compatibility with the latest ultralytics version. See the suggested upgrade command below.

For this specific inquiry regarding model loading using load_state_dict and its associated errors, it is important to ensure that:

  • The model architecture (yolo11n.yaml) and the weights (best.pt) are compatible.
  • Any custom modifications, such as altering model.nc, are correctly integrated and do not cause layer mismatches.

Feel free to share exact error messages or provide details about how these layers mismatch to help us gain further insight into the issue. An Ultralytics engineer will also assist you soon to provide more guidance 🙂.

Join the Ultralytics community where it suits you best. For real-time chat, head to Discord 🎧. Prefer in-depth discussions? Check out Discourse. Or dive into threads on our Subreddit to share knowledge with the community.

Upgrade

Upgrade to the latest ultralytics package, including all requirements, in a Python>=3.8 environment with PyTorch>=1.8 to ensure your issue is not already resolved in the latest version:

pip install -U ultralytics

Environments

YOLO may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

Ultralytics CI

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify the correct operation of all YOLO Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

@aymuos15
Copy link
Author

aymuos15 commented Dec 7, 2024

model.model.load_state_dict(torch.load(weights)['model'].state_dict(), strict=False)

Even this throws the same error

@glenn-jocher
Copy link
Member

@aymuos15 it seems the weight file might not completely match the model architecture. Instead, consider using the load method provided by the Ultralytics YOLO class: model = YOLO('yolo11n.yaml').load(weights). This will handle any mismatched parameters more gracefully. For further details, refer to the Ultralytics model documentation.

Copy link
Collaborator

Simply changing model.nc wouldn't work since the model is already created with the nc in the yaml. You will have to change nc in the yaml.

@aymuos15
Copy link
Author

aymuos15 commented Dec 9, 2024

Thanks a lot for the suggestions.

Could you please answer the following question:

when I run: model = YOLO('yolo11n.yaml').load(weights) I get: Transferred 448/499 items from pretrained weights

but not when I run: model = YOLO(weights)

It is the exact same weights file.

Copy link
Collaborator

Does your yaml's nc match with the weights?

@aymuos15
Copy link
Author

aymuos15 commented Dec 9, 2024

This fixed my error. Thanks a lot!

Simply changing model.nc wouldn't work since the model is already created with the nc in the yaml. You will have to change nc in the yaml.

I double checked now!

@aymuos15 aymuos15 closed this as completed Dec 9, 2024
@glenn-jocher
Copy link
Member

You're welcome! Glad it worked for you. Let us know if you have any further questions or run into any issues. Happy experimenting with YOLO!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants