Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change number of epochs after when resuming training. #18154

Open
1 task done
mazatov opened this issue Dec 10, 2024 · 4 comments
Open
1 task done

Change number of epochs after when resuming training. #18154

mazatov opened this issue Dec 10, 2024 · 4 comments
Labels
detect Object Detection issues, PR's question Further information is requested

Comments

@mazatov
Copy link

mazatov commented Dec 10, 2024

Search before asking

  • I have searched the Ultralytics YOLO issues and discussions and found no similar questions.

Question

I have initially started training with 200 epochs but now that it's taking too long and the model is already performing I would like to terminate earlier. So I cancelled the training and am resuming it using :

yolo task=detect mode=train resume model="$LAST_PT_PATH" data="$TRAIN_DATA" epochs = 130 but it still is trying to train to 200 epochs.

I also tried editing the args.yaml in the train folder but it still stuck to 200. Where is it saved to train for 200 epochs and how can I edit it?

Additional

No response

@mazatov mazatov added the question Further information is requested label Dec 10, 2024
@UltralyticsAssistant UltralyticsAssistant added the detect Object Detection issues, PR's label Dec 10, 2024
@UltralyticsAssistant
Copy link
Member

👋 Hello @mazatov, thank you for your interest in Ultralytics 🚀! We recommend taking a look at the Docs for detailed guidance and examples. You may find the sections on Python and CLI usage particularly useful as they explain training configurations in detail.

If this is a 🐛 Bug Report, please provide a minimum reproducible example that we can use to debug the issue.

From your description, it seems like you’re attempting to adjust the number of epochs while resuming training. Please ensure:

  1. You use the correct syntax for overriding the parameters when running the training command.
  2. Your modifications persist in the correct locations—e.g., args.yaml or explicitly passed via the command line.

If this is a ❓ Question, please share the following to help us assist you better:

  • The exact command you're using
  • Any relevant training logs or outputs
  • The content of your args.yaml file

To explore similar topics or get direct help from the community, consider joining the Ultralytics community:

Upgrade

First, ensure you’re using the latest version of the ultralytics package. Issues are often resolved in newer updates, so upgrading might solve this problem:

pip install -U ultralytics

Make sure you're running this in a Python>=3.8 environment with PyTorch>=1.8 installed.

Environments

For running YOLO, verified environments include:

Status

Ultralytics CI

When this badge is green, all Ultralytics CI tests are passing. CI runs validate all YOLO Modes and Tasks across macOS, Windows, and Ubuntu systems on a 24-hour cycle and after every commit.

🤖 This is an automated response, but an Ultralytics engineer will follow up with you shortly to provide further assistance!

Copy link
Collaborator

You can't change epochs while resuming

@mazatov
Copy link
Author

mazatov commented Dec 10, 2024

Anything I can do? The final models are much smaller in size so I'd like to get them if possible without training much longer.

Copy link
Collaborator

You can run this:

from ultralytics.utils.torch_utils import strip_optimizer

strip_optimizer("path/to/best.pt")

It will reduce the size of the models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
detect Object Detection issues, PR's question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants