PT: add uniform likelihood bucket batching #1661

NeoLegends · 2024-12-05T13:04:45Z

This is a first attempt at automatically optimizing the bucket limits during training. After every subepoch the limits are adjusted to make every bucket catch a roughly equal number of segments.

I am debating on two more things:

Keeping only a fixed number of seq lens that have passed through the batching (e.g. 10k + upper + lower value) to have an upper bound on the memory consumption. Whenever the list is extended by another batch and crosses said limit, we re-sample randomly from that list to keep it at a fixed size. As long as the number of seq lens we keep is large enough this shouldn't have a big influence on the accuracy of the results.
Keeping seq len statistics even across subepochs to make the statistics better.

albertz · 2024-12-05T13:42:41Z

When I mention that I would try to optimize this, I was more thinking about writing a dedicated script just to do that.

But this here could also be interesting.

I'm not sure thought that I would put this into RETURNN yet, while you are still experimenting with it. You can rather put this somewhere into your i6_experiments.users. and then just use it from there. (I think we might need to extend the class resolution slightly, if "." in cls_name: ...; see similar code in rf.build_from_dict or get_optimizer_class or other such functions.)

…z-optim-bucket-limits

albertz · 2025-01-11T10:44:31Z

This is a first attempt at automatically optimizing the bucket limits during training. After every subepoch the limits are adjusted to make every bucket catch a roughly equal number of segments.

With "limit", you mean the max num seqs of a bucket?

I still don't really understand why this is something you want to optimize for. Why does it matter that they are evenly distributed?

This is to get better speed or better model performance or both? I don't see how this affects speed. I also don't see/understand how this affects model performance.

I thought that optimizing the max seq lens of each bucket, i.e. the boundaries, would be a more reasonable thing to optimize w.r.t. minimizing the amount of padding. That would improve at least the speed.

albertz · 2025-01-11T10:45:19Z

It seems this PR includes all the epoch_continuous changes? Why? Are those in any way related here? I don't see how. If they are not related, can you remove them from this PR?

NeoLegends · 2025-01-11T10:57:34Z

I don‘t think this PR is going to be merged in the current form.

NeoLegends self-assigned this Dec 5, 2024

NeoLegends force-pushed the moritz-optim-bucket-limits branch 4 times, most recently from 9b53a49 to c80d589 Compare December 5, 2024 13:10

NeoLegends marked this pull request as ready for review December 5, 2024 13:13

NeoLegends requested review from albertz and a team as code owners December 5, 2024 13:13

NeoLegends force-pushed the moritz-optim-bucket-limits branch 2 times, most recently from a2d513a to 1392e99 Compare December 5, 2024 13:17

NeoLegends changed the title ~~PT: add optimizing bucket batching~~ PT: add uniform likelihood bucket batching Dec 5, 2024

PT: add uniform likelihood bucket batching

0a561e4

NeoLegends force-pushed the moritz-optim-bucket-limits branch from 1392e99 to 0a561e4 Compare December 5, 2024 13:17

add docs

2e59d16

NeoLegends added 15 commits January 9, 2025 14:42

Dataset: add get_epoch_continuous

36982e7

improve variable naming

78d2c6c

on invalid epoch, invalidate last cur seq idx

b730992

proper impl via data

b12a989

fix invocation

f3f6f84

fix up laplace impl

055f62d

black

8965760

add missing docstring

9c5d5e5

assert that epoch_continuous never reaches 1.0

78ac8e5

stricter assertion

1bf5c8d

Make epoch_continuous a value between [0, 1] inclusive

2d4441a

improve docs

67dca4d

document keeping epoch_continuous in LaplaceOrdering

7f69839

make sure to catch not implemented errors

ab4fee7

improve comment

9e4f461

NeoLegends added 3 commits January 10, 2025 11:53

just catch general NotImplementedError

04795e2

add docs on meta entries

04b8d9b

Merge remote-tracking branch 'origin/moritz-ep-continuous' into morit…

38384a7

…z-optim-bucket-limits

NeoLegends marked this pull request as draft January 10, 2025 13:38

NeoLegends closed this Jan 11, 2025

albertz deleted the moritz-optim-bucket-limits branch January 11, 2025 16:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PT: add uniform likelihood bucket batching #1661

PT: add uniform likelihood bucket batching #1661

NeoLegends commented Dec 5, 2024 •

edited

Loading

albertz commented Dec 5, 2024

albertz commented Jan 11, 2025

albertz commented Jan 11, 2025

NeoLegends commented Jan 11, 2025

PT: add uniform likelihood bucket batching #1661

PT: add uniform likelihood bucket batching #1661

Conversation

NeoLegends commented Dec 5, 2024 • edited Loading

albertz commented Dec 5, 2024

albertz commented Jan 11, 2025

albertz commented Jan 11, 2025

NeoLegends commented Jan 11, 2025

NeoLegends commented Dec 5, 2024 •

edited

Loading