Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up the Tier 2 interpreter #112287

Closed
gvanrossum opened this issue Nov 20, 2023 · 2 comments
Closed

Speed up the Tier 2 interpreter #112287

gvanrossum opened this issue Nov 20, 2023 · 2 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@gvanrossum
Copy link
Member

gvanrossum commented Nov 20, 2023

The Tier 2 interpreter hasn't really been optimized carefully. While the "optimizer" pass is intended to make the Tier 2 micro-code faster through things like guard elimination or constantification, we should also look into just making the Tier 2 interpreter itself faster -- possibly by changing the representation of executable traces held in the executor (the current format is identical to the IR, which is rather verbose, using 16 bytes per uop!), and possibly by just carefully tuning the interpreter. (For example, if the space of micro-opcode ordinals could overlap the space of Tier 1 bytecode ordinals, we could fit the Tier 2 opcode in one byte.)

Linked PRs

gvanrossum added a commit that referenced this issue Nov 20, 2023
This makes the Tier 2 interpreter a little faster.
I calculated by about 3%,
though I hesitate to claim an exact number.

This starts by doubling the trace size limit (to 512),
making it more likely that loops fit in a trace.

The rest of the approach is to only load
`oparg` and `operand` in cases that use them.
The code generator know when these are used.

For `oparg`, it will conditionally emit
```
oparg = CURRENT_OPARG();
```
at the top of the case block.
(The `oparg` variable may be referenced multiple times
by the instructions code block, so it must be in a variable.)

For `operand`, it will use `CURRENT_OPERAND()` directly
instead of referencing the `operand` variable,
which no longer exists.
(There is only one place where this will be used.)
@iritkatriel iritkatriel added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Nov 27, 2023
aisk pushed a commit to aisk/cpython that referenced this issue Feb 11, 2024
…12286)

This makes the Tier 2 interpreter a little faster.
I calculated by about 3%,
though I hesitate to claim an exact number.

This starts by doubling the trace size limit (to 512),
making it more likely that loops fit in a trace.

The rest of the approach is to only load
`oparg` and `operand` in cases that use them.
The code generator know when these are used.

For `oparg`, it will conditionally emit
```
oparg = CURRENT_OPARG();
```
at the top of the case block.
(The `oparg` variable may be referenced multiple times
by the instructions code block, so it must be in a variable.)

For `operand`, it will use `CURRENT_OPERAND()` directly
instead of referencing the `operand` variable,
which no longer exists.
(There is only one place where this will be used.)
@hugovk
Copy link
Member

hugovk commented Mar 15, 2024

Closing because the PR has been merged. Please re-open if there's more needed here.

@hugovk hugovk closed this as completed Mar 15, 2024
@gvanrossum
Copy link
Member Author

Thanks for the ping! Arguably the issue was wider, but we've decided to focus on JIT performance, and the Tier 2 interpreter's speed is no longer of great concern (we keep it because it's easier to debug the rest of the Tier 2 machinery this way). So let's keep it closed but mark as "not planned", which is closer to the truth.

@gvanrossum gvanrossum closed this as not planned Won't fix, can't repro, duplicate, stale Mar 15, 2024
Glyphack pushed a commit to Glyphack/cpython that referenced this issue Sep 2, 2024
…12286)

This makes the Tier 2 interpreter a little faster.
I calculated by about 3%,
though I hesitate to claim an exact number.

This starts by doubling the trace size limit (to 512),
making it more likely that loops fit in a trace.

The rest of the approach is to only load
`oparg` and `operand` in cases that use them.
The code generator know when these are used.

For `oparg`, it will conditionally emit
```
oparg = CURRENT_OPARG();
```
at the top of the case block.
(The `oparg` variable may be referenced multiple times
by the instructions code block, so it must be in a variable.)

For `operand`, it will use `CURRENT_OPERAND()` directly
instead of referencing the `operand` variable,
which no longer exists.
(There is only one place where this will be used.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs)
Projects
None yet
Development

No branches or pull requests

3 participants