-
Notifications
You must be signed in to change notification settings - Fork 678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nix lexer improvements #2551
Merged
Merged
Nix lexer improvements #2551
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
These examples are from the Nix Reference Manual https://nixos.org/manual/nix/unstable/language/index.html This lexer needs some work but before changing anything here is a base line for how it is currently working.
This wasn't working at all previously and matching pairs of single quotes would bread subsequent highlighting.
The previous regex matched the spaces and equals sign. This is now done using a lookahead. I'm not actually sure these cases should be Literal.String.Symbol, I certainly wouldn't want these highlighted as symbols, but I'm leaving this as-is for now.
This is checked before matching operators, I feel the it should be a bit more aware of the context, but it didn't raise any issues in the tests.
This matching maybe a bit loose but didn't cause any regression in the test samples.
These cases were found to emit error tokens when lexing real-world code. The lexing code is not pretty, but handles the cases we know of.
These cases found by lexing real-world code and looking for error tokens.
Also fixes some issues with the literal paths.
jeanas
reviewed
Nov 5, 2023
jeanas
approved these changes
Nov 5, 2023
Hi @jeanas, thanks for reviewing the PR. I've implemented your suggestions and would be happy to hear any more feedback. |
@tarnacious I'm going to merge it, just waiting for the CI to complete. |
Thank you! |
Awesome work! 🥳 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Improvements/fixes to the Nix lexer.
Adds test cases (snippets) to get a base-line for how the lexer was previously working. Subsequent commits add fixes and update/add test cases.
Fixes #1800 and some other issues I've noticed with the lexer.
I noticed issues when I posted some nix code blocks on my blog, the highlighting looked so broken I added a modified lexer to fix it.
I've been testing this against all the nix code in the nixpkgs repository which has around 30k nix files containing around 25 million lines of code and 100k lines of comments. I can't actually check the lexing is correct, but I can check its completing and it's not returning error tokens (before this pull request it was returning 4316 error tokens in 1081 files and I suspect it would be worse if the multi-line string literal were closing correctly).