-
Notifications
You must be signed in to change notification settings - Fork 1.9k
JS: Add ECMAScript 2024 v Flag Operators for Regex Parsing
#18899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
84fddf1 to
94adaf8
Compare
605456f to
f93419e
Compare
v Flag Operators for Regex Parsing
6fe7753 to
430514b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Overview
This pull request introduces support for ECMAScript 2024 regex constructs under the new "v" flag. Key changes include:
- New AST node classes for character class operations (Subtraction, QuotedString, Intersection, Union)
- Enhancements to RegExpParser to conditionally enable nested character classes, new operators, and quoted string parsing with a fallback mechanism when errors are encountered
- New test inputs covering quoted strings, unions, intersections, subtractions, and nested character classes
Reviewed Changes
| File | Description |
|---|---|
| javascript/extractor/src/com/semmle/js/ast/regexp/CharacterClassSubtraction.java | New AST node for subtraction operator in character classes |
| javascript/extractor/src/com/semmle/js/ast/regexp/CharacterClassQuotedString.java | New AST node for handling quoted string escapes |
| javascript/extractor/src/com/semmle/js/ast/regexp/CharacterClassIntersection.java | New AST node for intersection operator in character classes |
| javascript/extractor/src/com/semmle/js/ast/regexp/CharacterClassUnion.java | New AST node for union operator in character classes |
| javascript/extractor/src/com/semmle/js/parser/RegExpParser.java | Extended parser functionality to support the new "v" flag and corresponding regex operations |
| javascript/extractor/src/com/semmle/js/extractor/ASTExtractor.java and RegExpExtractor.java | Updated extraction logic to accommodate new AST node types and conditional flag handling |
Copilot reviewed 31 out of 31 changed files in this pull request and generated 2 comments.
Tip: Leave feedback on Copilot's review comments with the 👎 and 👍 buttons to help improve review quality. Learn more
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
78aa5dc to
9e1f050
Compare
asgerf
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work! I have a couple of comments to keep you busy during the week 😄
javascript/extractor/src/com/semmle/js/ast/regexp/CharacterClassIntersection.java
Outdated
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/extractor/ASTExtractor.java
Outdated
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/ast/regexp/CharacterClassUnion.java
Outdated
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
d6df34e to
8558ead
Compare
6380ec8 to
d40ff96
Compare
d40ff96 to
f48eab9
Compare
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Asgerf <asgerf@github.com>
a337863 to
9c8e0a5
Compare
erik-krogh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work 👍
I didn't look through it thoroughly, I assume Asger did that.
Did you run database creation on the latest main of https://github.com/babel/babel and https://github.com/tc39/test262?
Those projects contain all kinds of valid and invalid syntax, so it's a nice test of whether something is horribly wrong.
javascript/extractor/src/com/semmle/js/ast/regexp/CharacterClassIntersection.java
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
For The |
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
No, that is not expected. However, that seems to be unrelated to this PR. |
Co-authored-by: Erik Krogh Kristensen <erik-krogh@github.com>
asgerf
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
Co-authored-by: Asger F <asgerf@github.com>
This pull request adds support for parsing ECMAScript 2024
vflag operators, including:Example:
/[[abc][cz]]/v&&): Matches characters common to both sets.Example:
/[[abc]&&[cz]]/v--): Removes characters from a set.Example:
/[[abc]--[cz]]/vMixing operations at the same level is not allowed:
/[[abc]&&[cz]--[zz]]/v/[[abc]&&[[cz]--[zz]]]/vExample:
/[[abc][cz]]/v\q{}): Allows matching exact sequences.Example:
/[\q{ab|cb|db}]/vCommit by commit review encouraged.
Useful links:
With correct parsing, this no longer produces an false positive in Closes #18854.