[SwiftParser] Improve statement level recovery#3156
Conversation
c706d4c to
2ca55fe
Compare
| _ = subparser.consumeAttributeList() | ||
| hasAttribute = true | ||
| } else if subparser.at(.poundIf) && subparser.consumeIfConfigOfAttributes() { | ||
| subparser.skipSingle() |
There was a problem hiding this comment.
subparser.consumeIfConfigOfAttributes() already skipped #if .... #endif
ef1bf91 to
dc4e479
Compare
| base: .syntax, | ||
| nameForDiagnostics: "declaration", | ||
| parserFunction: "parseDeclaration" | ||
| parserFunction: "parseDeclarationOrIfConfig" |
There was a problem hiding this comment.
parseDeclaration() don't parse #if ... #endif. Newly introduced parseDeclarationOrIfConfig() just for DeclSyntax.parse(from:)
| mutating func atStartOfDeclaration( | ||
| isAtTopLevel: Bool = false, | ||
| allowInitDecl: Bool = true, | ||
| allowRecovery: Bool = false |
There was a problem hiding this comment.
Removed allowRecovery from atStartOfDeclaration()/atStartOfStatement()/atStartOfSwitchCase() because all statement-level recoveries are now done by parseUnexpectedCodeDecl()
| mutating func parseDeclaration(in context: DeclarationParseContext = .topLevelOrCodeBlock) -> RawDeclSyntax { | ||
| // If we are at a `#if` of attributes, the `#if` directive should be | ||
| // parsed when we're parsing the attributes. | ||
| if self.at(.poundIf) && !self.withLookahead({ $0.consumeIfConfigOfAttributes() }) { |
There was a problem hiding this comment.
This block is basically hoisted to parseMemberBlockItem()
- To align with
parseCodeBlockItem()scheme - Don't want to attach
;toIfConfigDeclSyntax
| let isProbablyVarDecl = self.at(.identifier, .wildcard) && self.peek(isAt: .colon, .equal, .comma) | ||
| let isProbablyTupleDecl = self.at(.leftParen) && self.peek(isAt: .identifier, .wildcard) | ||
|
|
||
| if isProbablyVarDecl || isProbablyTupleDecl { | ||
| return RawDeclSyntax(self.parseBindingDeclaration(attrs, .missing(.keyword(.var)), in: context)) | ||
| } |
There was a problem hiding this comment.
Factored out to atBindingDeclarationWithoutVarKeyword() and move the handling to recoveryResult above, aligning with atFunctionDeclarationWithoutFuncKeyword().
|
|
||
| if atFunctionDeclarationWithoutFuncKeyword() { | ||
| return RawDeclSyntax(self.parseFuncDeclaration(attrs, .missing(.keyword(.func)))) | ||
| } |
There was a problem hiding this comment.
This is already handled in recoveryResult above.
| if self.atStartOfStatement(preferExpr: true) || self.atStartOfDeclaration() { | ||
| return false | ||
| } | ||
| if self.atStartOfLine && self.withLookahead({ $0.atStartOfSwitchCase() }) { |
There was a problem hiding this comment.
Previously case was a start of statement, but not anymore. This prevents the second case below is parsed as the returning expression.
case .foo:
return
case bar:
...|
|
||
| let item: RawCodeBlockItemSyntax.Item | ||
| let attachSemi: Bool | ||
| if self.at(.poundIf) && !self.withLookahead({ $0.consumeIfConfigOfAttributes() }) { |
There was a problem hiding this comment.
Inlined parseItem() because we want to choose whether to attach ; (attachSemi) depending on what we parse.
| } else { | ||
| // Otherwise, eat the unexpected tokens into an "decl". | ||
| item = .decl( | ||
| RawDeclSyntax( | ||
| self.parseUnexpectedCodeDeclaration(allowInitDecl: allowInitDecl, requiresDecl: false, until: stopCondition) | ||
| ) | ||
| ) | ||
| attachSemi = true |
There was a problem hiding this comment.
Previously in parseItem(), we tried atStartOfStatement(allowRecovery: true) and atStartOfDeclaration(allowRecovery: true) for recoveries.
Now unexpected tokens at this position are always parsed as UnexpectedCodeDeclSyntax until we find a decl, statement, or expression.
| 1 | func o() { | ||
| 2 | }👨👩👧👦} | ||
| | |`- error: extraneous braces at top level | ||
| | |`- error: unexpected braces in source file |
There was a problem hiding this comment.
This message change is because previously the UnexpectedNodesSyntax was stored as SourceFileSyntax.unexpectedBetweenStatementsAndEndOfFileToken, but now it's UnexpectedCodeDeclSyntax in SourceFileSyntax.statements.
| DiagnosticSpec( | ||
| locationMarker: "2️⃣", | ||
| message: "unexpected code '))' before macro" | ||
| message: "unexpected code '))' in class" |
There was a problem hiding this comment.
This and similar changes are because the unexpected code are now separate UnexpectedCodeDeclSyntax in the statements.
There was a problem hiding this comment.
So IIUC this parses as an attribute on a class with a macro member? The change from unexpected to a node seems fine to me diagnostic wise, there's not much difference in the "before" vs "in"
5ceb6e8 to
f9a97cf
Compare
| DiagnosticSpec( | ||
| locationMarker: "2️⃣", | ||
| message: "unexpected code '))' before macro" | ||
| message: "unexpected code '))' in class" |
There was a problem hiding this comment.
So IIUC this parses as an attribute on a class with a macro member? The change from unexpected to a node seems fine to me diagnostic wise, there's not much difference in the "before" vs "in"
| ), | ||
| DiagnosticSpec(locationMarker: "2️⃣", message: "unexpected code in source file"), | ||
| ], | ||
| fixedSource: """ |
There was a problem hiding this comment.
Why does the fixed source still have the #endif and extra }? Shouldn't both be unexpected and thus removed in the fixed source? Is the new UnexpectedDeclNode not being treated the same as unexpected?
I'm also unsure about closing the struct at the endif. It makes sense if there's a matching #if, but when there isn't... 🤔
There was a problem hiding this comment.
Diagnostics are conservative about removing user code. UnexpectedNodesSyntax in general might contain important user code and we don't want to delete them in "fix-it all"
There was a problem hiding this comment.
I'm also unsure about closing the struct at the endif.
We can't know if there's } after #endif until we parse it.
IMO #endif et al should always be stronger than }, because, unlike { or } which editor often insert, the user probably intentionally wrote #endif for some reason.
This specific test case might be controversial, but for example, for a source file like this:
struct MyStruct {
func foo() {}
// ...
#endif
struct Other {
....
}
This is probably missing } before #endif
| diagnostics: [ | ||
| DiagnosticSpec( | ||
| locationMarker: "1️⃣", | ||
| // FIXME: "expected attribute name after '@'". |
There was a problem hiding this comment.
Could you file an issue for this rather than have the FIXME? Or add a specific test for the message, with the FIXME + a link to the issue? Same for the one below
| ) | ||
| } | ||
|
|
||
| func testDeclSyntaxParseIfConfigAttr() throws { |
| message: "expected declaration and ';' after attribute", | ||
| fixIts: ["insert declaration and ';'"] |
There was a problem hiding this comment.
Why do we default to ; instead of a newline for this case?
There was a problem hiding this comment.
This is modeled as
CodeBlockItem(
item: .decl(MissingDecl(...)),
semicolon: .semiColonToken(precence: .missing)
)
And ParseDiagnosticsGenerator generates single diagnostic for consecutive missing nodes.
"consecutive statements ... insert newline or ';'" is a special diagnostics that is generated when the previous node was not missing.
Tests/SwiftParserTest/translated/ForwardSlashRegexSkippingInvalidTests.swift
Show resolved
Hide resolved
|
swiftlang/swift#84553 |
|
swiftlang/swift#84553 |
1 similar comment
|
swiftlang/swift#84553 |
Consider the following code:
Previously, this was parsed as a
FunctionDeclSyntaxwith the unexpected code} @attrattached asunexpectedBeforeFuncKeyword. This was problematic because@attrwas effectively ignored. TheTokenPrecedence-based recovery strategy couldn't handle this well, since@is not a declaration keyword and doesn't by itself signal the start of a declaration.One option would be to check
atStartOfDeclaration()after eachskipSingle()and recognize@as the start of a declaration. However, then the preceding}would need to be attached to the first attribute in the attribute list, which would require threading recovery logic intoparseAttributeList()and only applying it to the firstparseAttribute(). This would make the code more complex and harder to maintain.This PR introduces a new syntax node
UnexpectedCodeDeclSyntaxparseUnexpectedCodeDeclaration().This simplifies recovery handling significantly.
Also:
#ifrelated recovery:#endifet al at top-level is nowUnexpectedCodeDeclinstead of making everything after it "unexpected"#endifet al closes code/member block with missing}.parsePoundDirective(). It now receives only one closure instead of 3.#sourceLocation(file: "<file>", line: 100) ;;is now diagnosed as;-only statement, or unexpected;separator depending on the position.}or statements. E.g.