Skip to content

Parser improvements#207

Merged
aibaars merged 15 commits intotree-sitter:masterfrom
aibaars:parser-improvements
Apr 20, 2022
Merged

Parser improvements#207
aibaars merged 15 commits intotree-sitter:masterfrom
aibaars:parser-improvements

Conversation

@aibaars
Copy link
Copy Markdown
Contributor

@aibaars aibaars commented Apr 14, 2022

This PR includes various improvements to the parser. It is best reviewed on a commit-by-commit basis.

Checklist:

  • All tests pass in CI.
  • There are sufficient tests for the new fix/feature.
  • Grammar rules have not been renamed unless absolutely necessary.
  • The conflicts section hasn't grown too much.
  • The parser size hasn't grown too much (check the value of STATE_COUNT in src/parser.c).

@nickrolfe

Copy link
Copy Markdown
Contributor

@maxbrunsfeld maxbrunsfeld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good, and it looks like you've been super thorough about unit tests for the new fixes. I just had one question about the grammar changes.

grammar.js Outdated
rest_assignment: $ => prec(-1, seq('*', optional($._lhs))),

_fid: $ => choice(alias($.identifier_suffix, $.identifier), alias($.constant_suffix, $.constant)),
_fid_call: $ => prec.right(field('method', $._fid)),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain this part of the change? I can't tell what the behavioral change is from introducing the _fid rule.

I also wonder if there's some less abbreviated name that we could use instead of _fid; could it be written out as multiple words, similarly to how we use the full word version of most other names? What does the f stand for?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes fid is a terrible name ;-) I got it from the Ruby's parse.y . I guess the F stands for function. These are identifiers ending with ? or ! which cannot be used as normal variable or constant names.

For example Foo::Bar? is a call to a method named Bar? with the module/class Foo as receiver, and not a scope resolution of the constant Bar? on the module/class Foo.

I'm currently working on a Ripper based AST printer that parses a Ruby file and prints it in the format expected by tree-sitter tests. That allows me to compare ASTs generated by the tree-sitter grammar to ones produced by the real Ruby grammar and find cases where tree-sitter and Ruby disagree.

@aibaars
Copy link
Copy Markdown
Contributor Author

aibaars commented Apr 20, 2022

@maxbrunsfeld Thanks for the quick review. I renamed _fid to _function_identifier which is more descriptive.

self: $ => 'self',
true: $ => token(choice('true', 'TRUE')),
false: $ => token(choice('false', 'FALSE')),
nil: $ => token(choice('nil', 'NIL')),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to say the upper-case variants were still valid as of 2.7 (but not in 3.0), but it looks like they were only ever defined as constants, and never parser keywords, so 👍 to this change.

-2**10
-x**10

1..-1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very happy to see this common pattern is now parsed correctly 🎉

@maxbrunsfeld
Copy link
Copy Markdown
Contributor

maxbrunsfeld commented Apr 20, 2022

The windows CI is clearly messed up; I think we disregard it for now. This looks great @aibaars, nice work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants