Skip to content

MinML tree-sitter parser#10

Open
phamelink wants to merge 16 commits intomainfrom
minml-lsp
Open

MinML tree-sitter parser#10
phamelink wants to merge 16 commits intomainfrom
minml-lsp

Conversation

@phamelink
Copy link
Copy Markdown
Contributor

@phamelink phamelink commented Apr 10, 2026

This PR introduces a Tree-sitter grammar for MinML, enabling high-performance incremental parsing and laying the foundation for advanced editor support such as syntax highlighting, code folding, and symbol
navigation.

Key Changes

  • Grammar Implementation: Added dev/tree-sitter/grammar.js which defines the MinML syntax, including support for elements, attributes, character references, quoted strings, raw blocks, and matcher escapes.
  • Comprehensive Test Suite: Created a test corpus in dev/tree-sitter/test/corpus/minml.txt with 30 test cases:
    • 12 success cases covering all standard MinML features and complex nesting.
    • 18 error cases demonstrating robust error recovery and AST generation for invalid inputs.
  • Developer Documentation: Added dev/tree-sitter/README.md with detailed instructions on the project structure, how to regenerate the parser, and how to run tests.
  • Build System Integration: Updated the root Makefile to include tree-sitter-gen and tree-sitter-test targets for a streamlined developer workflow.
  • Project Documentation: Updated the root README.md and dev/README.md to include the Tree-sitter tool in the project overview.
  • Reference Examples: Added dev/tree-sitter/examples/13-error-cases.m as a reference for known invalid input patterns.

Verification

The implementation has been verified by running the Tree-sitter test suite make tree-sitter-test and also manually checking the outputs of parsing example files with tree-sitter parse examples/XX-XX.m

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is the most important file of the PR, comments explaining how you structured the grammar is important.

@AntoineBastide47
Copy link
Copy Markdown
Contributor

Is this just the setup or is it implemented in the extension ?

@AntoineBastide47
Copy link
Copy Markdown
Contributor

Also, it might be better to remove the generated files and have the consumer download tree sitter locally to regenerate the files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants