Integrate the toxic2 tolerant parser across providers#1267
Open
lukaszsamson wants to merge 15 commits into
Open
Integrate the toxic2 tolerant parser across providers#1267lukaszsamson wants to merge 15 commits into
lukaszsamson wants to merge 15 commits into
Conversation
… toxic2 ranges - Point elixir_sense to a local path dep (carries the toxic2-backed parser) - Reimplement SelectionRanges AST node ranges using toxic2's range: metadata and comments from Toxic2.string_to_quoted_with_comments, dropping the bespoke AstUtils.node_range computation - Delete now-unused AstUtils module and its tests - Adjust document_symbols test for toxic2-recovered AST Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Derive structural and comment folds from the error-tolerant toxic2 parser (node source ranges + Toxic2.string_to_quoted_with_comments) instead of the Elixir-tokenizer-backed token-pair / special-token passes. The line-based indentation pass is kept (it supplies assignment/clause folds that have no single closing token) and AST folds override it at shared start lines, as the token-pair pass used to. Comments inside strings/heredocs are no longer mistaken for fold-able comment blocks. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Parse with toxic2 (range: true, no literal_encoder) and read each node's range: meta for symbol range and selection range, replacing the token-metadata end-position heuristics (kept only as a fallback for range-less nodes). Also: preserve nil args (bare identifiers like var / __MODULE__) in neutralize_errors across document_symbols, selection_ranges and folding_range - the previous `not is_list(args)` clause turned them into zero-arity calls; and ignore error-recovery placeholders when computing function/type arity so an incomplete `def foo(` reports foo/0 rather than an inflated arity. Adjust the records test for the toxic2 call-node range (starts at `Record`). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Every node extract_* takes a location from carries a toxic2 range:, so the old
token-metadata heuristic was unreachable - and it harbored a latent bug
(elixir_position_to_lsp({nil, nil}) returns end-of-file because nil sorts above
integers). Replace location_to_range with a range:-only version that degrades a
range-less node to a zero-width range at its line/column, and drop the now-dead
symbol argument.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The LS Parser now produces a range-bearing toxic2 AST for Context.ast (and
the metadata built from it) on every .ex/.exs parse, replacing the
Code.string_to_quoted! AST and the ElixirSense fault-tolerant fallback.
This is the foundation that lets range-aware providers read node ranges
straight off Context.ast.
- parse_file/3 stays the sole diagnostics source (Code.with_diagnostics)
and the EEx/HEEx parser; it now returns a tagged {:ok, ast, diagnostics}
/ {:error, diagnostics} so a falsey-but-valid AST (literal nil/false) is
no longer mistaken for a parse failure.
- do_parse/2 decides the flag from the Code success tag: clean -> :exact,
else toxic recovered something usable -> :fixed, else :not_parsable.
- parse_elixir_toxic/3 builds the AST/metadata via
ElixirSense.Core.Parser.parse_to_neutralized_ast(range: true,
keep_range: true), keeping the catch/telemetry safety net.
- fault_tolerant_parse/2 removed (toxic always recovers; cursor env is
derived separately in Metadata.get_cursor_env).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…r helper DocumentSymbols, FoldingRange and SelectionRanges each carried a byte-identical private neutralize_errors/1. Replace all three with the shared ElixirSense.Core.Parser.neutralize_errors/3 (keep_range: true so the range: meta survives). document_symbols/folding_range pass their parse diagnostics (also gets the call-arg sentinel cleaning); selection_ranges' self-neutralizing ast_node_ranges/4 passes [] (range-only, and __error__ nodes carry no range, so the diagnostics-driven cleaning is a no-op there). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
definition/references/implementation/declaration/call_hierarchy/hover locators and the llm_environment command now classify the symbol under the cursor via ElixirSense.Core.SurroundContext.Toxic.surround_context/2 (stage 0 delegates to Code.Fragment). Completion (cursor_context/container_cursor_to_quoted) untouched. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Replaces the Elixir-tokenizer passes (token_pair_ranges via FoldingRange.Token/
TokenPair, special_token_group_ranges via FoldingRange.SpecialToken, and the
stop-token machinery) with a toxic2-AST pass, delimiter_pair_ranges/4. It derives
outer/inner ranges for ()/[]/{}/%{}/<<>>, calls, bracket access x[y], and
do/else/rescue/after/catch/end blocks from the toxic2 closing:/do:/end:/section-key
range: metadata, plus a stab pattern .. -> range. String/heredoc/sigil ranges now
come from ast_node_ranges (the toxic nodes carry range:), so the special-token pass
is gone. Both selection and folding providers no longer use :elixir_tokenizer for
their output (FoldingRange was already migrated).
Adversarial review found a crash: a cursor exactly on a block section keyword
(else/rescue/...) made two sibling, non-nested ranges and tripped the
"increasingly narrowing" merge invariant. Fixed by selecting the cursor's section
with half-open containment (end exclusive). A fuzz over real files dropped
selection-range crashes from 238 (old tokenizer code) to 6 pre-existing
"no intersection" cases in the shared merge. Bracket-access ranges (lost in the
first cut) restored via the from_brackets meta. Regression tests added.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ules These FoldingRange submodules were lib-dead after both providers moved off the Elixir tokenizer (FoldingRange.provide and SelectionRanges no longer use them). Remove the modules and their tests. convert_text_to_input and @type input drop the :tokens field (now lines-only); Indentation and CommentBlock provide_ranges only ever read :lines, so their doctests and the folding_range_test passes keep working. The only remaining ElixirSense.Core.Normalized.Tokenizer user is now elixir_sense's Source.which_func. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
elixir-ls already extracts join bindings (backported in 906c8c8) but had no unit test for it - only commented-out integration TODOs. Port elixir_sense's focused, self-contained QueryTest (mock Post/Comment schemas) to lock in the behavior. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…cent bracket crash) container_ranges decided cursor containment with the inclusive in?/2 on end-exclusive ranges, so adjacent bracket accesses (foo[bar][baz]) both claimed the shared boundary column and emitted two non-nested sibling ranges, raising "ranges_1 is not increasingly narrowing" in the merge. Use the half-open check (end exclusive) like do_block_ranges already does. Found by gpt-5.5 adversarial review. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…Toxic selection_ranges: the symbol-under-cursor pass called Code.Fragment.surround_context directly. Route it through ElixirSense.Core.SurroundContext.Toxic instead - the same toxic2-backed entry point the navigation providers already use. Navigable shapes now get their span from the AST range: metadata; only purely lexical units (a bare do/end, exotic operators) reach Toxic's internal Code.Fragment fallback. AST ranges alone don't cover these symbol-level spans (e.g. the do/end keyword units, the dot-path callee), so the pass is kept rather than removed. document_symbols: drop the stale 'extract module name location from Code.Fragment.surround_context?' TODO - module name locations come from the toxic2 AST range metadata now. This removes the last direct Code.Fragment.surround_context call in the providers; the only remaining uses are Toxic's own internal fallback. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Pass the AST already parsed at the top of selection_ranges/3 into SurroundContext.Toxic.surround_context/3 instead of having it re-parse the source on every cursor position. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Move elixir_sense from a local path dep back to the git dep model and pin it to the pushed toxic2-parser SHA (b928399b) via dep_versions.exs + mix.lock. This transitively pulls toxic2 (lukaszsamson/toxic2). elixir_sense/toxic2 require Elixir ~> 1.19, so drop the 1.16/1.17/1.18 jobs from the CI matrix (smoke tests + test matrix, Linux + Windows). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ixes) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Integrates the toxic2 error-tolerant parser (via the
matching elixir_sense branch) across the ElixirLS providers, replacing the tokenizer-driven and
Code.Fragment-driven implementations with passes derived from toxic2's ranged AST.Companion PR: elixir-lsp/elixir_sense#336 (this branch pins that elixir_sense SHA).
What changed
Providers reimplemented on toxic2 ranges
toxic2 parse; the old
FoldingRange.Token/TokenPair/SpecialTokentokenizer is deleted.Half-open containment fixes an adjacent-bracket (
foo[bar][baz]) crash.heuristics removed).
routed through
ElixirSense.Core.SurroundContext.Toxicfor symbol-under-cursor classification.selection_rangesreuses the AST it already parsed instead of re-parsing per cursor position.Context.ast/metadata for.ex/.exsbuilt from the toxic2 ranged AST;neutralize_errorsdeduped onto the shared
ElixirSense.Core.Parserhelper.No more direct
Code.Fragment.surround_contextin the providers — the only remaining use isthe internal fallback inside the toxic2 classifier.
Build / CI
toxic2-parser SHA (
dep_versions.exs+mix.lock); this transitively pulls toxic2.~> 1.19, so drop the 1.16/1.17/1.18 jobs from the CImatrix (smoke tests + test matrix, Linux + Windows).
Testing
apps/language_serverprovider suite green (1170 passing) against the git-pinned deps.mix compile --warnings-as-errors,mix format --check-formatted, andmix dialyzerall clean.🤖 Generated with Claude Code