-
-
Notifications
You must be signed in to change notification settings - Fork 14.5k
Fix invalid macro tag generation for keywords which can be followed by values #148655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| // This code crashed because a `if` followed by a `!` was considered a macro, | ||
| // creating an invalid class stack. | ||
| // Regression test for <https://github.com/rust-lang/rust/issues/148617>. | ||
|
|
||
| //@ compile-flags: -Zunstable-options --generate-macro-expansion | ||
|
|
||
| enum Enum { | ||
| Variant, | ||
| } | ||
|
|
||
| pub fn repro() { | ||
| if !matches!(Enum::Variant, Enum::Variant) {} | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| // This test ensures that keywords which can be followed by values (and therefore `!`) | ||
| // are not considered as macros. | ||
| // This is a regression test for <https://github.com/rust-lang/rust/issues/148617>. | ||
|
|
||
| #![crate_name = "foo"] | ||
| #![feature(negative_impls)] | ||
|
|
||
| //@ has 'src/foo/keyword-macros.rs.html' | ||
|
|
||
| //@ has - '//*[@class="rust"]//*[@class="number"]' '2' | ||
| //@ has - '//*[@class="rust"]//*[@class="number"]' '0' | ||
| //@ has - '//*[@class="rust"]//*[@class="number"]' '1' | ||
| const ARR: [u8; 2] = [!0,! 1]; | ||
|
|
||
| trait X {} | ||
|
|
||
| //@ has - '//*[@class="rust"]//*[@class="kw"]' 'impl' | ||
| impl !X for i32 {} | ||
|
|
||
| fn a() { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe add at least one test case where the Also, do we have tests for
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The same issue cannot happen with something other than idents but added it just in case. |
||
| //@ has - '//*[@class="rust"]//*[@class="kw"]' 'if' | ||
| if! true{} | ||
| //@ has - '//*[@class="rust"]//*[@class="kw"]' 'match' | ||
| match !true { _ => {} } | ||
| //@ has - '//*[@class="rust"]//*[@class="kw"]' 'while' | ||
| let _ = while !true { | ||
| //@ has - '//*[@class="rust"]//*[@class="kw"]' 'break' | ||
| break !true; | ||
| }; | ||
| } | ||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, why don't we exclude all keywords except for
self,super,crate(.is_path_segment_keyword())1? I.e., turning the list around since you can't invoke arbitrary keywords as macros (unlessr#'ed ofc which we already handle).For example, the list keeps growing, there's not only
impl !Trait for ()from above (soimpl) but also#[cfg(false)] impl const !Trait for () {}(soconst).Footnotes
Since
#[cfg(false)] self!(…)and so on is valid even though you can't define a macro calledself/r#self. ↩There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't work because
tryis not a path segment keyword, and yet you can name your macrotry. So we're blocked on the constant. :')Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, true, well until we start factoring in the requested/ambient edition (see #148221), then that will no longer pose a problem.
Still, we could invert this check by rejecting all keywords except
self,super,crate(.is_path_segment_keyword()) as mentioned and edition-sensitive keywords likeasync,try,gen(HACK:.is_reserved(Rust2015)(since we've never unreserved a keyword so far)).However, I guess it's fine to keep your current approach given a fix of #148221 would be the ultimate & proper fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I had in the back of my mind to use the actual rustc parser for this to get an AST for quite some time (and completely forgot until @yotamofek just asked me today why we didn't do it 🤣 ).
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @yotamofek
Well, it's an AST, not a CST (concrete syntax tree), so you can't really faithfully / losslessly reconstruct the source code unless you meticulously call
span_to_snippetbasically everywhere and I'm talking everywhere and even then it's basically impossible.Of course, we don't necessarily need to reconstruct the source à la
rustfmtand could just use it to splice the source string in a few selected places but that wouldn't allow us to highlight comments as they aren't represented in the AST obviously and some keywords most likely (again, we can do somespan_to_snippettrickery but we will get this wrong similar torustfmtwhich just swallows comments here & there (we would only fail to highlight things but still)).I mean it's worth a try, maybe I'm missing some third approach that's miles better.
Okay, so we could follow a hybrid solution by lexing the source, going through the token stream like we do now to highlight comments only, then parse the token stream & use it to splice+highlight the source. Might be perf heavy. Still won't catch everything but alright, trade-offs are everywhere and it might be better than the current approach.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be saying silly things due to lack of knowledge,
but can't we just use the lexer, without invoking the parser, to generate a stream of
Tokens? Then we "just" have to map aTokenKindto a CSS class?I do wonder why rustfmt doesn't do that, though. Is it because they need AST-level information? But isn't rustfmt context-unaware?
I'll try to read up on it, maybe we can talk about it at tomorrow's meeting. Seems like it would simplify rustdoc quite a bit and be a much more robust solution, assuming it's actually feasible. But again - no idea what I'm talking about here 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, was gonna say the lexer is probably much slower since it also allows for recovery, suggestions, and can't assume the code is actually syntactically valid (which we can, I think?)
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for the sake of completely, I'll mention it: While I'm pretty sure all the ASTs created at the start of the rustdoc process were dropped already, the HIR should still be around. We could in theory visit it and splice the source according to all the spans we find in the HIR.
Now, the biggest drawback of that will probably be syntactic sugar like
forloops andasyncbodies (the latter have been turned into state machines at this point) which we might not be able to highlight easily or at all but I could be wrong.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's essentially what we're doing right now. We're currently only lexing the source using
rustc_lexerand iterate through itsTokens (cc https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lexer/enum.TokenKind.html).rustc_parse's lexer which you've mentioned only transforms the token stream provided byrustc_lexerinto a different representation that's "slightly easier" to parse. For all intents and purposes, however they're the same thing in a different color, rustdoc's approach wouldn't change on a macro scale by changing over to it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, got it. Thanks for the explanation!