Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@GuillaumeGomez
Copy link
Member

@GuillaumeGomez GuillaumeGomez commented Nov 7, 2025

Fixes #148617.

The problem didn't come from the generate-macro-expansion feature but was actually uncovered thanks to it.

Keywords like if or return, when followed by a ! were considered as macros, which was wrong and let to invalid class stack and to the panic.

While working on it, I realized that _ was considered as a keyword, so I fixed that as well in the second commit. (reverted, see #148655 (comment), #148655 (comment))

r? @yotamofek

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. T-rustdoc-frontend Relevant to the rustdoc-frontend team, which will review and decide on the web UI/UX output. labels Nov 7, 2025
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@GuillaumeGomez
Copy link
Member Author

npm failing to install. New flaky I guess. Restarting CI. ^^'

@GuillaumeGomez
Copy link
Member Author

CI passed \o/

@bors
Copy link
Collaborator

bors commented Nov 9, 2025

☔ The latest upstream changes (presumably #148692) made this pull request unmergeable. Please resolve the merge conflicts.

Comment on lines 1276 to 1277
if !KEYWORDS_FOLLOWABLE_BY_VALUE.contains(&text)
&& self.peek_non_whitespace() == Some(TokenKind::Bang)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

I think it would be more consistent (with previous match arms) to put this condition in a match guard. (and then have another Some(c) => c arm, like before)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, changing it. :)

let span = new_span(before, text, file_span);
sink(DUMMY_SP, Highlight::EnterSpan { class: Class::Macro(span) });
sink(span, Highlight::Token { text, class: None });
TokenKind::RawIdent if self.peek_non_whitespace() == Some(TokenKind::Bang) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be missing something and maybe too lazy to dig in any deeper, but why did you remove TokenKind::Ident from this arm?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because a RawIdent can never be a keyword, meaning get_real_ident_class is unneeded, allowing this check to be simpler.

@yotamofek
Copy link
Contributor

This looks absolutely fine, the fixture is more correct, and it adds more tests without breaking any previous ones, so I'd approve this...
but I've already approved two related PRs that introduced/uncovered edge cases, so I think I'd be inclined to wait for another r+ this time 😅 (maybe @lolbinarycat if you the time?)


//@ has 'src/foo/keyword-macros.rs.html'

fn a() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add at least one test case where the ! is not separated by whitespace from the keyword?
e.g. if! true{}
To make sure it works, and doesn't regress in the future.

Also, do we have tests for !s following punctuation? e.g. something like const ARR: [u8; 2] = [!0,! 0];?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same issue cannot happen with something other than idents but added it just in case.

@GuillaumeGomez GuillaumeGomez force-pushed the keyword-as-macros branch 2 times, most recently from 3f877d5 to e108cb6 Compare November 9, 2025 11:47
@rustbot
Copy link
Collaborator

rustbot commented Nov 9, 2025

This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@rust-log-analyzer

This comment has been minimized.

@fmease
Copy link
Member

fmease commented Nov 9, 2025

Technically speaking, _ is a keyword. You can't name items that (except for const items which are special-cased) or generic parameters. Moreover "naming" local bindings incl. function parameters that way doesn't bind _ / introduce a binding called _, it just discards the value and thus exhibits different drop behavior compared to regular identifiers like x or _x.

@GuillaumeGomez
Copy link
Member Author

Ok ok, removing this change.

}

/// Used to know if a keyword followed by a `!` should never be treated as a macro.
const KEYWORDS_FOLLOWABLE_BY_VALUE: &[&str] = &["if", "while", "match", "break", "return"];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you include impl as well? To fix negative impls: https://doc.rust-lang.org/nightly/src/core/marker.rs.html#1028

Screenshot 2025-11-09 at 14-22-40 marker rs - source

Of course, then the naming …_BY_VALUE won't make sense anymore because the thing following the impl ! is a trait/type.

Copy link
Member Author

@GuillaumeGomez GuillaumeGomez Nov 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gonna rename the const. Synthax ambiguities are a nightmare on their own. :')

Comment on lines 1258 to 1259
// So if it's not a keyword which can be followed by a value (like `if` or
// `return`) and the next non-whitespace token is a `!`, then we consider
// it's a macro.
if !KEYWORDS_FOLLOWABLE_BY_VALUE.contains(&text)
&& matches!(self.peek_non_trivia(), Some((TokenKind::Bang, _)))
Copy link
Member

@fmease fmease Nov 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, why don't we exclude all keywords except for self, super, crate (.is_path_segment_keyword())1? I.e., turning the list around since you can't invoke arbitrary keywords as macros (unless r#'ed ofc which we already handle).

For example, the list keeps growing, there's not only impl !Trait for () from above (so impl) but also #[cfg(false)] impl const !Trait for () {} (so const).

Footnotes

  1. Since #[cfg(false)] self!(…) and so on is valid even though you can't define a macro called self / r#self.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't work because try is not a path segment keyword, and yet you can name your macro try. So we're blocked on the constant. :')

Copy link
Member

@fmease fmease Nov 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, true, well until we start factoring in the requested/ambient edition (see #148221), then that will no longer pose a problem.

Still, we could invert this check by rejecting all keywords except self, super, crate (.is_path_segment_keyword()) as mentioned and edition-sensitive keywords like async, try, gen (HACK: .is_reserved(Rust2015) (since we've never unreserved a keyword so far)).

However, I guess it's fine to keep your current approach given a fix of #148221 would be the ultimate & proper fix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I had in the back of my mind to use the actual rustc parser for this to get an AST for quite some time (and completely forgot until @yotamofek just asked me today why we didn't do it 🤣 ).

Copy link
Member

@fmease fmease Nov 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @yotamofek

Well, it's an AST, not a CST (concrete syntax tree), so you can't really faithfully / losslessly reconstruct the source code unless you meticulously call span_to_snippet basically everywhere and I'm talking everywhere and even then it's basically impossible.

Of course, we don't necessarily need to reconstruct the source à la rustfmt and could just use it to splice the source string in a few selected places but that wouldn't allow us to highlight comments as they aren't represented in the AST obviously and some keywords most likely (again, we can do some span_to_snippet trickery but we will get this wrong similar to rustfmt which just swallows comments here & there (we would only fail to highlight things but still)).

I mean it's worth a try, maybe I'm missing some third approach that's miles better.

Okay, so we could follow a hybrid solution by lexing the source, going through the token stream like we do now to highlight comments only, then parse the token stream & use it to splice+highlight the source. Might be perf heavy. Still won't catch everything but alright, trade-offs are everywhere and it might be better than the current approach.

Copy link
Contributor

@yotamofek yotamofek Nov 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be saying silly things due to lack of knowledge,
but can't we just use the lexer, without invoking the parser, to generate a stream of Tokens? Then we "just" have to map a TokenKind to a CSS class?

I do wonder why rustfmt doesn't do that, though. Is it because they need AST-level information? But isn't rustfmt context-unaware?

I'll try to read up on it, maybe we can talk about it at tomorrow's meeting. Seems like it would simplify rustdoc quite a bit and be a much more robust solution, assuming it's actually feasible. But again - no idea what I'm talking about here 😁

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, was gonna say the lexer is probably much slower since it also allows for recovery, suggestions, and can't assume the code is actually syntactically valid (which we can, I think?)

Copy link
Member

@fmease fmease Nov 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for the sake of completely, I'll mention it: While I'm pretty sure all the ASTs created at the start of the rustdoc process were dropped already, the HIR should still be around. We could in theory visit it and splice the source according to all the spans we find in the HIR.

Now, the biggest drawback of that will probably be syntactic sugar like for loops and async bodies (the latter have been turned into state machines at this point) which we might not be able to highlight easily or at all but I could be wrong.

Copy link
Member

@fmease fmease Nov 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be saying silly things due to lack of knowledge,
but can't we just use the lexer, without invoking the parser, to generate a stream of Tokens? Then we "just" have to map a TokenKind to a CSS class?

That's essentially what we're doing right now. We're currently only lexing the source using rustc_lexer and iterate through its Tokens (cc https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lexer/enum.TokenKind.html).

rustc_parse's lexer which you've mentioned only transforms the token stream provided by rustc_lexer into a different representation that's "slightly easier" to parse. For all intents and purposes, however they're the same thing in a different color, rustdoc's approach wouldn't change on a macro scale by changing over to it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, got it. Thanks for the explanation!

Comment on lines 1236 to 1239
self.in_macro = true;
let span = new_span(before, text, file_span);
sink(DUMMY_SP, Highlight::EnterSpan { class: Class::Macro(span) });
sink(span, Highlight::Token { text, class: None });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.in_macro = true;
let span = new_span(before, text, file_span);
sink(DUMMY_SP, Highlight::EnterSpan { class: Class::Macro(span) });
sink(span, Highlight::Token { text, class: None });
self.new_macro_span(text, sink, before, file_span);

@fmease
Copy link
Member

fmease commented Nov 9, 2025

r=fmease,yotamofek with comments addressed

@fmease fmease added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 9, 2025
@GuillaumeGomez
Copy link
Member Author

@bors r=yotamofek,fmease rollup

@bors
Copy link
Collaborator

bors commented Nov 9, 2025

📌 Commit 2c4a593 has been approved by yotamofek,fmease

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 9, 2025
bors added a commit that referenced this pull request Nov 9, 2025
Rollup of 4 pull requests

Successful merges:

 - #148248 (Constify `ControlFlow` methods without unstable bounds)
 - #148285 (Constify `ControlFlow` methods with unstable bounds)
 - #148510 (compiletest: Do the known-directives check only once, and improve its error message)
 - #148655 (Fix invalid macro tag generation for keywords which can be followed by values)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 5430082 into rust-lang:master Nov 10, 2025
11 checks passed
rust-timer added a commit that referenced this pull request Nov 10, 2025
Rollup merge of #148655 - GuillaumeGomez:keyword-as-macros, r=yotamofek,fmease

Fix invalid macro tag generation for keywords which can be followed by values

Fixes #148617.

The problem didn't come from the `generate-macro-expansion` feature but was actually uncovered thanks to it.

Keywords like `if` or `return`, when followed by a `!` were considered as macros, which was wrong and let to invalid class stack and to the panic.

~~While working on it, I realized that `_` was considered as a keyword, so I fixed that as well in the second commit.~~ (reverted, see #148655 (comment), #148655 (comment))

r? `@yotamofek`
@rustbot rustbot added this to the 1.93.0 milestone Nov 10, 2025
@GuillaumeGomez GuillaumeGomez deleted the keyword-as-macros branch November 10, 2025 10:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. T-rustdoc-frontend Relevant to the rustdoc-frontend team, which will review and decide on the web UI/UX output.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ICE: Didn't find 'Class::Original' to close

6 participants