Skip to content

compiletest: Improve diagnostics for line annotation mismatches 2 #144747

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 3, 2025

Conversation

petrochenkov
Copy link
Contributor

@petrochenkov petrochenkov commented Jul 31, 2025

Follow up to #140622 based on feedback from #144590.

@rustbot
Copy link
Collaborator

rustbot commented Jul 31, 2025

r? @Kobzol

rustbot has assigned @Kobzol.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-compiletest Area: The compiletest test runner A-testsuite Area: The testsuite used to check the correctness of rustc S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) labels Jul 31, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jul 31, 2025

Some changes occurred in src/tools/compiletest

cc @jieyouxu

@petrochenkov
Copy link
Contributor Author

petrochenkov commented Jul 31, 2025

Before:
New Bitmap Image

After:
New Bitmap Image

@Kobzol
Copy link
Member

Kobzol commented Jul 31, 2025

Looks slightly better to me, especially the separated headers.

@RalfJung What do you think?

@RalfJung
Copy link
Member

RalfJung commented Aug 1, 2025

That helps a lot, thanks!

I am slightly concerned that from the output it looks like it will still sometimes print multiple "reported with different X", which seems like it could drown the output -- these are just guesses at things that might be matching but they could be entirely unrelated, so we should limit the output to reduce the noise caused by wrong guesses.

@petrochenkov
Copy link
Contributor Author

I am slightly concerned that from the output it looks like it will still sometimes print multiple "reported with different X", which seems like it could drown the output -- these are just guesses at things that might be matching but they could be entirely unrelated, so we should limit the output to reduce the noise caused by wrong guesses.

That's what I disagree with, #140622 was based on my experience porting error-pattern and kind-less annotations to line annotations with kind. There the extra suggestions were quite helpful, that's why I preferred to keep more of them.
This PR is a trade-off, more suggestions are filtered if better suggestions exist, but multiple suggestions are still reported if they have the same rank.

@RalfJung
Copy link
Member

RalfJung commented Aug 1, 2025

#140622 was based on my experience porting error-pattern and kind-less annotations to line annotations with kind.

That's a one-time migration effort, so I don't see why we'd optimize for that. The common cases in my experience are

  • a diagnostic message changed: same kind, same line, the hint is useful but it is shown twice (once as "expected different message" and once as "reported with different message") -- so its already quite verbose for this case (which matters a lot as it means one has to scroll up much more when fixing 20 tests at the same time)
  • something changes from warn to err: same message, same line, the hint is useful but I expect it'll also be shown twice like above
  • a diagnostic moved to a different line -- seems like the tool can even detect that, very cool, but again it is printed twice
  • everything massively changed and you have to take a closer look -- in this case the hints are ~never going to be helpful

multiple suggestions are still reported if they have the same rank.

What is an example of multiple suggestions with the same rank? In the screenshot, the two suggestions at the very bottom are both just confusing to me, I don't see how they could be helpful. Message and kind are completely different, why would we even assume that this is talking about the same error?

@petrochenkov
Copy link
Contributor Author

In the screenshot, the two suggestions at the very bottom are both just confusing to me, I don't see how they could be helpful. Message and kind are completely different, why would we even assume that this is talking about the same error?

Because in this case if you write //~ stub the suggestions will give you all the diagnostics on this line so you can choose what to cover and copypaste some of them.

@RalfJung
Copy link
Member

RalfJung commented Aug 1, 2025

Okay I can see how if there's an expected diagnostic without a level, and multiple similar diagnostics on that line, there could be multiple matches.

But why would it suggest diagnostics that don't even contain stub? Both error and kind are different then, at that point the chances of the suggestion being right are fairly low in my experience, and the chances of this being entirely unrelated start to get significant.

@petrochenkov
Copy link
Contributor Author

petrochenkov commented Aug 1, 2025

but again it is printed twice

The duplication is a real problem, yes.
I'd like to resolve it but didn't want to include the fix into #140622, which was already a large change in diagnostics.
Need to figure out

  • what exactly is duplication, to not filter away something useful
  • what version is better to report when there are duplicates - the "reported but not expected" or "expected but not reported", or maybe choose either one or another based on other factors.

@petrochenkov
Copy link
Contributor Author

But why would it suggest diagnostics that don't even contain stub?

Because none of the diagnostics contain stub.
Personally, I often literally write some nonsense into the annotations first to see the right annotations in the diagnostics later.

@RalfJung
Copy link
Member

RalfJung commented Aug 1, 2025

Personally, I often literally write some nonsense into the annotations first to see the right annotations in the diagnostics later.

That's a cute trick, but most of the .rs files don't have nonsense in the annotations, so it seems like an odd point to optimize for. (Also, is it documented? Please make the tool usable for everyone, not just yourself. :) At least in my work, I have to fix existing ui tests much more often than write new ones, and having so many incorrect and duplicate suggestions makes that harder than it was a year ago. The signal-to-noise ratio of the output has gone up significantly. (Ofc I also knew exactly how the old output looked. The new one is a lot easier to interpret when first seeing this, especially after this PR.)

I don't think compiletest should take "the message is entirely different" as a signal for "probably the user wrote deliberate nonsense, let's show all the messages". In general that's just not a good assumption. If you want a specific mode where compiletest shows you all the errors on a line so you can copy-paste the text, that should be explicitly requested, not implicitly assumed. Do we allow empty annotations? If not, they could be used as that signal. But I guess we allow them and it means "just any message", so this may need a keyword/sigil of sorts.

@RalfJung
Copy link
Member

RalfJung commented Aug 1, 2025

What are the conditions under which is still shows multiple suggestions? Did I understand correctly that in the case of #144590, it would now show only one suggestion? Does it ever show multiple suggestions for //~ERROR: stub or only for //~ stub? If it's just the latter, that's so rare I think I could live with it.

@petrochenkov
Copy link
Contributor Author

What are the conditions under which is still shows multiple suggestions?

If there are multiple suggestions with the same rank.
There's a comment in code

            // - message and line / message and kind - great, suggested
            // - only message - good, suggested
            // - known line and kind - ok, suggested
            // - only known line - meh, but suggested

and the ranks are assigned according to that.
It's probably better to look at the examples I'll post below.

Did I understand correctly that in the case of #144590, it would now show only one suggestion?

Yes, in that example it will show one suggestion in all cases.

@Kobzol
Copy link
Member

Kobzol commented Aug 1, 2025

(I'm gonna:

r? RalfJung

since I wasn't involved in the previous discussions and don't work with the UI suite that much to have strong opinions here).

@rustbot rustbot assigned RalfJung and unassigned Kobzol Aug 1, 2025
@petrochenkov
Copy link
Contributor Author

Does it ever show multiple suggestions for //~ERROR: stub

Yes, if there are multiple diagnostics containing "stub" reported on different lines (which is unlikely if it's literally "stub").
Or if there are multiple other ERRORs not containing "stub" are reported on the same line.
Or if there are multiple other WARNs and NOTEs (but not ERRORs) not containing "stub" are reported on the same line.

or only for //~ stub? If it's just the latter, that's so rare I think I could live with it.

This is equivalent to //~ UNKNOWN stub where UNKNOWN simply doesn't match any kind that can be produced by the compiler, otherwise the same logic as above applies.

@RalfJung
Copy link
Member

RalfJung commented Aug 1, 2025

Yes, if there are multiple diagnostics containing "stub" reported on different lines (which is unlikely if it's literally "stub").

So it's listing all lines that have a diagnostic that matches the text? That could be extremely verbose if there's many similar / identical diagnostics. At that point it makes much more sense to look at the stderr file and figure out what is going on IMO.

Or is it only diagnostics that haven't been matched already? That would be a lot better, but still risks printing a lot of irrelevant output.

This is equivalent to //~ UNKNOWN stub where UNKNOWN simply doesn't match any kind that can be produced by the compiler, otherwise the same logic as above applies.

Oh interesting, I would have expected this to match all kinds produced by the compiler.

@RalfJung
Copy link
Member

RalfJung commented Aug 1, 2025

since I wasn't involved in the previous discussions and don't work with the UI suite that much to have strong opinions here).

I wasn't involved in the previous discussions here either, but fundamentally we just don't seem to have consensus on whether the tool should err on the side of being more verbose vs more concise.

I would suggest we land this PR (with the nit about colons resolved, if we have consensus on that), since I consider it a clear improvement -- but it shouldn't close the issue as output still seems very verbose in some cases.

@petrochenkov
Copy link
Contributor Author

petrochenkov commented Aug 1, 2025

Yeah, I also suggest to land this change, then address the duplication, then maybe reconsider showing the remaining suggestions.

@petrochenkov
Copy link
Contributor Author

Or is it only diagnostics that haven't been matched already?

Yes, only unmatched diagnostics and annotations are shown in this output.

Oh interesting, I would have expected this to match all kinds produced by the compiler.

It was like that before #139720, but now not specifying the kind is prohibited, so UNKNOWN doesn't match anything.

@petrochenkov
Copy link
Contributor Author

Updated with the added colons (#144747 (comment)).

@RalfJung
Copy link
Member

RalfJung commented Aug 1, 2025

Thanks!

@bors r+

@bors
Copy link
Collaborator

bors commented Aug 1, 2025

📌 Commit d5ffb06 has been approved by RalfJung

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 1, 2025
@petrochenkov
Copy link
Contributor Author

@bors rollup

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Aug 1, 2025
compiletest: Improve diagnostics for line annotation mismatches 2

Follow up to rust-lang#140622 based on feedback from rust-lang#144590.
bors added a commit that referenced this pull request Aug 1, 2025
Rollup of 11 pull requests

Successful merges:

 - #132748 (get rid of some false negatives in rustdoc::broken_intra_doc_links)
 - #135771 ([rustdoc] Add support for associated items in "jump to def" feature)
 - #143360 (loop match: error on `#[const_continue]` outside `#[loop_match]`)
 - #143662 ([rustdoc] Display unsafe attrs with edition 2024 `unsafe()` wrappers.)
 - #143900 ([rustdoc] Correctly handle `should_panic` doctest attribute and fix `--no-run` test flag on the 2024 edition)
 - #144478 (Improve formatting of doc code blocks)
 - #144703 ([test][AIX] ignore extern_weak linkage test)
 - #144747 (compiletest: Improve diagnostics for line annotation mismatches 2)
 - #144756 (detect infinite recursion with tail calls in ctfe)
 - #144766 (Add human readable name "Cygwin")
 - #144782 (Properly pass path to staged `rustc` to `compiletest` self-tests)

r? `@ghost`
`@rustbot` modify labels: rollup
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Aug 1, 2025
compiletest: Improve diagnostics for line annotation mismatches 2

Follow up to rust-lang#140622 based on feedback from rust-lang#144590.
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Aug 1, 2025
compiletest: Improve diagnostics for line annotation mismatches 2

Follow up to rust-lang#140622 based on feedback from rust-lang#144590.
bors added a commit that referenced this pull request Aug 2, 2025
Rollup of 12 pull requests

Successful merges:

 - #132748 (get rid of some false negatives in rustdoc::broken_intra_doc_links)
 - #135771 ([rustdoc] Add support for associated items in "jump to def" feature)
 - #143360 (loop match: error on `#[const_continue]` outside `#[loop_match]`)
 - #143662 ([rustdoc] Display unsafe attrs with edition 2024 `unsafe()` wrappers.)
 - #144478 (Improve formatting of doc code blocks)
 - #144703 ([test][AIX] ignore extern_weak linkage test)
 - #144747 (compiletest: Improve diagnostics for line annotation mismatches 2)
 - #144766 (Add human readable name "Cygwin")
 - #144782 (Properly pass path to staged `rustc` to `compiletest` self-tests)
 - #144786 (Cleanup the definition of `group_type`)
 - #144796 (Add my previous commit name to .mailmap)
 - #144797 (Update safety comment for new_unchecked in niche_types)

r? `@ghost`
`@rustbot` modify labels: rollup
samueltardieu added a commit to samueltardieu/rust that referenced this pull request Aug 2, 2025
compiletest: Improve diagnostics for line annotation mismatches 2

Follow up to rust-lang#140622 based on feedback from rust-lang#144590.
bors added a commit that referenced this pull request Aug 2, 2025
Rollup of 18 pull requests

Successful merges:

 - #132748 (get rid of some false negatives in rustdoc::broken_intra_doc_links)
 - #135771 ([rustdoc] Add support for associated items in "jump to def" feature)
 - #143360 (loop match: error on `#[const_continue]` outside `#[loop_match]`)
 - #143662 ([rustdoc] Display unsafe attrs with edition 2024 `unsafe()` wrappers.)
 - #143771 (Constify some more `Result` functions)
 - #143900 ([rustdoc] Correctly handle `should_panic` doctest attribute and fix `--no-run` test flag on the 2024 edition)
 - #144185 (Document guarantees of poisoning)
 - #144395 (update fortanix tests)
 - #144478 (Improve formatting of doc code blocks)
 - #144614 (Fortify RemoveUnneededDrops test.)
 - #144703 ([test][AIX] ignore extern_weak linkage test)
 - #144747 (compiletest: Improve diagnostics for line annotation mismatches 2)
 - #144756 (detect infinite recursion with tail calls in ctfe)
 - #144766 (Add human readable name "Cygwin")
 - #144782 (Properly pass path to staged `rustc` to `compiletest` self-tests)
 - #144786 (Cleanup the definition of `group_type`)
 - #144796 (Add my previous commit name to .mailmap)
 - #144797 (Update safety comment for new_unchecked in niche_types)

Failed merges:

 - #144805 (compiletest: Preliminary cleanup of `ProcRes` printing/unwinding)

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit that referenced this pull request Aug 2, 2025
Rollup of 17 pull requests

Successful merges:

 - #132748 (get rid of some false negatives in rustdoc::broken_intra_doc_links)
 - #143360 (loop match: error on `#[const_continue]` outside `#[loop_match]`)
 - #143662 ([rustdoc] Display unsafe attrs with edition 2024 `unsafe()` wrappers.)
 - #143771 (Constify some more `Result` functions)
 - #144185 (Document guarantees of poisoning)
 - #144395 (update fortanix tests)
 - #144478 (Improve formatting of doc code blocks)
 - #144614 (Fortify RemoveUnneededDrops test.)
 - #144703 ([test][AIX] ignore extern_weak linkage test)
 - #144747 (compiletest: Improve diagnostics for line annotation mismatches 2)
 - #144756 (detect infinite recursion with tail calls in ctfe)
 - #144766 (Add human readable name "Cygwin")
 - #144782 (Properly pass path to staged `rustc` to `compiletest` self-tests)
 - #144786 (Cleanup the definition of `group_type`)
 - #144796 (Add my previous commit name to .mailmap)
 - #144797 (Update safety comment for new_unchecked in niche_types)
 - #144803 (rustc-dev-guide subtree update)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 7e8ef3a into rust-lang:master Aug 3, 2025
10 checks passed
@rustbot rustbot added this to the 1.90.0 milestone Aug 3, 2025
rust-timer added a commit that referenced this pull request Aug 3, 2025
Rollup merge of #144747 - petrochenkov:annusexp2, r=RalfJung

compiletest: Improve diagnostics for line annotation mismatches 2

Follow up to #140622 based on feedback from #144590.
github-actions bot pushed a commit to rust-lang/rustc-dev-guide that referenced this pull request Aug 4, 2025
Rollup of 17 pull requests

Successful merges:

 - rust-lang/rust#132748 (get rid of some false negatives in rustdoc::broken_intra_doc_links)
 - rust-lang/rust#143360 (loop match: error on `#[const_continue]` outside `#[loop_match]`)
 - rust-lang/rust#143662 ([rustdoc] Display unsafe attrs with edition 2024 `unsafe()` wrappers.)
 - rust-lang/rust#143771 (Constify some more `Result` functions)
 - rust-lang/rust#144185 (Document guarantees of poisoning)
 - rust-lang/rust#144395 (update fortanix tests)
 - rust-lang/rust#144478 (Improve formatting of doc code blocks)
 - rust-lang/rust#144614 (Fortify RemoveUnneededDrops test.)
 - rust-lang/rust#144703 ([test][AIX] ignore extern_weak linkage test)
 - rust-lang/rust#144747 (compiletest: Improve diagnostics for line annotation mismatches 2)
 - rust-lang/rust#144756 (detect infinite recursion with tail calls in ctfe)
 - rust-lang/rust#144766 (Add human readable name "Cygwin")
 - rust-lang/rust#144782 (Properly pass path to staged `rustc` to `compiletest` self-tests)
 - rust-lang/rust#144786 (Cleanup the definition of `group_type`)
 - rust-lang/rust#144796 (Add my previous commit name to .mailmap)
 - rust-lang/rust#144797 (Update safety comment for new_unchecked in niche_types)
 - rust-lang/rust#144803 (rustc-dev-guide subtree update)

r? `@ghost`
`@rustbot` modify labels: rollup
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-compiletest Area: The compiletest test runner A-testsuite Area: The testsuite used to check the correctness of rustc S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants