Skip to content

str: Mark unstable round_char_boundary feature functions as const #144472

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 28, 2025

Conversation

okaneco
Copy link
Contributor

@okaneco okaneco commented Jul 25, 2025

Mark floor_char_boundary, ceil_char_boundary const
Simplify the implementations, reducing the number of arithmetic operations

It seems unnecessary to do the lower/upper bounds calculations and extra slicing when we can jump straight to inspecting the bytes, assuming the underlying data is valid UTF-8.

Tracking issue #93743

@rustbot
Copy link
Collaborator

rustbot commented Jul 25, 2025

r? @Mark-Simulacrum

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 25, 2025
@clarfonthey
Copy link
Contributor

I wouldn't necessarily say this "simplifies" the implementation since a lot of this is relative to what the optimised output is, and the new implementation may not be as clear to the optimizer that at most four iterations happen. However, I think that it's a valid implementation and making it const is nice.

}

// The character boundary will be within four bytes of the index
debug_assert!(i <= index + 3);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I think that simply adding this as a condition to the loop itself might be helpful (and similarly above) to ensure that the optimiser knows this precondition as well. Not sure how it'd affect codegen, but it feels like the safest approach compared to, for example, assert_unchecked.

@okaneco
Copy link
Contributor Author

okaneco commented Jul 26, 2025

Compiler explorer links
floor_char - https://rust.godbolt.org/z/PKr1GP1rs
ceil_char - https://rust.godbolt.org/z/nMqPT4rYY

Some surface notes:
The while loops don't have panic paths.
For the iterators, the floor version unrolls but the ceil remains a loop.

I filed an issue because of noticing something in the original implementation #139759.
The saturating_sub from the index still required a bounds check, but that's been fixed and should be in the bump to LLVM 21.

Mark `floor_char_boundary`, `ceil_char_boundary` const
Simplify the implementations, reducing the number of arithmetic operations
@Mark-Simulacrum
Copy link
Member

@bors r+

Seems reasonable enough and ~equally clear to me.

@bors
Copy link
Collaborator

bors commented Jul 27, 2025

📌 Commit 7f7d343 has been approved by Mark-Simulacrum

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 27, 2025
bors added a commit that referenced this pull request Jul 28, 2025
Rollup of 7 pull requests

Successful merges:

 - #144072 (update `Atomic*::from_ptr` and `Atomic*::as_ptr` docs)
 - #144151 (`tests/ui/issues/`: The Issues Strike Back [1/N])
 - #144300 (Clippy fixes for miropt-test-tools)
 - #144399 (Add a ratchet for moving all standard library tests to separate packages)
 - #144472 (str: Mark unstable `round_char_boundary` feature functions as const)
 - #144503 (Various refactors to the codegen coordinator code (part 3))
 - #144530 (coverage: Infer `instances_used` from `pgo_func_name_var_map`)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit e36b844 into rust-lang:master Jul 28, 2025
10 checks passed
@rustbot rustbot added this to the 1.90.0 milestone Jul 28, 2025
rust-timer added a commit that referenced this pull request Jul 28, 2025
Rollup merge of #144472 - okaneco:char_bound, r=Mark-Simulacrum

str: Mark unstable `round_char_boundary` feature functions as const

Mark `floor_char_boundary`, `ceil_char_boundary` const
Simplify the implementations, reducing the number of arithmetic operations

It seems unnecessary to do the lower/upper bounds calculations and extra slicing when we can jump straight to inspecting the bytes, assuming the underlying data is valid UTF-8.

Tracking issue #93743
@okaneco okaneco deleted the char_bound branch July 28, 2025 11:55
github-actions bot pushed a commit to model-checking/verify-rust-std that referenced this pull request Jul 30, 2025
str: Mark unstable `round_char_boundary` feature functions as const

Mark `floor_char_boundary`, `ceil_char_boundary` const
Simplify the implementations, reducing the number of arithmetic operations

It seems unnecessary to do the lower/upper bounds calculations and extra slicing when we can jump straight to inspecting the bytes, assuming the underlying data is valid UTF-8.

Tracking issue rust-lang#93743
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants