Rewrite`is_ascii` using `slice::as_chunks` #144837

Kmeakin · 2025-08-02T17:52:03Z

Generalize the x86-64+sse2 version of is_ascii to be architecture-neutral, and rewrite it using slice::as_chunks. The new version is both shorter (in terms of Rust source code) and smaller (in terms of produced assembly).

Compare the assembly generated before and after:
https://godbolt.org/z/MWKdnaYoK

rustbot · 2025-08-02T17:52:07Z

r? @tgross35

rustbot has assigned @tgross35.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Generalize the x86-64+sse2 version of `is_ascii` to be architecture-neutral, and rewrite it using `slice::as_chunks`. The new version is both shorter (in terms of Rust source code) and smaller (in terms of produced assembly). Compare the assembly generated before and after: https://godbolt.org/z/MWKdnaYoK

hanna-kruppe · 2025-08-02T18:43:48Z

The simpler source code and shorter assembly seem to boil down to two changes:

Using unaligned loads for every full chunk, instead of trying to align all loads except possibly the first and the last one.
Always using the simple byte-by-byte loop for the last bytes.len() % CHUNK_SIZE bytes, instead of trying to handle it with an unaligned load that overlaps with the preceding chunk.

The first one seems quite reasonable in many cases. It probably causes a huge performance regression for targets that don't have efficient unaligned loads, but to be fair, those are becoming less common and less important over time.

The second change may be quite problematic for some common input sizes, though. Try benchmarking before vs. after on an input that's 2 * CHUNK_SIZE - 1 bytes long, or with a random short input lengths that make the branches and iteration counts less predictable.

okaneco · 2025-08-02T19:45:21Z

There are some benchmarks in library/core/benches/ascii/is_ascii.rs and I added more in #130733, also a codegen test.

When I originally made that PR, new uses of const_eval_select seemed to be discouraged when making a function const, and then the situation was a little different by the time it was reviewed and merged.

However, the usize-aligned path is probably still needed for targets without SIMD like i586-unknown-linux-gnu since it can do SWAR ASCII checks instead of byte at a time.

rustbot assigned tgross35 Aug 2, 2025

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Aug 2, 2025

This comment has been minimized.

Sign in to view

Kmeakin force-pushed the km/optimize-is-ascii branch from c43111d to 7cbbbc6 Compare August 2, 2025 18:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rewrite`is_ascii` using `slice::as_chunks` #144837

Rewrite`is_ascii` using `slice::as_chunks` #144837

Kmeakin commented Aug 2, 2025

Uh oh!

rustbot commented Aug 2, 2025

Uh oh!

This comment has been minimized.

hanna-kruppe commented Aug 2, 2025

Uh oh!

okaneco commented Aug 2, 2025

Uh oh!

Uh oh!

Rewriteis_ascii using slice::as_chunks #144837

Are you sure you want to change the base?

Rewriteis_ascii using slice::as_chunks #144837

Conversation

Kmeakin commented Aug 2, 2025

Uh oh!

rustbot commented Aug 2, 2025

Uh oh!

This comment has been minimized.

hanna-kruppe commented Aug 2, 2025

Uh oh!

okaneco commented Aug 2, 2025

Uh oh!

Uh oh!

Rewrite`is_ascii` using `slice::as_chunks` #144837

Rewrite`is_ascii` using `slice::as_chunks` #144837