Optimize fmod
with a method using integer multiplication
#1002
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is kind of a retry at #898. One of the problems there was that it would have added overhead and regressed performance for typical inputs.
Unlike that PR, this doesn't aim for sub-linear scaling; the cost of evaluating
fmod(x, y)
is still roughly proportional tolog2(|x/y|)
. However, the constant factor is much better. Running therandom
-benchmarks locally, I got walltime reductions ofNew utilities in
libm::support
:trait NarrowingDiv
for dividingu2N / uN
when the quotient fits inuN
u256 / u128
fn linear_mul_reduction<U>(x: U, mut e: u32, y: U) -> U
computes(x << e) % y
with the new method