The lower bound will be \begin{equation} \min _{x \in [a, b] \wedge m\backslash x, y \in [c, d] \wedge n\backslash y} x | y \end{equation} And the upper bound will be \begin{equation} \max _{x \in [a, b] \wedge m\backslash x, y \in [c, d] \wedge n\backslash y} x | y \end{equation} Where

`m\x`

means "`x`

is divisible by `m`

".So how can we calculate them faster than direct evaluation? I don't know, and to my knowledge, no one else does either. But if sound (ie only

*over*approximating) but non-tight bounds are OK, then there is a way. Part of the trick is constraining

`m`

and `n`

to be powers of two. It's safe to use `m = m & -m`

. That should look familiar - it's extracting the rightmost bit of `m`

. An other explanation of "the rightmost bit of `m`

" is "the highest power of two that divides `m`

". That doesn't rule out any values of `x`

that were valid before, so it's a sound approximation.Strangely, for

`minOR`

, if the bounds are pre-rounded to their corresponding powers of two, there is absolutely no difference in the code whatsoever. It is possible to set a bit that is known to be zero in that bound, but that can only happen if that bit is one in the other bound anyway, so it doesn't affect the result. The other case, setting a bit that is not known to be zero, is the same as it would be with only the range information.`maxOR`

is a problem though. In `maxOR`

, bits at the right are set which may be known to be zero. Some of those bits may have to be reset. But how many? To avoid resetting too many bits, we have to round the result down to a multiple of `min(m, n)`

. That's clearly sound - if a bit can't be one in both `x`

and `n`

, obviously it can't be one in the result. But it turns out not to be tight - for example for `[8, 9] 1\x`

and `[0, 8] 4\y`

, it computes 0b1111, even though the last two bits can only be 0b00 or 0b01 (`y`

does not contribute to these bits, and the range of `x`

is so small that the bits only have those values) so the tight upper bound is 0b1101. If that's acceptable, the code would be
static uint maxOR(uint a, uint b, uint c, uint d, uint m, uint n) { uint resettableb = (a ^ b) == 0 ? 0 : 0xFFFFFFFF >> nlz(a ^ b); uint resettabled = (c ^ d) == 0 ? 0 : 0xFFFFFFFF >> nlz(c ^ d); uint resettable = b & d & (resettableb | resettabled); uint target = resettable == 0 ? 0 : 1u << bsr(resettable); uint targetb = target & resettableb; uint targetd = target & resettabled & ~resettableb; uint newb = b | (targetb == 0 ? 0 : targetb - 1); uint newd = d | (targetd == 0 ? 0 : targetd - 1); uint mask = (m | n) & (0 - (m | n)); return (newb | newd) & (0 - mask); }Which also uses a sneaky way of getting

`min(m, n)`

- by ORing them and then taking the rightmost bit. Because why not.I haven't (yet?) found a nice way to calculate the tight upper bound. Even if I do, that still leaves things non-tight when the old

`m`

or `n`

were not powers of two.Next post, xor, which has some unique difficulties.

## No comments:

## Post a Comment