Tuesday, 4 June 2024

Sharpening a lower bound with KnownBits information

I have written about this before, but that was a long time ago, I've had a lot more practice with similar things since then. This topic came up on Mastodon, inspiring me to give it another try. Actually the title is a bit of a lie, I will be using Miné's bitfield domain in which we have bitvectors z indicating the bits that can be zero and o indicating the bits that can be one (as opposed to bitvectors which indicate the bits known to be zero and the bits known to be one respectively, or bitvectors 0b and 1b as used in the paper linked below where 1b has bits set that are known to be set and 0b has bits unset that are known to be unset). The exact representation doesn't really matter.

The problem, to be clear, is that suppose we have a lower bound on some variable, along with some knowledge about its bits (knowing that some bits have a fixed value, which others do not), for example we may know that a variable is even (its least significant bit is known to be zero) and at least 5. "Sharpening" the lower bound means increasing it, if possible, so that the lower bound "fits" the knowledge we have about the bits. If a value is even and at least 5, it is also at least 6, so we can increase the lower bound.

As a more recent reference for an algorithm that is better than my old one, you can read Sharpening Constraint Programming approaches for Bit-Vector Theory.

As that paper notes, we need to find the highest bit in the current lower bound that doesn't "fit" the KnownBits (or z, o pair from the bitfield domain) information, and then either:

  • If that bit was not set but should be, we need to set it, and reset any lower bits that are not required to be set (lower bound must go up, but only as little as possible).
  • If that bit was set but shouldn't be, we need to reset it, and in order to do that we need to set a higher bit that wasn't set yet, and also reset any lower bits that are not required to be set.

So far so good. What that paper doesn't tell you, is that these are essentially the same case, and we can do:

  • Starting from the highest "wrong" bit in the lower bound, find the lowest bit that is unset but could be set, set it, and clear any lower bits that are not required to be set.

That mostly sounds like the second case, what allows the original two cases to be unified is the fact that the bit we find is the same as the bit that needs to be set in the first case too.

As a reminder, x & -x is a common technique used to extract or isolate the lowest set bit aka blsi. It can also be written as x & (~x + 1), and if we change the 1 to some other constant, we can use this technique to find the lowest set bit but starting from some position that is not necessarily the least significant bit. So if we start from highestSetBit(~low & o), we find the bit we're looking for. Actually the annoying part is highestSetBit. Putting the rest together, we may get an implementation like this:

uint64_t sharpen_low(uint64_t low, uint64_t z, uint64_t o)
{
    uint64_t m = (~low & ~z) | (low & ~o);
    if (m) {
        uint64_t target = ~low & o;
        target &= ~target + highestSetBit(m);
        low = (low & -target) | target;
        low |= ~z;
    }
    return low;
}

The branch on m is a bit annoying, but on the plus side it means that the input of highestSetBit is always non-zero. Zero is otherwise a bit of an annoying case to handle in highestSetBit. In modern C++, you can use std::bit_floor for highestSetBit.

Sharpening the upper bound is symmetric, it can be implemented as ~sharpen_low(~high, o, z) or you could push the bitwise flips "inside" the algorithm and do some algebra to cancel them out.