Saturday 26 September 2020

The range a sum cannot fall within

Throughout this post, the variables A, B, and R are used, with R defined as R = A + B, and A ≤ B. Arithmetic in this post is unsigned and modulo 2k. Note that A ≤ B is not a restriction on the input, it is a choice to label the smaller input as A and the larger input as B. Addition is commutative, so this choice can be made without loss of generality.

R < A || R ≥ B

The sum is less than A iff the addition wraps (1), otherwise it has to be at least B (2).

  1. B cannot be so high that the addition can wrap all the way up to or past A. To make A + B add up to A, B would have had to be 2k, which is one beyond the maximum value it can be. R = A is possible only if B is zero, in which case R ≥ B holds instead.
  2. Since A is at least zero, in the absence of wrapping there is no way to reduce the value below the inputs.

Perhaps that all looks obvious, but this has a useful application: if the carry-out of the addition is not available, it can be computed via carry = (x + y) < x, which is a relatively well-known trick. It does not matter which of x or y is the smaller or larger input, the sum cannot fall within the "forbidden zone" between them. The occasionally seen carry = (x + y) < max(x, y) adds an unnecessary complication.

R < (A & B) || R ≥ (A | B)

This is a stronger statement, because A & B is usually smaller than A and A | B is usually greater than B.

If no wrapping occurs, then R ≥ (A | B). This can be seen for example by splitting the addition into a XOR and adding the carries separately, (A + B) = (A ^ B) + (A & B) * 2, while bitwise OR can be decomposed similarly into (A | B) = (A ^ B) + (A & B)(see below). Since there is no wrapping (by assumption), (A & B) * 2 ≥ (A & B) and therefore (A + B) ≥ (A | B). Or, with less algebra: addition sometimes produces a zero where the bitwise OR produces a one, but then addition compensates doubly for it by carrying into the next position.

For the case in which wrapping occurs I will take a bit-by-bit view. In order to wrap, the carry out of bit k-1 must be 1. In order for the sum to be greater than or equal to A & B, bit k-1 of the sum must be greater than or equal to bit k-1 of A & B. That combination means that the carry into bit k-1 of the sum must have been 1 as well. Furthermore, bit k-1 of the sum can't be greater than bit k-1 of A & B, at most it can be equal, which means bit k-2 must be examined as well. The same argument applies to bit k-2 and so on, until finally for the least-significant bit it becomes impossible for it to be carried into, so the whole thing falls down: by contradiction, A + B must be less than A & B when the sum wraps.

What about (A | B) = (A ^ B) + (A & B) though?

The more obvious version is (A | B) = (A ^ B) | (A & B), compensating for the bits reset by the XOR by ORing exactly those bits back in. Adding them back in also works, because the set bits in A ^ B and A & B are disjoint: a bit being set in the XOR means that exactly one of the input bits was set, which makes their AND zero.

Monday 3 August 2020

Why does AND distribute over XOR

AND distributes over XOR, unsurprisingly both from the left and right, that is:

x & y ^ z & y == (x ^ z) & y
x & y ^ x & z == x & (y ^ z)
a & c ^ a & d ^ b & c ^ b & d == (a ^ b) & (c ^ d)
A somewhat popular explanation for why is,
Conjunction and exclusive or form the multiplication and addition operations of a field GF(2), and as in any field they obey the distributive law.

Which is true and a useful way to think about it, but it is also the type of backwards explanation that relies on a concept that is more advanced than the thing which is being explained.

Diagrams with crossing lines

Let's represent an expression such as a & c ^ a & d ^ b & c ^ b & d by putting the variables on the left of every AND along the top of a grid, and the variables on the right of every AND along the side. Then for example the grid cell on the intersection between the column of a and the row of c corresponds to the term a & c. Further, let's draw lines for variables that are True, in this example all variables are True:

The overall expression a & c ^ a & d ^ b & c ^ b & d counts the number of crossings, modulo 2. Rather than counting the crossings one by one, the number of crossings could be computed by counting how many variables along the top are True, how many along the side are True, and taking the product, again modulo 2. A sum modulo 2 is XOR and a product modulo 2 is AND, so this gives the equivalent expression (a ^ b) & (c ^ d).

The simpler cases x & y ^ z & y and x & y ^ x & z correspond to 1x2 and 2x1 diagrams.

Diagrams with bites taken out of them

Such a diagram with a section of it missing can be dealt with by completing the grid and subtracting the difference. For example the unwieldy a & e ^ a & f ^ a & g ^ a & h ^ b & e ^ b & f ^ b & g ^ b & h ^ c & e ^ c & f ^ d & e ^ d & f (shown in the diagram below) is "incomplete", it misses the 2x2 square that corresponds to (c ^ d) & (g ^ h). Completing the grid and subtracting the difference gives ((a ^ b ^ c ^ d) & (e ^ f ^ g ^ h)) ^ ((c ^ d) & (g ^ h)), which is correct.

This all has a clear connection to the FOIL method and its generalizations, after all conjunction and exclusive or form the multiplication and addition operations of a field GF(2).

The same diagrams also show why AND distributes over OR (the normal, inclusive, OR), which could alternatively be explained in terms of the Boolean semiring.

Sunday 3 May 2020

Information on incrementation

Defining increment

Just to avoid any confusion, the operation that this post is about is adding 1 (one) to a value: $$\text{increment}(x) = x + 1$$ Specifically, performing that operation in the domain of bit-vectors.

Incrementing is very closely related to negating. After all, -x = ~x + 1 and therefore x + 1 = -~x, though putting it that way feel oddly reversed to me.

Bit-string notation

In bit-string notation (useful for analysing compositions of operations at the bit level), increment can be represented as: $$a01^k + 1 = a10^k$$

An "English" interpretation of that form is that an increment carries through the trailing set bits, turning them to zero, and then carries into the right-most unset bit, setting it.

That "do something special with the right-most unset bit" aspect of increment is the basis for various right-most bit manipulations, some of which were implemented in AMD Trailing Bit Manipulation (TBM) (which has been discontinued).

For example, the right-most unset bit in x can be set using x | (x + 1), which has a nice symmetry with the more widely known trick for unsetting the right-most set bit, x & (x - 1).

Increment by XOR

As was the case with negation, there is a way to define increment in terms of XOR. The bits that flip during an increment are all the trailing set bits and the right-most unset bit, the TBM instruction for which is BLCMSK. While that probably does not seem very useful yet, the fact that x ^ (x + 1) takes the form of some number of leading zeroes followed by some number of trailing ones turns out to be useful.

Suppose one wants to increment a bit-reversed integer, a possible (and commonly seen) approach is looping of the bits from top the bottom and implementing the "carry through the ones, into the first zero" logic by hand. However, if the non-reversed value was also available (let's call it i), the bit-reversed increment could be implemented by calculating the number of ones in the mask as tzcnt(i + 1) + 1 (or popcnt(i ^ (i + 1))) and forming a mask with that number of ones located at the desired place within an integer:

// i   = normal counter
// rev = bit-reversed counter
// N   = 1 << number_of_bits
int maskLen = tzcnt(i + 1) + 1;
rev ^= N - (N >> maskLen);
That may still not seem useful, but this enables an implementation of the bit-reversal permutation (not a bit-reversal itself, but the permutation that results from bit-reversing the indices). The bit-reversal permutation is sometimes used to re-order the result of a non-auto-sorting Fast Fourier Transform algorithm into the "natural" order. For example,
// X = array of data
// N = length of X, power of two
for (uint32_t i = 0, rev = 0; i < N; ++i)
{
    if (i < rev)
        swap(X[i], X[rev]);
    int maskLen = tzcnt(i + 1) + 1;
    rev ^= N - (N >> maskLen);
}
This makes no special effort to be cache-efficient.