Tuesday, 4 February 2014

Addition in the bitfield domain, alternative algorithm

This is obviously a continuation of this post, I could have published this one first but it didn't seem that interesting to me at the time. But it's simpler to understand, it's (as far as I know) "new" in the sense that it hasn't been written down anywhere public, and the thing I really wanted to post about didn't really work out as well as I had hoped.. so here it is.

The idea behind this algorithm is to take the software version of a Kogge-Stone adder, and replace all the variables by z/o pairs and all the operators by their "lifted" equivalents. So just take this:

uint p = x ^ y;
uint g = x & y;

g |= p & (g << 1);
p &= p << 1;

g |= p & (g << 2);
p &= p << 2;

g |= p & (g << 4);
p &= p << 4;

g |= p & (g << 8);
p &= p << 8;

g |= p & (g << 16);

uint result = x ^ y ^ (g << 1);
Do the replacement:
// p = x ^ y
uint pz = xz & yz | xo & yo;
uint po = xz & yo | xo & yz;
// g = x & y
uint gz = xz | yz;
uint go = xo & yo;
// g |= p & (g << 1)
gz = (gz << 1 | 1 | pz) & gz;
go = (go << 1) & po | go;
// p &= p << 1
pz = pz << 1 | 1 | pz;
po = po << 1 & po;
// g |= p & (g << 2)
gz = (~(~gz << 2) | pz) & gz;
go = (go << 2) & po | go;
// p &= p << 2
pz = ~(~pz << 2) | pz;
po = po << 2 & po;
// g |= p & (g << 4)
gz = (~(~gz << 4) | pz) & gz;
go = (go << 4) & po | go;
// p &= p << 4
pz = ~(~pz << 4) | pz;
po = po << 4 & po;
// g |= p & (g << 8)
gz = (~(~gz << 8) | pz) & gz;
go = (go << 8) & po | go;
// p &= p << 8
pz = ~(~pz << 8) | pz;
po = po << 8 & po;
// g |= p & (g << 16)
gz = (~(~gz << 16) | pz) & gz;
go = (go << 16) & po | go;
// g <<= 1
gz = ~(~gz << 1);
go = go << 1;
// z = x ^ y
uint zz = xz & yz | xo & yo;
uint zo = xz & yo | xo & yz;
// result = z ^ g
uint resz = zz & gz | zo & go;
uint reso = zz & go | zo & gz;

It got longer, but not actually more complicated - it's still the same algorithm, and you could even write it exactly the same way if you used operator overloading.

This algorithm doesn't need a special case for "empty" inputs, because the two lifted XORs at the end preserve emptiness (though the lifted left shifts in the calculation of g do not).