`grev` (generalized bit-reverse) is an operation that implements bit-permutations corresponding to XOR-ing the indices by some value. It has been proposed to be part of the Zbp extension of RISC-V, with this reference implementation (source: release v0.93)

uint32_t grev32(uint32_t rs1, uint32_t rs2) { uint32_t x = rs1; int shamt = rs2 & 31; if (shamt & 1) x = ((x & 0x55555555) << 1) | ((x & 0xAAAAAAAA) >> 1); if (shamt & 2) x = ((x & 0x33333333) << 2) | ((x & 0xCCCCCCCC) >> 2); if (shamt & 4) x = ((x & 0x0F0F0F0F) << 4) | ((x & 0xF0F0F0F0) >> 4); if (shamt & 8) x = ((x & 0x00FF00FF) << 8) | ((x & 0xFF00FF00) >> 8); if (shamt & 16) x = ((x & 0x0000FFFF) << 16) | ((x & 0xFFFF0000) >> 16); return x; }

`grev` looks in some ways similar to bit-shifts and rotates: the left and right operands have distinct roles with the right operand being a mask of `k` bits if the left operand has 2^{k} bits^{[1]}.

Carry-less multiplication normally has a left-shift in it, `grevmul` is what you get when that left-shift is replaced with `grev`.

uint32_t grevmul32(uint32_t x, uint32_t y) { uint32_t r = 0; for (int k = 0; k < 32; k++) { if (y & (1 << k)) r ^= grev32(x, k); } return x; }

`grevmul` is, at its core, very similar to `clmul`: take single-bit products (logical AND) of every bit of the left operand with every bit of the right operand, then do some XOR-reduction. The difference is in which partial products are grouped together. For `clmul`, the partial products that contribute to bit `k` of the result are pairs with indices `i,j` such that `i + j = k`. For `grevmul`, it's the pairs with indices such that `i ^ j = k`. This goes back to `grev` permuting the bits by XOR-ing their indices by some value, and that value is `k` here.

Now that `grevmul` has been defined, let's look at some of its properties, comparing it to `clmul` and plain old `imul`.

grevmul | clmul | imul | |
---|---|---|---|

zero^{[2]} |
0 | 0 | 0 |

identity | 1 | 1 | 1 |

commutative | yes | yes | yes |

associative | yes | yes | yes |

distributes over | xor | xor | addition |

op(x, 1 << k) is | grev(x, k) | x << k | x << k |

x has inverse if |
popcnt(x) & 1 | x & 1 | x & 1 |

op(x, x) is | popcnt(x) & 1 | pdep(x, 0x55555555) |
x² |

## What is the "grevmul inverse" of `x`?

Time for some algebra. Looking just at the table above, and forgetting the actual definition of `grevmul`, can we say something about the solutions of `grevmul(x, y) == 1`? Surprisingly, yes.

Assuming we have some `x` with odd hamming weight (numbers with even hamming weight do not have inverses, so let's ignore them for now), we know that `grevmul(x, x) == 1`. The inverse in a monoid is unique so `x` is not just some inverse of `x`, it is *the* (unique) inverse of `x`.

Since the "addition operator" is XOR (for which negation is the identity function), this is a non-trivial example of a ring in which `x = -x = x ^{-1}`, when

`x`exists. Strange, isn't it?

^{-1}We also have that `f(x) = grevmul(x, c)` (for appropriate choices of `c`) is a (non-trivial) involution, so it may be a contenter for the "middle operation" of an involutary bit finalizer, but probably useless without an efficient implementation.

~~I was going to write about implementing ~~ E: see Implementing grevmul with GF2P8AFFINEQB where I went ahead and implemented the whole thing, not only the "multiply by 8-bit constant" case.`grevmul` by an 8-bit constant with two `GF2P8AFFINEQB`s but I've had enough for now, maybe later.

[1] The right operand of a shift is often called the shift *count*, but it can also be interpreted as a mask indicating some subset of shift-by-2^{i} operations to perform. That interpretation is useful for example when implementing a shift-by-variable operation on a machine that only has a shift-by-constant instruction, following the same pattern as the reference implementation of `grev32`.

[2] This looks like a joke, but I mean that the numeric value 0 acts as the zero element of the corresponding semigroup.