Proposal: Expose Bit Manipulation functions 

Bit manipulation routines are common enough that we should expose a subset as platform primitives. 
While some of them may be simple to write, it's much harder to achieve the requisite performance desired, especially since they tend to be used within tight loops.

The aim of this proposal is to scope & design a minimal set of functions, to be implemented with a bias towards performance. Per @tannergooding 
> The point of these APIs is to provide a general-purpose API that works on all platforms (which means providing a software fallback) and is generally-usable. Hardware Intrinsics are for performance oriented scenarios where you require hardware acceleration and need more direct control of the code that is emitted.

Note that even though some of the formula may be simple, relevant callsites are more self-documenting when using the intrinsics (should the dev _choose_ to use them).

### Scope
- [x] Consolidate existing callsites in `CoreCLR` (https://github.com/dotnet/coreclr/pull/22584)
- [x] Units pass in `CoreFX`(https://github.com/dotnet/corefx/pull/35193)
- [x] Expose `System.Numerics.BitOperations` as a `public` ref assembly in `CoreFX` (https://github.com/dotnet/corefx/issues/35419)
- [ ] Consolidate existing callsites in `CoreFX` (https://github.com/dotnet/corefx/pull/34917)
- [ ] Implement proposed methods (this issue)

### Rationale and Usage
The proposed functions are already implemented throughout the stack, often with different algorithms, performance characteristics and test coverage.
Existing callsites below: https://github.com/dotnet/corefx/issues/32269#issuecomment-457689128
(There is likely to be more; the initial search was timeboxed to ~1 hour)

Some of the implementation have [suboptimal performance](https://github.com/dotnet/coreclr/blob/499e97e17ab07938f229f286427537a8e464c0a4/src/System.Private.CoreLib/shared/System/Buffers/Text/FormattingHelpers.CountDigits.cs#L13-L66) or [bugs](https://github.com/dotnet/coreclr/issues/22326). Something like `ExtractBit` is trivial to implement, but `PopCount` is more complex and thus prone to logic and performance issues. Hiding these complex formulae behind friendly signatures makes using them more approachable.

Here's an example of a function (`BTC`) whose signature is simple but the algebra is easy to get wrong.
```csharp
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool ComplementBit(ref uint value, int bitOffset)
{
    uint mask = 1u << bitOffset;
    bool btc = (value & mask) != 0;

    value = ~(~mask ^ value);

    return btc;
}
```
However making a call to it meets our goal of abstraction and performance:
```csharp
uint value = 123;
bool previouslyTrue = BitOperations.ComplementBit(ref value, 6);
```

### Proposed API
The proposed API is purposefully kept lean. We can add more methods in later design iterations. We should view this as an opportunity to get simple, base functionality out the door and not stray into the dangerous territory of adding every bit twiddling hack that exists.

Assume all methods are decorated with `[MethodImpl(MethodImplOptions.AggressiveInlining)]`

```csharp
public static class BitOperations
{
    // BT
    bool ExtractBit(byte value, int bitOffset); // Could name this BitTest or TestBit
    bool ExtractBit(uint value, int bitOffset);
    bool ExtractBit(int value, int bitOffset);

    // BTS (scalar)
    byte InsertBit(byte value, int bitOffset); // BitSet or SetBit
    uint InsertBit(uint value, int bitOffset);
    int InsertBit(int value, int bitOffset);

    // True BTS (returns original value)
    bool InsertBit(ref byte value, int bitOffset);
    bool InsertBit(ref uint value, int bitOffset);
    bool InsertBit(ref int value, int bitOffset);

    // BTR
    byte ClearBit(byte value, int bitOffset); // BitReset or ResetBit
    uint ClearBit(uint value, int bitOffset);
    int ClearBit(int value, int bitOffset);

    bool ClearBit(ref byte value, int bitOffset);
    bool ClearBit(ref uint value, int bitOffset);
    bool ClearBit(ref int value, int bitOffset);

    // BTC
    byte ComplementBit(byte value, int bitOffset);
    uint ComplementBit(uint value, int bitOffset);
    int ComplementBit(int value, int bitOffset);

    bool ComplementBit(ref byte value, int bitOffset);
    bool ComplementBit(ref uint value, int bitOffset);
    bool ComplementBit(ref int value, int bitOffset);

    // on ? BTS : BTR
    byte WriteBit(byte value, int bitOffset, bool on);
    uint WriteBit(uint value, int bitOffset, bool on);
    int WriteBit(int value, int bitOffset, bool on);

    bool WriteBit(ref byte value, int bitOffset, bool on);
    bool WriteBit(ref uint value, int bitOffset, bool on);
    bool WriteBit(ref int value, int bitOffset, bool on);
}
```

### Details
* The focus will be on **performance**. 
* Ultimately, most of the code should be **branchless** and leverage **intrinsics** where possible.
* Somewhat of an overlap in functionality between `InsertBit`/`ClearBit` and `WriteBit` where the latter conditionally executes the equivalent of either former (using twiddling to do so without branching). But there's enough twiddling in `Write` to avoid any branching that it's maybe worth keeping both variants.
* Since these functions are used in performance-sensitive applications, we avoid input checking and `exception`s. The current design tries to dispense with the requirement by sharpening input types, contractual assumptions (eg out-of-bounds `offset` will use `mod n` in some functions or result in a `no-op` in others) and specific design choices.

### Questions
* What additional sizes & signs of integers do we support, and in which methods? It looks like `int`, `uint` and `byte` are commonly used. Anything else?

### Decisions
* What `namespace` this should be in. **Decision**: `namespace System.Numerics`.
* The class name of `BitOps` is self-documenting, terse (it might be specified frequently in a callsite, if not aliased with `using`) and both terms are well-known names or abbreviations. **Decision**: `BitOperations`
* **Decision**: We favor well-known names over the alternatives. For example, `PopCount` could be called `CountSetBits` but that's not the common lingo used by twiddlers. Furthermore, intrinsics already expose the well-known names, eg `Popcnt.PopCount`.
* Likewise, method names should also be as concise as possible, for example, `TrailingZeroCount` may be more concisely described as `TrailingZeros`. **Decision**: `TrailingZeroCount` already chosen by a previous [PR](https://github.com/dotnet/coreclr/pull/22118).
* **Decision**: Count-oriented methods such as `PopCount` return (idiomatic) `int`, not `uint`
* Should `offset` or `position` parameters be `int` or `uint`. Latter is preferred since negatives not permitted regardless. **Decision**: int is an idiomatic input/output type in C#.
* `Log(0)` is mathematically undefined. Should it return `0` or `-1`? **Decision**: Returns 0
* Do we care about endianness (ie do we need BE and LE variants of relevant methods). The current proposal is only LE, and is shaped in such a way that we are not future-proof wrt supporting BE. Discussion proposes an alternative. For example, `LeadingZeros` _might_ need a different algorithm on BE/LE platforms. **Decision: Endianess only matters if we are reinterpreting integers**.

### Sample call sites
The following samples are taken from the linked units, from the method `BitOps_Samples`.
The code chooses values that are easy to eyeball for correctness. The real units cover many more boundaries & conditions.
```csharp
// ExtractBit: Reads whether the specified bit in a mask is set.
Assert.True(BitOps.ExtractBit((byte)0b0001_0000, 4));
Assert.False(BitOps.ExtractBit((byte)0b0001_0000, 7));

// InsertBit: Sets the specified bit in a mask and returns the new value.
byte dest = 0b0000_1001;
Assert.Equal(0b0010_1001, BitOps.InsertBit(dest, 5));

// InsertBit(ref): Sets the specified bit in a mask and returns whether it was originally set.
Assert.False(BitOps.InsertBit(ref dest, 5));
Assert.Equal(0b0010_1001, dest);

// ClearBit: Clears the specified bit in a mask and returns the new value.
dest = 0b0000_1001;
Assert.Equal(0b0000_0001, BitOps.ClearBit(dest, 3));
// ClearBit(ref): Clears the specified bit in a mask and returns whether it was originally set.
Assert.True(BitOps.ClearBit(ref dest, 3)); 
Assert.Equal(0b0000_0001, dest);

// ComplementBit: Complements the specified bit in a mask and returns the new value.
dest = 0b0000_1001;
Assert.Equal(0b0000_0001, BitOps.ComplementBit(dest, 3));
// ComplementBit(ref): Complements the specified bit in a mask and returns whether it was originally set.
Assert.True(BitOps.ComplementBit(ref dest, 3));
Assert.Equal(0b0000_0001, dest);

// WriteBit: Writes the specified bit in a mask and returns the new value. Does not branch.
dest = 0b0000_1001;
Assert.Equal(0b0000_0001, BitOps.WriteBit(dest, 3, on: false));
// WriteBit(ref): Writes the specified bit in a mask and returns whether it was originally set. Does not branch.
Assert.True(BitOps.WriteBit(ref dest, 3, on: false));
Assert.Equal(0b0000_0001, dest);
```

### Updates
* Original issue authored by @mburbea here: [Proposal: Add a BitManipulation class](https://github.com/dotnet/corefx/issues/12425)
* Initial proposal submitted
* Methods such as `InsertBit` accepted a bool that determined whether it set or cleared the bit in question. Such methods have now been refactored in two, `InsertBit` and `ClearBit`.
* Some name changes based on suggestions (eg `FlipBit` became `ComplementBit`).
* TestAndSet methods have been refactored into simple scalar functions, for performance.
* The offset parameter is `int` instead of `byte`. All POC units pass.
* Added `WriteBit(value, offset, bool on)` overloads that conditionally set/clear the specified bit.
* Added `ExtractByte` and friends
* Added `Evaluate(bool)` (used internally, so may as well expose it)
* Removed all `Span<T>` overloads per @tannergooding advice. Will maybe submit in separate proposal
* Moved `TrailingOnes` and `LeadingOnes` into [to do later](https://github.com/dotnet/corefx/issues/32269#issuecomment-425534929) proposal in comments
* Added `Log2`
* Removed redundant overloads
* Removed doc-comments so spec is easier to read
* Added byte overloads to all crud methods based on code analysis of existing callsites. eg eg bool ExtractBit(byte value, int bitOffset) 
* Added **Scope** section at top of spec
* Separated calls into **existing** and **proposed** sections
* Added task list, worded spec in a more concise manner
* Moved `RotateByte`, etc into [to do later](https://github.com/dotnet/corefx/issues/32269#issuecomment-425534929) proposal in comments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposal: Expose Bit Manipulation functions #27382

Scope

Rationale and Usage

Proposed API

Details

Questions

Decisions

Sample call sites

Updates

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proposal: Expose Bit Manipulation functions #27382

Description

Scope

Rationale and Usage

Proposed API

Details

Questions

Decisions

Sample call sites

Updates

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions