Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Proposal: Expose Bit Manipulation functions  #27382

@grant-d

Description

@grant-d

Bit manipulation routines are common enough that we should expose a subset as platform primitives.
While some of them may be simple to write, it's much harder to achieve the requisite performance desired, especially since they tend to be used within tight loops.

The aim of this proposal is to scope & design a minimal set of functions, to be implemented with a bias towards performance. Per @tannergooding

The point of these APIs is to provide a general-purpose API that works on all platforms (which means providing a software fallback) and is generally-usable. Hardware Intrinsics are for performance oriented scenarios where you require hardware acceleration and need more direct control of the code that is emitted.

Note that even though some of the formula may be simple, relevant callsites are more self-documenting when using the intrinsics (should the dev choose to use them).

Scope

Rationale and Usage

The proposed functions are already implemented throughout the stack, often with different algorithms, performance characteristics and test coverage.
Existing callsites below: https://github.com/dotnet/corefx/issues/32269#issuecomment-457689128
(There is likely to be more; the initial search was timeboxed to ~1 hour)

Some of the implementation have suboptimal performance or bugs. Something like ExtractBit is trivial to implement, but PopCount is more complex and thus prone to logic and performance issues. Hiding these complex formulae behind friendly signatures makes using them more approachable.

Here's an example of a function (BTC) whose signature is simple but the algebra is easy to get wrong.

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool ComplementBit(ref uint value, int bitOffset)
{
    uint mask = 1u << bitOffset;
    bool btc = (value & mask) != 0;

    value = ~(~mask ^ value);

    return btc;
}

However making a call to it meets our goal of abstraction and performance:

uint value = 123;
bool previouslyTrue = BitOperations.ComplementBit(ref value, 6);

Proposed API

The proposed API is purposefully kept lean. We can add more methods in later design iterations. We should view this as an opportunity to get simple, base functionality out the door and not stray into the dangerous territory of adding every bit twiddling hack that exists.

Assume all methods are decorated with [MethodImpl(MethodImplOptions.AggressiveInlining)]

public static class BitOperations
{
    // BT
    bool ExtractBit(byte value, int bitOffset); // Could name this BitTest or TestBit
    bool ExtractBit(uint value, int bitOffset);
    bool ExtractBit(int value, int bitOffset);

    // BTS (scalar)
    byte InsertBit(byte value, int bitOffset); // BitSet or SetBit
    uint InsertBit(uint value, int bitOffset);
    int InsertBit(int value, int bitOffset);

    // True BTS (returns original value)
    bool InsertBit(ref byte value, int bitOffset);
    bool InsertBit(ref uint value, int bitOffset);
    bool InsertBit(ref int value, int bitOffset);

    // BTR
    byte ClearBit(byte value, int bitOffset); // BitReset or ResetBit
    uint ClearBit(uint value, int bitOffset);
    int ClearBit(int value, int bitOffset);

    bool ClearBit(ref byte value, int bitOffset);
    bool ClearBit(ref uint value, int bitOffset);
    bool ClearBit(ref int value, int bitOffset);

    // BTC
    byte ComplementBit(byte value, int bitOffset);
    uint ComplementBit(uint value, int bitOffset);
    int ComplementBit(int value, int bitOffset);

    bool ComplementBit(ref byte value, int bitOffset);
    bool ComplementBit(ref uint value, int bitOffset);
    bool ComplementBit(ref int value, int bitOffset);

    // on ? BTS : BTR
    byte WriteBit(byte value, int bitOffset, bool on);
    uint WriteBit(uint value, int bitOffset, bool on);
    int WriteBit(int value, int bitOffset, bool on);

    bool WriteBit(ref byte value, int bitOffset, bool on);
    bool WriteBit(ref uint value, int bitOffset, bool on);
    bool WriteBit(ref int value, int bitOffset, bool on);
}

Details

  • The focus will be on performance.
  • Ultimately, most of the code should be branchless and leverage intrinsics where possible.
  • Somewhat of an overlap in functionality between InsertBit/ClearBit and WriteBit where the latter conditionally executes the equivalent of either former (using twiddling to do so without branching). But there's enough twiddling in Write to avoid any branching that it's maybe worth keeping both variants.
  • Since these functions are used in performance-sensitive applications, we avoid input checking and exceptions. The current design tries to dispense with the requirement by sharpening input types, contractual assumptions (eg out-of-bounds offset will use mod n in some functions or result in a no-op in others) and specific design choices.

Questions

  • What additional sizes & signs of integers do we support, and in which methods? It looks like int, uint and byte are commonly used. Anything else?

Decisions

  • What namespace this should be in. Decision: namespace System.Numerics.
  • The class name of BitOps is self-documenting, terse (it might be specified frequently in a callsite, if not aliased with using) and both terms are well-known names or abbreviations. Decision: BitOperations
  • Decision: We favor well-known names over the alternatives. For example, PopCount could be called CountSetBits but that's not the common lingo used by twiddlers. Furthermore, intrinsics already expose the well-known names, eg Popcnt.PopCount.
  • Likewise, method names should also be as concise as possible, for example, TrailingZeroCount may be more concisely described as TrailingZeros. Decision: TrailingZeroCount already chosen by a previous PR.
  • Decision: Count-oriented methods such as PopCount return (idiomatic) int, not uint
  • Should offset or position parameters be int or uint. Latter is preferred since negatives not permitted regardless. Decision: int is an idiomatic input/output type in C#.
  • Log(0) is mathematically undefined. Should it return 0 or -1? Decision: Returns 0
  • Do we care about endianness (ie do we need BE and LE variants of relevant methods). The current proposal is only LE, and is shaped in such a way that we are not future-proof wrt supporting BE. Discussion proposes an alternative. For example, LeadingZeros might need a different algorithm on BE/LE platforms. Decision: Endianess only matters if we are reinterpreting integers.

Sample call sites

The following samples are taken from the linked units, from the method BitOps_Samples.
The code chooses values that are easy to eyeball for correctness. The real units cover many more boundaries & conditions.

// ExtractBit: Reads whether the specified bit in a mask is set.
Assert.True(BitOps.ExtractBit((byte)0b0001_0000, 4));
Assert.False(BitOps.ExtractBit((byte)0b0001_0000, 7));

// InsertBit: Sets the specified bit in a mask and returns the new value.
byte dest = 0b0000_1001;
Assert.Equal(0b0010_1001, BitOps.InsertBit(dest, 5));

// InsertBit(ref): Sets the specified bit in a mask and returns whether it was originally set.
Assert.False(BitOps.InsertBit(ref dest, 5));
Assert.Equal(0b0010_1001, dest);

// ClearBit: Clears the specified bit in a mask and returns the new value.
dest = 0b0000_1001;
Assert.Equal(0b0000_0001, BitOps.ClearBit(dest, 3));
// ClearBit(ref): Clears the specified bit in a mask and returns whether it was originally set.
Assert.True(BitOps.ClearBit(ref dest, 3)); 
Assert.Equal(0b0000_0001, dest);

// ComplementBit: Complements the specified bit in a mask and returns the new value.
dest = 0b0000_1001;
Assert.Equal(0b0000_0001, BitOps.ComplementBit(dest, 3));
// ComplementBit(ref): Complements the specified bit in a mask and returns whether it was originally set.
Assert.True(BitOps.ComplementBit(ref dest, 3));
Assert.Equal(0b0000_0001, dest);

// WriteBit: Writes the specified bit in a mask and returns the new value. Does not branch.
dest = 0b0000_1001;
Assert.Equal(0b0000_0001, BitOps.WriteBit(dest, 3, on: false));
// WriteBit(ref): Writes the specified bit in a mask and returns whether it was originally set. Does not branch.
Assert.True(BitOps.WriteBit(ref dest, 3, on: false));
Assert.Equal(0b0000_0001, dest);

Updates

  • Original issue authored by @mburbea here: Proposal: Add a BitManipulation class
  • Initial proposal submitted
  • Methods such as InsertBit accepted a bool that determined whether it set or cleared the bit in question. Such methods have now been refactored in two, InsertBit and ClearBit.
  • Some name changes based on suggestions (eg FlipBit became ComplementBit).
  • TestAndSet methods have been refactored into simple scalar functions, for performance.
  • The offset parameter is int instead of byte. All POC units pass.
  • Added WriteBit(value, offset, bool on) overloads that conditionally set/clear the specified bit.
  • Added ExtractByte and friends
  • Added Evaluate(bool) (used internally, so may as well expose it)
  • Removed all Span<T> overloads per @tannergooding advice. Will maybe submit in separate proposal
  • Moved TrailingOnes and LeadingOnes into to do later proposal in comments
  • Added Log2
  • Removed redundant overloads
  • Removed doc-comments so spec is easier to read
  • Added byte overloads to all crud methods based on code analysis of existing callsites. eg eg bool ExtractBit(byte value, int bitOffset)
  • Added Scope section at top of spec
  • Separated calls into existing and proposed sections
  • Added task list, worded spec in a more concise manner
  • Moved RotateByte, etc into to do later proposal in comments

Metadata

Metadata

Assignees

No one assigned

    Labels

    api-needs-workAPI needs work before it is approved, it is NOT ready for implementationarea-System.Numerics

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions