-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Description
Bit manipulation routines are common enough that we should expose a subset as platform primitives.
While some of them may be simple to write, it's much harder to achieve the requisite performance desired, especially since they tend to be used within tight loops.
The aim of this proposal is to scope & design a minimal set of functions, to be implemented with a bias towards performance. Per @tannergooding
The point of these APIs is to provide a general-purpose API that works on all platforms (which means providing a software fallback) and is generally-usable. Hardware Intrinsics are for performance oriented scenarios where you require hardware acceleration and need more direct control of the code that is emitted.
Note that even though some of the formula may be simple, relevant callsites are more self-documenting when using the intrinsics (should the dev choose to use them).
Scope
- Consolidate existing callsites in
CoreCLR
(Consolidate implementation of Rotate and PopCount coreclr#22584) - Units pass in
CoreFX
(Units for BitOps.TrailingZeroCount corefx#35193) - Expose
System.Numerics.BitOperations
as apublic
ref assembly inCoreFX
(https://github.com/dotnet/corefx/issues/35419) - Consolidate existing callsites in
CoreFX
([NO MERGE] BitOps analysis CoreFX (WIP) corefx#34917) - Implement proposed methods (this issue)
Rationale and Usage
The proposed functions are already implemented throughout the stack, often with different algorithms, performance characteristics and test coverage.
Existing callsites below: https://github.com/dotnet/corefx/issues/32269#issuecomment-457689128
(There is likely to be more; the initial search was timeboxed to ~1 hour)
Some of the implementation have suboptimal performance or bugs. Something like ExtractBit
is trivial to implement, but PopCount
is more complex and thus prone to logic and performance issues. Hiding these complex formulae behind friendly signatures makes using them more approachable.
Here's an example of a function (BTC
) whose signature is simple but the algebra is easy to get wrong.
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool ComplementBit(ref uint value, int bitOffset)
{
uint mask = 1u << bitOffset;
bool btc = (value & mask) != 0;
value = ~(~mask ^ value);
return btc;
}
However making a call to it meets our goal of abstraction and performance:
uint value = 123;
bool previouslyTrue = BitOperations.ComplementBit(ref value, 6);
Proposed API
The proposed API is purposefully kept lean. We can add more methods in later design iterations. We should view this as an opportunity to get simple, base functionality out the door and not stray into the dangerous territory of adding every bit twiddling hack that exists.
Assume all methods are decorated with [MethodImpl(MethodImplOptions.AggressiveInlining)]
public static class BitOperations
{
// BT
bool ExtractBit(byte value, int bitOffset); // Could name this BitTest or TestBit
bool ExtractBit(uint value, int bitOffset);
bool ExtractBit(int value, int bitOffset);
// BTS (scalar)
byte InsertBit(byte value, int bitOffset); // BitSet or SetBit
uint InsertBit(uint value, int bitOffset);
int InsertBit(int value, int bitOffset);
// True BTS (returns original value)
bool InsertBit(ref byte value, int bitOffset);
bool InsertBit(ref uint value, int bitOffset);
bool InsertBit(ref int value, int bitOffset);
// BTR
byte ClearBit(byte value, int bitOffset); // BitReset or ResetBit
uint ClearBit(uint value, int bitOffset);
int ClearBit(int value, int bitOffset);
bool ClearBit(ref byte value, int bitOffset);
bool ClearBit(ref uint value, int bitOffset);
bool ClearBit(ref int value, int bitOffset);
// BTC
byte ComplementBit(byte value, int bitOffset);
uint ComplementBit(uint value, int bitOffset);
int ComplementBit(int value, int bitOffset);
bool ComplementBit(ref byte value, int bitOffset);
bool ComplementBit(ref uint value, int bitOffset);
bool ComplementBit(ref int value, int bitOffset);
// on ? BTS : BTR
byte WriteBit(byte value, int bitOffset, bool on);
uint WriteBit(uint value, int bitOffset, bool on);
int WriteBit(int value, int bitOffset, bool on);
bool WriteBit(ref byte value, int bitOffset, bool on);
bool WriteBit(ref uint value, int bitOffset, bool on);
bool WriteBit(ref int value, int bitOffset, bool on);
}
Details
- The focus will be on performance.
- Ultimately, most of the code should be branchless and leverage intrinsics where possible.
- Somewhat of an overlap in functionality between
InsertBit
/ClearBit
andWriteBit
where the latter conditionally executes the equivalent of either former (using twiddling to do so without branching). But there's enough twiddling inWrite
to avoid any branching that it's maybe worth keeping both variants. - Since these functions are used in performance-sensitive applications, we avoid input checking and
exception
s. The current design tries to dispense with the requirement by sharpening input types, contractual assumptions (eg out-of-boundsoffset
will usemod n
in some functions or result in ano-op
in others) and specific design choices.
Questions
- What additional sizes & signs of integers do we support, and in which methods? It looks like
int
,uint
andbyte
are commonly used. Anything else?
Decisions
- What
namespace
this should be in. Decision:namespace System.Numerics
. - The class name of
BitOps
is self-documenting, terse (it might be specified frequently in a callsite, if not aliased withusing
) and both terms are well-known names or abbreviations. Decision:BitOperations
- Decision: We favor well-known names over the alternatives. For example,
PopCount
could be calledCountSetBits
but that's not the common lingo used by twiddlers. Furthermore, intrinsics already expose the well-known names, egPopcnt.PopCount
. - Likewise, method names should also be as concise as possible, for example,
TrailingZeroCount
may be more concisely described asTrailingZeros
. Decision:TrailingZeroCount
already chosen by a previous PR. - Decision: Count-oriented methods such as
PopCount
return (idiomatic)int
, notuint
- Should
offset
orposition
parameters beint
oruint
. Latter is preferred since negatives not permitted regardless. Decision: int is an idiomatic input/output type in C#. Log(0)
is mathematically undefined. Should it return0
or-1
? Decision: Returns 0- Do we care about endianness (ie do we need BE and LE variants of relevant methods). The current proposal is only LE, and is shaped in such a way that we are not future-proof wrt supporting BE. Discussion proposes an alternative. For example,
LeadingZeros
might need a different algorithm on BE/LE platforms. Decision: Endianess only matters if we are reinterpreting integers.
Sample call sites
The following samples are taken from the linked units, from the method BitOps_Samples
.
The code chooses values that are easy to eyeball for correctness. The real units cover many more boundaries & conditions.
// ExtractBit: Reads whether the specified bit in a mask is set.
Assert.True(BitOps.ExtractBit((byte)0b0001_0000, 4));
Assert.False(BitOps.ExtractBit((byte)0b0001_0000, 7));
// InsertBit: Sets the specified bit in a mask and returns the new value.
byte dest = 0b0000_1001;
Assert.Equal(0b0010_1001, BitOps.InsertBit(dest, 5));
// InsertBit(ref): Sets the specified bit in a mask and returns whether it was originally set.
Assert.False(BitOps.InsertBit(ref dest, 5));
Assert.Equal(0b0010_1001, dest);
// ClearBit: Clears the specified bit in a mask and returns the new value.
dest = 0b0000_1001;
Assert.Equal(0b0000_0001, BitOps.ClearBit(dest, 3));
// ClearBit(ref): Clears the specified bit in a mask and returns whether it was originally set.
Assert.True(BitOps.ClearBit(ref dest, 3));
Assert.Equal(0b0000_0001, dest);
// ComplementBit: Complements the specified bit in a mask and returns the new value.
dest = 0b0000_1001;
Assert.Equal(0b0000_0001, BitOps.ComplementBit(dest, 3));
// ComplementBit(ref): Complements the specified bit in a mask and returns whether it was originally set.
Assert.True(BitOps.ComplementBit(ref dest, 3));
Assert.Equal(0b0000_0001, dest);
// WriteBit: Writes the specified bit in a mask and returns the new value. Does not branch.
dest = 0b0000_1001;
Assert.Equal(0b0000_0001, BitOps.WriteBit(dest, 3, on: false));
// WriteBit(ref): Writes the specified bit in a mask and returns whether it was originally set. Does not branch.
Assert.True(BitOps.WriteBit(ref dest, 3, on: false));
Assert.Equal(0b0000_0001, dest);
Updates
- Original issue authored by @mburbea here: Proposal: Add a BitManipulation class
- Initial proposal submitted
- Methods such as
InsertBit
accepted a bool that determined whether it set or cleared the bit in question. Such methods have now been refactored in two,InsertBit
andClearBit
. - Some name changes based on suggestions (eg
FlipBit
becameComplementBit
). - TestAndSet methods have been refactored into simple scalar functions, for performance.
- The offset parameter is
int
instead ofbyte
. All POC units pass. - Added
WriteBit(value, offset, bool on)
overloads that conditionally set/clear the specified bit. - Added
ExtractByte
and friends - Added
Evaluate(bool)
(used internally, so may as well expose it) - Removed all
Span<T>
overloads per @tannergooding advice. Will maybe submit in separate proposal - Moved
TrailingOnes
andLeadingOnes
into to do later proposal in comments - Added
Log2
- Removed redundant overloads
- Removed doc-comments so spec is easier to read
- Added byte overloads to all crud methods based on code analysis of existing callsites. eg eg bool ExtractBit(byte value, int bitOffset)
- Added Scope section at top of spec
- Separated calls into existing and proposed sections
- Added task list, worded spec in a more concise manner
- Moved
RotateByte
, etc into to do later proposal in comments