Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Proposal: Add a BitManipulation class #18876

@mburbea

Description

@mburbea

I propose adding a new class to CoreFx, which will contain many common bit manipulation techniques.
Part of having a class like this is that the Jit can target these methods class for intrinsics, similar to what was done for Vector<T>.
There is still some open discussion as to what the shape of this API should be, and even the class name.

Class Name

There are several class names being thrown around and none are particularly winning everyone over.

  • Bits
  • BitOperations or BitOps
  • BitUtility
  • BitTwiddler
  • BitView (see the next section)

Two Classes?

@benaadams, correctly notes that there seems to be two different APIs here.
A low level view allowing you to manipulate a numeric type to extract or inject a bit (like a bit vector), byte, short, or int value.
These methods are the equivalent of the following but safer (and possibly faster)

public static unsafe int ExtractInt(ulong u,int pos)=>((int*)&u)[pos];
public static unsafe ulong InjectInt(ulong u, int pos, int toSet) {
    ((int*)&u)[pos] = toSet;
     return u;
}

And another set of utility exposing methods for treating an integer register type like a unit of data allowing you to manipulate it or introspect it.

Does it make sense to keep these APIs in one class?

Method Names

Another point of contention. What should we call these methods? For the "view" members, there is some dislike of the naming convention of Get/Set even though there is prior art like BitVector32 & SqlDataRecord. Everyone seems to like Read/Write more, but while Read is fine. Write isn't neccessarily the operation being done. I'm still looking for some verbiage to note that this really takes a value, modifies it a bit (no pun intended), and spit out a new one.

PopCount/HammingWeight/ CountSetBits - We can't decide on this name. I personally like PopCount as it is a well known name for the algorithm. However, for someone who does not CPU intrinsics or bit tricks this name might mean nothing to you. the .net api is split where common or well known algorithms are simply called that (e.g. Rijndael) and sometimes descriptive for new users. I think that this class in general is fairly low-level so even a novice should be expected to do a quick google search in the subject area.

And even naming the methods as actions (e.g. CountTrailingZeroes) or properties (e.g. TrailingZeroesCount).

Hardware Acceleration / Detection

@benaadams, suggested adding a simple flag to determine if the class has hardware acceleration. I personally suggest going a step further and adding an enum to describe which methods are accelerated (not in the PoC).
Unfortunately, the enum based approach does raise the question if the jit could do branch elimination on a comparison like the following if it could replace AcceleratedOperations as a runtime constant. (unknown)

// 
if (Bits.AcceleratedOperations == BitOps.PopCount){
   // use popcount
}
else{
  // work around it?
}

I admittedly question what you would do differently if it isn't accelerated. This isn't like Vector<T> where you might switch to a different approach and use ulong like smaller vectors. The methods should be pretty good solutions to these problems and outside of switching to some native code I don't see us doing better.
Methods that could be accelerated (in AMD64 at least)::
PopCount => POPCNT
RotateRight => ROR (already accelerated!)
RotateLeft => ROL (already accelerated!)
CountLeadingZeroes => LZCNT (in ABM compliant hardware) or (BSR followed by xor 63)
CountTrailingZeroes => TZCNT / BSF
ReadBit => BT
WriteBit => BTS/BTC (maybe?)
ReadByte/ReadInt16/ReadInt32 => BEXTR (possibly)

Updated spec:

public static class Bits{
       public static int PopCount(ulong value);
       public static int PopCount(long value);
       public static int PopCount(uint value);
       public static int PopCount(int value);

       public static ulong RotateRight(ulong value, int offset);
       public static uint  RotateRight(uint value, int offset);

       public static ulong RotateLeft(ulong value, int offset);
       public static uint  RotateLeft(uint value, int offset);

       public static int CountLeadingZeros(ulong value);
       public static int CountLeadingZeros(long value);
       public static int CountLeadingZeros(uint value);
       public static int CountLeadingZeros(int value);

       public static int CountTrailingZeroes(ulong value);
       public static int CountTrailingZeroes(long value);
       public static int CountTrailingZeroes(uint value);
       public static int CountTrailingZeroes(int value);

       public static  bool ReadBit(ulong value, int offset);
       public static  bool ReadBit(long value, int offset);
       public static  bool ReadBit(uint value, int offset);
       public static  bool ReadBit(int value, int offset);

       public static ulong WriteBit(ulong value, int offset, bool toSet);
       public static long WriteBit(long value, int offset, bool toSet);
       public static uint WriteBit(uint value, int offset, bool toSet);
       public static int WriteBit(int value, int offset, bool toSet);

       public static byte ReadByte(ulong value, int offset);
       public static byte ReadByte(long value, int offset);
       public static byte ReadByte(uint value, int offset);
       public static byte ReadByte(int value, int offset);

       public static short ReadInt16(ulong value, int offset);
       public static short ReadInt16(long value, int offset);
       public static short ReadInt16(uint value, int offset);
       public static short ReadInt16(int value, int offset);

       public static int ReadInt32(ulong value,  int offset);
       public static int ReadInt32(long value,  int offset);

       public static ulong WriteByte(ulong value, int offset, byte toSet);
       public static long WriteByte(long value, int offset, byte toSet);
       public static uint WriteByte(uint value, int offset, byte toSet);
       public static int WriteByte(int value, int offset, byte toSet);

       public static ulong WriteInt16(ulong value, int offset, short toSet);
       public static long WriteInt16(long value, int offset, short toSet);
       public static uint WriteInt16(uint value, int offset, short toSet);
       public static int WriteInt16(int value, int offset, short toSet);

       public static ulong WriteInt32(ulong value, int position, int toSet);
       public static long WriteInt32(long value, int position, int toSet);
}

Edit: POC class
https://gist.github.com/mburbea/c9a71ac1b1a25762c38c9fee7de0ddc2

More updates! Removed signed rotate operators.

Metadata

Metadata

Assignees

No one assigned

    Labels

    api-needs-workAPI needs work before it is approved, it is NOT ready for implementationarea-System.Numerics

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions