-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Description
Background and motivation
The complexity of enumerating a HashSet<T>
or a Dictionary<TKey, TValue>
is determined not by its Count
, but by its internal capacity. There are scenarios where both of these structures are enumerated frequently, and during their lifetime they might grow very large and then shrink to a small size. In these scenarios it makes sense to call TrimExcess
at a point when the current Count
has become too small compared to the current capacity, in order to speed up the enumeration of the collection.
If a Capacity
property was exposed, it could be used for example like this:
dict.Remove(key);
if (dict.Count < dict.Capacity >> 2) dict.TrimExcess(dict.Count << 1);
In the above example the capacity of the dictionary is reduced to double its size, when its size becomes a quarter of its capacity. So a dictionary with capacity 1000 will have its capacity reduced to 500 when the Count
drops below 250.
Currently it's not possible to know the internal capacity of these collections, without keeping track of their Count
after every operation that affects their size. Which is quite cumbersome. Or by using reflection, which is not efficient and safe.
Also currently it's not possible to trim a HashSet<T>
to a specific capacity, like it's possible with a Dictionary<TKey, TValue>
. The HashSet<T>.TrimExcess
method has no overload with a capacity
parameter.
API Proposal
namespace System.Collections.Generic
{
public class HashSet<T>
{
/// <summary>
/// Gets the total number of elements the internal data structure can hold
/// without resizing.
/// </summary>
public int Capacity { get; }
/// <summary>
/// Sets the capacity of this set to hold up a specified number of elements
/// without any further expansion of its backing storage.
/// </summary>
public void TrimExcess(int capacity);
}
public class Dictionary<TKey, TValue>
{
/// <summary>
/// Gets the total number of elements the internal data structure can hold
/// without resizing.
/// </summary>
public int Capacity { get; }
}
}
Alternative Designs
A TrimExcess
overload with a loadFactor
parameter would also do the job:
public void TrimExcess(double loadFactor);
...but its usage would not be completely obvious, and it could collide with the existing overload that has an int capacity
parameter. Related issue: #23744
Risks
None that I am aware of.