Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Dict implementation design #983

Open
@certik

Description

@certik

Here are the freedoms that we have in implementation dictionaries:

  • Hash function
  • load factor / initial size of an array
  • array resizing strategy (rehashing or not)
  • key collision strategy
  • memory allocation
  • ordered vs unordered
  • hash table vs red-black tree

Here is how we can do this:

  • We start with some defaults (just one choice for all of the above)
  • We implement more (eventually all) choices, and allow the user to select using a type annotation, such as x: Dict[hash3, loadfactor(5), resizing("x"), collision_strategy("list")].
  • We can then benchmark various combinations and choose some good solid defaults
  • We can later add a user defined hash function that gets compiled and inlined in LLVM, so optimal performance.
  • Ideally the hash function should be defined in the frontend language too

Note: #975 implements unordered dict. The ordered dict implementation can be implemented on top by adding some data structure to maintain the order of key insertion nodes (such as linked list), and another hash table of key -> node, so that you can do insertion, lookup and deletion all in O(1); it seems lookup is as fast as for unordered dict, and so is unordered iteration; but insertion and deletion is slower due to updating of the two extra datastructures; and there is a new feature "ordered iteration" (that did not exist in unordered dict, but it is slower than "unordered iteration"); more details at https://github.com/Aqcurate/OrderedDict#ordered-dictionary-implementation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions