-
Notifications
You must be signed in to change notification settings - Fork 58
Description
HistoryTreeNode has an epochs field that is used to identify the epochs which an update took place.
As it stands now, the field can grow to an unbounded size. This will pose an issue when the corresponding DB record hits either soft or hard limits imposed by the DB schema or architecture.
ESTIMATED LIMITS
Epochs are u64 = 8 bytes.
- For mysql, with the current VARBINARY(2000) definition, 2KB / 8B = ~ 250 values (we could always raise the max value in the definition, this is just for illustration)
- For some alternate key-value db (KVDB) with say 100KB recommended soft limit for record size, ignoring other overhead, 100KB / 8B = ~ 12500 values
SOLUTION?
The associated epochs for a HistoryTreeNode can be obtained directly from the HistoryNodeState table given the node label, since each update corresponds to a HistoryNodeState.
- For mysql, this can be done through one call with a join, or a separate, concurrent call to the DB.
- For the alternate KVDB, one call is also sufficient. Instead of a "get" operation to fetch a singular HistoryTreeNode record, we can use a "scan" operation to fetch both the HistoryTreeNode record AND associated HistoryNodeState records at one go, provided we have a good sharding setup.
Accumulating the data from separate tables will incur some computation time on the database server and we will need to profile and assess the impact. However, I think this is necessary to avoid the unbounded record size issue. Are there any other potential solutions I've missed?