Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Aggregated metrics #4

@KodyKantor

Description

@KodyKantor

Currently, node-artedi serializes and exports metrics in the same form that the user specifies. Take this example:

counter.increment({
// Increment this counter with three labels.
    one: 'value',
    two: 'value',
    three: 'value'
})

Every time we call collector.collect() the three labels ('one,' 'two,' 'three') will be serialized for this counter. This is great for simple applications. For more complex applications, and to reduce the possibility of accidentally creating one-off metrics, we've thought of another solution. The user could specify the labels that they would like collected at the time of collector creation. Any operations on labels that were not statically defined (dynamic labels) will cause the dynamic labels to be ignored. For example, the user could define the labels 'one' and 'two' as labels to keep. Label 'three' would be aggregated away. This helps to resolve a couple problems: metric cardinality, and composition.

Metric cardinality is the hallmark issue in metric collection systems of our size. The more labels we use, the more data we maintain in memory, and serialize. The more data we serialize, the slower our queries are (and we store more data on disk at the server).

Composition is a problem that arises when we pass instances of an object between libraries. There are multiple problems that come up with composition, but for this topic we are focusing on the metrics that libraries collect on behalf of the greater application. Let's say that we instrument the node-cueball library. The node-cueball library may create labels for all sorts of internal data that is only important to developers. For example, maybe it creates a label for 'remoteIP'. remoteIP may be a reasonable thing to collect in Muskie, but not in CloudAPI due to the number of possible values for remoteIP. By declaring static labels in CloudAPI, we can ensure that 'remoteIP' is not tracked, which will help us avoid the metric cardinality problem.

There are a couple options for what we do if only a subset of the static labels are assigned values.

Option 1) Drop the labels. The unspecified labels would simply be dropped.

Option 2) Assign default values. The user could define default values up front, which will be filled in if they are not given specific values later.

We also need to decide where we define static labels. I can think of three options for this.

  1. Assign static labels when the parent Collector object is created. This option is the only option that allows us to reliably control which metrics are serialized by libraries that we are instrumenting. By defining which labels to track in the parent, all children are required to comply. The difficulty with this is that if we want to get a new set of labels out of a library, the wrapping library/application also requires a code change (to allow the new label to be serialized). This makes upgrading difficult and slow. On the other hand, we avoid serializing (and maintaining in memory) possibly useless labels

  2. Assign static labels when the child collectors (gauge, counter, histogram) are created. This option gives us more flexibility, and avoids the 'upgrade' problem from option 1). In this option, control of labels is given to each library. We're really trying to avoid this, since this doesn't solve the problem.

  3. Assign static labels when the child collectors have their observation functions called (increment(), observe(), etc). This would be identical to what is currently done, and would be the same as what we call 'dynamic labels.' This doesn't solve the problem, so we will not take this option.

After considering this, I propose that we assign static labels when the parent Collector object is created, and drop static labels that are not used. Those are option 1) from both sets of questions. The exact syntax for this has to be worked out.

After this is implemented, our result will look something like this (syntax will change):

var collector = mod_artedi.createCollector({
  staticLabels: [ 'method', 'statusCode' ]
});

var counter = collector.counter({
  name: 'some_counter',
  help: 'some help'
});

counter.increment({
  method: 'putobject',
  statusCode: 203,
  user: 'kkantor',
  remoteIP: '1.2.3.4'
});

counter.increment({
  method: 'getobject'
});

collector.collect(function (err, str) {
  console.log(str);
  // Prints:
  // # HELP some_counter some help
  // # TYPE some_counter counter
  // some_counter{method="putobject",statusCode="203"} 1
  // some_counter{method="getobject"} 1
});

As suggested by Dave Pacheco, we can introduce a DTrace probe to node-artedi to observe raw metric actions as they are occur before we aggregate labels away. This will minimize the blow from trading granularity (dropping labels) for performance/scalability (metric cardinality).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions