Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Jochen Ott edited this page Feb 20, 2015 · 3 revisions

A new feature of UHH2 is the ability to deal with metadata, simple key-value string data which is valid for a whole sample and not a single event. Typical use cases include:

  • store processing information -- e.g. whether jet resolution smearing was already applied -- to prevent certain bugs
  • store sample-wide information -- such as the sample cross section -- at the first analysis step and use it at subsequent processing steps

Metadata is stored in the output root file in an additional output tree with the name uhh2_meta. This tree contains one branch of type string which contains all metadata key/value pairs in the format

key1===value1
key2===value2

etc.

Setting and Getting Metadata

Creating new metadata is done in the setup phase during the construction of the AnalysisModule class(es) via the Context object. For example, to create a metadata with name "jer_smearing_applied", use this in the AnalysisModule constructor:

ctx.set_metadata("jer_smearing_applied", "true");

Note that keys and values are always strings. Metadata is availbel via Context::get like any other setting, however with the special prefix "meta_". To access the above metadata, use

ctx.set("meta_jer_smearing_applied");

As an example, refer to the JetResolutionSmearer in common which contains code to detect an attempt to double-smear jets.

Internals and Best Practices

Internally, metadata is handled by the framework at different stages:

  • when reading in the first input file of a sample -- and before constructing the first AnalysisModule --, the metadata tree ("uhh2_meta") is read out, if present, and all settings found are made available via the Context object just like the settings from the xml file, but with the "meta_"-prefix
  • when reading in additional input files, a consistency check is done: as metadata is expected to be consistent for the whole sample, it is an error if metadata is different for two input files for the same sample.
  • when writing the first (selected) event to the output, the current metadata values are written to the metadata tree.

Note that the above algorithm makes metadata round-trip, i.e. all metadata available in the input will be written to the output, without any code refering to metadata explicitly. This behavior is on purpose to keep as much information as possible, but can make it hard to get rid of metadata which might be deserable workaround in some case; see last item of best pracrices below.

Some best practices:

  • use a globally-unique name to prevent name collisions, e.g. including the AnalysisModule class name which created the metadata.
  • always create metadata very early, as early as possible. This ensures that other AnalysisModules depending on that value read the right value and that the framework writes the correct metadata to the output.
  • avoid setting metadata value more than once. The current implementation will throw an error if trying to set metadata with a different value (setting the same value is Ok), unless a special "force" flag of the Context::set_metadata is set. The only purpose of this flag is to allow correcting wrong metadata in the input file. If used, it should be done as early as possible in the processing (i.e. at the top of the constructor of the top-level AnalysisModule), such that other AnalysisModules and the framework only see the new value.

Clone this wiki locally