-
Notifications
You must be signed in to change notification settings - Fork 61
Metadata
A new feature of UHH2 is the ability to deal with metadata, simple key-value string data which is valid for a whole sample and not a single event. Typical use cases include:
- store processing information -- e.g. whether jet resolution smearing was already applied -- to prevent certain bugs
- store sample-wide information -- such as the sample cross section -- at the first analysis step and use it at subsequent processing steps
Metadata is stored in the output root file in an additional output tree with the name uhh2_meta. This tree contains one branch of type string which contains all metadata key/value pairs in the format
key1===value1
key2===value2
etc.
Creating new metadata is done in the setup phase during the construction of the AnalysisModule class(es) via the Context object. For example, to create a metadata with name "jer_smearing_applied", use this in the AnalysisModule constructor:
ctx.set_metadata("jer_smearing_applied", "true");
Note that keys and values are always strings. Metadata is availbel via Context::get like any other setting, however with the special prefix "meta_". To access the above metadata, use
ctx.set("meta_jer_smearing_applied");
As an example, refer to the JetResolutionSmearer in common which contains code to detect an attempt to double-smear jets.
Internally, metadata is handled by the framework at different stages:
- when reading in the first input file of a sample -- and before constructing the first
AnalysisModule--, the metadata tree ("uhh2_meta") is read out, if present, and all settings found are made available via theContextobject just like the settings from thexmlfile, but with the "meta_"-prefix - when reading in additional input files, a consistency check is done: as metadata is expected to be consistent for the whole sample, it is an error if metadata is different for two input files for the same sample.
- when writing the first (selected) event to the output, the current metadata values are written to the metadata tree.
Note that the above algorithm makes metadata round-trip, i.e. all metadata available in the input will be written to the output, without any code refering to metadata explicitly. This behavior is on purpose to keep as much information as possible, but can make it hard to get rid of metadata which might be deserable workaround in some case; see last item of best pracrices below.
Some best practices:
- use a globally-unique name to prevent name collisions, e.g. including the
AnalysisModuleclass name which created the metadata. - always create metadata very early, as early as possible. This ensures that other
AnalysisModules depending on that value read the right value and that the framework writes the correct metadata to the output. - avoid setting metadata value more than once. The current implementation will throw an error if trying to set metadata with a different value (setting the same value is Ok), unless a special "force" flag of the
Context::set_metadatais set. The only purpose of this flag is to allow correcting wrong metadata in the input file. If used, it should be done as early as possible in the processing (i.e. at the top of the constructor of the top-levelAnalysisModule), such that otherAnalysisModules and the framework only see the new value.
-
Ntuple instructions per branch/release
- 10_6_X, UL16/17/18
- 10_2_X, 2016/17/18
- 9_4_X, 2017
- 9_2_X, 2017
-
8_0_X, 2016
- Installing and Compiling (Run II, 80X, miniAOD v1, 80X_v1)
- Ntuple Production (Run II, 80X, MiniAODv1)
- Installing, Compiling and Ntuples (Run II, 80X, miniAOD v2, 80X_v2)
- Installing, Compiling and Ntuples (Run II, 80X, Moriond17, 80X_v3)
- Installing, Compiling and Ntuples (Run II, 80X, miniAOD v2, HOTVR & XCone reprocessing, 80X_v5)
-
Analysis info
- crab kill, follow-up tasks, duplicates
- Running failing crab jobs locally
- Checking & Reprocessing of missing ntuples
- Creating & using luminosity ROOT file in SFrame
- Finding a MINIAOD file from an ntuple event
- Luminosity & cross-section weighting information for Monte Carlo samples
- NtupleFormat
- Pileup reweighting for MC
- 2017 MC samples with buggy pileup
- Recipe for PDF uncertainties (RunII, 25ns, MiniAODv2)
- Running
- Singularity (using SL6 on EL7)
- Storing user variables in objects
- Trigger Paths & Filters; storing trigger objects
- Working with DESY Tier 2 dCache (
/pnfs) - Tier2 UHH2 group space
- Application of Keras Neural-Network in UHH2
-
Developer tips
- (Top) Jet collections in Ntuples
- Adding a new object class to ntuples
- CMSSW vs. SFrame
- Code Conventions
- Code Overview
- Committing & Contributing Code
- Compiling and installing fastjet, fastjet contrib
- Continuous Integration
- Continuous Integration Setup Instructions
- Debugging tips
- git(hub) tutorial
- Handling different years (RunII_102_v1 10_2_X and beyond)
- Event Class
- Maintainer Responsibilities
- Metadata
- OS Acronyms
- Performance
- Porting changes across branches (cherry-picking)
- Renaming a ntuple collection
- Using an external package
- DNN/TF dev planning
-
Older ntuple instructions
-
7_6_X, 25ns, 2015
- Installing and Compiling (Run II, 25ns)
- Installing and Compiling (Run II, 25ns, miniAOD v2)
- Installing and Compiling (Run II, 76X, 25ns, miniAOD v2)
- Ntuple Production (Run II, 25ns v1 MC ONLY!)
- Ntuple Production (Run II, 25ns, MiniAODv2)
- Ntuple Production (Run II, 25ns, prompt reco D v3)
- Ntuple Production (Run II, 76X, 25ns, MiniAODv2)
- 7_4_X, 50ns, 2015
- Phys14, 2014
-
7_6_X, 25ns, 2015