Performance

fastjet & fastjet-contrib dependence

Note that CMSSW 8_0_X and 9_4_X are based on fastjet 3.1.0 and fastjet-contrib 1.026.

HOTVR works with fastjet 3.1.0, but has a significant speedup with fastjet 3.2.1. e.g. for ttbar MC 1.21 s/evt (v3.1.0) vs 0.30 s/evt (v3.2.1). However, due to an API change in fastjet::JetDefinition, CMSSW packages that create an instance of that class must be checked out & recompiled (RecoBTag/SecondaryVertex, RecoJets/JetAlgorithms, RecoJets/JetProducers, PhysicsTools/JetMCAlgos).

The ECF modules have a significant speedup with fastjet-contrib 1.032 e.g. 0.159 s/evt (1.026) vs 0.016 s/evt (v1.032).

For comparison, the total event time for MC is then ~ 2.5s with the above changes (was ~4s before). For data the event time is ~ 1.8s with changes (was ~ 3s before).

Event sizes

plain miniAOD: 40-60kB/event, depending on sample and pileup scenario
old SFrame ntuple (with all jet constituents): around 10% less than plain miniAOD
miniAOD+topjets: around 80kB/event for high-pileup --> adds around 30% to plain miniAOD (main culprit: patJetCA8CHS, so could be trimmed down a little)
small version of SFrame ntuple (=without jet constituents): around 5kB/event

Runtime

running ntuplewriter only (on miniAOD+topjets sample), writing the small ntuple: 300 events/sec. This is the "baseline performance" to be expected when running AnalysisModules within CMSSW.
running ntuplewriter only, but writing large ntuple: 6 events/sec
running on miniAOD and writing both, miniAOD+topjets file AND small ntuple: 2 events/sec (for large ntuple: 1.5 events/s)

The time requirement for the last item is mainly due to the additional jet collections produced in CMSSW modules, such as heptoptag, cmstoptag, ca15, ca8 jets. In the current configuration, this takes most of the time, almost 0.5s per event.

Runtime of `next-ntuple-format`

This section summarizes the CMSSW performance of the ntuplewriter.py in the next-ntuple-format branch with CMSSW74X, tested with a Tpime-pair sample.

The measured time per event is around 2.4s. This is much more than the previous version, which only used 0.5s per event. From those 2.4s, more than 1.7s are spent for Qjets volatility (half for ca8, half for ca15), another ~0.1s for each of the three categories:

ca0.8 jets + gen + pruned + pruned-gen
ca1.5 jets + gen + filtered + filtered-gen
cmstoptag + gen + heptoptag + gen

Some optimization could be achieved by:

optimizing Qjets volatility
rising the pt cut to calculate Qjets for fewer jets (currently: minimum pt of 100GeV)
get rid of gen collection for some of the above jet algos could save ~half of the ~0.3s currently spent

Note that the measurement was done on a Tprime sample; the time on other samples, esp. SM backgrounds, is probably much lower as there are fewer high-pt jets to calculate Qjets volatility for.

Performance

Performance

fastjet & fastjet-contrib dependence

Event sizes

Runtime

Runtime of next-ntuple-format

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Runtime of `next-ntuple-format`