-
Notifications
You must be signed in to change notification settings - Fork 61
Description
Things to be checked/taken care of, especially with respect to 2016/7/8 differences. I've grouped them by object, since often they are common across years.
Please comment/edit with any missing items!
Some things are only important post-ntupleisation, which have been marked with [analysis]. Everything else is assumed to be important for making ntuples.
Instructions to add a new class: https://github.com/UHH2/UHH2/wiki/Adding-a-new-object-class-to-ntuples
Changes needed on the analysis level are collected here: Complete RunII data analysis with UHH2 branch RunII_102X_v1
GENERAL
- Move to 10_2_10 for the L1 prefiring, double C tagger updates. https://github.com/cms-sw/cmssw/releases/CMSSW_10_2_10
- Store year in ntuple, for later use in analysis modules (user could override, but good default)
- Add in 2017v2 year & ntuple cfg
- Sort out eras
- Check GTs
- Separate
year& config for ReReco 2018 RunA-C?
JETS (not substructure)
-
Ensure name of collections sensible (says CHS or PUPPI, AK*)
-
Ensure all have PUPPI multiplicities (& PUPPI energy fractions?)
-
Ensure daughter accessing works properly - the mechanism changed for 2017
-
Store pileupID discriminant value https://twiki.cern.ch/twiki/bin/viewauth/CMS/PileupJetID#Information_for_13_TeV_data_anal
-
Store pileupID working point flags
-
Mechanism to store constituents? e.g. raggleton@6ee0ad2
-
Store leading parton/hadron FlavourParticle from jet flavour info?
-
Fix bug storing incorrect L1 factor in Ntuplewriter
-
Simplify double reclustering with multiple min pT (e.g. ak8PuppiJets) by using: doesn't work - means you lose all clustering info e.g. rParam. Wouldn't save much time anyway.PtCandSelector -
Seems like lepton keys in PUPPI jets are not stored properly. Issue in CMSSW?
Jets produced with NtupleWriterJets are fine -
Don't apply any JECs?Hard to do with deepboostedjet stuff, and would require very low pT cut as sometimes large JEC factor. Better to just lower the initial cut in FastjetJetProducer -
Set better default values for (b-) tagging discriminants: probably -2, but definitely not 0!
-
Check we have all the recommended b tag things for each year/dataset: https://twiki.cern.ch/twiki/bin/viewauth/CMS/BtagRecommendation
All recommended resolved (AK4) taggers in place and filled. Exceptions: 2016v2 DeepCSV (CHS and PUPPI) and DeepJet (CHS). Whether or not filled values make sense should be checked in test analysis (e.g ttbar with parton matching) -
Make deepFlavour btag vars default inbTagDiscriminatorsso applied to ALL jets
deepFlavour btag vars stored, with more complicated scheme than just adding it to bTagDiscriminators list -
add min jet pt requirement to store PF constituences
-
btaginfo() is not filled at the moment. Is it needed?
-
rekey daughters of jets to avoid duplicate particles (mainly for PUPPI)
TOP JETS
(may be helpful: https://github.com/UHH2/UHH2/wiki/(Top)-Jet-collections-in-Ntuples)
- Ensure name of collections sensible (says CHS or PUPPI, AK*), especially since e.g. "slimmedJetsAK8" was CHS in 2016 but PUPPI in 2017+!
- re-cluster AK8 jets with low pt threshold (for JERC), make copy with high pt threshold (for substructure, DeepFlavor, etc)
- Ensure daughter & subject accessing works properly - the mechanism changed for 2017
- Some unified way of storing extra variables - userFloats? Manual matching? We currently employ both...
- Ensure all have PUPPI multiplicities (& PUPPI energy fractions?)
- Ensure subjets are filled properly - all have area, daughters, energy fractions, etc
-
PUPPI multiplicities for sub-jets?Not doing, very complicated to ensure module name matches correctly to get map in NtupleWriterJets. - Ensure we store recommended b-taggers (https://twiki.cern.ch/twiki/bin/viewauth/CMS/BtagRecommendation in AK8 "_SoftDropX" collections): DeepCSV, DeepJet(2018, 2016v3) and DeepFlavour(2017), CSVv2(2016v2, 2017): CSV and DeepCSV are stored for both AK8CHS and AK8PUPPI, DeepFlavour is stored only for AK8PUPPI
- Store DeepFlavour/DeepJet or/and DeepAKx for AK8CHS_SoftDropCHS
- Ensure we have double b tagger & store DeepJet variables (https://twiki.cern.ch/twiki/bin/view/CMS/DeepJet, https://twiki.cern.ch/twiki/bin/view/CMS/DeepFlavour https://indico.cern.ch/event/777545/contributions/3234584/attachments/1766808/2869123/BTV_CMS_Week.pdf)
- Why
DeepFlavourJetTagsProducerproducesThe NN encountered 2 nan input TagInfo values and produced 5 nan output valueserrors forpfDeepCSVJetTagsAk8PuppiJetsFat? - Tidy up b taggers - why "Did not find all b-taggers! Available btaggers:..." message?
Due to missing DeepFlavor tags, as it's expected by default for any jet collection, but it does not exist for AK4 jets - Use
postfix = WithPuppiDaughters: https://github.com/cms-sw/cmssw/blob/master/PhysicsTools/PatAlgos/python/tools/jetTools.py#L643 OMG why is this not documented anywhere?!
- Why
- Ensure we have energy correlation functions (ECFs) across all years
- Keep
HepTopTagger? Retired - Store pileupID discriminant value https://twiki.cern.ch/twiki/bin/viewauth/CMS/PileupJetID#Information_for_13_TeV_data_anal
- Get rid of pruned mass
- Seems like lepton keys for PUPPI jets (updatedPatJetsSlimmedJetsAK8_SoftDropPuppi) are not stored properly -> issue for 94X_v1, not confirmed with 10_2_X_v1 branch, to be double checked with HZ MC (Andrea' ToDo as part of debug campaign)
-
Lower ECF cut? currently at pt > 250 for Njets = 3 as per miniaodNo objections, so no change -
Don't apply any JECs?Hard to do with deepboostedjet stuff, and would require very low pT cut as sometimes large JEC factor. Better to just lower the initial cut in FastjetJetProducer - Lower
fatjet_ptminto ensure not biased jet selection in ntuple - add more DeepJet booleans? See [102X] Recluster AK8PUPPI for TopJets ourselves #1117 (comment)
- make sure AK8CHS and AK8PUPPI for JERC studies have low thresholds (e.g 20 GeV)
- add flags of PF storage for each collection
GEN JETS
- Make a proper GenJet class?
- Combine with
GenJetWithPartsto store genparticle indices? raggleton@dffb54e - Inherit from
FlavourParticleand store flavour fromslimmedGenJetsFlavourInfoscollection - Store energy fractions (
emEnergyetc), copy from GenTopJets: e081421, https://github.com/cms-sw/cmssw/blob/CMSSW_8_0_26/RecoJets/JetProducers/src/JetSpecific.cc#L377 -
Proper linking for mother and daughters in NtupleWriter::add_genpartwould require storing all gen particles: clustered in jets + mother and daughters. User case is not clear - Add in lepton+photon genjets to meet requirements here for Top XS measurement: https://twiki.cern.ch/twiki/bin/view/LHCPhysics/ParticleLevelTopDefinitions . Basically cluster antikt R=0.1, only using final state gen electrons, gen muons, and gen photons that have a parent of an electron or muon (to avoid e.g. pion contamination). Dennis TODO
- Add multiplicities to complement energy fractions, needed for Dennis' lepton jets
GEN TOP JETS
- Store new
GenJets as subjets, notParticles - Add method to calculate (or setter/getter?) softdrop mass - check v4.M() == sum(constituents).M() == sum(subjets).M(). Have checked all give the same answer, so users can just do
GenTopJet::v4.M() - Other variables to store? ECFs? tau_i?
- Add muon energy fraction
- Make sure GEN energy fractions are filled. Was working before switch to GenJet!
- add separate flag to store PF constituences for slim/fat gen jets
MET
- Store all the
slimmedMETsEGCleanetc for the 03Feb rereco (2016v2)? -
StoreslimmedMETsNoHFfor 2016v3 (and others?)? - Add in EE2017 recipe (only for
year=2017 though) & store. But keepslimmedMETs,slimmedMETsPUPPIas usual - Store sumET explicitly (& "proper" met significance from covariance matrix? instead of current MET/sqrt(sumET) ?)
- CHS MET for 2016? the rawCHS uncertainty isn't stored for 2016v2, so it's currently storing junk
- Store GenMet
MUONS
- Add in 2016 IDs & tags
ELECTRONS
- Add in 2016 IDs & tags
-
For 2016v2: store the bool particleFlowEGammaGSFixed:dupECALClusters and if ecalMultiAndGSGlobalRecHitEB:hitsNotReplaced is empty or not in any analysis job? - Store HEEP variables?
Variables are stored from while ago as copy from miniAOD TrackIso in miniAOD is broken and might be different in that is used by VID behind HEEP 7.0. For more details see https://twiki.cern.ch/twiki/bin/view/CMS/HEEPElectronIdentificationRun2 - Fix fBrem having -1E-30
- Do we want the new calibrations etc for electrons? https://twiki.cern.ch/twiki/bin/view/CMS/EgammaMiniAODV2 https://twiki.cern.ch/twiki/bin/view/CMS/EgammaPostRecoRecipes
Comparison between miniAOD 2018 vs smeared 2018
https://sharper.web.cern.ch/sharper/cms/egamma/2019/Feb21st_SSValid/
TRIGGER
- Choose correct trigger collection name, differs for 2016 vs 2017+ (one is
selectedPatTrigger, another isslimmedPatTrigger) - Ensure MET flags stored correctly (
metfilter_bits=cms.InputTag("TriggerResults", "", metfilterpath),), was different in 2016 vs 2017 - Prefiring things (across all years? https://twiki.cern.ch/twiki/bin/viewauth/CMS/L1ECALPrefiringWeightRecipe) (should become part of 10_2_x at some point, see PR 25380)
- Update for L1 prefire backport, PR 25645
- Update L1 prefiring recipe once it becomes more integrated. We will run the official recipe when making ntuples, and also later add a user module to re-calculate the weights from jets & photons (see PHOTONS section below)
- Trigger objects for selected analyses (dijet, Z') see https://github.com/UHH2/TriggerPaths
- Add updates list of EE crystals to filter out https://twiki.cern.ch/twiki/bin/viewauth/CMS/MissingETOptionalFiltersRun2#How_to_run_ecal_BadCalibReducedM
- Bad Muon & Bad Charged Hadron filters for 2016v2 https://twiki.cern.ch/twiki/bin/viewauth/CMS/MissingETOptionalFiltersRun2#How_to_run_the_Bad_Charged_Hadro
- store L1min and L1max prescale
- add possibility to store L1seeds, which might be useful for prefiring studies with 2017 data
GEN PARTICLES
-
Check ME particles are there for different generators
- Pythia8 (QCD) ✅ for QCD see gd -> gd (since LO, there should only be 2 outgoing partons?)
- MG5_aMC@NLO + pythia (lots) ✅ for G+Jets see gc -> Gggc
- Powheg+Pythia (ttbar) ✅ see t -> b and t -> w -> w decay products
- amc@nLO + ? (ttbar) ✅ see t -> b and t -> w -> w decay products
- Herwig++ (QCD) ❌ can't do due to useless status codes
- Herwig7 (QCD, top) ❌ can't do due to useless status codes
-
Tidy up stablegenparticles, that pythia8 flag etc
HOTVR/XCONE (GEN) JETS
- Something missing? Energy fractions? To do it properly, would be a little more complicated? Need to output
reco::Jet, then pat-ify? - Make sure jet area is stored (RECO only)
- GEN Jets: store "gen" energy fractions and gen.particles, also for sub-jets (no info is actually stored at the moment due to numberOfDaughters() =0 by construction)
-
Make sure vertex info is used in the clustering, e.g in GenXConeProducer::produce
Not needed, because particles used for clustering defined in respect to PV -
lepton keys for XCONE and HOTVR?A lot of work for not clear gain. Note for future development: the most straight-forward way would be integration of self-clustered jet collection in NtupleWriterJets. - Remove XCone23
TAUS
- Any IDs to store? Is anyone using it? No
- Drop taus
PF CANDIDATES
- Change default to store only those in (top)jets?
By default PF candidates are not stored to save space. One can enable them with option "doPFJetConstituents=cms.uint32(Njets)", where Njets is number of leading jets for which PF candidates are stored in all jet collections
NB: for fat jets PF candidates of sub-jets are stored. However this feature does not work for "_Softdrop" collections, because their constituencies are sub-jets and not PF candidates
GEN/EVENT INFO
-
Make sure all necessary weights stored, e.g. Parton shower weights in Autumn18 MC
All weights are stored in GenInfo->weights(). Most of them are related to model uncertainties of the Pythia8 parton shower: the first entry is the central value, then there are variations, in total 46. -
binningValues are not filled, because it's empty in most of miniAOD samples. According to comment here: https://github.com/cms-sw/cmssw/blob/CMSSW_10_2_X/SimDataFormats/GeneratorProducts/interface/GenEventInfoProduct.h#L91 it is not critical variable any way.
-
xPDF not available for some samples, e.g. https://uhh2-integration.web.cern.ch/UHH2integration/test1129/mc_2017_TTSemiLeptonic.html#genInfo.pdf_xPDF1() Possible replacement variable? OK for 2016v2, v3, but not in 2017 or 2018. 2016 used powheg-pythia, 2018 is also powheg-pythia and 2017 is powheg-pythia and amcatnloFXFX-pythia8. So seems weird. xPDF data not even in MiniAOD, so not our fault If the user needs the PDF value, should be recoverable via x, PDGID and q scale
PHOTONS
- Add Photon object class?
- Store slimmedPhotons, so that we can recalculate L1Prefire weights if necessary
- Update ExampleModuleElectronID to handle both Electrons & photons
- Why photon Puppi iso values changing each time? Only the v.extreme ones? turns out it didn't exist in 2016, so just filling in junk random values, have left at default value of -1 for 2016v2 samples
References:
https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookMiniAOD
https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookMiniAOD2016
https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookMiniAOD2017
https://twiki.cern.ch/twiki/bin/view/CMSPublic/ReMiniAOD03Feb2017Notes
https://twiki.cern.ch/twiki/bin/view/CMS/PdmV2016Analysis
https://twiki.cern.ch/twiki/bin/view/CMS/PdmV2017Analysis
https://twiki.cern.ch/twiki/bin/view/CMS/PdmV2018Analysis