Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@francescobrivio
Copy link
Contributor

@francescobrivio francescobrivio commented Oct 5, 2022

Config Change Request

Not really a replay request, more just a config change.

Proposed Changes
In #4642 a lifetime of 3 months was added to the express alcarecos. Since this means that when such lifetime expires these datasets (which are not custodial as the prompt ones) get deleted (not transferred to tape), we propose to increase the lifetime to 12 month so that they can be used, for example, for the derivation fo the re-reco conditions.
In this PR I increased the lifetime for:

  • "Express"
  • "ExpressCosmics"
  • "ExpressAlignment"
  • "ALCALumiPixelsCountsExpress"
  • "ALCAPPSExpress"

@drkovalskyi given your comment in #4642 (comment) any feedback on this proposal is welcome (including possibly setting different diskNodes for storing these alcarecos).

T0 Operations cmsTalk thread
https://cms-talk.web.cern.ch/t/config-change-for-keeping-express-alcareco-on-disk-for-1-year/15927

@drkovalskyi
Copy link

We need an estimate of how much data we are talking about. Based on that we can decide if it's Ok to make a year long subscription.

In general, please keep in mind that Tier0 subscriptions are meant to be used as initial subscription, i.e. if user's needs go beyond what T0 rules are providing new Rucio rules should be created by the user.

@francescobrivio
Copy link
Contributor Author

Hi Dima,

We need an estimate of how much data we are talking about. Based on that we can decide if it's Ok to make a year long subscription.

Ok we'll try to get an estimate asap!
Indeed this was also my concern when I mentioned the possibility to update the diskNodes for these alcarecos.

if user's needs go beyond what T0 rules are providing new Rucio rules should be created by the user.

This is not always true for the express case, since there is not default tape subscription, once the 3 months are over the dataset is deleted for good. E.g. this happened now with:
/StreamExpressAlignment/Run2022B-TkAlMinBias-Express-v1/ALCARECO

@drkovalskyi
Copy link

Can you clarify why 3 months is not enough to realize that you need the data for longer and you need to make a new subscription? Is it because the data is used only much later? Maybe we should reconsider the idea of having a tape copy? Basically we should not put all data on disk for a year if we just need a small fraction once in a while.

@francescobrivio
Copy link
Contributor Author

Hi @drkovalskyi
I made this table based on 2022C (so far the largest for 2022 with ~5/fb collected) to give you an example of the sizes.

Dataset ALCARECO Size Comment
Express SiStripPCLHistos 278.8 GB Better on T2_CH_CERN
^ SiStripCalZeroBias 6.6 TB ^
^ SiStripCalMinBias 10.3 TB ^
^ SiStripCalMinBiasAAG 4.3 TB ^
^ TkAlMinBias 3.8 TB ^
^ SiPixelCalZeroBias 79.5 GB ^
^ SiPixelCalSingleMuon 628.3 GB ^
^ SiPixelCalSingleMuonTight 157.0 BG ^
ExpressCosmics SiStripPCLHistos 7.7 GB Better on T2_CH_CERN
^ SiStripCalZeroBias 4.6 GB ^
^ SiPixelCalZeroBias 3.5 GB ^
^ TkAlCosmics0T 10.1 GB ^
StreamExpressAlignment TkAlMinBias 7.0 TB Can be moved to other Disks
ALCALumiPixelsCountsExpress AlCaPCCRandom 130.5 GB Can be moved to other Disks
ALCAPPSExpress PPSCalMaxTracks 748.7 GB Can be moved to other Disks

Can you clarify why 3 months is not enough to realize that you need the data for longer and you need to make a new subscription? Is it because the data is used only much later?

For example the StreamExpressAlignment alcareco I mentioned in my previous message is used to derive the beamspot conditions for the re-reco, so maybe 1 year is an overshoot, but at the same time 3 months is too short...
But I understood (from private conversations) that in Run 2 the policy was to store them for 12 months, so that's what I'm also implementing in this PR.

Maybe we should reconsider the idea of having a tape copy?

Yes I guess this enters in the same general discussion about alcareco handling (see also https://its.cern.ch/jira/browse/CMSTZ-1005)

Basically we should not put all data on disk for a year if we just need a small fraction once in a while.

Are you now referring to T2_CH_CERN or Disk sites in general?

@drkovalskyi
Copy link

Ok, so we are talking about ~35TB for ~20-25% of data, i.e. it's 100-200TB per year. I think it's small enough to just keep it all for a year.

@francescobrivio
Copy link
Contributor Author

Thanks Dima!

@germanfgv @jhonatanamado I guess this PR can be merged then.

@francescobrivio
Copy link
Contributor Author

test syntax please

@germanfgv germanfgv merged commit c4cc364 into dmwm:master Oct 5, 2022
@francescobrivio francescobrivio deleted the alca_expressAlcarecos branch November 8, 2022 09:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants