-
Notifications
You must be signed in to change notification settings - Fork 12
Changing functions which are still using NNPDF pseudodata #1424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Also, some of these are not used anywhere that I can see (just tried removing |
|
computed_psedorreplicas_chi2 was used in the alpha s determination, and would be good to have some version of it (especially one that is efficient). The code in mc_gen.py was used for various studies on systematics, which might be picked again some day. |
Could you give me some snippet of code that uses it? Otherwise testing it will be a nightmare. |
|
Uh, there were a bunch of typo "fixes" that appear to have broken the alpha_s runcards. |
|
I also see these were not updated to use groups... Anyhow, here a runcard that "works" after I make this change diff --git a/validphys2/src/validphys/paramfits/dataops.py b/validphys2/src/validphys/paramfits/dataops.py
index fed61f8b2..773898e48 100644
--- a/validphys2/src/validphys/paramfits/dataops.py
+++ b/validphys2/src/validphys/paramfits/dataops.py
@@ -71,12 +71,12 @@ def get_parabola(asvals, chi2vals):
#TODO: Export the total here. Not having it is causing huge pain elsewhere.
@table
@check_fits_different
-def fits_matched_pseudoreplicas_chi2_table(fits, fits_computed_pseudoreplicas_chi2):
+def fits_matched_pseudoreplicas_chi2_table(fits, fits_computed_psedorreplicas_chi2):
"""Collect the chi^2 of the pseudoreplicas in the fits a single table,
groped by nnfit_id.
The columns come in two levels, fit name and (total chi², n).
The indexes also come in two levels: nnfit_id and experiment name."""
- return pd.concat(fits_computed_pseudoreplicas_chi2, axis=1, keys=map(str,fits))
+ return pd.concat(fits_computed_psedorreplicas_chi2, axis=1, keys=map(str,fits))fits:
- NNPDF31_nnlo_as_0117_uncorr_s2
meta:
author: Zahari Kassabov
title: Pseudorreplica raw data for the second batch of proton only fits at NLO
keywords: [as]
use_t0: False
use_cuts: True
experiments:
from_: fit
theoryid: 53
fitting:
from_: fit
dataseed:
from_: fitting
datacuts:
from_: fit
pdf:
from_: fit
template_text: |
{@fits_matched_pseudorreplicas_chi2_table@}
actions_:
- - report:
main: True |
|
I'll fix the typos as I go. Thanks for the runcard, seem to work. I've changed the runcard to experiments:
- experiment: NMC
datasets:
- {dataset: NMC}
- experiment: SLAC
datasets:
- {dataset: SLACP}So it doesn't take forever. |
|
Since I don't need to care about pre-3.1 compatibility (#1405) and that we have a group mechanism that deprecates half of the function I've decided to redo However, some terrible considerations In order to generate the pseuodata I'm doing: from validphys.n3fit_data import replica_mcseed
from validphys.pseudodata import make_replica
all_data_replicas = []
for replica in fitted_replica_indexes:
value_of_mcseed = replica_mcseed(replica, mcseed, True)
all_data_replicas.append(make_replica(dataset_inputs_loaded_cd_with_cuts, value_of_mcseed))
r_data = np.array(all_data_replicas).Twhere the input is def computed_pseudoreplicas_chi2(
mcseed,
dataset_inputs_loaded_cd_with_cuts,
fitted_replica_indexes,
...It looks like I should be able to have So, rather than reviewers, from this PR it would be nice if someone else would deal with massaging the new/ported actions to be Also, a working runcard (a modification of yours) that can be used: fits:
- 210629-n3fit-001
meta:
author: juacrumar
title: Pseudorreplica raw data chi2 for NNPDF4.0
keywords: [as]
use_t0: False
use_cuts: True
dataset_inputs:
- {dataset: NMCPD_dw_ite}
- {dataset: D0WMASY, cfac: [QCD]}
theoryid: 200
genrep:
from: _fit
fitting:
from_: fit
mcseed:
from_: fit
datacuts:
from_: fit
pdf:
from_: fit
template_text: |
{@fits_matched_pseudoreplicas_chi2_table@}
actions_:
- report(main=True) |
I would actually drop
Probably studies that were done with it (looking at the example runcards) were done with Anyway, my version should work just the same with |
|
@scarlehoff agreed. Probably it is better to do these things from scratch anyway. |
I am not sure I understand the issue here: What is there to do other than perhaps refactoring the loop into its own provider? |
|
But the loop exists (it's What I don't know how to do is how to collect In particular I've tried doing: fitted_make_replicas = collect('make_replica', ('fitted_replica_indexes',))But I get: (even if instead of a list I make it output a NSList with I would like to be able the same thing I've done for make_replicas = collect('make_replica', ('replicas',))I guess I could make |
|
You can only collect over something that is known at "compile time", but not at "run time". Actions (i.e. functions in provider modules), such as To do things at compile time we have production rules (i.e things defined in some The distinction between the two is fairly arbitrary, other than it is nice to have things that are slow and can be checked as actions so they can fail quickly or execute successfully. In this case however I don't think we would gain much: The case where we want to work with single individual replicas is fairly niche so we can as well have the corresponding for loop in the code. |
|
Well. There is a way, implementing an But if you are happy with the current form I am happy to leave it like this. It just looked "unvalidphy-sy" to me (so I thought it would look horrendous to everbody else) I'll take out the silly comments, deal with |
|
Uh, there's no As a compromise I've moved the import inside the onyl function that will ever use |
I needed this functionality a few PRs ago, can you try: which leverages the following production rule nnpdf/validphys2/src/validphys/config.py Line 241 in 4c27de4
With regards to the |
|
Ah! Thank you! I can use indeed |
|
@siranipour @Zaharid to the best of my knowledge all pseudodata is now python-generated everywhere for all vp. Let me know if I missed something. |
Co-authored-by: siranipour <[email protected]>
|
Overall this looks fine to me. Especially the part where it removes a lot of code. |
|
Please approve and merge if this one does indeed look good. |
siranipour
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to see a lot of the code I found confusing is now gone
siranipour
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to merge once the test passes
I'm currently removing the libNNPDF dependencies and decided to start with
RandomGeneratorwhich seems the lowest hanging fruit as we now havemake_replicawhich the fit now uses.However some actions are still using NNPDF pseudodata (which is very bad because that means the pseudodata could perfectly be 100% different from the python version) either calling
pseudodataoMakeReplicaThese are:
n3fit_data_utils.pychi2grids.py::computed_pseudorreplicas_chi2* @Zaharidmc_gen.py::one_art_data_residualsmc_gen.py::art_rep_generationresults.py::closure_pseudodata_replicas@Zaharid- [ ]filter.pySadly these functions have been mainly developed/touched for people who already left the collaboration so please @Zaharid @siranipour if you could have a close look or give pointers (maybe some functions can be totally removed?) they would be much appreciated.
Also, some of these doesn't seem to be used anywhere (computed_pseudorreplicas_chi2 for instance, I just tried removing it with no consequences for the test...) s
*that function clearly states
#TODO: Everythning about this function is horrible. We need to rewriteand I would agree but I think making it use the python pseudodata is more pressing.Edit: The
RandomGeneratorcannot be completely taken out from vp since until there is a python-only closure test. I've instead moved the import inside the appropriate function.