Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@siranipour
Copy link
Contributor

@siranipour siranipour commented Jul 14, 2021

Now that python is used to construct the pseudodata, I've updated the pseudodata.py function that regenerates the pseudodata of a given fit. It's now much easier to do this in parallel because we don't use swig objects anymore.

Closes #1323

The tests will, however, fail for now until I regenerate a PDF that saves its python pseudodata for the regression test. @wilsonmr, I may be misremembering but did you at some point have a PR which saved the pseudodata of a fit?

EDIT: This has turned into a bit of a general improvements PR.

  1. Corrected the recreation of a fit pseudodata. I've swapped the C++ make_replica with the python one.
  2. Improved the reading of pseudodata from a fit which has fitting::savepseudodata: True
  3. There was a function that created the tr_mask for all replicas:
    def training_mask_table(replicas_exps_tr_masks, replicas, experiments_index):

    I've now changed this so it works on an individual replica and then collected it over replicas (I did this because it was useful for the recreation of a training validation mask.
  4. Fixed the tests

@siranipour siranipour marked this pull request as draft July 14, 2021 12:30
@siranipour siranipour force-pushed the replicate_pseudodata branch from d2e5304 to 9d9ae85 Compare July 14, 2021 12:37
@wilsonmr
Copy link
Contributor

now that the replica generation is fast, can we not just call the action on a set of replicas? Is there really that much speed up from MP now? In fact since the action is in validphys, can we not just call --parallel in the usual way to get the MP?

@wilsonmr
Copy link
Contributor

@wilsonmr, I may be misremembering but did you at some point have a PR which saved the pseudodata of a fit?

This got added in #1081 I think with the savepseudodata flag:

if file_content["fitting"].get("savepseudodata"):

@wilsonmr
Copy link
Contributor

in particular, calling the API inside of a validphys action seems really horrible, I don't think this should be an internal action, or even exist at all

@siranipour
Copy link
Contributor Author

Oh true actually, let me see if the --parallel flag works

@siranipour siranipour force-pushed the replicate_pseudodata branch from 70ae840 to ccde1b2 Compare July 14, 2021 14:31
@wilsonmr
Copy link
Contributor

just to say, the function which I mentioned, pseudodata_table, was actually not updated to use make_replica in #1268 which is why the function is so painfully slow, it also means that function is currently useless because it's not even dumping the right pseudodata during the fit haha

@wilsonmr
Copy link
Contributor

how long does this take now with the collect approach @siranipour ?

@siranipour
Copy link
Contributor Author

just to say, the function which I mentioned, pseudodata_table, was actually not updated to use make_replica in #1268 which is why the function is so painfully slow, it also means that function is currently useless because it's not even dumping the right pseudodata during the fit haha

Ah ok haha, I'll try and get another PR for that then because the tests will fail until we regenerate a new regression fit.

how long does this take now with the collect approach @siranipour ?

Unfortunately I was a bit pushed for time, so I didn't get to fully benchmark it, but it still took a non negligible O(few minutes). I'll take a proper look and also play with the parallel flag and see what happens.

@siranipour siranipour force-pushed the replicate_pseudodata branch from ba389c9 to 62c4521 Compare July 21, 2021 09:47
@siranipour siranipour changed the title Recreating python psueododata Recreating a fit's python psueododata Jul 21, 2021
@siranipour siranipour requested review from Zaharid and scarlehoff July 21, 2021 16:07
@siranipour siranipour changed the title Recreating a fit's python psueododata Improving reading and recreating fit pseudodata Jul 21, 2021
@siranipour siranipour marked this pull request as ready for review July 21, 2021 16:12
@siranipour siranipour added the run-fit-bot Starts fit bot from a PR. label Jul 21, 2021
@github-actions
Copy link

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

@siranipour
Copy link
Contributor Author

Any ideas why the regression tests fail? They pass locally for me...

@siranipour siranipour force-pushed the replicate_pseudodata branch 4 times, most recently from 0df61c3 to 2c49d34 Compare July 27, 2021 09:14
@siranipour
Copy link
Contributor Author

Ahh I generated the new baseline with theory 200 which is why the tests are failing. Will fix

@siranipour siranipour force-pushed the replicate_pseudodata branch from 2c49d34 to 30280ea Compare July 27, 2021 13:08
@siranipour siranipour force-pushed the replicate_pseudodata branch from 30280ea to 747f7a5 Compare August 31, 2021 13:47
@scarlehoff scarlehoff removed the run-fit-bot Starts fit bot from a PR. label Sep 1, 2021
@siranipour
Copy link
Contributor Author

@scarlehoff this PR flies in tandem with #1333, in that it reproduces the pseudodata a posteriori which is in principle the ones that are saved by #1333. I would appreciate if you could take a look at this soon too.

The basic idea is that I've added actions that work on a per replica basis. I.e replicate the pseudodata of replica_i. We then collect over either NSList(range(nrep), nskey='replica') to replicate all replicas, or produce_pdfreplicas, which is basically the same thing but accounting for the postfit reshuffling

siranipour and others added 2 commits September 14, 2021 17:19
@siranipour siranipour merged commit b7e1e14 into master Sep 15, 2021
@siranipour siranipour deleted the replicate_pseudodata branch September 15, 2021 15:18
@Zaharid Zaharid added the enhancement New feature or request label Oct 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

should pseudata.py should use fitting_data_dict?

5 participants