-
Notifications
You must be signed in to change notification settings - Fork 13
Tracking PCA/NMF coeffcients in archetype mode #342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This reverts commit d2b0b24.
|
Thanks for this update, and your attention to preserving backwards compatibility by default. If I understand correctly, in archetype mode now the COEFF column contains the same info as ARCHCOEFF plus LEGCOEFF. Is that correct? What would break if we made COEFF only the archetype coefficients (i.e. like ARCHCOEFF), dropped ARCHCOEFF, and kept LEGCOEFF? That would keep COEFFS as the required column containing the coefficients for whatever mode was run, and LEGCOEFF would optionally exist if Legendre polynomials were also used, and no data would be replicated in different columns of the same table. That also has the additional benefit of possibly including them in PCA or NMF mode too (future work, different PR). I think that would break Prospect plotting in archetype mode, but I'm also not sure Prospect still works with archetype mode and might need updates anyway. Anything else? |
|
Yes, you are right. In archetype mode now the COEFF column contains the same info as ARCHCOEFF plus LEGCOEFF. Regarding dropping ARCHCOEFF and just keeping the LEGCOEFF, I think, that should work in principle. I just had a look at the codebase and it will need multiple updates in the redrock main and test scripts to make sure the code understands what COEFF means in both cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still testing, but here is an initial review of code details. Overall looks good but I think there are some bugs with header propagation. Other requests are for clarity / maintainability.
py/redrock/external/desi.py
Outdated
| self.secondary_headers["FIBERMAP"] = hdus["FIBERMAP"].header.copy() | ||
| exp_fmap = encode_table(Table(hdus["EXP_FIBERMAP"].data, | ||
| copy=True).as_array()) | ||
| self.secondary_headers["FIBERMAP"] = hdus["EXP_FIBERMAP"].header.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please double check these. It looks like here and L327 you are copying the wrong header and/or overwriting a previous header due to a key-name mismatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
py/redrock/external/desi.py
Outdated
| Options: | ||
| archetypes : list or dict of Archetype objects | ||
| spec_header (dict-like): header of HDU 0 of input spectra | ||
| secondary_headers (dict-like): header of other HDUs than primary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please clarify in docstring the structure of this. It looks like spec_header is a dict-like structure with the HDU 0 header itself, while secondary_headers is a dict keyed by EXTNAME, with values that must be astropy.io.fits.Header objects, not just any dict-like objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
py/redrock/archetypes.py
Outdated
| Options: | ||
| R : array[nwave,nwave] resolution matrix to convolve with model | ||
| legcoeff : array of additional legendre coefficients | ||
| deg_legendre : legendre polynomial degree |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the degree of the legendre polynomial have to be specified separately from the length of the legcoeffs because it is ambiguous about whether the coefficients are multi-camera or not? Could this be handled with the dimensionality of legcoeff itself like you do with the output, i.e.
- If legcoeff is 1D[n], it means they are n polynomials fitted over the entire spectra (like the templates);
- If legcoeff is 2D[m,n], it means they are for m cameras and n polynomials.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
py/redrock/archetypes.py
Outdated
| zzchi2 = np.zeros(self._narch, dtype=np.float64) | ||
| zzcoeff = np.zeros((self._narch, 1+ncam*(nleg)), dtype=np.float64) | ||
| #zzchi2 = np.zeros(self._narch, dtype=np.float64) | ||
| #zzcoeff = np.zeros((self._narch, 1+ncam*(nleg)), dtype=np.float64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just remove these instead of commenting them out. I agree they are no longer needed due to getting set by various calls to per_camera_coeff_with_least_square_batch or calc_zchi2_batch later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
py/redrock/fitz.py
Outdated
| prior=None | ||
| chi2min, coeff, fulltype = archetype.get_best_archetype(target,weights,flux,wflux,dwave,zbest, per_camera, n_nearest, trans=trans, use_gpu=use_gpu, prior=prior) | ||
| del trans | ||
| subtype = fulltype.split(':::')[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use redrock.templates.parse_fulltype(fulltype)[1] instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
This is a PR related to issue #274. In the current version of redrock, if it's run in archetype mode, we lose the PCA results completely in the final best redshift file. This could be an issue if someone wants to track the PCA results. In this PR, I add the functionality to also save the PCA coefficients, spectype, and subtype in archetype mode.
New features:
- If this is 1D[n], it means they are n polynomials fitted over the entire spectra (like the templates);
- If this is 2D[m,n], it means they are for m cameras and n polynomials.
Backward compatibility
Example Runs
On main branch:
Without archetype
With archetypes
On this branch:
Without archetypes
With archetypes
Sanity Checks
Also Resolves #331
This PR also fixes #331: non-primary HDUs (e.g., FIBERMAP) were losing header keywords when copied from coadds. I’ve updated
desi.pyso units and headers for non-primary HDUs are now preserved correctly in the redrock output file.