Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@abhi0395
Copy link
Member

@abhi0395 abhi0395 commented Sep 24, 2025

This is a PR related to issue #274. In the current version of redrock, if it's run in archetype mode, we lose the PCA results completely in the final best redshift file. This could be an issue if someone wants to track the PCA results. In this PR, I add the functionality to also save the PCA coefficients, spectype, and subtype in archetype mode.

New features:

  1. In archetype mode, the final redshift table will now have a few new columns:
  • COEFF: Leading Archetype coefficients (always positive), size of array depends upon the size of maximum coeffcients
  • NCOEFF Number of archetype coefficients
  • LEGCOEFF: Legendre polynomial coefficients.
    - If this is 1D[n], it means they are n polynomials fitted over the entire spectra (like the templates);
    - If this is 2D[m,n], it means they are for m cameras and n polynomials.
  • PCA_COEFF: Best-fit PCA coefficients for the best redshift
  • PCA_SPECTYPE: Best-fit PCA spectype for the best redshift
  • PCA_SUBTYPE: Best-fit PCA subtype for the best redshift
  1. In non-archetype mode:
  • COEFF: In non-archetype mode: remains PCA coefficients.
  1. Tests have been updated accordingly.

Backward compatibility

  • Only adds columns in archetype mode; existing columns and meanings are unchanged.
  • I/O: Minor additional columns written; no change in core fitting loops.

Example Runs

On main branch:

Without archetype

export petal=8
export RR_TEMPLATE_DIR=/global/cfs/cdirs/desicollab/users/rongpu/data/redrock/laelbg_templates
export outdir="/pscratch/sd/a/abhijeet/test_redrock/redrock_pr342/main"

srun -n 4 -c 4 --gpu-bind=map_gpu:3,2,1,0 rrdesi_mpi \
  --gpu --max-gpuprocs 4 \
  -i /global/cfs/cdirs/desicollab/users/rongpu/data/desi2/tertiary47/ignore_z_camera/coadds/coadd-${petal}-83577-thru20250430.fits \
  -o ${outdir}/redrock-${petal}-83577-thru20250430.fits \
  --model ${outdir}/rrmodel-${petal}-83577-thru20250430.fits \
  --nminima 100 \
  --zscan-galaxy=-0.005,3.4,1e-4 \
  -n 10 \
  -d ${outdir}/redrock-${petal}-83577-thru20250430.h5

With archetypes

export petal=8
export RR_TEMPLATE_DIR=/global/cfs/cdirs/desicollab/users/rongpu/data/redrock/laelbg_templates
export outdir="/pscratch/sd/a/abhijeet/test_redrock/redrock_pr342/main"

srun -n 4 -c 4 --gpu-bind=map_gpu:3,2,1,0 rrdesi_mpi \
  --gpu --max-gpuprocs 4 \
  -i /global/cfs/cdirs/desicollab/users/rongpu/data/desi2/tertiary47/ignore_z_camera/coadds/coadd-${petal}-83577-thru20250430.fits \
  -o ${outdir}/archetype-redrock-${petal}-83577-thru20250430.fits \
  --model ${outdir}/archetype-rrmodel-${petal}-83577-thru20250430.fits \
  --nminima 100 \
  --zscan-galaxy=-0.005,3.4,1e-4 \
  -n 10 \
  -d ${outdir}/archetype-redrock-${petal}-83577-thru20250430.h5

On this branch:

Without archetypes

export petal=8
export RR_TEMPLATE_DIR=/global/cfs/cdirs/desicollab/users/rongpu/data/redrock/laelbg_templates
export outdir="/pscratch/sd/a/abhijeet/test_redrock/redrock_pr342/coeff_split"

srun -n 4 -c 4 --gpu-bind=map_gpu:3,2,1,0 rrdesi_mpi \
  --gpu --max-gpuprocs 4 \
  -i /global/cfs/cdirs/desicollab/users/rongpu/data/desi2/tertiary47/ignore_z_camera/coadds/coadd-${petal}-83577-thru20250430.fits \
  -o ${outdir}/redrock-${petal}-83577-thru20250430.fits \
  --model ${outdir}/rrmodel-${petal}-83577-thru20250430.fits \
  --nminima 100 \
  --zscan-galaxy=-0.005,3.4,1e-4 \
  -n 10 \
  -d ${outdir}/redrock-${petal}-83577-thru20250430.h5

With archetypes

export petal=8
export RR_TEMPLATE_DIR=/global/cfs/cdirs/desicollab/users/rongpu/data/redrock/laelbg_templates
export outdir="/pscratch/sd/a/abhijeet/test_redrock/redrock_pr342/coeff_split"

srun -n 4 -c 4 --gpu-bind=map_gpu:3,2,1,0 rrdesi_mpi \
  --gpu --max-gpuprocs 4 \
  -i /global/cfs/cdirs/desicollab/users/rongpu/data/desi2/tertiary47/ignore_z_camera/coadds/coadd-${petal}-83577-thru20250430.fits \
  -o ${outdir}/archetype-redrock-${petal}-83577-thru20250430.fits \
  --model ${outdir}/archetype-rrmodel-${petal}-83577-thru20250430.fits \
  --nminima 100 \
  --zscan-galaxy=-0.005,3.4,1e-4 \
  -n 10 \
  -d ${outdir}/archetype-redrock-${petal}-83577-thru20250430.h5

Sanity Checks

outdir = '/pscratch/sd/a/abhijeet/test_redrock/redrock_pr342'
main={}
coeff_split = {}
main['redrock']=Table.read(f'{outdir}/main/redrock-8-83577-thru20250430.fits', hdu=1)
main['archetype']=Table.read(f'{outdir}/main/archetype-redrock-8-83577-thru20250430.fits', hdu=1)

coeff_split['archetype']=Table.read(f'{outdir}/coeff_split/archetype-redrock-8-83577-thru20250430.fits', hdu=1)
coeff_split['redrock']=Table.read(f'{outdir}/coeff_split/redrock-8-83577-thru20250430.fits', hdu=1)

for k in coeff_split['redrock'].colnames:
    t1 = coeff_split['redrock'][k]
    t2 = main['redrock'][k]
    print(f'PCA-ONLY RESULTS: main and branch: {k}: Arrays equal: {np.array_equal(t1,t2)}')

for k in main['archetype'].colnames:
    t1 = coeff_split['archetype'][k]
    t2 = main['archetype'][k]
    if k not in ['COEFF', 'NCOEFF', 'LEGCOEFF', 'PCA_COEFF', 'PCA_SPECTYPE', 'PCA_SUBTYPE']:
        print(f'ARCHETYPE RESULTS: main and branch: {k}: Arrays equal: {np.array_equal(t1,t2)}')

Also Resolves #331

This PR also fixes #331: non-primary HDUs (e.g., FIBERMAP) were losing header keywords when copied from coadds. I’ve updated desi.py so units and headers for non-primary HDUs are now preserved correctly in the redrock output file.

@abhi0395 abhi0395 requested a review from sbailey September 24, 2025 01:07
@coveralls
Copy link

coveralls commented Sep 24, 2025

Coverage Status

coverage: 42.447% (-0.3%) from 42.716%
when pulling fcadb3d on coeff_split
into 0715ffb on main.

@sbailey
Copy link
Collaborator

sbailey commented Sep 26, 2025

Thanks for this update, and your attention to preserving backwards compatibility by default. If I understand correctly, in archetype mode now the COEFF column contains the same info as ARCHCOEFF plus LEGCOEFF. Is that correct? What would break if we made COEFF only the archetype coefficients (i.e. like ARCHCOEFF), dropped ARCHCOEFF, and kept LEGCOEFF?

That would keep COEFFS as the required column containing the coefficients for whatever mode was run, and LEGCOEFF would optionally exist if Legendre polynomials were also used, and no data would be replicated in different columns of the same table. That also has the additional benefit of possibly including them in PCA or NMF mode too (future work, different PR).

I think that would break Prospect plotting in archetype mode, but I'm also not sure Prospect still works with archetype mode and might need updates anyway. Anything else?

@abhi0395
Copy link
Member Author

Yes, you are right. In archetype mode now the COEFF column contains the same info as ARCHCOEFF plus LEGCOEFF.

Regarding dropping ARCHCOEFF and just keeping the LEGCOEFF, I think, that should work in principle. I just had a look at the codebase and it will need multiple updates in the redrock main and test scripts to make sure the code understands what COEFF means in both cases.

Copy link
Collaborator

@sbailey sbailey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still testing, but here is an initial review of code details. Overall looks good but I think there are some bugs with header propagation. Other requests are for clarity / maintainability.

self.secondary_headers["FIBERMAP"] = hdus["FIBERMAP"].header.copy()
exp_fmap = encode_table(Table(hdus["EXP_FIBERMAP"].data,
copy=True).as_array())
self.secondary_headers["FIBERMAP"] = hdus["EXP_FIBERMAP"].header.copy()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please double check these. It looks like here and L327 you are copying the wrong header and/or overwriting a previous header due to a key-name mismatch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

Options:
archetypes : list or dict of Archetype objects
spec_header (dict-like): header of HDU 0 of input spectra
secondary_headers (dict-like): header of other HDUs than primary
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please clarify in docstring the structure of this. It looks like spec_header is a dict-like structure with the HDU 0 header itself, while secondary_headers is a dict keyed by EXTNAME, with values that must be astropy.io.fits.Header objects, not just any dict-like objects.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Options:
R : array[nwave,nwave] resolution matrix to convolve with model
legcoeff : array of additional legendre coefficients
deg_legendre : legendre polynomial degree
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the degree of the legendre polynomial have to be specified separately from the length of the legcoeffs because it is ambiguous about whether the coefficients are multi-camera or not? Could this be handled with the dimensionality of legcoeff itself like you do with the output, i.e.

  • If legcoeff is 1D[n], it means they are n polynomials fitted over the entire spectra (like the templates);
  • If legcoeff is 2D[m,n], it means they are for m cameras and n polynomials.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

zzchi2 = np.zeros(self._narch, dtype=np.float64)
zzcoeff = np.zeros((self._narch, 1+ncam*(nleg)), dtype=np.float64)
#zzchi2 = np.zeros(self._narch, dtype=np.float64)
#zzcoeff = np.zeros((self._narch, 1+ncam*(nleg)), dtype=np.float64)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just remove these instead of commenting them out. I agree they are no longer needed due to getting set by various calls to per_camera_coeff_with_least_square_batch or calc_zchi2_batch later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

prior=None
chi2min, coeff, fulltype = archetype.get_best_archetype(target,weights,flux,wflux,dwave,zbest, per_camera, n_nearest, trans=trans, use_gpu=use_gpu, prior=prior)
del trans
subtype = fulltype.split(':::')[1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use redrock.templates.parse_fulltype(fulltype)[1] instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Redrock output missing FIBERMAP header keywords

4 participants