Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

hamogu
Copy link
Member

@hamogu hamogu commented Aug 8, 2025

Description

Part of the CDS specs is that data files can be split to keep each ASCII file below 10 MB. There are few (possibly only the Tycho 2 catalog) such cases in CDS, but we want to be complete.
The implementation is in the CDS reader, looking for the very specific case where a ReadMe file is given, and the catalog file is not found with the expected name. In that and only that case, the code checks if split data files exist.

Fixes #4121

  • By checking this box, the PR author has requested that maintainers do NOT use the "Squash and Merge" button. Maintainers should respect this when possible; however, the final decision is at the discretion of the maintainer that merges the PR.

Copy link
Contributor

github-actions bot commented Aug 8, 2025

Thank you for your contribution to Astropy! 🌌 This checklist is meant to remind the package maintainers who will review this pull request of some common things to look for.

  • Do the proposed changes actually accomplish desired goals?
  • Do the proposed changes follow the Astropy coding guidelines?
  • Are tests added/updated as required? If so, do they follow the Astropy testing guidelines?
  • Are docs added/updated as required? If so, do they follow the Astropy documentation guidelines?
  • Is rebase and/or squash necessary? If so, please provide the author with appropriate instructions. Also see instructions for rebase and squash.
  • Did the CI pass? If no, are the failures related? If you need to run daily and weekly cron jobs as part of the PR, please apply the "Extra CI" label. Codestyle issues can be fixed by the bot.
  • Is a change log needed? If yes, did the change log check pass? If no, add the "no-changelog-entry-needed" label. If this is a manual backport, use the "skip-changelog-checks" label unless special changelog handling is necessary.
  • Is this a big PR that makes a "What's new?" entry worthwhile and if so, is (1) a "what's new" entry included in this PR and (2) the "whatsnew-needed" label applied?
  • At the time of adding the milestone, if the milestone set requires a backport to release branch(es), apply the appropriate "backport-X.Y.x" label(s) before merge.

@pllim pllim added this to the v7.2.0 milestone Aug 8, 2025
@dhomeier
Copy link
Contributor

dhomeier commented Aug 8, 2025

I think the iers test is somewhat nonstandard in that it reuses a iers.IERS_B instance that has already previously opened a valid file. Might perhaps be considered abuse, but it’s probably still better to guard against such cases.

@hamogu
Copy link
Member Author

hamogu commented Aug 8, 2025

The last commit should take care of the issue.

@hamogu
Copy link
Member Author

hamogu commented Aug 8, 2025

pre-commit.ci autofix

@hamogu
Copy link
Member Author

hamogu commented Aug 8, 2025

pre-commit.ci autofix

@dhomeier
Copy link
Contributor

dhomeier commented Aug 8, 2025

CI is passing, so that part looks good, thanks! Don't have time for a full review now, but I'll try to get to it by next week.

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very nice!

Copy link
Member

@taldcroft taldcroft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

hamogu added 3 commits August 10, 2025 15:24
Part of the CDS specs is that data files can be split to keep each ASCII file below 10 MB. There are few (possibly only the Tycho 2 catalog) such cases in CDS, but we want to be complete.
The implementation is in the CDS reader, looking for the very specific case where a ReadMe file is given, and the catalog file is not found with the expected name. In that and only that case, the code checks if split data files exist.

closes astropy#4121
tighten up regular expression

reorder imports
Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good that @taldcroft found a possible problem. Reapproving (my inline comment is really only in case something else needs to be changed too).

# deal with table where the ReadMe is present, but the data is split over several data files
if self.header.readme is not None:
path = Path(table)
pattern = re.compile(r"\.(\d{2,3})(\.gz)?$")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not worth re-running CI for, but if there are other changes needed, I would change this to,

if f_list := sorted(Path(table).parent.glob(path.name + "*")):
    pattern = re.compile(r"\.(\d{2,3})(\.gz)?$")
    numbers = ...

Copy link
Member

@taldcroft taldcroft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@hamogu
Copy link
Member Author

hamogu commented Aug 17, 2025

@dhomeier Do you want to re-review this or can it be merged? (The only test failures is an allowed failure.)

@hamogu
Copy link
Member Author

hamogu commented Aug 21, 2025

I should add:The only test failure is an allowed failure that has nothing do with this PR (timeout of some https connection)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

io.ascii (.cds) should accept more than one file
5 participants