Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

alcholiclg
Copy link
Contributor

@alcholiclg alcholiclg commented Aug 28, 2025

This PR adds features to HubApi:

  1. Add repo_info support for retrieving repository-level metadata (datasets, models) and latest commit info.
  2. Add list_repo_commits for fetching full repository commit history.
  3. Support compatibility with undefined parameters in ModelInfo and DatasetInfo.

Usage example:

from modelscope.hub.api import HubApi

api = HubApi()
# repo_id = 'Qwen/Qwen3-Coder-480B-A35B-Instruct'
# repo_type = 'model'
repo_id = 'swift/Chinese-Qwen3-235B-2507-Distill-data-110k-SFT'
repo_type = 'dataset'

repo_info = api.repo_info(repo_id=repo_id, repo_type=repo_type)

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @alcholiclg, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the HubApi by introducing new functionalities to retrieve comprehensive repository information. It adds a unified repo_info method to fetch detailed metadata for both models and datasets, including their latest commit details. Furthermore, it provides a list_repo_commits method to access the full commit history of a repository. The changes also improve the robustness of ModelInfo and DatasetInfo classes by making them compatible with potentially undefined parameters and integrating the new commit information.

Highlights

  • New repo_info API: A unified method HubApi.repo_info is introduced to fetch detailed metadata for both models and datasets, including their latest commit information.
  • Commit History Retrieval: A new HubApi.list_repo_commits method allows fetching the complete commit history for any given model or dataset repository.
  • Enhanced Model and Dataset Information: The ModelInfo and DatasetInfo classes are updated to include fields for the latest commit SHA (sha), the full latest commit object (last_commit), and the last_modified timestamp, providing richer repository metadata.
  • Improved Data Class Robustness: ModelInfo and DatasetInfo constructors are made more resilient to handle missing or undefined parameters, ensuring backward compatibility.
  • Timestamp Conversion Utility: A new convert_timestamp utility function is added to modelscope.hub.utils.utils to reliably convert various timestamp formats (UNIX, ISO strings) into timezone-aware datetime objects.
  • New Data Structures: OrganizationInfo, DetailedCommitInfo, and CommitHistoryResponse dataclasses are introduced to better structure and manage the retrieved repository and commit data.
  • Comprehensive Test Coverage: A new test file tests/hub/test_hub_repo_info.py is added, providing extensive unit tests for the new API methods and the enhanced data classes, including tests with real repository data.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant enhancements to the HubApi by adding support for retrieving repository information (repo_info) and commit history (list_repo_commits). The implementation includes new data classes for models, datasets, and organizations, which is a good approach. However, I've identified a few critical issues, such as type mismatches in dataclass initializers that could cause runtime errors, and some flawed error handling logic. Additionally, there are opportunities to improve code quality and robustness. The inclusion of comprehensive tests is a great practice.

self.author = kwargs.pop('author', '')

# backward compatibility
self.__dict__.update(kwargs)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using self.__dict__.update(kwargs) for backward compatibility can be risky. It makes the object's attributes dynamic and dependent on the API response, which can hide issues like typos in keys from the API. For improved robustness, it's better to explicitly handle all expected keys and perhaps log a warning for any unexpected keys remaining in kwargs.

self.last_modified = None

# backward compatibility
self.__dict__.update(kwargs)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using self.__dict__.update(kwargs) for backward compatibility can be risky. It makes the object's attributes dynamic and dependent on the API response, which can hide issues like typos in keys from the API. For improved robustness, it's better to explicitly handle all expected keys and perhaps log a warning for any unexpected keys remaining in kwargs.

Comment on lines +1495 to +1498
if is_relative_path(repo_id) and repo_id.count('/') == 1:
_owner, _dataset_name = repo_id.split('/')
else:
raise ValueError(f'Invalid repo_id: {repo_id} !')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The variables _owner and _dataset_name are assigned but never used. They should be removed to improve code clarity. The validation logic can be simplified.

Suggested change
if is_relative_path(repo_id) and repo_id.count('/') == 1:
_owner, _dataset_name = repo_id.split('/')
else:
raise ValueError(f'Invalid repo_id: {repo_id} !')
if not (is_relative_path(repo_id) and repo_id.count('/') == 1):
raise ValueError(f'Invalid repo_id: {repo_id} !')

Comment on lines +1521 to +1522
except requests.exceptions.RequestException as e:
raise Exception(f'Failed to get repository commits for {repo_id}: {str(e)}')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Raising a generic Exception can obscure the original error and make debugging more difficult. It's better to raise a more specific exception, like RequestError, and chain the original RequestException using from e to preserve the stack trace.

Suggested change
except requests.exceptions.RequestException as e:
raise Exception(f'Failed to get repository commits for {repo_id}: {str(e)}')
except requests.exceptions.RequestException as e:
raise RequestError(f'Failed to get repository commits for {repo_id}: {str(e)}') from e

alcholiclg and others added 4 commits August 29, 2025 14:50
…cope into feat/support_repo_info

Merge branch 'feat/support_repo_info' of github.com:alcholiclg/modelscope into feat/support_repo_info
@wangxingjun778 wangxingjun778 merged commit 2fc6d8e into modelscope:master Aug 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants