Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Categorical Balnced Chunk iterator#5894

Merged
sw005320 merged 14 commits intoespnet:masterfrom
ftshijt:category_chunk
Sep 23, 2024
Merged

Categorical Balnced Chunk iterator#5894
sw005320 merged 14 commits intoespnet:masterfrom
ftshijt:category_chunk

Conversation

@ftshijt
Copy link
Collaborator

@ftshijt ftshijt commented Sep 4, 2024

This is the implementation of a categorical balanced chunk iterator, currently designed for codec training in multi-domains.

Differences to existing iterators:

  • chunk_iterator: chunk_iterator has category supports, but it only supports using one category in a batch, no randomness in the batch
  • category_iterator: category_iterator does not support chunks

The current category_chunk_iterator is an extended chunk version of category_iterator. The current usage case is in codec training.

@mergify mergify bot added the ESPnet2 label Sep 4, 2024
@sw005320 sw005320 added the Enhancement Enhancement label Sep 9, 2024
@sw005320 sw005320 added this to the v.202405 milestone Sep 9, 2024
@sw005320
Copy link
Contributor

sw005320 commented Sep 9, 2024

Please fix the CI error

@sw005320
Copy link
Contributor

sw005320 commented Sep 9, 2024

Is it possible to add some tests for this?
This is a very useful extension and it would be a base for many other implementations.

@ftshijt
Copy link
Collaborator Author

ftshijt commented Sep 9, 2024

Is it possible to add some tests for this? This is a very useful extension and it would be a base for many other implementations.

Sure thing! I'm almost done with the performance checking, will add that soon

@ftshijt ftshijt added the Codec label Sep 12, 2024
@ftshijt
Copy link
Collaborator Author

ftshijt commented Sep 23, 2024

@sw005320 I think this PR is ready to merge. Please let me know if you have other suggestions. I've checked some intermediate results and will push to huggingface once the training converged.

@sw005320
Copy link
Contributor

LGTM!
Thanks, @ftshijt!

@sw005320 sw005320 merged commit ab9d386 into espnet:master Sep 23, 2024
Shikhar-S pushed a commit to Shikhar-S/espnet that referenced this pull request Mar 13, 2025
Categorical Balnced Chunk iterator
@ftshijt ftshijt deleted the category_chunk branch May 19, 2025 07:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants