Codestin Search App

Masao-Someki · 2025-10-17T19:17:15Z

What did you change?

Merged master branch into espnet3 branch

Why did you make this change?

To fix the CI issue.
See #6178 for details

Is your PR small enough?

no, but this is just a merge PR

Additional Context

[espnet3-6] Add evaluation scripts #6178

for more information, see https://pre-commit.ci

merge remote

Get forced alignments from CTC model

SpeechLM Data Infra: dataset management

for more information, see https://pre-commit.ci

Fix HF tests by switching them to upstream testing models

…master

Masao-Someki · 2025-10-17T19:18:06Z

@sw005320
I just create a PR that just merges master into espnet3 branch.
I think we can merge this PR after CI has passed.

for more information, see https://pre-commit.ci

gemini-code-assist

Code Review

This pull request merges a significant number of changes from the master branch, introducing new features like Language Identification (LID) and SpeechLM tasks, along with forced alignment capabilities and new data samplers. The updates also include robustness improvements, such as better dependency handling and support for Apple's MPS devices. While the majority of the changes appear solid and well-integrated with new tests, I've identified a critical issue in the new force_align.py script that could lead to incorrect results by silently truncating input audio. My review focuses on this critical point to ensure the script's correctness.

gemini-code-assist · 2025-10-17T19:19:59Z

egs2/ipapack_plus/s2t1/force_align.py

+def prepare_speech(speech, model, device):
+    """
+    Prepare speech tensor for model input.
+
+    Args:
+        speech: Audio waveform (numpy array or torch tensor)
+        model: Speech2Text model instance
+        device: Device to place tensor on
+
+    Returns:
+        Tuple of (speech_tensor, speech_lengths)
+    """
+    if isinstance(speech, np.ndarray):
+        speech = torch.tensor(speech)
+
+    if speech.dim() > 1:
+        assert (
+            speech.dim() == 2 and speech.size(1) == 1
+        ), f"Speech of size {speech.size()} is not supported"
+        speech = speech.squeeze(1)
+
+    speech_length = int(
+        model.preprocessor_conf["fs"] * model.preprocessor_conf["speech_length"]
+    )
+    original_length = speech.size(-1)
+
+    if original_length >= speech_length:
+        speech = speech[:speech_length]
+    else:
+        speech = F.pad(speech, (0, speech_length - original_length))
+    speech = speech.unsqueeze(0).to(getattr(torch, model.dtype))
+    speech_lengths = speech.new_full([1], dtype=torch.long, fill_value=speech.shape[1])
+    return speech, speech_lengths


The current implementation of prepare_speech truncates or pads the input audio to a fixed length derived from the model's training configuration. This will cause any audio file longer than the configured speech_length to be silently truncated, leading to incomplete and incorrect forced alignment results. An alignment utility should process the entire audio file to be useful in a general context.

The suggested change removes this fixed-length processing, ensuring that the entire audio waveform is passed to the model for alignment.

def prepare_speech(speech, model, device): """ Prepare speech tensor for model input. Args: speech: Audio waveform (numpy array or torch tensor) model: Speech2Text model instance device: Device to place tensor on Returns: Tuple of (speech_tensor, speech_lengths) """ if isinstance(speech, np.ndarray): speech = torch.tensor(speech) if speech.dim() > 1: assert ( speech.dim() == 2 and speech.size(1) == 1 ), f"Speech of size {speech.size()} is not supported" speech = speech.squeeze(1) speech = speech.unsqueeze(0).to(getattr(torch, model.dtype)) speech_lengths = speech.new_full([1], dtype=torch.long, fill_value=speech.shape[1]) return speech, speech_lengths

- Replace espnet -> espnet2.legacy

…rge_master

codecov · 2025-10-17T19:55:31Z

Codecov Report

❌ Patch coverage is 70.74236% with 67 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.14%. Comparing base (4531bcc) to head (c5aea57).
⚠️ Report is 139 commits behind head on espnet3.

Files with missing lines	Patch %	Lines
espnet2/bin/lid_inference.py	70.22%	53 Missing ⚠️
espnet2/iterators/category_iter_factory.py	30.00%	7 Missing ⚠️
espnet2/tasks/abs_task.py	57.14%	3 Missing ⚠️
espnet2/bin/asr_align.py	50.00%	2 Missing ⚠️
espnet2/bin/s2t_ctc_align.py	50.00%	2 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##           espnet3    #6263       +/-   ##
============================================
+ Coverage         0   70.14%   +70.14%     
============================================
  Files            0      751      +751     
  Lines            0    69057    +69057     
============================================
+ Hits             0    48441    +48441     
- Misses           0    20616    +20616

Flag	Coverage Δ
test_integration_espnet2	`47.88% <43.43%> (?)`
test_python_espnet2	`62.76% <58.51%> (?)`
test_python_espnet3	`15.98% <1.74%> (?)`
test_utils	`62.76% <58.51%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Qingzheng-Wang and others added 30 commits August 16, 2025 21:11

Add test for lid task.

4621a53

Add lid1 CI.

52978bf

Add lid inference test.

2888cd7

Add lid train test.

e7e45c1

Add lid espnet model test.

a75103f

Add catpow samplers test.

b318798

Add tristage lr scheduler test.

8d3c48f

Add minian4 lid recipe for CI.

3896b28

Fix symlinks to directly to utils in asr.

4e1bdee

Add readme.

f2359df

[pre-commit.ci] auto fixes from pre-commit.com hooks

89d1742

for more information, see https://pre-commit.ci

Fix unused variable.

dbec698

[pre-commit.ci] auto fixes from pre-commit.com hooks

2f376a7

for more information, see https://pre-commit.ci

Fix local files to keep with early prs.

bcd523e

Fix local files to sym links

7d80668

Merge branch 'master' into lid_release8

074b738

Add random seed.

461bae8

Merge branch 'master' into lid_release8

89396f2

Remove unused comments.

882b557

Fix num iters and comments.

8dfaaf1

Fix variable quotes.

e9beb63

Merge branch 'master' into lid_release8

d71b1c0

Merge branch 'master' into lid_release8

5dda091

Merge branch 'master' into lid_release8

4f642fa

Rename test espnet model to avoid repeat with other task.

70a1750

Replace frontend from s3prl to default.

e2da90f

Add dataset scaling factor attribute.

fb71860

Rename save_every to checkpoint_interval.

9049b4f

Update mock dependencies.

7b193b5

[pre-commit.ci] auto fixes from pre-commit.com hooks

40eec9d

for more information, see https://pre-commit.ci

jctian98 and others added 10 commits October 6, 2025 11:16

Merge branch 'infra' of https://github.com/jctian98/espnet into infra

8bc55c3

merge remote

fix ci

c298c04

Merge branch 'master' into infra

643578e

Merge pull request espnet#6248 from Shikhar-S/powsm

1778902

Get forced alignments from CTC model

Merge pull request espnet#6257 from jctian98/infra

849b210

SpeechLM Data Infra: dataset management

Fix HF tests by switching them to upstream testing models

d4a73ca

[pre-commit.ci] auto fixes from pre-commit.com hooks

e6358fb

for more information, see https://pre-commit.ci

Fix too long lines

cc1f815

Merge pull request espnet#6261 from akreal/fix-hf-testing

81477a2

Fix HF tests by switching them to upstream testing models

Merge branch 'master' of https://github.com/espnet/espnet into merge_…

a4c7e58

…master

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. CI Travis, Circle CI, etc labels Oct 17, 2025

mergify bot added ESPnet1 ESPnet2 README Installation labels Oct 17, 2025

[pre-commit.ci] auto fixes from pre-commit.com hooks

1056eb2

for more information, see https://pre-commit.ci

gemini-code-assist bot reviewed Oct 17, 2025

View reviewed changes

Masao-Someki added 3 commits October 17, 2025 14:29

Fixed merge issue

c11ad2f

- Replace espnet -> espnet2.legacy

Merge branch 'merge_master' of github.com:Masao-Someki/espnet into me…

8136137

…rge_master

Format

0d497ad

Updated timeout to 50sec for test_parallel_for_propagates_task_exception

c5aea57

Masao-Someki changed the title ~~Merge master~~ [ESPnet-3] Merge master into espnet3 branch Oct 18, 2025

Masao-Someki merged commit 8b3fea3 into espnet:espnet3 Oct 18, 2025
28 checks passed

Fhrozen added this to the v.202512 milestone Oct 26, 2025

Fhrozen modified the milestones: v.202512, v.202511 Nov 14, 2025

Masao-Someki deleted the merge_master branch November 26, 2025 18:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ESPnet-3] Merge master into espnet3 branch#6263

[ESPnet-3] Merge master into espnet3 branch#6263
Masao-Someki merged 138 commits intoespnet:espnet3from
Masao-Someki:merge_master

Masao-Someki commented Oct 17, 2025

Uh oh!

Masao-Someki commented Oct 17, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 17, 2025

Uh oh!

codecov bot commented Oct 17, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Conversation

Masao-Someki commented Oct 17, 2025

What did you change?

Why did you make this change?

Is your PR small enough?

Additional Context

Uh oh!

Masao-Someki commented Oct 17, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

codecov bot commented Oct 17, 2025 •

edited

Loading