Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Modify format_wav_scp.py to support PCM of uint8, int32, float32, float64, etc.#4997

Merged
kamo-naoyuki merged 1 commit intoespnet:masterfrom
kamo-naoyuki:format
Mar 13, 2023
Merged

Modify format_wav_scp.py to support PCM of uint8, int32, float32, float64, etc.#4997
kamo-naoyuki merged 1 commit intoespnet:masterfrom
kamo-naoyuki:format

Conversation

@kamo-naoyuki
Copy link
Collaborator

@kamo-naoyuki kamo-naoyuki commented Mar 11, 2023

Issue of the current script

  • np.int16 is specified for the soundfile.read: soundfile.read(-. dtype=np.int16). This indicates some problems if the PCM data has uint8 or float type.
  • Using kaldiio for segmentation, but kaldiio requires that the input file is wav file

The modification of PR

  • I chanced to to soundfile.read() withtou specifying the dtype. Note that soundfile.read() always returns float64 array in spite of the dtype of input pcm, and soundfile.write write a float64 array in int16 pcm.
  • I implemented a segmentation module independent from kaldiio. This realizes supporting flac format for inputting.

e.g. The following wav.scp with segments also works.

ID1 cat a.flac |

@kamo-naoyuki kamo-naoyuki changed the title Modify format_wav.scp to support PCM of uint8, int32, float32, floar64, etc. Modify format_wav_scp.py to support PCM of uint8, int32, float32, floar64, etc. Mar 11, 2023
@mergify mergify bot added the ESPnet2 label Mar 11, 2023
@codecov
Copy link

codecov bot commented Mar 11, 2023

Codecov Report

Merging #4997 (aad60ed) into master (b5b2b11) will increase coverage by 0.09%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #4997      +/-   ##
==========================================
+ Coverage   76.91%   77.00%   +0.09%     
==========================================
  Files         606      606              
  Lines       53770    53748      -22     
==========================================
+ Hits        41355    41388      +33     
+ Misses      12415    12360      -55     
Flag Coverage Δ
test_integration_espnet1 66.29% <ø> (-0.05%) ⬇️
test_integration_espnet2 47.76% <ø> (+0.77%) ⬆️
test_python 66.84% <ø> (-0.01%) ⬇️
test_utils 23.28% <ø> (-0.08%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

see 30 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@kamo-naoyuki kamo-naoyuki changed the title Modify format_wav_scp.py to support PCM of uint8, int32, float32, floar64, etc. Modify format_wav_scp.py to support PCM of uint8, int32, float32, float64, etc. Mar 11, 2023
@kamo-naoyuki
Copy link
Collaborator Author

I'll merge this PR if there are no comments.

@kamo-naoyuki kamo-naoyuki merged commit 418418c into espnet:master Mar 13, 2023
@kamo-naoyuki kamo-naoyuki deleted the format branch March 13, 2023 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant