Commit f68484c
Unified inference of streaming ASR (NVIDIA-NeMo#14817)
* init inference folders
Signed-off-by: naymaraq <[email protected]>
* added base asr inference
Signed-off-by: naymaraq <[email protected]>
* add ctc and rnnt inference classes
Signed-off-by: naymaraq <[email protected]>
* small changes for ctc/rnnt inference
Signed-off-by: naymaraq <[email protected]>
* add cache aware ctc/rnnt inference classes
Signed-off-by: naymaraq <[email protected]>
* finilize asr inference part
Signed-off-by: naymaraq <[email protected]>
* add word class
Signed-off-by: naymaraq <[email protected]>
* add enums file
Signed-off-by: naymaraq <[email protected]>
* add alignment preserving itn
Signed-off-by: naymaraq <[email protected]>
* add punctuation/capitalization model
Signed-off-by: naymaraq <[email protected]>
* add audio_io and progressbar files
Signed-off-by: naymaraq <[email protected]>
* add framing and buffering files
Signed-off-by: naymaraq <[email protected]>
* mv common/inference/utils into asr/inference/utils
Signed-off-by: naymaraq <[email protected]>
* add StreamingState objects
Signed-off-by: naymaraq <[email protected]>
* temporary rm enhancement stuff
Signed-off-by: naymaraq <[email protected]>
* rm common/inference
Signed-off-by: naymaraq <[email protected]>
* add greedy decoders for CTC/RNNT
Signed-off-by: naymaraq <[email protected]>
* add endpointing files
Signed-off-by: naymaraq <[email protected]>
* add text processing
Signed-off-by: naymaraq <[email protected]>
* mv itn_utils into utils
Signed-off-by: naymaraq <[email protected]>
* add bpe_decoder, context_manager for cache aware, recognizer_utils
Signed-off-by: naymaraq <[email protected]>
* add base_recognizer and recognizer interface files
Signed-off-by: naymaraq <[email protected]>
* add recognizers
Signed-off-by: naymaraq <[email protected]>
* add factory
Signed-off-by: naymaraq <[email protected]>
* add inference example and asr_client.py
Signed-off-by: naymaraq <[email protected]>
* minor fix
Signed-off-by: naymaraq <[email protected]>
* minor fixes
Signed-off-by: naymaraq <[email protected]>
* add example usage
Signed-off-by: naymaraq <[email protected]>
* add jsonl support
Signed-off-by: naymaraq <[email protected]>
* rm niva prefix
Signed-off-by: naymaraq <[email protected]>
* fix docstrings
Signed-off-by: naymaraq <[email protected]>
* mv RequestType into enums.py
Signed-off-by: naymaraq <[email protected]>
* rm redundant setters
Signed-off-by: naymaraq <[email protected]>
* add a log_level to config.yaml
Signed-off-by: naymaraq <[email protected]>
* setup log_level in RecognizerBuilder
Signed-off-by: naymaraq <[email protected]>
* add comments in multi stream and fix docstrings in buffering
Signed-off-by: naymaraq <[email protected]>
* conditional import for diskcache
Signed-off-by: naymaraq <[email protected]>
* set log level to INFO
Signed-off-by: naymaraq <[email protected]>
* add MPS device support
Signed-off-by: naymaraq <[email protected]>
* add tests
Signed-off-by: naymaraq <[email protected]>
* move inference into examples/asr/asr_chunked_inference/ctc
Signed-off-by: naymaraq <[email protected]>
* rm duplicated create_partial_transcript method
Signed-off-by: naymaraq <[email protected]>
* Apply isort and black reformatting
Signed-off-by: naymaraq <[email protected]>
Signed-off-by: naymaraq <[email protected]>
* resolve flake8 errors
Signed-off-by: naymaraq <[email protected]>
* resolve return type
Signed-off-by: naymaraq <[email protected]>
* fix imports in tests
Signed-off-by: naymaraq <[email protected]>
* optimize bpe_decoder
Signed-off-by: naymaraq <[email protected]>
* optimize log prob normalization
Signed-off-by: naymaraq <[email protected]>
* optimize split_text function
Signed-off-by: naymaraq <[email protected]>
* fix parital batching, improved GPU utilization
Signed-off-by: naymaraq <[email protected]>
* simplify ctc greedy decoder
Signed-off-by: naymaraq <[email protected]>
* add a method to perform ITN on a list of texts
Signed-off-by: naymaraq <[email protected]>
* remove duplicated code in enums
Signed-off-by: naymaraq <[email protected]>
* remove unnecessary pad_to logging
Signed-off-by: naymaraq <[email protected]>
* modified update_punctuation_and_language_tokens_timestamps function to ensure correct global timestamps for eou calculation
Signed-off-by: naymaraq <[email protected]>
* Apply isort and black reformatting
Signed-off-by: naymaraq <[email protected]>
* [refactor: segment-level output] conditional import for pynini and nemo_text_processing
Signed-off-by: naymaraq <[email protected]>
* [refactor: segment-level output] fix configs, added asr_output_granularity
Signed-off-by: naymaraq <[email protected]>
* [refactor: segment-level output] write segment/word level output into json instead of ctm
Signed-off-by: naymaraq <[email protected]>
* [refactor: segment-level output] add output granuality to request options
Signed-off-by: naymaraq <[email protected]>
* [refactor: segment-level output] add segment related fields to state
Signed-off-by: naymaraq <[email protected]>
* [refactor: segment-level output] add remove repeated punctuation function
Signed-off-by: naymaraq <[email protected]>
* [refactor: segment-level output] add TextSegment class
Signed-off-by: naymaraq <[email protected]>
* [refactor: segment-level output] update bpe decoder to support text segment
Signed-off-by: naymaraq <[email protected]>
* [refactor: segment-level output] update recognizers
Signed-off-by: naymaraq <[email protected]>
* [refactor: segment-level output] update text processing to support segment-level output
Signed-off-by: naymaraq <[email protected]>
* rm unused and duplicated code
Signed-off-by: naymaraq <[email protected]>
* Apply isort and black reformatting
Signed-off-by: naymaraq <[email protected]>
* code cleanup
Signed-off-by: naymaraq <[email protected]>
* Apply isort and black reformatting
Signed-off-by: naymaraq <[email protected]>
* rm unused code and code cleanup
Signed-off-by: naymaraq <[email protected]>
* Set num_slots to 1024 and add a num_slots parameter to the config files
Signed-off-by: naymaraq <[email protected]>
* removed hyp.alignment processing codes
Signed-off-by: naymaraq <[email protected]>
* disable amp
Signed-off-by: naymaraq <[email protected]>
* mv diskcache req into requirements_asr.txt
Signed-off-by: naymaraq <[email protected]>
* set use_amp to true and make typing consistent
Signed-off-by: naymaraq <[email protected]>
* use match/case for readability
Signed-off-by: naymaraq <[email protected]>
* rm lambdas from punctuation_capitalization_config.py
Signed-off-by: naymaraq <[email protected]>
* rm detect_eou method from RNNTGreedyEndpointing
Signed-off-by: naymaraq <[email protected]>
* reuse read_manifest from manifest_utils
Signed-off-by: naymaraq <[email protected]>
* use librosa instead of soundfile
Signed-off-by: naymaraq <[email protected]>
* unfreeze ASRRequestOptions dataclass
Signed-off-by: naymaraq <[email protected]>
* set use_amp to false for buffered CTC/RNNT recognizers, improved throughput
Signed-off-by: naymaraq <[email protected]>
* change matmul precision to high for cache aware models
Signed-off-by: naymaraq <[email protected]>
* optimized audio buffer shifting
Signed-off-by: naymaraq <[email protected]>
* Move running scripts and YAML files out of the ctc folder
Signed-off-by: naymaraq <[email protected]>
* reorganize file structure
Signed-off-by: naymaraq <[email protected]>
* Apply isort and black reformatting
Signed-off-by: naymaraq <[email protected]>
* Minor code simplifications
Signed-off-by: naymaraq <[email protected]>
* rm duplicated initializations from recognizers
Signed-off-by: naymaraq <[email protected]>
* remove package version for diskcache
Signed-off-by: naymaraq <[email protected]>
* move tqdm import to the top
Signed-off-by: naymaraq <[email protected]>
* simplify millisecond_to_frames function
Signed-off-by: naymaraq <[email protected]>
* raise a ValueError in case of stream_id > n_audio_files
Signed-off-by: naymaraq <[email protected]>
* fix return types
Signed-off-by: naymaraq <[email protected]>
* use list/dict/... instead of List/Dict/...
Signed-off-by: naymaraq <[email protected]>
* use keyword argument passing to create CacheFeatureBufferer
Signed-off-by: naymaraq <[email protected]>
* clean up state resetting logic
Signed-off-by: naymaraq <[email protected]>
* reuse normalize_batch
Signed-off-by: naymaraq <[email protected]>
* rename verbatim_transcripts and automatic_punctuation
Signed-off-by: naymaraq <[email protected]>
* rename recognizers to pipelines
Signed-off-by: naymaraq <[email protected]>
* rename asr/*_inference -> model_wrappers/*_inference_wrapper
Signed-off-by: naymaraq <[email protected]>
* Apply isort and black reformatting
Signed-off-by: naymaraq <[email protected]>
* reorgonize pnc, itn, text_processing params
Signed-off-by: naymaraq <[email protected]>
* improved code readability in pipeline initializations
Signed-off-by: naymaraq <[email protected]>
* Apply isort and black reformatting
Signed-off-by: naymaraq <[email protected]>
* add CI script for testing
Signed-off-by: naymaraq <[email protected]>
* add output_dir in CI test
Signed-off-by: naymaraq <[email protected]>
* move python running script into new folder
Signed-off-by: naymaraq <[email protected]>
* renamed asr_streaming_infer -> asr_streaming_inference
Signed-off-by: naymaraq <[email protected]>
* correct path in CI test
Signed-off-by: naymaraq <[email protected]>
* fix: variable may be used before it is initialized
Signed-off-by: naymaraq <[email protected]>
* fix docstring in itn/ folder
Signed-off-by: naymaraq <[email protected]>
* fix docstring in model_wrappers/ folder
Signed-off-by: naymaraq <[email protected]>
* fix docstring in utils/ folder
Signed-off-by: naymaraq <[email protected]>
* fix docstring in pipelines/ folder
Signed-off-by: naymaraq <[email protected]>
* fix docstring in streaming/ folder
Signed-off-by: naymaraq <[email protected]>
* remove PnC codes since nlp models are no longer supported
Signed-off-by: naymaraq <[email protected]>
* minor changes
Signed-off-by: naymaraq <[email protected]>
* return step output from transcribe_step method
Signed-off-by: naymaraq <[email protected]>
* Apply isort and black reformatting
Signed-off-by: naymaraq <[email protected]>
* fix functional_test
Signed-off-by: naymaraq <[email protected]>
* increase timeout for L0_Unit_Tests_CPU_ASR
Signed-off-by: naymaraq <[email protected]>
* rm cache aware inference from functional test
Signed-off-by: naymaraq <[email protected]>
---------
Signed-off-by: naymaraq <[email protected]>
Signed-off-by: naymaraq <[email protected]>
Co-authored-by: naymaraq <[email protected]>
Co-authored-by: naymaraq <[email protected]>1 parent 155b255 commit f68484c
91 files changed
Lines changed: 11418 additions & 2 deletions
File tree
- .github/workflows
- examples/asr
- asr_chunked_inference
- asr_streaming_inference
- conf/asr_streaming_inference
- nemo/collections/asr
- inference
- factory
- itn
- model_wrappers
- pipelines
- streaming
- buffering
- decoders
- greedy
- endpointing
- greedy
- framing
- state
- text
- utils
- parts/preprocessing
- requirements
- tests
- collections/asr/inference
- functional_tests
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
41 | 41 | | |
42 | 42 | | |
43 | 43 | | |
44 | | - | |
| 44 | + | |
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
| |||
129 | 129 | | |
130 | 130 | | |
131 | 131 | | |
| 132 | + | |
| 133 | + | |
132 | 134 | | |
133 | 135 | | |
134 | 136 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
| 16 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
Lines changed: 96 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
Lines changed: 80 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
Lines changed: 83 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
Lines changed: 80 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
0 commit comments