TasNet: Time-domain Audio Separation Network

A PyTorch implementation of "TasNet: Time-domain Audio Separation Network for Real-time, single-channel speech separation", published in ICASSP2018, by Yi Luo and Nima Mesgarani.

Results

Method	Causal	SDRi	SI-SNRi	Config
TasNet-BLSTM (Paper)	No	11.1	10.8
TasNet-BLSTM (Here)	No	11.84	11.54	L40 N500 hidden500 layer4 lr1e-3 epoch100 batch size10
TasNet-BLSTM (Here)	No	11.77	11.46	+ L2 1e-4
TasNet-BLSTM (Here)	No	13.07	12.78	+ L2 1e-5

Install

PyTorch 0.4.1+
Python3 (Recommend Anaconda)
pip install -r requirements.txt
If you need to convert wjs0 to wav format and generate mixture files, cd tools; make

Usage

If you already have mixture wsj0 data:

$ cd egs/wsj0, modify wsj0 data path data to your path in the beginning of run.sh.
$ bash run.sh, that's all!

If you just have origin wsj0 data (sphere format):

$ cd egs/wsj0, modify three wsj0 data path to your path in the beginning of run.sh.
Convert sphere format wsj0 to wav format and generate mixture. Stage 0 part provides an example.
$ bash run.sh, that's all!

You can change hyper-parameter by $ bash run.sh --parameter_name parameter_value, egs, $ bash run.sh --stage 3. See parameter name in egs/aishell/run.sh before . utils/parse_options.sh.

Workflow

Workflow of egs/wsj0/run.sh:

Stage 0: Convert sphere format to wav format and generate mixture (optional)
Stage 1: Generating json files including wav path and duration
Stage 2: Training
Stage 3: Evaluate separation performance
Stage 4: Separate speech using TasNet

More detail

# Set PATH and PYTHONPATH
$ cd egs/wsj0/; . ./path.sh
# Train:
$ train.py -h
# Evaluate performance:
$ evaluate.py -h
# Separate mixture audio:
$ separate.py -h

How to visualize loss?

If you want to visualize your loss, you can use visdom to do that:

Open a new terminal in your remote server (recommend tmux) and run $ visdom
Open a new terminal and run $ bash run.sh --visdom 1 --visdom_id "<any-string>" or $ train.py ... --visdom 1 --vidsdom_id "<any-string>"
Open your browser and type <your-remote-server-ip>:8097, egs, 127.0.0.1:8097
In visdom website, chose <any-string> in Environment to see your loss

How to resume training?

$ bash run.sh --continue_from <model-path>

TODO

Layer normlization described in paper
LSTM skip connection
Curriculum learning

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
egs/wsj0		egs/wsj0
src		src
test		test
tools		tools
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TasNet: Time-domain Audio Separation Network

Results

Install

Usage

Workflow

More detail

How to visualize loss?

How to resume training?

TODO

About

Uh oh!

Releases

Packages

Uh oh!

Languages

kaituoxu/TasNet

Folders and files

Latest commit

History

Repository files navigation

TasNet: Time-domain Audio Separation Network

Results

Install

Usage

Workflow

More detail

How to visualize loss?

How to resume training?

TODO

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages