Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[RecurrentNN × Regression × Regularized]-base Mouth Opening Estimation via SSL(Semi-supervised Learning).

License

Notifications You must be signed in to change notification settings

KakaruHayate/R3MOE

Repository files navigation

W I P 🔧⚠️🚧🔨

R^3 - M · O · E

[RecurrentNN × Regression × Regularized] based Mouth Opening Estimation via SSL

中文文档

Installation

  1. Install PyTorch from official instructions: https://pytorch.org/get-started/locally/
  2. Install dependencies:
pip install -r requirements.txt

Preprocessing

1. Mouth Opening Data

  1. Collect data using LipsSync. Directory structure:

    2025-02-04_22-01-52/
        audio.wav
        mouth_data.csv
    2025-02-04_22-43-56/
        audio.wav
        mouth_data.csv
    valid.txt
    
    • Prepare seen validation set (in-distribution speakers) and unseen validation set (out-of-distribution speakers)
    • Add audio paths to valid.txt
    • For SSL: Prepare unlabeled vocal-only audio (intact spectrum below 16kHz)
  2. Run preprocessing:

    # Labeled data
    python recipes/mouth_opening/preprocess.py <SOURCE_DIR> <TARGET_DIR>
    
    # Unlabeled data (SSL)
    python recipes/mouth_opening/preprocess_unlabel.py <SOURCE_DIR> <TARGET_DIR>

Base Training

Run training:

python train.py --exp_name <EXP_NAME> --dataset <DATA_PATH> --gpu <GPU_ID>

View all options with python train.py --help. Variants:

  • train_r_drop.py (R-Drop regularization)
  • train_mse.py (MSE loss)

SSL Training

Command:

python train_ssl.py --exp_name <EXP_NAME> --dataset <DATA_PATH> --unlabel_dataset <UNLABEL_PATH> --gpu <GPU_ID>

Prerequisites:

  • Create valid2.txt with unseen validation paths
  • --conv_dropout must be non-zero

Recommendations

Inference

python eval.py --model <model_path> --wav <wav_path>

Acknowledgements

Resources

About

[RecurrentNN × Regression × Regularized]-base Mouth Opening Estimation via SSL(Semi-supervised Learning).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages