This project provides an automatic mixing and speech enhancement system designed to reduce auditory masking in multi-speaker scenarios. The system intelligently processes multiple tracks of speech audio, enhancing clarity and separation using perceptually motivated optimization techniques.
When multiple people speak simultaneously, auditory masking and signal interference often make it difficult to distinguish individual voices. This system addresses that problem by:
- Evaluating perceptual masking with the PEAQ model (ITU-R BS.1387)
- Using Ideal Mask Ratio (IMR) metrics for precise optimization
- Applying audio effects including:
- 🎚️ Level balancing
- 🎛️ Equalization
- 📉 Dynamic range compression
- 🧭 Spatialization
- Using a hybrid optimization approach:
- 🎵 Harmony Search (metaheuristic)
- 🔢 Integer optimization
-
Open a terminal and navigate to the project directory.
-
Run the following command to install dependencies:
npm install
This will install all packages listed in the
package.jsonfile.
-
Node.js must be installed on your system.
👉 Download Node.js -
If
http-serveris not already installed, you can install it globally with:npm install -g http-server
-
Open a terminal and navigate to the project directory.
-
Start the local server:
http-server
-
In your browser, go to:
http://localhost:8080
Play-Automix to start the system.
This step is essential for initializing audio parameter training and optimization. Without this, no enhancement processing will occur.
On the UI page:
-
🔹 Click
Play-Automix:
This initializes the system. It begins analyzing audio parameters and performing training and optimization to reduce auditory masking. -
🔹 Click
Play-Unmix:
Listen to the unprocessed (raw) version of the same audio for comparison. -
🔹 Click
Stop:
Ends playback and records the enhanced output.
This code is provided for research purposes only.
Non-commercial use is permitted.
Commercial use, redistribution, or modification of this code requires prior written permission.
Please cite the associated paper if you use this code in your work.