A high-performance digital human video generation application with GPU acceleration support for NVIDIA RTX series graphics cards.
This project transforms a CPU-only digital human application into a dual-environment system supporting both CPU and GPU processing, achieving 40% performance improvement on NVIDIA RTX hardware.
- π GPU Acceleration: CUDA 11.8 + PyTorch 2.0.1 optimization
- π Dual Environment: CPU (stable) + GPU (performance) options
- π Web Interface: Gradio-based UI with real-time processing
- π± CLI Support: Command-line tools for batch processing
- π‘οΈ Error Recovery: Graceful fallbacks and comprehensive error handling
| Metric | CPU Version | GPU Version | Improvement |
|---|---|---|---|
| Audio Processing | 8.75s | 1.51s | 82% faster |
| Total Processing | 18.99s | 11.71s | 38% faster |
| Model Loading | 15s | 5s | 67% faster |
- OS: Linux Ubuntu 18.04+
- Python: 3.8+
- RAM: 8GB+
- Storage: 10GB free space
- GPU: NVIDIA RTX 20/30/40 series
- VRAM: 6GB+ recommended
- CUDA: 11.8+ compatible drivers
- Driver: NVIDIA 450.80.02+
git clone https://github.com/agilealpha1/AiVideo.git
cd AiVideo# Create CPU environment
python -m venv venv
source venv/bin/activate
# Install CPU dependencies
pip install -r requirements_updated.txt
# Run CPU web interface
python app.py
# Access at: http://localhost:7860# Create GPU environment
python -m venv venv_gpu
source venv_gpu/bin/activate
# Install GPU-optimized PyTorch
pip install torch==2.0.1+cu118 torchaudio==2.0.2+cu118 torchvision==0.15.2+cu118 --index-url https://download.pytorch.org/whl/cu118
# Install other dependencies
pip install -r requirements_gpu_fixed.txt
# Additional required packages
pip install einops typeguard==2.13.3
# Run GPU web interface
python app_gpu.py
# Access at: http://localhost:7861- Upload Files: Audio (.wav) + Video (.mp4)
- Select Code: Processing identifier (default: 1004)
- Choose Environment:
- CPU:
http://localhost:7860(stable) - GPU:
http://localhost:7861(faster)
- CPU:
- Process: Click "Process" button
- Download: Get generated video/audio files
# CPU processing
source venv/bin/activate
python run.py --audio_path audio.wav --video_path video.mp4
# GPU processing
source venv_gpu/bin/activate
python run_gpu.py --audio_path audio.wav --video_path video.mp4 --gpu
# Force CPU mode in GPU environment
python run_gpu.py --audio_path audio.wav --video_path video.mp4 --cpuHeyGem-Linux-Python-Hack/
βββ π CPU Environment
β βββ app.py # CPU web interface
β βββ run.py # CPU command line
β βββ requirements_updated.txt # CPU dependencies
βββ π GPU Environment
β βββ app_gpu.py # GPU web interface
β βββ run_gpu.py # GPU command line
β βββ requirements_gpu_fixed.txt # GPU dependencies
βββ π Core Modules
β βββ service/ # Core processing logic
β βββ face_lib/ # Face detection/processing
β βββ landmark2face_wy/ # Neural network models
β βββ y_utils/ # Utility functions
βββ π Configuration
β βββ config/ # Application settings
β βββ example/ # Sample input files
βββ π Documentation
βββ README.md # This file
βββ .gitignore # Git exclusions
RuntimeError: Unexpected error from cudaGetDeviceCount()
Solution: Use CPU environment or update NVIDIA drivers
OSError: Cannot find empty port in range: 7860-7860
Solution: Check for running processes:
ps aux | grep python
kill <process_id> # If neededModuleNotFoundError: No module named 'einops'
Solution: Install missing packages:
pip install einops typeguard==2.13.3_queue.Empty: timeout
Status: Non-critical - processing continues, files still generated
# Clear GPU cache
python -c "import torch; torch.cuda.empty_cache()"nvidia-smi# Test CUDA availability
python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}')"
# Test web interface
curl http://localhost:7860 # CPU version
curl http://localhost:7861 # GPU version# Compare processing times
time python run.py --audio_path example/audio.wav --video_path example/video.mp4
time python run_gpu.py --audio_path example/audio.wav --video_path example/video.mp4 --gputorch>=2.0.1- Deep learning frameworkgradio>=4.44.1- Web interfaceopencv-python>=4.7.0- Computer visionnumpy>=1.21.6,<1.23.0- Numerical computingscipy>=1.7.1,<1.8.0- Scientific computing
torch==2.0.1+cu118- CUDA-enabled PyTorchonnxruntime-gpu==1.19.2- GPU inference runtimeeinops==0.8.1- Tensor operationstypeguard==2.13.3- Type checking
- Fork the repository
- Create feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push branch (
git push origin feature/amazing-feature) - Open Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Original HeyGem Digital Human team for the base application
- @Holasyb918/HeyGem-Linux-Python-Hack - Docker-free Linux Python implementation
- @duixcom/Duix.Heygem - Enhanced HeyGem implementation
- NVIDIA for CUDA toolkit and GPU optimization guides
- PyTorch team for GPU acceleration framework
- Gradio team for the excellent web interface framework
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: See
/docsfolder for detailed guides
β Star this repository if it helped you optimize your digital human processing!