Generate professional images, videos, speech, and morphing effects using state-of-the-art AI models.
Portraits is a comprehensive Python suite for AI-powered content generation, featuring:
- πΌοΈ Image Generation - Professional headshots and artistic images
- π¬ Video Generation - High-quality text-to-video synthesis
- ποΈ Voice Synthesis - Natural speech with emotion support
- π Video Morphing - Smooth facial landmark-based transitions
- π Web Interface - User-friendly Gradio UI
- SDXL Turbo model for fast, high-quality output
- LoRA support for custom styles and fine-tuning
- Professional headshots with consistent lighting and posing
- Batch processing for multiple generations
- SkyReels-V2 (14B parameters) for cinematic quality
- 540P resolution (960x544) output
- Configurable duration up to 6 seconds
- GPU acceleration with CPU fallback support
- Maya1 (3B parameters) for natural speech
- Emotion tags (
<laugh>,<cry>,<whisper>, etc.) - Voice descriptions for age, pitch, and style control
- 24kHz mono audio output
- MediaPipe facial landmark detection
- Smooth mesh-based transitions
- Customizable speed and frame interpolation
- Automatic sorting and validation
- Python 3.11+
- 8GB+ VRAM recommended (GPU acceleration)
- Supports: CUDA, Apple Silicon (MPS), CPU
# Clone the repository
git clone https://github.com/andrewmarconi/portraits.git
cd portraits
# Install with uv (recommended)
uv sync
# Or with pip
pip install -r requirements.txt# Launch web interface (recommended)
python main.py
# Command examples
python main.py image --prompt "professional headshot of a woman"
python main.py video --prompt "cinematic drone shot over mountains"
python main.py voice --text "Hello world!" --voice "warm, friendly, medium pitch"
python main.py morph --input images/ --output morphed.mp4from portraits.generators import image, video, voice, morph
# Generate professional headshot
image_path = image.generate_headshot(
prompt="professional headshot of a woman in business attire",
num_images=1
)
# Generate video from text
video_path = video.generate_video(
prompt="cinematic drone shot over mountains at sunset",
num_frames=97
)
# Generate speech with emotion
audio_path = voice.generate_voice(
text="Welcome to Portraits! <excited>",
description="warm, friendly, medium pitch"
)
# Create morphing video
morph_path = morph.create_mesh_morphing_video(
input_dir="portraits/",
output_file="morphed_video.mp4"
)from portraits.generators.image import generate_headshot
# Basic generation
image_path = generate_headshot(
prompt="professional headshot of a doctor"
)
# Advanced options
image_path = generate_headshot(
prompt="professional headshot in office lighting",
num_images=5,
seed=42,
lora_path="./custom_style.safetensors",
lora_scale=0.8
)from portraits.generators.video import generate_video
# Standard generation
video_path = generate_video(
prompt="a serene lake surrounded by mountains"
)
# High-quality settings
video_path = generate_video(
prompt="underwater coral reef with tropical fish",
num_frames=145, # ~6 seconds at 24fps
guidance_scale=7.0,
num_inference_steps=50
)from portraits.generators.voice import generate_voice
# Natural speech
audio_path = generate_voice(
text="Hello, this is a test of the voice synthesis system."
)
# Character voice with emotions
audio_path = generate_voice(
text="I'm so excited to meet you! <laugh> This is amazing!",
description="young female, energetic, high pitch",
temperature=0.7,
top_p=0.9
)from portraits.generators.morph import create_mesh_morphing_video
# Basic morphing
morph_path = create_mesh_morphing_video(
input_dir="headshots/",
output_file="transition.mp4"
)
# Custom settings
morph_path = create_mesh_morphing_video(
input_dir="portraits/",
output_file="smooth_morph.mp4",
fps=30,
morph_frames=12,
padding=0.2
)Launch the comprehensive web UI:
python main.pyFeatures:
- Tabbed interface for each generation type
- Real-time preview of results
- Parameter controls with sliders and inputs
- Batch processing support
- Download management for generated content
The system automatically detects and optimizes for your hardware:
- CUDA GPUs: Full acceleration, recommended for best performance
- Apple Silicon: MPS support with CPU fallback for stability
- CPU: Functional but significantly slower
| Feature | Model | VRAM Required | Disk Space |
|---|---|---|---|
| Image Generation | SDXL Turbo | 8GB+ | 12GB |
| Video Generation | SkyReels-V2 | 24GB+ | 28GB |
| Voice Synthesis | Maya1 | 16GB+ | 6GB |
| Video Morphing | MediaPipe | 4GB+ | 500MB |
# PyTorch optimizations
export PYTORCH_ENABLE_MPS_FALLBACK=1 # Apple Silicon
export CUDA_VISIBLE_DEVICES=0 # Select GPU
# Model paths (optional)
export PORTRAITS_MODELS_DIR="./models"
export PORTRAITS_OUTPUT_DIR="./output"portraits/
βββ core/ # Shared utilities
β βββ config.py # Configuration management
β βββ device.py # Hardware detection
β βββ utils.py # Common utilities
β βββ exceptions.py # Custom exceptions
βββ generators/ # AI generation modules
β βββ image.py # Image generation
β βββ video.py # Video generation
β βββ voice.py # Voice synthesis
β βββ morph.py # Video morphing
β βββ *_helpers.py # Modular helper functions
βββ ui/ # Web interface
β βββ app.py # Gradio application
βββ tests/ # Test suite
- β Optimized Complexity: All functions refactored for maintainability
- β Type Hints: Full type annotation coverage
- β Error Handling: Comprehensive exception management
- β Documentation: Detailed docstrings and examples
- β Testing: Unit tests for core functionality
# Clone repository
git clone https://github.com/andrewmarconi/portraits.git
cd portraits
# Install development dependencies
uv sync --extra dev
# Run tests
pytest tests/
# Check code quality
ruff check portraits/
mypy portraits/- Fork the repository
- Create a feature branch:
git checkout -b feature-name - Make your changes with tests
- Run quality checks:
ruff check && mypy && pytest - Submit a pull request
# Corporate headshot
generate_headshot(
prompt="professional corporate headshot, neutral expression, business attire"
)
# Creative professional
generate_headshot(
prompt="professional headshot, warm lighting, confident smile"
)# Cinematic scene
generate_video(
prompt="cinematic shot of a futuristic city at night with flying cars"
)
# Nature documentary
generate_video(
prompt="close-up of a hummingbird drinking nectar in slow motion"
)# News anchor
generate_voice(
text="Breaking news: AI technology continues to advance rapidly.",
description="middle-aged male, authoritative, clear diction"
)
# Friendly assistant
generate_voice(
text="How can I help you today? <smile>",
description="young female, cheerful, medium pitch"
)Out of Memory Errors:
- Reduce
num_framesfor video generation - Enable CPU offloading with
enable_offload=True - Use smaller batch sizes
Model Download Issues:
- Check internet connection
- Verify Hugging Face authentication for private models
- Ensure sufficient disk space
Performance Issues:
- Update GPU drivers
- Use CUDA instead of CPU when available
- Close other GPU-intensive applications
- Documentation: Check inline docstrings and examples
- Issues: GitHub Issues
- Discussions: GitHub Discussions
This project is licensed under the MIT License - see the LICENSE file for details.
- Stability AI - SDXL Turbo model
- Skywork AI - SkyReels-V2 model
- Maya Research - Maya1 model
- Google - MediaPipe framework
- Gradio - Web interface framework
| Hardware | Image (1x) | Video (4s) | Voice (10s) | Morph (10 imgs) |
|---|---|---|---|---|
| RTX 4090 | ~2s | ~45s | ~15s | ~8s |
| RTX 3080 | ~4s | ~90s | ~25s | ~15s |
| M2 Ultra | ~6s | ~120s | ~30s | ~20s |
| CPU (16GB) | ~45s | ~600s | ~180s | ~90s |
Benchmarks are approximate and depend on model settings and content complexity.
Portraits - Professional AI content generation made accessible. π¨β¨