An advanced multi-agent AI platform for automated content creation, from research and scriptwriting to visual asset generation and production management. This system orchestrates multiple AI agents to transform trending tech news into fully-produced social media content using LangGraph for workflow management and various AI services for content generation.
Watch our automated content creation pipeline in action - Click to play on YouTube
Samsung PRISM - Kirmada Presentation - View our detailed project presentation covering the system architecture, workflow design, and implementation details.
- Features
- System Architecture
- Directory Structure
- Installation & Setup
- Running the Application
- AI Agent Components
- Workflow Orchestration
- Storage & Integration
- Technical Implementation
- Testing
- Cost Management
- Extending the System
- Troubleshooting
- Multi-Agent Architecture: Specialized AI agents for research, content creation, and asset generation
- LangGraph Orchestration: Advanced workflow management with error handling and parallel processing
- End-to-End Automation: Complete pipeline from research to final video assets
- Multi-Source Research: Fetches tech news from Tavily, NewsAPI, and direct web crawling
- Intelligent Script Generation: AI-powered script creation optimized for social media format
- Visual Asset Production: Image generation with FLUX models, b-roll search from Pexels
- Voice Generation: AI voiceover synthesis for video narrations
- Shot-by-Shot Analysis: AI-powered shot breakdown for video production
- Google Drive Integration: Automated asset organization with topic-based subfolders
- Notion Workspace Management: Project tracking and monitoring integration
- Parallel Processing: Simultaneous image, voice, and b-roll generation
- Cost Management: Built-in cost tracking and limits to prevent excessive usage
- Cross-Platform Publishing: Ready for multi-platform content distribution
graph TB
subgraph "User Interface"
UI[Web Dashboard<br/>or API Interface]
end
subgraph "Workflow Orchestration"
LG[LangGraph<br/>State Management]
WF[Production Workflow<br/>State Graph]
end
subgraph "AI Agent Layer"
SA[Search Agent<br/>Intelligent Tech News Search]
CA[Crawl Agent<br/>Article Content Extraction]
SG[Scripting Agent<br/>Video Script Generation]
PG[Prompt Generation<br/>Visual Asset Prompts]
IG[Image Generation<br/>FLUX Image Creation]
VG[Voice Generation<br/>AI Narration Synthesis]
BG[B-roll Search<br/>Pexels Asset Search]
AA[Asset Gathering<br/>Organize Assets in Google Drive]
NO[Notion Agent<br/>Project Tracking]
VT[Visual Table<br/>Production Schedule]
end
subgraph "API Integration Layer"
TV[Tavily Search API]
FC[Firecrawl API]
OA[OpenAI API]
TO[Together AI - FLUX]
CB[Chatterbox API]
PX[Pexels API]
SB[Supabase API]
NT[Notion API]
GD[Google Drive API]
end
subgraph "Storage Layer"
ST[Supabase<br/>Article & Script Storage]
GD[Google Drive<br/>Asset Organization]
NT[Notion<br/>Project Management]
end
UI --> LG
LG --> WF
WF --> SA
WF --> CA
WF --> SG
WF --> PG
WF --> IG
WF --> VG
WF --> BG
WF --> AA
WF --> NO
WF --> VT
SA --> TV
SA --> FC
CA --> FC
SG --> OA
PG --> OA
IG --> TO
VG --> CB
BG --> PX
AA --> GD
NO --> NT
ST <--> SB
GD <--> AA
NT <--> NO
D:\Kirmada\workflow\
│
├── .env # Environment variables and API keys
├── .gitignore # Git ignore patterns
├── credentials.json # Google Drive OAuth credentials
├── token.json # Google Drive access tokens
│
├── agents/ # AI agent implementations
│ ├── asset_gathering_agent.py # Organize generated assets in Google Drive
│ ├── broll_search_agent.py # Find b-roll footage from Pexels
│ ├── crawl_agent.py # Web content crawling
│ ├── image_generation_agent.py # AI image generation with FLUX
│ ├── notion_agent.py # Notion workspace integration
│ ├── prompt_generation_agent.py # Generate visual asset prompts
│ ├── scripting_agent.py # Script generation from articles
│ ├── search_agent.py # Intelligent tech news search
│ ├── supabase_agent.py # Database storage operations
│ ├── visual_table_agent.py # Production table generation
│ └── voice_generation_agent.py # AI voiceover generation (using Chatterbox)
│
├── core/ # Core workflow orchestration
│ └── production_workflow.py # Main LangGraph workflow implementation
│
├── storage/ # Storage and integration services
│ ├── gdrive_storage.py # Google Drive integration
│ └── upload_to_gdrive.py # Asset upload utilities
│
├── assets/ # Generated assets
│ └── generated_images/ # AI-generated images
│
├── temp_broll/ # Temporary b-roll storage
│
└── __pycache__/ # Python cache files (auto-generated)
- Python 3.8 or higher
- pip package manager
- Git for version control
- Google Drive account with API access
git clone <repository_url>
cd Kirmada/workflow# Create virtual environment
python -m venv .venv
# Activate virtual environment
# On Windows:
.venv\Scripts\activate
# On macOS/Linux:
source .venv/bin/activatepip install -r ../requirements.txtCreate a .env file in the workflow directory based on the example in workflow/.env:
# LangGraph Configuration
OPENAI_API_KEY=your_openai_api_key
CLAUDE_API_KEY=your_claude_api_key
TAVILY_API_KEY=your_tavily_api_key
FIRECRAWL_API_KEY=your_firecrawl_api_key
# Image Generation APIs
TogetherAI_API_KEY=your_togetherai_api_key
PEXELS_API_KEY=your_pexels_api_key
# Voice Generation
CHATTERBOX_API_KEY=your_chatterbox_api_key
# Database Storage
SUPABASE_URL=your_supabase_url
SUPABASE_ANON_KEY=your_supabase_anon_key
# Google Drive Integration
GOOGLE_APPLICATION_CREDENTIALS=credentials.json
# Notion Integration
NOTION_API_KEY=your_notion_api_key
NOTION_DATABASE_ID=your_notion_database_id
# Cost Limits
COST_LIMIT_PER_REEL=0.50
COST_LIMIT_DAILY=10.00- Go to Google Cloud Console
- Create a new project or select an existing one
- Enable the Google Drive API
- Create credentials (OAuth 2.0 Client ID)
- Download the credentials JSON file
- Rename it to
credentials.jsonand place it in the workflow directory - Run the application once to complete OAuth flow and create
token.json
The application will automatically create necessary directories and authenticate services on first run.
# From the workflow directory
python -m asyncio -c "from core.production_workflow import run_production_workflow; import asyncio; asyncio.run(run_production_workflow('latest tech breakthrough'))"The workflow can be integrated into the main backend application through the LangGraph service.
The workflow runs as a background service that:
- Automatically discovers trending tech news
- Generates AI scripts for social media
- Creates visual assets (images, voiceovers)
- Organizes assets in Google Drive
- Updates Notion workspace with project status
- Purpose: Finds high-quality, standalone tech news articles
- Techniques: AI-powered search queries, Firecrawl validation, result ranking
- Features:
- Intelligent query optimization
- Quality filtering to avoid aggregated content
- Validation with Firecrawl API
- Domain-based prioritization for official sources
- Purpose: Extracts clean content from web articles
- Techniques: Web scraping with Firecrawl, content cleaning
- Features:
- Handles JavaScript-rendered content
- Extracts article metadata
- Cleans HTML to plain text
- Handles various website formats
- Purpose: Generates engaging YouTube-style scripts from articles
- Techniques: LLM-powered text generation with structure
- Features:
- 60-second video script format
- Hook and section structure (INTRODUCTION, MAIN, CONCLUSION)
- Visual suggestions for each section
- Metadata extraction for production planning
- Purpose: Creates visual assets using AI models
- Techniques: FLUX Schnell model via Together AI
- Features:
- 9:16 aspect ratio for social media
- Prompt enhancement for quality images
- Batch processing capabilities
- Google Drive storage integration
- Purpose: Generates detailed visual prompts from scripts
- Techniques: LLM-based prompt engineering
- Features:
- Shot-specific prompt generation
- Style and mood optimization
- Visual timing alignment
- Quality scoring for prompts
- Purpose: Generates AI voiceovers for video scripts
- Techniques: API integration with Chatterbox
- Features:
- Natural-sounding narration
- Emotion and tone control
- Audio file format conversion
- Google Drive storage
- Purpose: Finds relevant b-roll footage from Pexels
- Techniques: API integration with Pexels
- Features:
- Keyword-based search
- Quality filtering
- Automatic download and organization
- Metadata tracking
- Purpose: Integrates with Notion workspaces for project tracking
- Techniques: Notion API integration
- Features:
- Project database entry creation
- Status tracking and updates
- Link generation to assets
- Progress monitoring
The main LangGraph workflow orchestrates all agents in a multi-step process:
sequenceDiagram
participant User
participant Workflow as Production Workflow
participant SA as Search Agent
participant CA as Crawl Agent
participant SB as Supabase Storage
participant SG as Scripting Agent
participant PG as Prompt Generation
participant IG as Image Generation
participant VG as Voice Generation
participant BG as B-roll Search
participant AA as Asset Gathering
participant NO as Notion Integration
User->>Workflow: Start production for topic
Workflow->>SA: Search for trending tech news
SA->>Workflow: Return high-quality articles
Workflow->>CA: Crawl selected article
CA->>Workflow: Return clean article content
Workflow->>SB: Store article in Supabase
SB->>Workflow: Article stored with ID
Workflow->>SG: Generate script from content
SG->>Workflow: Return generated script
Workflow->>SB: Store script in Supabase
SB->>Workflow: Script stored with ID
Workflow->>SG: Analyze script for shots
SG->>Workflow: Return shot breakdown
par Parallel Generation
Workflow->>PG: Generate image prompts
PG->>Workflow: Return prompts
Workflow->>IG: Generate images from prompts
IG->>Workflow: Return images
Workflow->>VG: Generate voiceover
VG->>Workflow: Return voice file
Workflow->>BG: Search b-roll footage
BG->>Workflow: Return b-roll assets
end
Workflow->>AA: Organize assets in Google Drive
AA->>Workflow: Assets organized
Workflow->>NO: Update Notion project
NO->>Workflow: Project updated
Workflow->>User: Production complete
The workflow maintains comprehensive state across all steps:
@dataclass
class WorkflowState:
# Input parameters
user_query: str = ""
topic: str = ""
# Search phase
search_results: str = ""
search_urls: List[str] = field(default_factory=list)
# Crawl phase
article_data: Dict[str, Any] = field(default_factory=dict)
crawled_content: str = ""
# Storage phase
article_id: str = ""
storage_result: str = ""
# Script generation phase
script_content: str = ""
script_hook: str = ""
visual_suggestions: List[str] = field(default_factory=list)
script_id: str = ""
# Shot analysis phase
shot_breakdown: List[Dict[str, Any]] = field(default_factory=list)
shot_timing: List[Dict[str, Any]] = field(default_factory=list)
shot_types: List[str] = field(default_factory=list)
# Parallel generation phase
prompts_generated: Annotated[List[Dict], add] = field(default_factory=list)
images_generated: Annotated[List[str], add] = field(default_factory=list)
voice_files: Annotated[List[str], add] = field(default_factory=list)
broll_assets: Dict[str, Any] = field(default_factory=dict)
# Asset gathering phase
project_folder_path: str = ""
asset_organization_result: str = ""
# Notion integration phase
notion_project_id: str = ""
notion_status: str = ""
# Status tracking
current_step: str = "search"
errors: Annotated[List[str], add] = field(default_factory=list)
messages: Annotated[List[BaseMessage], add] = field(default_factory=list)- Each node implements comprehensive error handling
- Failed steps preserve partial results for recovery
- Workflow state is preserved across execution
- Automatic retry mechanisms for transient failures
- Automated folder structure creation
- Topic-based subfolder organization
- Thread-safe file upload with duplicate prevention
- Cache management for performance optimization
- OAuth2 authentication with token refresh
- Article content storage with metadata
- Script storage with associated metadata
- Relational data management
- Query and retrieval capabilities
- Error handling and retry logic
- Project tracking database integration
- Status updates and monitoring
- Asset linking and organization
- Progress visualization
- Team collaboration features
Each agent follows the LangChain tool pattern:
@tool
async def agent_function_name(parameters) -> str:
"""
Comprehensive docstring explaining the agent's function.
Args:
parameters: Detailed parameter descriptions
Returns:
Formatted response with structured information
"""
# Implementation with error handling
try:
# Agent logic
result = perform_action()
return format_result(result)
except Exception as e:
return format_error(str(e))The workflow implements intelligent parallel processing:
- Prompt generation, image creation, voice synthesis, and b-roll search run in parallel
- Synchronization point to ensure all assets are ready before organization
- Resource management to prevent API rate limit issues
- Error propagation handling between parallel branches
- Built-in cost tracking per reel and daily limits
- API usage monitoring
- Automatic stopping when cost thresholds are met
- Detailed cost breakdown in final reports
- Intelligent content filtering to avoid aggregated content
- Quality validation of generated assets
- Multiple quality checks at each step
- Automatic retry mechanisms for failed generations
# Test individual agents
python -m pytest tests/test_agents.py
# Test workflow orchestration
python -m pytest tests/test_workflow.py
# Test storage integration
python -m pytest tests/test_storage.py
# Run all tests with coverage
python -m pytest --cov=workflow tests/- Agent Functionality: Individual agent behavior and error handling
- Workflow Execution: End-to-end workflow completion
- Storage Operations: File upload and organization
- API Integration: Third-party service integration
- Error Handling: Graceful failure and recovery
The system includes several cost management features:
- Per-reel limits: $0.50 per reel production
- Daily limits: $10.00 daily spending cap
- API key rotation: Automatic key switching if limits are reached
- Generation quality: Optimal balance between quality and cost
- OpenAI API: Script generation (~$0.02-0.05 per reel)
- Together AI: Image generation with FLUX (~$0.05-0.10 per reel)
- Pexels: B-roll search (free tier available)
- Chatterbox: Voice generation (~$0.05-0.08 per reel)
- Google Drive: Storage and API calls (free tier available)
- Create a new agent file:
# agents/new_agent.py
from langchain_core.tools import tool
@tool
async def new_agent_function(input_data: str) -> str:
"""
New agent to perform specific function
"""
# Implementation
pass
new_agent_tools = [
new_agent_function
]- Register the agent in the workflow:
# In production_workflow.py
from agents.new_agent import new_agent_tools
# Add to state
class WorkflowState:
# ... other fields
new_agent_result: str = ""
# Add node to workflow
self.workflow.add_node("new_agent_step", self.new_agent_node)
self.workflow.add_edge("previous_step", "new_agent_step")- Adjust image generation settings (dimensions, style, quality)
- Modify script length and structure parameters
- Customize voice model and emotion settings
- Configure Google Drive folder structure
- Add new publishing APIs (YouTube, TikTok, Instagram)
- Create platform-specific optimization agents
- Extend Notion templates with new platform fields
- Add platform-specific cost tracking
Symptoms: "Invalid API key", "Authentication failed" Solutions:
# Verify API keys are correctly set
env | grep -i api_key
# Re-authenticate Google Drive
rm token.json
# Run the script again to complete OAuth flowSymptoms: "Rate limit exceeded", "Too many requests" Solutions:
- Add multiple API keys for rotation
- Implement delays between requests
- Check daily usage limits on all services
- Use lower-cost alternatives when available
Symptoms: No images generated, "Model not available" Solutions:
# Install required dependencies
pip install together
# Check model availability
python -c "from agents.image_generation_agent import check_image_generation_status; import asyncio; asyncio.run(check_image_generation_status())"Symptoms: File upload failures, Google Drive access issues Solutions:
- Verify Google Cloud project setup
- Check OAuth2 permissions
- Ensure sufficient storage space
- Validate credentials file location
Symptoms: Workflow doesn't proceed to next step Solutions:
- Check error logs in console output
- Verify state transitions in workflow
- Restart the workflow if needed
- Validate all required API keys are available
- Use efficient models for faster generation
- Implement caching for repeated requests
- Optimize parallel processing
- Pre-load frequently used models
- Use free-tier models when possible
- Optimize generation parameters
- Implement intelligent retries to prevent waste
- Monitor usage and set appropriate limits
# In production_workflow.py
import logging
logging.basicConfig(level=logging.DEBUG)# Check recent API usage
# Review cost reports in each service console:
# - OpenAI Dashboard
# - Together AI Dashboard
# - Pexels Dashboard
# - Google Cloud ConsoleThis project is licensed under the MIT License. See the LICENSE file for details.
- Fork the repository
- Create a feature branch:
git checkout -b feature-name - Make your changes and test thoroughly
- Commit your changes:
git commit -am 'Add some feature' - Push to the branch:
git push origin feature-name - Create a Pull Request
For issues, questions, or contributions:
- Check the troubleshooting section above
- Review existing issues in the repository
- Create a new issue with detailed information
- Contribute improvements via pull requests
Automate your content creation! 🚀