Version: 2.0 Status: Week 1 - Database & Infrastructure Complete Architecture: MCP Server (Python) + Supabase + Multi-LLM Integration
Engine is an MCP server that enables AI agents to automate documentary-style video editing workflows. It extracts quotes (speech clips) and assets (visual footage) from source videos, builds scripts with narrator voiceover, and exports to professional video editing formats.
- 🎬 Multi-platform video search (YouTube, TikTok, Facebook, etc.)
- 🗣️ Quote extraction - AI-powered speech clip analysis with Gemini 2.5 Flash
- 🎨 Asset management - Visual footage with semantic search
- 📝 Script building - Narrator TTS with word-level timing
- 🎥 Multi-layer composition - Picture-in-picture, rapid-fire cutaways
- 💰 Real-time cost tracking - Granular API usage monitoring
- 📦 Professional export - Premiere Pro XML format
engine/
├── mcp/ # MCP Server (Python)
│ ├── src/engine_mcp/ # Source code
│ ├── tests/ # Test suite
│ └── README.md # MCP server docs
│
├── website/ # Next.js Web Interface (Coming Soon)
│ └── README.md
│
├── supabase/ # Database & Migrations
│ ├── migrations/ # SQL migrations
│ └── config.toml # Supabase config
│
├── scripts/ # Development scripts
│
├── docs/ # Documentation
│ └── OVERVIEW.md # Implementation guide
│
├── CLAUDE.md # Project instructions
├── IMPLEMENTATION_STATUS.md # Feature tracking
├── .env.example # Environment template
└── README.md # This file
- Python 3.11+
- Node.js 18+ (for website)
- Supabase CLI
- uv package manager
# Clone repository
git clone <your-repo-url>
cd engine
# Copy environment template
cp .env.example .env
# Edit .env with your API keys
nano .env
# Install MCP dependencies
cd mcp
uv sync
# Start local Supabase
cd ..
supabase start
# Run migrations
supabase db reset
# Run tests
cd mcp
uv run pytest tests/ -vGet API keys from:
- Gemini: https://ai.google.dev/
- OpenAI: https://platform.openai.com/
- ElevenLabs: https://elevenlabs.io/
- Pexels: https://www.pexels.com/api/ (free)
This project uses a placeholder-first implementation strategy:
- ✅ Database schema with pgvector
- ✅ Async connection pooling
- ✅ Pydantic models
- ✅ Placeholder integrations
- ✅ Test infrastructure
- Week 2: Video processing (YouTube, Gemini, OpenAI)
- Week 3: Assets & TTS (ElevenLabs, Pexels)
- Week 4: Script building & export
- Week 5: Polish & production readiness
See IMPLEMENTATION_STATUS.md for detailed tracking.
Important: We use specific terms for clarity:
- Quotes - Speech clips WITH audio (dialogue that will be heard)
- Assets - Visual footage with NO audio (source clips muted, stock footage, user uploads)
- Segments - Script building blocks (narrator text or quote clip)
- Visual Layers - Multi-layer composition (base video + overlays)
- Python 3.11+ with async/await
- uv - Fast Python package manager
- MCP SDK - Model Context Protocol
- Pydantic v2 - Data validation
- asyncpg - Async PostgreSQL driver
- Supabase - PostgreSQL with real-time features
- pgvector - Vector similarity search (1536-dim embeddings)
- Gemini 2.5 Flash - Video/image analysis
- OpenAI - Text embeddings (text-embedding-3-small)
- ElevenLabs - TTS with word timestamps
- yt-dlp - YouTube video search
- Pexels API - Free stock footage
- Next.js 14 - React framework
- TypeScript - Type safety
- Tailwind CSS - Styling
- shadcn/ui - UI components
projects- Top-level user projects (UUID)source_videos- Multi-platform video metadataquotes- Speech clips with rich AI descriptionsassets- Visual footage with embeddingsscripts- Script versionsscript_segments- Ordered script segmentsvisual_layers- Multi-layer compositionrapid_fire_cutaways- Word-triggered quick cutsprocessing_jobs- Async task trackingapi_costs- Granular cost trackingexports- Export packages (XML + ZIP)
cd mcp
# Run fast tests only
uv run pytest tests/ -m "not slow"
# Run all tests
uv run pytest tests/ -v
# Run specific test file
uv run pytest tests/integration/test_database.py -v
# Run with coverage
uv run pytest tests/ --cov=engine_mcp --cov-report=htmldocs/OVERVIEW.md- Comprehensive implementation guidemcp/README.md- MCP server documentationIMPLEMENTATION_STATUS.md- Feature trackingCLAUDE.md- Project instructions for Claude Code
All API calls are tracked with granular details:
- Gemini: Input tokens + images + output tokens + thinking tokens
- OpenAI: Embedding tokens
- ElevenLabs: Characters
- Pexels: Free but tracked for transparency
Real-time project cost accumulation in projects.total_cost_usd.
This is currently a personal project following the implementation plan in docs/OVERVIEW.md. See IMPLEMENTATION_STATUS.md for what's being worked on.
MIT
Status: Week 1 Complete - Database & Infrastructure Ready ✅ Next: Week 2 - Implement real video processing integrations