Skip to content

Decycle/engine

Repository files navigation

Engine - AI-Powered Video Editing Automation

Version: 2.0 Status: Week 1 - Database & Infrastructure Complete Architecture: MCP Server (Python) + Supabase + Multi-LLM Integration


Overview

Engine is an MCP server that enables AI agents to automate documentary-style video editing workflows. It extracts quotes (speech clips) and assets (visual footage) from source videos, builds scripts with narrator voiceover, and exports to professional video editing formats.

Key Features

  • 🎬 Multi-platform video search (YouTube, TikTok, Facebook, etc.)
  • 🗣️ Quote extraction - AI-powered speech clip analysis with Gemini 2.5 Flash
  • 🎨 Asset management - Visual footage with semantic search
  • 📝 Script building - Narrator TTS with word-level timing
  • 🎥 Multi-layer composition - Picture-in-picture, rapid-fire cutaways
  • 💰 Real-time cost tracking - Granular API usage monitoring
  • 📦 Professional export - Premiere Pro XML format

Project Structure

engine/
├── mcp/                    # MCP Server (Python)
│   ├── src/engine_mcp/     # Source code
│   ├── tests/              # Test suite
│   └── README.md           # MCP server docs
│
├── website/                # Next.js Web Interface (Coming Soon)
│   └── README.md
│
├── supabase/               # Database & Migrations
│   ├── migrations/         # SQL migrations
│   └── config.toml         # Supabase config
│
├── scripts/                # Development scripts
│
├── docs/                   # Documentation
│   └── OVERVIEW.md         # Implementation guide
│
├── CLAUDE.md               # Project instructions
├── IMPLEMENTATION_STATUS.md # Feature tracking
├── .env.example            # Environment template
└── README.md               # This file

Quick Start

Prerequisites

  • Python 3.11+
  • Node.js 18+ (for website)
  • Supabase CLI
  • uv package manager

Installation

# Clone repository
git clone <your-repo-url>
cd engine

# Copy environment template
cp .env.example .env

# Edit .env with your API keys
nano .env

# Install MCP dependencies
cd mcp
uv sync

# Start local Supabase
cd ..
supabase start

# Run migrations
supabase db reset

# Run tests
cd mcp
uv run pytest tests/ -v

Environment Setup

Get API keys from:


Development Approach

This project uses a placeholder-first implementation strategy:

Week 1: Foundation ✅ COMPLETE

  • ✅ Database schema with pgvector
  • ✅ Async connection pooling
  • ✅ Pydantic models
  • ✅ Placeholder integrations
  • ✅ Test infrastructure

Week 2-5: Real Implementations (In Progress)

  • Week 2: Video processing (YouTube, Gemini, OpenAI)
  • Week 3: Assets & TTS (ElevenLabs, Pexels)
  • Week 4: Script building & export
  • Week 5: Polish & production readiness

See IMPLEMENTATION_STATUS.md for detailed tracking.


Terminology

Important: We use specific terms for clarity:

  • Quotes - Speech clips WITH audio (dialogue that will be heard)
  • Assets - Visual footage with NO audio (source clips muted, stock footage, user uploads)
  • Segments - Script building blocks (narrator text or quote clip)
  • Visual Layers - Multi-layer composition (base video + overlays)

Tech Stack

Backend (MCP Server)

  • Python 3.11+ with async/await
  • uv - Fast Python package manager
  • MCP SDK - Model Context Protocol
  • Pydantic v2 - Data validation
  • asyncpg - Async PostgreSQL driver

Database

  • Supabase - PostgreSQL with real-time features
  • pgvector - Vector similarity search (1536-dim embeddings)

AI/ML Services

  • Gemini 2.5 Flash - Video/image analysis
  • OpenAI - Text embeddings (text-embedding-3-small)
  • ElevenLabs - TTS with word timestamps
  • yt-dlp - YouTube video search
  • Pexels API - Free stock footage

Frontend (Coming Soon)

  • Next.js 14 - React framework
  • TypeScript - Type safety
  • Tailwind CSS - Styling
  • shadcn/ui - UI components

Database Schema

Core Tables

  • projects - Top-level user projects (UUID)
  • source_videos - Multi-platform video metadata
  • quotes - Speech clips with rich AI descriptions
  • assets - Visual footage with embeddings
  • scripts - Script versions
  • script_segments - Ordered script segments
  • visual_layers - Multi-layer composition
  • rapid_fire_cutaways - Word-triggered quick cuts
  • processing_jobs - Async task tracking
  • api_costs - Granular cost tracking
  • exports - Export packages (XML + ZIP)

Testing

cd mcp

# Run fast tests only
uv run pytest tests/ -m "not slow"

# Run all tests
uv run pytest tests/ -v

# Run specific test file
uv run pytest tests/integration/test_database.py -v

# Run with coverage
uv run pytest tests/ --cov=engine_mcp --cov-report=html

Documentation

  • docs/OVERVIEW.md - Comprehensive implementation guide
  • mcp/README.md - MCP server documentation
  • IMPLEMENTATION_STATUS.md - Feature tracking
  • CLAUDE.md - Project instructions for Claude Code

Cost Tracking

All API calls are tracked with granular details:

  • Gemini: Input tokens + images + output tokens + thinking tokens
  • OpenAI: Embedding tokens
  • ElevenLabs: Characters
  • Pexels: Free but tracked for transparency

Real-time project cost accumulation in projects.total_cost_usd.


Contributing

This is currently a personal project following the implementation plan in docs/OVERVIEW.md. See IMPLEMENTATION_STATUS.md for what's being worked on.


License

MIT


Status: Week 1 Complete - Database & Infrastructure Ready ✅ Next: Week 2 - Implement real video processing integrations

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors