Skip to content

Programmer-5090/Hocus-Pocus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hocus Pocus - Terminal-Based Audio Identification System

Inspired by Shazam, Hocus Pocus brings real-time audio identification to your command line

Python License Audio Formats Interface

Features

Core Capabilities

  • Real-time Audio Recognition - Identify songs playing around you in seconds via terminal interface
  • Multi-format Support - Process MP3, WAV, FLAC, M4A, AAC, OGG, and WMA files
  • Advanced Fingerprinting - Constellation mapping algorithm for robust audio signatures
  • Database Management - Efficient SQLite storage with automatic optimization
  • Batch Processing - Upload entire music libraries with recursive folder scanning
  • Interactive Terminal Interface - Professional command-line interface with progress trackingudio Identification System

Inspired by Shazam, Hocus Pocus brings real-time audio identification to your fingertips

Python License Audio Formats

Features

Core Capabilities

  • Real-time Audio Recognition - Identify songs playing around you in seconds
  • Multi-format Support - Process MP3, WAV, FLAC, M4A, AAC, OGG, and WMA files
  • Advanced Fingerprinting - Constellation mapping algorithm for robust audio signatures
  • Database Management - Efficient SQLite storage with automatic optimization
  • Batch Processing - Upload entire music libraries with recursive folder scanning
  • Interactive CLI - Professional command-line interface with progress tracking

Technical Features

  • Spectrogram Analysis - High-quality FFT-based audio processing
  • Peak Detection - Intelligent identification of spectral landmarks
  • Constellation Mapping - Robust fingerprint generation from audio peaks
  • Optimized Matching - Fast database queries with indexed fingerprint lookup
  • Audio Visualization - Generate spectrograms and constellation maps
  • Background Processing - Non-blocking audio analysis and database operations
  • Modular Architecture - Clean separation of concerns with dedicated CLI, core, and audio modules

Quick Start

Prerequisites

  1. Python 3.8+ with pip
  2. FFmpeg for audio format conversion
    • Windows: Download from ffmpeg.org
    • macOS: brew install ffmpeg
    • Linux: sudo apt install ffmpeg

Installation

# Clone the repository
git clone https://github.com/Programmer-5090/hocus-pocus.git
cd hocus-pocus

# Install dependencies
pip install -r requirements.txt
# or
pip install -e .

# Run the application
python main.py

First Run

  1. Add Music to Database

    python main.py
    # Choose 'upload' to add songs from a folder
  2. Identify Audio

    # Play a song on your speakers/phone
    python main.py
    # Choose 'yes' when prompted to identify

Usage Guide

Interactive Mode

The main interface provides several options:

  • yes - Record audio and identify the playing song
  • upload - Add songs from a folder to the database
  • no - Skip the current session
  • quit - Exit the application

Database Management

Hocus Pocus automatically optimizes your database for performance:

Database Statistics:
Total songs: 1,928
Total fingerprints: 85,748,924
Database size: 4.27 GB
Status: Optimized

Batch Upload Features

  • Recursive Scanning - Process nested folder structures
  • Progress Tracking - Real-time upload status with success rates
  • Error Handling - Detailed reporting of failed imports
  • Metadata Extraction - Automatic artist/title detection from filenames
  • Format Validation - Skip unsupported file types automatically

Architecture

Core Components

src/
├── audio/           # Audio processing pipeline
│   ├── audio_loader.py          # Multi-format audio loading
│   ├── audio_recorder.py        # Microphone recording
│   ├── spectrogram_processor.py # FFT analysis
│   └── audio_visualizer.py      # Plotting and visualization
├── cli/             # Terminal interface components
│   ├── database_optimizer.py    # Database optimization
│   ├── folder_upload.py         # Batch upload functionality
│   ├── identification.py        # Audio identification
│   └── interface.py             # User interface and display
├── core/            # Core identification engine
│   ├── engine.py                # Main orchestration
│   └── fingerprint_generator.py # Constellation mapping
└── database/        # Data persistence
    └── database_manager.py      # SQLite operations

Processing Pipeline

  1. Audio Loading - Multi-format support via FFmpeg
  2. Preprocessing - Normalization and resampling
  3. Spectrogram Generation - FFT-based frequency analysis
  4. Peak Detection - Spectral landmark identification
  5. Fingerprint Creation - Constellation map generation
  6. Database Storage - Indexed fingerprint storage
  7. Matching Algorithm - Efficient similarity search

Configuration

Audio Processing Settings

AUDIO_CONFIG = {
    'default_sample_rate': 22050,  # Hz
    'fft_size': 2048,              # FFT window size
    'hop_length': 512,             # Overlap between windows
    'db_floor': -80                # Noise floor (dB)
}

Fingerprinting Parameters

FINGERPRINT_CONFIG = {
    'fan_value': 5,                # Peaks per anchor point
    'target_zone': (1, 20)         # Time delta range
}

Performance Tuning

PERFORMANCE_CONFIG = {
    'max_query_duration': 30.0,    # Max recording time (seconds)
    'batch_size_fingerprints': 1000 # Database batch size
}

Performance

Benchmark Results

  • Identification Time: < 3 seconds for 10-second clips
  • Database Capacity: Currently optimized with 1,928 songs and 85.7M+ fingerprints
  • Memory Usage: ~50MB for typical database operations
  • Storage Efficiency: ~2.3MB per song (including fingerprints and metadata)
  • Database Performance: 4.27GB optimized database with indexed fingerprint lookup

Accuracy Metrics

  • Clean Audio: 95%+ identification rate
  • Noisy Environment: 80%+ with background noise
  • Partial Clips: 85%+ with 10+ second samples

Supported Formats

Format Extension FFmpeg Native
WAV .wav Yes Yes
MP3 .mp3 Yes No
FLAC .flac Yes No
M4A .m4a Yes No
AAC .aac Yes No
OGG .ogg Yes No
WMA .wma Yes No

Development

Project Structure

├── src/                # Source code
│   ├── audio/          # Audio processing components
│   ├── cli/            # Terminal interface modules
│   ├── core/           # Core identification engine
│   └── database/       # Data persistence layer
├── data/               # Database storage
├── output/             # Generated visualizations
├── tools/              # Utility scripts
├── tests/              # Test suite
├── config.py           # Configuration settings
├── main.py             # Entry point
└── pyproject.toml      # Project metadata

Running Tests

# Run test suite
python -m pytest tests/

# Debug specific components
python tests/debug_matching.py
python tests/debug_types.py

Database Tools

# Analyze database performance
python tools/analyze_database.py

# Manual optimization
python tools/optimize_database.py

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Setup

# Install development dependencies
pip install -e ".[dev]"

# Run linting
flake8 src/
black src/

# Type checking
mypy src/

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Inspired by the original Shazam algorithm
  • Built with passion for audio processing and music technology
  • Thanks to the open-source community for amazing libraries

References and Resources


Made with care for music lovers everywhere

Hocus Pocus - Where audio meets magic

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages