Skip to content

trollByte/dmarc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

57 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

DMARC Aggregate Report Processor

Enterprise-Grade DMARC Analytics Platform

A production-ready enterprise platform that ingests, processes, and analyzes DMARC aggregate reports with advanced ML-powered threat detection, distributed task processing, and comprehensive security features.


πŸš€ Features

Core Functionality

  • πŸ“§ Automated DMARC Report Ingestion - IMAP inbox monitoring with Celery task queue
  • πŸ“€ Bulk File Upload - Drag-and-drop 50-200 reports simultaneously
  • πŸ”„ Idempotent Processing - SHA256-based duplicate prevention
  • πŸ’Ύ PostgreSQL Storage - Production-grade relational database
  • πŸ” JWT Authentication - Role-based access control (Admin/Analyst/Viewer)
  • πŸš€ RESTful API - FastAPI with auto-generated documentation
  • βœ… Comprehensive Testing - 70%+ code coverage enforced
  • 🐳 Docker Deployment - Single-command orchestration
  • πŸ”” Multi-Channel Alerting - Email, Slack, Discord, Microsoft Teams

🎯 Enterprise Features (NEW)

Phase 1: Distributed Task Processing

  • ⚑ Celery + Redis Queue - Asynchronous background job processing
  • πŸ“… Celery Beat Scheduler - Automated periodic tasks
    • Email ingestion every 15 minutes
    • Report processing every 5 minutes
    • Alert checks hourly
    • ML model training weekly
  • 🌸 Flower Dashboard - Real-time task monitoring at :5555
  • πŸ”„ Retry Logic - Exponential backoff with 3 attempts
  • πŸ“Š Task Tracking - PostgreSQL result backend

Phase 2: Authentication & Authorization

  • πŸ”‘ JWT Authentication - Access tokens (15min) + refresh tokens (7 days)
  • πŸ‘₯ Role-Based Access Control - Admin, Analyst, Viewer roles
  • πŸ” API Key Management - Per-user API keys with SHA256 hashing
  • πŸ›‘οΈ Password Security - bcrypt hashing (12 rounds)
  • πŸ“ User Management - Admin-only user creation (no self-registration)
  • πŸ”„ Token Refresh - Seamless token renewal
  • πŸ“‹ Audit Trail - User action tracking

Phase 3: Enhanced Alerting

  • 🎯 Alert Lifecycle - Created β†’ Acknowledged β†’ Resolved
  • πŸ”• Deduplication - SHA256 fingerprinting with cooldown periods
  • ⏰ Alert Suppressions - Time-based muting for maintenance windows
  • πŸ“Š Alert History - Persistent storage with full lifecycle tracking
  • πŸ“ Configurable Rules - UI-based threshold management
  • πŸ”” Teams Priority - Microsoft Teams notifications sent first
  • πŸ“ˆ Alert Analytics - Trends, resolution times, acknowledgment rates

Phase 4: ML Analytics & Geolocation

  • πŸ€– Anomaly Detection - Isolation Forest ML model for suspicious IPs
  • 🌍 IP Geolocation - MaxMind GeoLite2 offline mapping
  • πŸ—ΊοΈ Country Heatmaps - Geographic visualization of email sources
  • πŸ“Š Model Management - Training, versioning, deployment
  • πŸ”„ Automated Training - Weekly ML model updates (Sunday 2 AM)
  • 🎯 Daily Detection - Automatic anomaly scanning (3 AM)
  • πŸ’Ύ 90-Day Caching - Efficient geolocation data caching
  • πŸ“ˆ Prediction History - ML prediction tracking and analytics

Performance & Caching

  • ⚑ Redis Caching - 90%+ hit rate, sub-200ms response times
  • πŸ”§ Query Optimization - N+1 query elimination, indexed lookups
  • πŸ“ˆ Auto-Invalidation - Cache clearing on new data
  • πŸ”„ Connection Pooling - Optimized database and cache connections

Visualizations

  • πŸ“Š 8 Interactive Charts:
    • DMARC results timeline (line chart)
    • Results by domain (bar chart)
    • Top source IPs (bar chart)
    • Disposition breakdown (pie chart)
    • SPF/DKIM alignment breakdown (stacked bar)
    • Policy compliance (doughnut chart)
    • Failure rate trend with moving average (line chart)
    • Top sending organizations (horizontal bar)

Advanced Filtering

  • πŸ” Source IP - Exact match or CIDR ranges
  • πŸ” Authentication - DKIM/SPF pass/fail
  • πŸ“‹ Disposition - None/Quarantine/Reject
  • 🏒 Organization - Sending organization filter
  • πŸ“… Date Range - Custom or preset ranges
  • 🌐 Domain - Multi-domain filtering

Export Capabilities

  • πŸ“„ CSV Exports - Reports, records, sources
  • πŸ“‘ PDF Reports - Professional summary with charts
  • πŸ”’ Rate Limiting - 10/min CSV, 5/min PDF
  • πŸ›‘οΈ Security - CSV formula injection prevention

πŸ› οΈ Tech Stack

Backend

  • Framework: Python 3.11 + FastAPI
  • Task Queue: Celery + Redis
  • ML/Analytics: scikit-learn, NumPy, pandas
  • Geolocation: MaxMind GeoLite2 + geoip2
  • Auth: JWT (PyJWT), bcrypt
  • Database: PostgreSQL 15 + SQLAlchemy 2.0
  • Cache: Redis 7 (Alpine)
  • PDF: ReportLab

Frontend

  • Stack: Vanilla HTML/CSS/JS + Chart.js v4.4.0
  • Charts: Chart.js for visualizations
  • Web Server: Nginx (reverse proxy)

Infrastructure

  • Orchestration: Docker Compose
  • Services: Backend, Celery Worker, Celery Beat, PostgreSQL, Redis, Nginx, Flower
  • Monitoring: Flower dashboard for Celery tasks

πŸ“‹ Prerequisites

Required

  • Docker & Docker Compose
  • MaxMind GeoLite2 database (free account)

Optional

  • Email account with IMAP access (for automated ingestion)
  • Microsoft Teams/Slack webhooks (for alerts)

πŸš€ Quick Start

1. Clone Repository

git clone <repo-url>
cd dmarc

2. Download MaxMind Database

  1. Sign up at: https://dev.maxmind.com/geoip/geolite2-free-geolocation-data
  2. Download GeoLite2-City.mmdb
  3. Place at: backend/data/GeoLite2-City.mmdb
mkdir -p backend/data
# Copy GeoLite2-City.mmdb to backend/data/

3. Configure Environment

cp .env.sample .env
# Edit .env with your settings

Required Settings:

# JWT Secret (generate with: python -c "import secrets; print(secrets.token_urlsafe(64))")
JWT_SECRET_KEY=your-secret-key-here

# Celery + Redis
USE_CELERY=true
CELERY_BROKER_URL=redis://redis:6379/0

# Database
DATABASE_URL=postgresql://dmarc:dmarc@db:5432/dmarc

# Email (optional - for automated ingestion)
EMAIL_HOST=imap.gmail.com
EMAIL_PORT=993
EMAIL_USER=your-email@example.com
EMAIL_PASSWORD=your-app-password

# Alerts (optional)
TEAMS_WEBHOOK_URL=https://your-teams-webhook

4. Start Services

docker compose up -d --build

Services:

  • backend - FastAPI application (port 8000)
  • celery-worker - Background task processor
  • celery-beat - Scheduled task scheduler
  • flower - Celery monitoring UI (port 5555)
  • db - PostgreSQL database
  • redis - Cache & message broker
  • nginx - Web server (port 80)

5. Run Database Migrations

docker compose exec backend alembic upgrade head

Migrations Applied:

  • 001 - Ingested reports table
  • 002 - DMARC reports & records tables
  • 003 - Performance indexes
  • 004 - Celery task tracking
  • 005 - User authentication
  • 006 - Enhanced alerting
  • 007 - ML analytics & geolocation

6. Create Admin User

docker compose exec backend python scripts/create_admin_user.py

Follow the prompts to create your first admin user.

7. Access the Platform

8. Login

Use the admin credentials you created to login via the dashboard or API.


πŸ” Authentication

Login (Get JWT Token)

curl -X POST http://localhost:8000/auth/login \
  -H "Content-Type: application/json" \
  -d '{
    "email": "admin@example.com",
    "password": "your-password"
  }'

Response:

{
  "access_token": "eyJ0eXAiOiJKV1QiLCJhbGc...",
  "refresh_token": "eyJ0eXAiOiJKV1QiLCJhbGc...",
  "token_type": "bearer"
}

Use Token in Requests

curl -H "Authorization: Bearer <access_token>" http://localhost:8000/api/reports

πŸ“‘ API Endpoints

Authentication (/auth)

  • POST /auth/login - Login with email/password
  • POST /auth/refresh - Refresh access token
  • POST /auth/logout - Logout (invalidate tokens)

Users (/users)

  • GET /users/me - Get current user profile
  • GET /users - List all users (admin)
  • POST /users - Create user (admin)
  • PATCH /users/{id} - Update user (admin)
  • DELETE /users/{id} - Delete user (admin)
  • POST /users/api-keys - Generate API key

Core DMARC (/api)

  • GET /api/domains - List domains
  • GET /api/reports - List reports (paginated)
  • GET /api/reports/{id} - Get report details
  • POST /api/upload - Bulk file upload

Analytics & Rollup (/api/rollup)

  • GET /api/rollup/summary - Aggregate statistics
  • GET /api/rollup/sources - Top source IPs
  • GET /api/rollup/alignment - DKIM/SPF alignment
  • GET /api/rollup/timeline - Time-series data
  • GET /api/rollup/failure-trend - Failure rate trends

Exports (/api/export)

  • GET /api/export/reports/csv - Export reports CSV
  • GET /api/export/records/csv - Export records CSV
  • GET /api/export/sources/csv - Export sources CSV
  • GET /api/export/report/pdf - Generate PDF summary

Alerts (/alerts)

  • GET /alerts/history - Alert history
  • GET /alerts/rules - Alert rules
  • POST /alerts/rules - Create rule (admin)
  • PATCH /alerts/{id}/acknowledge - Acknowledge alert
  • PATCH /alerts/{id}/resolve - Resolve alert
  • POST /alerts/suppressions - Create suppression

ML Analytics (/analytics)

  • GET /analytics/geolocation/map - Country heatmap
  • GET /analytics/geolocation/lookup/{ip} - IP geolocation
  • GET /analytics/ml/models - List ML models
  • POST /analytics/ml/train - Train model (admin)
  • POST /analytics/ml/deploy - Deploy model (admin)
  • POST /analytics/anomalies/detect - Detect anomalies
  • GET /analytics/anomalies/recent - Recent predictions

Tasks (/tasks)

  • POST /tasks/trigger/email-ingestion - Trigger email fetch
  • POST /tasks/trigger/process-reports - Process pending reports
  • GET /tasks/status/{task_id} - Get task status

🎯 Role-Based Access

Role Permissions
Admin Full access: users, models, rules, all data
Analyst Read/write: reports, alerts, analytics
Viewer Read-only: dashboards, reports, analytics

πŸ“Š Monitoring

Flower Dashboard (Celery Tasks)

Access at http://localhost:5555

Monitors:

  • Active tasks
  • Task history
  • Worker status
  • Task schedules (Beat)

Scheduled Tasks

# View all schedules
docker compose exec celery-beat celery -A app.celery_app inspect scheduled

# Force run a task
docker compose exec celery-worker celery -A app.celery_app call \
  app.tasks.ml_tasks.train_anomaly_model_task

πŸ§ͺ Testing

# Run all tests with coverage
docker compose exec backend pytest -v --cov=app

# Run specific test suite
docker compose exec backend pytest tests/unit/ -v
docker compose exec backend pytest tests/integration/ -v

# Generate HTML coverage report
docker compose exec backend pytest --cov=app --cov-report=html

Coverage: 70%+ enforced in CI/CD


πŸ“š Documentation


πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Nginx     │────▢│   Backend    │────▢│ PostgreSQL β”‚
β”‚   (Port 80) β”‚     β”‚  (FastAPI)   β”‚     β”‚    (DB)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                           β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚    Redis    │◀───▢│Celery Worker β”‚
                    β”‚   (Broker)  β”‚     β”‚   + Beat     β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                           β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚   Flower    β”‚
                    β”‚  (Monitor)  β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ”§ Development

# View logs
docker compose logs -f backend
docker compose logs -f celery-worker

# Rebuild after code changes
docker compose up --build -d backend

# Create new migration
docker compose exec backend alembic revision --autogenerate -m "description"

# Reset database (WARNING: deletes all data)
docker compose down -v
docker compose up -d
docker compose exec backend alembic upgrade head
docker compose exec backend python scripts/create_admin_user.py

🚒 Production Deployment

See backend/DEPLOYMENT.md for:

  • SSL/TLS with Let's Encrypt
  • Database backups
  • Security hardening
  • Performance tuning
  • Monitoring setup

πŸ“ˆ System Requirements

Minimum:

  • CPU: 2 cores
  • RAM: 4GB
  • Storage: 10GB

Recommended:

  • CPU: 4+ cores
  • RAM: 8GB
  • Storage: 50GB+ (depends on volume)

πŸ“„ License

MIT


Version: 2.0.0 (Enterprise Edition) Last Updated: January 2026 Status: βœ… Production Ready

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •