GASBench

A benchmark evaluation package for discriminative models on Bittensor Subnet 34 (GAS - Generative Adversarial Subnet).

GASBench evaluates AI-generated content detection models across image, video, and audio modalities.

Overview

This package provides a self-contained benchmark evaluation system for testing models on diverse datasets:

Image, Video & Audio Benchmarks
Test discriminative models on curated datasets for AI-generated content detection
Data Processing
Dataset download, caching, preprocessing, and augmentation with aspect ratio preservation for image/video, and standardized resampling/windowing for audio
Comprehensive Metrics
Accuracy, MCC, cross-entropy, inference times, and per-dataset breakdowns

For detailed ONNX model requirements for all three modalities, see the
👉 ONNX Model Specification

Installation

GPU support is installed by default:

cd gasbench
pip install -e .

For CPU-only (no CUDA):

pip install -e .[cpu]

Usage

Command Line Interface

# Run image benchmark
gasbench model.onnx image

# Run video benchmark
gasbench model.onnx video

# Run audio benchmark
gasbench model.onnx audio

# Debug mode (faster, smaller datasets)
gasbench model.onnx image --debug
gasbench model.onnx video --debug
gasbench model.onnx audio --debug

# Custom cache directory
gasbench model.onnx video --cache-dir /tmp/my_cache

# Save JSON results to a specific directory
gasbench model.onnx image --output-dir ./results
gasbench model.onnx video --output-dir ./results
gasbench model.onnx audio --output-dir ./results

# For SN34 miners: Use only gasstation datasets
gasbench model.onnx image --gasstation-only
gasbench model.onnx video --gasstation-only
gasbench model.onnx audio --gasstation-only

Results are automatically saved to a timestamped JSON file.

Python API

import asyncio
from gasbench import run_benchmark, print_benchmark_summary, save_results_to_json

async def evaluate_model():
    results = await run_benchmark(
        model_path="path/to/model.onnx",
        modality="image",  # "image" | "video" | "audio"
        debug_mode=False,
        gasstation_only=False,
    )

    print_benchmark_summary(results)

    output_path = save_results_to_json(results, output_dir="./results")
    return results

results = asyncio.run(evaluate_model())

Model Requirements (High-Level)

Models must be exported as ONNX classifiers (image, video, or audio).
GASBench expects batched inputs and logits outputs.
Image/video preprocessing (resize/crop/augment) and audio preprocessing (mono/resample/crop) are handled by GASBench.
Both binary (real vs synthetic) and 3-class (real, semisynthetic, synthetic) models are supported.

For exact ONNX requirements, see:
👉 ONNX Model Specification

Metrics

benchmark_score
sn34_score
binary_mcc
binary_cross_entropy
avg_inference_time_ms
p95_inference_time_ms

Cache Directory

By default, datasets are cached at /tmp/benchmark_data/. Specify a custom location with cache_dir parameter or --cache-dir flag.

JSON Output Structure

A structured JSON summary is automatically generated after each run:

{
  "metadata": {
    "run_id": "64a8c5eb-560a-4822-9ae5-b51d27737831",
    "timestamp": 1765074390.323,
    "datetime": "2025-12-07T02:26:30.323068",
    "model_path": "model.onnx",
    "modality": "audio",
    "mode": "full",
    "gasstation_only": false,
    "benchmark_completed": true,
    "duration_seconds": 1054.31
  },
  "overall_score": 0.8523,
  "validation": {
    "model_path": "model.onnx",
    "input_shape": "['batch_size', 96000]",
    "input_type": "tensor(float)",
    "output_shape": "['batch_size', 2]"
  },
  "errors": [],
  "results": {
    "total_samples": 5000,
    "correct_predictions": 4261,
    "accuracy": 0.8522,
    "avg_inference_time_ms": 12.3,
    "p95_inference_time_ms": 45.6,
    "binary_mcc": 0.7045,
    "binary_cross_entropy": 0.2345,
    "sn34_score": 0.7891
  },
  "accuracy_by_media_type": {
    "real": {
      "samples": 2000,
      "correct": 1780,
      "accuracy": 0.89
    },
    "synthetic": {
      "samples": 2500,
      "correct": 2200,
      "accuracy": 0.88
    }
  },
  "dataset_info": {
    "datasets_used": ["dataset-a", "dataset-b", "..."],
    "evaluation_type": "synthetic_detection",
    "dataset_media_types": {
      "dataset-a": "real",
      "dataset-b": "synthetic"
    },
    "samples_per_dataset": {
      "dataset-a": 100,
      "dataset-b": 100
    }
  }
}

Name		Name	Last commit message	Last commit date
Latest commit History 223 Commits
docs		docs
src/gasbench		src/gasbench
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GASBench

Overview

Installation

Usage

Command Line Interface

Python API

Model Requirements (High-Level)

Metrics

Cache Directory

JSON Output Structure

About

Uh oh!

Releases 16

Packages

Contributors 4

Uh oh!

Languages

License

BitMind-AI/gasbench

Folders and files

Latest commit

History

Repository files navigation

GASBench

Overview

Installation

Usage

Command Line Interface

Python API

Model Requirements (High-Level)

Metrics

Cache Directory

JSON Output Structure

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 16

Packages 0

Contributors 4

Uh oh!

Languages

Packages