Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -103,3 +103,6 @@ venv.bak/
.mypy_cache/
package/bin/sftp-config.json
package/default/sftp-config.json

# total_replay output
total_replay/output/
86 changes: 86 additions & 0 deletions total_replay/CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

TOTAL-REPLAY is a Python CLI tool by Splunk Threat Research Team for replaying attack data and test logs from Splunk Security Content and Splunk Attack Data projects. It automates detection testing by replaying relevant attack data based on detection metadata (names, GUIDs, MITRE ATT&CK IDs, analytic stories).

## Development Setup

```bash
poetry shell
poetry install
```

**Requirements:** Python 3.13+

**Environment Variables (required):**
- `SPLUNK_HOST` - Splunk server IP/hostname
- `SPLUNK_HEC_TOKEN` - HTTP Event Collector authentication token

## Running the Tool

```bash
# By detection name (also searches .yml filenames)
python3 total_replay.py -n '7zip CommandLine To SMB Share Path, CMLUA Or CMSTPLUA UAC Bypass'

# By MITRE ATT&CK technique ID
python3 total_replay.py -tid 'T1021, T1020, T1537'

# By detection GUID
python3 total_replay.py -g '01d29b48-ff6f-11eb-b81e-acde48001123'

# By analytic story
python3 total_replay.py -as 'AgentTesla, Remcos'

# From file with mixed metadata (greedy mode)
python3 total_replay.py -fgr './test/test_names.txt'

# Replay from local cache (skip re-downloading)
python3 total_replay.py -ld './output/2025-12-12/guid/replayed_yaml_cache'

# Specify custom index (default: "test")
python3 total_replay.py -i main -tid 'T1071'
```

File-based inputs also available: `-fn` (names), `-ftid` (technique IDs), `-fg` (GUIDs), `-fas` (analytic stories).

## Architecture

**Entry Point:** `total_replay.py` - Typer CLI that parses input, delegates to UtilityHelper

**Core Logic:** `utility/utility_helper.py` - UtilityHelper class handles:
- `search_security_content()` - Walks security_content/detections to find matching YAML files
- `download_via_attack_data()` - Downloads attack data via `git lfs pull --include=<path>`
- `send_data_to_splunk()` - POSTs events to Splunk HEC (port 8088, HTTPS)
- `normalized_file_args()` - Regex categorization of file inputs into metadata types

**Data Flow:**
1. Parse CLI input and categorize by type (detection names, GUIDs, technique IDs, analytic stories)
2. Walk security_content detections folder, match YAML files by field
3. Extract `attack_data` URLs from matched detection YAML
4. Download data via Git LFS from attack_data repo
5. Generate YAML cache with metadata in `output/<date>/<marker_uid>/replayed_yaml_cache/`
6. Send events to Splunk HEC

## Configuration

Edit `configuration/config.yml`:
```yaml
settings:
security_content_detection_path: ~/security_content/detections
attack_data_dir_path: ~/attack_data
debug_print: False # Toggle verbose output
```

## Input File Format

File inputs support mixed metadata. The tool uses regex to auto-categorize:
- YAML filenames: `^[a-z0-9_]+(?:\.yml)?$`
- GUIDs: UUID format
- Technique IDs: `T\d{4}(?:\.\d{3})?`
- Detection names/analytic stories: Remaining alphanumeric entries
- Lines starting with `#` are skipped

See `test/test_names.txt` for examples.
15 changes: 12 additions & 3 deletions total_replay/configuration/config.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,16 @@
settings:
security_content_detection_path: ~/path/to/your/security_content/detections
attack_data_dir_path: ~/path/to/your/attack_data
security_content_detection_path: ~/security_content/detections
attack_data_dir_path: ~/attack_data
output_dir_name : output
cache_replay_yaml_name : cache_replay_data.yml
replayed_yaml_cache_dir_name: replayed_yaml_cache
debug_print: False
debug_print: True

# Splunk connection settings
# Environment variables (SPLUNK_HOST, SPLUNK_USERNAME, SPLUNK_PASSWORD, SPLUNK_HEC_TOKEN)
# will override these values if set
splunk:
host: attack-data
username: admin
password: seamlesslabs
hec_token: f9d2e13d-63ca-4bf2-8dcb-aa3a9d7dafff
76 changes: 76 additions & 0 deletions total_replay/detection_results.jsonl

Large diffs are not rendered by default.

67 changes: 67 additions & 0 deletions total_replay/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,73 @@ From there, you can choose whether to replay only detection GUIDs, only analytic

C. TOTAL-REPLAY downloads the required Attack Data each time you execute or replay data during detection testing or development. To help reduce disk space usage, the tool generates a cached .yml file for every downloaded dataset. You can then use the `local_data_path` parameter to replay the cached data, allowing you to avoid downloading the same Attack Data again.

---

## Run Detections

In addition to replaying attack data, TOTAL-REPLAY includes a detection runner tool (`run_detections.py`) that executes SPL queries from Security Content detection YAML files directly against your Splunk instance and outputs results to a JSONL file.

### Environment Variables

The detection runner requires the following environment variables (or config file settings):

| Environment Variable | Description |
|------------------------|--------------------------------------|
| **SPLUNK_HOST** | Splunk server IP/hostname |
| **SPLUNK_USERNAME** | Splunk username for REST API auth |
| **SPLUNK_PASSWORD** | Splunk password for REST API auth |

```bash
export SPLUNK_HOST=<IP_ADDRESS>
export SPLUNK_USERNAME=<USERNAME>
export SPLUNK_PASSWORD=<PASSWORD>
```

Alternatively, configure these in `configuration/config.yml`:
```yaml
splunk:
host: "your-splunk-server"
username: "admin"
password: "your-password"
```

### Usage Examples

```bash
# Run all detections
python3 run_detections.py --all

# Filter by detection name
python3 run_detections.py -n 'Windows Remote Services, CMLUA Or CMSTPLUA UAC Bypass'

# Filter by MITRE ATT&CK technique ID
python3 run_detections.py -tid 'T1021, T1059'

# Filter by detection GUID
python3 run_detections.py -g '01d29b48-ff6f-11eb-b81e-acde48001123'

# Filter by analytic story
python3 run_detections.py -as 'AgentTesla, Remcos'

# Custom output file and time range
python3 run_detections.py -as 'AgentTesla' --output results.jsonl --earliest -24h --latest now
```

### Options

| Option | Description |
|---------------------------|--------------------------------------------------|
| `-n, --name` | Comma-separated detection names or filenames |
| `-tid, --technique_id` | Comma-separated MITRE ATT&CK technique IDs |
| `-g, --guid` | Comma-separated detection GUIDs |
| `-as, --analytic_story` | Comma-separated analytic stories |
| `-a, --all` | Run all detection YAML files |
| `-o, --output` | Output JSONL file path (default: detection_results.jsonl) |
| `-e, --earliest` | Earliest time for search (default: 0 = all time) |
| `-l, --latest` | Latest time for search (default: now) |

---

### Other

For replaying captured datasets or event logs during detection development or testing outside of the Splunk Security Content or Splunk Attack Data GitHub repositories, we recommend using the built-in replay.py feature provided by either Splunk Attack Range or Attack Data.
Expand Down
Loading
Loading