DANK - Dense Analysis Network Knowledge

DANK is a Dense Analysis project focused on collecting and analyzing live data from the public Internet. It uses API access, web scraping, RSS feeds, and semantic indexing tools to ingest external content in real time. It applies sentiment analysis, semantic clustering, and AI models to build structured insights about the world, including trends, public perception, and evolving narratives. The goal is to automate contextual understanding and surface relevant knowledge as it emerges.

Requirements

Python 3.13
uv
ClickHouse (local server)

ClickHouse setup

Install ClickHouse: https://clickhouse.com/docs/en/install
Start the ClickHouse server (systemd or clickhouse server).
Create the schema:

~/clickhouse/clickhouse client --multiquery < schema.sql

The schema uses the dank database by default. Adjust config.toml if you need a different database name.

Configuration

Configuration lives in config.toml and should not be committed. Example:

sources = [
  { domain = "x.com", accounts = ["example"] },
  "blog.codinghorror.com",
]

[clickhouse]
host = "localhost"
port = 8123
database = "dank"
username = "default"
password = ""
secure = false
use_http = true

[x]
username = "your-x-username"
password = "your-x-password"
max_posts = 200
max_scrolls = 20
scroll_pause_seconds = 1.5

[storage]
data_dir = "data"
max_asset_bytes = 10485760

[browser]
# Optional: full path or command name for a Chromium-based browser.
executable_path = "thorium-browser"
# Optional: extra time to wait for the browser to start.
connection_timeout = 1.0
# Optional: connection retry count for slow browser startups.
connection_max_tries = 30

[email]
# Optional: IMAP settings for OTP codes.
host = "imap.example.com"
username = "you@example.com"
password = "your-imap-password"
port = 993

[logging]
# Optional: file path for scrape/process logs.
file = "dank.log"
# Optional: logging level (DEBUG, INFO, WARNING, ERROR).
level = "INFO"

sources controls which domains to scrape and process. Each entry can provide accounts for account-based sources like x.com.

If any particular domain lacks a specific configuration, the root of the domain will be scraped to discover RSS feeds to read from.

browser.executable_path sets the browser binary to launch. If unset, DANK will try common Chromium locations.

storage.max_asset_bytes caps asset downloads (bytes). Larger assets are skipped but still recorded.

When X prompts for a one-time code, DANK will poll the IMAP inbox for messages from x.com that arrived after the login attempt and extract the confirmation code.

If the browser takes longer to start, increase browser.connection_timeout or browser.connection_max_tries.

logging.file controls where scrape/process logs are written. Relative paths are resolved from the current working directory.

Usage

Dank offers the following commands.

uv run scrape -- Scrape the web for data
- Pass --domains to scrape only matching domains from sources, for example --domains '^x\\.com$'.
uv run process -- Process previously scraped data
- The --age argument can be given a duration to process, for example 6hours or 2days.
uv run clickhouse-query -- Run queries on the database
- You can only run SELECT, SHOW, or EXPLAIN queries through this tool
- Query results are well formatted and easy to read
- Query results are truncated unless you pass --full
uv run embed-text "your text" -- Print an embedding vector
- Output is a JSON list[float] for easy copy/paste into other tools.
uv run download-embedding-model -- Download and cache embeddings model
- Pass --model to choose another Hugging Face model id.
uv run web -- Start a simple web server to view content.
- Pass --no-reload to disable hot code reloading.
- Supports search filters for domain/account and a days-back slider.

Testing

uv run pytest -- Run default test suite.
uv run pytest -m embeddings -s -- Run real-model embedding checks.
- These tests are skipped by default and require the model cache.
- Includes per-case similarity and margin output for each model.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.opencode/skills		.opencode/skills
src/dank		src/dank
static		static
tests		tests
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
opencode.json		opencode.json
pyproject.toml		pyproject.toml
run-linters.sh		run-linters.sh
schema.sql		schema.sql
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DANK - Dense Analysis Network Knowledge

Requirements

ClickHouse setup

Configuration

Usage

Testing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DANK - Dense Analysis Network Knowledge

Requirements

ClickHouse setup

Configuration

Usage

Testing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages