AztecProtocol · mrzeszutko · Dec 22, 2025 · spalladino · Dec 23, 2025 · spalladino
diff --git a/docs/docs-network/setup/blob_storage.md b/docs/docs-network/setup/blob_storage.md
@@ -0,0 +1,264 @@
+---
+id: blob_storage
+sidebar_position: 4
+title: Blob storage and retrieval
+description: Learn how Aztec nodes store and retrieve blob data for L1 transactions.
+---
+
+## Overview
+
+Aztec uses EIP-4844 blobs to publish transaction data to Ethereum Layer 1. Since blob data is only available on L1 for a limited period (~18 days / 4,096 epochs), nodes need reliable ways to store and retrieve blob data for synchronization and historical access.
+
+Aztec nodes can be configured to retrieve blobs from L1 consensus (beacon nodes), file stores (S3, GCS, R2), and archive services.
+
+:::tip Automatic Configuration
+When using `--network [NETWORK_NAME]`, blob sources are automatically configured for you. Most users don't need to manually configure blob storage.
+:::
+
+## Understanding blob sources
+
+The blob client can retrieve blobs from multiple sources, tried in order:
+
+1. **File Store**: Fast retrieval from configured storage (S3, GCS, R2, local files, HTTPS)
+2. **L1 Consensus**: Beacon node API for recent blobs (within ~18 days)
-2. **L1 Consensus**: Beacon node API for recent blobs (within ~18 days)
+2. **L1 Consensus**: Beacon node API to a (semi-)supernode for recent blobs (within ~18 days)
-2. **L1 Consensus**: Beacon node API for recent blobs (within ~18 days)
+2. **L1 Consensus**: Beacon node API to a (semi-)supernode for recent blobs (within ~18 days)
+3. **Archive API**: Services like Blobscan for historical blob data
+
+For near-tip synchronization, the client will retry file stores with backoff to handle eventual consistency when blobs are still being uploaded by other validators.
+
+### L1 consensus and blob availability
+
+If your beacon node has access to [supernodes or semi-supernodes](https://ethereum.org/roadmap/fusaka/peerdas/), L1 consensus alone may be sufficient for retrieving blobs within the ~18 day retention period. With the Fusaka upgrade and [PeerDAS (Peer Data Availability Sampling)](https://eips.ethereum.org/EIPS/eip-7594), Ethereum uses erasure coding to split blobs into 128 columns, enabling robust data availability:
+
+- **Supernodes** (validators with ≥4,096 ETH staked): Custody all 128 columns and all blob data for the full ~18 day retention period. These nodes form the backbone of the network and continuously heal data gaps.
+- **Semi-supernodes** (validators with ≥1,824 ETH / 57 validators): Handle at least 64 columns, enabling reconstruction of complete blob data.
+- **Regular nodes**: Only download 1/8th of the data (8 of 128 columns) to verify availability. This is **not sufficient** to serve complete blob data.
+
+If L1 consensus is your only blob source, your beacon node must be a supernode or semi-supernode (or connected to one) to retrieve complete blobs. A regular node cannot reconstruct full blob data from its partial columns alone.
-If L1 consensus is your only blob source, your beacon node must be a supernode or semi-supernode (or connected to one) to retrieve complete blobs. A regular node cannot reconstruct full blob data from its partial columns alone.
+:::warning Supernodes
+If L1 consensus is your only blob source, your beacon node must be a supernode or semi-supernode (or connected to one) to retrieve complete blobs. A regular node cannot reconstruct full blob data from its partial columns alone.
+:::
-If L1 consensus is your only blob source, your beacon node must be a supernode or semi-supernode (or connected to one) to retrieve complete blobs. A regular node cannot reconstruct full blob data from its partial columns alone.
+:::warning Supernodes
+If L1 consensus is your only blob source, your beacon node must be a supernode or semi-supernode (or connected to one) to retrieve complete blobs. A regular node cannot reconstruct full blob data from its partial columns alone.
+:::
+
+This means that for recent blobs, configuring `L1_CONSENSUS_HOST_URLS` pointing to a well-connected supernode or semi-supernode may be all you need. However, file stores and archive APIs are still recommended for:
+- Faster retrieval (file stores are typically faster than L1 consensus queries)
+- Historical access (blobs older than ~18 days are pruned from L1)
+- Redundancy (multiple sources improve reliability)
+
+## Prerequisites
+
+Before configuring blob storage, you should:
+
+- Have the Aztec node software installed
+- Understand basic node operation
+- For uploading blobs: Have access to cloud storage (Google Cloud Storage, Amazon S3, or Cloudflare R2) with appropriate permissions
+
+## Configuring blob sources
+
+### Environment variables
+
+Configure blob sources using environment variables in your `.env` file:
+
+| Variable | Description | Example |
+|----------|-------------|---------|
+| `BLOB_FILE_STORE_URLS` | Comma-separated URLs to read blobs from | `gs://bucket/,s3://bucket/` |
+| `BLOB_FILE_STORE_UPLOAD_URL` | URL for uploading blobs | `s3://my-bucket/blobs/` |
+| `L1_CONSENSUS_HOST_URLS` | Beacon node URLs (comma-separated) | `https://beacon.example.com` |
+| `L1_CONSENSUS_HOST_API_KEYS` | API keys for beacon nodes | `key1,key2` |
+| `L1_CONSENSUS_HOST_API_KEY_HEADERS` | Header names for API keys | `Authorization` |
+| `BLOB_ARCHIVE_API_URL` | Archive API URL (e.g., Blobscan) | `https://api.blobscan.com` |
+| `BLOB_ALLOW_EMPTY_SOURCES` | Allow no blob sources (default: false) | `false` |
+
+:::note Upload is Optional
+Configuring `BLOB_FILE_STORE_UPLOAD_URL` is optional. You can still download blobs from file stores without uploading them yourself — other network participants (such as sequencers and validators) upload blobs to shared storage, making them available for all nodes to retrieve.
+:::
+
+### Supported storage backends
+
+The blob client supports the same storage backends as snapshots:
+
+- **Google Cloud Storage** - `gs://bucket-name/path/`
+- **Amazon S3** - `s3://bucket-name/path/`
+- **Cloudflare R2** - `s3://bucket-name/path/?endpoint=https://[ACCOUNT_ID].r2.cloudflarestorage.com`
+- **HTTP/HTTPS** (read-only) - `https://host/path`
+- **Local filesystem** - `file:///absolute/path`
+
+### Storage path format
+
+Blobs are stored using the following path structure:
+
+```
+{base_url}/aztec-{l1ChainId}-{rollupVersion}-{rollupAddress}/blobs/{versionedBlobHash}.data
+```
+
+For example:
+```
+gs://my-bucket/aztec-1-1-0x1234abcd.../blobs/0x01abc123...def.data
+```
+
+## Configuration examples
+
+### Basic file store configuration
+
+Add to your `.env` file:
+
+```bash
+# Read blobs from GCS
+BLOB_FILE_STORE_URLS=gs://my-snapshots/
+
+# Upload blobs to GCS
+BLOB_FILE_STORE_UPLOAD_URL=gs://my-snapshots/
+```
+
+### Multiple read sources with L1 fallback
+
+```bash
+# Try multiple sources in order
+BLOB_FILE_STORE_URLS=gs://primary-bucket/,s3://backup-bucket/
+
+# Upload to primary
+BLOB_FILE_STORE_UPLOAD_URL=gs://primary-bucket/
+
+# L1 consensus fallback
+L1_CONSENSUS_HOST_URLS=https://beacon1.example.com,https://beacon2.example.com
+
+# Archive fallback for historical blobs
+BLOB_ARCHIVE_API_URL=https://api.blobscan.com
+```
+
+### Cloudflare R2 configuration
+
+```bash
+BLOB_FILE_STORE_URLS=s3://my-bucket/blobs/?endpoint=https://[ACCOUNT_ID].r2.cloudflarestorage.com
+BLOB_FILE_STORE_UPLOAD_URL=s3://my-bucket/blobs/?endpoint=https://[ACCOUNT_ID].r2.cloudflarestorage.com
+```
+
+Replace `[ACCOUNT_ID]` with your Cloudflare account ID.
+
+### Local filesystem (for testing)
+
+```bash
+BLOB_FILE_STORE_URLS=file:///data/blobs
+BLOB_FILE_STORE_UPLOAD_URL=file:///data/blobs
+```
+
+### Docker Compose integration
+
+Add the environment variables to your `docker-compose.yml`:
+
+```yaml
+environment:
+  # ... other environment variables
+  BLOB_FILE_STORE_URLS: ${BLOB_FILE_STORE_URLS}
+  BLOB_FILE_STORE_UPLOAD_URL: ${BLOB_FILE_STORE_UPLOAD_URL}
+  L1_CONSENSUS_HOST_URLS: ${L1_CONSENSUS_HOST_URLS}
+  BLOB_ARCHIVE_API_URL: ${BLOB_ARCHIVE_API_URL}
+```
+
+## Authentication
+
+### Google Cloud Storage
+
+Set up [Application Default Credentials](https://cloud.google.com/docs/authentication/application-default-credentials):
+
+```bash
+gcloud auth application-default login
+```
+
+Or use a service account key:
+
+```bash
+export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account-key.json
+```
+
+### Amazon S3 / Cloudflare R2
+
+Set AWS credentials as environment variables:
+
+```bash
+export AWS_ACCESS_KEY_ID=your-access-key
+export AWS_SECRET_ACCESS_KEY=your-secret-key
+```
+
+For R2, these credentials come from your Cloudflare R2 API tokens.
+
+## How blob retrieval works
+
+When a node needs blobs for a block, the blob client follows this retrieval order:
+
+### During historical sync
+1. **File Store** - Quick lookup in configured file stores
+2. **L1 Consensus** - Query beacon nodes using slot number
+3. **Archive API** - Fall back to Blobscan or similar service
+
+### During near-tip sync
+1. **File Store** - Quick lookup (no retries)
+2. **L1 Consensus** - Query beacon nodes
+3. **File Store with retries** - Retry with backoff for eventual consistency
+4. **Archive API** - Final fallback
+
+The client automatically uploads fetched blobs to the configured upload file store, ensuring blobs are preserved for future retrieval.
+
+## Verification
+
+To verify your blob configuration is working:
+
+1. **Check startup logs**: Look for messages about blob source connectivity
+2. **Monitor blob retrieval**: Watch for successful blob fetches during sync
+3. **Verify storage**: Check your storage bucket to confirm blob files exist
+4. **Test retrieval**: Restart the node and verify it can retrieve previously stored blobs
+
+## Troubleshooting
+
+### No blob sources configured
+
+**Issue**: Node starts with warning about no blob sources.
+
+**Solutions**:
+- Configure at least one of: `BLOB_FILE_STORE_URLS`, `L1_CONSENSUS_HOST_URLS`, or `BLOB_ARCHIVE_API_URL`
+- Set `BLOB_ALLOW_EMPTY_SOURCES=true` only if you understand the implications (node may fail to sync)
+
+### Blob retrieval fails
+
+**Issue**: Node cannot retrieve blobs for a block.
+
+**Solutions**:
+- Verify your file store URLs are accessible
+- Check L1 consensus host connectivity
+- Ensure authentication credentials are configured
+- Review node logs for specific error messages
+- Try using multiple file store URLs for redundancy
+
+### Upload fails
+
+**Issue**: Blobs are not being uploaded to file store.
+
+**Solutions**:
+- Verify `BLOB_FILE_STORE_UPLOAD_URL` is set
+- Check write permissions on the storage bucket
+- Ensure credentials are configured (AWS/GCP)
+- Note: HTTPS URLs are read-only and cannot be used for uploads
+
+### L1 consensus host errors
+
+**Issue**: Cannot connect to beacon nodes.
+
+**Solutions**:
+- Verify beacon node URLs are correct and accessible
+- Check if API keys are required and correctly configured
+- Ensure the beacon node is synced
+- Try multiple beacon node URLs for redundancy
+
+## Best practices
+
+- **Use file stores for production**: File stores provide faster, more reliable blob retrieval than L1 consensus
+- **Configure multiple sources**: Use multiple file store URLs and L1 consensus hosts for redundancy
+- **Enable blob uploads**: Configure `BLOB_FILE_STORE_UPLOAD_URL` to contribute to blob availability
+- **Choose appropriate storage**:
+  - Google Cloud Storage for GCP infrastructure
+  - Amazon S3 for AWS infrastructure
+  - Cloudflare R2 for cost-effective public distribution (free egress)
+- **Monitor storage costs**: Blobs can accumulate over time; consider retention policies
+- **Use archive API for historical access**: Configure `BLOB_ARCHIVE_API_URL` for accessing blobs older than ~18 days. Even with PeerDAS supernodes providing robust data availability, blob data is pruned from L1 after 4,096 epochs. Archive services like [Blobscan](https://blobscan.com/) store historical blob data indefinitely
+
+## Next Steps
+
+- Learn about [using snapshots](./syncing_best_practices.md) for faster node synchronization
+- Set up [monitoring](../operation/monitoring.md) to track your node's blob retrieval
+- Check the [CLI reference](../reference/cli_reference.md) for additional blob-related options
+- Join the [Aztec Discord](https://discord.gg/aztec) for support
diff --git a/yarn-project/blob-client/README.md b/yarn-project/blob-client/README.md
@@ -31,7 +31,7 @@ URL for uploading blobs to a file store.
 **L1 Consensus Host URLs** (`L1_CONSENSUS_HOST_URLS`):
 Beacon node URLs for fetching recent blobs directly from L1.
 
-**Archive API URL** (`BLOB_SINK_ARCHIVE_API_URL`):
+**Archive API URL** (`BLOB_ARCHIVE_API_URL`):
 Blobscan or similar archive API for historical blob data.
 
 ### Example Usage

diff --git a/yarn-project/blob-client/src/archive/config.ts b/yarn-project/blob-client/src/archive/config.ts
@@ -7,7 +7,7 @@ export type BlobArchiveApiConfig = {
 
 export const blobArchiveApiConfigMappings: ConfigMappingsType<BlobArchiveApiConfig> = {
   archiveApiUrl: {
-    env: 'BLOB_SINK_ARCHIVE_API_URL',
+    env: 'BLOB_ARCHIVE_API_URL',
     description: 'The URL of the archive API',
   },
   ...pickConfigMappings(l1ReaderConfigMappings, ['l1ChainId']),

diff --git a/yarn-project/foundation/src/config/env_var.ts b/yarn-project/foundation/src/config/env_var.ts
@@ -21,9 +21,7 @@ export type EnvVar =
   | 'BB_NUM_IVC_VERIFIERS'
   | 'BB_IVC_CONCURRENCY'
   | 'BOOTSTRAP_NODES'
-  | 'BLOB_SINK_ARCHIVE_API_URL'
-  | 'BLOB_SINK_PORT'
-  | 'BLOB_SINK_URL'
+  | 'BLOB_ARCHIVE_API_URL'
   | 'BLOB_FILE_STORE_URLS'
   | 'BLOB_FILE_STORE_UPLOAD_URL'
   | 'BOT_DA_GAS_LIMIT'