Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
264 changes: 264 additions & 0 deletions docs/docs-network/setup/blob_storage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,264 @@
---
id: blob_storage
sidebar_position: 4
title: Blob storage and retrieval
description: Learn how Aztec nodes store and retrieve blob data for L1 transactions.
---

## Overview

Aztec uses EIP-4844 blobs to publish transaction data to Ethereum Layer 1. Since blob data is only available on L1 for a limited period (~18 days / 4,096 epochs), nodes need reliable ways to store and retrieve blob data for synchronization and historical access.

Aztec nodes can be configured to retrieve blobs from L1 consensus (beacon nodes), file stores (S3, GCS, R2), and archive services.

:::tip Automatic Configuration
When using `--network [NETWORK_NAME]`, blob sources are automatically configured for you. Most users don't need to manually configure blob storage.
:::

## Understanding blob sources

The blob client can retrieve blobs from multiple sources, tried in order:

1. **File Store**: Fast retrieval from configured storage (S3, GCS, R2, local files, HTTPS)
2. **L1 Consensus**: Beacon node API for recent blobs (within ~18 days)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. **L1 Consensus**: Beacon node API for recent blobs (within ~18 days)
2. **L1 Consensus**: Beacon node API to a (semi-)supernode for recent blobs (within ~18 days)

3. **Archive API**: Services like Blobscan for historical blob data

For near-tip synchronization, the client will retry file stores with backoff to handle eventual consistency when blobs are still being uploaded by other validators.

### L1 consensus and blob availability

If your beacon node has access to [supernodes or semi-supernodes](https://ethereum.org/roadmap/fusaka/peerdas/), L1 consensus alone may be sufficient for retrieving blobs within the ~18 day retention period. With the Fusaka upgrade and [PeerDAS (Peer Data Availability Sampling)](https://eips.ethereum.org/EIPS/eip-7594), Ethereum uses erasure coding to split blobs into 128 columns, enabling robust data availability:

- **Supernodes** (validators with ≥4,096 ETH staked): Custody all 128 columns and all blob data for the full ~18 day retention period. These nodes form the backbone of the network and continuously heal data gaps.
- **Semi-supernodes** (validators with ≥1,824 ETH / 57 validators): Handle at least 64 columns, enabling reconstruction of complete blob data.
- **Regular nodes**: Only download 1/8th of the data (8 of 128 columns) to verify availability. This is **not sufficient** to serve complete blob data.

If L1 consensus is your only blob source, your beacon node must be a supernode or semi-supernode (or connected to one) to retrieve complete blobs. A regular node cannot reconstruct full blob data from its partial columns alone.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If L1 consensus is your only blob source, your beacon node must be a supernode or semi-supernode (or connected to one) to retrieve complete blobs. A regular node cannot reconstruct full blob data from its partial columns alone.
:::warning Supernodes
If L1 consensus is your only blob source, your beacon node must be a supernode or semi-supernode (or connected to one) to retrieve complete blobs. A regular node cannot reconstruct full blob data from its partial columns alone.
:::


This means that for recent blobs, configuring `L1_CONSENSUS_HOST_URLS` pointing to a well-connected supernode or semi-supernode may be all you need. However, file stores and archive APIs are still recommended for:
- Faster retrieval (file stores are typically faster than L1 consensus queries)
- Historical access (blobs older than ~18 days are pruned from L1)
- Redundancy (multiple sources improve reliability)

## Prerequisites

Before configuring blob storage, you should:

- Have the Aztec node software installed
- Understand basic node operation
Comment on lines +47 to +48
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this

- For uploading blobs: Have access to cloud storage (Google Cloud Storage, Amazon S3, or Cloudflare R2) with appropriate permissions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not mix up instructions for blob retrieval and blob upload. Move everything related to blob upload to a separate file or at least a separate section, since only a very small subset of users will do uploads.


## Configuring blob sources

### Environment variables

Configure blob sources using environment variables in your `.env` file:

| Variable | Description | Example |
|----------|-------------|---------|
| `BLOB_FILE_STORE_URLS` | Comma-separated URLs to read blobs from | `gs://bucket/,s3://bucket/` |
| `BLOB_FILE_STORE_UPLOAD_URL` | URL for uploading blobs | `s3://my-bucket/blobs/` |
| `L1_CONSENSUS_HOST_URLS` | Beacon node URLs (comma-separated) | `https://beacon.example.com` |
| `L1_CONSENSUS_HOST_API_KEYS` | API keys for beacon nodes | `key1,key2` |
| `L1_CONSENSUS_HOST_API_KEY_HEADERS` | Header names for API keys | `Authorization` |
| `BLOB_ARCHIVE_API_URL` | Archive API URL (e.g., Blobscan) | `https://api.blobscan.com` |
| `BLOB_ALLOW_EMPTY_SOURCES` | Allow no blob sources (default: false) | `false` |

:::note Upload is Optional
Configuring `BLOB_FILE_STORE_UPLOAD_URL` is optional. You can still download blobs from file stores without uploading them yourself — other network participants (such as sequencers and validators) upload blobs to shared storage, making them available for all nodes to retrieve.
:::

### Supported storage backends

The blob client supports the same storage backends as snapshots:

- **Google Cloud Storage** - `gs://bucket-name/path/`
- **Amazon S3** - `s3://bucket-name/path/`
- **Cloudflare R2** - `s3://bucket-name/path/?endpoint=https://[ACCOUNT_ID].r2.cloudflarestorage.com`
- **HTTP/HTTPS** (read-only) - `https://host/path`
- **Local filesystem** - `file:///absolute/path`

### Storage path format

Blobs are stored using the following path structure:

```
{base_url}/aztec-{l1ChainId}-{rollupVersion}-{rollupAddress}/blobs/{versionedBlobHash}.data
```

For example:
```
gs://my-bucket/aztec-1-1-0x1234abcd.../blobs/0x01abc123...def.data
```

## Configuration examples

### Basic file store configuration

Add to your `.env` file:

```bash
# Read blobs from GCS
BLOB_FILE_STORE_URLS=gs://my-snapshots/

# Upload blobs to GCS
BLOB_FILE_STORE_UPLOAD_URL=gs://my-snapshots/
```

### Multiple read sources with L1 fallback

```bash
# Try multiple sources in order
BLOB_FILE_STORE_URLS=gs://primary-bucket/,s3://backup-bucket/

# Upload to primary
BLOB_FILE_STORE_UPLOAD_URL=gs://primary-bucket/

# L1 consensus fallback
L1_CONSENSUS_HOST_URLS=https://beacon1.example.com,https://beacon2.example.com

# Archive fallback for historical blobs
BLOB_ARCHIVE_API_URL=https://api.blobscan.com
```

### Cloudflare R2 configuration

```bash
BLOB_FILE_STORE_URLS=s3://my-bucket/blobs/?endpoint=https://[ACCOUNT_ID].r2.cloudflarestorage.com
BLOB_FILE_STORE_UPLOAD_URL=s3://my-bucket/blobs/?endpoint=https://[ACCOUNT_ID].r2.cloudflarestorage.com
```

Replace `[ACCOUNT_ID]` with your Cloudflare account ID.

### Local filesystem (for testing)

```bash
BLOB_FILE_STORE_URLS=file:///data/blobs
BLOB_FILE_STORE_UPLOAD_URL=file:///data/blobs
```

### Docker Compose integration

Add the environment variables to your `docker-compose.yml`:

```yaml
environment:
# ... other environment variables
BLOB_FILE_STORE_URLS: ${BLOB_FILE_STORE_URLS}
BLOB_FILE_STORE_UPLOAD_URL: ${BLOB_FILE_STORE_UPLOAD_URL}
L1_CONSENSUS_HOST_URLS: ${L1_CONSENSUS_HOST_URLS}
BLOB_ARCHIVE_API_URL: ${BLOB_ARCHIVE_API_URL}
```

## Authentication

### Google Cloud Storage

Set up [Application Default Credentials](https://cloud.google.com/docs/authentication/application-default-credentials):

```bash
gcloud auth application-default login
```

Or use a service account key:

```bash
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account-key.json
```

### Amazon S3 / Cloudflare R2

Set AWS credentials as environment variables:

```bash
export AWS_ACCESS_KEY_ID=your-access-key
export AWS_SECRET_ACCESS_KEY=your-secret-key
```

For R2, these credentials come from your Cloudflare R2 API tokens.

## How blob retrieval works

When a node needs blobs for a block, the blob client follows this retrieval order:

### During historical sync
1. **File Store** - Quick lookup in configured file stores
2. **L1 Consensus** - Query beacon nodes using slot number
3. **Archive API** - Fall back to Blobscan or similar service

### During near-tip sync
1. **File Store** - Quick lookup (no retries)
2. **L1 Consensus** - Query beacon nodes
3. **File Store with retries** - Retry with backoff for eventual consistency
4. **Archive API** - Final fallback

The client automatically uploads fetched blobs to the configured upload file store, ensuring blobs are preserved for future retrieval.

## Verification

To verify your blob configuration is working:

1. **Check startup logs**: Look for messages about blob source connectivity
2. **Monitor blob retrieval**: Watch for successful blob fetches during sync
3. **Verify storage**: Check your storage bucket to confirm blob files exist
4. **Test retrieval**: Restart the node and verify it can retrieve previously stored blobs
Comment on lines +201 to +204
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's either have proper instructions for verification, or just delete them


## Troubleshooting
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand Claude generated most of this. Let's clean them up and remove the slop.


### No blob sources configured

**Issue**: Node starts with warning about no blob sources.

**Solutions**:
- Configure at least one of: `BLOB_FILE_STORE_URLS`, `L1_CONSENSUS_HOST_URLS`, or `BLOB_ARCHIVE_API_URL`
- Set `BLOB_ALLOW_EMPTY_SOURCES=true` only if you understand the implications (node may fail to sync)

### Blob retrieval fails

**Issue**: Node cannot retrieve blobs for a block.

**Solutions**:
- Verify your file store URLs are accessible
- Check L1 consensus host connectivity
- Ensure authentication credentials are configured
- Review node logs for specific error messages
- Try using multiple file store URLs for redundancy

### Upload fails

**Issue**: Blobs are not being uploaded to file store.

**Solutions**:
- Verify `BLOB_FILE_STORE_UPLOAD_URL` is set
- Check write permissions on the storage bucket
- Ensure credentials are configured (AWS/GCP)
- Note: HTTPS URLs are read-only and cannot be used for uploads

### L1 consensus host errors

**Issue**: Cannot connect to beacon nodes.

**Solutions**:
- Verify beacon node URLs are correct and accessible
- Check if API keys are required and correctly configured
- Ensure the beacon node is synced
- Try multiple beacon node URLs for redundancy

## Best practices

- **Use file stores for production**: File stores provide faster, more reliable blob retrieval than L1 consensus
- **Configure multiple sources**: Use multiple file store URLs and L1 consensus hosts for redundancy
- **Enable blob uploads**: Configure `BLOB_FILE_STORE_UPLOAD_URL` to contribute to blob availability
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a best practice for all users. Uploading to a store does not help anyone unless they advertise it, like snapshot providers do.

- **Choose appropriate storage**:
- Google Cloud Storage for GCP infrastructure
- Amazon S3 for AWS infrastructure
- Cloudflare R2 for cost-effective public distribution (free egress)
- **Monitor storage costs**: Blobs can accumulate over time; consider retention policies
- **Use archive API for historical access**: Configure `BLOB_ARCHIVE_API_URL` for accessing blobs older than ~18 days. Even with PeerDAS supernodes providing robust data availability, blob data is pruned from L1 after 4,096 epochs. Archive services like [Blobscan](https://blobscan.com/) store historical blob data indefinitely

## Next Steps

- Learn about [using snapshots](./syncing_best_practices.md) for faster node synchronization
- Set up [monitoring](../operation/monitoring.md) to track your node's blob retrieval
- Check the [CLI reference](../reference/cli_reference.md) for additional blob-related options
- Join the [Aztec Discord](https://discord.gg/aztec) for support
2 changes: 1 addition & 1 deletion yarn-project/blob-client/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ URL for uploading blobs to a file store.
**L1 Consensus Host URLs** (`L1_CONSENSUS_HOST_URLS`):
Beacon node URLs for fetching recent blobs directly from L1.

**Archive API URL** (`BLOB_SINK_ARCHIVE_API_URL`):
**Archive API URL** (`BLOB_ARCHIVE_API_URL`):
Blobscan or similar archive API for historical blob data.

### Example Usage
Expand Down
2 changes: 1 addition & 1 deletion yarn-project/blob-client/src/archive/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ export type BlobArchiveApiConfig = {

export const blobArchiveApiConfigMappings: ConfigMappingsType<BlobArchiveApiConfig> = {
archiveApiUrl: {
env: 'BLOB_SINK_ARCHIVE_API_URL',
env: 'BLOB_ARCHIVE_API_URL',
description: 'The URL of the archive API',
},
...pickConfigMappings(l1ReaderConfigMappings, ['l1ChainId']),
Expand Down
4 changes: 1 addition & 3 deletions yarn-project/foundation/src/config/env_var.ts
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,7 @@ export type EnvVar =
| 'BB_NUM_IVC_VERIFIERS'
| 'BB_IVC_CONCURRENCY'
| 'BOOTSTRAP_NODES'
| 'BLOB_SINK_ARCHIVE_API_URL'
| 'BLOB_SINK_PORT'
| 'BLOB_SINK_URL'
| 'BLOB_ARCHIVE_API_URL'
| 'BLOB_FILE_STORE_URLS'
| 'BLOB_FILE_STORE_UPLOAD_URL'
| 'BOT_DA_GAS_LIMIT'
Expand Down
Loading