-
Notifications
You must be signed in to change notification settings - Fork 577
docs: blob storage documentation #19194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: next
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,264 @@ | ||||||||||
| --- | ||||||||||
| id: blob_storage | ||||||||||
| sidebar_position: 4 | ||||||||||
| title: Blob storage and retrieval | ||||||||||
| description: Learn how Aztec nodes store and retrieve blob data for L1 transactions. | ||||||||||
| --- | ||||||||||
|
|
||||||||||
| ## Overview | ||||||||||
|
|
||||||||||
| Aztec uses EIP-4844 blobs to publish transaction data to Ethereum Layer 1. Since blob data is only available on L1 for a limited period (~18 days / 4,096 epochs), nodes need reliable ways to store and retrieve blob data for synchronization and historical access. | ||||||||||
|
|
||||||||||
| Aztec nodes can be configured to retrieve blobs from L1 consensus (beacon nodes), file stores (S3, GCS, R2), and archive services. | ||||||||||
|
|
||||||||||
| :::tip Automatic Configuration | ||||||||||
| When using `--network [NETWORK_NAME]`, blob sources are automatically configured for you. Most users don't need to manually configure blob storage. | ||||||||||
| ::: | ||||||||||
|
|
||||||||||
| ## Understanding blob sources | ||||||||||
|
|
||||||||||
| The blob client can retrieve blobs from multiple sources, tried in order: | ||||||||||
|
|
||||||||||
| 1. **File Store**: Fast retrieval from configured storage (S3, GCS, R2, local files, HTTPS) | ||||||||||
| 2. **L1 Consensus**: Beacon node API for recent blobs (within ~18 days) | ||||||||||
| 3. **Archive API**: Services like Blobscan for historical blob data | ||||||||||
|
|
||||||||||
| For near-tip synchronization, the client will retry file stores with backoff to handle eventual consistency when blobs are still being uploaded by other validators. | ||||||||||
|
|
||||||||||
| ### L1 consensus and blob availability | ||||||||||
|
|
||||||||||
| If your beacon node has access to [supernodes or semi-supernodes](https://ethereum.org/roadmap/fusaka/peerdas/), L1 consensus alone may be sufficient for retrieving blobs within the ~18 day retention period. With the Fusaka upgrade and [PeerDAS (Peer Data Availability Sampling)](https://eips.ethereum.org/EIPS/eip-7594), Ethereum uses erasure coding to split blobs into 128 columns, enabling robust data availability: | ||||||||||
|
|
||||||||||
| - **Supernodes** (validators with ≥4,096 ETH staked): Custody all 128 columns and all blob data for the full ~18 day retention period. These nodes form the backbone of the network and continuously heal data gaps. | ||||||||||
| - **Semi-supernodes** (validators with ≥1,824 ETH / 57 validators): Handle at least 64 columns, enabling reconstruction of complete blob data. | ||||||||||
| - **Regular nodes**: Only download 1/8th of the data (8 of 128 columns) to verify availability. This is **not sufficient** to serve complete blob data. | ||||||||||
|
|
||||||||||
| If L1 consensus is your only blob source, your beacon node must be a supernode or semi-supernode (or connected to one) to retrieve complete blobs. A regular node cannot reconstruct full blob data from its partial columns alone. | ||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
|
|
||||||||||
| This means that for recent blobs, configuring `L1_CONSENSUS_HOST_URLS` pointing to a well-connected supernode or semi-supernode may be all you need. However, file stores and archive APIs are still recommended for: | ||||||||||
| - Faster retrieval (file stores are typically faster than L1 consensus queries) | ||||||||||
| - Historical access (blobs older than ~18 days are pruned from L1) | ||||||||||
| - Redundancy (multiple sources improve reliability) | ||||||||||
|
|
||||||||||
| ## Prerequisites | ||||||||||
|
|
||||||||||
| Before configuring blob storage, you should: | ||||||||||
|
|
||||||||||
| - Have the Aztec node software installed | ||||||||||
| - Understand basic node operation | ||||||||||
|
Comment on lines
+47
to
+48
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can remove this |
||||||||||
| - For uploading blobs: Have access to cloud storage (Google Cloud Storage, Amazon S3, or Cloudflare R2) with appropriate permissions | ||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's not mix up instructions for blob retrieval and blob upload. Move everything related to blob upload to a separate file or at least a separate section, since only a very small subset of users will do uploads. |
||||||||||
|
|
||||||||||
| ## Configuring blob sources | ||||||||||
|
|
||||||||||
| ### Environment variables | ||||||||||
|
|
||||||||||
| Configure blob sources using environment variables in your `.env` file: | ||||||||||
|
|
||||||||||
| | Variable | Description | Example | | ||||||||||
| |----------|-------------|---------| | ||||||||||
| | `BLOB_FILE_STORE_URLS` | Comma-separated URLs to read blobs from | `gs://bucket/,s3://bucket/` | | ||||||||||
| | `BLOB_FILE_STORE_UPLOAD_URL` | URL for uploading blobs | `s3://my-bucket/blobs/` | | ||||||||||
| | `L1_CONSENSUS_HOST_URLS` | Beacon node URLs (comma-separated) | `https://beacon.example.com` | | ||||||||||
| | `L1_CONSENSUS_HOST_API_KEYS` | API keys for beacon nodes | `key1,key2` | | ||||||||||
| | `L1_CONSENSUS_HOST_API_KEY_HEADERS` | Header names for API keys | `Authorization` | | ||||||||||
| | `BLOB_ARCHIVE_API_URL` | Archive API URL (e.g., Blobscan) | `https://api.blobscan.com` | | ||||||||||
| | `BLOB_ALLOW_EMPTY_SOURCES` | Allow no blob sources (default: false) | `false` | | ||||||||||
|
|
||||||||||
| :::note Upload is Optional | ||||||||||
| Configuring `BLOB_FILE_STORE_UPLOAD_URL` is optional. You can still download blobs from file stores without uploading them yourself — other network participants (such as sequencers and validators) upload blobs to shared storage, making them available for all nodes to retrieve. | ||||||||||
| ::: | ||||||||||
|
|
||||||||||
| ### Supported storage backends | ||||||||||
|
|
||||||||||
| The blob client supports the same storage backends as snapshots: | ||||||||||
|
|
||||||||||
| - **Google Cloud Storage** - `gs://bucket-name/path/` | ||||||||||
| - **Amazon S3** - `s3://bucket-name/path/` | ||||||||||
| - **Cloudflare R2** - `s3://bucket-name/path/?endpoint=https://[ACCOUNT_ID].r2.cloudflarestorage.com` | ||||||||||
| - **HTTP/HTTPS** (read-only) - `https://host/path` | ||||||||||
| - **Local filesystem** - `file:///absolute/path` | ||||||||||
|
|
||||||||||
| ### Storage path format | ||||||||||
|
|
||||||||||
| Blobs are stored using the following path structure: | ||||||||||
|
|
||||||||||
| ``` | ||||||||||
| {base_url}/aztec-{l1ChainId}-{rollupVersion}-{rollupAddress}/blobs/{versionedBlobHash}.data | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| For example: | ||||||||||
| ``` | ||||||||||
| gs://my-bucket/aztec-1-1-0x1234abcd.../blobs/0x01abc123...def.data | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| ## Configuration examples | ||||||||||
|
|
||||||||||
| ### Basic file store configuration | ||||||||||
|
|
||||||||||
| Add to your `.env` file: | ||||||||||
|
|
||||||||||
| ```bash | ||||||||||
| # Read blobs from GCS | ||||||||||
| BLOB_FILE_STORE_URLS=gs://my-snapshots/ | ||||||||||
|
|
||||||||||
| # Upload blobs to GCS | ||||||||||
| BLOB_FILE_STORE_UPLOAD_URL=gs://my-snapshots/ | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| ### Multiple read sources with L1 fallback | ||||||||||
|
|
||||||||||
| ```bash | ||||||||||
| # Try multiple sources in order | ||||||||||
| BLOB_FILE_STORE_URLS=gs://primary-bucket/,s3://backup-bucket/ | ||||||||||
|
|
||||||||||
| # Upload to primary | ||||||||||
| BLOB_FILE_STORE_UPLOAD_URL=gs://primary-bucket/ | ||||||||||
|
|
||||||||||
| # L1 consensus fallback | ||||||||||
| L1_CONSENSUS_HOST_URLS=https://beacon1.example.com,https://beacon2.example.com | ||||||||||
|
|
||||||||||
| # Archive fallback for historical blobs | ||||||||||
| BLOB_ARCHIVE_API_URL=https://api.blobscan.com | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| ### Cloudflare R2 configuration | ||||||||||
|
|
||||||||||
| ```bash | ||||||||||
| BLOB_FILE_STORE_URLS=s3://my-bucket/blobs/?endpoint=https://[ACCOUNT_ID].r2.cloudflarestorage.com | ||||||||||
| BLOB_FILE_STORE_UPLOAD_URL=s3://my-bucket/blobs/?endpoint=https://[ACCOUNT_ID].r2.cloudflarestorage.com | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| Replace `[ACCOUNT_ID]` with your Cloudflare account ID. | ||||||||||
|
|
||||||||||
| ### Local filesystem (for testing) | ||||||||||
|
|
||||||||||
| ```bash | ||||||||||
| BLOB_FILE_STORE_URLS=file:///data/blobs | ||||||||||
| BLOB_FILE_STORE_UPLOAD_URL=file:///data/blobs | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| ### Docker Compose integration | ||||||||||
|
|
||||||||||
| Add the environment variables to your `docker-compose.yml`: | ||||||||||
|
|
||||||||||
| ```yaml | ||||||||||
| environment: | ||||||||||
| # ... other environment variables | ||||||||||
| BLOB_FILE_STORE_URLS: ${BLOB_FILE_STORE_URLS} | ||||||||||
| BLOB_FILE_STORE_UPLOAD_URL: ${BLOB_FILE_STORE_UPLOAD_URL} | ||||||||||
| L1_CONSENSUS_HOST_URLS: ${L1_CONSENSUS_HOST_URLS} | ||||||||||
| BLOB_ARCHIVE_API_URL: ${BLOB_ARCHIVE_API_URL} | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| ## Authentication | ||||||||||
|
|
||||||||||
| ### Google Cloud Storage | ||||||||||
|
|
||||||||||
| Set up [Application Default Credentials](https://cloud.google.com/docs/authentication/application-default-credentials): | ||||||||||
|
|
||||||||||
| ```bash | ||||||||||
| gcloud auth application-default login | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| Or use a service account key: | ||||||||||
|
|
||||||||||
| ```bash | ||||||||||
| export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account-key.json | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| ### Amazon S3 / Cloudflare R2 | ||||||||||
|
|
||||||||||
| Set AWS credentials as environment variables: | ||||||||||
|
|
||||||||||
| ```bash | ||||||||||
| export AWS_ACCESS_KEY_ID=your-access-key | ||||||||||
| export AWS_SECRET_ACCESS_KEY=your-secret-key | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| For R2, these credentials come from your Cloudflare R2 API tokens. | ||||||||||
|
|
||||||||||
| ## How blob retrieval works | ||||||||||
|
|
||||||||||
| When a node needs blobs for a block, the blob client follows this retrieval order: | ||||||||||
|
|
||||||||||
| ### During historical sync | ||||||||||
| 1. **File Store** - Quick lookup in configured file stores | ||||||||||
| 2. **L1 Consensus** - Query beacon nodes using slot number | ||||||||||
| 3. **Archive API** - Fall back to Blobscan or similar service | ||||||||||
|
|
||||||||||
| ### During near-tip sync | ||||||||||
| 1. **File Store** - Quick lookup (no retries) | ||||||||||
| 2. **L1 Consensus** - Query beacon nodes | ||||||||||
| 3. **File Store with retries** - Retry with backoff for eventual consistency | ||||||||||
| 4. **Archive API** - Final fallback | ||||||||||
|
|
||||||||||
| The client automatically uploads fetched blobs to the configured upload file store, ensuring blobs are preserved for future retrieval. | ||||||||||
|
|
||||||||||
| ## Verification | ||||||||||
|
|
||||||||||
| To verify your blob configuration is working: | ||||||||||
|
|
||||||||||
| 1. **Check startup logs**: Look for messages about blob source connectivity | ||||||||||
| 2. **Monitor blob retrieval**: Watch for successful blob fetches during sync | ||||||||||
| 3. **Verify storage**: Check your storage bucket to confirm blob files exist | ||||||||||
| 4. **Test retrieval**: Restart the node and verify it can retrieve previously stored blobs | ||||||||||
|
Comment on lines
+201
to
+204
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's either have proper instructions for verification, or just delete them |
||||||||||
|
|
||||||||||
| ## Troubleshooting | ||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I understand Claude generated most of this. Let's clean them up and remove the slop. |
||||||||||
|
|
||||||||||
| ### No blob sources configured | ||||||||||
|
|
||||||||||
| **Issue**: Node starts with warning about no blob sources. | ||||||||||
|
|
||||||||||
| **Solutions**: | ||||||||||
| - Configure at least one of: `BLOB_FILE_STORE_URLS`, `L1_CONSENSUS_HOST_URLS`, or `BLOB_ARCHIVE_API_URL` | ||||||||||
| - Set `BLOB_ALLOW_EMPTY_SOURCES=true` only if you understand the implications (node may fail to sync) | ||||||||||
|
|
||||||||||
| ### Blob retrieval fails | ||||||||||
|
|
||||||||||
| **Issue**: Node cannot retrieve blobs for a block. | ||||||||||
|
|
||||||||||
| **Solutions**: | ||||||||||
| - Verify your file store URLs are accessible | ||||||||||
| - Check L1 consensus host connectivity | ||||||||||
| - Ensure authentication credentials are configured | ||||||||||
| - Review node logs for specific error messages | ||||||||||
| - Try using multiple file store URLs for redundancy | ||||||||||
|
|
||||||||||
| ### Upload fails | ||||||||||
|
|
||||||||||
| **Issue**: Blobs are not being uploaded to file store. | ||||||||||
|
|
||||||||||
| **Solutions**: | ||||||||||
| - Verify `BLOB_FILE_STORE_UPLOAD_URL` is set | ||||||||||
| - Check write permissions on the storage bucket | ||||||||||
| - Ensure credentials are configured (AWS/GCP) | ||||||||||
| - Note: HTTPS URLs are read-only and cannot be used for uploads | ||||||||||
|
|
||||||||||
| ### L1 consensus host errors | ||||||||||
|
|
||||||||||
| **Issue**: Cannot connect to beacon nodes. | ||||||||||
|
|
||||||||||
| **Solutions**: | ||||||||||
| - Verify beacon node URLs are correct and accessible | ||||||||||
| - Check if API keys are required and correctly configured | ||||||||||
| - Ensure the beacon node is synced | ||||||||||
| - Try multiple beacon node URLs for redundancy | ||||||||||
|
|
||||||||||
| ## Best practices | ||||||||||
|
|
||||||||||
| - **Use file stores for production**: File stores provide faster, more reliable blob retrieval than L1 consensus | ||||||||||
| - **Configure multiple sources**: Use multiple file store URLs and L1 consensus hosts for redundancy | ||||||||||
| - **Enable blob uploads**: Configure `BLOB_FILE_STORE_UPLOAD_URL` to contribute to blob availability | ||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not a best practice for all users. Uploading to a store does not help anyone unless they advertise it, like snapshot providers do. |
||||||||||
| - **Choose appropriate storage**: | ||||||||||
| - Google Cloud Storage for GCP infrastructure | ||||||||||
| - Amazon S3 for AWS infrastructure | ||||||||||
| - Cloudflare R2 for cost-effective public distribution (free egress) | ||||||||||
| - **Monitor storage costs**: Blobs can accumulate over time; consider retention policies | ||||||||||
| - **Use archive API for historical access**: Configure `BLOB_ARCHIVE_API_URL` for accessing blobs older than ~18 days. Even with PeerDAS supernodes providing robust data availability, blob data is pruned from L1 after 4,096 epochs. Archive services like [Blobscan](https://blobscan.com/) store historical blob data indefinitely | ||||||||||
|
|
||||||||||
| ## Next Steps | ||||||||||
|
|
||||||||||
| - Learn about [using snapshots](./syncing_best_practices.md) for faster node synchronization | ||||||||||
| - Set up [monitoring](../operation/monitoring.md) to track your node's blob retrieval | ||||||||||
| - Check the [CLI reference](../reference/cli_reference.md) for additional blob-related options | ||||||||||
| - Join the [Aztec Discord](https://discord.gg/aztec) for support | ||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.