docs: blob storage documentation #19194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

mrzeszutko wants to merge 1 commit into next from feature/blob-storage-docs

Contributor

mrzeszutko commented Dec 22, 2025

Summary

Add comprehensive blob storage documentation for node operators
Rename BLOB_SINK_ARCHIVE_API_URL → BLOB_ARCHIVE_API_URL (cleanup after BlobSink removal)
Remove dead environment variables BLOB_SINK_PORT and BLOB_SINK_URL

Description

Following the removal of the BlobSink HTTP server (#19143), this PR:

Adds new documentation (blob_storage.md) explaining how Aztec nodes store and retrieve blob data, including:
- Overview of blob sources (FileStore, L1 Consensus, Archive API)
- PeerDAS and supernode requirements for L1 consensus
- Configuration examples for GCS, S3, and Cloudflare R2
- Authentication setup
- Troubleshooting guide
Cleans up legacy naming by renaming BLOB_SINK_ARCHIVE_API_URL to BLOB_ARCHIVE_API_URL - the "sink" terminology is no longer accurate since the HTTP server was removed
Removes dead code - BLOB_SINK_PORT and BLOB_SINK_URL env vars that were left behind after BlobSink removal


          Blob storage documentation

2078cd3

mrzeszutko changed the title ~~Blob storage documentation~~ docs: blob storage documentation

mrzeszutko marked this pull request as ready for review

December 22, 2025 21:07

spalladino requested changes

View reviewed changes

Contributor

spalladino left a comment

Let's split the instructions related to retrieval and storage, since they are meant for different users. Also let's please delete the generic or redundant instructions inserted by Claude, like "search logs for troubleshooting".

docs/docs-network/setup/blob_storage.md

+              The blob client can retrieve blobs from multiple sources, tried in order:
+. **File Store**: Fast retrieval from configured storage (S3, GCS, R2, local files, HTTPS)
+. **L1 Consensus**: Beacon node API for recent blobs (within ~18 days)

Contributor

spalladino Dec 23, 2025

Suggested change

      
            2. **L1 Consensus**: Beacon node API for recent blobs (within ~18 days)
          
            2. **L1 Consensus**: Beacon node API to a (semi-)supernode for recent blobs (within ~18 days)

docs/docs-network/setup/blob_storage.md

+              - **Semi-supernodes** (validators with ≥1,824 ETH / 57 validators): Handle at least 64 columns, enabling reconstruction of complete blob data.
+              - **Regular nodes**: Only download 1/8th of the data (8 of 128 columns) to verify availability. This is **not sufficient** to serve complete blob data.
+              If L1 consensus is your only blob source, your beacon node must be a supernode or semi-supernode (or connected to one) to retrieve complete blobs. A regular node cannot reconstruct full blob data from its partial columns alone.

Contributor

spalladino Dec 23, 2025

Suggested change

      
            If L1 consensus is your only blob source, your beacon node must be a supernode or semi-supernode (or connected to one) to retrieve complete blobs. A regular node cannot reconstruct full blob data from its partial columns alone.
          
            :::warning Supernodes
          
            If L1 consensus is your only blob source, your beacon node must be a supernode or semi-supernode (or connected to one) to retrieve complete blobs. A regular node cannot reconstruct full blob data from its partial columns alone.
          
            :::

docs/docs-network/setup/blob_storage.md

Comment on lines +47 to +48

		- Have the Aztec node software installed
		- Understand basic node operation

Contributor

spalladino Dec 23, 2025

I think we can remove this

docs/docs-network/setup/blob_storage.md

+              - Have the Aztec node software installed
+              - Understand basic node operation
+              - For uploading blobs: Have access to cloud storage (Google Cloud Storage, Amazon S3, or Cloudflare R2) with appropriate permissions

Contributor

spalladino Dec 23, 2025

Let's not mix up instructions for blob retrieval and blob upload. Move everything related to blob upload to a separate file or at least a separate section, since only a very small subset of users will do uploads.

docs/docs-network/setup/blob_storage.md

Comment on lines +201 to +204

+. **Check startup logs**: Look for messages about blob source connectivity
+. **Monitor blob retrieval**: Watch for successful blob fetches during sync
+. **Verify storage**: Check your storage bucket to confirm blob files exist
+. **Test retrieval**: Restart the node and verify it can retrieve previously stored blobs

Contributor

spalladino Dec 23, 2025

Let's either have proper instructions for verification, or just delete them

docs/docs-network/setup/blob_storage.md

+              - **Use file stores for production**: File stores provide faster, more reliable blob retrieval than L1 consensus
+              - **Configure multiple sources**: Use multiple file store URLs and L1 consensus hosts for redundancy
+              - **Enable blob uploads**: Configure `BLOB_FILE_STORE_UPLOAD_URL` to contribute to blob availability

Contributor

spalladino Dec 23, 2025

This is not a best practice for all users. Uploading to a store does not help anyone unless they advertise it, like snapshot providers do.

docs/docs-network/setup/blob_storage.md

+. **Verify storage**: Check your storage bucket to confirm blob files exist
+. **Test retrieval**: Restart the node and verify it can retrieve previously stored blobs
+              ## Troubleshooting

Contributor

spalladino Dec 23, 2025

I understand Claude generated most of this. Let's clean them up and remove the slop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet