-
Notifications
You must be signed in to change notification settings - Fork 575
docs: blob storage documentation #19194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: next
Are you sure you want to change the base?
Conversation
spalladino
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's split the instructions related to retrieval and storage, since they are meant for different users. Also let's please delete the generic or redundant instructions inserted by Claude, like "search logs for troubleshooting".
| The blob client can retrieve blobs from multiple sources, tried in order: | ||
|
|
||
| 1. **File Store**: Fast retrieval from configured storage (S3, GCS, R2, local files, HTTPS) | ||
| 2. **L1 Consensus**: Beacon node API for recent blobs (within ~18 days) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| 2. **L1 Consensus**: Beacon node API for recent blobs (within ~18 days) | |
| 2. **L1 Consensus**: Beacon node API to a (semi-)supernode for recent blobs (within ~18 days) |
| - **Semi-supernodes** (validators with ≥1,824 ETH / 57 validators): Handle at least 64 columns, enabling reconstruction of complete blob data. | ||
| - **Regular nodes**: Only download 1/8th of the data (8 of 128 columns) to verify availability. This is **not sufficient** to serve complete blob data. | ||
|
|
||
| If L1 consensus is your only blob source, your beacon node must be a supernode or semi-supernode (or connected to one) to retrieve complete blobs. A regular node cannot reconstruct full blob data from its partial columns alone. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| If L1 consensus is your only blob source, your beacon node must be a supernode or semi-supernode (or connected to one) to retrieve complete blobs. A regular node cannot reconstruct full blob data from its partial columns alone. | |
| :::warning Supernodes | |
| If L1 consensus is your only blob source, your beacon node must be a supernode or semi-supernode (or connected to one) to retrieve complete blobs. A regular node cannot reconstruct full blob data from its partial columns alone. | |
| ::: |
| - Have the Aztec node software installed | ||
| - Understand basic node operation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove this
|
|
||
| - Have the Aztec node software installed | ||
| - Understand basic node operation | ||
| - For uploading blobs: Have access to cloud storage (Google Cloud Storage, Amazon S3, or Cloudflare R2) with appropriate permissions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not mix up instructions for blob retrieval and blob upload. Move everything related to blob upload to a separate file or at least a separate section, since only a very small subset of users will do uploads.
| 1. **Check startup logs**: Look for messages about blob source connectivity | ||
| 2. **Monitor blob retrieval**: Watch for successful blob fetches during sync | ||
| 3. **Verify storage**: Check your storage bucket to confirm blob files exist | ||
| 4. **Test retrieval**: Restart the node and verify it can retrieve previously stored blobs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's either have proper instructions for verification, or just delete them
|
|
||
| - **Use file stores for production**: File stores provide faster, more reliable blob retrieval than L1 consensus | ||
| - **Configure multiple sources**: Use multiple file store URLs and L1 consensus hosts for redundancy | ||
| - **Enable blob uploads**: Configure `BLOB_FILE_STORE_UPLOAD_URL` to contribute to blob availability |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a best practice for all users. Uploading to a store does not help anyone unless they advertise it, like snapshot providers do.
| 3. **Verify storage**: Check your storage bucket to confirm blob files exist | ||
| 4. **Test retrieval**: Restart the node and verify it can retrieve previously stored blobs | ||
|
|
||
| ## Troubleshooting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand Claude generated most of this. Let's clean them up and remove the slop.
Summary
BLOB_SINK_ARCHIVE_API_URL→BLOB_ARCHIVE_API_URL(cleanup after BlobSink removal)BLOB_SINK_PORTandBLOB_SINK_URLDescription
Following the removal of the BlobSink HTTP server (#19143), this PR:
Adds new documentation (blob_storage.md) explaining how Aztec nodes store and retrieve blob data, including:
Cleans up legacy naming by renaming
BLOB_SINK_ARCHIVE_API_URLtoBLOB_ARCHIVE_API_URL- the "sink" terminology is no longer accurate since the HTTP server was removedRemoves dead code -
BLOB_SINK_PORTandBLOB_SINK_URLenv vars that were left behind after BlobSink removal