diff --git a/images/blobs_fig_1_blob.svg b/images/blobs_fig_1_blob.svg new file mode 100644 index 0000000..650cdf4 --- /dev/null +++ b/images/blobs_fig_1_blob.svg @@ -0,0 +1 @@ +fixed-size chunk hashesroot hashcidBlob<opaque data>...<opaque data continued><padding>binarydataBLOB \ No newline at end of file diff --git a/images/blobs_fig_1_blob_dark.svg b/images/blobs_fig_1_blob_dark.svg new file mode 100644 index 0000000..0f9e475 --- /dev/null +++ b/images/blobs_fig_1_blob_dark.svg @@ -0,0 +1 @@ +fixed-size chunk hashesroot hashcidBlob<opaque data>...<opaque data continued><padding>binarydataBLOB \ No newline at end of file diff --git a/images/blobs_fig_2_collection.svg b/images/blobs_fig_2_collection.svg new file mode 100644 index 0000000..2f28288 --- /dev/null +++ b/images/blobs_fig_2_collection.svg @@ -0,0 +1 @@ +77sfc23pszkazsxg5w4q2hlnpaqjgjsq4sg3z4py3sa6gzryfpuq77spgchxylthv4gxajjbr6xzrxrdsywan3bampxq4d7sdjadm6ja77sprwdzerf4jyxvcmei2zqm3v6g2uj7dacuxy5qsudzlupmvxtq77ssxzzetjl5q23akdo6bnskvytd3by5layzfo7md2g6zdimbitq77swfoyskeay37xyeva4hnqbcppnzk4fbidq63rgcvyg2bf7mrja77tsjjayznqg7x3lxenl37ovsmgn7eflohzphnqulxcaqzoapklha77tthnjl45kc6mu6at7jidy4gnecbjpvv56pra3d4pjjic2nbdia77t2vxk522bvrk6apoka5szx4z6ishkqwse7g233vehcxcgns5za77uni5pv4v5uqi6hvfnw3qn3xmdvau5uf7irogmeerhc7tkg2pga77vkstmp7vuwrk577lczmb6f7bthjgowthsuvekkyzpzrspe4cda77vp4qw5ystjutgp62rkkwpgk5azed6v2jlecza6q4vvfq52nzha77vw3sk5vdbku3ihnlz56prctaepgtixa455uv3fkn7ldbesaiza77wbps7i6w4c65mzetip6ddicvdhbehlv4pqbulylwrouz6m7w6a77wt3qq54ozfq4okttokzuqvdpl2yxyeu2pprrs33wg3ubxavv6q77xpqkwwacxhpf24zsa74mepojejix6aq7anfc7pkk4kxsmt3weq77x4xo3e75xxp727awgnjm6zeibab7mwyz7zmgocfu3u7p64bpba77yh3fsvz5zux5dxj63cgt25oukxreizebnmqoij26qudsv5vtuq77ymzl65vtqvvmglwht3eju2fubegkoioexs4uv6fuafkbxopskq77y56hsagwclv2zjqzbkkhx6v6md4u5bsnvbr3de2fbahoqencra77zlqqxiqwdiagw2radpy7hmovbnvfildnvgrhkktvttcaofyl4lq77zt64i5cbksxwko4cgzujhacjpi3tc3crtvulaidh2jj24ijmdq77z3c7m3iyphvnxjtt4udhcndefcnzs76il4nemk26l7hm6iirjq772c4xujdiwx5psgwfx5dyi2jfip4uzh6sv77teqokqqrp2ksyy4q772lsd75krvciemzjpmxcq5joehuaonchwvnrf2xtgjjmbq56keq772wghidfckvdllljjzswpiizw7h4pfpwxudiqcdjuysljfcxa5q77333v5l7vahorcpfqmrpidgr7tonfxzng3fr2qg2oz4xkwx2xoq774ebubybojiq3jfu4nqaxxof37pmin3mmgctrcneupmju45gogq...tree hashescollection hash sequenceconventionally, metadata blob is first hashroot hashcidCollection \ No newline at end of file diff --git a/images/blobs_fig_2_collection_dark.svg b/images/blobs_fig_2_collection_dark.svg new file mode 100644 index 0000000..15fb717 --- /dev/null +++ b/images/blobs_fig_2_collection_dark.svg @@ -0,0 +1 @@ +tree hashesroot hashcidCollection77sfc23pszkazsxg5w4q2hlnpaqjgjsq4sg3z4py3sa6gzryfpuq77spgchxylthv4gxajjbr6xzrxrdsywan3bampxq4d7sdjadm6ja77sprwdzerf4jyxvcmei2zqm3v6g2uj7dacuxy5qsudzlupmvxtq77ssxzzetjl5q23akdo6bnskvytd3by5layzfo7md2g6zdimbitq77swfoyskeay37xyeva4hnqbcppnzk4fbidq63rgcvyg2bf7mrja77tsjjayznqg7x3lxenl37ovsmgn7eflohzphnqulxcaqzoapklha77tthnjl45kc6mu6at7jidy4gnecbjpvv56pra3d4pjjic2nbdia77t2vxk522bvrk6apoka5szx4z6ishkqwse7g233vehcxcgns5za77uni5pv4v5uqi6hvfnw3qn3xmdvau5uf7irogmeerhc7tkg2pga77vkstmp7vuwrk577lczmb6f7bthjgowthsuvekkyzpzrspe4cda77vp4qw5ystjutgp62rkkwpgk5azed6v2jlecza6q4vvfq52nzha77vw3sk5vdbku3ihnlz56prctaepgtixa455uv3fkn7ldbesaiza77wbps7i6w4c65mzetip6ddicvdhbehlv4pqbulylwrouz6m7w6a77wt3qq54ozfq4okttokzuqvdpl2yxyeu2pprrs33wg3ubxavv6q77xpqkwwacxhpf24zsa74mepojejix6aq7anfc7pkk4kxsmt3weq77x4xo3e75xxp727awgnjm6zeibab7mwyz7zmgocfu3u7p64bpba77yh3fsvz5zux5dxj63cgt25oukxreizebnmqoij26qudsv5vtuq77ymzl65vtqvvmglwht3eju2fubegkoioexs4uv6fuafkbxopskq77y56hsagwclv2zjqzbkkhx6v6md4u5bsnvbr3de2fbahoqencra77zlqqxiqwdiagw2radpy7hmovbnvfildnvgrhkktvttcaofyl4lq77zt64i5cbksxwko4cgzujhacjpi3tc3crtvulaidh2jj24ijmdq77z3c7m3iyphvnxjtt4udhcndefcnzs76il4nemk26l7hm6iirjq772c4xujdiwx5psgwfx5dyi2jfip4ugzh6sv77teqoqqrp2ksyy4q772lsd75krvciemzjpmxcq5joehuaonchwvnrf2xtgjjmbq56keq772wghidfckvdllljjzswpiizw7h4pfpwxudiqcdjuysljfcxa5q77333v5l7vahorcpfqmrpidgr7tonfxzng3fr2qg2oz4xkwx2xoq774ebubybojiq3jfu4nqaxxof37pmin3mmgctrcneupmju45gogq...collection hash sequenceconventionally, metadata blob is first hash \ No newline at end of file diff --git a/protocols/blobs.mdx b/protocols/blobs.mdx index 1103178..c00836b 100644 --- a/protocols/blobs.mdx +++ b/protocols/blobs.mdx @@ -2,9 +2,34 @@ title: "Blobs" --- -Content-addressed blob storage and transfer. `iroh-blobs` implements -request/response and streaming transfers of arbitrary-sized byte blobs, using -BLAKE3-verified streams and content-addressed links. +the blobs protocol works with sequences of opaque data ("blobs"), which are often the bytes of a file. +It's a protocol for data storage and transfer. `iroh-blobs` is _content-addressed_, which means data is always referenced to by it's cryptographic hash. Blobs implements request/response and +streaming transfers blobs or ranges of blobs, using hashes to verify streams remain intact. + +All blobs within iroh are referred to by the BLAKE3 hash of its content. BLAKE3 is a _tree hashing_ algorithm, that splits its input into uniform chunks and arranges them as the leaves of a binary tree, producing intermediate _chunk hashes_ that accumulate up to a _root hash._ Iroh uses the 32 byte root hash (or just “hash”) as an immutable blob identifier. + + + Blobs are sequences of opaque data, which are referred to by their hash + Blobs are sequences of opaque data, which are referred to by their hash + + +The blobs protocol leverages the tree hash structure of BLAKE3 as a basis for _incremental verification_. When data is sent over the network, the integrity of each chunk is checked both by the sender and the receiver, as described by Section 6.4 “Verified Streaming” of the [BLAKE3 paper](https://github.com/BLAKE3-team/BLAKE3-specs). Iroh caches all chunk hashes as external metadata, leaving the unaltered input blob as the canonical source of bytes. Verified streaming also facilitates _range requests:_ fetching a verifiable contiguous subsequence of a blob by streaming only the portions of the BLAKE3 binary tree required to verify the designated subsequence. + +Chunk hashes are distinct from root hashes and only used during data transfer. The chunk size of BLAKE3 is a tunable constant that defaults to 1KiB, which results in a 6% overhead on on the size of the input blob. Increasing the chunk size reduces overhead, at the cost of requiring more data transfer before an incremental verification checkpoint is reached. The chunk size constant can be modified & recalculated *without affecting the root hash.* This opens the door to experiment with different chunk size constants, even at runtime. We intend to investigate chunk size optimization in future work. + +## Collections + +A *Collection* is an ordered sequence of hashes that refer to other blobs. Collection reference counts can range from 0-billions. Collections are the _only_ means of relating blobs, and form the basis of synchronization & querying. Collections cannot be queried nested as graphs. + +The serialized form of a collection is each 32 byte hash in the collection, concatenated without any separator. A collection is only valid if it's byte length is a multiple of 32, which also makes indexing into a collection with a known byte length trivial: multiply the desired index by 32. + +By convention, the initial element in a collection is the _metadata blob:_ it stores information about all other blobs in the collection. The contents of a metadata blob can be defined by applications. + + + collections are an ordered sequence of + collections are an ordered sequence of + + ## Concepts @@ -296,4 +321,4 @@ If you're hungry for more, check out - the [iroh rust documentation](https://docs.rs/iroh), - Video streaming with [iroh-roq](/protocols/streaming), - Writing your [own protocols](/protocols/writing-a-protocol.mdx), -- [other examples](/examples), or \ No newline at end of file +- [other examples](/examples), or