CIDv0
uses multihash to support multiple hashing functions. This means that we can successfully generate a hash for specific content using different hashing algorithms and later on be able to identify content using this hash.
But when we're trying to read the data itself, how do we know the encoding method used?
It could be encoded with CBOR, Protobuf, plain JSON, etc. To solve this issue, CIDv1
introduces another prefix that uniquely identifies the encoding method used.
The multicodec prefix indicates which encoding was used on the data.
Multicodec supports many different types of encoding, and each has its own short codec identifier, as shown in the complete table.
In the example above we see how data encoded with the codec dag-pb
would be represented in our CID.
dag-pb
is one of many different types of IPLD (InterPlanetary Linked Data) codecs. Because IPFS always uses one of these IPLD formats for its data, the multicodec prefix in an IPFS CID will always be an IPLD codec.
However, it's important to note that multicodec isn't only used by IPFS and IPLD. Along with multihash and a few other self-describing protocols, it's part of the Multiformats project, which spun off from IPFS and now supports a wide variety of other projects and protocols, including the CID specification we're learning about here.