So now our CIDv1 in binary (0s and 1s) gives us this information:
<cid-version><multicodec><multihash>
Since binary CIDs are not very human-friendly, we can represent these binary CIDs in a string form (binary data represented as text). Example:
bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi
Converting data between binary and string formats requires base encoding, so when working with string CIDs it's important that we know the type of base encoding that was applied to the binary data. But how can we identify this?
In CIDv0
, hashes are always encoded with base58btc
. Always. This means that we can safely interpret CIDv0
hashes assuming they are using base58btc
. However, due to environment constraints (e.g. DNS names), we need the ability to support other base encodings as well. For that, you guessed it, we can add another prefix!
Multibase prefixes, which represent the base encoding used when converting CIDs between string and binary formats, are only used in the string form of the CID:
Let's examine two examples of CIDs in their string form:
We know the first one is a CIDv0
because it starts with Qm...
. All hashes that start with Qm
can safely be interpreted as base58btc
as a CID of Version 0.
The second example starts with b
, the base encoding prefix identifier for base32
, which is used by default by most implementations of IPFS.
For the complete list of multibase
identifiers, see this table.