Cryptography

Ambush Attacks on 160-bit

Object IDs and Addresses

Aug 16, 2023

Mysten Labs

2 min read

Mysten Labs routinely performs audits of major blockchains to analyze the various methods utilized for ensuring an efficient and secure blockchain – and then works to implement the best of those solutions on Sui. As part of this process, we recently performed internal audits and analysis regarding hash execution cost on Sui. The results of these efforts show that the cost of 160-bit hash collision attacks can be accomplished under certain assumptions and access to hardware for as little as just a few million dollars ($), making it a feasible attack vector even for individuals or less sophisticated actors.

Proposal: Use 256-bit object IDs (and inherently addresses too). To be on the safe side, Sui has already adopted the 256-bit address and object ID model to proactively prevent attacks discussed in this post.

Impact and attacks

Before we go into detail about why this attack is feasible, let’s assume someone can create two object IDs that collide. This type of attack is an offline attack, and there is no need to interact with the blockchain until one finds a collision. This is possible because object ID creation is a deterministic function based on the transaction data or, depending on the object type, the parent object ID. In any case, there is no unpredictable randomness involved – one could compute potential object IDs offline. Note also that object IDs are plain byte inputs – there is no elliptic curve crypto involved – and thus the computation is literally just that of a BLAKE2b hash function invocation.

Scenario 1 – Ambush

Objective 1: Compute (offline) two assets whose object IDs collide, which allows for ambush attacks (see our presentation). The following describes how a similar attack was possible on Ethereum (before geth client updates and related EIP-3607):

Objective 2: Find 2 contracts with the same address

  1. Write 2 contracts: (a) the one you want to deploy and attack and (b) a "smart" dummy one.

  2. Execute a collision attack until both addresses match (i.e., pick random salts).

  3. Publish the 1st contract (i.e., a DeFi, ERC20, L2 Rollup etc).

  4. Wait until the contract receives enough locked funds (you have to make it popular).

  5. When you are ready for the attack, publish the dummy contract.It now overrides the previous address.

  6. Drain the balance(s) in the 1st smart contract & transfer them to your controlled account.

We can extrapolate learnings from the preceding Ethereum scenario and apply them to Sui. Note that on Sui, where everything is an object, such ambush scenarios can be applied at the asset level, not only at the contract-logic layer. Ethereum uses 256-bits for memory and object access, and it's using only 160-bits for addresses (including EOA - contract addresses), so the attack is more contained there.

Scenario 2 - Live versus burnt objects

Bridges and light clients typically work with object IDs and txn IDs. It is technically possible to perform the attack described in the preceding section in such a way where the original object gets burnt. The adversary then recreates the same ID, but for a different object. There are many business scenarios where this can be problematic, and in theory it can also happen to fungible objects, such as when someone mints a $1B USDC object, then burns and creates another object with the same ID.

Scenario 3 - BFT Safety

Finding two transactions that result in the same object ID can break the safety of consensus protocols. Especially in the Fastpay protocol, these transactions can happen in parallel (single owner transactions). Note that we mostly focus on object collision as opposed to account address collision. This is because objects have states and any ID collision with another object is a BFT threat (for both ownership and asset feature reasons).

Mitigations

Sui uses 256-bit addresses and object IDs, and thus it’s immune against truncated-hash collisions attacks, but before taking this decision, another approach was also explored:

Track All Historical Object IDs

  • Compute the N possible object IDs that can be generated from txn (transaction).

  • If any of them is already in the validator's memory, abort the txn.

  • It’s very important to keep track of deleted object IDs as well.

A variation of the above is what other blockchains, which use 160-bit for addresses, typically do; unfortunately that wouldn’t work in Sui, mainly for two reasons:

  • Single-owner txns take the Fastpay (consensus-less) path, and they are executed in parallel, so in theory it would be possible to have two txns that happen to produce colliding object IDs.

  • Sui is a high-frequency transaction solution, and the amount of objects created and deleted (burnt) can be prohibitive for proactively verifying if an object exists before creating it. The most challenging part would be burnt objects, as then validators would have to maintain a huge database of all historical object IDs, since its existence at genesis block.

Summary: 256-bit objects (and addresses) are a straightforward solution to preventing 160-big hash collision attacks, and easier to implement than any other partial approach.

Educational content regarding security of hash functions

There are four levels of security for a cryptographic hash function 𝐻H.

The four security properties of hash functions include:

  1. Preimage resistance: 𝑥, find a message 𝑚 so that 𝐻(𝑚) = 𝑥.

    • Security of modern secure hash functions against this attack is typically \(2^n\) where n = hash_digest_size.

    • For 160-bit hashes, the effort is \(2^{160}\)

    • For 256-bit hashes, the effort is \(2^{256}\)

  2. 2nd-preimage resistance: given 𝑚1, find another 𝑚2≠𝑚1 so that 𝐻(𝑚1) = 𝐻(𝑚2).

    • Security of modern secure hash functions against this attack is typically \(2^n\) where n = hash_digest_size.

    • For 160-bit hashes, the effort is \(2^{160}\)

    • For 256-bit hashes, the effort is \(2^{256}\)

  3. Multi-target 2nd-preimage resistance: given a set of messages 𝑚X = {𝑚1, 𝑚2 .. 𝑚K} find another 𝑚Z so that 𝐻(𝑚Z) = 𝐻(any of 𝑚X).

    • The difference between this and 2nd-preimage is that now we target any element from a predefined set of messages, not just one 𝑚1.

    • Security of modern secure hash functions against this attack is typically \(2^{n-k}\) where n = hash_digest_size and k = \(log2(K)\) (where K = size of the set).

    • As you see, it’s easier to find a multi-target collision attack vs preimage attacks, because now you try to attack any existing element in a list, not just one.

    • For 160-bit hashes and 2^32 = 4 billion elements, the effort is \(2^{160-32}\) = \(2^{128}\)

    • For 256-bit hashes and 2^32 = 4 billion elements, the effort is \(2^{256-32}\) = \(2^{224}\)

  4. Collision resistance: find any pair of 𝑚1 and 𝑚2 so that 𝐻(𝑚1)=𝐻(𝑚2).

    • Also known as the birthday paradox: If you survey a random group of just 23 people, there is actually about a 50–50 chance that two of them will have the same birthday. As you see it’s not 1-out-of-365, because now we compare against many combinations of people-pairs, and we want at least one pair to match.

    • The difference between this and 2nd-preimage is that now we don’t specifically target an m1, but we try to find any pair of elements (that we can even create offline) whose hashes collide.

    • Security of modern secure hash functions against this attack is typically \(2^{n/2}\) where n = hash_digest_size.

    • As you see, it’s by far easier to find a collision attack vs preimage attacks, because now you try to find a random pair, not target a particular account/object etc.

    • For 160-bit hashes, the effort is \(2^{80}\)

    • For 256-bit hashes, the effort is \(2^{128}\)

    • Today, Bitcoin is operating at over \(2^{84}\) hashes per day, ~\(2^{93}\) annually. Historically, this generally doubles every 1-2 years. The following calculation is converting difficulty target to hashes per day, as we see ~\(2^{84.5}\)

Cost of the \( 2^{80} \) attack

The cost for the attack is estimated to be in the range of $1-10M, assuming hardware and storage are already in place. The reasoning behind this estimate is:

\( log(2,44000000000000 * (2^{32} / 600) * 60 * 60 * 24) \)

  • The preceding estimate implies that the Bitcoin network could, in theory, perform a \( 2^{80}\) sha256 hash invocations in about an hour!

  • Based on the mining profit per day (which is a few million dollars), and taking into account that there is usually a profit margin of 5-50%, the cost per hash can go down to about \( 1/2^{60}\) per hash invocation. Thus, for \(2^{80}\), we would need \(2^{20}\) = $1M USD.

  • Typically, mining is easier than collision attack generation because you need a slightly different setup for agents to communicate, but it’s a straightforward infrastructure and we’re aware of algorithms that search for collisions without requiring large storage requirements using techniques applied in rainbow tables that trade computations for storage. It would cost a bit more, but not necessarily significantly more. Also note that modern hash functions such as BLAKE2 and BLAKE3 families are already faster compared to Bitcoin's sha256.

  • Note that back in 2017, it took researchers \(2^{65}\) bits of effort and about $40k to break sha160. Since then, ASICS, GPUs, and better access to cloud services would allow for cheaper attacks, reduced by a significant factor. Thus, it's not a surprise that today, a \(2^{80}\) attack would cost a few million dollars. A similar analysis from Facebook research showed that \(2^{92}\) collision attacks on AES cost about $4B, thus for \(2^{80}\), it would be about $1M; the results are consistent.

Notes

  • We've discussed our claims with various external research teams and security experts in the industry, including the former CTO of CISO and researchers at a16z crypto. The feedback regarding recommended security levels for blockchain / contract / object address space aligns with our claims.

  • Also, based on discussions with security leads in the space, the general perception is that anything below $1B should be considered a threat, just to have room for cryptographic or CPU advancements. There was also a comment around the fact that even the “thought” of this being possible with a medium-scale adversary might disincentivize some stablecoin, bridges, or asset providers from launching in the first place.

  • This analysis does not only apply to BFT applications. 160-bit commitment schemes inside smart contracts should also switch to 256-bit. For example, the recent audit report to an over-collateralized sealed bid Ethereum auction contract here shows that originally we thought it’s a billion dollar cost attack, but this analysis proves it’s in the few  millions range, especially because we don’t deal with elliptic curve cryptography and it is just hash collisions on auction bidding prices.

  • Note that adversaries can indirectly benefit from such attacks by purposefully causing reputation damage to the network and subsequently profiting from the resulting price movements.

  • Many months prior to publishing this blog, we ethically informed other products and firms (including outside Sui and the blockchain industry at large) that we believed might be affected by such attacks in the future, should they stick with 160-bit commitments.

Mysten Labs routinely performs audits of major blockchains to analyze the various methods utilized for ensuring an efficient and secure blockchain – and then works to implement the best of those solutions on Sui. As part of this process, we recently performed internal audits and analysis regarding hash execution cost on Sui. The results of these efforts show that the cost of 160-bit hash collision attacks can be accomplished under certain assumptions and access to hardware for as little as just a few million dollars ($), making it a feasible attack vector even for individuals or less sophisticated actors.

Proposal: Use 256-bit object IDs (and inherently addresses too). To be on the safe side, Sui has already adopted the 256-bit address and object ID model to proactively prevent attacks discussed in this post.

Impact and attacks

Before we go into detail about why this attack is feasible, let’s assume someone can create two object IDs that collide. This type of attack is an offline attack, and there is no need to interact with the blockchain until one finds a collision. This is possible because object ID creation is a deterministic function based on the transaction data or, depending on the object type, the parent object ID. In any case, there is no unpredictable randomness involved – one could compute potential object IDs offline. Note also that object IDs are plain byte inputs – there is no elliptic curve crypto involved – and thus the computation is literally just that of a BLAKE2b hash function invocation.

Scenario 1 – Ambush

Objective 1: Compute (offline) two assets whose object IDs collide, which allows for ambush attacks (see our presentation). The following describes how a similar attack was possible on Ethereum (before geth client updates and related EIP-3607):

Objective 2: Find 2 contracts with the same address

  1. Write 2 contracts: (a) the one you want to deploy and attack and (b) a "smart" dummy one.

  2. Execute a collision attack until both addresses match (i.e., pick random salts).

  3. Publish the 1st contract (i.e., a DeFi, ERC20, L2 Rollup etc).

  4. Wait until the contract receives enough locked funds (you have to make it popular).

  5. When you are ready for the attack, publish the dummy contract.It now overrides the previous address.

  6. Drain the balance(s) in the 1st smart contract & transfer them to your controlled account.

We can extrapolate learnings from the preceding Ethereum scenario and apply them to Sui. Note that on Sui, where everything is an object, such ambush scenarios can be applied at the asset level, not only at the contract-logic layer. Ethereum uses 256-bits for memory and object access, and it's using only 160-bits for addresses (including EOA - contract addresses), so the attack is more contained there.

Scenario 2 - Live versus burnt objects

Bridges and light clients typically work with object IDs and txn IDs. It is technically possible to perform the attack described in the preceding section in such a way where the original object gets burnt. The adversary then recreates the same ID, but for a different object. There are many business scenarios where this can be problematic, and in theory it can also happen to fungible objects, such as when someone mints a $1B USDC object, then burns and creates another object with the same ID.

Scenario 3 - BFT Safety

Finding two transactions that result in the same object ID can break the safety of consensus protocols. Especially in the Fastpay protocol, these transactions can happen in parallel (single owner transactions). Note that we mostly focus on object collision as opposed to account address collision. This is because objects have states and any ID collision with another object is a BFT threat (for both ownership and asset feature reasons).

Mitigations

Sui uses 256-bit addresses and object IDs, and thus it’s immune against truncated-hash collisions attacks, but before taking this decision, another approach was also explored:

Track All Historical Object IDs

  • Compute the N possible object IDs that can be generated from txn (transaction).

  • If any of them is already in the validator's memory, abort the txn.

  • It’s very important to keep track of deleted object IDs as well.

A variation of the above is what other blockchains, which use 160-bit for addresses, typically do; unfortunately that wouldn’t work in Sui, mainly for two reasons:

  • Single-owner txns take the Fastpay (consensus-less) path, and they are executed in parallel, so in theory it would be possible to have two txns that happen to produce colliding object IDs.

  • Sui is a high-frequency transaction solution, and the amount of objects created and deleted (burnt) can be prohibitive for proactively verifying if an object exists before creating it. The most challenging part would be burnt objects, as then validators would have to maintain a huge database of all historical object IDs, since its existence at genesis block.

Summary: 256-bit objects (and addresses) are a straightforward solution to preventing 160-big hash collision attacks, and easier to implement than any other partial approach.

Educational content regarding security of hash functions

There are four levels of security for a cryptographic hash function 𝐻H.

The four security properties of hash functions include:

  1. Preimage resistance: 𝑥, find a message 𝑚 so that 𝐻(𝑚) = 𝑥.

    • Security of modern secure hash functions against this attack is typically \(2^n\) where n = hash_digest_size.

    • For 160-bit hashes, the effort is \(2^{160}\)

    • For 256-bit hashes, the effort is \(2^{256}\)

  2. 2nd-preimage resistance: given 𝑚1, find another 𝑚2≠𝑚1 so that 𝐻(𝑚1) = 𝐻(𝑚2).

    • Security of modern secure hash functions against this attack is typically \(2^n\) where n = hash_digest_size.

    • For 160-bit hashes, the effort is \(2^{160}\)

    • For 256-bit hashes, the effort is \(2^{256}\)

  3. Multi-target 2nd-preimage resistance: given a set of messages 𝑚X = {𝑚1, 𝑚2 .. 𝑚K} find another 𝑚Z so that 𝐻(𝑚Z) = 𝐻(any of 𝑚X).

    • The difference between this and 2nd-preimage is that now we target any element from a predefined set of messages, not just one 𝑚1.

    • Security of modern secure hash functions against this attack is typically \(2^{n-k}\) where n = hash_digest_size and k = \(log2(K)\) (where K = size of the set).

    • As you see, it’s easier to find a multi-target collision attack vs preimage attacks, because now you try to attack any existing element in a list, not just one.

    • For 160-bit hashes and 2^32 = 4 billion elements, the effort is \(2^{160-32}\) = \(2^{128}\)

    • For 256-bit hashes and 2^32 = 4 billion elements, the effort is \(2^{256-32}\) = \(2^{224}\)

  4. Collision resistance: find any pair of 𝑚1 and 𝑚2 so that 𝐻(𝑚1)=𝐻(𝑚2).

    • Also known as the birthday paradox: If you survey a random group of just 23 people, there is actually about a 50–50 chance that two of them will have the same birthday. As you see it’s not 1-out-of-365, because now we compare against many combinations of people-pairs, and we want at least one pair to match.

    • The difference between this and 2nd-preimage is that now we don’t specifically target an m1, but we try to find any pair of elements (that we can even create offline) whose hashes collide.

    • Security of modern secure hash functions against this attack is typically \(2^{n/2}\) where n = hash_digest_size.

    • As you see, it’s by far easier to find a collision attack vs preimage attacks, because now you try to find a random pair, not target a particular account/object etc.

    • For 160-bit hashes, the effort is \(2^{80}\)

    • For 256-bit hashes, the effort is \(2^{128}\)

    • Today, Bitcoin is operating at over \(2^{84}\) hashes per day, ~\(2^{93}\) annually. Historically, this generally doubles every 1-2 years. The following calculation is converting difficulty target to hashes per day, as we see ~\(2^{84.5}\)

Cost of the \( 2^{80} \) attack

The cost for the attack is estimated to be in the range of $1-10M, assuming hardware and storage are already in place. The reasoning behind this estimate is:

\( log(2,44000000000000 * (2^{32} / 600) * 60 * 60 * 24) \)

  • The preceding estimate implies that the Bitcoin network could, in theory, perform a \( 2^{80}\) sha256 hash invocations in about an hour!

  • Based on the mining profit per day (which is a few million dollars), and taking into account that there is usually a profit margin of 5-50%, the cost per hash can go down to about \( 1/2^{60}\) per hash invocation. Thus, for \(2^{80}\), we would need \(2^{20}\) = $1M USD.

  • Typically, mining is easier than collision attack generation because you need a slightly different setup for agents to communicate, but it’s a straightforward infrastructure and we’re aware of algorithms that search for collisions without requiring large storage requirements using techniques applied in rainbow tables that trade computations for storage. It would cost a bit more, but not necessarily significantly more. Also note that modern hash functions such as BLAKE2 and BLAKE3 families are already faster compared to Bitcoin's sha256.

  • Note that back in 2017, it took researchers \(2^{65}\) bits of effort and about $40k to break sha160. Since then, ASICS, GPUs, and better access to cloud services would allow for cheaper attacks, reduced by a significant factor. Thus, it's not a surprise that today, a \(2^{80}\) attack would cost a few million dollars. A similar analysis from Facebook research showed that \(2^{92}\) collision attacks on AES cost about $4B, thus for \(2^{80}\), it would be about $1M; the results are consistent.

Notes

  • We've discussed our claims with various external research teams and security experts in the industry, including the former CTO of CISO and researchers at a16z crypto. The feedback regarding recommended security levels for blockchain / contract / object address space aligns with our claims.

  • Also, based on discussions with security leads in the space, the general perception is that anything below $1B should be considered a threat, just to have room for cryptographic or CPU advancements. There was also a comment around the fact that even the “thought” of this being possible with a medium-scale adversary might disincentivize some stablecoin, bridges, or asset providers from launching in the first place.

  • This analysis does not only apply to BFT applications. 160-bit commitment schemes inside smart contracts should also switch to 256-bit. For example, the recent audit report to an over-collateralized sealed bid Ethereum auction contract here shows that originally we thought it’s a billion dollar cost attack, but this analysis proves it’s in the few  millions range, especially because we don’t deal with elliptic curve cryptography and it is just hash collisions on auction bidding prices.

  • Note that adversaries can indirectly benefit from such attacks by purposefully causing reputation damage to the network and subsequently profiting from the resulting price movements.

  • Many months prior to publishing this blog, we ethically informed other products and firms (including outside Sui and the blockchain industry at large) that we believed might be affected by such attacks in the future, should they stick with 160-bit commitments.

Read Next