From the system’s point of view, a blockchain is roughly composed of four modules in terms of its hierarchy:
- Consensus engine
- Scripting and virtual execution environment
- Transactions, block and chain logics
- Peer-to-peer network
Most research on related topics has focused on the consensus engine (to improve transaction throughput) and the scripting language (to make blockchains more useful), leaving the properties of peer-to-peer network largely unexplored.
The security properties of a blockchain actually hinges on the peer-to-peer network. Research on eclipse attack was started by Ethan Heilman, Alison Kendler, Aviv Zohar, and Sharon Goldberg in 2015. Their research illustrated the first attack against Bitcoin’s peer-to-peer network by controlling hundreds of nodes, which is modeled as an unstructured random graph in their research paper. Another interesting research paper titled “Low-Resource Eclipse Attacks on Ethereum’s Peer-to-Peer Network” demonstrated the feasibility of launching eclipse attacks against Ethereum’s P2P network layer (which is usually modeled as a structured graph like Kademlia DHT) using only two machines. Eclipse attacks demonstrate the need for cross-layer design when one builds a complicated P2P system.
A P2P system, like limewire, is a computing or networking distributed application architecture that partitions tasks or workloads among peers without the need for a central coordinating server or stable hosts. Peers communicate through Gossip protocols:
- Peer A is online & connects with peer B, which is pre-configured;
- Peer A consults peer B > learns the existence of peer C and D > connects with these new peers;
- Peer A broadcasts a new message to B, C and D. Once B, C, and D receive the broadcast, they will then send it to E, F, G, H…and so on.
Peer A’s view of the entire network solely depends on B, C and D. During an attack, peer A is essentially isolated from the rest of the network and its view can be manipulated by the attacker. This is what we call an Eclipse attack, a simple way to attack P2P systems including blockchains.
The cost of launching an ellipse attack is high when each peer is alwayslistening and talking to other peers (an ideal state), because the hacker needs to control the entire network in order to hack the P2P system. Due to practical considerations, each peer only exchanges information with a small group of peers in reality. Thus, the cost to hack the system is not as high. The rationale behind Ethereum’s choice of making the outgoing connections to be 13 instead of 8 like Bitcoin is to make the overall security of the ethereum network more robust.
Nevertheless, there are still pitfalls in the Ethereum P2P network makes that allows a hacker to launch an eclipse attack .
Design flaw #1: Peer’s Identity
Peers in the Ethereum network are identified by their node IDs, which are 64-byte cryptographic ECDSA public keys that can be generated. Multiple Ethereum nodes, each with a different node ID, is run on a single machine that has a single IP address. Due to this weak definition of a “peer”, the hacker can easily generate a large set of Ethereum peers, hosts them on one single machine with the same single IP address, and coordinates them strategically to block any incoming and outgoing communication to and from the victim node. This is the main vector exploited in the paper of Yuval Marcus et. al.. The identity of a peer should be bundled with some physical resources, e.g., IP address, to avoid unlimited provision of peers and fix such a design flaw.
Design flaw #2: Peer Selection Strategy
Instead of using random peer selection strategy, Ethereum uses Kademlia for selecting new peers. In a Kademlia network, each item of content is associated with a key and is stored only at those peers whose node ID that is “close” to its associated key. “Closeness” is defined as the binary Hamming distance between the key and the node ID. Each Kademlia node has b distinct buckets, where bucket i stores k peers information at distance i. To look up the content associated with key t, a Kademlia node looks in its buckets to find the node IDs that are “closest” to t, and asks them to either (a) return the content associated with t or (b) to return some node IDs that are even “closer” to t. This process is repeated until the key is found. In Ethereum’s case, one node:
- Chooses a random string t
- Looks in its buckets to find k = 16 node IDs closest to the string t
- Asks each node to return k node IDs from its buckets that helps the original node get “closer” to the string t
The result of this process is that up to k × k newly node IDs are collected.
Image Credit: Kurtis Jolly
From these k × k newly-discovered node IDs, the k nodes closest to the string t are then asked to return k nodes that are even closer to t. This process continues iteratively until no new nodes are found. Consequently, this allows the attacker to craft a set of node IDs designed to fill all buckets in the victim node’s datastore of peers. This vulnerability needs to be fixed with a good peer selection strategy. Ideally, the selected peers should be spreaded out in the network to minimize such a risk.
Design flaw #3: Inbound v.s. Outbound Connections
From a peer’s point of view, there are two types of connections — inbound (if it was initiated by another peer) and outbound (if it was initiated by this peer) connections. Ethereum puts a limit, known as maxpeers (25 by default), on the total number of inbound and outbound connections but does not set a limit for each of them. A victim node could max out the maxpeer limit with 100% inbound connections coming from the hacker. Placing a cap on the number of incoming connection to force peers to make a mix of incoming and outgoing connections would be a way to fix this problem.
Design flaw #4: Reboot and Erase
In Ethereum, a peer stores information about other nodes in two data structures:
- Long-term data storage that is on a hard-disk and exists (word choice) across the reboot
- Short-term data storage contains Kademlia-like buckets that are always empty when the peer reboots.
A smart hacker can leverage on design flaws #3 and #4 to fill up the short-term data storage immediately after a peer reboot, thus quickly replace victim’s all inbound connections to malicious connections. This problem can be fixed by sampling incoming connections to fill the short-term storage and keep a few peers in the short-term data storage after a reboot rather than wiping the entire short-term data storage clean.
Eclipse attacks can be used to break the consensus of the network and lead to:
- Double spending;
- Attacks against second layer protocols such as lightning network, e.g., an attacker can obtain the products without paying by tricking his victim into the thinking the payment channel is still open while the non-eclipsed part of the network sees that payment channel is closed;
- Smart contracts may be attackable if users see inconsistent views of the blockchain.
In addition, seeding more aggressively definitely help to throttle eclipse attacks by increasing the probability that the victim establishes outgoing connections with legitimate peers. Here are some ways to optimize seeding:
- Frequency: seeding could be executed immediately after a peer reboots or periodically like every 30 minutes
- Coverage: seeding could be executed against even more nodes (not limited to the six bootstrap nodes in Ethereum network). Previously used peers can still be utilized again after the reboot.
- Selectivity: the peers could be selected more rigorously in a seeding process. For example, one may use peers with a distinct network characteristics such as latency, tracing routes, IP range, and ISP.
How does this impact IoTeX?
When we design the IoTeX blockchains, we are especially cognizant of the peer’s identity, peer selection strategy, connection limits, and seeding strategies. In addition, we are actively exploring the feasibility to diversify selected peers by leveraging network statistics.