Vitalik: What is the real difference between Ethereum L2 and execution sharding
Original title: How do layer 2s really differ from execution sharding?
Original author: Vitalik Buterin, founder of Ethereum
Original translation: 0xjs, Golden Finance
Two and a half years ago, I pointed out in my article on "The End of Ethereum" that from a technical perspective, different paths for the future development of blockchain look very similar. In both cases, there are a large number of transactions on the chain, and processing these transactions requires (i) a lot of computation and (ii) a lot of data bandwidth. Regular Ethereum nodes like the 2TB reth archive node running on the laptop I used to write this article are not sufficient to directly verify such a huge amount of data and computation, even with great software engineering works and Verkle trees.
In both the "L1 sharding" and rollup-centric worlds, ZK-SNARKs are used to verify computations, and DAS (Data Availability Sampling) is used to verify data availability. The DAS in both cases is the same. The ZK-SNARKs technology is also the same in both cases, the difference is that one is smart contract code and the other is an embedded feature of the protocol. From a technical perspective, Ethereum is actually sharding, and rollup is part of the sharding.
This leads to a natural question: what is the difference between these two worlds? One answer is that the consequences of code errors are different: in the rollup world, tokens are lost, while in the sharded chain world, there is a consensus failure. But I expect that as the protocol solidifies and formal verification techniques improve, the importance of errors will decrease. So what are the long-term differences between these two visions that we can expect?
Diversity of Execution Environments
An idea that we briefly tried on Ethereum in 2019 was execution environments. Essentially, Ethereum would have different "zones" that could have different rules for how accounts work (including completely different approaches like UTXOs), how virtual machines work, and other features. This would enable approaches that would be difficult to achieve if Ethereum did all the work on its own.
Ultimately, we abandoned some of the more ambitious plans and kept only the EVM. However, Ethereum's L2 (including rollups, valdiums, and Plasmas) serve to some extent as an execution environment. Today, we often focus on EVM-equivalent L2s, but this ignores the diversity of many alternative approaches:
· Arbitrum Stylus, which adds a second VM based on WASM in addition to the EVM.
· Fuel, which uses a UTXO architecture similar to Bitcoin (but more complete).
· Aztec, which introduces a new language and programming paradigm designed around privacy-preserving smart contracts with ZK-SNARKs.
Fuel’s UTXO Architecture
We could try to turn the EVM into a super VM that covers all possible paradigms, but this would result in a much less effective implementation of each concept than if platforms like these focused on their own areas.
Security Tradeoffs: Scale and Speed
Ethereum L1 provides very strong security guarantees. If some data is in a block that is confirmed on L1, this entire consensus (including social consensus in extreme cases) ensures that the data cannot be edited in a way that violates the rules of the application, any execution triggered by the data cannot be undone, and the data will remain accessible. To achieve these guarantees, Ethereum L1 is willing to accept high costs. At the time of writing, transaction fees are relatively low: Layer 2 networks charge less than a penny per transaction, and even basic ETH transfers on L1 cost less than $1. If technology advances quickly enough that available block space growth can keep up with demand, these costs may remain low - but they may not. And even $0.01 per transaction is too high for many non-financial applications, such as social media or games.
But social media and games do not require the same security model as L1. It would be acceptable if someone paid a million dollars to undo the record of a chess game they lost, or to make your tweet look like it was posted three days after it was actually posted. Therefore, these applications should not pay for the same security costs. An L2-centric approach makes this possible by supporting a variety of data availability methods from rollups to plasma to validiums.
Different use cases, different L2 types
Another security tradeoff arises with the issue of passing assets from L2 to L2. It is expected that in 5-10 years, all rollups will be ZK rollups, and super-efficient proof systems like Binius and Circle STARKs combined with lookup and proof aggregation layers will enable L2 to provide final state roots at every slot. Currently, we have complex hybrid optimistic rollups and ZK rollups, with various proof time windows. If we implemented execution sharding in 2021, the security model for keeping the shards honest would be optimistic rollups, not ZK - so L1s would have to manage the system's complex fraud proof logic and have a one-week waiting period for assets to move from shard to shard. But I think this problem is ultimately temporary.
The third and equally persistent security tradeoff dimension is transaction speed. Ethereum generates a block every 12 seconds and is reluctant to go faster because that would over-centralize the network. However, many L2s are exploring block times of a few hundred milliseconds. 12 seconds isn’t too bad: on average, users submitting transactions have to wait about 6-7 seconds for them to be included in a block (not just 6 seconds, because there’s a chance that the next block won’t include them). That’s about the same time I have to wait when I pay with my credit card. But many applications need more speed, and L2 provides that.
To provide that speed, L2 relies on a pre-confirmation mechanism: L2’s own validators digitally sign a promise to include a transaction at a certain time, and they can be penalized if the transaction is not included. A mechanism called StakeSure generalizes this further.
L2 Pre-Confirmations
We could try to do all of this on L1. L1 could combine “fast pre-confirmation” and “slow finalization” systems. It could combine shards with different security levels. However, this would add a lot of complexity to the protocol. Additionally, doing it all on L1 runs the risk of overloading consensus, because many approaches to higher scale or faster throughput have higher centralization risks or require stronger forms of “governance”, and if done on L1, the effects of these stronger requirements will ripple through to other parts of the protocol. Ethereum can largely avoid these risks by providing these tradeoffs through L2.
Organizational and Cultural Advantages of L2
Imagine a country that is split in half, with one half becoming capitalist and the other becoming a highly government-dominated society (unlike this in reality, assume that in this thought experiment it is not the result of any kind of traumatic war; a border just magically appeared one day and that’s it). In the capitalist part, restaurants are run by various decentralized ownerships, chains, and franchises. In the government-dominated part, they are all branches of government, like the police department. On day one, not much will change. People largely follow existing habits, and what works and what doesn’t works depends on technical realities, like labor skills and infrastructure. A year from now, you would expect to see big changes as different incentive and control structures lead to big changes in behavior, affecting who comes, who stays, who leaves, what is built, what is maintained, and what is abandoned.
Industrial organization theory covers many of these distinctions: it talks not only about the difference between a government-run economy and a capitalist economy, but also about the difference between an economy dominated by large franchises and one where, for example, each supermarket is run by an independent entrepreneur. I think the difference between an L1-centric ecosystem and an L2-centric ecosystem is similar.
The "core people run everything" architecture is very problematic
The key benefit of Ethereum as a layer 2-centric ecosystem can be stated as follows:
Ethereum is an L2-centric ecosystem, where you are free to independently build a sub-ecosystem that is yours, with your unique characteristics, while being part of the larger Ethereum.
If you are just building an Ethereum client, you are part of the larger Ethereum, and while you have some room for creativity, it is much less than L2. If you are building a completely independent chain, you have the most room for creativity, but you lose the benefits of shared security and shared network effects. L2s form a happy middle ground.
L2s not only create a technical opportunity to experiment with new execution environments and security tradeoffs for scale, flexibility, and speed: they also create incentives for developers to build and maintain it, and for communities to form around it and support it.
The fact that each L2 is isolated means that deploying new approaches is permissionless: there is no need to convince all the core developers that your new approach is "safe" for the rest of the chain. If your L2 fails, it's on you. Anyone can work on completely weird ideas (such as Intmax's approach to Plasma), and even if they are completely ignored by Ethereum core developers, they can continue to build and eventually deploy. This is not the case with L1 features and precompiles, and even in Ethereum, the decisions about the success and failure of L1 development often depend on more politics than we would like. Regardless of what can be built in theory, the different incentives created by L1-centric ecosystems and L2-centric ecosystems will ultimately greatly affect what is actually built, its quality, and its sequence.
Challenges facing Ethereum’s L2-centric ecosystem
1+2 layer architectures can also have problems
A key challenge with this L2-centric approach is coordination, which L1-centric ecosystems rarely have to deal with. In other words, even as Ethereum forks, the challenge is to keep it feeling like “Ethereum” and having the network effects of being Ethereum rather than N independent chains. The situation today is not ideal in many ways:
· Moving tokens from one L2 to another often requires a centralized bridge platform and is complicated for the average user. If you have tokens on Optimism, you can’t just paste someone else’s Arbitrum address into your wallet and send funds.
· Cross-chain smart contract wallet support is poor — both for personal smart contract wallets and organizational wallets (including DAOs). If you change your keys on one L2, you also need to change your keys on every other L2.
· Decentralized validation infrastructure is often lacking. Ethereum is finally starting to have excellent light clients like Helios. However, this doesn’t make sense if all the activity is happening on an L2 that requires its own centralized RPC. In principle, it’s not hard to make light clients for L2 once you have the Ethereum header chain; in practice, too few people emphasize it.
There are efforts to improve all three aspects. For cross-chain token swaps, the ERC-7683 standard is an emerging option that, unlike existing “centralized bridges,” does not have any fixed central operator, token, or governance. For cross-chain accounts, the approach taken by most wallets is to use cross-chain replayable messages to update keys in the short term and key storage rollups in the long term. Light clients for L2 are starting to appear, such as Beerus for Starknet. In addition, recent improvements in the user experience through next-generation wallets have solved many of the more fundamental problems, such as eliminating the need for users to manually switch to the right network to access dapps.
Rabby displays a comprehensive view of asset balances across multiple chains. In the dark ages not so long ago, wallets didn’t do this!
But it’s important to recognize that an L2-centric ecosystem does swim upstream somewhat when it comes to coordination. Individual L2s have no natural economic incentive to build coordination infrastructure: small ones don’t, because they’ll only see a small share of the benefit of their contribution, and large ones don’t, because they’ll benefit more from strengthening their own local network effects. If each L2 optimizes its individual parts in isolation, and no one considers how each part fits into the greater whole, we get the urbanized dystopia shown in the picture a few paragraphs above.
I don’t claim to have a magical perfect solution to this problem. The best I can suggest is that the ecosystem needs to more fully recognize that cross-L2 infrastructure is a form of Ethereum infrastructure that should be valued and funded just like L1 clients, development tools, and programming languages. We have the Protocol Guild; maybe we need an Infrastructure Guild as well.
Conclusion
“L2” and “sharding” are often described as two opposing blockchain scaling strategies. But when you look at the underlying technology, it’s confusing: the actual underlying scaling methods are exactly the same. You have some kind of data sharding. You have fraud provers or ZK-SNARKs provers. You have solutions for cross-{rollup, shard} communication. The main difference is: who is responsible for building and updating these parts, and how autonomous are they?
From a technical perspective, an ecosystem centered around L2 is sharding, and you can create your own shards with your own rules. This kind of sharding is powerful and can inspire creativity and autonomous innovation. But it also faces key challenges, particularly around coordination. For an L2-centric ecosystem like Ethereum to succeed, it needs to understand these challenges and confront them head-on to get as many benefits of an L1-centric ecosystem as possible and get as close as possible to having the best of both worlds.
Original link
欢迎加入律动 BlockBeats 官方社群:
Telegram 订阅群: https://t.me/theblockbeats
Telegram 交流群: https://t.me/BlockBeats_App
Twitter 官方账号: https://twitter.com/BlockBeatsAsia
Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.
You may also like
U.S. Supreme Court refuses to halt Trump's hush money case
A victim lost $1 million in ACT due to a hacked account X
CryptoQuant CEO: The altcoin market is in a zero-sum PvP game, and only a few projects can survive