The following article originally appeared in Consensus Magazine, distributed exclusively to attendees of CoinDesk’s Consensus 2019 event.
Four autonomous vehicles arrive at an intersection. Who gets to go first?
Yeah, this sounds like the beginning of a bad joke, but the problem is very real, and surprisingly difficult. The solution lies in decentralized computing, an emerging field that will likely involve blockchains, along with a host of other technologies. To understand the problems it tries to solve, let’s dig deeper into this suburban impasse…
If we assume there’s no static infrastructure – for example, a traffic beacon – to arbitrate the intersection, the vehicles will have to negotiate a solution using only their in-vehicle computation capacity. What would the computer’s instructions be? Well, there are some general societal rules to go by: no one wants an accident; they all want to get through the intersection as quickly as possible; and there is some notion of “fairness” (“really, I got here first so I get to go first!”).
All this sounds more or less doable except that the vehicles might be augmented with a “little red button” that cheats the negotiation in order to get through first. (Seriously, if you were late for work, you’d push the button, right?).
From a systems architecture perspective, though, there are big problems with this scenario. Here are some: there’s no central authority deciding which car goes in which order. Second, the only infrastructure available for computation resides in the cars; that is, resources are dynamically allocated for the computation. Third, every driver is motivated by objectives that will drive their vehicle’s computation, and while some objectives – such as getting through the intersection without an accident – are shared by all, some goals will be unique to the individual.
(I’m late, so let me through first!).
It is this last characteristic that makes decentralized computing so challenging.
Applications and Challenges
Cryptocurrencies are the best-established applications of decentralized computing. But there are many others. In most cases, blockchains, which function as a decentralized, consensus-based alternative to trusting a centralized authority, will likely play a key role. Yet blockchains are useless on their own. For decentralized computing to work, blockchains must intersect with other solutions.
One much-discussed decentralized computing application is provenance in supply chains. Walmart recently announced that all its food suppliers must upload their data to a blockchain-based system so that users can monitor the supply chain for contaminated food. Similar ideas are being applied to tracking conflict-free minerals.
In these provenance cases, the blockchain is a critical component, but by no means the only one. As I discussed in last year’s Consensus Magazine, while a blockchain can provide persistent and transparent transactional management, storage and updates of data, the ability to track provenance also requires efficient, high integrity data entry. The quality of the blockchain’s monitoring is only as good as the data collected. Without proper oversight, data entry (for example. through sensor and telemetry data) may be manipulated by a malicious participant to misrepresent provenance.
Supply chain applications also demonstrate the importance of confidentiality and privacy of data because, at the core, they are about cross-organizational access to shared data. Queries on the data like “where did this lettuce come from?” are relatively uncontroversial and, in most circumstances, consistent with the shared objectives of the participants. Others are more conflicted, however, and it’s those that expose the difficulties in managing confidentiality in decentralized systems.
Can a supplier prove, for example, that it can meet delivery requirements without exposing otherwise confidential details of its internal operations? Herein lies a core problem for decentralized computing: how to perform network-wide computation on confidential data without exposing the details of that confidential information to the group.
Consider the challenges with genomic data. With researchers searching for cures to diseases, there’s tremendous societal and potentially business value in performing computations across the broadest possible set of genomic data sources, sources that are often created, managed or owned by different organizations. However, each database contains data that’s both highly valuable as intellectual property and restricted by regulations protecting the privacy of the individual contributors of that genomic data.
Or we could just return to our autonomous vehicles, which are probably still sitting at the intersection. (“You go first.” “No, you go first”, “No, YOU go first”). A recent requirement for operating an AV is that it must have a “black box” that records telemetry data that can be used to analyze past behavior – for example, to determine the cause of an accident. This is basically the same role that a black box plays in an airplane – with one key difference: an airplane is largely by itself in the sky, whereas an autonomous vehicle is continuously interacting with other (potentially autonomous) vehicles. The black box in one vehicle provides a single historical perspective.
It does not, however, provide insight into the actions or decisions of the other autonomous vehicles on the road. All of this is complicated by adversarial machine learning, which could create a new attack vector for autonomous vehicles. How can a computer relying on a simple, local record of telemetry data differentiate between an internal error made by the autonomous vehicle, an external attack on the vehicle’s telemetry, or the actions of a malicious participant in the coordination protocol?
Ideally, to provide an attack-resistant history of the vehicle’s behavior, the black box would confirm the vehicle’s telemetry data with that of nearby vehicles as well as information about interactions with those vehicles – a full, system-wide snapshot, in other words. And that just returns us to the problem of doing computation with confidential information from untrusted sources.
Treating The Blockchain as the Trust anchor
The Internet of Things will demand decentralized applications. But building non-trivial versions of them is hard. Relatively straightforward problems like deterministic fair exchange between two parties is known to be impossible without a trusted third party to arbitrate the interaction. Here, a blockchain provides great value because it, in effect, becomes a technology-based, trusted third party that can arbitrate multi-party protocols. Still, we must address many other challenges before we can realize general purpose decentralized computing.
In part, getting there requires a transition from “blockchain is the application” to “blockchain is the trust anchor.”
It’s a transition that we’re already seeing in bitcoin. For example, the Lightning Network moves the management of bitcoin transactions into an off-chain channel created by a pair of participants who will only close their balances to the blockchain if there is an off-chain dispute.
Thus the blockchain functions as the trust anchor while the Lightning Network is the decentralized application.
Meanwhile, Thunderella, a consensus algorithm developed at Cornell University, achieves substantial performance improvement by combining an optimistic, high-performance “off-chain” consensus protocol with asynchronous slow path that uses a traditional blockchain consensus protocol as a fallback trust anchor when optimistic assumptions fail.
In this case, the underlying blockchain’s role is to publish evidence that the optimistic assumptions no longer hold and to reset inconsistent views.
Our own work on Private Data Objects, a Hyperledger Labs project to explore decentralized computing models, splits contract execution into an off-chain component that performs the actual computation, and an on-chain component that simply ensures an ordering of updates that respects dependencies between contract objects. Thus, intuitively, the blockchain serves as a decentralized commit/coordination log for database updates.
Confronting the Confidentiality Challenge
How to scale this and protect confidentiality?
Well, one approach requires us to recognize that balancing the tensions between shared and individual objectives is simplified if we broaden our notions of successful computation. Under the principle of differential privacy, we can dial down, or “fuzz,” the required accuracy of a database to preserve confidentiality. For example, we might convert a precise result like “the delivery truck is at 4th and Wilshire” into something less definitive like “the delivery truck will arrive in about 10 minutes.”
Consider how this concept – where some objectives must be met for success while others are “fuzzed” to complete the computation – might apply to our autonomous vehicles. It might not be necessary that the first vehicle to arrive at the intersection be the first one through the intersection so long as there aren’t accidents and that it can continue to its destination in good time. Fairness and first-come-first-through remain the objectives, but may not be requirements for success.
Other advances in computer science might also help. Privacy-preserving cryptocurrencies using zero-knowledge proofs (ZKP) such as Zcash and Monero demonstrate the power of cryptography to enable computation on a privacy-protected sets of data. Still, as of now, developers have struggled to take this computationally complex technology to the kind of scale that’s needed for general purpose decentralized computing.
Here, hardware-based trusted execution environments (TEE) offer a potential alternative. Many modern processors come with technology to perform computation that guarantees the integrity and confidentiality of computation under certain circumstances.
Examples of shipping products include TrustZone from ARM, Software Guard Extensions (SGX) from Intel, and AMD’s Secure Encrypted Virtualization (SEV). For those more inclined to open hardware specifications, the Keystone Project from researchers at UC Berkeley and MIT, seeks to develop an open-source TEE for the RISC-V processor.
A hardware-based TEE provides a general-purpose compute environment that addresses performance and flexibility requirements that limit the applicability of ZKP technologies. However, hardware-based trust should not be viewed as a panacea. When it is situated appropriately in the larger security design context it could be an effective way to
execute confidential computation in an optimistic fashion.
In other words, decentralized computing requires a combination of solutions. I hate to tell them, but those cars stuck at the intersection are going to have to multitask.
Motherboard image via Shutterstock