Privacy for Blockchains: An Introduction

We can only realize the full potential of blockchains if they can use private or sensitive data. In this introduction, we examine some possible solutions.

From the CMC editorial desk: Blockchains are known for their ability to facilitate transparency and trust among peers. However, what happens when you need the trust aspect, but have to keep the data private? Calling on our friends over at Enigma, they tell us the secret to understanding how to think about the intersection between blockchain and privacy.

Blockchains are an incredible innovation with the power to reshape our world. The hope is that blockchain technology will allow us to decentralize systems and applications and restore power to individuals. The next few years will be filled with opportunities to innovate and build. However, in order for us to realize the full potential of blockchains, it is not only important to understand what they can enable – we must also understand where they fall short.

In this article, we’ll examine one of the largest weaknesses of blockchains, one created by design: privacy. At Enigma, we believe solving privacy will finally allow decentralized technologies to be used by millions of people across the world. We will look at some of the main approaches that are being taken to address privacy, followed by some of the largest opportunities that exist if we succeed.

Blockchains: Public By Design

At their inception, blockchains were designed to be public. A public blockchain is a distributed ledger where entries are auditable and permissionless. Anyone can attempt to record an entry, and anyone can read the entries themselves. Blockchain enabled the inception of Bitcoin, a new type of money where the total supply was known and you could trace the flow of value. Bitcoin is pseudonymous, meaning that while addresses can be linked to value, identities cannot be linked to addresses. However, this means Bitcoin isn’t really anonymous or private – nor was it meant to be.

A few years ago, after the successful application of Bitcoin to decentralizing money, an old and important idea resurfaced. That idea, today more commonly known as “smart contracts,” posited that if we can solve the problem of trust in the banking system through code, we could do the same for any other type of application. After all, why not? And since this would be a more general solution – one that worked for all computations, not just transactions – wouldn’t it be far more valuable to create?

There was a catch, however. Blockchain’s innovation was solving for consensus, trust, and verifiability – but this came at a cost. For blockchains, the major tradeoffs were privacy and scalability. As every transaction is public, you would not want to keep any kind of private or sensitive data on the blockchain – anything from votes to personally identifiable information to trade secrets. And blockchains themselves are difficult to scale, again by design: there is a limited amount of transactions or computations you would want to track on a blockchain before it becomes too expensive and slow.

From the perspective of privacy, blockchains are worse than anything that came before them. Instead of trusting your data with a single organization (such as Facebook, Google, your bank, etc), you now have to trust everyone. For all intents and purposes, data on the blockchain becomes public domain.

Again, this is not an unintended weakness. Blockchains are bad at privacy by design so that they can be good at what they’re designed for. But this means that blockchains by themselves can only solve so many real-world problems.

While some people (such as Bitcoin maximalists) believe that blockchain can only be good for one thing, many others are working hard to build supplementary technologies that can work alongside blockchains. These technologies will be of critical importance if blockchains are to achieve their promise of decentralizing the world beyond just auditable payments. And if they succeed, we may one day remember these privacy technologies as the most powerful and valuable innovations of the 21st century – not blockchain.

The Necessity of Privacy

Because blockchain’s biggest weakness is privacy, some of the more valuable technologies that work with blockchains will solve for privacy. It is important to clarify what this means, especially since there has been a lot of talk about “privacy coins.” For example, projects like Monero and Zcash are focusing on privacy for transactions, protecting the identity of transaction participants as well as the content of the transactions.

When we talk about the potential of blockchain to create a more decentralized future, however, we don’t just mean providing anonymity. We are talking specifically about data privacy and allowing blockchains to work with private and sensitive data. Decentralized applications built on blockchains don’t really require user anonymity, but they very often require private data.

For example, imagine you want to create a decentralized social network (maybe because you do not trust Facebook). The network would need to be able to use many different kinds of sensitive data, including identity data, location data, maybe even payment data. But by putting all this data on a blockchain, you’ve already exposed it. This is a worse situation than if you hadn’t used a blockchain, since you wouldn’t need to wait for the network to be hacked – the data would already be public.

If you first encrypt the data and then put it on a blockchain, you haven’t solved the problem. Now you can’t perform any computations on that data, and even if you could, you still run into the problem of scalability. But remember that we still want things to be decentralized – so using a centralized database as a solution means you just have all the existing risks of centralized applications with none of the benefits. We need to find a scalable solution for computing over private data that acts in a decentralized manner.

Creating Decentralized Privacy Solutions

Fortunately, these types of privacy technologies have been an area of research for decades – long before blockchains were proposed and popularized by Satoshi Nakamoto. In cryptography, this field of computing over encrypted data is known as secure computation . With secure computation, nodes in a decentralized network (as well as the public) are able to execute and validate computations while never seeing the data they are working with. When combined with blockchain, secure computation networks can form a foundational platform for decentralized applications. While blockchains alone have limited utility, secure computation is a critical field in its own right, enabling many different kinds of functions such as secure data sharing.

Below we will describe a few different types of privacy solutions that are active areas of research and development. At Enigma, we primarily use Secure Multi-party Computation (MPC) and Trusted Execution Environments (TEEs), but we’re actively looking into all of the technologies listed here. These descriptions will get a bit technical, but they provide a very good starting point if you want to learn more about what privacy experts are building and thinking about!

Secure Multi-Party Computation

Known as MPC for short, secure multi-party computation starts by asking a philosophical question. Is there any trusted third party, a supercomputer of sorts, that we can send our data to and trust it to perform computations on our behalf, without potentially leaking our private information? This is the equivalent of envisioning a server that can never be hacked or compromised (internally or externally) – an idealized scenario. Since this is not possible in reality (or else we wouldn’t need security at all), we instead aspire to simulate such an omnipotent and trustworthy machine.

MPC proposes to emulate such a trusted third party by combining untrusted parties. In other words, we can design a decentralized network of computers that will ensure that no data leaks during computation. Each computer in the network only sees encrypted bits of data — but never anything meaningful. The only way to recover the plaintext data is by having all the players in the network collude to leak data (as opposed to gaining control of a secret key). The number of systems needed to reconstruct the data is a tunable parameter that can range from some portion of the system up to all of them.

Fully Homomorphic Encryption

Fully homomorphic encryption (FHE) is a purely software-based solution to privacy. Recall that an encryption is a where you can hide data in a way that it will appear meaningless to anyone except those who have access to the secret decryption key. One shortcoming of encryption is that generally, doing a computation over the ciphertext-space does not affect the ciphertexts in the same way as doing the computation over the plaintext data. If, however, it does, then we call this scheme homomorphic.

For example, imagine we have two values a and b, and using a homomorphic encryption algorithm we obtain the encrypted values ea and eb. If we attempt to add ec = ea + eb together, then ec would equal an encryption of the sum of a + b. In other words, when ec is decrypted, it would result in the sum of the two integers. Note that for now FHE remains a theoretical advancement, and it is very challenging to make these types of schemes practical for real-world use.

Zero-Knowledge Proofs

Zero-knowledge proofs (ZKPs) are a specific kind of secure computation – less general than the above techniques. ZKPs focus on proving/disproving statements. The goal is to have the prover prove to the verifier some argument, without revealing any other information.

The simplest type of ZKP are proofs of knowledge. In this version, the prover must show that he possesses knowledge of some secret information, without revealing it. If the rules of the game were different and we didn’t care about revealing the secret information, then the solution would be trivial — simply show the secret to the verifier. Instead, we must find another way. One significant real life example where ZKP are useful is authentication. One could prove his or her identity by showing they have knowledge of some secret passphrase or a key, without actually providing that secret directly. However this property creates a shortcoming as one can only prove secrets that she has access to – in other words in multi-actor systems such as auctions, we would still need to trust an auctioneer to compare all the bids and reveal the winner and share the secret.

The recent interest in ZKP stems from the introduction of zk-SNARKs (zero-knowledge Succinct Non-interactive ARguments of Knowledge). zk-SNARKs are a special form of ZKP that is also non-interactive — the prover and verifier aren’t required to be online at the same time, and succinct — the proofs are small in size, which makes verification fast. The two major shortcomings of the technology are that, generating the proof is still incredibly slow (proving relatively simple statements would still take minutes), and that the cryptographic assumptions used are fairly new and not well established in academia or industry.

Trusted Execution Environments

As opposed to the above techniques, Trusted Execution Environments (TEEs) are a hardware-based privacy solution. In a TEE-based network, secure hardware is used to protect the data that is being used from leaking outside the hardware itself. Using techniques like remote attestation, users of the network can be sure that the encrypted data submitted to the network remains private. The primary tradeoff of TEEs is that you must trust that the hardware has not been compromised. However, performance can be significantly faster than with purely software-based methods of secure computation.

Conclusion

We could go into extensive detail about all of the above solutions, but in the interest of time, we instead will summarize the points from the above article.

First, privacy solutions are critical to a decentralized future. As shown earlier, blockchains alone can’t become a foundation for the next Internet. At Enigma, we believe that these types of privacy technologies are just as groundbreaking and valuable as blockchain itself. Making these technologies useful at scale has been a dream for decades – and if solved, it will be regarded as one of the greatest achievements of this century, enabling thousands of new kinds of applications.

Second, there are many kinds of privacy solutions. We covered just a few of these solutions above and only at a very high level. By combining and optimizing different types of privacy solutions, you can create different kinds of networks with different strengths. While we are mostly focused on Secure Multi-party Computation and Trusted Execution Environments, we are actively conducting research in other areas as well.

If you are interested in learning more about any of the above privacy technologies, please do your own research! I hope this article has awoken your interest in privacy – something that is often invisible, but affects all of us deeply. It is exciting to know there are so many unsolved questions and amazing opportunities that remain in the decentralization space. So if you’re inspired, please come build with us!

Written by Tor Bair, Head of Growth at Enigma.

About Enigma

Enigma is building a privacy protocol – a blockchain-based network for secure computations. Enigma uses cutting-edge cryptographic methods and secure hardware solutions to turn ordinary smart contracts into “secret contracts”. With our technology, sensitive data used by smart contracts is never exposed, creating a platform where truly decentralized applications can run securely at scale. We believe Enigma is the missing piece to building and unlocking the value of a decentralized future. You can join the conversation with Enigma on Twitter and Telegram.