Can Ethereum Overcome Its Immense Technical Problems? [Podcast] Princeton Computer Science Professor Michael Freedman on why scaling this blockchain-based computing platform will be so difficult.
Ethereum is an "unstoppable computational machine" that puts "a lot more power and sovereignty in the individual." That's how ethereum co-founder Joseph Lubin described this decentralized global computing network in an interview with Reason last year. Centralized institutions, Lubin says, will gradually be replaced by scattered computing nodes, leaving regulators no "corporation that [they] can step on, take information from, or cause to act in a certain way." Ethereum will become as well-known as the internet itself, "impact[ing] all aspects of our existence."
It's a seductive pitch. "Ether"—the digital currency that powers the ethereum network—has increased in value about 3,000 percent so far in 2017 and it's starting to attract the same mainstream attention as bitcoin. Last month, a company built on the ethereum platform raised the equivalent of $153 million in three hours.
Now that ethereum has a market cap of more than $18 billion, the network is facing immense technical challenges. Can it scale to serve millions of users? To discuss, we spoke with Michael Freedman, a professor of computer science at Princeton who, starting in the late '90s, worked on an earlier wave of peer-to-peer computing technologies. In 2011, Freedman co-authored a paper about a technique called "sharding" that ethereum developers are currently working to integrate into their network. Today, he's the co-founder and chief technology officer of the database company TimeScale. He's also serves an advisor to the decentralized computing platform Blockstack, which we discuss in the interview and was the subject of a recent Reason video.
https://soundcloud.com/reasonmag/can-ethereum-overcome-its-immense-technological-problems
This is a rush transcript—check all quotes against the audio for accuracy.
Michael Freedman: The problem I think is that the timeline for these things are very uncertain. This is along the time where we've seen this rapid increase in adoption of a theory. Now hundreds of millions or billions of dollars being put into this. The question is are we going to be able to develop the technology at the same speed at which we're jumping in in terms of demand?
Why this is a real problem I think fundamentally is that many of the main technique that allowed us to achieve the security we have in bitcoin or in ethereum that is that effectively everybody can validate every transaction on the network. Everything that happens on the network is also an inhibited scale.
The fact that there's no secret information or the fact that there is basically one kind of global historical log means that we kind of can't split this up into little things going off to the side, which is normally how we scale these things. Everybody does little work off to the side and then all together we make forward progress. That's fundamentally hard in this new model of global consensus.
Jim Epstein: This is the blockchain that you're referring to? This idea that everyone can see every bit of information.
Freedman: Correct. Everybody can see every bit of information or that there's really even putting aside the global visibility there's really just one historical log of basically every event that happens in the whole world.
Epstein: Why is it hard to scale a system like that?
Freedman: Well, because it kind of requires that as somebody is trying to make some progress you basically keep in your head everything else that happened in the world. Imagine that if you're trying to learn something where in your mind you have to think about all other worldly information that's happening at that same time. Effectively that's currently how blockchain works. Everybody knows everything in the world.
While on the flip side the traditional way we've known to scale for now decades is you kind of split up the work. You only focused on your little bit of the information you care about. Other people could focus on theirs. Ne'er the twain shall meet in that you don't care about their work and therefore you don't need to track it.
Epstein: Bitcoin is a currency system where you see all transaction in the ledger. Ethereum takes us to another level because it's attempting to be a global computer where all computation happens in this ledger or essential computation in this shared database that everybody sees. This element where everyone sees all of these computations. That's what you're not sure about. Is it that it becomes too much data and the ledger, this shared database, becomes too long or too large in terms of gigabytes? Or is it something else? Is it about the way the information flows?
Freedman: These two things are very much related because as you currently have this global ledger the more that you're trying to at any one time ... If we're trying to do 100 things at once versus 100,000 things at once effectively the ledger is going to grow at a different rate. The fact that as more people try to use it we go from trying to do one or two transactions a second to seven transactions a second to 100 to a million transactions a second.
Those are orders of- to changes, which mean that, one, everybody participating has to download more data just to keep up. Two, the growth of this ledger continues to get larger and larger over time. It's really almost the fact that at any one time the fact that you need to do so much more just to keep up that's the scalability problem.
If you look to the shift of computing there was a big change in the '90s, which was when we went from these what they sometimes refer to as "scale up" where you have mainframes where you kind of build bigger and bigger individual computers versus scale out where you have normal ... The things that Google is running and the thing that you're running on your laptop are kind of very similar. In fact, it's just that Google or Facebook run a million of them or 10 million of these computers at once. As opposed to one on your desktop.
It's not that they've built a giant computer. They just have a lot of computers that do things in parallel. That's kind of the scaling problem here. So far the way that we have scaled blockchain technologies ... In all of the current proposals about making the block sizes bigger and doing some other things are all still about scaling up. How do you build effectively bigger and faster mainframes? When if we really want to scale, when we want to hit those next orders of magnitude we need to do, then we need to go into the scaling out. How do we do things a lot in parallel? That's something that the technologies just don't yet have.
Epstein: The big idea that you hear in the ethereum space is called sharding. In a nutshell, and you can correct me, this idea is that this is how the ethereum network is going to scale. It's that not every computer, every node in the system, needs to be updated with all of the current information but rather individual nodes, kind of what you're saying, spread out, can do it locally, and then maybe at some later point they'll all come into consensus. This will solve the bottle-necking problem. What do you think about this idea of sharding?
Freedman: Everything I've talked about so far about scaling out in some sense uses what some might term sharding. This is not a new technology. This is how systems have been scaling out since the '90s.
The real question is what are the technical challenges between trying to do this scaling out and yet still provide the type of strong security guarantees one wants in these blockchains. It is not at all surprising that one turns to sharding here. It's the general technique that we have been using for a long time now. The question was how you satisfy those needs with also the fact that it's harder to make work in this world where you want these global consensus problems.
Probably compared to a lot of other computer systems now the security risks are so much greater because now the penalty for getting it wrong could be hundreds of millions of dollars. When that's not normally the cost of an individual bug, however minor, in a normal type of internet infrastructure.
Epstein: Could sharding work?
Freedman: These are kind of formal mathematical things. Until you have any impossibility results, which I know of no one here, it's very unwise to think that nothing could ever happen. Taking the lay answers, it's difficult but possible. We'll see how long this will take. We'll see how this will be aligned with what is also desired.
The reason I say that is because right now most blockchain technologies give a ... I don't want to go into all the technical details but they give this strong notion of consensus. Although, a slightly weird one because you might not reach that. You might not be guaranteed that consensus until an hour later after enough work has happened in the blockchain.
Epstein: Define the term "consensus." What does that mean?
Freedman: Right, so there's consistency notions in distributed systems. This is effectively like when you go into the bank and you see that you have a certain amount of money in the bank. You don't expect it all of a sudden to change randomly.
On the flip side, if you've ever tried to add something to your Amazon shopping cart you might have noticed that there might be some periods of time where you might see something in your shopping cart twice before it resolves itself. This is not really a bug. This is the fact that Amazon has decided to design their systems in a certain way to basically stress availability. That is they always want their service to be online and allow you to add stuff to your shopping cart. Over the fact of always basically being strongly consistent.
Strongly consistent in the blockchain what they sometimes refer to as consensus. Consensus is that we have many parties which are reaching agreement. What they're reaching agreement on is the state of the world. In particular, they're reaching an agreement on who owns this coin? You don't want two people to own the coin at one time.
Now the way this works in the blockchain is that people could cheat for short periods of time. Or there's actually a lot of divergence for the very recent time. What happens is over time, in particular after several of these blocks get released and turned into a chain, it basically will take more and more work that is basically burning computational cycles and money effectively to try and rewrite history.
What people basically do is say, "Well, if this has been around for a while ..." This has been around for ... Often it's common to be for an hour or so. Then I'm kind of good enough with the risk of how much effort it would take somebody else to try and rewrite history.
This basically said that after you have that historical record, after a certain amount of time, basically entire world agreed on exactly one record of history. There are alternate consistency models, for example, which are that if you and I are let's say running computer programs. You and I are selling goods, selling coins. I buy a coffee from the local bodega in New York and you're in London and buy something at the nearest pub. Why does the world need to know that you bought your beer before I bought my coffee? That yours happened before mine? They really have nothing to do with one another.
Epstein: That's because you have a credit card processer? You have a third-party running that financial transaction, which is what you don't have in the blockchain world. Is that right?
Freedman: Yes, but I'm actually kind of asking the question more from a fundamental thing, which is in a blockchain, which has one global log of all historical records, what the blockchain will actually have a record of saying is that you bought your beer three hours before I bought my coffee. The different question is that why given you're halfway around the world and your coin that you use to buy your beer has anything to do with the money I used to buy my coffee? Why do we need to establish this strict ordering between them?
One of the ways that there's alternate models in distributed systems that allows us to, say, not decide which one comes first. Effectively it says, "Well, they don't have any relationship to one another so why do we need to actually ever make that decision?" This allows us to what we might call is weaken the consistency model. That allows it to have more flexibility and cheaper protocols.
One of the ways we might see it is that a way that some of these systems might scale to greater sharding is they could actually change their model about what the system provides. Again, the possibility of this depends ultimately on what our final model is and what we want from it. Because everything right now is a little bit up in the air and not well-defined or formally considered it's hard to tell what is actually going on and what the properties it finally will have.
Epstein: To kind of continue with your analogy, though, in the blockchain world and whether it be bitcoin or ethereum, the order while in the world where you're buying coffee or beer the timing doesn't matter. In the blockchain world the timing is essential.
Freedman: In the blockchain world it expresses that timing. My point is then you could say, "We have difficulty in achieving this scale out in a world that captures that timing." You might ask the question, "Well, is that timing actually fundamental?" If it's not fundamental can we weaken our need to capture that timing and therefore allow us to build blockchain-based stuff that could scale better based on certain types of interactions because we don't need to ever capture their timing with respect to one another.
Epstein: What you're weakening is the essence of what the blockchain brings to the table?
Freedman: I'm pausing in my answer because that is weakening a current essence of the blockchain. I totally agree with that. I think the question that people have always asked in these type of distributed systems are there are always trade-offs. In fact, in this area that I started talking about it there are known results, they're known impossibility results, about how much you could scale or how available can we make these systems that have such strong consistency properties like the Blockchain currently has.
What I'm saying is this might open up the door in creating a more expansive view about how do we develop different types of blockchains or add different properties if we're trying to optimize for different things? Right now we optimized for effectively strong global agreement. This is global consistency of all operations. It might be that we gave up certain things of scale to do so. Maybe there's other systems where we could achieve slightly different properties in terms of security and consistency for much better abilities to scale.
Epstein: In terms of that analogy, though, this idea that's called sharding, which is the big idea in the ethereum community for how we're going to scale this system, how does that work and how does it weaken this system of consensus?
Freedman: Yeah. Unfortunately, I did spend some time ... I'm not necessarily always catching up to what came out of ethereum this week. I did spend some time trying to understand it at the more technical level all the details with sharding. To be fair, right now, it seems a little less defined to me. It's kind of hard for me to answer that given that it didn't seem sufficiently well-specified or formally specified for me to be able to make a deep technical understanding of-
Epstein: Well, sharding is this old technique. You wrote a paper about sharding in 2003.
Freedman: Sure.
Epstein: Sharding is a technique that's constantly used in making the internet run. What is sharding? Let's start there.
Freedman: Sure. Sharding is basically saying that if you have a big data set, or database usually, split it into lots of parts and let those parts operate independently. Think about your disc on your computer. Let's say you store one terabyte of data on your disc. You can't buy currently easily a 10 terabyte disc. Let's run 10 discs instead and they each have one-tenth of the amount of data. That basically is what sharding says.
Epstein: Then why is it difficult to apply this concept to the ethereum space?
Freedman: Well, ethereum and bitcon and all of these blockchain technologies are similar in that each computer, in this example each disc, is supposed to see all of the data to make sure that you didn't, let's say, do something different. You didn't take the same coin and, let's say again, if we go back in my analogy before of 10 discs. Let's say there's 10 separate computers, separate people. I didn't try to spend the same coin with person three as person seven or I didn't try to execute a program differently at program three and program seven and cause the world's state to diverge.
In Bitcoin, I take that coin, I spend it, and all 10 nodes sees that I spent it in one way. If in a world where they only see a small portion of the data, a node responsible for that small portion might not see that I took this coin and I spent it with some other party because they're not storing that party's data.
Epstein: The trick then with sharding would be to allow a certain amount of the information slip through so everyone can see something but maybe not all of it. That way everyone can be up to date with the essential information. I guess that's I assume what they're talking about when they talk about applying sharding to ethereum. Is that right?
Freedman: Well, yeah. Like I said, this is when I think some of the low-level details matter a lot in that effectively what you're going to need to do is to make sure that the fact that ... In my example I tried to split what was one party into 10. I still need to make sure that everybody sees just enough data in such a way that I can't slip through a transaction. It might turn out that the way to design it could, let's say, scale ...
In the example analogy I had of this thing in New York and this beer in London, I was able to say, "Well, this didn't matter because they didn't have anything to do with one another." What happens in fact if that bartender in London immediately took the coin you went to spend it on and tried to spend it in New York City. It might turn out in that case they do matter. They are related.
The fact of your ability to scale might also in fact be a function of the workload of the system. How closely related, how intertwined are all of these operations to one another? What does that mean in terms of my need to keep all of the different parties up to date with how each other is behaving?
There's a bunch of these different unknowns based on how well does the system allow me to update minimal information and how much does my workload even impact my ability to scale these things? These are all details that need to be explored and worked out.
Epstein: They need to be worked out. Again, the ideas that you're saying they can be worked out? Give me a sense of how big a problem is this in terms of figuring out the technical details to make this pretty exciting technology called ethereum grow so that it can serve millions of people?
Freedman: I don't like trying to predict the rate of technical change. These are probably easier than putting somebody on Mars, harder than building your latest website.
Epstein: That's a big range.
Freedman: There's a lot of unknowns here. It's always really difficult to predict the speed of how-
Epstein: Does it require some sort of a technical breakthrough? Is it uncharted territory technically? Do they have to figure something out that people that have been working in the field of distributive systems have not figured out up to this point?
Freedman: I think there are definitely aspects of this that require some technical breakthroughs. The reason computer science is somewhat different than a lot of fields, which is that we sometimes create the world in which we operate. If you're building a rocket ship there are fundamentals of gravity that you need to deal with. If you're doing chemistry there's fundamentals of the world that you need to deal with.
In computer science, what's interesting is that we could tweak our worldview in order to make systems easier to build. An example I gave before I said, "Well, it might be particularly difficult if we want this global consensus to scale." In fact, this has been an outstanding problem ... Probably the weaker form of this has been an outstanding problem for at least 25 years.
The first person to basically develop what we now think of as general consensus protocols actually won the Turing Award, which is computer science's highest prize. Kind of like our Nobel Prize. Leslie Lamport, who developed some of these early consensus algorithms in the first place. They have always required basically everybody participate in the interactions.
Now the question is how can we change that? Do we think that we're not going to change any of our assumptions and just make progress while adding onto that the additional complexity of all the security aspects of blockchain? Maybe but it's difficult because when people have been working on this for so long the question is what do we know now? What has changed that people haven't changed before?
One of the common ways you do this is you change your model. You say, "Well, I'm not going to work on the same problem. I'm going to work on a slightly easier problem and being able to make progress on that."
From what I do understand it sounds like the ethereum community is probably making the smarter approach of actually trying to solve an easier problem. The uncertainty really is is how easy are they ultimately going to make this? What do they give up in order to achieve those new properties, those new scaling properties?
Epstein: What they don't want to give up, though, is the purpose of this system, which is decentralization. It's that we can run online applications. We can engage in forms of trade that don't require a third-party, whether it be the government or a bank, etc. I guess the question is it sounds like what you're saying is if you bring a third-party into it it's much easier to solve these problems in terms of scalability. You get something that looks a little bit more like what we have in today's internet. The question is-
Freedman: Well, I wasn't only saying ... A trusted party is one direction. I wasn't saying that. Bitcoin, from 10,000 feet or a layperson's perspective, you could say bitcoin or blockchain was doing decentralization. As you know from my own background decentralization has always been in a cyclical aspects of the internet.
The whole start of the internet in the 1960s and '70s was the very notion that we didn't want a centralized telecommunication network. It's called the internet because it's the network of many networks. We wanted to be able to in a decentralized fashion connect together a lot of networks and make them work.
If you fast forward in the late '90s and really starting with around 2000 we had this rise of instant peer-to-peer systems. That again was very much decentralized. In some sense even more decentralized than blockchain was today because there was no well-defined miners that are now getting this centralization effect of basically running big mining rigs. They didn't have the same type of strong consistency properties that blockchain has.
Again, we certainly know how to build decentralized systems that are scalable. What blockchain was able to do was saying how do we actually get some decentralization while also getting these strong security or consistency properties? Now the question is what does the middle ground look like and ultimately how will that fall out?
Epstein: A difference in blockchain and this goes for bitcoin and ethereum is that there's a lot of money at stake. There's money running in this system and it wasn't true of all these earlier decentralized systems, right?
Freedman: That's good and bad. The good thing is maybe it enables it not to just be the side project that you can hack on the weekend. You could spend real money hiring top people working on this thing. The bad thing is it raised the bar and it means that little bugs are really bad.
The DOW had what? $25 million or $30 million lost just last week or something? We had a $30 million plus heist of the software to store wallets and multisig wallets. It creates a really high bar to get things right because the incentives for trying to break these things are so strong.
Epstein: People in this space have drawn these analogies between the challenges of scaling the early internet and these challenges of scaling networks like ethereum. You worked on these early internet scaling problems. Is there an analogy?
Freedman: I should say that was 30 or 40 years before my time.
Epstein: You started in 1999. You worked on CoralCDN, right? Which addressed the slash dot effect. Slashdot was a website that was an aggregator that would send a ton of traffic to a site and blow it off the network. You were looking for solutions to that. When you started these internet scaling issues the internet had many fewer users and it wasn't clear that it was going to be able to scale to billions. Maybe it was. I don't know.
Freedman: Yeah, just to set the bar, a lot of the earlier work that I'm talking about making the underlying internet, the communication, this was going on in the early '70s. By the time all these peer-to-peer systems were being looked at this is after the whole e-commerce growth and we already had an internet that was at least a hundreds of millions, if not billions, of users.
Now I do think that the parallel isn't great. On one had these systems are trying to figure out how to scale. What we were doing before was trying to allow anybody to communicate with any other party. Now I think it's almost a return to the opposite. We went from a world where we're going to allow millions or billions of people to just keep separate data to a world with the blockchain where we need to say there's a global ledger. In fact, we're re-centralizing all of the knowledge.
Now by centralize, it's not that it's run by one party but that all the knowledge is known by each party. That's almost counter to these other things where we try to split up the knowledge and let everybody just deal with their local information.
Now the question, what all this debate upon, is how do we deal with these two properties, which somewhat look at their fundamental face at odds? How do we split it so that we can split the knowledge across all the worlds so we can scale to really high numbers? On the flip side, how do we do so in a way that keeps enough people looking at what each party is working on so we can do things like what is sometimes known as equivocate, which is tell different parties different types of information.
This equivocation is fundamentally what attacking the blockchain is about, what double spending is with bitcoin. What it's saying is I am going to try to take one coin and spend it with two different parties. Equivocation is I'm going to take one fact and give different answers to different people. When we have split up all of our knowledge in different parts then that's hard to do or that that's easy to do. When we've brought all of the knowledge together and since every party needs to know about all of the knowledge in the world that's kind of how the blockchain makes this problem go away.
Epstein: Talking about the early internet again and before your time obviously, you've talked about how the genius of the internet was that it was built in layers. At it's core is these simple computer programs called protocols. Talk about that and then also talk about ethereum in that context. Does ethereum violate that ingenious infrastructure or ingenious structure?
Freedman: If you remember why I said it's called the internet, a network of networks, what was happening is that everybody had ... There were all these different computer networks running around the world. There wasn't agreement. There were old telecom networks. IBM had their own standards of what these networks should be. Ethernet started coming about at the time.
What the internet was it was originally a project funded by the Defense Department. It was called the ARPANET or from DARPA, the Defense Advanced Research Project Administration. It was attempting to basically define a protocol by which you could connect these disparate separate networks and allow computers across them to communicate.
It was meant to be this global language by which you could actually have computers on different networks actually reach one another, connect them together. That was what it now known as the internet protocol or IP. On top of that people started asking the question what should the intermediate layers look like?
One of the fundamental ideas in computer science is also that of abstraction. Abstraction says that we could define interfaces so that they could guarantee some property and yet hide actually implementation details. If you think about webpages, for example, whenever you access a webpage it speaks a standardized protocol. That's HTTP. That's why when you type in a URL in your browser you see those four characters in front of that. HTTP colon and then something like www.cnn.com. That's saying speak the HTTP protocol.
Epstein: Protocol meaning a computer program? A small program.
Freedman: Well, a protocol is actually really a description of what is the syntax and semantics of messages that could be sent between two parties. Obviously everything is generally implemented in software. This doesn't actually mean it's a fixed program. If you look at HTTP for example there are hundreds or thousands of different programs that implement the standard.
It's like saying what a protocol is is like a human language like English. It's saying there's words, there's pronunciation, and between those words and pronunciations people could communicate with one another. That's generally what a network protocol means.
Where we're going with this message of abstraction is that if I want to connect two computers together do I need to call up my buddy and say, "hey, I want to speak to you in a certain way. Go learn my language." Or is there a ... I used English but is there a lingua franca of the way we can communicate? Can we say that if we each implement this particular way of communicating then we know we can communicate over the network?
That's effectively what layering did. It said that if we all implement this one standard, this way that we want to take messages and split them up into little packets, and the way we address those packets so that different computers on the internet know where to ultimately send them then we could basically build this model where we can always connect more and more computers to the network and they'll still all know how to speak with one another.
What the magic of that layering is, and I said connect more and more computers, is that the basic information that was sent in each packets back in the '70s and today much of that is very similar. These protocols have actually changed very little. The reason that was important was it gave us what people sometimes refer to as future proofing the network or future compatibility.
Often in software engineering you talk about backward compatible. How do I know that when I do a software update it still works on Windows XP? What a nice layering model will look like is when I build a piece of software today how do I know that in 10 years from now it's still going to work. When you're talking about building protocols that can be deployed all across the network it's very hard to update them. Once you release them, the genie is out of the bottle, and you better hope that what you did stands the test of time.
Epstein: By analogy is it like I can build the foundation of my house and then when I want to redo my kitchen I don't have to redo the foundation? It's like the complexity is at the surface layer?
Freedman: I'm happy with that analogy. Or put it different. This notion of building on top is a good one because when you think about layering what this was really designed for is logically to think of these things going all the way down to what people call is the link layer, which is how does your phone communicate with a cellphone tower? Or how does your laptop speak WiFi? That's what's called the link layer.
The network layer is how do those messages being sent off your laptop go all across the internet to speak to a server over there owned by Netflix to send you back your favorite movie? Then the application layer, which is where most of the action happens, which is how do we build new services on top of this network?
If you look at the reason that this thing has been future proofed is it said we actually don't need that much from the network. In fact, that was one of the key fundamental ideas of the internet is that you had mostly a dumb network and you moved all of the smarts and all of the choice to the endpoints, to the computers.
This was very different from what came before. Before we had in the telecom network all of the intelligence was in the core of the network. Mob Bell built these big computers or big machines and made all of this decision about what services are available? When your endpoint is that really stupid analog device that you pick up and just dial the numbers on.
You can look at and say ... I remember as a kid watching commercials on television from AT&T saying we're going to get video conferencing any day now. I don't think that vision was ever realized from my teleco provider. It certainly became realized over the internet where we didn't have to wait for the ISPs to deploy video conferencing solutions. Skype came along, earlier things came along, and we could just immediately do this running on software on the end tests.
Epstein: Okay. Let's bring to this ethereum, which ethereum is not promising to lay new cables that will be a new internet but it is promised as a new worldwide web. A new operating system for the web. A rethink of the protocols and the software architecture of how we communicate online. Does ethereum mesh with this? Does it violate this kind of vision of layering and a dumb network that you've laid out?
Freedman: None of it really violates this layering model at least that I describe because all of these things ... Perhaps one of the limitations of the initial layering model is that almost all of the interesting stuff was thrown up into the application layer. Everything we talk about in the world today from a network perspective is just part of the application layer.
What the network layering model was really said to do was focusing on how do we build the network infrastructure and instead let's move most of the real interesting stuff into the application layers, into the end hosts.
Epstein: The application layer would be Twitter, Facebook?
Freedman: The computer. Twitter, Facebook, but your web server that's sitting on your home. In fact, a blockchain miner and an Ethereum node is just speaking an application level protocol that is Ethereum's networking protocol over the internet. From the internet's perspective it doesn't matter that's ethereum or bitcoin or email or Google's web search. They're all the same.
Then the question is, well, isn't this all part of the ... It doesn't violate anything. It's just an application layer. One of the things that happened in the internet architecture is all of the interesting stuff moved up into the application layer. Then we can look back and say, "Well, did it actually solve some of the fundamental problems?"
What started happening is that in order to make the internet trusted we started building out services that the whole internet relied on. Some of those include DNS. If you type in a URL you rely on some infrastructure. This is-
Epstein: DNS is how you find, say, Amazon.com. It's a directory.
Freedman: Correct. It's a directory that maps these human-readable names into IP addresses. IP addresses are those things that the underlying network uses. It's like your post office address. It's those things the underlying network use to figure out how to send your data to its destination. Like the servers that Amazon dot com is running on.
In order to use the applications it turned out that we needed to rely on things like DNS or relying on ... When you log into Amazon dot com the reason that you trust that you can buy stuff there and you're not speaking to some malicious hacker is that your browser comes pre-configured with cryptographic information from some specially trusted parties that are then going to have a relationship with Amazon. They're going to tell you, "Oh, by the way, this is actually Amazon who you're speaking with." Those are often called certificate authorities.
What happened is while we are still running on top of the network there were some fundamental things that everybody started relying on to have a secure internet. I think that's some of the best criticism that the various services that are looking to decentralize the internet are having is that should that trust be held in a small set of hands? Is that trust well-placed?
There are a number of situations where the people entrusted with that information didn't really keep up their end of the bargain either because of security bugs and perhaps security incompetence. Sometimes their own commercial reasons.
Epstein: The structure of ethereum where applications are running on this blockchain that everyone shares that is in harmony with this dumb network model that's made the internet scale so well?
Freedman: It's kind of an apples to oranges kind of question. From a strict networking layering perspective it doesn't violate any aspect of it. If you aren't so dogmatic about the question and look at what is the way that ethereum and bitcoin and these blockchains are scaling the basic way they're scaling today is that effectively everybody knows all bits of information.
That's kind of pretty counter to all ways that we've scaled services before. Either sharding or the very traditional approach of using hierarchies. The way DNS and many services on the internet and even in nature work, the way they scale is that you have a hierarchy of who knows what? The closer you are to the top, the more course grain your knowledge is.
If you ask who needs in ethereum's world or bitcoin's world does everybody know ... Who needs to know about every single transaction? Well, only a small set of involved parties does. Even though, there might be some parties out there that have a better view of, let's say, that this transaction happened in North America and they might know some people in North America. That's kind of how you use hierarchies to scale things.
Epstein: Let me wrap up by getting to Blockstack, which is a project that you're an advisor to, and it came ... Muneeb Ali, who is one of the co-founders, got his PhD in your computer science department at Princeton. Tell me about this project in relation to what we've been talking about and why you think it might be a better model to bring some of these decentralized aspects to the web than, say, ethereum?
Freedman: Well, I think it's trying to solve a different problem. A lot of our conversation is about what is Ethereum doing or some other projects doing? It particularly has this worldview or it seems to have this worldview that basically we want everything to be running on this globally audited system.
A lot of the reasons that we're talking about scalability is a common number that's given is bitcoin today does on average seven transactions a second. Close to that. At it's peak, I think the Visa network does 200,000 transactions a second. If you look at the number of requests coming into Facebook or Google they're going to be probably way higher than that.
Now you could ask the question of does this mean that in fact what we need from ethereum or the blockchain, do we need the network to do a million transactions a second? In the network, in a way that's globally auditable that everybody can see. Maybe the answer is no. Maybe the answer is instead you kind of figure out precisely what are the actual security properties you need that have this globally auditable view? What are the things that you could then take, some people in this community call it, off chain?
What does this network bootstrap what does it allow you to do elsewhere that is hard and allow us to now use other technologies that we know how to do? Basic crypto, let's say. Once we use something like the blockchain to figure out what each party's cryptographic keys are, do we need to actually do their transactions on the chain?
Or could they just sign something, communicate it like they normally do, using those keys that they boot strapped via the blockchain, and use public key crypto as we've known to do it for 30 years? Which is if anybody is later seen to be lying then you could use the fact that they've cryptographically signed and committed to message and the fact that the Blockchain tells you that these are their valid keys to point out that they've done something wrong.
Really, the question in some sense is what is the level of global transparency you actually need or in fact want? One of the interesting aspects of this is do we want all of our computation to be happening in a global way? Or can we ask do we only want certain properties? Or are we trying to solve more concrete problems?
Taking back more concretely to Blockstack, one of the things, and I talked earlier about, is DNS and certificate authorities. That one of the problems with today's internet is that everybody relies on what are effectively 13 or so parties who are running the so-called root DNS servers. That if some of them were to go down that creates availability problems. If any of them were to basically lie they could basically do that at will.
Epstein: Which would be sort of like, we type in Google. We're relying on one of these services to route us to Google and they could send us elsewhere?
Freedman: Correct.
Epstein: We're relying on this. Okay.
Freedman: Correct. The same with certificate authorities. When you see that little lock box, that little lock in your browser bar, that says, "This is actually Google that you're speaking to" you rely on the certificate authority to make that secure.
In fact, people have found out that different certificate authorities were found to be giving actual incorrect information probably purposefully in order to satisfy the requirements or the asks of local governments that they're operating in.
Ideally, what you want is internet that when you think ... In order to try to talk to Google, you could, A, figure out where Google servers are located and somehow establish communication with the real Google. Then have a way that once you speak with them, have a way to be confident that you're actually talking to Google and not somebody else who is trying to listen in on the information, or in fact even change the results of your search requests.
Now does that mean that Google doesn't provide a service? Do we need to get rid of Google and in fact make everybody's search queries public to the entire internet? I would actually think that most people would not want that and would actually think that's a massive back stepping in terms of privacy.
Decentralization does not necessarily mean better privacy. It is really much more a function of what problem we're trying to solve. Is this a good problem for the solution? What I like about Blockstack is that it's figured out what are some of the services that are missing today that we can leverage the Blockchain to provide and dis-intermediate these trusted parties like certificate authorities and the DNS route servers that perhaps are playing a larger than life role in the internet than what some of the early architects the internet would have wanted them to do.
The early internet wanted to be able to be decentralized. We now have certain entities to provide naming and security that are a much more centralized than we thought we'd probably want. What Blockstack allows you to do is actually say, "These services we don't need to make centralized, we don't need to make trusted."
Now what you want to build on top of this that's up to you. I think that's actually much more conceptually in line with this notion of layering that said it wanted to enable choice at the application layer. If you want to build a decentralized search engine you could do that. If you want to build Google you could do that too.
It really enables choice while allowing you to bootstrapping trust. That as a user I have choice in who I use and I don't have to trust other parts of the internet that I don't really have relationship and I probably don't have good reason to trust in order to decide that I wanted to use Google or I wanted to use Duck Duck Go or the latest, greatest decentralized search engine out there.
What ethereum is trying to do is figure out how do we make as much computation as possible part of the global Blockchain. Now it very well could be that when people build onto applications on top of ethereum they might start deciding, "Well, let's do some of my application in the ethereum blockchain itself" and some of what they would call off-chain.
You could get orders of magnitude difference in what you could then do off-chain or on-chain. Fundamentally, I think the whole view of ethereum sharding and ethereum scalability is trying to ask the question of how much more support can we put in the chain to allow more and more computation on the chain?
What Blockstack I think is trying to instead say is that ... Again, as much as we'd like them there's not always a crisp demarcation between these two worldviews. What Blockstack has basically been trying to say is what are the minimal things that we need to do on the chain in order to give us the type of security that we'd want to build a lot of very interesting applications off-chain.
Blockstack's, in some sense, focus is how do we enable secure applications off-chain given what we can do on the blockchain. Ethereum is trying to say how do we put more and more type of computation in our form of smart contracts on the chain itself?
Epstein: Okay. All right. Thank you, Mike. I appreciate your time. I think we'll leave it there. Thanks for joining us.
Freedman: Thanks for having me