Cryptocurrency Code Review: Chainlink

in #blockchain7 years ago

Disclaimer: These reviews are done as is from what is on display in the master branch of the repo’s made available. This review is not a comment on the overall project, scope, or success thereof. This was done as an educational review by me and any comments in the article are simply my opinion. It should not be used as any comment or advice on the project as a whole.

Review Date: 25/03/2018

@Theroryshow (Rory) from ChainLink reached out and asked that I do a third party review for ChainLink. I never like it when a member of the team actually reaches out, since it almost always guarantees the code is good. But I’ve been meaning to do ChainLink for awhile now, so let’s have a look;

Active repo, good amount of contributors, good branches structure (features/chores)

Good spread out of contributors in commits, good commit messages.

So at this point, let’s quickly talk about ChainLink. Why is off-chain data important? Let’s say you want to do a smart contract that does token swops between 2 ERC20 tokens. At the base level, you can do this 1:1 or you can define a hardcoded ratio 1:100, but as we know, price is volatile, so if you want to implement such a mechanism you would need to be able to get real-time price data. This however isn’t possible in a smart contract, since you can’t make an outside request. So how would you solve it? The easiest way is that you set up an Ethereum Log Event, you register to receive these events on your own external server, when you receive the event you do your HTTP (web) request and get the current trading price, once you have this data you push it back into the smart contract via a transaction call. Now the smart contract has the data, so it can execute the transfer.

From the above, you can see that isn’t really for everyone, you need your own servers and event subscribers. So what was the option before ChainLink? The option was oraclize http://www.oraclize.it/ which is a centralized company that did this work for you, you prepay their smart contract with ETH and then you could make requests get the result and push it back into your smart contract. But for this, you have to trust them, what happens if they no longer exist? Your whole solution is dead.

So a decentralized solution is required for this, step in ChainLink. At it’s core it does the same but in a decentralized fashion. But this comes with a few issues, 1 what if it’s in the interest of the node giving me the trading price (in the example above) at a much lower (or higher) value to benefit themselves. So this data has to be validated, this is where shared consensus comes in.

Although this comes with a lot of complications, in Ethereum events are deterministic, which means if you replay 1 + 1 you will always get 2. In HTTP systems it is often not deterministic, if I make the request for trading price 2 seconds apart, that price can be very different, so which one of the two nodes is correct then? The other problem is, essentially all nodes have to do the HTTP request, but only 1 gets rewarded, this is very different than with standard deterministic blockchain events since a) no matter what the time difference I will always get the same result and b) while a block is expensive to make, it’s super fast to validate (unlike in the Oracle where each Node has to pay for the data/bandwidth cost). Let’s use the highly unlikely example that I want to download a 1 TB file, now to validate that a node isn’t cheating, each node has to download that 1 TB file, but only 1 is going to get the reward.

ChainLink has a solution for problem 1 whereby they aggregate, I’m not a huge fan of it, the problem is, if I want exact millisecond sensitive data, I don’t want a median average returned to me, so for time sensitive data I don’t think this is an ideal solution, for less volatile data, this however is a good solution to get off-chain data, perhaps user details or weather details.

For issue 2, I haven’t really seen a solution.

Going over their wiki, I saw something I really liked though (which wasn’t covered in the whitepaper), there are 5 ways to trigger a ChainLink job, RunLog, Web, Cron, EthLog and RunAt.

Cron is the important one, Cron is a unix based time mechanism which allows you to set something to run at a specific time or interval. This is something very lacking currently in Ethereum smart contracts (they were omitted for a very good reason, since they destroy the deterministic nature of the blockchain, so they can’t be core part of it), but it is important for real world cases.

So an example people often like to use with blockchain and smart contracts is the flight insurance contract. You buy flight insurance, the blockchain sees the flight was delayed, you are paid out 24 hours later. This isn’t actually currently possible on the blockchain since a) it can’t natively receive data that the flight was delayed and b) it can’t trigger an event to happen at a specified time. All smart contract events must be triggered by an initiating transactions.

So a time trigger, and event trigger solution is a must have to actually build real world examples, for this reason, I think solutions like ChainLink are critical.

However, if we look at something like Oraclize, they pay for their server usage via ETH fee’s. There is no reason you can’t have a 100% ETH based decentralized solution, you can pay the decentralized nodes via ETH fee’s for their data transfer (a simple Ethereum based side-chain). So in it’s current iteration ChainLink as a token for me, doesn’t have a great value proposition.

As they expand to more blockchains and become more agnostic this vision however changes, but right now, with just Ethereum support, I actually find it very annoying that you have to own ETH (to trigger the log event) and ChainLink (to pay for the data request) to receive a simple HTTP request.

This model also greatly limits the kind of data you might want, so let’s look back at the flight insurance example, how do you know the flight was delayed? Well, you have to ask the flight system (third party) every few seconds/minutes/hours if the flight was delayed. But think how expensive this will become? Essentially I don’t want to do any more than 1 request. So I will do 1 request after the flight was suppose to be successful. Not a very user friendly system is that?

So let’s dive into the code, so HTTP is actually very finicky and everyone has a different implementation, one developer will use basic auth, another JWT, another sessions, another a combination of the above, but what this means to support HTTP you need to support payloads and headers

So right off the bad, what I don’t like, is there aren’t any headers being specified there, only the URL and then the basic auth parameters. This means they are currently coding for a limited set of cases, but, I’m sure they will still expand on it, so let’s move on.

Their code is incredibly Ethereum intertwined, so it wasn’t designed with multiple blockchains in mind, right now they are building Ethereum oracles, this concerns me a little bit from an architecture perspective, since it means they are rushing for their primary use case, ideally they should have tried abstracting as much as possible from the blockchain and instead following an ESB (Enterprise Service Bus) design approach, this will hurt their development in the future.

Their cron implementation isn’t time agnostic, what this means is it doesn’t take nodes in multiple timezones into considerations, some nodes will fire at different times or hours because of localized timezone adjustment, this tells me they don’t have too much production experience in this field.

They have implemented all of their job specs, and they have implemented all of their initiators. There are some good fallbacks in terms of job spec validators and even syntax checkers.

What I don’t see anywhere though is the consensus and validation code? Right now this looks like a standalone client that simply stores the jobs and can execute them. I don’t see any p2p replication, I don’t see any transaction systems, and I don’t see their own consensus. And I hate to say it, but that’s the difficult part, they have done the easy part. Which means they are very far away from reaching their whitepaper goals.

Conclusion: It’s really good code, missing some production experience, they seem far from achieving their whitepaper goals, for now they are a standalone application that can listen to Ethereum only events and do basic HTTP requests.