bZx Is Not The Problem
Post by: Taylor Monahan, Founder & CEO of MyCrypto
As you likely know by now, the #DeFi space underwent a great shookening this week. This post is about that, but not really.
If you came here expecting technical specifics of the situation, you can read the analysis of the first incident, or bZx’s post-mortem of that incident, or the analysis of the second incident. I will not be rehashing here.
If you came here hoping for answers to the questions that have been raging, perhaps try discussing here: Are flash loans the problem? Was it a hack or taking advantage of an arbitrage opportunity? Was what theattacker did legal? Is DeFi decentralized? What is protecting these upgradable smart contracts? What role will insuranceplay in DeFi’s future? Was it an oracle problem? Was it a liquidity problem? Could other platforms be affected? What are flash loans good for?
Instead, I’m going to talk about the one thing everyone thinks they are talking about, but no one is actually talking about. How do we prevent this from happening again?
For this, we need to pull our heads out of the sand and look at those who came before us. As much as we love fancying ourselves as inspired rebels who are transcending the traditional systems, there is very little that is truly original about crypto. There is even less that is original about the hurdles we are encountering. And there is absolutely nothing original about the mistakes we are making.
One of the most frustrating aspects of the bZx situation is the defenders of bZx and their attempts to justify an unacceptable outcome to make it acceptable. Making excuses for your own fuckups is one thing and, while bZx shouldn’t back down from their responsibility, it’s an expected first response given the situation.
But making excuses for those who have shown no signs of being diligent about (or even aware of) the impact of their choices is quite another. Those who are responsible for millions of dollars of users’ deposits should not be able to avoid that responsibility when they lose their users’ deposits. Calling it ‘experimentation’ or ‘the early days’ negates our ability to learn and grow as the builders of our future.
Most importantly, as we debate whether bZx acted wrongly, or whether it was wrong of me to tell them they acted wrongly, or if flash loans are a problem, we only succeed in, maybe, slightly mitigating individual diversifiable risks. This distracts from the real systemic risk that is creeping in:
Just kidding! They waltzed in ages ago to great fanfare. Also if you hold MKR, vote to put a time delay so it doesn’t get flashf*cked too, thanks.
Risk and Complex Systems
There are a few theories (by people far smarter than myself) on the techniques that can successfully mitigate risk in complex systems. Most interestingly, these have changed rapidly over the past century as the systems we built became more complex and their failures were no longer primarily or solely due to external forces.
Most of these risk management strategies revolve around identifying the cause of a failure and putting controls in place to prevent, detect, and address that cause. This typically assumes there are two states of operation:
Compliant, which causes the majority of outcomes to be acceptable, which results in success.
Non-compliant, which causes the majority of outcomes to be unacceptable, which results in failure.
The theory goes that if you can detect or prevent the transition from compliant to non-compliant, you can reduce unacceptable outcomes, and therefore you can reduce the likelihood of failure. You could do so by ensuring you use robust building materials, creating systems to monitor the other systems, or reducing your reliance on humans as human performance is variable and can cause unanticipated changes to the state.
As the systems evolved from simple bridges to complex nuclear reactors, the existing mental models became less effective in preventing catastrophic failures. The most-cited incidents that highlighted the need for better models were the Three Mile Island accident, the Space Shuttle Challenger disaster, and the Chernobyl Nuclear Power Plant disaster. These disasters all occurred in the last 40 years and, in their own unique ways, showed that the technological systems are not the primary or sole cause of disaster. Instead, humans, the organization of humans, and the culture of the organization of humans can be the cause.
Interestingly, I ran a super scientific Twitter poll on this subject and the respondents (which are presumably primarily the crypto builders and users and spectators who follow me) were overwhelmingly confident that humans are more likely to cause failure than technology, and (less overwhelmingly) that the culture of the group or organization of said humans is more likely to cause failure than the humans themselves.
They also primarily see humans as a resource, not a liability.
And, therefore, expect to have a greater number of acceptable outcomes (aka a better chance of success and a reduced chance of failure) if you dedicate more resources to adjusting human performance than adjusting the technical system.
This doesn’t explain why we have been rampantly debating aspects of the technological system itself (flash loans, smart contract security, admin keys) instead of the human performance, but I digress.
Actually, I don’t. That fundamental contradiction is what this entire piece is about. Yay!
The Human Element
I will preface this next bit by saying that we cannot ignore the technological system entirely. There are a stunning number of examples in this space where a robot simply following a checklist would reduce a number of unacceptable outcomes. However, having the checklist will only address some outcomes, and not actually ensure the success of this highly complex system. We are not building a simple bridge over a shallow river, but we still shouldn’t use quicksand as a building material.
For the purposes of this piece, I will be defining the environment vs systems vs organization thusly:
Environment: the world that the Ethereum network and smart contracts on the Ethereum network affects. This also includes some people or things that do not use Ethereum, but are still affected by its existence.
System: the Ethereum network in its entirety.
Subsystems: individual smart contracts or groups of smart contracts that exist with a cohesive goal. These may operate semi-autonomously within the system, may rely on other subsystems, or may create larger subsystems by interacting with other subsystems. ‘DeFi’ can be seen as a subsystem made up of subsystems that are interwoven technically and with the common goal of giving people access to financial markets.
Organization: the collection of all the people who influence the Ethereum network and/or the environment, including but not limited to: smart contract developers, users, investors, commentators, Bitcoin maximalists, companies, foundations, competitors, researches, etc.
Sub-organizations: the collection of people who influence one sub-system, including but not limited to the people who build it, maintain it, audit it, give feedback on it, use it, are otherwise affected by it, etc.
Builders: the people who just build and/or maintain one sub-system.
People: any individual human in the organization.
However, be mindful when discussing with others as a lot of people consider the environment to be limited to the world of a single smart contract and an organization to be only the builders of said smart contract.
This line of thinking can be observed via comments like, “if you don’t like X DeFi platform, don’t use it!” or when people fail to understand why “Bitcoin Maximalists” comment on the state of Ethereum. This is highly problematic as it removes the users or spectators of said subsystems from the conversation and instills a belief that the subsystems have limited effects on real people.
Unlike nuclear reactors, there is no clear distinction between those who build on Ethereum and those who affect the system. We are all operating together and what we build, along with our usage and discussion of the system, is what determines the nature of the system moving forward. This is why it makes sense to use such a broad definition of “organization.”
Okay! So, how can we realistically manage risk?
1. Prioritize creating an organizational culture that leads to the outcomes we collectively desire.
Luckily, based on my Twitter poll above, we seem to be mostly in agreement about these points, though we may be fuzzy on what we actually desire. But, let’s be explicit about the things we appear to agree on.
First, we should agree that the organization should have a culture that optimizes for acceptable outcomes:
It is not just the technology that causes failures.
It is not just builders who cause failure.
It is not just sub-organizations that cause failure.
It is the culture of the collection of all the people who influence the Ethereum network and/or the environment, including but not limited to: smart contract developers, users, investors, commentators, Bitcoin maximalists, companies, foundations, competitors, researches, etc. that is most likely to cause the greatest failures.
An organization’s culture is shaped by the culture of its individual parts. Further, an organization’s culture changes due to changes in behavior of its individual parts. If we change how we perform and what we do, we will first shape our individual culture, and then the organization’s culture. We should be aware of the impact our actions have so we can do more of the things that encourage the outcomes we want and do less of the things that discourage the outcomes we want.
Interestingly, this applies not just to the culture itself (e.g. agreeing a loss of user funds is unacceptable) but also to the change in the culture (e.g. the process by which we agree that a loss of user funds is unacceptable.) We all must simultaneously:
do things that prevent people from losing user’s crypto.
do things that encourage people to safely keep their user’s crypto.
do things that prevent people from believing losing user’s crypto is an acceptable outcome.
do things that encourage people to believe that keeping user’s crypto safe is a priority.
do things that prevent people from losing their own crypto.
Etc. etc. etc.
Second, we should agree to spend more resources on adjusting the culture of the organization.
Our most abundant resource is the time and energy of the people inside the organization. Therefore, each individual has a responsibility to use their unique skills, experience, and expertise to promote a culture that results in more acceptable outcomes while also discouraging a culture that results in less desirable outcomes. Again, it is not just the builders who are responsible—it is everyone in the organization.
Some people are good at educating. Some people are good at coding. Some people are good at visualizing. Some people are good at organizing. Some people are good at hacking. Some people are good at yelling on Twitter. Some people are good at auditing. Some people are good at communicating. Some people are good at researching. Aka ‘you do you.’ 🙂
2. Understand that this is no easy task.
We should fully grok that this shit is really, really, really complex and really, really, really hard.
Our system has high interactive complexity (meaning there is a high rate of interaction between two or more components) and is tightly coupled (meaning a change in one component results in strong + sudden change in others). Historically, these are the systems that are at the greatest risk of being impacted by unexpected and undesirable outcomes of catastrophic magnitudes. Chernobyl was both highly interactive and tightly coupled. Yay! Therefore, it is highly likely that DeFi will suffer catastrophic incidents and these incidents, no matter what we do, cannot be prevented.
However, this does not negate the need to mitigate risk. Just because you cannot eliminate an outcome entirely does not mean the efforts are worthless. We should look at the acceptable outcomes on both a grand scale and small scale. Attempting to prevent unacceptable outcomes is worth taking on. Attempting to reduce unexpected outcomes is worth taking on. Attempting to dampen the scope of either is worth taking on.
For example: It is worth promoting a culture that prioritizes people not losing money. It is worth holding those accountable when their behavior and sub-culture does not prioritize people not losing money. Preventing some people from losing money is better than not preventing some people from losing money, even if you cannot prevent all people from losing money.
3. Strive to understand not just what people did, but why they believed it would be okay to do so.
If we look at the bZx situation, they consistently addressed the ‘what’ without addressing the ‘why’. The attack vector that was exploited on February 18, 2020 was almost the same as the one disclosed by Sam Sun in September 2019: the system relied on another system for price data and that other system could be manipulated by an attacker to trick the first system into thinking the attacker had 1.5x collateral, when they really didn’t. The only difference is the attacker used a flash loan to increase their profits (Sam didn’t) and the attacker exploited the sUSD/ETH pair (Sam’s exploited the WAX/ETH, DAI/ETH, DAI/USDC and REP/ETH pairs.)
When we focus too heavily on the ‘what’, we risk repeating history as we are too zoomed in to learn anything meaningful from our failures. It only helps ensure the exact same failure will not happen again and ignores situations where failure presents in a slightly different way. This over-specificity resulted in the bZx incidents, as well asthisscariness.
However, if we understand and address the ‘why,’ we have a better chance at catching a wider range of potential future failures. For example, understanding why you shouldn’t allow manipulable tokens may lead you ensuring all tokens you add in the future are not manipulable.
The same approach applies to the people and culture of an organization. If a system has a bug in the code (the ‘what’), the solution is to fix the bug in the code. However, we all know that this isn’t sufficient.
Why was there a bug in the code? Why did the people responsible for writing the code, deploy the code? Based on the answers to these questions, you can better address the underlying issues.
For example, a common occurrence in an organization with an extreme, top-down management philosophy is the productivity/profits vs safety trade-off:
Why was there a bug? Because the developers wrote the code quickly, on very little sleep.
Why did those responsible for writing the code, deploy the code? Because they were pressured by those around them to increase profits; they determined the risk of not deploying the code and being fired was greater than the risk of deploying the code and having the system exploited.
What is the solution? Adjust the culture so that safety is a higher priority than profits. This could include: empowering people to speak up; giving people a means to report actions that result in unsafe conditions; encouraging the good-faith questioning of the choices being made by upper management.
In this space, it’s far more common that issues are a result of naivety or gross negligence:
Why was there a bug? Because the engineers wrote the financial subsystems with the same diligence as they wrote a fun marketing application no one actually used.
Why did those responsible for writing the code, deploy the code? Because they didn’t realize the impact their actions could have on people; because they’ve only written marketing apps before; because the bugs in those marketing applications never resulted in massive financial loss by users.
What is the solution? Adjust the culture so that safety is a higher priority than experimentation. This could include: encouraging a deep appreciation for the risks of the systems; education and training; sharing insights about how past systems were attacked; discouraging inexperienced people from deploying code to a live system; not using systems that haven’t been audited or are brand new; not using systems that have repeatedly been exploited; etc.
By adjusting the ideas and behaviors across the organization, we are not only able to prevent failures that have happened before, but we can also reduce future potential failures of a similar, and even a different, nature. If we only use systems created by people who are transparent, honest, and diligent about security, we encourage people with those attributes and discourage people without those attributes.
People who don’t have those attributes can choose to not build subsystems, or become more transparent, honest, and diligent about security.
4. People are our greatest resource and greatest strength.
Given the highly complex nature of both the system (the Ethereum network) and the subsystems (all the contracts the system contains), it is unfeasible to perfectly predict every cause of every unacceptable outcome. You may be able to do so within one subsystem, but the system contains thousands.
Further, some of these subsystems operate together…some of the time. When multiple subsystems work together or one subsystem relies on another, this forms a new subsystem with new risks and new acceptable and unacceptable states that don’t exist when either subsystem is operating on its own.
The sheer amount of subsystems, multiplied by all the possible combinations of subsystems, multiplied by variances in human performance at every level of the organization, multiplied by the variances in the outputs of the subsystems, make it impossible to know what each possible subsystem is, let alone how each can fail, let alone how you can detect and prevent their failure.
Since there is no way to perfectly define a ‘compliant’ state or even know whether we are in a compliant or non-compliant state, we have to operate under the assumption that we are always somewhere between perfectly compliant and perfectly non-compliant. In fact, our state is a state of ever-changing unknowingness.
Once we grok this reality, we no longer focus on preventing a transition from a ‘compliant’ to a ‘non-compliant’ state. Instead, we must constantly adjust our behavior in response to both positive and negative outcomes (or potential outcomes) on a micro-level. We also constantly extend our knowledge to better understand the acceptable and unacceptable outcomes, and to better understand how our behavior is encouraging or discouraging acceptable or unacceptable outcomes.
We should encourage a culture of being as adaptable and resilient as possible, given these conditions.
We should encourage and teach people to be resilient, persevere, react quickly, and adapt quickly.
We should encourage the questioning of our own assumptions and each others assumptions. We should question why something or someone is more authoritative than something or someone else. We should encourage skepticism. We should learn from both our successes and our failures. We should be aware of how our actions influence those successes and failures. We should be highly critical of the actions we take and the actions others take. We should not shy away from the unknown, and instead grapple with it to make some of the unknown, known.
With constant adjustments and a re-balancing everything within the environment, we have a chance of being successful in our endeavours. We don’t know with any degree of certainty what we are building or what effects it will have. We don’t know with any degree of certainty what will succeed and what will fail. All we can do is live, learn, build, and adjust with a mindfulness and awareness that hopefully allows us to adapt quickly enough to prevent failure, and be resilient enough to recover when we do fail.
bZx is not the problem. Flash loans are not the problem. Solidity is not the problem. The Ethereum network is not the problem. A lack of diligence is not the problem. The Ethereum Foundation is not the problem. Udi is not the problem. You are not the problem. I am not the problem. We are all the problem. And, we are also all the solution.
If we accept a culture that doesn’t adapt to changing circumstances, especially when that culture results in the same unacceptable outcomes again and again, we will certainly suffer catastrophic failure. We must change, we must encourage others to change, and we must disallow those who don’t change from continuing to operate that way.
The real systemic risk is an organization that is fragile and chooses passivity in the face of its failures.
(Editor’s Note: Taylor provided related reading and sources, which you can find in Part 2 of this newsletter.)
Moving onto our network updates. This week our contributors cover DeFi:
📌 Set Protocol
Contributor: Anthony Sassano, Product Marketing Manager at Set Protocol
Since the launch of Set Social Trading, we’ve seen an uptick in visits from non-English speaking countries. One country that particularly stands out is Turkey, where we’ve seen an incredible inflow of users after onboarding social trader Fidelitas Lex, a native Turkish person and having a following of 37k people on Twitter.
Our 4 most popular robo Sets on TokenSets account for ~$2.7 million of the ~$5.6 million of capital locked in Set Protocol’s Vault. It’s interesting to note that the non-yield ETH 20 Day MA Crossover Set and ETH 26 Day EMA Crossover Set’s still have more capital locked in them. This shows us that users either don’t want exposure to the increased risk of a Compound Protocol asset (cUSDC) or that they simply bought these Sets as “set and forget” purchases. It may also point to users not wanting to trigger a taxable event by selling their non-yield Set to buy the yield version.
The spread of the number of Set holders on Set Social Trading has been following an even distribution with only Aaron Kruger (and his Moonshot Sets) being a major outlier. It’s also interesting to note that some traders such as DeFi Fund have more capital in their Sets than traders such as ADL, who have more individual Set holders.
The last 30 days have seen the total USD locked in Set Protocol grow from $3 million to an all-time high of $6 million, and it has since settled in at around $5.6 million. We believe this recent growth has come as a result of out-performance from both the Social Trading Sets and Robo Sets listed on TokenSets.