Detailed Trail of Bits Audit Response - October 28th, 2019

We recently completed a first audit of our smart contracts with the Trail of Bits team. This audit was interesting for a number of reasons. We were the first Vyper contract protocol that the Trail of Bits team had audited, so we both learned a number of interesting lessons in the process.

The Trail of Bits team (henceforth referred to as ToB) produced a report which raised 22 issues found in the smart contract codebase with varying levels of severity. In this post, we’ll do a detailed analysis of each issue raised in the audit along with our response. Here’s a summary table that details the fixes made in this audit

Issue # Description Action Taken
1 Candidate proposal denial of service by front-running transactions Accept Risk
2 Re-parameterization may be abused to exploit markets Fixed
3 increaseApproval and decreaseApproval do not followERC20 standard Fixed
4 Missing check for zero address in setPrivileged Accept Risk
5 Staked tokens can be destroyed through a failed challenge Fixed
6 Staked tokens can be destroyed through a challenge re-creation Fixed
7 A successful challenge may force the Maker to lose all tokens Fixed
8 Bookkeeping inconsistency in Datatrust in case of price change Fixed
9 EtherToken/MarketToken owners can drain ether from users Fixed
10 Reporting excess bytes delivered will prevent ongoing purchases Documented
11 Delivering more bytes than purchased can trigger unexpected behavior for third parties Documented
12 Request delivery denial of service by front-running transactions Documented
13 Attackers can prevent new challenges/listings/backends, parameter changes, and stake retrievals Documented
14 Malicious Backend candidate can exploit submitted url for phishing or denial of service. Fixed
15 Quick buy and sell allows vote manipulation Accept Risk
16 EtherTokens can be used to increase the price arbitrarily Accept Risk
17 Arithmetic rounding might lead to trapped tokens Accept Risk
18 Race condition on Reserve buy and sell allows one to steal ethers Documented
19 requestDelivery is prone to a race condition when computing the price Documented
20 Lack of timeout to resolve candidates Accept Risk
21 No quorum in voting allows attack to spam the election with candidates Fixed
22 Lack of timeout to claim listing fees allows price manipulation Accept Risk

1. Candidate proposal denial of service by front-running transactions

Severity: Medium
Difficulty: Medium
Type: Denial of Service
Target: Listing, Datatrust

Explanation: An attacker tracking a data market can watch for candidate proposal transactions. If the attacker spots such a candidate proposal, it can “front run” the transaction by submitting its own candidate with the same hash. If the maker has already submitted their listing to the datatrust, it may be possible for the attacker to take control of the listing.

Suggested Mitigation: At present, the only charge for executing this attack is gas cost. ToB suggests in the short term adding a fee to the candidate proposal process to discourage front running attacks. Longer term, ToB suggests architecting the contracts to avoid front-running, perhaps by adding a separate hash-request step where makers request and receive a unique hash before submitting a candidate.

Our Action: After consideration, we decided to not add the suggested fee. Doing so would make it more challenging for new Makers to enter the ecosystem. Instead, we’ve documented this attack as part of our docs. If front-running attacks prove severe, we will likely add a hash-request step to the contracts in a future version.

2. Re-parameterization may be abused to exploit markets

Severity: High
Difficulty: High
Type: Data Validation
Target: Parametrizer

Explanation: It’s possible to reparameterize the market to have nonsensical parameters. For example, if the stake is above the MarketToken.totalSupply(), it wouldn’t be possible for anyone to vote in the market after this attack.

Suggested Mitigation: ToB suggests adding guardrails reparameterization to prevent nonsensical quantities in the short run. Longer term, they suggest incorporating static analysis tools into our pipeline to catch these errors

Our Action: We agree that this is a severe bug since an invalid parameterization could crash the entire market. We’ve constructed a fix that adds guardrails around reparameterization.

3. increaseApproval and decreaseApproval do not followERC20 standard

Severity: Low
Difficulty: Medium
Type: Undefined Behavior
Target: EtherToken, MarketToken

Explanation: Our initial implementation didn’t use standard names for the ERC-20 functionality The functions usually called increaseAllowance and decreaseAllowance were named increaseApproval and decreaseApproval. In addition, these functions didn’t return boolean values for success

Suggested Mitigation: Renaming the functions to match ERC-20 and returning boolean success values.

Our Action: We renamed the functions and added boolean return in this PR

4. Missing check for zero address in setPrivileged

Severity: Low
Difficulty: Medium
Type: Input Validation
Target: MarketToken, Voting

Explanation: The contracts have a setPrivileged function which sets addresses which have extra privilege in the smart contracts. This can only be called once, and is intended to be called at initialization. At present, the contracts don’t check that the privileged addresses can’t be the 0x0 address, which is an invalid address

Suggested Mitigation: Add a check for 0x0 to ensure it can’t be set as a privileged address.

Our Action: After some thought, we decided against adding this check. The reason for this is that there’s a whole class of errors in which the data market deployer can set an incorrect address. These incorrect addresses might be valid Ethereum addresses, but not the valid address of the contracts that should be privileged. There’s no real way to check for these types of errors. Adding a check for the 0x0 case when many other similar cases can get through didn’t seem like it was worth it. (See our discussion.) We will however create good deployment docs and have sample scripts that demonstrate correct privilege setting to make it harder for newcomers to make these mistakes.

5. Staked tokens can be destroyed through a failed challenge

Severity: High
Difficulty: Medium
Type: Data Validation
Target: Voting

Explanation: The method Voting.transferStake overwrote the stake in its destination address. If a maker has successfully fended off two challenges to their listing, this method could override their winnings from the first failed challenge and incorrectly destroy tokens.

Suggested Mitigation: ToB recommended incrementing stake instead of overwriting it. Longer term, incorporating static analysis tools such as Echidna or Manticore could ensure invariants such as the fact that stake should not be destroyed by Voting.transferStake.

Our Action: We agreed this was an error and raised an issue documenting it. We created a fix in this PR. We’re currently looking into adding this static analysis tooling

6. Staked tokens can be destroyed through a challenge re-creation

Severity: High
Difficulty: Medium
Type: Data Validation
Target: Voting

Explanation: Voting.addCandidate overwrote the owner’s existing stake. If a market stakeholder submitted two challenges with the same hash, it could be possible to overwrite the stake from the previous challenge with the new challenge.

Suggested Mitigation: ToB suggests incrementing the stake here instead of overwriting. Longer term, static analysis tools should be used to ensure an invariant that stake should not be destroyed by Voting.addCandidate.

Our Action: We agreed this was an error and raised an issue documenting it. We created a fix in this PR. We’re currently looking into adding the static analysis invariant.

7. A successful challenge may force the Maker to lose all tokens

Severity: Medium
Difficulty: Medium
Type: Data Validation
Target: Listing

Explanation: The function Listing.removeListing burned all the stake associated with a listing when it was removed. This function was called when a listing owner chose to exit their listing from the market or if a successful challenge forced removal of the listing. In case of voluntary exit, the listing owner could withdraw their stake before exiting, but in case of challenge it’s possible that the listing owner could be offline and not be able to withdraw their stake in time. This would destroy their tokens.

Suggested Mitigation: ToB suggests modifying Listing.removeListing to transfer stake to the listing owner upon removal and add a static invariant that Listing.removeListing can’t destroy stake.

Our Action: We thought about this one for a bit. Making a no-fault removal for listing owners seemed like it could make fraud easier. But we realized that the current flow already enables for no-fault removal but just punishes makers who aren’t online enough which seems unfair. We decided to go for a no-fault removal as recommended. We documented the issue and created a fix in this PR

8. Bookkeeping inconsistency in Datatrust in case of price change

Severity: Low
Difficulty: Medium
Type: Data Validation
Target: Datatrust

Explanation: When deliveries are paid for, the current set of market parameters are not recorded. This means that if the price changes after the delivery was paid for, it’s possible that the datatrust or maker won’t get paid if there are insufficient funds.

Suggested Mitigation: ToB suggests saving the current set of market parameters for each delivery when it’s requested in the Datatrust.Delivery struct so that payments can be computed correctly.

Our Action: We agree that this inconsistency creates problems and created an issue. We’ve created a fix in this PR

9. EtherToken/MarketToken owners can drain ether from users

Severity: High
Difficulty: High
Type: Access Control
Target: EtherToken, MarketToken

Explanation: The deployers of EtherToken and MarketToken could create “creator blocks” for themselves of some initial allocation of EtherToken or MarketToken they owned. This would allow them to call EtherToken.withdraw or MarketToken.withdraw to withdraw funds.

Suggested Mitigation: The ToB suggested removing these creator blocks from both contracts.

Our Action: For MarketToken, the creator blocks were functioning as intended. The founder of a data market (like the founder of a non-profit or startup) has the right to dictate initial conditions. How the market evolves afterwards is up to the stakeholders of course. However, for the case of EtherToken, this “creator block” was a bug. The EtherToken is intended to be 1-1 with ETH. It doesn’t make sense for the creator to have any special powers. We created an issue and added a fix in this PR

10. Reporting excess bytes delivered will prevent ongoing purchases

Severity: Medium
Difficulty: Medium
Type: Data Validation
Target: Datatrust

Explanation: Purchasing a delivery awards a user a number of bytes. Let’s say a user makes a purchase for 100 bytes, then a second purchase for 100 bytes. If the user requests a listing that requires 150 bytes from the datatrust, then it isn’t possible to “complete” the second purchase.

Suggested Mitigation: ToB suggests creating a hard ceiling for purchases. So it wouldn’t be possible to request 150 bytes if you’ve paid only for 2 purchases of 100 bytes as above.

Our Action: This behavior is as expected for the delivery process, where multiple deliveries are allowed to pool purchased bytes together. However, this audit point by the ToB team points out a weakness in our current docs. We need to do a better job of explaining how the data purchasing flow works. We’ve created an issue to track the needed work.

11. Delivering more bytes than purchased can trigger unexpected behavior for third parties

Severity: Low
Difficulty: Medium
Type: Data Validation
Target: Datatrust

Explanation: If the datatrust delivers more bytes than has been paid for, the Datatrust.listingAccessed method will fail and the maker and the Datatrust won’t receive their payments.

Suggested Mitigation: ToB suggests the same change as for the previous issue, pegging the report to each delivery so an excess number of bytes can’t be permitted.

Our Action: As for the previous issue, this behavior is as expected for the system, but this audit point highlights that our docs here need further improvement. We’ll use the same issue from the previous point to track the needed work

12. Request delivery denial of service by front-running transactions

Severity: Low
Difficulty: High
Type: Denial of Service
Target: Datatrust

Explanation: It’s possible to call Datatrust.requestDelivery with 0 bytes requested. This makes it relatively cheap to front-run other users’ delivery requests by submitting delivery requests for 0 bytes with the same hash. This could prove to be an annoyance for data purchasers.

Suggested Mitigation: ToB suggests adding a minimum bytes requested for Datatrust.requestDelivery so that it isn’t possible to purchase 0 bytes. Or alternatively, using some on-chain hash assignation that makes it harder for the user to select a hash maliciously.

Our Action: For now, we’ve decided to document the issue to make it clear to users that these front-running attacks are possible. If these attacks prove serious, we may alter this behavior in a future version of the protocol

13. Attackers can prevent new challenges/listings/backends, parameter changes, and stake retrievals

Severity: Medium
Difficulty: High
Type: Denial of Service
Target: Voting

Explanation: This is similar to the front-running attacks documented in audit points #1 and #12 above. Front-running attackers can block a number of core governance operations in a data market by front-running with the same hash.

Suggested Mitigation: ToB suggests documenting this attack in the short run, and longer run considering changing the hash generation to prevent malicious hash submission.

Our Action: We’ve documented the front-running docs. In a future version, we will rework the hash submission to avoid allowing users to select hashes maliciously.

14. Malicious Backend candidate can exploit submitted url for phishing or denial of service.

Severity: High
Difficulty: Low
Type: Data Validation
Target: Datatrust

Explanation: The method Datatrust.register which prospective datatrust operators call to register themselves as a datatrust operator candidates overwrites Datatrust.backend_url. This means that any candidate could maliciously overwrite the backend_url for the current datatrust, allowing them to reroute incoming data.

Suggested Mitigation: ToB suggests updating the backend_url in resolveRegistration rather than register in the short run. Longer term, they suggest using static analysis to prevent changes to vote-governed fields from happening without a vote.

Our Action: We agree this finding is a bug and created an issue to track the fix. We’ve constructed a fix in this PR. We chose to not set the backend_url in either method. Rather, the approved datatrust operator has to call Datatrust.setBackendUrl manually after approval. We are investigating the static analysis based approaches.

15. Quick buy and sell allows vote manipulation

Severity: High
Difficulty: High
Type: Timing
Target: Voting

Explanation: The data market is governed solely by on-chain votes. A vote manipulator who can purchase and sell large amounts of MarketToken can manipulate votes at will.

Suggested Mitigation: ToB points out that on-chain voting is known to be challenging and that there’s isn’t any magical bullet. They suggest that weighting earlier votes more heavily than later votes may prevent vote manipulation.

Our Action: For now, we’re simply planning to keep an eye on the markets we deploy to see if vote manipulation issues crop up. The spread parameter prevents rapid algorithmic buy/sell orders since there’s an extra charge on the buy which makes the attack unprofitable. However, longer term, as secondary markets develop, vote manipulation attacks may become easier to execute. We will consider adding a time-weighting mechanism in a future version.

16. EtherTokens can be used to increase the price arbitrarily

Severity: Low
Difficulty: Low
Type: Data Validation
Target: Reserve

Explanation: Anyone can send EtherToken to the Reserve for a data market. This permits the sender to raise the price of MarketToken without increasing the supply of available MarketToken.

Suggested Mitigation: ToB suggests monitoring the blockchain to see if such transfers are happening.

Our Action: We agree that this type of price boosting is a possibility. We’ve used this behavior in some of our internal tests to achieve a desired price for example. We believe this behavior isn’t very likely since it isn’t possible for the sender to directly recover their EtherToken (unless they own 100% of MarketToken). The Reserve.withdraw function will dilute them down according to their fractional ownership of the data market. So this type of send operation has a price tag which makes it unprofitable for most cases. However, we are planning to monitor for these events to see if they occur in practice.

17. Arithmetic rounding might lead to trapped tokens

Severity: Low
Difficulty: Medium
Type: Data Validation
Target: Datatrust

Explanation: The delivery payment mechanism uses integer arithmetic to compute the split between reserve, datatrust, and makers. Due to integer rounding errors, some portion of funds may be lost to the reserve contract.

Suggested Mitigation: ToB suggests computing the total fees, accounting for the difference and paying it out to a chosen party.

Our Action: Computing the total feels in this way is quite challenging. The main reason is that the split-up of fees isn’t known up front. The listing utilization reward distributes fees to the owners of the listings that the purchaser actually queried. In addition, the actual rounding errors will be quite small, with a cap of a 100 wei rounding error per delivery (at today’s ETH price, this is roughly one trillionth of a cent). For these reasons, we’ve decided to leave the rounding as is, but keep an eye on it in case there are issues that crop up in the future.

18. Race condition on Reserve buy and sell allows one to steal ethers

Severity: Medium
Difficulty: Medium
Type: Data Validation
Target: Reserve

Explanation: The algorithmic price curve allows for algorithmic purchases to happen. If an attacker is watching the blockchain and sees that a large Reserve.support order is in-bound, they can front-run this transaction, place their own Reserve.support transaction before, then immediately call Reserve.withdraw to pull out a portion of the new funds.

Suggested Mitigation: ToB suggests documenting these issues so users know that these attacks are possible and that they raise their gas accordingly or split large transactions into multiple smaller transactions.

Our Action: The spread parameter prevents some partial protection against this attack, since it charges buyers the spread as an extra fee, but large price jumps which swamp the spread can make this attack profitable. As a result, we agree that this condition needs to be well-documented. We have created an issue to track this work.

19. requestDelivery is prone to a race condition when computing the price

Severity: Low
Difficulty: High
Type: Timing
Target: Reserve

Explanation: There is some delay between the user submitting a transaction and the transaction actually being executed on the blockchain. If a reparameterization happens in this span, it’s possible that the data purchaser could end up paying an unexpected price.

Suggested Mitigation: ToB suggests adding a maximum price parameter to Datatrust.requestDelivery that will fail the transaction if the price surges unexpectedly. Or alternatively, ToB suggests documenting this behavior.

Our Action: We agree that this race condition is an issue. We’ve chosen to document it for the time being. If these race conditions prove severe, we will create a protocol-level fix in a future version. We’ve created an issue to track our fix.

20. Lack of timeout to resolve candidates

Severity: High
Difficulty: High
Type: Timing
Target: Listing, Parametrizer

Explanation: For all votes, there’s no time-out on when a winning candidate can be approved. A malicious user could store a winning candidate and wait for an opportune moment to resolve the registration.

Suggested Mitigation: ToB suggests adding a time-out to candidate resolution and modifying off-chain UIs to clearly show pending candidates.

Our Action: Adding a timeout adds an extra layer of complexity to the voting process. Genuine candidate proposers who were offline could lose their accepted candidates. In addition, the resolve methods can be called by any actor in a data market, and not necessarily just the candidate proposer. For these reasons, we’ve chosen instead to have the off-chain UIs display candidates. In addition, we’re working to build command line tools that will permit datatrust operators to resolve candidates. This will ensure that an honest datatrust can keep the market clean of dangling candidates.

21. No quorum in voting allows attack to spam the election with candidates

Severity: High
Difficulty: Low
Type: Timing
Target: Voting.vy

Explanation: Votes at present don’t have a quorum. This means that a determined attacker can spawn a thousand candidates with bad options, then wait till near the close of the vote_by and vote for one of the bad candidates.

Suggested Mitigation: ToB suggests adding a fixed fee to create a candidate and a quorum on the vote in addition to monitoring the blockchain for suspicious activities.

Our Action: This attack is serious. We’ve taken a few steps to mitigate. First, by addressing the issues in #2, we’ve closed off the chance of shutting down a market by sneaking in a bad reparameterization. Second, it’s possible to create an implicit quorum by raising stake (if stake is 10% of the MarketToken supply for example, it creates an implicit quorum of 10%). It’s likely that we’ll require larger stake for our first market to make it harder to vote in malicious changes.

As a second change, we decided to change the protocol to require that the creation of new reparameterization and datatrust registration require the proposer to stake funds. This stake will mean that it will become harder for a malicious attacker to spawn many fraudulent reparameterization/registration candidates since doing so would lock up large amounts of their funds. However, we did not make listing candidates require stake. The reason is that the datatrust operator needs to acknowledge these listings before the vote can pass, which creates a natural defense point against spam attacks while keeping the barrier now for new makers. We documented this discussion in this issue and merged in a fix.

22. Lack of timeout to claim listing fees allows price manipulation

Severity: High
Difficulty: High
Type: Timing
Target: Listing

Explanation: There’s no timing to claim the listing utilization reward via Listing.claimBytesAccessed. A maker can choose a time when the support price is low to gain maximum MarketToken reward for their listing utilization

Suggested Mitigations: ToB suggests allowing any user could be allowed to call Listing.claimBytesAccessed. Alternatively, the contract owner or datatrust could make the call. Or Listing.claimBytesAccessed could be automatically called after the datatrust reports bytes accesed via Datatrust.listingAccessed.

Our Action: Part of the market’s economics is that the access reward for an individual listing will likely be small relative to the market’s total size. An average data market may have hundreds or thousands of listings. The access fee for any given listing will be by natural considerations, limited in size compared to the overall market. Consequently, the price manipulation possible with one listing would likely be small. This could change somewhat if a cartel of makers forms which choose to take collective action. This could allow for more significant shifts. This latter attack sounds more complex though, and we’re not sure how feasible this attack is in practice yet.

For these reasons, we’re going to hold off on implementing any of these fixes. That said, we’ll keep a watch for price manipulations with this mechanic and may alter this payout structure in a future version to make the system robust against these attacks.