We propose an auction based mechanism for rewarding winners of the solver competition.
Currently, the winner of the solver competition is rewarded an amount of COW token based on the orders that are included in their solution. Those per-order rewards do not take into account general market conditions, the current value of COW tokens, or how difficult it was to find a solution. It therefore requires constant readjustment and it does not incentivize research into tackling challenging batches, e.g., utilizing the value of CoWs.
The proposed auction mechanism allows for dynamically finding a reward based on how much value a solver provides to users by settling a batch. This results in a distribution of rewards better aligned with goals of the protocol, solvers, and users.
The current reward scheme for the solver competition is based on risk adjusted on a per-order basis as described in CIP-14.
There are several issues with the current model:
- It is parameterized by an ad-hoc minimum reward of 37 COW per order.
- It is oblivious to how easy it is to detect the best execution (thin vs thick competition). Auctions with multiple identical submitted solutions are rewarded the same as auctions with only one valid solution.
- It does not (as much as it should) reward innovation and taking advantage of CoWs and structurally better solutions (e.g., detecting CoWs vs execution with touching the same pool multiple times).
- It is oblivious to the volume of an order and the particular pairs traded.
Ultimately, the current model is quite static, does not directly take protocol viability into consideration, and is not adaptive enough to the ever-changing competition landscape.
The proposed reward mechanism results in an improvement of all those issues. It also requires some changes in the behavior of solvers. We will discuss those after the specification of the mechanism.
Per-auction rewards are computed using a second-price auction mechanism for the quality of solutions:
- Each solver commits to a solution, which is simulated, and a score.
- The winner is the solver with the highest score whose solution successfully simulated.
- The winner gets to settle the auction on chain, and is rewarded with
Reward = observedQuality - referenceScore,
where we cap the reward both from below and above.
Here, the observed quality is
observedQuality = observedSurplus + observedFee, which is only non-zero if the settlement is successfully submitted to the block chain. The reference score is the second highest submitted score. We note that the (protocol-submitted) empty solution, having a zero score, will always be considered in the ranking. This implies that an auction with only one submitted solution (from a solver) can still get rewarded (the referenceScore in this case is zero), and moreover, that solutions with negative scores can never win.
The cap from above is chosen as
0.01ETH + gas costs while the cap from below is
-0.01ETH. The choice of those numbers is explained in the experiments section.
Suppose we have 2 solutions:
- solution 1 has quality $100, solver reports score $70
- solution 2 has quality $110, solver reports score $65
Solution 1 wins based on the higher score. If they successfully submit the solution on chain they get a reward that is equal to
$100 - $65 = $35. If the transaction fails they get a reward of
$0 - $65 = -$65, i.e., they pay the protocol $65.
Suppose we have a third solution:
- solution 3 has quality $90, solver reports score $69
Solution 1 still wins but the reward is
$31 in case of a successful submission on chain and
-$69 in case of a revert.
The protocol commits to spend some amount of tokens on rewards. If at the end of each month the total amount of rewards paid to solvers is smaller than the amount committed to by the protocol the remaining sum is distributed to solver teams which consistently provide good solutions. Those rewards will be distributed pro-rate on the number of auctions each solver submitted a valid solution which passed simulation over the accounting period
The capping of performance based rewards is chosen such that rewards of solvers are comparable to before the change of the rewards. It will periodically be tweaked to make sure that a large part of the rewards is paid based on performance. The total budget for rewards will be chosen by the protocol.
We will monitor additional constraints for the submitted scores and add them to the social consensus rules, such as
score < observedQuality.
With the new reward mechanism solver teams would have to submit not only their solution but also a score. The score can be interpreted as expected quality - expected costs. A profit maximizing solver would maximize their score, which in turn would mean that they maximize expected quality - expected costs. The second-price mechanism means that solvers would truthfully report this expectation.
Suppose a solver found a solution which generates a quality of $20 (e.g., by a surplus of 11$, fees of $9) for costs of $8 (e.g., by transaction costs of $9 and estimated positive slippage of $1 in case of a successful submission). They estimate the revert probability to be 5%. Then they should optimally submit a score of
(0.95 * $20 - 0.05 * $0) - (0.95 * $8 + 0.05 * $9) = $10.95.
The change of the reward mechanism will thus require solvers to adapt their strategy a bit: Instead of just maximizing quality - costs they now explicitly have to take into account revert risks. This moves complexity to solvers, as proper risk management is going to be required to be profitable.
The new reward scheme does not by itself enforce maximizing surplus of users. This is because in the score computation a smaller surplus of a solution can be compensated by a smaller cost e.g. due to positive slippage. We want to avoid the situation of solvers decreasing surplus and will therefore monitor and enforce EBBO (Ethereum’s Best Bid and Offer) rules more strictly, see the social consensus rules in CIP-11.
The new reward scheme has been tested on historical auction data from October to December. Using the proposed scheme, the capping of rewards was chosen to roughly correspond to rewards with the old scheme. For December:
This results in the following profits of current solvers (blue: scheme old, yellow: new scheme).
The script and data for the experiments can be found in the Appendix. The script also contains additional information on the distribution of rewards, profits, and payments per auction and per solver.
First-price auction. In a first-price auction version of the above mechanism, the
referenceScore is simply equal to the highest score, i.e., the score of the winning solver. Switching to a first-price auction makes the mechanism simpler to describe, and solvers have, in a way, a more direct way of expressing how much they want to get paid for the solution they provide.
Note that although a first-price auction is easier to explain and looks more straightforward, it is actually more complicated on the solvers’ side, as solvers will need to fine-tune their reporting, speculate about how other solvers might behave, and in the repeated setting of the Solver Competition, “learn” how the competition behaves in order to develop the best possible score-reporting strategies. In particular, an optimal strategy for solvers is to report a score that is a tiny bit larger than the score they expect the second best solver to report; this breaks truthfulness, and might lead to more unpredictable score reporting strategies from solvers.
Protocol computes score. Solvers would only submit their solution and the protocol would determine the score. If the score is computed via
simulatedQuality - simulatedCosts, the resulting ranking is the same as with the current objective. Only the amount of rewards would change compared to the reward scheme currently in use.
This change would require little adaptation on the side of solvers. It already encourages tackling complicated batches where competition is thin. It does not, however, allow for the incorporation of costs beyond transaction costs.
- Should we let solvers report scores or should the protocol determine the score based on the solution? The former allows for more flexibility from the side of solvers, the latter is more robust with respect to gaming the auction.
- How should we distribute the remaining budget after paying performance based rewards? [Pro-rate on valid solutions for auctions is included in the draft now.]
UPDATE January 30 We included experiments and specified the capping.
The script and data for testing the new scheme on historical data can be found here:
A lot of parties contributed to the ideas and formulation of this CIP-Draft. Some of the sources are:
- A report on the proposed auction model with a discussion of alternatives and some rigorous analysis of optimal behavior of solvers:
- The slides of the workshop on solver rewards restructuring:
- A forum post on a related auction model by Thomas Bosman:
- A report on a related auction model by Max Holloway (Xenophon Labs):