```
CIP: 20
title: Auction model for solver rewards
author: @felixhenneke, @harisang
status: active
created: 2023-01-16
replaces: CIP 14
```

## Simple Summary:

We propose an auction based mechanism for rewarding winners of the solver competition.

Currently, the winner of the solver competition is rewarded an amount of COW token based on the orders that are included in their solution. This per-order rewards model does not take into account general market conditions or how difficult it was to find a solution. It therefore requires constant readjustment and it does not incentivize research into tackling challenging batches, e.g., utilizing the value of CoWs.

The proposed auction mechanism allows for dynamically finding a reward based on how much value a solver provides to users by settling a batch. This results in a distribution of rewards better aligned with goals of the protocol, solvers, and users.

## Motivation

The current reward scheme for the solver competition is based on risk adjusted rewards on a per-order basis as described in CIP-14.

There are several issues with the current model:

- It is parameterized by an ad-hoc minimum reward of 37 COW per order.
- It is oblivious to how easy it is to detect the best execution (thin vs thick competition). Auctions with multiple identical submitted solutions are rewarded the same as auctions with only one valid solution.
- It does not (as much as it should) reward innovation and taking advantage of CoWs and structurally better solutions (e.g., detecting CoWs vs execution with touching the same pool multiple times).
- It is oblivious to the volume of an order and the particular pairs traded.

Ultimately, the current model is quite static, does not directly take protocol viability into consideration, and is not adaptive enough to the ever-changing competition landscape.

The proposed reward mechanism results in an improvement of all those issues. It also requires some changes in the behavior of solvers. We will discuss those after the specification of the mechanism.

## Specification

Per-auction rewards are computed using a second-price auction mechanism for the quality of solutions:

- Each solver commits to a score and a solution.
- The winner is the solver with the highest score.
- The winner gets to settle the auction on chain, and is payed

`payment = cap( observedQuality - referenceScore )`

.

Here, the observed quality is `observedQuality = observedSurplus + observedFee`

, as executed on chain. The reference score is the second highest submitted score. The uncapped payment of `observedQuality - referenceScore`

is capped from above and below using the function `cap(x) = max(-c, min(c + observedCost, x))`

with a cap `c`

of `0.01 ETH`

.

Submitted scores have to be positive. We note that the (protocol-submitted) empty solution, having a zero score, will always be considered in the ranking. This implies that an auction with only one submitted solution (from a solver) can still get rewarded (the referenceScore in this case is zero).

The observed quality is zero if the settlement is not successfully submitted to the blockchain. This can be, e.g., due to a revert of the transaction or because the transaction was not successfully settled in time (current limit is 2 minutes).

The payment described above can be decomposed into a reward and a gas reimbursement. In the case of a successful settlement, the gas reimbursement is equal to `observedCost`

of the settlement, while the reward is `payment - observedCost`

which is equal to `min(observedQuality - observedCost - referenceScore, c)`

. For a failed transaction, the gas reimbursement is zero and the reward is negative and equal to `max(-c, -referenceScore)`

, in which case the winning solver would have to pay the reward to the protocol. As before, gas reimbursements are payed in ETH while rewards are paid in COW, using an up to date COW price.

## Example

Suppose we have 2 solutions:

- solution 1 has quality $20, solver reports score $10
- solution 2 has quality $25, solver reports score $5

Solution 1 wins based on the higher score. If they successfully submit the solution on chain they get payed `$20 - $5 = $15`

. If the transaction fails they get payed `$0 - $5 = -$5`

, i.e., they pay the protocol $5.

Suppose we have a third solution:

- solution 3 has quality $15, solver reports score $9

Solution 1 still wins but the payment is `$11`

in case of a successful submission on chain and `-$9`

in case of a revert.

The protocol commits to spend 20_000_000 COW tokens per year i.e. around 383307 COW per week on rewards. If at the end of an accounting period (currently: a week) the total amount of performance based rewards to be paid to solvers is smaller than the amount committed to by the protocol, the remaining sum is distributed to solver teams which consistently provide solutions. Those rewards will be distributed pro-rate to solvers based on the number of auctions they submitted a valid solution for over the accounting period.

The capping of performance based rewards is chosen such that the total rewards for all solvers amount to around 15_000_000 COW tokens per year (see section on experiments). This is comparable to the rewards payed to solvers in Nov-Jan. The remaining COW tokens of the total allocation of 20_000_000 COW are rewarded based on consistency. The cap of 0.01ETH will be adjusted one time after 4 weeks of using the new reward scheme. If the amount of rewards payed in that period extrapolates to significantly more than 15_000_000 COW per year, the cap will be decreased. If the amount of rewards payed in that period extrapolates to significantly less than 15_000_000 COW per year, the cap will be increased.

We will monitor the additional constraint `score < observedQuality`

for submitted scores and add this rule to the social consensus rules.

## Implications for solvers

With the new reward mechanism solver teams would have to submit not only their solution but also a score. The score can be interpreted as expected quality - expected costs. A profit maximizing solver would maximize their score, which in turn would mean that they maximize expected quality - expected costs. The second-price mechanism means that solvers would truthfully report this expectation.

## Example

Suppose a solver found a solution which in the case of a successful settlement generates a quality of $20 (e.g., by a surplus of 11$, fees of $9) for costs of $8 (e.g., by transaction costs of $9 and estimated positive slippage of $1). They estimate the revert probability to be 5%. Then they should optimally submit a score of `(0.95 * $20 + 0.05 * $0) - (0.95 * $8 + 0.05 * $9) = $10.95`

.

The change of the reward mechanism will thus require solvers to adapt their strategy a bit: Instead of just maximizing quality - costs they now explicitly have to take into account revert risks. This moves complexity to solvers, as proper risk management is going to be required to be profitable.

The new reward scheme does not by itself enforce maximizing surplus of users. This is because in the score computation a smaller surplus of a solution can be compensated by a smaller cost e.g. due to positive slippage. We want to avoid the situation of solvers decreasing surplus and will therefore monitor and enforce EBBO (Ethereumâ€™s Best Bid and Offer) rules more strictly, see the social consensus rules in CIP-11.

## Transitioning period

Solvers have to learn to submit scores, e.g., by having a model for revert risk and costs. Therefore we propose a transitioning period for the new reward mechanism along the following lines:

- Two weeks for a dry run of the new scheme where solvers can optionally submit either a
`score`

or a`scoreDicount`

(at most one of the two values). All solutions are simulated and only successfully simulated solutions are considered. If a`scoreDiscount`

is submitted, the driver computes a score as`score = simulatedSurplus + simulatedFees - simulatedCost - scoreDiscount`

. Solutions are still ranked and rewarded by the**old scheme**. Solvers can see how they*would have been*ranked and rewarded using the new scheme.

This period will start on 28.2.2023 and with weekly payments computed on 7.3.2023 and 14.3.2023. - Four weeks of using the new scheme in production where submitting either a
`score`

or a`scoreDiscount`

(exactly one of the two values) is mandatory. The ranking of solutions and rewards are computed using the**new scheme**. This period is used to observe the competition and the effect of capping. If rewards are significantly smaller than expected the cap is increased. If rewards are significantly larger than expected the cap is reduced.

This period starts on 14.3.23 with the first payment to solvers taking place on the 21.3.2023. - After that period, the submission of a
`score`

becomes mandatory. This period starts on the 11.04.2023 with the first payment being on the 18.04.2023.

## Experiments on historical data

The new reward scheme has been tested on historical auction data from October to January. Using the proposed scheme, the capping of rewards was chosen such that rewards are close to 41068 COW per day, corresponding to 15_000_000 COW per year, with a fixed exchange rate of 0.00005 ETH/COW. For October-January:

In the time period November-January the proposed scheme results in the following rewards for the individual solvers (blue: scheme old, yellow: new scheme).

The script and data for the experiments can be found in the Appendix. The script also contains additional information on the distribution of rewards, profits, and payments per auction and per solver.

**UPDATE January 30** We included experiments and specified the capping.

**UPDATE February 7**

- Information on the total budget for rewards was added and how they impact the choice of the capping parameter and consistency based payments.
- A timeline was added with information on a transitioning period to the new reward scheme.
- To avoid confusion, the naming of rewards changed slightly. The proposal now distinguished between
*payments*and*rewards*. The naming is now in line with the current reward scheme.

**UPDATE February 13** Slight change in the wording to make it compatible with future changes, e.g., driver/solver co-location.

**UPDATE February 16** CIP is moved to voting phase. Link to snapshot.

## Appendix

The script and data for testing the new scheme on historical data can be found here:

A lot of parties contributed to the ideas and formulation of this CIP-Draft. Some of the sources are:

- A report on the proposed auction model with a discussion of alternatives and some rigorous analysis of optimal behavior of solvers:

- The slides of the workshop on solver rewards restructuring:

- A forum post on a related auction model by Thomas Bosman:

- A report on a related auction model by Max Holloway (Xenophon Labs):