Grant Application: ZeroMEV API

Pmcgoohan · November 21, 2022, 12:54pm

Grant Application: ZeroMEV API

Grant Title:

ZeroMEV Data API

Author:

About You:

I have a 20 year career in developing automated trading systems. I have been involved in Ethereum since before the pre-sale, hoping it would address the failings I saw in traditional finance. My motivation is to engage with the community to achieve this end.

I identified that frontrunning would be a problem for Ethereum in 2014 and begun estimating real world harms with my EthInclude project (a prototype of Zeromev) which supplied data for my MEV WTF Devcon talk around this time.

I’ve been published on Coindesk and I’m a frequent poster on ethresear.ch, where I warned that PBS would lead to builder centralization issues a year before MEV-Boost was released. I founded and developed http://zeromev.org on an Ethereum Foundation grant in 2021.

Additional Links:

Miners frontrunning post (2014) (http://bit.ly/3v3YDmP)
Estimate frontrunning estimator (https://github.com/pmcgoohan/EthInclude)
Builder centralization warning (https://ethresear.ch/t/two-slot-proposer-builder-separation/10980/10)
Coindesk article (http://bit.ly/3jWSRkG)
Founded/developed (http://zeromev.org)

EthGlobal (https://www.youtube.com/watch?v=f1eJnhET1U4)
YouTube (https://www.youtube.com/@pmcgoohan-zeromev7240)
Twitter (https://twitter.com/pmcgoohanCrypto)

Grant Category:

CoWmunity growth

Grant Description:

Generate transaction granularity MEV summary data sourced from zeromev.org
Data published via a documented public API at data.zeromev.org / info.zeromev.org
Also unrestricted direct database access to this data for CoW Swap / Zeromev and their partners
Minimum provisioning / maintenance period of 12 months

Grant Goals and impact:

Accessibility of data related to MEV on Ethereum is still relatively limited.

It is in the best interest of both zeromev.org and CoW Swap to make granular MEV data available to the public through a free API. This will help to increase transparency and awareness around the problematic nature of MEV and promote sustainable solutions.

Some potential ideas for use cases to build on top of the proposed API:

A chart that compares the value extracted from users split by the DEX protocol they were using
A chart that compares the amount of MEV incidents for every MEV type split by DEX protocol
An educational snippet on CoW Swap revealing more information about MEV incidents uses had when interacting with various DeFi protocols

Milestones:

The following MEV data will be summarized for each relevant transaction:

Field	Description
BlockNumber	Ethereum block number
TxIndex	Index of transaction in block
MEVType	Frontrun, Backrun, Sandwiched, Swaps, Arb, etc
Protocol	Uniswap, Bancor, Opensea, etc
UserLossUsd	Loss to user from the MEV
ExtractorProfitUsd	Profit to the extractor from the MEV
VolumeUsd	Swap volume (where applicable)
Imbalance	Sandwiched imbalance percentage
AddressFrom	Transaction sender
AddressTo	Transaction receiver
ArrivalTimeUS	Time the transaction was first seen by our US node
ArrivalTimeEU	Time the transaction was first seen by our European node
ArrivalTimeAS	Time the transaction was first seen by our Asian node

Deliverables:

Creation of a new postgres database to hold the MEV transaction summary data above
Data to be persisted on RAID drives across two servers / database instances (a write instance and replicating read instance)
Database to be populated with all available historical MEV transaction data
MEV transaction data updated in realtime as each new block is classified by Zeromev core
Authenticated access to the database restricted by IP for use by Cowswap / Zeromev / Dune / etc
Public access via a REST API at data.zeromev.org (rate limited)
Document the API and data fields alongside other zeromev documentation at info.zeromev.org
Service to be provisioned and maintained for a minimum of 12 months (approximately the remainder of the Zeromev contract with the Ethereum Foundation)
Development time is estimated to be 6 - 8 weeks

Funding Request:

Funding request summery: 30k xDAI (15k payed upfront, 15k payed upon completion)

Budget Breakdown:

Development $22400, Infrastructure / Maintenance $7600

Gnosis Chain Address (to receive the grant):

0x022a0D82f00Ed885f3707B92279Aa8528dc1b0A0

Referral:

Proposal was developed in collaboration with middleway.eth

Terms and conditions:

By applying for this grant, I agree to be bound by the CowDAO Participation Agreement and the COWDAO Grant Terms and Conditions.

netrunner.eth · November 25, 2022, 2:57pm

An excellent grant. This has my support.

middleway.eth · November 28, 2022, 6:42pm

The first payment of the grant has been executed: link
@Pmcgoohan when you have updates about the progress, please post them here for everyone to follow

Pmcgoohan · November 28, 2022, 10:24pm

Thank you so much for your your support! I’m looking forward to getting this out and will keep the community updated on this thread

Pmcgoohan · December 19, 2022, 11:06am

Hi. I’m sorry to report that Zeromev is having infrastructure problems.
We run two archive nodes for redundancy- sadly both have failed and so MEV data is no longer updating (although transaction timing data is still being collected without problems).
Resolving this situation and making the clusters more robust is my immediate priority.
I’ll be progressing the API project once this has been done.
Thanks.

middleway.eth · December 19, 2022, 1:12pm

That’s unfortunate!
Hope you find the path for quick recovery of Zeromev!
Please keep up posted when you have a timeline for getting back to work on the MEV API

Pmcgoohan · January 9, 2023, 1:38pm

Hello,
We successfully resolved the infrastructure issues above and increased capacity and Zeromev starting processing MEV data again.
Unfortunately, the system then halted on 04-Jan with a new issue related to the use of Flashbots mev-inspect-py.
I have contracted k8/python experts to help me resolve this.
I’m afraid that because all resources are currently focused on restoring the site, the launch date for the API project has been pushed back to late Feb/early Mar.
I hope to have Zeromev back up by the end of the week. Investigations are ongoing and I’ll have a clearer idea soon. I will post further information here as I have it.
Please accept my apologies for this, and thank you for your patience.

Pmcgoohan · January 15, 2023, 9:36pm

Hi

I’m pleased to say that the remaining issues with the site have been resolved which is now back up and running.

I discovered that mev-inspect-py processes certain blocks very slowly (many minutes rather than a few seconds as usual) and have now ensured that the site can handle these outliers.

So it’s full speed ahead on the API project. I’m sorry this has caused delay- thank you for your patience. I’m looking forward to getting stuck in on Monday.

middleway.eth · January 16, 2023, 1:22pm

Awesome!
Glad that zeromev is back up and running
Looking forward for the cool data and education that will be built using the MEV API

Pmcgoohan · January 29, 2023, 9:03pm

Hi all,

Quick progress update.

The API servers have been provisioned and configured. Replicating database instances have been setup with automatic failover provided by a third watcher server.

The database and API source table have been created as specified. The code to populate this from the existing Zeromev MEV and arrival time databases is nearing completion.

It’s going well and we’re on course to deliver.

I am also extending the Zeromev dataset by backfilling another year of data. This will give the Zeromev site and API the same time range MEV-Explore (from Dec-2019). I expect this extended dataset to be made available as part of the API launch, if not before.

Pmcgoohan · February 3, 2023, 10:51am

Hi

We’re making good progress this week. The API table is now being populated with data for testing and debugging. A few things have come up that I wanted to highlight:

Data Structure / API Improvements

I’ve added swap_count columns alongside swap_volume_usd for better reporting
I also aim to add extractror_swap_count and extractror_swap_volume columns so extractor volume can be differentiated from user or ‘true’ volume (and perhaps calculated user_swap_count, user_swap_volume columns through the API)

Data Structure Limitations

Note that while it will be possible to aggregate arb and swap volumes, it will not be possible to differentiate accurately by protocol because where there are multiple swaps per transaction the protocol field will be set to “multiple”
A later development could address this with a dedicated swaps table with a row for each swap rather than each ethereum transaction
Sandwich volume is anyway not impacted and can be aggregated by protocol

Classification Improvements

I will need to reclassify the entire MEV dataset to populate the API table, and this represents an opportunity to make improvements
Currently, if any token in a potential sandwich is unknown (ie: not a known token in the Ethplorer API), the Zeromev classification does not calculate it
Because the dataset will be used for aggregate reporting (eg: total MEV per day), I am looking into calculating MEV even in some instances where tokens are unknown
This should be possible as long as the input & output tokens are known (see screenshot below)

I think the steps above will greatly improve the power of the dataset, while keeping it simple to understand and report against, as was the original vision.

I do not expect this to push back delivery date beyond the end of this month/early next month.

I’d be keen to hear what you think here, and I’m very happy to discuss it and give further clarification.

Many thanks!
Pmcgoohan

Pmcgoohan · February 21, 2023, 8:35pm

Hi everyone,

Quick update for you.

The improvements to the MEV classification / calculations above have been coded successfully.

I aim to release these early with an announcement this week.

This will prepare us for the new website data format (some changes were needed there) so when the time comes we can export this along with the API data without any downtime and without breaking clients (this will take around 48 hours).

Both the API data export and the REST API are now up and running in development. Testing is ongoing.

It’s looking good!

middleway.eth · February 23, 2023, 11:11am

Awesome, thanks for the update!
Do you have an ETA for public testing ready time?
Would be interested to find people that want to build cool visualizations using the API

Pmcgoohan · February 24, 2023, 3:56pm

Hey!

The updated version of the web client has been released, ready for the upcoming data format change above (announced here https://twitter.com/pmcgoohanCrypto/status/1629145705959784448?s=20).

I’m doing a full export in dev at the moment. Spot checks are looking good but I’d like to see the totals once it has completed.

That’ll put us in a position to discuss the launch date early next week.

Pmcgoohan · February 28, 2023, 2:44pm

Hi

Testing has raised a few small issues related to low liquidity DEX pools which have now been fixed.

This project has been a useful exercise in auditing the dataset, and the MEV data is looking strong.

Unfortunately, I have just come across an issue with the address fields.

I intend the address fields to relate to the from/to fields returned by the eth_getBlockByNumber RPC call, as I think these are most useful (please let me know if you disagree!)

I’ve been using the mev-inspect database address from/to fields until now, but on a closer inspection it seems they relate to specific traces which are not always what is needed (eg: uniswap 3 contract addresses rather than end user addresses).

So I think it’s best to ignore those fields, and import them directly via RPC as a batch job and incrementally after that.

Given that this means further coding and testing involving a high volume of data (the entire chain since 2019), where I was going to set the launch date for next week, I think it’s now sensible to aim for the week after.

How do we feel about March 15th as a potential launch date?

Pmcgoohan · March 13, 2023, 3:18pm

Hi everyone,

So I (rather publicly) discovered some issues with the backrunning data in the API last week, well of my interpretation of it at least. Explanation here…
[twitterdotcom]/pmcgoohanCrypto/status/1635283236158046215

Although it was hasty of me to generalize about the nature of these backruns, sandwich imbalances are interesting.

They show where my sandwich calculation is having to work to produce a balanced result based on the mev-inspect-py data. They will often point to more complex MEV types that are not yet fully quantified.

However, I want to avoid this data being reported as first-class MEV as it stands. As such, while backruns will still be visible through the API, I will null the user loss and extractor profit columns for them. I’ll keep the imbalance column as an indicator that the sandwich has been balanced.

The data in these nulled columns won’t be lost, it’ll still be stored in the database, just not published.

Despite all this, I’m on course for the API to be available at some point on Wed.

However, it is a little close to the wire as I will still have data importing right up to then. Because of this, and in the light of my recent backrunning U-turn, I’m keen to have more eyes on it before going public.

As such, I think the idea of an internal soft launch in the first instance is a good one. Perhaps we can discuss doing that this week and hold off on a general launch and any announcements until after that?

Let me know your thoughts

Pmcgoohan · March 13, 2023, 10:15pm

I just noticed that a couple of posts above have been flagged for some reason. Not sure why, they contain relevant information.

Chim9 · March 15, 2023, 8:08pm

Yeah it’s just the forum auto mod, I’ve unflagged them, they should be visible now.

Pmcgoohan · March 16, 2023, 10:26am

Quick update: we had a successful soft launch of the Zeromev API yesterday

The API is up and running and serving MEV data from Jan-2020 to the present and updating in realtime.

It is fully load tested and rate limited and OpenAPI documentation is complete and published on the webserver.

Thank you for your support and encouragement in getting to this point. We’re looking good!

zlysunshine · March 17, 2023, 6:28am

WOW! Fantastic! Where to find the API? I was about to contact you to use the data for academic research.

Topic		Replies	Views
Open Sourcing ZeroMEV and Funding Its Maintenance Funded Grants	13	190	December 27, 2024
Grant Application: Research on Sandwich MEV Impact On CoW Protocol Funded Grants complete , grant-funded	25	1988	February 24, 2025
Ensuring ZeroMev’s future as a public resource for MEV data CoW Grants Program	5	133	January 16, 2025
[ON HOLD] RFP-03 MEV Blocker Transactions and Stats Explorer CoW Grants Program	3	82	August 16, 2024
Grant Proposal: PoWDump - A PoW ETH dump button using cross-chain atomic swaps Funded Grants complete , grant-funded	17	2416	February 24, 2025

Grant Application: ZeroMEV API

Grant Application: ZeroMEV API

Data Structure / API Improvements

Data Structure Limitations

Classification Improvements

Related topics