Author Information
Team: @bleu @yvesfracari @ribeirojose @mendesfabio @lgahdl
About Us:
bleu collaborates with companies and DAOs as a web3 technology and user experience partner. We’re passionate about bridging the experience gap we see in blockchain and web3.
Our work for CoW so far:
[CoW] Framework Agnostic SDK: Restructured SDK architecture to be more composable with framework-agnostic base packages with EVM adapters.
[CoW] Hook dApps: CoW Hooks dApps – Built hook dApps integrated into CoW Swap frontend and developed the cow-shed module of @cowprotocol/cow-sdk to simplify permissioned hooks.
[CoW] Offline Development Mode (proposal): Self-contained offline development environment enabling developers to work without external dependencies while testing solver strategies with realistic DEX liquidity.
Simple Summary
The CoW Protocol Playground is currently incapable of running systematic performance tests without deploying to production.
We propose a performance testing suite for the CoW Playground that:
- Generates configurable synthetic load (orders and patterns).
- Measures performance end-to-end.
- Integrates with the existing Prometheus/Grafana stack.
- Works with fork mode (primary) and offline mode (stretch goal).
Goal
This proposal addresses the need for performance testing capabilities in the CoW Protocol development workflow. Current limitations include:
- Performance improvements require deployment to production environments
- No way to generate a synthetic load for testing
- Difficult to measure the impact of optimizations before deployment
- Cannot simulate edge cases or stress conditions
- No standardized approach to performance testing
Benefits for the CoW Ecosystem:
- Risk Reduction: Identify performance issues before production deployment
- Faster Development: Measure optimization impact immediately in fork mode
- Better Insights: Understand system behavior under various load patterns
- Data-Driven Decisions: Make optimization choices based on concrete metrics
- Reproducible Testing: Standard test scenarios for consistent benchmarking
- Fork Mode Testing: Test performance with realistic mainnet state using Anvil fork mode
Fork Mode Integration (Primary Requirement)
- Works with Anvil fork mode (
anvil --fork-url $MAINNET_RPC). - Uses CoW archive node made available for development.
- Leverages Anvil’s state caching for faster subsequent test runs after initial setup.
- Integrates with existing Prometheus/Grafana metrics.
- Provides realistic mainnet state for authentic performance testing.
Benefits of fork mode:
- Test against actual mainnet state and liquidity
- Realistic DEX interactions and pricing
- Authentic solver behavior and settlement scenarios
- First run caches state, subsequent runs are much faster
Offline Mode Compatibility (Stretch Goal)
- Validate compatibility with offline mode environment
- Test with pre-deployed contracts and local DEX liquidity
- Confirm no external dependencies needed for offline testing
- Document any differences in performance characteristics
Milestones
| Milestone | Duration | Payment (xDAI) |
|---|---|---|
| M1 — Load Generation Framework | 2 weeks | 6,000 xDAI |
| M2 — Performance Benchmarking | 2 weeks | 6,000 xDAI |
| M3 — Metrics & Visualization | 2 weeks | 6,000 xDAI |
| M4 — Test Scenarios | 1 week | 3,000 xDAI |
| M5 — Integration, Documentation & Offline Mode Exploration | 2 weeks | 6,000 xDAI |
| Maintenance | 1 year | 27,000 COW |
Total Duration: 9 weeks
Total Funding: 27,000 xDAI
Maintenance Vesting: 27,000 COW over 1 year
Deliverables
1. Load Generation Framework
Realistic order flow simulation:
- Generates market/limit orders with configurable rates, sizes, token pairs, and patterns.
- Simulates multiple traders and signing behavior.
2. Performance Benchmarking Tools
- Measure order lifecycle, settlement latency, solver rounds, API latency, and resource usage.
- Store baselines and compare runs to detect regressions.
3. Metrics Collection & Visualization
- Prometheus exporters for testing metrics.
- Ready-to-use Grafana dashboards (throughput, latency distributions, solver metrics, resource usage, error rates).
- Optional alerts for performance degradation.
4. Test Scenarios & Configurations (RFP Requirement)
Reusable test configurations:
- Predefined scenarios: light, medium, heavy, spike, sustained, edge cases.
- Configuration-driven via YAML/JSON + a simple scenario builder.
5. Load Testing CLI Tool
- Simple commands like
performance-test run --scenario heavy-load. - Reads config files, shows real-time progress, exports reports.
- CI-friendly interface.
6. Performance Regression Detection
- Compare current vs baseline runs.
- Highlight regressions via deltas and percentiles.
- Optionally surface alerts through existing stack.
7. Playground Integration
- Integrated into the offline docker-compose.
- Validated against Anvil, pre-deployed contracts, and local liquidity.
- Stretch: explore fork mode compatibility.
8. Documentation
- Quick start for running tests.
- Scenario configuration reference.
- Metrics/graphs interpretation guide.
- Architecture overview and extension points.
Specification
M1: Load Generation Framework (2 weeks)
Build the core load generation capabilities:
- Order generation engine
- Market and limit order creation using CoW SDK order schemas
- Realistic token pair distribution
- Configurable order parameters
- Treat playground as HTTP API for standard load testing
- User simulation module
- Multiple concurrent traders
- Signature generation and submission
- Order status tracking
- CLI tool interface
- Command-line argument parsing
- Configuration file support
- Real-time progress display
- Order submission strategies
- Constant rate submission
- Burst patterns
- Gradual ramp-up
Goal: A working load generator that creates realistic order flow for the playground environment.
Technical Note: Will evaluate k6 (strong candidate due to Grafana integration) vs Python-based tools for framework selection.
M2: Performance Benchmarking (2 weeks)
Implement performance measurement and comparison:
- Metrics collection framework
- Order lifecycle timing (complex: track from submission through settlement)
- Settlement latency tracking
- API response times (req/s from load testing framework - k6 provides this natively)
- Resource utilization monitoring (extract from Docker container stats)
- Baseline snapshot system
- Performance baseline capture
- Metadata and configuration storage
- Version control integration
- Comparison engine
- Statistical analysis
- Regression detection algorithms (non-trivial: define thresholds, statistical significance)
- Performance difference reporting
- Automated reporting
- Summary statistics
- Detailed performance breakdowns
- Visualization-ready data export
Goal: Benchmarking system with regression detection.
Technical Note: Regression algorithm and accurate order lifecycle timing are the most complex parts requiring careful design.
M3: Metrics & Visualization (2 weeks)
Integrate with Prometheus/Grafana infrastructure:
- Prometheus exporters
- Custom metrics for load testing
- Performance benchmark metrics
- Test scenario metadata
- Grafana dashboards
- Order throughput visualization
- Latency distribution histograms
- Resource utilization panels
- Comparison views
- Alerting rules
- Performance degradation alerts
- Error rate thresholds
- Resource exhaustion warnings
Goal: Rich visualization and monitoring capabilities.
M4: Test Scenarios (1 week)
Build a test scenario library:
- Implement predefined scenarios
- Light, medium, heavy load scenarios
- Spike and sustained load patterns
- Edge case scenarios
- Create scenario configuration system
- Framework-native configuration (e.g., k6 JavaScript scenarios or YAML/JSON)
- Validation and error handling
- Template library
- Example scenario collection with documentation
Goal: Rich library of reusable test scenarios.
M5: Integration, Documentation & Offline Mode Exploration (2 weeks)
Final integration, documentation, and offline mode exploration:
- End-to-end integration testing with fork mode (PRIMARY)
- Configure Anvil fork mode:
anvil --fork-url $MAINNET_RPCusing CoW Protocol archive node - Validate all test scenarios with forked mainnet state
- Verify Anvil’s state caching works correctly for subsequent runs
- Test realistic DEX interactions and settlement scenarios
- Metrics collection verification
- Dashboard validation
- Performance overhead measurement
- Discover and address missing metrics (iterative refinement)
- Document Anvil fork mode behavior and limitations
- Configure Anvil fork mode:
- Offline mode exploration (STRETCH GOAL)
- Validate compatibility with offline mode environment if time permits
- Test with pre-deployed contracts and local liquidity
- Verify no external dependencies required
- Performance comparison between fork and offline modes
- Document differences in performance characteristics
- Comprehensive documentation
- Quick start guide for fork mode setup
- Configuration reference for archive node integration
- Anvil fork mode configuration guide (block time, caching behavior)
- Metrics interpretation guide
- Architecture documentation
- Example workflows and tutorials
- Troubleshooting guide
- Offline mode setup guide (with stretch goal achieved)
Maintenance Vesting
- Bug fixes and security updates
- Documentation updates as protocols evolve
- Support for fork mode updates and archive node changes
- Address discovered issues during usage
Goal: Production-ready testing suite fully validated with fork mode, complete documentation, and offline mode exploration results as stretch goal.
Architecture Diagram
Architecture Components Description
1. Performance Testing CLI (Blue)
The command-line interface that developers interact with to run performance tests.
- Configuration Loader: Reads and parses test scenario configurations from YAML/JSON files
- Scenario Engine: Orchestrates the execution of test scenarios, managing timing and coordination
- Order Generator: Creates synthetic orders based on scenario specifications
- Report Generator: Produces performance reports comparing results against baselines
This is the entry point for developers running performance tests with simple commands like performance-test run --scenario heavy-load.
2. Load Generation (Green)
The core load generation system that simulates realistic user activity.
- User Simulator: Models multiple concurrent users with realistic behavior patterns
- Signing Engine: Generates valid signatures for orders using test accounts
- Order Submitter: Sends orders to the Orderbook API at configured rates and patterns
This block handles the actual generation and submission of synthetic trading activity that mimics real users.
3. Metrics Collection (Orange)
The performance measurement and analysis system.
- Metrics Collector: Monitors the CoW Protocol services and captures performance data
- Prometheus Exporter: Exposes collected metrics in Prometheus format
- Baseline Manager: Stores and manages performance baseline snapshots
- Performance Comparator: Analyzes current performance against baselines and detects regressions
This block provides the intelligence to measure, track, and compare performance over time.
4. Monitoring Stack (Purple)
The visualization and alerting infrastructure.
- Prometheus: Time-series database storing all performance metrics
- Grafana Dashboards: Visual interfaces showing performance trends, latency distributions, and comparisons
- Alert Manager: Sends notifications when performance degrades or thresholds are exceeded
This provides real-time visibility and historical tracking of system performance.
Method
Technical Approach
We propose a performance testing framework designed for the playground environment:
Fork Mode Integration (Primary Requirement):
- Anvil fork mode: Uses
anvil --fork-url $MAINNET_RPCwith CoW Protocol archive node - Realistic mainnet state: Tests against actual DEX liquidity and contract state
- Optimized performance: Anvil caches state after first run for faster subsequent tests
- 12s block time: Configured for Ethereum-like timing behavior
- Docker integration: Runs alongside existing playground services
- Authentic testing: Real solver behavior and settlement scenarios
Framework Selection:
We will evaluate and select the best-fit framework during M1 based on Grafana integration and CoW Protocol requirements:
- k6 (leading candidate): Excellent Grafana integration, native Prometheus metrics export, JavaScript scenarios, proven performance testing capabilities
- Python-based alternatives: (Locust, aiohttp) if specific CoW SDK integration needs outweigh k6’s advantages
- CoW SDK integration: Reuse order schemas and types from
@cowprotocol/cow-sdkfor realistic order generation - API-first approach: Treat playground services as HTTP APIs for standard load testing patterns
Architecture:
- Concurrent/asynchronous execution: Handle high-volume order generation efficiently
- CLI-first design: Easy integration with CI/CD pipelines
- Configuration-driven: Framework-native configuration for flexible scenario definition
- Docker-native: Seamless integration with existing playground docker-compose setup
Load Generation Strategy:
- Realistic order simulation: Model actual user behavior and order patterns
- Configurable load patterns: Support various testing scenarios
- Minimal system impact: Efficient resource usage when not actively testing
- Extensible architecture: Easy to add new order types and patterns
Metrics Collection:
- Non-intrusive monitoring: Leverage existing logging and metrics
- Standard protocols: Prometheus for storage, Grafana for visualization
- Rich metrics: Capture latency distributions, not just averages
- Historical tracking: Enable long-term performance trend analysis
Implementation Strategy
- Modular Design: Separate concerns (generation, collection, visualization)
- Configuration-Driven: All test scenarios defined in configuration files
- Docker Integration: Run as part of playground docker-compose setup
- CI/CD Ready: Command-line interface for automated testing
- Extensible: Plugin architecture for custom metrics and scenarios
Open Source Commitment
All code will be open-source from day 0. We’re open to feedback during PRs and will maintain the codebase according to CoW Protocol standards.
Long-term Sustainability
Maintenance Plan:
- 1-year maintenance through COW token vesting (27,000 COW)
- Bug fixes and feature enhancements
- Documentation updates as protocols evolve
- Community support and issue triage
- Updates for new playground features
Community Ownership:
- All code contributed to CoW Protocol repositories
- Documentation enables community contributions
- Plugin architecture for community extensions
- Training and knowledge transfer
Evaluation Criteria
Per the RFP, our proposal addresses all evaluation criteria:
1. Approach to Load Generation and Testing
- Realistic simulation: Model actual user behavior and order patterns using CoW SDK order schemas
- Flexible scenarios: Pre-built scenarios plus custom configuration
- Scalable architecture: Handle light to heavy load testing
- Concurrent/asynchronous design: Efficient resource usage and high throughput
- Industry-standard tools: Evaluate k6 (preferred for Grafana integration) vs Python-based solutions
- API-first approach: Treat playground services as HTTP APIs for standard load testing patterns
2. Quality of Metrics and Insights
- Metrics: Latency, throughput, resource usage, error rates
- Statistical analysis: Distributions, percentiles, regression detection
- Actionable insights: Clear identification of bottlenecks
- Comparative analysis: Before/after performance comparison
- Historical tracking: Long-term performance trend visibility
3. Ease of Use for Developers
- Simple CLI: Single command to run tests
- Pre-built scenarios: Common test cases ready out-of-the-box
- Clear documentation: Quick start to advanced usage
- Intuitive configuration: YAML/JSON for easy customization
- Automated reporting: No manual data analysis required
4. Integration with Existing Tools
- Playground Fork Mode: Full compatibility with Anvil fork mode using CoW archive node
- Anvil fork mode: Configured with 12s block time and state caching for optimal performance
- Archive node integration: Uses CoW’s archive node for mainnet state forking
- Prometheus/Grafana: Native integration with existing monitoring stack
- Docker Compose: Seamless playground integration
- Offline mode (stretch): Potential compatibility exploration
- CI/CD ready: Command-line interface for automation
5. Maintainability and Documentation
- Clean architecture: Modular, well-structured codebase
- Tests: Unit and integration test coverage
- Clear documentation: Architecture, usage, and extension guides
- Example scenarios: Real-world usage examples
6. Cost and Timeline
- Total cost: $27,000 xDAI development + 27,000 COW (1-year vesting) maintenance
- Timeline: 9 weeks
- Rate: $3,000/week
- Buffer included: Extra time in M5 for discovering missing metrics and handling unexpected Anvil fork mode limitations
Funding Request
Development Grant: $27,000 (USDC)
Maintenance Vesting: 27,000 COW (1-year vesting from delivery date)
Timeline Breakdown:
- M1 (Load Generation Framework): 2 weeks
- M2 (Performance Benchmarking): 2 weeks
- M3 (Metrics & Visualization): 2 weeks
- M4 (Test Scenarios): 1 week
- M5 (Integration, Fork Mode & Documentation): 2 weeks
- Total: 9 weeks
Rate: $3,000/week
Budget Breakdown
Development Grant ($27,000 USDC):
- Developer hourly rates during execution
- Project manager on a need-to-know basis
- Testing and validation infrastructure
Maintenance Vesting (27,000 COW over 1 year):
- Bug fixes and feature enhancements
- Documentation updates as protocols evolve
- Community support and issue triage
- Updates for new playground features
Payment Information
Gnosis Chain Address: 0x554866e3654E8485928334e7F91B5AfC37D18e04
Additional Information
The 1-year maintenance vesting ensures ongoing support and improvements as the CoW Protocol evolves. We’re committed to maintaining high code quality and responsiveness to community feedback throughout the maintenance period.
Our recent experience with the Offline Development Mode project gives us deep familiarity with the playground architecture, positioning us well to build effective performance testing tooling.
Terms and Conditions
By submitting this grant application, we acknowledge and agree to be bound by the CoW DAO Participation Agreement and the CoW Grant Terms and Conditions.
