How to Create a Public Optimization Benchmark

Benchmarks are standardized test suites that enable fair comparison of optimization algorithms. This guide shows you how to create high-quality benchmarks that the community can use to evaluate and compare their algorithms.

What is an Optimization Benchmark?

A benchmark is a collection of:

Problem Instances: Standardized test problems
Evaluation Metrics: Consistent measurement criteria
Reference Solutions: Known optimal or best-known solutions
Evaluation Protocol: Rules for fair comparison

Prerequisites

Before creating a benchmark:

Domain Expertise: Deep understanding of the optimization problem
Problem Collection: Diverse set of problem instances
Reference Solutions: Known optimal or high-quality solutions
GitHub Login: Sign in with GitHub
Uploaded Repositories: Upload problem definitions
Platform Access: Access to benchmark creation features

Planning Your Benchmark

1. Define Benchmark Scope

benchmark_specification = {
    "name": "TSP Benchmark Suite 2024",
    "domain": "Traveling Salesman Problem",
    "problem_types": ["symmetric", "asymmetric", "euclidean", "geographic"],
    "size_range": {"min": 10, "max": 1000},
    "instance_count": 50,
    "difficulty_levels": ["easy", "medium", "hard", "extreme"],
    "evaluation_metrics": ["solution_quality", "runtime", "convergence_rate"]
}

2. Collect Problem Instances

Instance Sources

Literature: Classic problems from research papers
Real-world: Practical applications and datasets
Generated: Systematically created test cases
Community: Contributed by other researchers

Instance Diversity

instance_categories = {
    "size_distribution": {
        "small": {"range": "10-50 nodes", "count": 15},
        "medium": {"range": "51-200 nodes", "count": 20}, 
        "large": {"range": "201-500 nodes", "count": 10},
        "xlarge": {"range": "501-1000 nodes", "count": 5}
    },
    "structure_types": {
        "random": "Randomly generated coordinates",
        "clustered": "Nodes grouped in clusters", 
        "grid": "Regular grid layout",
        "real_world": "Actual city coordinates"
    }
}

Creating Benchmark Components

1. Problem Repository Structure

tsp_benchmark_2024/
├── problems/
│   ├── small/
│   │   ├── tsp_10_random_01.json
│   │   ├── tsp_20_clustered_01.json
│   │   └── ...
│   ├── medium/
│   │   ├── tsp_100_grid_01.json
│   │   └── ...
│   └── large/
├── solutions/
│   ├── optimal/
│   │   ├── tsp_10_random_01_optimal.json
│   │   └── ...
│   └── best_known/
├── config.json
├── benchmark_spec.json
└── README.md

2. Problem Definition Format

{
  "name": "tsp_berlin52",
  "type": "tsp_symmetric",
  "description": "52 locations in Berlin (Groetschel)",
  "size": 52,
  "optimal_value": 7542,
  "source": "TSPLIB",
  "coordinates": [
    {"id": 1, "x": 565.0, "y": 575.0},
    {"id": 2, "x": 25.0, "y": 185.0},
    ...
  ],
  "metadata": {
    "difficulty": "medium",
    "structure": "real_world",
    "tags": ["classic", "tsplib", "geographic"]
  }
}

3. Solution Format

{
  "problem_name": "tsp_berlin52",
  "solution_type": "optimal",
  "objective_value": 7542,
  "tour": [1, 49, 32, 45, 19, 41, 8, 9, 10, 43, 33, 51, 11, 52, 14, 13, 47, 26, 27, 28, 12, 25, 4, 6, 15, 5, 24, 48, 38, 37, 40, 39, 36, 35, 34, 44, 46, 16, 29, 50, 20, 23, 30, 2, 7, 42, 21, 17, 3, 18, 31, 22],
  "verification": {
    "verified": true,
    "method": "exact_solver",
    "solver": "Concorde",
    "computation_time": 1.23
  }
}

Current Benchmark Access

1. Benchmark Page

Visit rastion.com/benchmark to access:

Protected Access

Authentication Required: Must be logged in to access benchmarks
User Permissions: Benchmark creation may require specific permissions
Community Features: View and interact with existing benchmarks

Benchmark Interface

Browse Benchmarks: View available benchmark suites
Performance Metrics: See algorithm performance on standard problems
Comparison Tools: Compare different algorithms side-by-side
Download Options: Access benchmark data and results

2. Integration with Leaderboards

Benchmark-Leaderboard Connection

Benchmarks often integrate with the leaderboard system:

Standardized Problems: Benchmarks provide problems for leaderboard competitions
Fair Evaluation: Consistent evaluation criteria across submissions
Performance Tracking: Long-term performance monitoring
Community Engagement: Encourage participation in benchmark challenges

Creating Leaderboard Problems

If you have admin access, you can create leaderboard problems:

Access Admin Interface: Use admin-only problem creation modal
Repository Selection: Choose from your uploaded repositories
Problem Configuration: Set name and optimization direction (min/max)
Simplified Setup: Minimal fields for quick problem creation
Automatic Integration: Problems automatically appear in leaderboard

3. Repository-Based Benchmarks

Upload Benchmark Repositories

Create benchmarks by uploading comprehensive problem repositories:

import qubots.rastion as rastion

# Authenticate
rastion.authenticate("your_rastion_token")

# Upload benchmark repository
benchmark_url = rastion.upload_model_from_path(
    path="./tsp_benchmark_suite",
    repository_name="tsp_benchmark_2024",
    description="Comprehensive TSP benchmark with diverse instances",
    tags=["benchmark", "tsp", "evaluation"],
    private=False
)

Repository Structure for Benchmarks

tsp_benchmark_suite/
├── problems/
│   ├── small_instances/
│   ├── medium_instances/
│   └── large_instances/
├── solutions/
│   ├── optimal_solutions/
│   └── best_known_solutions/
├── config.json
├── benchmark_spec.json
└── README.md

Evaluation Protocol Design

1. Performance Metrics

evaluation_metrics = {
    "solution_quality": {
        "gap_to_optimal": "(solution_value - optimal_value) / optimal_value * 100",
        "success_rate": "percentage of runs finding optimal solution",
        "best_value": "best objective value found",
        "average_value": "average objective value across runs"
    },
    "efficiency": {
        "runtime": "total execution time",
        "iterations": "number of algorithm iterations",
        "evaluations": "number of solution evaluations"
    },
    "robustness": {
        "standard_deviation": "consistency across multiple runs",
        "convergence_rate": "speed of convergence to good solutions"
    }
}

2. Evaluation Rules

evaluation_protocol = {
    "execution_environment": {
        "cpu_limit": "4 cores",
        "memory_limit": "8GB",
        "time_limit": 300,  # seconds
        "platform": "Rastion Playground"
    },
    "statistical_requirements": {
        "min_runs": 10,
        "confidence_level": 0.95,
        "significance_test": "Mann-Whitney U"
    },
    "submission_rules": {
        "algorithm_description": "required",
        "parameter_settings": "must be documented",
        "reproducibility": "code must be available"
    }
}

Quality Assurance

1. Instance Validation

def validate_benchmark_instance(instance):
    checks = {
        "format_valid": validate_json_format(instance),
        "solution_exists": check_reference_solution(instance),
        "solution_verified": verify_optimal_solution(instance),
        "metadata_complete": check_required_metadata(instance),
        "difficulty_appropriate": assess_difficulty_level(instance)
    }
    return all(checks.values()), checks

# Validate all instances
validation_results = []
for instance in benchmark_instances:
    is_valid, details = validate_benchmark_instance(instance)
    validation_results.append({
        "instance": instance["name"],
        "valid": is_valid,
        "details": details
    })

2. Reference Solution Verification

def verify_reference_solutions():
    for problem, solution in problem_solution_pairs:
        # Verify solution feasibility
        assert is_feasible_solution(problem, solution["tour"])
        
        # Verify objective value calculation
        calculated_value = calculate_objective(problem, solution["tour"])
        assert abs(calculated_value - solution["objective_value"]) < 1e-6
        
        # Verify optimality claim (if applicable)
        if solution["solution_type"] == "optimal":
            assert verify_optimality(problem, solution)

Benchmark Management

1. Benchmark Maintenance

Version Control

Repository Updates: Update benchmark repositories with new instances
Version Tracking: Maintain compatibility across benchmark versions
Documentation: Keep comprehensive documentation updated
Community Feedback: Incorporate community suggestions and improvements

Quality Assurance

# Validate benchmark instances
def validate_benchmark_instance(instance):
    checks = {
        "format_valid": validate_json_format(instance),
        "solution_exists": check_reference_solution(instance),
        "solution_verified": verify_optimal_solution(instance),
        "metadata_complete": check_required_metadata(instance)
    }
    return all(checks.values()), checks

2. Community Integration

Public Repositories: Make benchmark repositories publicly accessible
Community Contributions: Accept community-contributed problem instances
Collaborative Improvement: Work with community to enhance benchmarks
Knowledge Sharing: Document best practices and lessons learned

Integration with Platform Features

Leaderboard Problems: Use benchmark instances for leaderboard competitions
Playground Testing: Enable easy testing of algorithms on benchmark problems
Experiment Templates: Provide experiment templates using benchmark problems
Performance Tracking: Monitor algorithm performance across benchmark instances

Maintenance and Updates

1. Version Control

# Update benchmark with new instances
benchmark.add_version("1.1.0", changes=[
    "Added 10 new large-scale instances",
    "Fixed coordinate precision in 3 instances", 
    "Updated reference solutions for 2 instances"
])

# Maintain backward compatibility
benchmark.set_compatibility({
    "1.0.0": "fully_compatible",
    "1.1.0": "current_version"
})

2. Community Feedback

Issue Tracking: Monitor reported problems
Solution Updates: Incorporate better reference solutions
Instance Additions: Add community-contributed instances
Protocol Refinements: Improve evaluation procedures

Best Practices

1. Benchmark Design

Diversity: Include varied problem characteristics
Scalability: Cover different problem sizes
Difficulty: Range from easy to extremely challenging
Relevance: Include real-world inspired instances

2. Documentation

# Benchmark Documentation Template

## Overview
- Purpose and scope
- Target algorithms
- Evaluation objectives

## Instance Description
- Problem types and characteristics
- Size distribution
- Difficulty levels
- Source and generation methods

## Evaluation Protocol
- Performance metrics
- Execution environment
- Statistical requirements
- Submission guidelines

## Reference Solutions
- Optimality verification
- Solution quality assessment
- Computational methods used

## Usage Instructions
- How to download instances
- Submission process
- Result interpretation

Getting Started with Benchmarks

1. Current Access

Explore Existing Benchmarks

Visit Benchmark Page: Go to rastion.com/benchmark
Study Examples: Learn from existing benchmark implementations
Understand Structure: See how successful benchmarks are organized
Performance Analysis: Review algorithm performance on standard benchmarks

Contribute to Leaderboards

Admin Access: Contact platform administrators for benchmark creation permissions
Problem Contribution: Contribute high-quality problem instances
Community Engagement: Participate in benchmark discussions and improvements

2. Future Development

Platform Evolution

The benchmark system continues to evolve with:

Enhanced Creation Tools: Improved interfaces for benchmark creation
Automated Validation: Better tools for verifying benchmark quality
Community Features: Enhanced collaboration and sharing capabilities
Integration Improvements: Better integration with leaderboards and experiments

Next Steps

Join Leaderboard

Participate in existing benchmark competitions

Create Experiment

Design experiments using benchmark problems

View Benchmarks

Explore existing community benchmarks

Upload Repository

Upload problem repositories for benchmarks

Benchmarks are essential for advancing optimization research. By contributing high-quality problem instances and participating in benchmark development, you help create valuable resources for the entire optimization community!

How to Join a Public LeaderboardLearn how to participate in optimization competitions and submit your algorithms to public leaderboards

On this page

How to Create a Public Optimization Benchmark
What is an Optimization Benchmark?
Prerequisites
Planning Your Benchmark
1. Define Benchmark Scope
2. Collect Problem Instances
Instance Sources
Instance Diversity
Creating Benchmark Components
1. Problem Repository Structure
2. Problem Definition Format
3. Solution Format
Current Benchmark Access
1. Benchmark Page
Protected Access
Benchmark Interface
2. Integration with Leaderboards
Benchmark-Leaderboard Connection
Creating Leaderboard Problems
3. Repository-Based Benchmarks
Upload Benchmark Repositories
Repository Structure for Benchmarks
Evaluation Protocol Design
1. Performance Metrics
2. Evaluation Rules
Quality Assurance
1. Instance Validation
2. Reference Solution Verification
Benchmark Management
1. Benchmark Maintenance
Version Control
Quality Assurance
2. Community Integration
Sharing and Collaboration
Integration with Platform Features
Maintenance and Updates
1. Version Control
2. Community Feedback
Best Practices
1. Benchmark Design
2. Documentation
Getting Started with Benchmarks
1. Current Access
Explore Existing Benchmarks
Contribute to Leaderboards
2. Future Development
Platform Evolution
Next Steps

Getting Started

How To Guides

Repository Management

Playground

Leaderboard System

Platform Features

​How to Create a Public Optimization Benchmark

​What is an Optimization Benchmark?

​Prerequisites

​Planning Your Benchmark

​1. Define Benchmark Scope

​2. Collect Problem Instances

​Instance Sources

​Instance Diversity

​Creating Benchmark Components

​1. Problem Repository Structure

​2. Problem Definition Format

​3. Solution Format

​Current Benchmark Access

​1. Benchmark Page

​Protected Access

​Benchmark Interface

​2. Integration with Leaderboards

​Benchmark-Leaderboard Connection

​Creating Leaderboard Problems

​3. Repository-Based Benchmarks

​Upload Benchmark Repositories

​Repository Structure for Benchmarks

​Evaluation Protocol Design

​1. Performance Metrics

​2. Evaluation Rules

​Quality Assurance

​1. Instance Validation

​2. Reference Solution Verification

​Benchmark Management

​1. Benchmark Maintenance

​Version Control

​Quality Assurance

​2. Community Integration

​Sharing and Collaboration

​Integration with Platform Features

​Maintenance and Updates

​1. Version Control

​2. Community Feedback

​Best Practices

​1. Benchmark Design

​2. Documentation

​Getting Started with Benchmarks

​1. Current Access

​Explore Existing Benchmarks

​Contribute to Leaderboards

​2. Future Development

​Platform Evolution

​Next Steps

Join Leaderboard

Create Experiment

View Benchmarks

Upload Repository

How to Create a Public Optimization Benchmark

What is an Optimization Benchmark?

Prerequisites

Planning Your Benchmark

1. Define Benchmark Scope

2. Collect Problem Instances

Instance Sources

Instance Diversity

Creating Benchmark Components

1. Problem Repository Structure

2. Problem Definition Format

3. Solution Format

Current Benchmark Access

1. Benchmark Page

Protected Access

Benchmark Interface

2. Integration with Leaderboards

Benchmark-Leaderboard Connection

Creating Leaderboard Problems

3. Repository-Based Benchmarks

Upload Benchmark Repositories

Repository Structure for Benchmarks

Evaluation Protocol Design

1. Performance Metrics

2. Evaluation Rules

Quality Assurance

1. Instance Validation

2. Reference Solution Verification

Benchmark Management

1. Benchmark Maintenance

Version Control

Quality Assurance

2. Community Integration

Sharing and Collaboration

Integration with Platform Features

Maintenance and Updates

1. Version Control

2. Community Feedback

Best Practices

1. Benchmark Design

2. Documentation

Getting Started with Benchmarks

1. Current Access

Explore Existing Benchmarks

Contribute to Leaderboards

2. Future Development

Platform Evolution

Next Steps