Round-Robin Provider System

Overview

The Metamock AI image generation system implements a round-robin load balancing mechanism to distribute image generation requests across multiple AI providers. This ensures better reliability, performance, and cost distribution.

Architecture

Core Components

  1. AI Image Generator (ai_image_generator.py) - Central orchestrator
  2. Provider Implementations - Individual provider classes
  3. Base Provider Interface (base_provider.py) - Common provider contract
  4. Round-Robin Logic - Automatic provider rotation

File Structure

server/backend/services/ai_image_generator/
├── ai_image_generator.py          # Main orchestrator with round-robin
├── base_provider.py               # Abstract provider interface
├── replicate_provider.py          # Replicate API provider
├── google_provider.py             # Google Nano Banana provider
└── __init__.py

Currently Available Providers

1. Replicate Provider

  • Name: replicate
  • Priority: 1 (highest - used first in rotation)
  • Model: bytedance/seedream-4
  • Rate Limit: 50 requests/minute
  • Max Concurrent: 5
  • Status: ✅ Enabled
  • Required Env: REPLICATE_API_TOKEN

2. Google Nano Banana Provider

  • Name: google_nano_banana
  • Priority: 2 (used second in rotation)
  • Model: Gemini 3 Pro Image
  • Rate Limit: 1000 requests/minute
  • Max Concurrent: 20
  • Status: ✅ Enabled
  • Required Env: GOOGLE_API_KEY

Smart Routing & Round-Robin Behavior

⚠️ IMPORTANT: The system uses dimension-aware smart routing that determines provider selection based on input type. Round-robin is NOT guaranteed for all requests.

Input Type Detection (ai_image_generator.py:50-60)

The system analyzes input parameters to determine routing strategy:

def _detect_input_type(self, input_params: Dict[str, Any]) -> str:
    if "width" in input_params and "height" in input_params:
        return "exact_dimensions"    # → Force Replicate only
    elif "aspect_ratio" in input_params:
        return "aspect_ratio"        # → Round-robin compatible providers
    else:
        return "default"             # → Round-robin all providers

Provider Selection Logic

1. Exact Dimensions (width + height specified)

Input: {width: 1920, height: 1080}
Route: FORCE Replicate only - NO round-robin
Reason: Only Replicate supports precise pixel dimensions

Behavior:

Request 1 → replicate (forced)
Request 2 → replicate (forced)
Request 3 → replicate (forced)
ALL requests → replicate

2. Aspect Ratio (aspect_ratio specified)

Input: {aspect_ratio: "16:9"}
Route: Round-robin between compatible providers
Compatible: Replicate + Google Nano Banana

Behavior:

Request 1 → replicate (index 0)
Request 2 → google_nano_banana (index 1)  
Request 3 → replicate (index 0)
Request 4 → google_nano_banana (index 1)

3. Default (no dimensions specified)

Input: {prompt: "a cat"}
Route: Round-robin between all available providers

Behavior:

Request 1 → provider[0] (based on priority)
Request 2 → provider[1] 
Request 3 → provider[0]

Bulk Generation Real-World Behavior

Bulk generation behavior depends on preset configuration:

Case A: Preset with exact dimensions

{
  "config": {
    "dimensions": {
      "width": 1920,
      "height": 1080
    }
  }
}

Result: ALL images use Replicate only - NO round-robin

Case B: Preset with aspect ratio

{
  "config": {
    "dimensions": {
      "aspect_ratio": "16:9"
    }
  }
}

Result: Images alternate between providers

Job A:
├── Image 1 → replicate
├── Image 2 → google_nano_banana
├── Image 3 → replicate
└── Image 4 → google_nano_banana

Job B (continues rotation):
├── Image 1 → replicate 
└── Image 2 → google_nano_banana

Provider Configuration

Provider Config Structure

@dataclass
class ProviderConfig:
    name: str                    # Unique identifier
    rate_limit_per_minute: int   # API rate limits
    max_concurrent: int          # Concurrent request limit
    enabled: bool = True         # Enable/disable provider
    priority: int = 1            # Loading order (lower = higher priority)

Environment Variables Required

Development (.env.dev)

# Replicate
REPLICATE_API_TOKEN=r8_your_replicate_token

# Google 
GOOGLE_API_KEY=your_google_ai_api_key

Production (.env.prod)

# Replicate
REPLICATE_API_TOKEN=r8_your_production_replicate_token

# Google
GOOGLE_API_KEY=your_production_google_ai_api_key

Provider Loading Process

1. Automatic Discovery

The system automatically discovers providers by: - Scanning all files in the ai_image_generator/ directory - Finding classes that inherit from AIImageProvider - Calling create_from_env() on each provider class

2. Validation

Each provider must: - Have required environment variables set - Pass is_available() check - Be marked as enabled=True

3. Priority Sorting

Providers are sorted by priority (ascending):

providers.sort(key=lambda p: getattr(p.config, 'priority', float('inf')))

Adding New Providers

1. Create Provider Class

# new_provider.py
from .base_provider import AIImageProvider, ProviderConfig

class NewProvider(AIImageProvider):
    def __init__(self, api_key: str, config: ProviderConfig = None):
        if config is None:
            config = ProviderConfig(
                name="new_provider",
                rate_limit_per_minute=100,
                max_concurrent=10,
                enabled=True,
                priority=3  # Set appropriate priority
            )
        super().__init__(config)
        self.api_key = api_key

    @classmethod
    def create_from_env(cls):
        api_key = os.getenv('NEW_PROVIDER_API_KEY')
        if not api_key:
            return None
        return cls(api_key)

    def is_available(self) -> bool:
        return self.api_key is not None

    def generate_image(self, model_id: str, input_params: Dict[str, Any]) -> str:
        # Implementation here
        pass

2. Add Environment Variable

# .env.dev and .env.prod
NEW_PROVIDER_API_KEY=your_api_key_here

3. Restart Services

docker-compose restart celery-worker
docker-compose restart metamock-backend

Troubleshooting

Issue: Only One Provider Loading

Symptoms: All images use the same provider Cause: Other providers disabled or missing API keys Solution: 1. Check environment variables are set 2. Verify enabled=True in provider config 3. Restart services after changes

Issue: Round-Robin Not Working in Bulk Generation

Symptoms: All images in bulk job use same provider Possible Causes: 1. Smart Routing: Preset uses exact dimensions (width/height) → Forces Replicate only 2. Provider Availability: Only one provider available or enabled 3. Configuration: Aspect ratio not properly set in preset config

Solution: 1. Check preset configuration - Use aspect_ratio instead of width/height for round-robin 2. Verify multiple providers are loaded and available 3. Check provider is_available() status 4. Review bulk generation logs for routing decisions 5. Look for log messages: 🎯 EXACT DIMENSIONS: Routing to REPLICATE vs 🎯 ASPECT RATIO: Round-robin

Issue: Provider Not Available

Symptoms: Provider shows Available: False Common Causes: - Missing or invalid API key - enabled=False in configuration - Provider's is_available() method failing

Monitoring

Check Loaded Providers

from backend.services.ai_image_generator.ai_image_generator import get_ai_image_generator

generator = get_ai_image_generator()
for provider in generator.providers:
    print(f"{provider.config.name}: {provider.is_available()}")

Bulk Generation Logs

Look for these log patterns to understand routing behavior:

Smart Routing Decision Logs:

🎯 EXACT DIMENSIONS: Routing to REPLICATE (precise dimensions required)
🎯 ASPECT RATIO: Round-robin → REPLICATE
🎯 ASPECT RATIO: Round-robin → GOOGLE_NANO_BANANA
🎯 DEFAULT: Round-robin → PROVIDER_NAME

Generation Result Logs:

🏷️ Generation X: Used provider 'REPLICATE' (priority: 1)
🏷️ Generation X: Used provider 'GOOGLE_NANO_BANANA' (priority: 2)
🔍 BULK DEBUG: generation_result provider_name=replicate
🔍 STEP4 DEBUG: First generation provider_name=google_nano_banana, provider_priority=2

Config Processing Logs:

🎯 Generation X: Extracted aspect ratio 16:9 from config dimensions
🎯 Generation X: Found aspect ratio at ROOT level: 1:1
❌ Generation X: NO aspect_ratio found anywhere in config!

Analytics Tracking

Provider usage is automatically tracked in: - S3 analytics files (analytics.json) - Database records with provider information - Celery logs with provider details

Performance Considerations

Provider Priority Strategy

  • Priority 1: Fast, reliable providers (e.g., Replicate)
  • Priority 2: High-capacity providers (e.g., Google)
  • Priority 3+: Backup/experimental providers

Rate Limiting

Each provider enforces its own rate limits: - Providers track request timestamps - Automatic backoff when limits exceeded - Graceful fallback to next available provider

Concurrent Request Management

  • Each provider has max_concurrent limit
  • Prevents overwhelming individual APIs
  • Ensures stable performance across all providers

Best Practices

  1. Always have multiple providers enabled for redundancy
  2. Use aspect ratios for round-robin - Set presets with aspect_ratio instead of width/height for load balancing
  3. Use exact dimensions sparingly - Only when precise pixel control is required (forces Replicate only)
  4. Set appropriate rate limits based on API provider specifications
  5. Monitor provider performance through logs and analytics
  6. Test new providers thoroughly before enabling in production
  7. Keep API keys secure and rotate regularly
  8. Balance priorities based on cost, speed, and quality requirements
  9. Check routing logs to understand which providers are being selected

Technical Implementation Details

Smart Routing Logic (ai_image_generator.py)

  • Feature Flag: dimension_routing_enabled = True (hardcoded - always enabled)
  • Thread Safety: Uses threading.Lock() for concurrent request handling
  • Provider Discovery: Auto-scans *_provider.py files and calls create_from_env()
  • Capability Matching: Routes based on provider capabilities vs request requirements

Key Files

  • Main Orchestrator: server/backend/services/ai_image_generator/ai_image_generator.py
  • Bulk Integration: server/backend/services/bulk_gen/bulk_image_service.py:137
  • Enhanced Config: bulk_image_service.py:115 - Normalizes aspect ratio to root level
  • Step 4 Handler: server/backend/celery/tasks/bulk_gen/step_4_generate_images.py

Enhanced Config Processing

The bulk service processes preset configurations to extract aspect ratios:

# Extracts aspect_ratio from nested dimensions and moves to root level
if 'dimensions' in config and 'aspect_ratio' in config['dimensions']:
    enhanced_config['aspect_ratio'] = config['dimensions']['aspect_ratio']

This ensures compatibility with smart routing input type detection.

Future Enhancements

  • Configurable dimension routing via environment variables
  • Weighted round-robin based on provider performance
  • Dynamic provider enabling/disabling based on health checks
  • Cost-based routing for budget optimization
  • Quality-based provider selection per image type
  • Real-time provider monitoring dashboard
  • Fallback dimension conversion when exact dimensions requested but Replicate unavailable