Round-Robin Provider System

Overview

The Metamock AI image generation system implements a round-robin load balancing mechanism to distribute image generation requests across multiple AI providers. This ensures better reliability, performance, and cost distribution.

Architecture

Core Components

AI Image Generator (ai_image_generator.py) - Central orchestrator
Provider Implementations - Individual provider classes
Base Provider Interface (base_provider.py) - Common provider contract
Round-Robin Logic - Automatic provider rotation

File Structure

server/backend/services/ai_image_generator/
├── ai_image_generator.py          # Main orchestrator with round-robin
├── base_provider.py               # Abstract provider interface
├── replicate_provider.py          # Replicate API provider
├── google_provider.py             # Google Nano Banana provider
└── __init__.py

Currently Available Providers

1. Replicate Provider

Name: replicate
Priority: 1 (highest - used first in rotation)
Model: bytedance/seedream-4
Rate Limit: 50 requests/minute
Max Concurrent: 5
Status: ✅ Enabled
Required Env: REPLICATE_API_TOKEN

2. Google Nano Banana Provider

Name: google_nano_banana
Priority: 2 (used second in rotation)
Model: Gemini 3 Pro Image
Rate Limit: 1000 requests/minute
Max Concurrent: 20
Status: ✅ Enabled
Required Env: GOOGLE_API_KEY

Smart Routing & Round-Robin Behavior

⚠️ IMPORTANT: The system uses dimension-aware smart routing that determines provider selection based on input type. Round-robin is NOT guaranteed for all requests.

Input Type Detection (`ai_image_generator.py:50-60`)

The system analyzes input parameters to determine routing strategy:

def _detect_input_type(self, input_params: Dict[str, Any]) -> str:
    if "width" in input_params and "height" in input_params:
        return "exact_dimensions"    # → Force Replicate only
    elif "aspect_ratio" in input_params:
        return "aspect_ratio"        # → Round-robin compatible providers
    else:
        return "default"             # → Round-robin all providers

Provider Selection Logic

1. Exact Dimensions (width + height specified)

Input: {width: 1920, height: 1080}
Route: FORCE Replicate only - NO round-robin
Reason: Only Replicate supports precise pixel dimensions

Behavior:

Request 1 → replicate (forced)
Request 2 → replicate (forced)
Request 3 → replicate (forced)
ALL requests → replicate

2. Aspect Ratio (aspect_ratio specified)

Input: {aspect_ratio: "16:9"}
Route: Round-robin between compatible providers
Compatible: Replicate + Google Nano Banana

Behavior:

Request 1 → replicate (index 0)
Request 2 → google_nano_banana (index 1)  
Request 3 → replicate (index 0)
Request 4 → google_nano_banana (index 1)

3. Default (no dimensions specified)

Input: {prompt: "a cat"}
Route: Round-robin between all available providers

Behavior:

Request 1 → provider[0] (based on priority)
Request 2 → provider[1] 
Request 3 → provider[0]

Bulk Generation Real-World Behavior

Bulk generation behavior depends on preset configuration:

Case A: Preset with exact dimensions

{
  "config": {
    "dimensions": {
      "width": 1920,
      "height": 1080
    }
  }
}

Result: ALL images use Replicate only - NO round-robin

Case B: Preset with aspect ratio

{
  "config": {
    "dimensions": {
      "aspect_ratio": "16:9"
    }
  }
}

Result: Images alternate between providers

Job A:
├── Image 1 → replicate
├── Image 2 → google_nano_banana
├── Image 3 → replicate
└── Image 4 → google_nano_banana

Job B (continues rotation):
├── Image 1 → replicate 
└── Image 2 → google_nano_banana

Provider Configuration

Provider Config Structure

@dataclass
class ProviderConfig:
    name: str                    # Unique identifier
    rate_limit_per_minute: int   # API rate limits
    max_concurrent: int          # Concurrent request limit
    enabled: bool = True         # Enable/disable provider
    priority: int = 1            # Loading order (lower = higher priority)

Environment Variables Required

Development (.env.dev)

# Replicate
REPLICATE_API_TOKEN=r8_your_replicate_token

# Google 
GOOGLE_API_KEY=your_google_ai_api_key

Production (.env.prod)

# Replicate
REPLICATE_API_TOKEN=r8_your_production_replicate_token

# Google
GOOGLE_API_KEY=your_production_google_ai_api_key

Provider Loading Process

1. Automatic Discovery

The system automatically discovers providers by: - Scanning all files in the ai_image_generator/ directory - Finding classes that inherit from AIImageProvider - Calling create_from_env() on each provider class

2. Validation

Each provider must: - Have required environment variables set - Pass is_available() check - Be marked as enabled=True

3. Priority Sorting

Providers are sorted by priority (ascending):

providers.sort(key=lambda p: getattr(p.config, 'priority', float('inf')))

Adding New Providers

1. Create Provider Class

# new_provider.py
from .base_provider import AIImageProvider, ProviderConfig

class NewProvider(AIImageProvider):
    def __init__(self, api_key: str, config: ProviderConfig = None):
        if config is None:
            config = ProviderConfig(
                name="new_provider",
                rate_limit_per_minute=100,
                max_concurrent=10,
                enabled=True,
                priority=3  # Set appropriate priority
            )
        super().__init__(config)
        self.api_key = api_key

    @classmethod
    def create_from_env(cls):
        api_key = os.getenv('NEW_PROVIDER_API_KEY')
        if not api_key:
            return None
        return cls(api_key)

    def is_available(self) -> bool:
        return self.api_key is not None

    def generate_image(self, model_id: str, input_params: Dict[str, Any]) -> str:
        # Implementation here
        pass

2. Add Environment Variable

# .env.dev and .env.prod
NEW_PROVIDER_API_KEY=your_api_key_here

3. Restart Services

docker-compose restart celery-worker
docker-compose restart metamock-backend

Troubleshooting

Issue: Only One Provider Loading

Symptoms: All images use the same provider Cause: Other providers disabled or missing API keys Solution: 1. Check environment variables are set 2. Verify enabled=True in provider config 3. Restart services after changes

Issue: Round-Robin Not Working in Bulk Generation

Symptoms: All images in bulk job use same provider Possible Causes: 1. Smart Routing: Preset uses exact dimensions (width/height) → Forces Replicate only 2. Provider Availability: Only one provider available or enabled 3. Configuration: Aspect ratio not properly set in preset config

Solution: 1. Check preset configuration - Use aspect_ratio instead of width/height for round-robin 2. Verify multiple providers are loaded and available 3. Check provider is_available() status 4. Review bulk generation logs for routing decisions 5. Look for log messages: 🎯 EXACT DIMENSIONS: Routing to REPLICATE vs 🎯 ASPECT RATIO: Round-robin

Issue: Provider Not Available

Symptoms: Provider shows Available: False Common Causes: - Missing or invalid API key - enabled=False in configuration - Provider's is_available() method failing

Monitoring

Check Loaded Providers

from backend.services.ai_image_generator.ai_image_generator import get_ai_image_generator

generator = get_ai_image_generator()
for provider in generator.providers:
    print(f"{provider.config.name}: {provider.is_available()}")

Bulk Generation Logs

Look for these log patterns to understand routing behavior:

Smart Routing Decision Logs:

🎯 EXACT DIMENSIONS: Routing to REPLICATE (precise dimensions required)
🎯 ASPECT RATIO: Round-robin → REPLICATE
🎯 ASPECT RATIO: Round-robin → GOOGLE_NANO_BANANA
🎯 DEFAULT: Round-robin → PROVIDER_NAME

Generation Result Logs:

🏷️ Generation X: Used provider 'REPLICATE' (priority: 1)
🏷️ Generation X: Used provider 'GOOGLE_NANO_BANANA' (priority: 2)
🔍 BULK DEBUG: generation_result provider_name=replicate
🔍 STEP4 DEBUG: First generation provider_name=google_nano_banana, provider_priority=2

Config Processing Logs:

🎯 Generation X: Extracted aspect ratio 16:9 from config dimensions
🎯 Generation X: Found aspect ratio at ROOT level: 1:1
❌ Generation X: NO aspect_ratio found anywhere in config!

Analytics Tracking

Provider usage is automatically tracked in: - S3 analytics files (analytics.json) - Database records with provider information - Celery logs with provider details

Performance Considerations

Provider Priority Strategy

Priority 1: Fast, reliable providers (e.g., Replicate)
Priority 2: High-capacity providers (e.g., Google)
Priority 3+: Backup/experimental providers

Rate Limiting

Each provider enforces its own rate limits: - Providers track request timestamps - Automatic backoff when limits exceeded - Graceful fallback to next available provider

Concurrent Request Management

Each provider has max_concurrent limit
Prevents overwhelming individual APIs
Ensures stable performance across all providers

Best Practices

Always have multiple providers enabled for redundancy
Use aspect ratios for round-robin - Set presets with aspect_ratio instead of width/height for load balancing
Use exact dimensions sparingly - Only when precise pixel control is required (forces Replicate only)
Set appropriate rate limits based on API provider specifications
Monitor provider performance through logs and analytics
Test new providers thoroughly before enabling in production
Keep API keys secure and rotate regularly
Balance priorities based on cost, speed, and quality requirements
Check routing logs to understand which providers are being selected

Technical Implementation Details

Smart Routing Logic (`ai_image_generator.py`)

Feature Flag: dimension_routing_enabled = True (hardcoded - always enabled)
Thread Safety: Uses threading.Lock() for concurrent request handling
Provider Discovery: Auto-scans *_provider.py files and calls create_from_env()
Capability Matching: Routes based on provider capabilities vs request requirements

Key Files

Main Orchestrator: server/backend/services/ai_image_generator/ai_image_generator.py
Bulk Integration: server/backend/services/bulk_gen/bulk_image_service.py:137
Enhanced Config: bulk_image_service.py:115 - Normalizes aspect ratio to root level
Step 4 Handler: server/backend/celery/tasks/bulk_gen/step_4_generate_images.py

Enhanced Config Processing

The bulk service processes preset configurations to extract aspect ratios:

# Extracts aspect_ratio from nested dimensions and moves to root level
if 'dimensions' in config and 'aspect_ratio' in config['dimensions']:
    enhanced_config['aspect_ratio'] = config['dimensions']['aspect_ratio']

This ensures compatibility with smart routing input type detection.

Future Enhancements

Configurable dimension routing via environment variables
Weighted round-robin based on provider performance
Dynamic provider enabling/disabling based on health checks
Cost-based routing for budget optimization
Quality-based provider selection per image type
Real-time provider monitoring dashboard
Fallback dimension conversion when exact dimensions requested but Replicate unavailable

Round-Robin Provider System

Overview

Architecture

Core Components

File Structure

Currently Available Providers

1. Replicate Provider

2. Google Nano Banana Provider

Smart Routing & Round-Robin Behavior

Input Type Detection (ai_image_generator.py:50-60)

Provider Selection Logic

1. Exact Dimensions (width + height specified)

2. Aspect Ratio (aspect_ratio specified)

3. Default (no dimensions specified)

Bulk Generation Real-World Behavior

Case A: Preset with exact dimensions

Case B: Preset with aspect ratio

Provider Configuration

Provider Config Structure

Environment Variables Required

Development (.env.dev)

Production (.env.prod)

Provider Loading Process

1. Automatic Discovery

2. Validation

3. Priority Sorting

Adding New Providers

1. Create Provider Class

2. Add Environment Variable

3. Restart Services

Troubleshooting

Issue: Only One Provider Loading

Issue: Round-Robin Not Working in Bulk Generation

Issue: Provider Not Available

Monitoring

Check Loaded Providers

Bulk Generation Logs

Analytics Tracking

Performance Considerations

Provider Priority Strategy

Rate Limiting

Concurrent Request Management

Best Practices

Technical Implementation Details

Smart Routing Logic (ai_image_generator.py)

Key Files

Enhanced Config Processing

Future Enhancements

Input Type Detection (`ai_image_generator.py:50-60`)

Smart Routing Logic (`ai_image_generator.py`)