Round-Robin Provider System
Overview
The Metamock AI image generation system implements a round-robin load balancing mechanism to distribute image generation requests across multiple AI providers. This ensures better reliability, performance, and cost distribution.
Architecture
Core Components
- AI Image Generator (
ai_image_generator.py) - Central orchestrator - Provider Implementations - Individual provider classes
- Base Provider Interface (
base_provider.py) - Common provider contract - Round-Robin Logic - Automatic provider rotation
File Structure
server/backend/services/ai_image_generator/
├── ai_image_generator.py # Main orchestrator with round-robin
├── base_provider.py # Abstract provider interface
├── replicate_provider.py # Replicate API provider
├── google_provider.py # Google Nano Banana provider
└── __init__.py
Currently Available Providers
1. Replicate Provider
- Name:
replicate - Priority: 1 (highest - used first in rotation)
- Model:
bytedance/seedream-4 - Rate Limit: 50 requests/minute
- Max Concurrent: 5
- Status: ✅ Enabled
- Required Env:
REPLICATE_API_TOKEN
2. Google Nano Banana Provider
- Name:
google_nano_banana - Priority: 2 (used second in rotation)
- Model: Gemini 3 Pro Image
- Rate Limit: 1000 requests/minute
- Max Concurrent: 20
- Status: ✅ Enabled
- Required Env:
GOOGLE_API_KEY
Smart Routing & Round-Robin Behavior
⚠️ IMPORTANT: The system uses dimension-aware smart routing that determines provider selection based on input type. Round-robin is NOT guaranteed for all requests.
Input Type Detection (ai_image_generator.py:50-60)
The system analyzes input parameters to determine routing strategy:
def _detect_input_type(self, input_params: Dict[str, Any]) -> str:
if "width" in input_params and "height" in input_params:
return "exact_dimensions" # → Force Replicate only
elif "aspect_ratio" in input_params:
return "aspect_ratio" # → Round-robin compatible providers
else:
return "default" # → Round-robin all providers
Provider Selection Logic
1. Exact Dimensions (width + height specified)
Input: {width: 1920, height: 1080}
Route: FORCE Replicate only - NO round-robin
Reason: Only Replicate supports precise pixel dimensions
Behavior:
Request 1 → replicate (forced)
Request 2 → replicate (forced)
Request 3 → replicate (forced)
ALL requests → replicate
2. Aspect Ratio (aspect_ratio specified)
Input: {aspect_ratio: "16:9"}
Route: Round-robin between compatible providers
Compatible: Replicate + Google Nano Banana
Behavior:
Request 1 → replicate (index 0)
Request 2 → google_nano_banana (index 1)
Request 3 → replicate (index 0)
Request 4 → google_nano_banana (index 1)
3. Default (no dimensions specified)
Input: {prompt: "a cat"}
Route: Round-robin between all available providers
Behavior:
Request 1 → provider[0] (based on priority)
Request 2 → provider[1]
Request 3 → provider[0]
Bulk Generation Real-World Behavior
Bulk generation behavior depends on preset configuration:
Case A: Preset with exact dimensions
{
"config": {
"dimensions": {
"width": 1920,
"height": 1080
}
}
}
Result: ALL images use Replicate only - NO round-robin
Case B: Preset with aspect ratio
{
"config": {
"dimensions": {
"aspect_ratio": "16:9"
}
}
}
Result: Images alternate between providers
Job A:
├── Image 1 → replicate
├── Image 2 → google_nano_banana
├── Image 3 → replicate
└── Image 4 → google_nano_banana
Job B (continues rotation):
├── Image 1 → replicate
└── Image 2 → google_nano_banana
Provider Configuration
Provider Config Structure
@dataclass
class ProviderConfig:
name: str # Unique identifier
rate_limit_per_minute: int # API rate limits
max_concurrent: int # Concurrent request limit
enabled: bool = True # Enable/disable provider
priority: int = 1 # Loading order (lower = higher priority)
Environment Variables Required
Development (.env.dev)
# Replicate
REPLICATE_API_TOKEN=r8_your_replicate_token
# Google
GOOGLE_API_KEY=your_google_ai_api_key
Production (.env.prod)
# Replicate
REPLICATE_API_TOKEN=r8_your_production_replicate_token
# Google
GOOGLE_API_KEY=your_production_google_ai_api_key
Provider Loading Process
1. Automatic Discovery
The system automatically discovers providers by:
- Scanning all files in the ai_image_generator/ directory
- Finding classes that inherit from AIImageProvider
- Calling create_from_env() on each provider class
2. Validation
Each provider must:
- Have required environment variables set
- Pass is_available() check
- Be marked as enabled=True
3. Priority Sorting
Providers are sorted by priority (ascending):
providers.sort(key=lambda p: getattr(p.config, 'priority', float('inf')))
Adding New Providers
1. Create Provider Class
# new_provider.py
from .base_provider import AIImageProvider, ProviderConfig
class NewProvider(AIImageProvider):
def __init__(self, api_key: str, config: ProviderConfig = None):
if config is None:
config = ProviderConfig(
name="new_provider",
rate_limit_per_minute=100,
max_concurrent=10,
enabled=True,
priority=3 # Set appropriate priority
)
super().__init__(config)
self.api_key = api_key
@classmethod
def create_from_env(cls):
api_key = os.getenv('NEW_PROVIDER_API_KEY')
if not api_key:
return None
return cls(api_key)
def is_available(self) -> bool:
return self.api_key is not None
def generate_image(self, model_id: str, input_params: Dict[str, Any]) -> str:
# Implementation here
pass
2. Add Environment Variable
# .env.dev and .env.prod
NEW_PROVIDER_API_KEY=your_api_key_here
3. Restart Services
docker-compose restart celery-worker
docker-compose restart metamock-backend
Troubleshooting
Issue: Only One Provider Loading
Symptoms: All images use the same provider
Cause: Other providers disabled or missing API keys
Solution:
1. Check environment variables are set
2. Verify enabled=True in provider config
3. Restart services after changes
Issue: Round-Robin Not Working in Bulk Generation
Symptoms: All images in bulk job use same provider Possible Causes: 1. Smart Routing: Preset uses exact dimensions (width/height) → Forces Replicate only 2. Provider Availability: Only one provider available or enabled 3. Configuration: Aspect ratio not properly set in preset config
Solution:
1. Check preset configuration - Use aspect_ratio instead of width/height for round-robin
2. Verify multiple providers are loaded and available
3. Check provider is_available() status
4. Review bulk generation logs for routing decisions
5. Look for log messages: 🎯 EXACT DIMENSIONS: Routing to REPLICATE vs 🎯 ASPECT RATIO: Round-robin
Issue: Provider Not Available
Symptoms: Provider shows Available: False
Common Causes:
- Missing or invalid API key
- enabled=False in configuration
- Provider's is_available() method failing
Monitoring
Check Loaded Providers
from backend.services.ai_image_generator.ai_image_generator import get_ai_image_generator
generator = get_ai_image_generator()
for provider in generator.providers:
print(f"{provider.config.name}: {provider.is_available()}")
Bulk Generation Logs
Look for these log patterns to understand routing behavior:
Smart Routing Decision Logs:
🎯 EXACT DIMENSIONS: Routing to REPLICATE (precise dimensions required)
🎯 ASPECT RATIO: Round-robin → REPLICATE
🎯 ASPECT RATIO: Round-robin → GOOGLE_NANO_BANANA
🎯 DEFAULT: Round-robin → PROVIDER_NAME
Generation Result Logs:
🏷️ Generation X: Used provider 'REPLICATE' (priority: 1)
🏷️ Generation X: Used provider 'GOOGLE_NANO_BANANA' (priority: 2)
🔍 BULK DEBUG: generation_result provider_name=replicate
🔍 STEP4 DEBUG: First generation provider_name=google_nano_banana, provider_priority=2
Config Processing Logs:
🎯 Generation X: Extracted aspect ratio 16:9 from config dimensions
🎯 Generation X: Found aspect ratio at ROOT level: 1:1
❌ Generation X: NO aspect_ratio found anywhere in config!
Analytics Tracking
Provider usage is automatically tracked in:
- S3 analytics files (analytics.json)
- Database records with provider information
- Celery logs with provider details
Performance Considerations
Provider Priority Strategy
- Priority 1: Fast, reliable providers (e.g., Replicate)
- Priority 2: High-capacity providers (e.g., Google)
- Priority 3+: Backup/experimental providers
Rate Limiting
Each provider enforces its own rate limits: - Providers track request timestamps - Automatic backoff when limits exceeded - Graceful fallback to next available provider
Concurrent Request Management
- Each provider has
max_concurrentlimit - Prevents overwhelming individual APIs
- Ensures stable performance across all providers
Best Practices
- Always have multiple providers enabled for redundancy
- Use aspect ratios for round-robin - Set presets with
aspect_ratioinstead ofwidth/heightfor load balancing - Use exact dimensions sparingly - Only when precise pixel control is required (forces Replicate only)
- Set appropriate rate limits based on API provider specifications
- Monitor provider performance through logs and analytics
- Test new providers thoroughly before enabling in production
- Keep API keys secure and rotate regularly
- Balance priorities based on cost, speed, and quality requirements
- Check routing logs to understand which providers are being selected
Technical Implementation Details
Smart Routing Logic (ai_image_generator.py)
- Feature Flag:
dimension_routing_enabled = True(hardcoded - always enabled) - Thread Safety: Uses
threading.Lock()for concurrent request handling - Provider Discovery: Auto-scans
*_provider.pyfiles and callscreate_from_env() - Capability Matching: Routes based on provider capabilities vs request requirements
Key Files
- Main Orchestrator:
server/backend/services/ai_image_generator/ai_image_generator.py - Bulk Integration:
server/backend/services/bulk_gen/bulk_image_service.py:137 - Enhanced Config:
bulk_image_service.py:115- Normalizes aspect ratio to root level - Step 4 Handler:
server/backend/celery/tasks/bulk_gen/step_4_generate_images.py
Enhanced Config Processing
The bulk service processes preset configurations to extract aspect ratios:
# Extracts aspect_ratio from nested dimensions and moves to root level
if 'dimensions' in config and 'aspect_ratio' in config['dimensions']:
enhanced_config['aspect_ratio'] = config['dimensions']['aspect_ratio']
This ensures compatibility with smart routing input type detection.
Future Enhancements
- Configurable dimension routing via environment variables
- Weighted round-robin based on provider performance
- Dynamic provider enabling/disabling based on health checks
- Cost-based routing for budget optimization
- Quality-based provider selection per image type
- Real-time provider monitoring dashboard
- Fallback dimension conversion when exact dimensions requested but Replicate unavailable