Bulk Generation System Documentation

Last Updated: November 30, 2024
Status: Testing workflow automation v4

Overview

The Bulk Generation System is a comprehensive feature that allows users to generate multiple AI images based on a single preset configuration. Instead of generating one image at a time, users can create 5-50 unique variations in a single bulk operation, packaged as a downloadable ZIP file.

Core Concept

Single Preset โ†’ Multiple Unique Images

Users create a preset configuration (e.g., "Professional bags") with base attributes like: - Category: bags - Size: large - Material: leather - Use model: true - Color palette: ['#FF0000', '#000000'] - Generation count: 5

The system then generates 5 completely unique bag images by randomly resolving attributes for each generation: - Generation 1: Large leather tote bag, female model, casual setting - Generation 2: Large leather messenger bag, male model, office setting
- Generation 3: Large leather backpack, female model, outdoor setting - etc.

Architecture Overview

Frontend (React) โ†’ API (FastAPI) โ†’ Celery Worker โ†’ S3 Storage
     โ†“                โ†“               โ†“              โ†“
Preset Page     bulk_generation.py   8-Step Process  ZIP Downloads

Core Components

1. Frontend Integration (frontend/src/app/bulk-runs/presets/page.tsx)

Trigger Flow: - User clicks play button on preset - Confirmation modal shows preset details + credit requirements - User confirms โ†’ API call to /api/bulk-generation/start - Success flash message (not popup) with task ID and progress info

Key Features: - Material-UI modal with preset preview - Real-time flash notifications that user manually dismisses - Credit validation before submission - Task ID provided for monitoring

2. API Layer (server/api/bulk_generation.py)

Endpoints: - POST /start - Trigger bulk generation with preset_id - GET /jobs/{job_uuid}/status - Monitor job progress - GET /download/{user_id}/{run_id} - Download ZIP via S3 presigned URL

Validation Pipeline: - Preset exists and accessible to user - User has sufficient credits (1 credit per image) - Generation count within limits (max 50) - Bulk generator exists for category

3. Celery Task Processing (server/backend/celery/tasks/generation_tasks.py)

8-Step Process: 1. Preset Validation - Load config, create job records 2. Prompt Generation - Generate unique prompts using bulk generators
3. Directory Setup - Create temporary output directory 4. Image Generation - Uses round-robin AI provider system (see ROUNDROBIN.md) 5. ZIP Creation - Package results with metadata 6. Database Updates - Deduct credits, save history 7. S3 Upload - Upload ZIP and delete local files 8. Status Finalization - Mark job complete with download URL

AI Provider System: - Smart Routing - Automatically selects optimal provider based on input type - Round-robin Load Balancing - Distributes load between Replicate and Google providers - Dimension-aware Selection - Routes exact dimensions to Replicate, aspect ratios get round-robin - Automatic Failover - Graceful provider switching on errors - Thread-safe Operation - Handles concurrent bulk generations safely

For detailed provider selection logic, load balancing behavior, and troubleshooting, see ROUNDROBIN.md

Error Handling: - Individual generation failures don't stop the batch - Credits only deducted for successful generations - Comprehensive error logging with context - Automatic cleanup on failure

4. Modular Service Architecture (server/backend/services/bulk_gen/)

Service Files: - preset_service.py - Preset loading and credit validation - bulk_generator_service.py - Dynamic generator loading and prompt generation with attribute resolution tracking - bulk_image_service.py - Replicate API integration and file management - bulk_database_service.py - Atomic database operations - zip_service.py - ZIP creation and S3 upload - bulk_file_utils.py - Directory management utilities - bulk_error_handler.py - Error handling and recovery - bulk_generation_config.py - Configuration constants

Enhanced Attribute Resolution Tracking (NEW)

The bulk generator service now provides detailed metadata about which attributes were randomly selected vs. user-specified in each generation.

Enhanced Return Structure:

{
  "generation": 1,
  "prompt": "A young adult white female wearing a...",
  "config": { /* full resolved configuration */ },
  "attribute_resolution": {
    "randomly_selected": {
      "size": "Large",           // Was __RANDOM__, system picked this
      "type": "Messenger bag",   // Was __RANDOM__, system picked this
      "model_sex": "Female",     // Was __RANDOM__, system picked this
      // ... more random selections
    },
    "user_specified": {
      "use_model": true,         // User explicitly set this
      "color_palette": ["#808080", "#000000", "#FFFFFF"], // User set
      "generation_count": 2      // User set
    },
    "total_random_count": 8,
    "total_user_specified_count": 4
  },
  "preset_metadata": {
    "preset_id": "af8dedd7-c7b8-446a-afb2-25cf4ffda869",
    "preset_name": "Professional Bags",
    "category": "bags",
    "original_config": { /* original preset config for reference */ }
  }
}

Benefits: - Universal Implementation - Works automatically for ALL categories (bags, cushions, etc.) - Clear Attribution - Shows exactly which attributes were random vs user-specified - Detailed Metadata - Includes preset information and resolution statistics - Backward Compatible - Preserves original prompt structure while adding enhancements - Debugging Support - Helps track how presets resolve into final prompts - Analytics Integration - Attribute resolution data automatically saved to S3 analytics.json

Dynamic Bulk Generators

Concept

Each category (bags, cushions, etc.) has a dedicated bulk generator file that creates multiple unique prompts from a single preset.

Location: server/config/prompts/{category}/bulk.py

Example: bags/bulk.py

Attribute Mappers System

Three Core Mappers: 1. AttributeMapper (attribute_mapper.py) - Resolves category-specific attributes 2. ModelMapper (model_mapper.py) - Handles human model variations 3. BackgroundMapper (background_mapper.py) - Manages scene/environment settings

How It Works:

# Original preset config
{
  "size": "__RANDOM__",           # Will be resolved to actual values
  "material": "leather",          # Static value
  "model_sex": "__RANDOM__",      # Will be resolved 
  "background": "__RANDOM__"      # Will be resolved
}

# After mapper resolution (5 generations)
[
  {"size": "large", "material": "leather", "model_sex": "female", "background": "office"},
  {"size": "medium", "material": "leather", "model_sex": "male", "background": "casual"},
  {"size": "small", "material": "leather", "model_sex": "female", "background": "outdoor"},
  # ... 2 more unique combinations
]

Schema Loading: Mappers load their value options from S3 schema files: - metamock-ai/category_metadata_schema.json - Category-specific attributes - Contains valid values for each attribute (sizes, materials, types, etc.)

Universal Mapper Architecture (Enhanced)

The mapper system has been significantly enhanced to provide intelligent attribute resolution that respects user preferences while providing meaningful randomization.

Core Mappers

1. AttributeResolver (attribute_mapper.py) - Handles category-specific attributes (type, material, size, etc.) - Loads valid options from S3 category schemas - Supports uniqueness tracking to avoid duplicates across generations - Respects user-specified values vs __RANDOM__ placeholders

# Example usage
resolver = AttributeResolver("bags")
attributes = {
    "type": "__RANDOM__",      # Will be randomly selected
    "material": "leather",     # User-specified, never changed
    "size": "__RANDOM__"       # Will be randomly selected
}
resolved = resolver.resolve_attributes(attributes, generation_count=5)

2. ModelMapper (model_mapper.py) - Manages human model attribute combinations (sex, age, ethnicity) - Provides uniqueness across generations when using random values - Supports both enabled/disabled model scenarios - Includes flattening for prompt compatibility

# Model configuration with mixed user/random values
model_usage = {
    "use_model": True,
    "model_attributes": {
        "model_sex": "Female",        # User-specified
        "model_age": "__RANDOM__",    # Will resolve to: Teenager, Young adult, Middle aged, Old
        "model_ethnicity": "__RANDOM__" # Will resolve to: White, Black, Oriental, etc.
    }
}

3. BackgroundMapper (background_mapper.py) - Manages scene/environment settings based on model usage - Separate background pools for human vs product-only scenarios - Context-aware background selection (office, casual, outdoor, etc.) - Supports user-specified backgrounds with fallback to random

4. ColorMapper (color_mapper.py) - NEW - Handles proper color distribution across generations - Converts color arrays into individual colors per generation - Multiple distribution strategies: cycling, random, weighted

# User specifies multiple colors
color_palette = ["#FF0000", "#00FF00", "#0000FF", "#FFFF00", "#FF00FF"]

# ColorMapper distributes one color per generation
distributed = color_mapper.distribute_colors(color_palette, generation_count=5)
# Result: ["#FF0000", "#00FF00", "#0000FF", "#FFFF00", "#FF00FF"]
# Each generation gets exactly one color instead of the entire array

User Preference vs Random Resolution

Critical Feature: All mappers now distinguish between user-specified values and __RANDOM__ placeholders.

Before (Problematic):

# ALL values treated as random, ignoring user preferences
config = {
    "material": "leather",    # User wanted leather specifically
    "size": "__RANDOM__"      # User wanted random size
}
# Result: Both material and size were randomized (WRONG!)

After (Correct):

# Only __RANDOM__ values are resolved, user values preserved
config = {
    "material": "leather",    # PRESERVED: User-specified value
    "size": "__RANDOM__"      # RESOLVED: System picks random size
}
# Result: material stays "leather", only size is randomized (CORRECT!)

Color Distribution System

Problem Solved: Previously, entire color arrays were passed to each prompt generation.

Issue: If user specified ["red", "blue", "green"], each of 5 generations would get all 3 colors.

Solution: ColorMapper distributes colors individually:

class ColorMapper:
    def __init__(self, strategy: ColorDistributionStrategy):
        # CYCLING: red, blue, green, red, blue
        # RANDOM: green, red, blue, red, green  
        # WEIGHTED: More frequent colors get higher probability

Benefits: - Each generation gets exactly one color for focused results - Better variety across the bulk generation set - User color preferences are respected while providing distribution - Prevents overwhelming prompts with multiple color instructions

Background Preference Handling

Enhanced Logic: Background selection now respects user preferences with intelligent fallbacks.

# User explicitly sets background
config = {
    "background": "casual street carrying",  # User preference
    "use_model": True
}
# Result: ALL generations use "casual street carrying"

# User wants random backgrounds  
config = {
    "background": "__RANDOM__",  # or empty/null
    "use_model": True
}
# Result: Each generation gets different random background

Implementation:

if user_background and user_background.strip() and user_background != "__RANDOM__":
    # Use user-specified background for all generations
    background_values = [user_background] * generation_count
else:
    # Generate random backgrounds using BackgroundMapper
    background_values = self.background_mapper.generate_background_combinations(
        use_model, generation_count
    )

Model Attribute Grammar Fixes

Issue Resolved: Frontend was sending grammatically incorrect model attributes.

Before: - "Young adults" โ†’ Generated "Young adults Female White person" (WRONG GRAMMAR) - "Teens" โ†’ Generated "Teens Male Black person" (WRONG GRAMMAR)

After: - "Young adult" โ†’ Generates "Young adult Female White person" (CORRECT) - "Teenager" โ†’ Generates "Teenager Male Black person" (CORRECT)

Fix Locations: - /frontend/src/components/ModelSelector.tsx - Corrected dropdown values - Model attributes now use proper singular forms for grammatical correctness

Analytics Integration

Enhanced Metadata: Every mapper resolution is tracked for analytics transparency.

Resolution Tracking:

{
  "attribute_resolution": {
    "randomly_selected": {
      "type": "tote bag",           // Was __RANDOM__
      "model_sex": "Female",        // Was __RANDOM__
      "background_human": "office", // Was __RANDOM__
      "color_palette": "#FF0000"    // Distributed from user array
    },
    "user_specified": {
      "material": "leather",        // User explicitly set
      "use_model": true,           // User explicitly set  
      "size": "large"              // User explicitly set
    },
    "total_random_count": 4,
    "total_user_specified_count": 3
  }
}

Benefits: - Complete transparency in attribute resolution - User behavior analytics (what gets randomized vs specified) - Quality assurance tracking - Debugging support for prompt generation issues - Data-driven insights for improving randomization algorithms

Mapper Testing Infrastructure

Quality Assurance: Each category should include comprehensive prompt testing.

Example Implementation: bags_test_prompt.py - Tests 25+ attribute combinations - Validates grammar using language-tool-python - Checks edge cases and invalid inputs - Provides detailed reporting and auto-fixes - Ensures prompt quality before deployment

Testing Categories: - Product-only scenarios (10 test cases) - Human model scenarios (15 test cases)
- Edge cases with invalid attributes (2+ test cases) - Grammar validation and correction - Comprehensive reporting with success rates

Bulk Generator Implementation

Template Structure:

# In config/prompts/{category}/bulk.py
from .main import build_prompt  # Import from modular main.py

class CategoryBulkGenerator:
    def __init__(self):
        self.attribute_resolver = AttributeResolver(category_name)
        self.model_mapper = ModelMapper()  
        self.background_mapper = BackgroundMapper(category_name)

    def generate_bulk_prompts(self, preset_id: str) -> List[Dict]:
        # 1. Load preset configuration
        # 2. Resolve __RANDOM__ values for N generations
        # 3. Generate unique prompt for each combination using build_prompt()
        # 4. Return list of prompt data with metadata

Key Methods: - generate_bulk_prompts() - Main entry point - resolve_bulk_random_values() - Handle RANDOM resolution - generate_bag_prompt() - Create individual prompts - build_bag_prompt_from_config() - Construct final prompt text

Database Models

1. backend_jobs Table

Purpose: Generic job queue tracking for frontend polling Key Fields: - uuid (Primary Key): Shared with bulk_generation_header.id - job_type: JobType.bulk_generation enum value - status: processing โ†’ completed/failed (what frontend polls) - result_url: Download API endpoint (/api/bulk-generation/download/{user_id}/{run_id}) - download_link: Same as result_url for frontend convenience - created_at, completed_at: Job lifecycle timestamps - error_message: Failure details if applicable

Usage: Frontend polls this table to know when bulk jobs are done and get download links

2. bulk_generation_header Table

Purpose: Detailed bulk generation run tracking and metrics Key Fields: - id (Primary Key): UUID shared with backend_jobs.uuid - user_id: Owner of the bulk run - preset_id: Configuration used for generation - category: Image category (bags, cushions, etc.) - total_images: Expected generation count - completed_images: Actual successful generations
- failed_images: Number of failed generations - s3_folder_path: S3 location pattern {user_id}/bulk_runs/{category}/{run_id} - zip_file_s3_key: S3 key to the ZIP file - zip_file_url: Pre-signed download URL (with expiry) - zip_file_expires_at: When pre-signed URL expires - status: Detailed status (pending, processing, completed, failed, cancelled) - credits_consumed: Actual credits deducted (= completed_images) - estimated_credits: Initial credit estimate - error_message: Detailed error information - retry_count: Number of retry attempts

Usage: Stores comprehensive bulk generation workflow data and progress metrics

3. img_history Table (Updated for Bulk)

Purpose: Track individual image generations for user history Key Fields for Bulk: - aws_folder_id: Set to bulk run UUID (NOT individual image paths) - user_id: Image owner - ai_model: Model used for generation (sdxl, etc.) - operation_type: OperationType.bulk_generation - created_at: Generation timestamp

Usage: Creates one record per successful image in the bulk batch

Table Relationship

backend_jobs.uuid = bulk_generation_header.id (shared UUID)
img_history.aws_folder_id = bulk_generation_header.id (for bulk runs)

Frontend Flow: 1. Frontend polls backend_jobs for job completion status 2. Once complete, uses bulk_generation_header for detailed progress/stats
3. Download link from backend_jobs.download_link leads to ZIP file

Enhanced Analytics Integration

S3 Analytics Files (NEW)

Each bulk generation now automatically creates detailed analytics files with attribute resolution data:

Location: s3://bucket/prompt_history/{preset_id}/{job_id}/analytics.json

Enhanced Analytics Structure (Version 2.0):

{
  "user_id": 1,
  "job_id": "bulk-job-uuid-123",
  "category": "bags",
  "preset_id": "af8dedd7-c7b8-446a-afb2-25cf4ffda869",
  "analytics_version": "2.0",
  "enhanced_data_available": true,
  "generation_results": [
    {
      "generation_index": 1,
      "prompt_text": "A young adult white female wearing...",
      "resolved_attributes": {
        "type": "tote bag",
        "material": "cotton",
        "size": "medium",
        "model_sex": "female",
        // ... all final resolved values
      },
      "attribute_resolution": {
        "randomly_selected": {
          "type": "tote bag",      // Was __RANDOM__
          "model_sex": "female"    // Was __RANDOM__
        },
        "user_specified": {
          "material": "cotton",    // User explicitly set
          "size": "medium"         // User explicitly set
        },
        "total_random_count": 2,
        "total_user_specified_count": 2
      },
      "preset_metadata": {
        "preset_name": "Professional Bags",
        "category": "bags"
      },
      "s3_zip_path": "1/bulk_runs/bags/preset-id/preset-id_bulk_generation.zip",
      "image_filename_in_zip": "generation_1.png"
    }
    // ... more generations
  ]
}

Analytics Benefits: - Complete Transparency - See exactly which attributes were random vs user-specified - Data Analysis - Analyze user preferences and random selection patterns
- Quality Assurance - Track prompt generation consistency and attribute distribution - User Insights - Understand how users configure presets and what gets randomized - Debugging - Full audit trail from preset to final prompt for each generation

S3 Storage Architecture

Directory Structure

s3://metamock-client-data/
โ””โ”€โ”€ {user_id}/
    โ””โ”€โ”€ bulk_runs/
        โ””โ”€โ”€ {category}/
            โ””โ”€โ”€ {preset_id}/
                โ”œโ”€โ”€ {preset_id}_bulk_generation.zip
                โ””โ”€โ”€ metadata/
                    โ”œโ”€โ”€ generation_1.json
                    โ”œโ”€โ”€ generation_2.json
                    โ””โ”€โ”€ summary.json

ZIP File Contents

{preset_id}_bulk_generation.zip:
โ”œโ”€โ”€ generation_1.png
โ”œโ”€โ”€ generation_1_metadata.json
โ”œโ”€โ”€ generation_2.png  
โ”œโ”€โ”€ generation_2_metadata.json
โ”œโ”€โ”€ ...
โ””โ”€โ”€ bulk_generation_summary.json

S3 Upload Flow

  1. Create ZIP locally in /tmp/bulk_runs/
  2. Upload to S3 with proper metadata and content headers
  3. Delete local ZIP file immediately
  4. Store S3 URL in database (s3://bucket/path/to/file.zip)
  5. Generate presigned download URLs (1 hour expiry)

Download System

Frontend Access

Users access downloads through the bulk runs interface, NOT the regular downloads page. Important: Bulk runs have different IDs than single image generations.

API Flow

  1. User clicks download โ†’ /api/bulk-generation/download/{user_id}/{run_id}
  2. API validates user access and run completion
  3. Generates S3 presigned URL (1 hour expiry)
  4. Returns RedirectResponse to presigned URL
  5. Browser downloads directly from S3

Security

  • Users can only download their own bulk runs
  • Presigned URLs expire automatically
  • S3 bucket permissions prevent direct access
  • Content-Disposition headers force download with clean filenames

Credit System Integration

Credit Deduction Logic

# Credits deducted ONLY for successful generations
total_attempted = 5
successful_count = 4  # 1 failed
failed_count = 1

credits_deducted = successful_count  # 4 credits, not 5

Database Recording

  • img_history records created for each successful generation (one record per image)
  • bulk_generation_header stores total credits consumed
  • users table updated with deducted credits
  • Failed generations don't consume credits
  • NO prompt_history records created for bulk generation

Known Issues & Gotchas

1. Import Path Issues (RESOLVED)

Problem: Celery worker context has different Python paths Solution: Added /app path resolution in multiple locations: - generation_tasks.py (top-level) - api/image_generation.py (for utils.dimension_utils) - bulk_image_service.py (module-level)

2. Database Enum Values (RESOLVED)

Issue: New enum values need database migration Examples: - JobType.bulk_generation - Added to PostgreSQL enum - OperationType.bulk_generation - Added for img_history

3. Async/Sync Function Mixing

Issue: Celery tasks are sync but some functions were async Solution: Used asyncio.run() wrapper in Celery task for async image generation

4. Flower Dashboard Caching

Issue: Flower UI shows stale error messages with old timestamps Solution: Use docker logs for real-time monitoring, ignore Flower for debugging

5. File Path Extraction

Gotcha: S3 key generation relies on directory path parsing

# Directory: /tmp/bulk_runs/1/bags/preset-id/
# Parsed: user_id=1, category=bags, preset_id=preset-id

6. Frontend Error Handling

Issue: Original implementation used popup alerts Fixed: Replaced with Material-UI dismissible flash messages

Development Workflow

Category and Attribute Management

๐Ÿ†• Adding a New Category

Required Steps:

  1. Create Category Prompt File bash # Create the main prompt generator touch server/config/prompts/{category}.py
  2. Implement build_prompt(attributes) function
  3. Follow existing patterns (see cushions.py, t-shirts.py)
  4. Include proper attribute handling and background descriptions

  5. Create Bulk Generator in Modular Structure bash # Create bulk generator in category folder mkdir -p server/config/prompts/{category} touch server/config/prompts/{category}/bulk.py

  6. Place bulk generator in same folder as main.py
  7. Import build_prompt from .main instead of legacy approach
  8. Use modular structure consistently

  9. Update Service Class Mappings (if needed) python # In bulk_generator_service.py class_name_mappings = { "your_category": "YourCategoryBulkGenerator", # Add if complex naming }

  10. Update S3 Category Schema json // In s3://metamock-ai/category_metadata_schema.json { "your_category": { "attribute_name": ["option1", "option2", "option3"], "another_attribute": ["valueA", "valueB"] }, "background_mapping": { "your_category": { "human_backgrounds": ["setting1", "setting2"], "no_human_backgrounds": ["studio", "flat lay"] } } }

  11. Test and Validate bash # Test the bulk generator docker exec metamock-backend python -c " from backend.services.bulk_gen.bulk_generator_service import test_bulk_generator result = test_bulk_generator('your_category') print(result) "

  12. Create Grammar Testing Script (recommended) bash # Copy and modify existing test script cp server/config/prompts/bags_test_prompt.py server/config/prompts/{category}_test_prompt.py # Update category-specific test cases

๐Ÿ”ง Adding a New Attribute to Existing Category

Required Steps:

  1. Update Category Prompt File python # In config/prompts/{category}.py def build_prompt(attributes): new_attribute = attributes.get("new_attribute", "default_value") # Add handling for the new attribute in prompt generation

  2. Update S3 Category Schema json // Add to existing category in category_metadata_schema.json { "existing_category": { "existing_attribute": ["existing", "values"], "new_attribute": ["option1", "option2", "option3"] // โ† Add this } }

  3. Update Frontend (if user-configurable) typescript // Update preset configuration forms to include new attribute // Add validation rules for new attribute values

  4. Test Attribute Resolution bash # Verify AttributeResolver loads new schema correctly docker exec metamock-backend python -c " from config.prompts.mappers.attribute_mapper import AttributeResolver resolver = AttributeResolver('your_category') print(resolver.get_possible_values('new_attribute')) "

  5. Update Test Scripts python # Add test cases for new attribute in {category}_test_prompt.py test_cases.append({ "new_attribute": "test_value", # ... other attributes })

๐Ÿ“ Adding a New Option Value for Existing Attribute

Simplest Case - Only S3 Schema Update Required:

  1. Update S3 Schema json // In category_metadata_schema.json { "category": { "existing_attribute": [ "existing_option1", "existing_option2", "new_option3" // โ† Add new value ] } }

  2. Update Category Prompt Handling (if special logic needed) python # In config/prompts/{category}.py - only if new option needs special handling def build_attribute_description(attribute_value): descriptions = { "existing_option1": "existing description", "existing_option2": "existing description", "new_option3": "new description for new option" # โ† Add if needed }

  3. Test New Option bash # Test that new option resolves correctly docker exec metamock-backend python config/prompts/{category}_test_prompt.py

๐Ÿ”„ Deployment Checklist

After any schema changes:

  1. Upload Updated Schema to S3 bash aws s3 cp category_metadata_schema.json s3://metamock-ai/category_metadata_schema.json

  2. Restart Celery Workers (schema is cached) bash docker restart metamock-celery-worker

  3. Clear Frontend Cache (if applicable) bash # Clear any cached category options in frontend

  4. Test End-to-End

  5. Create test preset with new category/attribute/option
  6. Run bulk generation
  7. Verify prompts contain new values correctly
  8. Check analytics resolution tracking

๐Ÿšจ Common Gotchas

Category Names: - Avoid hyphens in file names (use underscores) - For existing categories with hyphens (t-shirts), use importlib - Update class name mappings for complex names

S3 Schema: - Schema changes are cached - restart workers after updates - Ensure background mappings exist for new categories - Test both human_backgrounds and no_human_backgrounds

Attribute Resolution: - New attributes default to first schema value if not specified - Test both user-specified and __RANDOM__ resolution - Verify analytics correctly track user vs system selection

Testing: - Always test with real presets, not just unit tests - Verify color distribution works with new categories - Check grammar with multiple attribute combinations

Testing Bulk Generation

  1. Create test preset via frontend
  2. Monitor logs: docker logs -f metamock-celery-worker
  3. Check S3 bucket for file upload
  4. Test download via bulk runs page
  5. Verify credit deduction

Testing Prompt Quality

Each category should have a dedicated prompt testing script to validate grammar and content quality:

Location: server/config/prompts/{category}_test_prompt.py

Example: bags_test_prompt.py (already implemented)

Testing Script Requirements: - Generate 25+ test cases with various attribute combinations - Test both product-only and human model scenarios - Include edge cases with invalid/unusual attributes - Check grammatical correctness (optionally with language-tool-python) - Validate prompt content is suitable for AI image generation - Provide detailed reporting with success rates

Usage:

# Run from backend container using modular structure
docker exec metamock-backend python -c "
from config.prompts.{category}.main import build_prompt
# Test various attribute combinations
print(build_prompt({test_attributes}))
"

# Or create dedicated test script in category folder
docker exec metamock-backend python config/prompts/{category}/test_prompts.py

# Example output
๐ŸŽ’ Category Prompt Grammar Checker
==================================================
๐Ÿงช Testing 27 prompt generation cases...
โœ… Grammar check passed (100% success rate)
๐Ÿ“„ Detailed results saved to {category}_grammar_test_results.json

Benefits: - Catch grammar issues before deployment - Ensure prompts generate appropriate AI images - Validate edge case handling - Document expected prompt formats - Facilitate debugging and improvements

Debugging Issues

# Real-time Celery logs
docker logs -f metamock-celery-worker

# Check S3 uploads
aws s3 ls s3://metamock-client-data/1/bulk_runs/ --recursive

# Database inspection
docker exec metamock-backend python -c "from db.session import SessionLocal; ..."

Future Enhancements

Monitoring & Progress

  • Real-time progress updates via WebSocket
  • Estimated completion times
  • Cancel operation functionality

Advanced Features

  • Custom generation counts per preset
  • Multiple aspect ratios in single bulk run
  • Batch prompt editing before generation
  • Template-based prompt variations

Performance Optimizations

  • Parallel Replicate API calls
  • Intelligent retry logic
  • Background S3 upload during generation
  • ZIP streaming for large batches

Troubleshooting

"No images were generated successfully"

  1. Check Replicate API token configuration
  2. Verify attribute mapper schemas in S3
  3. Confirm preset configuration validity
  4. Check Celery worker import paths

S3 Upload Failures

  1. Verify AWS credentials and bucket permissions
  2. Check bucket name configuration
  3. Confirm S3 client initialization
  4. Monitor worker memory usage for large ZIPs

Download Issues

  1. Verify S3 URL format in database
  2. Check presigned URL generation
  3. Confirm user permissions
  4. Test S3 bucket accessibility

Integration Points

External Dependencies

  • Replicate API - Image generation service
  • AWS S3 - File storage and serving
  • PostgreSQL - Job and result tracking
  • Redis - Celery message broker
  • NextJS - Frontend preset management

Internal Dependencies

  • Preset System - Configuration source
  • Credit System - Payment processing
  • User Authentication - Access control
  • S3 Client - File management utilities

This system represents a complete end-to-end bulk image generation pipeline with enterprise-grade error handling, monitoring, and scalability features.