Prompt Testing System Documentation

Last Updated: December 4, 2024
Status: Active development with color distribution fixes

Overview

The Prompt Testing System allows developers to test prompt generation logic without running full bulk generations. It simulates the exact same attribute resolution and color distribution that bulk generation uses, providing rapid feedback for prompt development and debugging.

Architecture

Frontend/External App → Frontend Proxy → FastAPI Backend → Prompt Testing Service
     ↓                     ↓                ↓                        ↓
localhost:3001         localhost:3003   localhost:8000    PromptTester + ColorMapper

Core Components

1. API Endpoints (server/api/prompt_testing.py)

Base URL: /api/v1/prompt-testing

Endpoints:

  1. POST /categories/{category}/test
  2. Tests prompt generation for specific category
  3. Supports 1-10 test generations
  4. Uses ColorMapper for proper color distribution
  5. Returns structured results with prompts and metadata

  6. GET /categories

  7. Lists available categories from S3 schema
  8. Uses test mode schema for development

  9. DELETE /cache/clear

  10. Clears test metadata cache
  11. Forces reload from S3 on next request

2. Request Format

{
  "count": 5,
  "attribute_overrides": {
    "style": "__RANDOM__",
    "material": "leather", 
    "color_palette": "#FF0000,#00FF00,#0000FF",
    "model_age": "__RANDOM__",
    "background_human": "casual street"
  }
}

3. Response Format

{
  "category": "bags",
  "total_tests": 5,
  "human_model_prompt": [...],
  "product_only_prompt": [...],
  "tests": [
    {
      "test_number": 1,
      "random_attributes": {
        "style": "tote bag",
        "material": "leather",
        "color_palette": "#FF0000",
        "model_age": "Young adult",
        "use_model": true
      },
      "template_variables": {
        "color_desc": "red color",
        "style_desc": "tote bag",
        "demographics": "young adult female"
      },
      "prompt": "A young adult female wearing a red color tote bag..."
    }
  ]
}

Color Distribution System

How It Works

The system uses the same ColorMapper as bulk generation to ensure consistent behavior:

  1. Input: "#FF0000,#00FF00,#0000FF"
  2. ColorMapper parses: ['#FF0000', '#00FF00', '#0000FF']
  3. Distribution: Each test gets one color
  4. Output: Test 1 gets "red color", Test 2 gets "green color", etc.

Key Classes

ColorMapper (config/prompts/mappers/color_mapper.py)

  • Parses comma-separated hex codes
  • Distributes colors across multiple generations
  • Supports cycling, random, and weighted strategies
  • CRITICAL: Must handle comma-separated input properly

PromptTester (backend/services/prompt_testing/prompt_tester.py)

  • Loads category schemas from S3
  • Handles __RANDOM__ attribute resolution
  • Integrates with category-specific prompt builders

Environment Setup

Required Environment Variables

Development (.env.dev):

# API routing
API_BACKEND_URL=http://metamock-backend:8000
NEXT_PUBLIC_API_URL=http://localhost:8000

# S3 schema files  
CATEGORY_META_DATA_FILE=category_metadata_schema.json
CATEGORY_META_DATA_TEST_FILE=category_metadata_schema_test.json
CLIENT_DATA_BUCKET=metamock-ai

Production (.env.prod):

API_BACKEND_URL=https://app.metamock.io
NEXT_PUBLIC_API_URL=https://app.metamock.io

CORS Configuration

Backend CORS (server/main.py):

allow_origins=[
    "http://localhost:3000",
    "http://localhost:3001", 
    "http://localhost:3003",
    "https://app.metamock.io",
    "https://admin.automock.app"
]

Frontend Proxy CORS (frontend/src/app/api/[...path]/route.ts):

const corsHeaders = {
  'Access-Control-Allow-Origin': 'http://localhost:3001, https://admin.automock.app',
}

Testing Methods

1. Direct Backend Testing

curl -X POST "http://localhost:8000/api/v1/prompt-testing/categories/hats/test" \
  -H "Content-Type: application/json" \
  -d '{
    "count": 3,
    "attribute_overrides": {
      "color_palette": "#FF0000,#00FF00,#0000FF"
    }
  }'

2. Through Frontend Proxy

curl -X POST "http://localhost:3003/api/v1/prompt-testing/categories/hats/test" \
  -H "Content-Type: application/json" \
  -d '{
    "count": 3,
    "attribute_overrides": {
      "color_palette": "#FF0000,#00FF00,#0000FF"  
    }
  }'

3. Browser Access

  • Main frontend: http://localhost:3003/settings/prompt_tester?category=hats
  • Admin interface: http://localhost:3001/settings/prompt_tester?category=hats

4. ColorMapper Testing

docker exec metamock-backend python -c "
from config.prompts.mappers.color_mapper import ColorMapper
mapper = ColorMapper()
result = mapper.distribute_colors('#FF0000,#00FF00,#0000FF', 3)
print('Distributed:', result)
"

Common Issues & Gotchas

1. CORS "Failed to fetch" Errors

Problem: External apps can't reach the API Symptoms: TypeError: Failed to fetch from localhost:3001 Solution: - Add origin to frontend proxy CORS headers - Ensure proper environment file is loaded - Check if containers are running

2. "Neutral color" Instead of Actual Colors

Problem: ColorMapper not parsing comma-separated hex codes Symptoms: All tests return "neutral color" in color_desc Root Cause: ColorMapper treating "#FF0000,#00FF00" as single color Solution: Fixed in ColorMapper _normalize_color_palette() method

Debug with logs:

docker logs metamock-backend | grep "🎨"

3. "undefined/api/v1/prompt-testing" in Modal

Problem: Frontend doesn't have API base URL Symptoms: API calls show undefined instead of domain Solution: Check API_BACKEND_URL environment variable

4. Category Not Found Errors

Problem: S3 schema not loaded or category doesn't exist Symptoms: 404: Category 'xyz' not found Solutions: - Clear cache: DELETE /api/v1/prompt-testing/cache/clear - Check S3 schema file exists - Restart containers to reload cache

5. Import Errors for Category Prompts

Problem: Category prompt builder not found Symptoms: ImportError: Category 'xyz' prompt builder not found Solutions: - Ensure category has migrated architecture (main.py file) - Check import path in bulk generator service - Verify Python path includes /app

Debugging Workflow

1. Check Container Status

docker ps | grep metamock

2. Monitor Backend Logs

docker logs -f metamock-backend

3. Test ColorMapper Directly

docker exec metamock-backend python -c "
from config.prompts.mappers.color_mapper import ColorMapper
from config.utils.color_utils import build_color_description
mapper = ColorMapper()
colors = mapper.distribute_colors('your_color_palette', count)
for i, color in enumerate(colors):
    desc = build_color_description(color)
    print(f'Test {i+1}: {color} -> {desc}')
"

4. Test Category Prompt Builder

docker exec metamock-backend python -c "
from config.prompts.{category}.main import build_prompt
result = build_prompt({'color_palette': '#FF0000'})
print(result)
"

5. Check S3 Schema

docker exec metamock-backend python -c "
from backend.services.cache import get_category_metadata_schema
schema = get_category_metadata_schema(test_mode=True)
print(schema.get('your_category', 'NOT_FOUND'))
"

Development Workflow

Adding New Categories for Testing

  1. Ensure category has new architecture: server/config/prompts/{category}/ ├── main.py # Main prompt builder ├── descriptions.py # Attribute descriptions └── (optional) bulk.py # Bulk generator

  2. Update S3 schema if needed: json { "category_name": { "attribute": ["option1", "option2"], "color_support": true } }

  3. Test the new category: bash curl -X GET "http://localhost:8000/api/v1/prompt-testing/categories"

Testing Color Distribution

Always test with comma-separated hex codes to verify ColorMapper:

{
  "count": 5,
  "attribute_overrides": {
    "color_palette": "#FF0000,#00FF00,#0000FF,#FFFF00,#FF00FF"
  }
}

Expected: Each test gets a different color description.

Integration with Bulk Generation

The prompt testing system is designed to exactly match bulk generation behavior:

  1. Same ColorMapper - Identical color distribution logic
  2. Same PromptTester - Identical attribute resolution
  3. Same Prompt Builders - Identical prompt generation
  4. Same S3 Schema - Identical attribute validation

Key Principle: If prompt testing works, bulk generation should work identically.

Future Enhancements

Planned Features

  • Real-time prompt preview in frontend
  • Batch testing across multiple categories
  • Grammar validation integration
  • Performance benchmarking
  • A/B testing support for prompt variations

Technical Debt

  • Consolidate environment variable management
  • Improve error messaging for failed tests
  • Add unit tests for prompt testing endpoints
  • Optimize S3 schema caching strategy

Quick Reference

Restart Services

docker restart metamock-backend metamock-frontend

Clear All Caches

curl -X DELETE "http://localhost:8000/api/v1/prompt-testing/cache/clear"

Check Available Categories

curl -X GET "http://localhost:8000/api/v1/prompt-testing/categories"

Test Color Distribution

curl -X POST "http://localhost:8000/api/v1/prompt-testing/categories/bags/test" \
  -H "Content-Type: application/json" \
  -d '{"count": 3, "attribute_overrides": {"color_palette": "#FF0000,#00FF00,#0000FF"}}'

This system provides a crucial development tool for rapid prompt iteration and ensures consistency between testing and production bulk generation.