Prompt Management

Phoenix Prompt Management provides a centralized system for versioning, organizing, and deploying prompts across your LLM applications. Treat prompts as code artifacts with proper version control, testing, and deployment workflows.

Why Prompt Management?

As LLM applications mature, prompts become critical infrastructure that requires:

Version Control: Track changes over time and roll back when needed
Collaboration: Enable teams to share and review prompts
Testing: Validate prompt changes before production deployment
Governance: Understand which prompts are in use and their performance
Reproducibility: Ensure consistent behavior across environments

Phoenix Prompt Management solves these challenges with a Git-like versioning system for prompts.

Core Concepts

Prompts

A prompt is a named template with:

Name: Unique identifier (e.g., “customer_support_greeting”)
Template: The actual prompt text with optional variables
Metadata: Model settings, tags, and custom properties
Versions: Complete history of changes

Versions

Each time you update a prompt, a new version is created with:

Version ID: Unique identifier for this specific version
Sequence Number: Auto-incrementing version number (v1, v2, v3…)
Template: The prompt text for this version
Created At: Timestamp of creation
Commit Message: Description of changes (optional)

Creating and Managing Prompts

Create a Prompt

Create prompts programmatically or through the UI:

import phoenix as px

client = px.Client()

# Create a new prompt
prompt = client.create_prompt(
    name="customer_support_greeting",
    template="""You are a customer support agent for {{company_name}}.
    
    Your role is to help users with:
    - Account issues
    - Billing questions
    - Product information
    
    Be professional, friendly, and concise.
    
    Customer Query: {{query}}
    
    Response:""",
    metadata={
        "model": "gpt-4",
        "temperature": 0.7,
        "max_tokens": 500,
        "owner": "support-team"
    }
)

print(f"Created prompt: {prompt.name}")
print(f"Version: {prompt.version_id}")

Update a Prompt

Updating a prompt creates a new version:

# Update the template (creates v2)
prompt.update(
    template="""You are a customer support agent for {{company_name}}.
    
    Guidelines:
    - Be empathetic and patient
    - Provide clear, actionable solutions
    - Escalate to human agents when necessary
    
    Customer Query: {{query}}
    
    Response:""",
    commit_message="Added escalation guidelines"
)

print(f"New version: {prompt.version_id}")

List Versions

View version history:

# Get all versions
versions = client.list_prompt_versions(
    prompt_name="customer_support_greeting"
)

for version in versions:
    print(f"v{version.sequence_number}: {version.created_at}")
    if version.commit_message:
        print(f"  Message: {version.commit_message}")

Retrieve Specific Version

Load a specific prompt version:

# Get by version number
prompt_v1 = client.get_prompt(
    name="customer_support_greeting",
    version=1
)

# Get by version ID
prompt_specific = client.get_prompt(
    name="customer_support_greeting",
    version_id="version-abc-123"
)

# Get latest version
prompt_latest = client.get_prompt(
    name="customer_support_greeting"
)

Working with Tags

Create Tags

Tag specific versions for easy reference:

# Tag current version as production
client.tag_prompt_version(
    prompt_name="customer_support_greeting",
    version_id=prompt.version_id,
    tag="production"
)

# Tag for staging
client.tag_prompt_version(
    prompt_name="customer_support_greeting",
    version_id=prompt.version_id,
    tag="staging"
)

# Semantic versioning
client.tag_prompt_version(
    prompt_name="customer_support_greeting",
    version_id=prompt.version_id,
    tag="v1.0.0"
)

Retrieve by Tag

Load prompts using tags:

# Get production version
prod_prompt = client.get_prompt(
    name="customer_support_greeting",
    tag="production"
)

# Get staging version
staging_prompt = client.get_prompt(
    name="customer_support_greeting",
    tag="staging"
)

Move Tags

Update tags to point to different versions:

# After testing staging, promote to production
client.tag_prompt_version(
    prompt_name="customer_support_greeting",
    version_id=staging_version_id,
    tag="production"  # Moves production tag to new version
)

View all tags for a prompt:

tags = client.list_prompt_tags(
    prompt_name="customer_support_greeting"
)

for tag in tags:
    print(f"{tag.name} → v{tag.version.sequence_number}")

Using Prompts in Applications

Basic Usage

Load and render prompts in your application:

import phoenix as px
from openai import OpenAI

px_client = px.Client()
openai_client = OpenAI()

# Load production prompt
prompt = px_client.get_prompt(
    name="customer_support_greeting",
    tag="production"
)

# Render with variables
rendered = prompt.render(
    company_name="Acme Inc",
    query="How do I reset my password?"
)

# Use with LLM
response = openai_client.chat.completions.create(
    model=prompt.metadata.get("model", "gpt-4"),
    temperature=prompt.metadata.get("temperature", 0.7),
    messages=[{"role": "user", "content": rendered}]
)

Template Variables

Phoenix supports template variable substitution:

template = """You are {{role}} with expertise in {{domain}}.

User: {{user_input}}

Assistant:"""

prompt = client.create_prompt(
    name="expert_assistant",
    template=template
)

rendered = prompt.render(
    role="a senior software engineer",
    domain="distributed systems",
    user_input="How does consensus work in Raft?"
)

Fallback Handling

Handle missing prompts gracefully:

try:
    prompt = client.get_prompt(
        name="custom_prompt",
        tag="production"
    )
except PromptNotFoundError:
    # Fallback to default prompt
    prompt = client.get_prompt(
        name="default_fallback",
        tag="production"
    )

Testing Prompts

Playground Integration

Test prompts interactively in the Playground (see Playground):

Load prompt

Open the Playground and click “Load Prompt”, then select your prompt and version/tag.

Test with real data

Use production trace replay or manual inputs to test the prompt.

Iterate and save

Modify the prompt in the Playground, then save as a new version.

Systematic Testing with Experiments

Test prompt changes systematically using experiments (see Experiments):

from phoenix.experiments import run_experiment
import phoenix as px

client = px.Client()

# Load dataset
dataset = client.get_dataset(name="support_queries")

# Define task using prompt
def task_with_prompt(input, version_tag):
    prompt = client.get_prompt(
        name="customer_support_greeting",
        tag=version_tag
    )
    
    rendered = prompt.render(
        company_name="Acme Inc",
        query=input['query']
    )
    
    response = openai_client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": rendered}]
    )
    
    return {"answer": response.choices[0].message.content}

# Test current production version
exp_prod = run_experiment(
    dataset=dataset,
    task=lambda input: task_with_prompt(input, "production"),
    experiment_name="prompt-production"
)

# Test new staging version
exp_staging = run_experiment(
    dataset=dataset,
    task=lambda input: task_with_prompt(input, "staging"),
    experiment_name="prompt-staging"
)

# Compare results in Phoenix UI

Deployment Workflow

Recommended workflow for managing prompts in production:

Create and iterate locally

# Create new prompt version
prompt = client.create_prompt(name="my_prompt", template="...")

Tag as development

client.tag_prompt_version(
    prompt_name="my_prompt",
    version_id=prompt.version_id,
    tag="dev"
)

Test in staging

# After validation, promote to staging
client.tag_prompt_version(
    prompt_name="my_prompt",
    version_id=prompt.version_id,
    tag="staging"
)

# Run experiments on staging
result = run_experiment(
    dataset=test_dataset,
    task=staging_task,
    experiment_name="staging-validation"
)

Deploy to production

# After successful testing, promote to production
client.tag_prompt_version(
    prompt_name="my_prompt",
    version_id=prompt.version_id,
    tag="production"
)

Monitor performance

Use Phoenix tracing to monitor production performance. If issues arise, rollback:

# Rollback to previous version
client.tag_prompt_version(
    prompt_name="my_prompt",
    version_id=previous_version_id,
    tag="production"
)

Best Practices

Use Descriptive Names: Name prompts based on their purpose (e.g., support_greeting, summarization_technical_docs).

Commit Messages: Always include meaningful commit messages when updating prompts. Tagging Strategy: Maintain consistent tag names across prompts:

production - Live in production
staging - Being tested
canary - Gradual rollout
rollback - Previous stable version

Version Metadata: Store model settings and generation parameters in prompt metadata. Test Before Production: Always validate prompt changes on representative datasets before deploying. Monitor Performance: Track evaluation metrics for prompts in production using tracing and experiments. Document Changes: Use commit messages and metadata to explain why changes were made.

Phoenix UI Features

The Phoenix UI provides rich prompt management capabilities:

Prompt Library

Browse all prompts in your organization
Search by name, tag, or metadata
View version history
Compare versions side-by-side

Version Comparison

Diff view between any two versions
Highlight template changes
Compare metadata changes
View performance metrics per version

Deployment Status

See which versions are tagged for production/staging
View last deployment timestamp
Track rollback history

Next Steps

Playground

Test prompts interactively before deployment

Experiments

Systematically validate prompt changes

Tracing

Monitor prompt performance in production

Prompts API

Complete API reference for prompt management

​Why Prompt Management?

​Core Concepts

​Prompts

​Versions

​Tags

​Creating and Managing Prompts

​Create a Prompt

​Update a Prompt

​List Versions

​Retrieve Specific Version

​Working with Tags

​Create Tags

​Retrieve by Tag

​Move Tags

​List Tags

​Using Prompts in Applications

​Basic Usage

​Template Variables

​Fallback Handling

​Testing Prompts

​Playground Integration

​Systematic Testing with Experiments

​Deployment Workflow

​Best Practices

​Phoenix UI Features

​Prompt Library

​Version Comparison

​Deployment Status

​Next Steps

Playground

Experiments

Tracing

Prompts API

Why Prompt Management?

Core Concepts

Prompts

Versions

Tags

Creating and Managing Prompts

Create a Prompt

Update a Prompt

List Versions

Retrieve Specific Version

Working with Tags

Create Tags

Retrieve by Tag

Move Tags

List Tags

Using Prompts in Applications

Basic Usage

Template Variables

Fallback Handling

Testing Prompts

Playground Integration

Systematic Testing with Experiments

Deployment Workflow

Best Practices

Phoenix UI Features

Prompt Library

Version Comparison

Deployment Status

Next Steps