Centralized prompt management with version control, tagging, and systematic testing for production LLM applications
Phoenix Prompt Management provides a centralized system for versioning, organizing, and deploying prompts across your LLM applications. Treat prompts as code artifacts with proper version control, testing, and deployment workflows.
Create prompts programmatically or through the UI:
import phoenix as pxclient = px.Client()# Create a new promptprompt = client.create_prompt( name="customer_support_greeting", template="""You are a customer support agent for {{company_name}}. Your role is to help users with: - Account issues - Billing questions - Product information Be professional, friendly, and concise. Customer Query: {{query}} Response:""", metadata={ "model": "gpt-4", "temperature": 0.7, "max_tokens": 500, "owner": "support-team" })print(f"Created prompt: {prompt.name}")print(f"Version: {prompt.version_id}")
# Update the template (creates v2)prompt.update( template="""You are a customer support agent for {{company_name}}. Guidelines: - Be empathetic and patient - Provide clear, actionable solutions - Escalate to human agents when necessary Customer Query: {{query}} Response:""", commit_message="Added escalation guidelines")print(f"New version: {prompt.version_id}")
# Get all versionsversions = client.list_prompt_versions( prompt_name="customer_support_greeting")for version in versions: print(f"v{version.sequence_number}: {version.created_at}") if version.commit_message: print(f" Message: {version.commit_message}")
# Get by version numberprompt_v1 = client.get_prompt( name="customer_support_greeting", version=1)# Get by version IDprompt_specific = client.get_prompt( name="customer_support_greeting", version_id="version-abc-123")# Get latest versionprompt_latest = client.get_prompt( name="customer_support_greeting")
# Tag current version as productionclient.tag_prompt_version( prompt_name="customer_support_greeting", version_id=prompt.version_id, tag="production")# Tag for stagingclient.tag_prompt_version( prompt_name="customer_support_greeting", version_id=prompt.version_id, tag="staging")# Semantic versioningclient.tag_prompt_version( prompt_name="customer_support_greeting", version_id=prompt.version_id, tag="v1.0.0")
# Get production versionprod_prompt = client.get_prompt( name="customer_support_greeting", tag="production")# Get staging versionstaging_prompt = client.get_prompt( name="customer_support_greeting", tag="staging")
# After testing staging, promote to productionclient.tag_prompt_version( prompt_name="customer_support_greeting", version_id=staging_version_id, tag="production" # Moves production tag to new version)
template = """You are {{role}} with expertise in {{domain}}.User: {{user_input}}Assistant:"""prompt = client.create_prompt( name="expert_assistant", template=template)rendered = prompt.render( role="a senior software engineer", domain="distributed systems", user_input="How does consensus work in Raft?")
# After validation, promote to stagingclient.tag_prompt_version( prompt_name="my_prompt", version_id=prompt.version_id, tag="staging")# Run experiments on stagingresult = run_experiment( dataset=test_dataset, task=staging_task, experiment_name="staging-validation")
4
Deploy to production
# After successful testing, promote to productionclient.tag_prompt_version( prompt_name="my_prompt", version_id=prompt.version_id, tag="production")
5
Monitor performance
Use Phoenix tracing to monitor production performance. If issues arise, rollback:
# Rollback to previous versionclient.tag_prompt_version( prompt_name="my_prompt", version_id=previous_version_id, tag="production")
Use Descriptive Names: Name prompts based on their purpose (e.g., support_greeting, summarization_technical_docs).
Commit Messages: Always include meaningful commit messages when updating prompts.Tagging Strategy: Maintain consistent tag names across prompts:
production - Live in production
staging - Being tested
canary - Gradual rollout
rollback - Previous stable version
Version Metadata: Store model settings and generation parameters in prompt metadata.Test Before Production: Always validate prompt changes on representative datasets before deploying.Monitor Performance: Track evaluation metrics for prompts in production using tracing and experiments.Document Changes: Use commit messages and metadata to explain why changes were made.