Getting Started
This guide will help you get PromptCache up and running in minutes.
Prerequisites
- Go 1.24 or higher
- Docker (optional, for containerized deployment)
- API keys for your chosen provider(s)
Installation Methods
Method 1: Docker (Recommended)
The fastest way to get started:
# Clone the repository
git clone https://github.com/messkan/prompt-cache.git
cd prompt-cache
# Set your environment variables
export EMBEDDING_PROVIDER=openai
export OPENAI_API_KEY=your-openai-api-key
# Run with Docker Compose
docker-compose up -d
Method 2: From Source
Build and run directly:
# Clone the repository
git clone https://github.com/messkan/prompt-cache.git
cd prompt-cache
# Set environment variables
export EMBEDDING_PROVIDER=openai
export OPENAI_API_KEY=your-openai-api-key
# Run using the provided script
./scripts/run.sh
# Or use Make
make run
Method 3: Manual Build
For more control:
# Build the binary
go build -o prompt-cache cmd/api/main.go
# Run the server
./prompt-cache
Verify Installation
Check that the server is running:
# Health check
curl http://localhost:8080/health
# Expected: {"status":"healthy","time":"..."}
# Readiness check (verifies storage is working)
curl http://localhost:8080/health/ready
# Expected: {"status":"ready"}
# Get current provider
curl http://localhost:8080/v1/config/provider
# Expected: {"provider":"openai","available_providers":["openai","mistral","claude"]}
# View metrics
curl http://localhost:8080/metrics
First Request
Use PromptCache with the OpenAI SDK:
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="your-openai-api-key"
)
# First request - goes to OpenAI
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "What is quantum computing?"}]
)
print(response.choices[0].message.content)
# Second request - similar prompt, served from cache
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Explain quantum computing"}]
)
print(response.choices[0].message.content)
Monitoring
View Statistics
# JSON stats
curl http://localhost:8080/v1/stats
# Prometheus metrics
curl http://localhost:8080/metrics
Watch Logs
# Docker
docker-compose logs -f
# Direct run
# Logs appear in the terminal (JSON format)
Look for log entries with:
"cache_hit":true- Request served from cache"cache_hit":false- Request forwarded to provider
Cache Management
# View cache stats
curl http://localhost:8080/v1/cache/stats
# Clear all cache
curl -X DELETE http://localhost:8080/v1/cache
Next Steps
Troubleshooting
Server won’t start
Check that:
- The required API key is set for your provider
- Port 8080 is not already in use (or change with
PORTenv var) - BadgerDB data directory is writable
Cache not working
Verify:
- Prompts are semantically similar
- Similarity thresholds are properly configured
- Gray zone verifier is enabled (if needed)
- Check
/v1/statsfor hit rates
Provider errors
Ensure:
- API keys are valid and have sufficient credits
- Network connectivity to provider APIs
- Provider name is correct (openai, mistral, or claude)