Getting Started

This guide will help you get PromptCache up and running in minutes.

Prerequisites

Go 1.24 or higher
Docker (optional, for containerized deployment)
API keys for your chosen provider(s)

Installation Methods

Method 1: Docker (Recommended)

The fastest way to get started:

# Clone the repository
git clone https://github.com/messkan/prompt-cache.git
cd prompt-cache

# Set your environment variables
export EMBEDDING_PROVIDER=openai
export OPENAI_API_KEY=your-openai-api-key

# Run with Docker Compose
docker-compose up -d

Method 2: From Source

Build and run directly:

# Clone the repository
git clone https://github.com/messkan/prompt-cache.git
cd prompt-cache

# Set environment variables
export EMBEDDING_PROVIDER=openai
export OPENAI_API_KEY=your-openai-api-key

# Run using the provided script
./scripts/run.sh

# Or use Make
make run

Method 3: Manual Build

For more control:

# Build the binary
go build -o prompt-cache cmd/api/main.go

# Run the server
./prompt-cache

Verify Installation

Check that the server is running:

# Health check
curl http://localhost:8080/health
# Expected: {"status":"healthy","time":"..."}

# Readiness check (verifies storage is working)
curl http://localhost:8080/health/ready
# Expected: {"status":"ready"}

# Get current provider
curl http://localhost:8080/v1/config/provider
# Expected: {"provider":"openai","available_providers":["openai","mistral","claude"]}

# View metrics
curl http://localhost:8080/metrics

First Request

Use PromptCache with the OpenAI SDK:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="your-openai-api-key"
)

# First request - goes to OpenAI
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "What is quantum computing?"}]
)
print(response.choices[0].message.content)

# Second request - similar prompt, served from cache
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Explain quantum computing"}]
)
print(response.choices[0].message.content)

Monitoring

View Statistics

# JSON stats
curl http://localhost:8080/v1/stats

# Prometheus metrics
curl http://localhost:8080/metrics

Watch Logs

# Docker
docker-compose logs -f

# Direct run
# Logs appear in the terminal (JSON format)

Look for log entries with:

"cache_hit":true - Request served from cache
"cache_hit":false - Request forwarded to provider

Cache Management

# View cache stats
curl http://localhost:8080/v1/cache/stats

# Clear all cache
curl -X DELETE http://localhost:8080/v1/cache

Next Steps

Troubleshooting

Server won’t start

Check that:

The required API key is set for your provider
Port 8080 is not already in use (or change with PORT env var)
BadgerDB data directory is writable

Cache not working

Verify:

Prompts are semantically similar
Similarity thresholds are properly configured
Gray zone verifier is enabled (if needed)
Check /v1/stats for hit rates

Provider errors

Ensure:

API keys are valid and have sufficient credits
Network connectivity to provider APIs
Provider name is correct (openai, mistral, or claude)