Testing Troubleshooting Guide¶
This guide helps you diagnose and resolve common issues when running tests in SaasLiteLLM.
Quick Diagnostics Checklist¶
Before diving into specific issues, run through this checklist:
# 1. Check Docker services
docker compose ps
# 2. Check service health
curl http://localhost:8002/health # LiteLLM
curl http://localhost:8003/health # SaaS API
# 3. Check database connection
docker exec -it litellm-postgres pg_isready -U litellm_user -d litellm
# 4. Check Redis connection
docker exec -it litellm-redis redis-cli ping
# 5. Review recent logs
docker compose logs --tail=50 postgres
docker compose logs --tail=50 redis
If any of these fail, see the relevant section below.
Common Test Failures¶
1. Connection Refused Errors¶
Symptom¶
Connection Error: Could not connect to http://localhost:8003
requests.exceptions.ConnectionError: Connection refused
Cause¶
The SaaS API is not running or not accessible on port 8003.
Solution¶
Step 1: Check if the process is running
# Check for Python processes on port 8003
lsof -i :8003
# Check for Python processes on port 8002 (LiteLLM)
lsof -i :8002
Step 2: Start the required services
# Terminal 1: Start LiteLLM backend
source .venv/bin/activate
python scripts/start_local.py
# Terminal 2: Start SaaS API
source .venv/bin/activate
python scripts/start_saas_api.py
Step 3: Verify services are accessible
# Should return {"status": "healthy"}
curl http://localhost:8003/health
curl http://localhost:8002/health
Prevention¶
Create a startup script that checks prerequisites:
#!/bin/bash
# scripts/start_all_services.sh
echo "Starting all services for testing..."
# Check Docker services
if ! docker compose ps postgres | grep -q "Up"; then
echo "Starting Docker services..."
./scripts/docker_setup.sh
fi
# Start LiteLLM in background
echo "Starting LiteLLM..."
python scripts/start_local.py &
LITELLM_PID=$!
# Wait for LiteLLM
sleep 10
if ! curl -s http://localhost:8002/health > /dev/null; then
echo "Failed to start LiteLLM"
kill $LITELLM_PID
exit 1
fi
# Start SaaS API in background
echo "Starting SaaS API..."
python scripts/start_saas_api.py &
SAAS_PID=$!
# Wait for SaaS API
sleep 5
if ! curl -s http://localhost:8003/health > /dev/null; then
echo "Failed to start SaaS API"
kill $LITELLM_PID $SAAS_PID
exit 1
fi
echo "All services running!"
echo "LiteLLM PID: $LITELLM_PID"
echo "SaaS API PID: $SAAS_PID"
2. Database Connection Issues¶
Symptom A: Database Not Running¶
Failed to connect to database
psycopg2.OperationalError: could not connect to server
connection refused
Solution for Symptom A¶
Step 1: Check if PostgreSQL container is running
Step 2: If not running, start Docker services
Step 3: Verify PostgreSQL is accepting connections
# Should output "accepting connections"
docker exec litellm-postgres pg_isready -U litellm_user -d litellm
# Test connection with psql
docker exec -it litellm-postgres psql -U litellm_user -d litellm -c "SELECT version();"
Step 4: Check PostgreSQL logs for errors
Symptom B: Wrong Database Credentials¶
Solution for Symptom B¶
Step 1: Verify environment variables
Step 2: Ensure credentials match docker-compose.yml
Step 3: Recreate database with correct credentials
# Stop and remove volumes
docker compose down -v
# Restart with fresh database
./scripts/docker_setup.sh
Symptom C: Database Missing Tables¶
Solution for Symptom C¶
Step 1: Run database migrations
Step 2: Verify tables exist
Expected tables: - organizations - model_groups - model_group_models - teams - team_model_groups - team_credits - team_credit_transactions - jobs - llm_calls
Step 3: If migrations fail, check migration files
Step 4: Manually run migrations if needed
for file in scripts/migrations/*.sql; do
echo "Running $file..."
docker exec -i litellm-postgres psql -U litellm_user -d litellm < "$file"
done
3. LiteLLM Integration Failures¶
Symptom¶
ERROR: LiteLLM integration failed!
Response: 500 Internal Server Error
Possible causes:
- LiteLLM proxy not running
- LiteLLM database not accessible
- Master key incorrect
Cause¶
Communication between SaaS API and LiteLLM proxy is broken.
Solution¶
Step 1: Verify LiteLLM is running
Step 2: Check LiteLLM can access database
Step 3: Verify master key configuration
# Check .env file
cat .env | grep LITELLM_MASTER_KEY
# Ensure it's set in environment
echo $LITELLM_MASTER_KEY
Step 4: Test LiteLLM API directly
# Should return information about the key endpoint
curl -X POST http://localhost:8002/key/info \
-H "Authorization: Bearer sk-local-dev-master-key-change-me" \
-H "Content-Type: application/json"
Step 5: Check LiteLLM configuration
# Verify config file exists
cat src/config/litellm_config.yaml
# Check for syntax errors
python -c "import yaml; yaml.safe_load(open('src/config/litellm_config.yaml'))"
Step 6: Restart LiteLLM with verbose logging
# Stop current instance
pkill -f "litellm"
# Start with debug mode
source .venv/bin/activate
litellm --config src/config/litellm_config.yaml --port 8002 --detailed_debug
4. "Already Exists" Errors¶
Symptom¶
Cause¶
Test data from previous runs still exists in the database.
Is This a Problem?¶
Usually No: The test scripts are designed to handle existing data:
if response.status_code == 400 and "already exists" in response.text:
print("Organization already exists (OK)")
When It's a Problem¶
If you're testing creation logic specifically, or tests fail due to stale data:
Solution¶
Option 1: Clean up test data manually
docker exec -it litellm-postgres psql -U litellm_user -d litellm << EOF
DELETE FROM team_credit_transactions WHERE team_id LIKE 'team_test%';
DELETE FROM team_credits WHERE team_id LIKE 'team_test%';
DELETE FROM team_model_groups WHERE team_id LIKE 'team_test%';
DELETE FROM teams WHERE team_id LIKE 'team_test%';
DELETE FROM model_group_models WHERE group_id IN (SELECT id FROM model_groups WHERE group_name LIKE '%Test%');
DELETE FROM model_groups WHERE group_name LIKE '%Test%';
DELETE FROM organizations WHERE organization_id LIKE 'org_test%';
EOF
Option 2: Reset entire database (nuclear option)
# Stop containers and remove volumes
docker compose down -v
# Restart fresh
./scripts/docker_setup.sh
# Recreate schema
./scripts/run_migrations.sh
Option 3: Create cleanup script
# scripts/cleanup_test_data.py
#!/usr/bin/env python3
import psycopg2
from config.settings import settings
def cleanup():
"""Remove all test data"""
conn = psycopg2.connect(settings.database_url)
cur = conn.cursor()
print("Cleaning up test data...")
# Delete in correct order (handle foreign keys)
tables_and_conditions = [
("team_credit_transactions", "team_id LIKE 'team_test%' OR team_id LIKE 'team_demo%'"),
("team_credits", "team_id LIKE 'team_test%' OR team_id LIKE 'team_demo%'"),
("team_model_groups", "team_id LIKE 'team_test%' OR team_id LIKE 'team_demo%'"),
("teams", "team_id LIKE 'team_test%' OR team_id LIKE 'team_demo%'"),
("organizations", "organization_id LIKE 'org_test%' OR organization_id LIKE 'org_demo%'"),
]
for table, condition in tables_and_conditions:
cur.execute(f"DELETE FROM {table} WHERE {condition}")
print(f" Deleted {cur.rowcount} rows from {table}")
conn.commit()
conn.close()
print("Cleanup complete!")
if __name__ == "__main__":
cleanup()
Run before tests:
5. Import Errors¶
Symptom¶
Cause¶
Dependencies not installed or virtual environment not activated.
Solution¶
Step 1: Activate virtual environment
Step 2: Verify Python version
Step 3: Install dependencies
# Install core dependencies
uv pip install litellm[proxy] fastapi uvicorn[standard] psycopg2-binary sqlalchemy
# Install test dependencies
uv pip install pytest pytest-asyncio
# Or install all from pyproject.toml
uv pip install -e ".[dev]"
Step 4: Verify installation
python -c "import litellm; print(litellm.__version__)"
python -c "import fastapi; print(fastapi.__version__)"
6. Port Already in Use¶
Symptom¶
Cause¶
Another process is using port 8002 or 8003.
Solution¶
Step 1: Find the process using the port
Step 2: Kill the process
# Kill by PID
kill -9 <PID>
# Or kill all Python processes using these ports
pkill -f "start_local.py"
pkill -f "start_saas_api.py"
Step 3: Verify ports are free
Step 4: Restart services
7. Redis Connection Failures¶
Symptom¶
Cause¶
Redis container not running or not accessible.
Solution¶
Step 1: Check Redis container
Step 2: If not running, start it
Step 3: Test Redis connection
Step 4: Check Redis logs
Note: Redis is optional for basic functionality. If you don't need caching, you can disable it in configuration.
8. Test Timeouts¶
Symptom¶
Cause¶
- Services taking too long to start
- Database query hanging
- Network issues
Solution¶
Step 1: Increase wait times in test scripts
Step 2: Check service resource usage
Step 3: Restart Docker services
Step 4: Check for blocking queries
# View active PostgreSQL connections
docker exec -it litellm-postgres psql -U litellm_user -d litellm -c "
SELECT pid, usename, application_name, state, query
FROM pg_stat_activity
WHERE state != 'idle';
"
9. Authentication Failures¶
Symptom¶
Cause¶
Incorrect or missing master key configuration.
Solution¶
Step 1: Verify master key in .env
Step 2: Ensure key format is correct
Step 3: Update .env if needed
Step 4: Restart services to pick up new key
Step 5: Test authentication
curl -X GET http://localhost:8002/health \
-H "Authorization: Bearer sk-local-dev-master-key-change-me"
Debugging Failed Tests¶
Enable Detailed Logging¶
For Test Scripts¶
# Run with output redirection to capture all logs
python scripts/test_full_integration.py 2>&1 | tee test_output.log
For pytest¶
# Verbose output with print statements
pytest tests/ -vv -s
# Show local variables on failure
pytest tests/ -l
# Stop on first failure
pytest tests/ -x
For LiteLLM¶
# Start with detailed debug mode
litellm --config src/config/litellm_config.yaml \
--port 8002 \
--detailed_debug
For SaaS API¶
# Run with debug logging
uvicorn src.saas_api:app \
--host 0.0.0.0 \
--port 8003 \
--log-level debug
Inspect Database State¶
# Connect to database
docker exec -it litellm-postgres psql -U litellm_user -d litellm
# Check organizations
SELECT * FROM organizations;
# Check teams
SELECT * FROM teams;
# Check credits
SELECT * FROM team_credits;
# Check model groups
SELECT * FROM model_groups;
# Check team-model assignments
SELECT t.team_id, mg.group_name
FROM teams t
JOIN team_model_groups tmg ON t.id = tmg.team_id
JOIN model_groups mg ON tmg.group_id = mg.id;
Check API Endpoints Manually¶
# Health check
curl http://localhost:8003/health
# List organizations
curl http://localhost:8003/api/organizations
# Get specific team
curl http://localhost:8003/api/teams/team_test_hr
# Check team credits
curl http://localhost:8003/api/credits/teams/team_test_hr/balance
# View API documentation
open http://localhost:8003/docs
Monitor Service Logs in Real-Time¶
# Terminal 1: PostgreSQL logs
docker compose logs -f postgres
# Terminal 2: Redis logs
docker compose logs -f redis
# Terminal 3: All logs
docker compose logs -f
Environment-Specific Issues¶
macOS Issues¶
Docker Desktop Not Running¶
Port Conflicts on macOS¶
# Check what's using the port
lsof -i :8002 -i :8003 -i :5432 -i :6379
# Kill conflicting processes
sudo lsof -ti:8002 | xargs kill -9
Linux Issues¶
Permission Denied on Docker¶
# Add user to docker group
sudo usermod -aG docker $USER
# Log out and log back in, or run:
newgrp docker
PostgreSQL Port Conflict¶
# If system PostgreSQL is running on 5432
sudo systemctl stop postgresql
# Or change port in docker-compose.yml
Windows Issues¶
WSL2 Docker Integration¶
# Ensure WSL2 integration is enabled in Docker Desktop
# Settings > Resources > WSL Integration
# Restart Docker Desktop
wsl --shutdown
# Start Docker Desktop again
Path Issues in WSL¶
# Use WSL paths, not Windows paths
cd /mnt/c/Users/YourName/repos/SaasLiteLLM # Instead of C:\Users\...
Performance Issues¶
Slow Test Execution¶
Cause¶
- Database performance
- Network latency
- Resource constraints
Solution¶
Step 1: Optimize database
# Analyze and vacuum database
docker exec litellm-postgres psql -U litellm_user -d litellm -c "VACUUM ANALYZE;"
Step 2: Check Docker resource allocation
Step 3: Use connection pooling Edit src/models/database.py to add connection pooling.
Step 4: Profile slow queries
-- Enable query logging in PostgreSQL
ALTER DATABASE litellm SET log_statement = 'all';
ALTER DATABASE litellm SET log_duration = on;
Getting Help¶
Collecting Debug Information¶
When reporting issues, include:
-
System Information:
-
Service Status:
-
Recent Logs:
-
Environment Configuration:
-
Test Output:
Resources¶
- Project Documentation: Check other docs in
/docs - LiteLLM Documentation: https://docs.litellm.ai
- FastAPI Documentation: https://fastapi.tiangolo.com
- PostgreSQL Documentation: https://www.postgresql.org/docs
Creating an Issue¶
When creating an issue on GitHub:
- Use a descriptive title: "Test fails: Connection refused on port 8003"
- Describe the problem: What were you trying to do?
- Steps to reproduce: What commands did you run?
- Expected behavior: What should have happened?
- Actual behavior: What actually happened?
- Environment: OS, Python version, Docker version
- Logs: Include relevant error messages and logs
Preventive Measures¶
Pre-Test Checklist¶
Create a checklist to run before testing:
#!/bin/bash
# scripts/pre_test_check.sh
echo "Running pre-test checks..."
# Check Docker
if ! docker info > /dev/null 2>&1; then
echo "❌ Docker is not running"
exit 1
fi
echo "✅ Docker is running"
# Check containers
if ! docker compose ps postgres | grep -q "Up"; then
echo "❌ PostgreSQL is not running"
exit 1
fi
echo "✅ PostgreSQL is running"
if ! docker compose ps redis | grep -q "Up"; then
echo "⚠️ Redis is not running (optional)"
fi
# Check database connection
if ! docker exec litellm-postgres pg_isready -U litellm_user -d litellm > /dev/null 2>&1; then
echo "❌ Database connection failed"
exit 1
fi
echo "✅ Database connection successful"
# Check virtual environment
if [ -z "$VIRTUAL_ENV" ]; then
echo "⚠️ Virtual environment not activated"
echo " Run: source .venv/bin/activate"
fi
# Check services
if ! curl -s http://localhost:8002/health > /dev/null 2>&1; then
echo "⚠️ LiteLLM is not running"
echo " Run: python scripts/start_local.py"
fi
if ! curl -s http://localhost:8003/health > /dev/null 2>&1; then
echo "⚠️ SaaS API is not running"
echo " Run: python scripts/start_saas_api.py"
fi
echo ""
echo "✅ Pre-test checks complete"
Regular Maintenance¶
# Clean up old containers and volumes (monthly)
docker system prune -af --volumes
# Update dependencies (weekly)
uv pip list --outdated
# Vacuum database (weekly)
docker exec litellm-postgres psql -U litellm_user -d litellm -c "VACUUM ANALYZE;"
# Clear test data (after testing)
python scripts/cleanup_test_data.py
Related Documentation¶
- Testing Overview - Testing strategy and philosophy
- Integration Tests - Running integration tests
- Getting Started - Initial setup instructions
- Deployment - Production deployment guide