API Reference Overview¶
Complete reference for the SaaS LiteLLM API endpoints.
Overview¶
The SaaS LiteLLM API provides REST endpoints for managing jobs, making LLM calls, managing teams, and tracking usage.
Base URL (Local): http://localhost:8003/api Base URL (Production): https://your-domain.com/api
Authentication: Bearer token (virtual key) in Authorization header
Interactive API Documentation¶
For complete, interactive API documentation with "Try it out" functionality:
-
ReDoc
Beautiful, responsive API documentation
-
Swagger UI
Interactive API testing interface
API Categories¶
Jobs API¶
Manage job lifecycle for cost tracking:
| Endpoint | Method | Description |
|---|---|---|
/api/jobs/create | POST | Create a new job |
/api/jobs/{job_id} | GET | Get job details |
/api/jobs/{job_id}/complete | POST | Complete a job |
/api/jobs/{job_id}/costs | GET | Get job cost breakdown |
LLM Calls API¶
Make LLM calls within jobs:
| Endpoint | Method | Description |
|---|---|---|
/api/jobs/{job_id}/llm-call | POST | Non-streaming LLM call |
/api/jobs/{job_id}/llm-call-stream | POST | Streaming LLM call (SSE) |
Teams API¶
Manage teams and access:
| Endpoint | Method | Description |
|---|---|---|
/api/teams/create | POST | Create a new team |
/api/teams/{team_id} | GET | Get team details |
/api/teams/{team_id} | PUT | Update team |
/api/teams/{team_id}/suspend | POST | Suspend team |
/api/teams/{team_id}/resume | POST | Resume team |
/api/teams/{team_id}/usage | GET | Get team usage stats |
Organizations API¶
Manage organizations:
| Endpoint | Method | Description |
|---|---|---|
/api/organizations/create | POST | Create organization |
/api/organizations/{org_id} | GET | Get organization details |
/api/organizations/{org_id}/teams | GET | List organization teams |
Credits API¶
Manage team credits:
| Endpoint | Method | Description |
|---|---|---|
/api/credits/balance | GET | Get credit balance |
/api/credits/add | POST | Add credits to team |
/api/credits/transactions | GET | Get credit transaction history |
Model Access Groups API¶
Control model access per team:
| Endpoint | Method | Description |
|---|---|---|
/api/model-access-groups/create | POST | Create access group |
/api/model-access-groups/{group_name} | GET | Get access group |
/api/model-access-groups/{group_name} | PUT | Update access group |
Model Aliases API¶
Configure model aliases:
| Endpoint | Method | Description |
|---|---|---|
/api/model-aliases/create | POST | Create model alias |
/api/model-aliases/{alias_name} | GET | Get model alias |
Health API¶
Check system health:
| Endpoint | Method | Description |
|---|---|---|
/health | GET | Health check |
Authentication¶
All endpoints (except /health) require authentication with a virtual key:
curl -X POST http://localhost:8003/api/jobs/create \
-H "Authorization: Bearer sk-your-virtual-key" \
-H "Content-Type: application/json" \
-d '{"team_id": "acme-corp", "job_type": "test"}'
Common Request Patterns¶
Create Job → Make Call → Complete¶
# 1. Create job
job = POST /api/jobs/create
{
"team_id": "acme-corp",
"job_type": "analysis"
}
# 2. Make LLM call
response = POST /api/jobs/{job_id}/llm-call
{
"messages": [{"role": "user", "content": "..."}]
}
# 3. Complete job
result = POST /api/jobs/{job_id}/complete
{
"status": "completed"
}
Check Credits Before Call¶
# Check balance
balance = GET /api/credits/balance?team_id=acme-corp
if balance["credits_remaining"] > 0:
# Make call
pass
else:
# Add credits first
POST /api/credits/add
{
"team_id": "acme-corp",
"amount": 100
}
Response Formats¶
Success Response (200 OK)¶
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "pending",
"created_at": "2024-10-14T12:00:00Z"
}
Error Response (4xx/5xx)¶
Validation Error (422)¶
{
"detail": [
{
"loc": ["body", "messages"],
"msg": "field required",
"type": "value_error.missing"
}
]
}
HTTP Status Codes¶
| Code | Meaning | Description |
|---|---|---|
| 200 | OK | Request successful |
| 201 | Created | Resource created |
| 400 | Bad Request | Invalid request format |
| 401 | Unauthorized | Invalid or missing virtual key |
| 403 | Forbidden | Insufficient credits or access denied |
| 404 | Not Found | Resource not found |
| 422 | Unprocessable Entity | Validation error |
| 429 | Too Many Requests | Rate limit exceeded |
| 500 | Internal Server Error | Server error |
| 503 | Service Unavailable | Service temporarily unavailable |
Rate Limits¶
Rate limits are enforced per team:
- Requests per minute (RPM): Configurable per team
- Tokens per minute (TPM): Configurable per team
When rate limited, you'll receive a 429 Too Many Requests response. Implement exponential backoff for retries.
Pagination¶
Endpoints that return lists support pagination:
Query Parameters: - limit (int, default: 100) - Number of items per page - offset (int, default: 0) - Number of items to skip
Example:
Filtering and Sorting¶
Time-based Filtering¶
Sorting¶
Versioning¶
The API uses URL-based versioning:
- Current version: v1 (default, no prefix required)
- Future versions:
/api/v2/...
OpenAPI Specification¶
Download the OpenAPI 3.0 specification:
Use the spec to: - Generate client libraries - Import into API testing tools (Postman, Insomnia) - Build custom tooling
SDKs and Clients¶
Python Client¶
Type-safe async Python client:
from examples.typed_client import SaaSLLMClient
async with SaaSLLMClient(
base_url="http://localhost:8003",
team_id="acme-corp",
virtual_key="sk-your-key"
) as client:
job_id = await client.create_job("test")
# ...
Other Languages¶
Currently, we provide an official Python client. For other languages:
- Use the OpenAPI spec to generate clients
- Use standard HTTP libraries
- See example code in the integration guides
Webhooks¶
Register webhooks to receive notifications:
POST /api/webhooks/register
{
"team_id": "acme-corp",
"webhook_url": "https://your-app.com/webhooks/job-complete",
"events": ["job.completed", "job.failed"]
}
Webhook Payload:
{
"event": "job.completed",
"job_id": "job_789abc",
"team_id": "acme-corp",
"timestamp": "2024-10-14T12:00:00Z",
"data": {
"total_calls": 5,
"duration_seconds": 45
}
}
Idempotency¶
POST requests support idempotency keys to prevent duplicate operations:
POST /api/jobs/create
-H "Idempotency-Key: unique-key-123"
-d '{"team_id": "acme-corp", "job_type": "test"}'
If you retry with the same idempotency key within 24 hours, you'll receive the same response.
CORS¶
CORS is enabled for web applications. Allowed origins can be configured in the server settings.
Testing¶
Test Endpoints in Browser¶
Use Swagger UI for interactive testing:
Test with cURL¶
# Create job
curl -X POST http://localhost:8003/api/jobs/create \
-H "Authorization: Bearer sk-your-key" \
-H "Content-Type: application/json" \
-d '{"team_id": "acme-corp", "job_type": "test"}'
# Make LLM call
curl -X POST http://localhost:8003/api/jobs/{job_id}/llm-call \
-H "Authorization: Bearer sk-your-key" \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "Hello"}]}'
Test with Python¶
import requests
API_URL = "http://localhost:8003/api"
VIRTUAL_KEY = "sk-your-key"
headers = {
"Authorization": f"Bearer {VIRTUAL_KEY}",
"Content-Type": "application/json"
}
response = requests.post(
f"{API_URL}/jobs/create",
headers=headers,
json={"team_id": "acme-corp", "job_type": "test"}
)
print(response.json())
Best Practices¶
- Always use HTTPS in production
- Implement exponential backoff for retries
- Set reasonable timeouts (30-60 seconds)
- Handle all error codes appropriately
- Monitor rate limits and credit usage
- Use idempotency keys for critical operations
- Validate request data before sending
- Log requests for debugging
Detailed API Documentation¶
For detailed documentation on specific API categories:
-
Create and manage jobs for cost tracking
-
Make streaming and non-streaming LLM calls
-
Manage teams and access controls
-
Manage organizations and hierarchies
Getting Help¶
If you encounter issues with the API:
- Check the interactive docs - http://localhost:8003/docs
- Review error handling guide - Error Handling
- See examples - Basic Usage
- Check troubleshooting - Troubleshooting Guide
Next Steps¶
- Try the Interactive Docs - Test endpoints in your browser
- Read Integration Guide - Learn integration patterns
- See Examples - Working code examples
- Review Jobs API - Detailed job endpoint documentation