Credits¶
Learn how the credit system works and how to allocate credits to teams.
How Credits Work¶
SaaS LiteLLM uses a credit-based billing system built on top of LiteLLM to simplify cost tracking:
- 1 credit = 1 completed job (regardless of how many LLM calls the job makes)
- Credits are allocated per team
- Credits are only deducted when a job completes successfully
- Failed jobs don't consume credits
Built on LiteLLM
While LiteLLM tracks token-based costs from providers, SaaS LiteLLM abstracts this into simple credit-based billing. This allows you to set predictable pricing for your clients while tracking actual provider costs internally.
Why Credits Instead of Tokens?¶
Traditional Token Billing:
- Document analysis: 2,345 tokens = $0.0234
- Chat session: 5,123 tokens = $0.0512
- Data extraction: 1,234 tokens = $0.0123
Credit Billing:
✅ Predictable costs ✅ Easy to understand ✅ Simple budgetingCredit Allocation¶
Initial Allocation¶
When creating a team, specify initial credits:
curl -X POST http://localhost:8003/api/teams/create \
-H "Content-Type: application/json" \
-d '{
"organization_id": "org_client",
"team_id": "client-prod",
"team_alias": "Production",
"access_groups": ["gpt-models"],
"credits_allocated": 1000
}'
Response:
Adding Credits¶
Add credits to an existing team:
Via Dashboard: 1. Navigate to Teams → Select team 2. Click "Add Credits" 3. Enter amount 4. Click "Add"
Via API:
curl -X POST http://localhost:8003/api/credits/add \
-H "Content-Type: application/json" \
-d '{
"team_id": "client-prod",
"amount": 500,
"description": "Monthly credit top-up - November 2024"
}'
Response:
{
"team_id": "client-prod",
"credits_added": 500,
"credits_remaining": 1250,
"transaction_id": "txn_abc123"
}
Checking Credit Balance¶
Via Dashboard¶
Navigate to Teams → Select team → See credits displayed
Via API¶
Response:
{
"team_id": "client-prod",
"credits_allocated": 1500,
"credits_remaining": 750,
"credits_used": 750,
"percentage_used": 50.0,
"status": "active"
}
Credit Deduction Logic¶
When Credits Are Deducted¶
Credits are deducted only when a job completes successfully:
# Create job - No credits deducted yet
job = create_job("document_analysis")
# Make LLM calls - No credits deducted yet
llm_call(job_id, messages)
llm_call(job_id, messages)
llm_call(job_id, messages)
# Complete job - NOW 1 credit is deducted
complete_job(job_id, "completed")
Failed Jobs Don't Cost Credits¶
# Create job
job = create_job("analysis")
try:
# Make LLM call that fails
llm_call(job_id, messages)
except Exception as e:
# Complete as failed
complete_job(job_id, "failed")
# No credits deducted! ✅
Multiple Calls = One Credit¶
The beauty of job-based billing:
# One job with 5 LLM calls = 1 credit
job = create_job("complex_analysis")
extract_text(job_id) # Call 1
classify(job_id) # Call 2
summarize(job_id) # Call 3
generate_insights(job_id) # Call 4
quality_check(job_id) # Call 5
complete_job(job_id, "completed")
# Total cost: 1 credit (not 5!)
Credit Transactions¶
View Transaction History¶
Track all credit additions and deductions:
Response:
{
"team_id": "client-prod",
"transactions": [
{
"transaction_id": "txn_abc123",
"type": "addition",
"amount": 500,
"description": "Monthly credit top-up",
"timestamp": "2024-10-14T10:00:00Z",
"balance_after": 1250
},
{
"transaction_id": "txn_abc122",
"type": "deduction",
"amount": 1,
"description": "Job: job_xyz123 completed",
"job_id": "job_xyz123",
"timestamp": "2024-10-14T09:30:00Z",
"balance_after": 750
}
]
}
Low Credit Alerts¶
Setting Up Alerts¶
Monitor teams approaching zero credits:
# Check if team needs alert
def check_credit_alerts(team_id):
balance = get_credit_balance(team_id)
allocated = balance['credits_allocated']
remaining = balance['credits_remaining']
percentage = (remaining / allocated) * 100
# Alert at 20% remaining
if percentage <= 20 and percentage > 10:
send_alert("low_credits", team_id, remaining)
# Critical alert at 10%
elif percentage <= 10:
send_alert("critical_credits", team_id, remaining)
Automated Top-ups¶
Set up automatic credit additions:
# Auto top-up when credits hit threshold
def auto_topup_check(team_id):
balance = get_credit_balance(team_id)
if balance['credits_remaining'] < 100:
# Add predefined amount
add_credits(team_id, 1000, "Automatic monthly top-up")
notify_client(team_id, "Credits topped up to 1000")
Pricing Strategies¶
Strategy 1: Credit Packages¶
Sell credits in packages:
Starter: 1,000 credits = $99/month
Professional: 5,000 credits = $399/month
Enterprise: 20,000 credits = $1,299/month
Implementation:
# Starter plan
curl -X POST http://localhost:8003/api/credits/add \
-d '{"team_id": "client-prod", "amount": 1000}'
# Professional plan
curl -X POST http://localhost:8003/api/credits/add \
-d '{"team_id": "client-prod", "amount": 5000}'
Strategy 2: Pay-As-You-Go¶
Charge per credit used:
Track Actual Costs:
# Get usage for billing
curl "http://localhost:8003/api/teams/client-prod/usage?period=2024-10"
# Response includes:
# - credits_used: 534
# - Your charge: 534 × $0.10 = $53.40
# - Actual LiteLLM cost: $45.67 (internal tracking)
# - Your profit: $7.73
Strategy 3: Subscription + Overage¶
Base subscription with overage charges:
Strategy 4: Tiered Pricing¶
Volume discounts:
Budget Modes¶
Mode 1: Hard Limit (Default)¶
Team cannot exceed allocated credits:
Behavior: - Job creation succeeds - LLM calls succeed - Job completion fails if credits exhausted - API returns 403 "Insufficient credits"
Mode 2: Soft Limit with Alerts¶
Allow overage with notifications:
{
"team_id": "client-prod",
"budget_mode": "soft_limit",
"credits_allocated": 1000,
"alert_at_percentage": 80
}
Behavior: - Can exceed allocated credits - Alerts sent at 80%, 100%, 120% - Track overage separately for billing
Mode 3: Unlimited (Enterprise)¶
No credit limits:
Behavior: - No credit checks - Track usage for billing - Typically for enterprise contracts
Credit Replenishment¶
Add credits to teams from payments (subscriptions or one-time purchases).
Replenishing from Payment¶
When a client pays for credits, use the replenish endpoint:
API Endpoint: POST /api/credits/teams/{team_id}/replenish
Authentication: Admin only (JWT Bearer token or X-Admin-Key header)
curl -X POST http://localhost:8003/api/credits/teams/acme-prod/replenish \
-H "Authorization: Bearer your-admin-jwt" \
-H "Content-Type: application/json" \
-d '{
"credits": 5000,
"payment_type": "subscription",
"payment_amount_usd": 499.00,
"reason": "November 2024 subscription payment"
}'
Response:
{
"team_id": "acme-prod",
"credits_added": 5000,
"credits_before": 1250,
"credits_after": 6250,
"payment_type": "subscription",
"payment_amount_usd": 499.00,
"transaction": {
"transaction_id": "txn_abc123",
"transaction_type": "subscription_payment",
"credits_amount": 5000,
"credits_before": 1250,
"credits_after": 6250,
"reason": "November 2024 subscription payment ($499.00 USD)",
"created_at": "2024-11-01T00:00:00Z"
}
}
Payment Types¶
Subscription Payment:
- Recurring monthly/annual payments - Updateslast_refill_at timestamp - Creates subscription_payment transaction One-Time Payment:
- Ad-hoc credit purchases - Top-ups or overages - Createsone_time_payment transaction Auto-Refill Configuration¶
Set up automatic credit refills tied to subscription billing:
API Endpoint: POST /api/credits/teams/{team_id}/configure-auto-refill
curl -X POST http://localhost:8003/api/credits/teams/acme-prod/configure-auto-refill \
-H "Authorization: Bearer your-admin-jwt" \
-H "Content-Type: application/json" \
-d '{
"enabled": true,
"refill_amount": 5000,
"refill_period": "monthly"
}'
Response:
{
"team_id": "acme-prod",
"auto_refill_enabled": true,
"refill_amount": 5000,
"refill_period": "monthly",
"last_refill_at": "2024-11-01T00:00:00Z",
"message": "Auto-refill enabled successfully"
}
Refill Periods: - monthly - Refill once per month - weekly - Refill once per week - daily - Refill once per day
Disabling Auto-Refill¶
curl -X POST http://localhost:8003/api/credits/teams/acme-prod/configure-auto-refill \
-H "Authorization: Bearer your-admin-jwt" \
-H "Content-Type: application/json" \
-d '{
"enabled": false
}'
Integration with Payment Processors¶
Example: Stripe Webhook Handler
import requests
from datetime import datetime
def handle_stripe_subscription_payment(event):
"""Handle successful subscription payment from Stripe"""
subscription = event['data']['object']
customer_id = subscription['customer']
amount_paid = subscription['latest_invoice']['amount_paid'] / 100 # cents to dollars
# Map Stripe customer to team_id
team_id = get_team_id_from_stripe_customer(customer_id)
# Calculate credits based on plan
credits_to_add = calculate_credits_from_amount(amount_paid)
# Replenish credits
response = requests.post(
f"http://localhost:8003/api/credits/teams/{team_id}/replenish",
headers={"Authorization": f"Bearer {ADMIN_JWT}"},
json={
"credits": credits_to_add,
"payment_type": "subscription",
"payment_amount_usd": amount_paid,
"reason": f"Stripe subscription payment - {subscription['id']}"
}
)
return response.json()
Transaction Tracking¶
All replenishments create audit trail transactions:
View Replenishment History:
curl "http://localhost:8003/api/credits/teams/acme-prod/transactions?limit=50" \
-H "Authorization: Bearer sk-virtual-key"
Response showing replenishment transactions:
{
"team_id": "acme-prod",
"total": 3,
"transactions": [
{
"transaction_id": "txn_abc123",
"transaction_type": "subscription_payment",
"credits_amount": 5000,
"credits_before": 1250,
"credits_after": 6250,
"reason": "November 2024 subscription payment ($499.00 USD)",
"created_at": "2024-11-01T00:00:00Z"
},
{
"transaction_id": "txn_abc122",
"transaction_type": "one_time_payment",
"credits_amount": 1000,
"credits_before": 250,
"credits_after": 1250,
"reason": "Additional credits purchase ($99.00 USD)",
"created_at": "2024-10-15T10:30:00Z"
}
]
}
Replenishment Best Practices¶
- Track Payment References
- Include payment processor ID in reason field
-
Enables reconciliation between payments and credits
-
Validate Before Replenishing
- Verify payment completed successfully
- Check for duplicate webhook events
-
Implement idempotency
-
Monitor Refill Timing
- Check
last_refill_attimestamp - Prevent duplicate monthly refills
-
Handle edge cases (failed payments, cancellations)
-
Automate Subscription Refills
- Enable auto-refill for subscription customers
- Configure appropriate refill_period
- Decouple from manual payment processing
Example Workflow: Monthly Subscription¶
# 1. Customer subscribes to $99/month plan (1000 credits)
team_id = "customer-prod"
monthly_credits = 1000
# 2. Configure auto-refill
requests.post(
f"http://localhost:8003/api/credits/teams/{team_id}/configure-auto-refill",
headers={"Authorization": f"Bearer {ADMIN_JWT}"},
json={
"enabled": True,
"refill_amount": monthly_credits,
"refill_period": "monthly"
}
)
# 3. On payment success (e.g., Stripe webhook)
def on_payment_success(amount_usd, team_id):
requests.post(
f"http://localhost:8003/api/credits/teams/{team_id}/replenish",
headers={"Authorization": f"Bearer {ADMIN_JWT}"},
json={
"credits": monthly_credits,
"payment_type": "subscription",
"payment_amount_usd": amount_usd,
"reason": f"Monthly subscription - {datetime.now().strftime('%B %Y')}"
}
)
# 4. Customer usage tracked normally
# 5. Next month, repeat step 3 on next payment
Client Communication¶
Credit Allocation Email¶
Subject: Your Credits Have Been Allocated
Hi [Client],
We've allocated 1,000 credits to your account!
WHAT ARE CREDITS?
- 1 credit = 1 completed job
- Jobs can contain multiple LLM calls
- Only successful jobs consume credits
YOUR BALANCE:
- Allocated: 1,000 credits
- Remaining: 1,000 credits
ESTIMATED USAGE:
- Document analysis: ~1 credit per document
- Chat sessions: ~1 credit per conversation
- Your 1,000 credits = approximately 1,000 operations
MONITOR USAGE:
View real-time usage at: https://dashboard.yourcompany.com
Need more credits? Reply to this email or visit your dashboard.
Questions? support@yourcompany.com
Low Credit Warning¶
Subject: Low Credit Alert - 20% Remaining
Hi [Client],
Your credit balance is running low:
CURRENT BALANCE: 200 credits (20% remaining)
ESTIMATED DEPLETION: 2-3 days at current usage
ACTION REQUIRED:
1. Purchase additional credits: https://dashboard.yourcompany.com/credits
2. Or contact us: support@yourcompany.com
Don't let your integration stop! Top up now.
Best regards,
Your Company Team
Out of Credits Notice¶
Subject: URGENT: Credits Exhausted
Hi [Client],
Your account has run out of credits.
CURRENT BALANCE: 0 credits
STATUS: API calls suspended
TO RESTORE SERVICE:
1. Purchase credits immediately: https://dashboard.yourcompany.com/credits
2. Or contact support: support@yourcompany.com
Service will resume automatically once credits are added.
Questions? We're here to help!
Monitoring & Analytics¶
Team Credit Utilization¶
Response:
{
"total_teams": 25,
"summary": {
"total_allocated": 50000,
"total_remaining": 32000,
"total_used": 18000,
"avg_utilization": 36.0
},
"by_status": {
"healthy": 20, // >50% remaining
"warning": 3, // 20-50% remaining
"critical": 2 // <20% remaining
},
"teams_needing_attention": [
{
"team_id": "client-a-prod",
"credits_remaining": 50,
"percentage": 5.0,
"status": "critical"
}
]
}
Usage Trends¶
Track credit consumption over time:
Use For: - Predicting when team will run out - Recommending plan upgrades - Identifying usage spikes - Forecasting revenue
Best Practices¶
For Admins¶
- Start Conservative
- Begin with modest allocation (1,000 credits)
- Monitor usage first week
-
Adjust based on actual patterns
-
Set Up Alerts
- Alert at 50%, 20%, 10%, 0%
- Proactive outreach before depletion
-
Automated top-up for trusted clients
-
Review Monthly
- Check all team credit levels
- Identify heavy users (upsell opportunity)
-
Identify low users (check satisfaction)
-
Track Profit Margins
- Monitor actual LiteLLM costs vs. credits charged
- Adjust pricing if margins too thin
- Offer volume discounts for retention
For Clients¶
Share these tips with clients:
- Monitor Your Balance
- Check dashboard regularly
- Set up low-balance alerts
-
Don't let credits hit zero
-
Estimate Usage
- 1 credit ≈ 1 business operation
- Track your typical job types
-
Budget accordingly
-
Complete Jobs Properly
- Always call
complete_job() - Failed jobs don't cost credits
-
Incomplete jobs won't be billed
-
Optimize Usage
- Group related calls into single jobs
- Don't create unnecessary jobs
- Cache responses when possible
Common Scenarios¶
Scenario 1: Client Runs Out Mid-Month¶
Problem: Client exhausted their 1,000 credits in 2 weeks
Actions: 1. Add immediate emergency credits (100-200) 2. Analyze usage patterns 3. Recommend plan upgrade 4. Set up automatic top-ups going forward
Scenario 2: Client Barely Uses Credits¶
Problem: Client using <10% of allocation
Actions: 1. Check if they're having integration issues 2. Offer to help with implementation 3. Consider downgrading to save them money (builds trust) 4. Check if they need different features
Scenario 3: Unexpected Usage Spike¶
Problem: Team uses 3x normal credits in one day
Actions: 1. Check for runaway processes 2. Contact client to verify legitimate usage 3. Temporarily increase limits if needed 4. Investigate potential security issues
Troubleshooting¶
Credits Not Deducted¶
Problem: Jobs completing but credits unchanged
Solutions: 1. Check job actually reached "completed" status 2. Verify credit deduction logic in code 3. Check database transaction logs 4. Ensure job_id is valid
Can't Add Credits¶
Problem: API returns error when adding credits
Solutions: 1. Verify team exists and is active 2. Check team not suspended 3. Validate credit amount is positive integer 4. Check for database connection issues
Balance Shows Negative¶
Problem: Credits_remaining is negative
Solutions: 1. This can happen with race conditions 2. Investigate concurrent job completions 3. Add credits to bring back to positive 4. Implement better locking in credit deduction
Next Steps¶
Now that you understand credits:
- Create Teams - Allocate credits when creating teams
- Monitor Usage - Track credit consumption
- Set Up Model Access - Control which models teams can use
- Review Best Practices - Optimize credit management