Skip to content

Credits

Learn how the credit system works and how to allocate credits to teams.

How Credits Work

SaaS LiteLLM uses a credit-based billing system built on top of LiteLLM to simplify cost tracking:

  • 1 credit = 1 completed job (regardless of how many LLM calls the job makes)
  • Credits are allocated per team
  • Credits are only deducted when a job completes successfully
  • Failed jobs don't consume credits

Built on LiteLLM

While LiteLLM tracks token-based costs from providers, SaaS LiteLLM abstracts this into simple credit-based billing. This allows you to set predictable pricing for your clients while tracking actual provider costs internally.

Why Credits Instead of Tokens?

Traditional Token Billing:

- Document analysis: 2,345 tokens = $0.0234
- Chat session: 5,123 tokens = $0.0512
- Data extraction: 1,234 tokens = $0.0123
❌ Complex for clients to predict costs ❌ Varies based on input/output length ❌ Hard to budget

Credit Billing:

- Document analysis: 1 credit
- Chat session: 1 credit
- Data extraction: 1 credit
✅ Predictable costs ✅ Easy to understand ✅ Simple budgeting

Credit Allocation

Initial Allocation

When creating a team, specify initial credits:

curl -X POST http://localhost:8003/api/teams/create \
  -H "Content-Type: application/json" \
  -d '{
    "organization_id": "org_client",
    "team_id": "client-prod",
    "team_alias": "Production",
    "access_groups": ["gpt-models"],
    "credits_allocated": 1000
  }'

Response:

{
  "team_id": "client-prod",
  "credits_allocated": 1000,
  "credits_remaining": 1000
}

Adding Credits

Add credits to an existing team:

Via Dashboard: 1. Navigate to Teams → Select team 2. Click "Add Credits" 3. Enter amount 4. Click "Add"

Via API:

curl -X POST http://localhost:8003/api/credits/add \
  -H "Content-Type: application/json" \
  -d '{
    "team_id": "client-prod",
    "amount": 500,
    "description": "Monthly credit top-up - November 2024"
  }'

Response:

{
  "team_id": "client-prod",
  "credits_added": 500,
  "credits_remaining": 1250,
  "transaction_id": "txn_abc123"
}

Checking Credit Balance

Via Dashboard

Navigate to Teams → Select team → See credits displayed

Via API

curl "http://localhost:8003/api/credits/balance?team_id=client-prod"

Response:

{
  "team_id": "client-prod",
  "credits_allocated": 1500,
  "credits_remaining": 750,
  "credits_used": 750,
  "percentage_used": 50.0,
  "status": "active"
}

Credit Deduction Logic

When Credits Are Deducted

Credits are deducted only when a job completes successfully:

# Create job - No credits deducted yet
job = create_job("document_analysis")

# Make LLM calls - No credits deducted yet
llm_call(job_id, messages)
llm_call(job_id, messages)
llm_call(job_id, messages)

# Complete job - NOW 1 credit is deducted
complete_job(job_id, "completed")

Failed Jobs Don't Cost Credits

# Create job
job = create_job("analysis")

try:
    # Make LLM call that fails
    llm_call(job_id, messages)
except Exception as e:
    # Complete as failed
    complete_job(job_id, "failed")
    # No credits deducted! ✅

Multiple Calls = One Credit

The beauty of job-based billing:

# One job with 5 LLM calls = 1 credit
job = create_job("complex_analysis")

extract_text(job_id)      # Call 1
classify(job_id)          # Call 2
summarize(job_id)         # Call 3
generate_insights(job_id) # Call 4
quality_check(job_id)     # Call 5

complete_job(job_id, "completed")
# Total cost: 1 credit (not 5!)

Credit Transactions

View Transaction History

Track all credit additions and deductions:

curl "http://localhost:8003/api/credits/transactions?team_id=client-prod&limit=10"

Response:

{
  "team_id": "client-prod",
  "transactions": [
    {
      "transaction_id": "txn_abc123",
      "type": "addition",
      "amount": 500,
      "description": "Monthly credit top-up",
      "timestamp": "2024-10-14T10:00:00Z",
      "balance_after": 1250
    },
    {
      "transaction_id": "txn_abc122",
      "type": "deduction",
      "amount": 1,
      "description": "Job: job_xyz123 completed",
      "job_id": "job_xyz123",
      "timestamp": "2024-10-14T09:30:00Z",
      "balance_after": 750
    }
  ]
}

Low Credit Alerts

Setting Up Alerts

Monitor teams approaching zero credits:

# Check if team needs alert
def check_credit_alerts(team_id):
    balance = get_credit_balance(team_id)

    allocated = balance['credits_allocated']
    remaining = balance['credits_remaining']
    percentage = (remaining / allocated) * 100

    # Alert at 20% remaining
    if percentage <= 20 and percentage > 10:
        send_alert("low_credits", team_id, remaining)

    # Critical alert at 10%
    elif percentage <= 10:
        send_alert("critical_credits", team_id, remaining)

Automated Top-ups

Set up automatic credit additions:

# Auto top-up when credits hit threshold
def auto_topup_check(team_id):
    balance = get_credit_balance(team_id)

    if balance['credits_remaining'] < 100:
        # Add predefined amount
        add_credits(team_id, 1000, "Automatic monthly top-up")
        notify_client(team_id, "Credits topped up to 1000")

Pricing Strategies

Strategy 1: Credit Packages

Sell credits in packages:

Starter: 1,000 credits = $99/month
Professional: 5,000 credits = $399/month
Enterprise: 20,000 credits = $1,299/month

Implementation:

# Starter plan
curl -X POST http://localhost:8003/api/credits/add \
  -d '{"team_id": "client-prod", "amount": 1000}'

# Professional plan
curl -X POST http://localhost:8003/api/credits/add \
  -d '{"team_id": "client-prod", "amount": 5000}'

Strategy 2: Pay-As-You-Go

Charge per credit used:

$0.10 per credit
Minimum purchase: 100 credits ($10)

Track Actual Costs:

# Get usage for billing
curl "http://localhost:8003/api/teams/client-prod/usage?period=2024-10"

# Response includes:
# - credits_used: 534
# - Your charge: 534 × $0.10 = $53.40
# - Actual LiteLLM cost: $45.67 (internal tracking)
# - Your profit: $7.73

Strategy 3: Subscription + Overage

Base subscription with overage charges:

Plan: $99/month includes 1,000 credits
Overage: $0.08 per additional credit

Strategy 4: Tiered Pricing

Volume discounts:

First 1,000 credits: $0.10 each
Next 4,000 credits: $0.08 each
Over 5,000 credits: $0.06 each

Budget Modes

Mode 1: Hard Limit (Default)

Team cannot exceed allocated credits:

{
  "team_id": "client-prod",
  "budget_mode": "hard_limit",
  "credits_allocated": 1000
}

Behavior: - Job creation succeeds - LLM calls succeed - Job completion fails if credits exhausted - API returns 403 "Insufficient credits"

Mode 2: Soft Limit with Alerts

Allow overage with notifications:

{
  "team_id": "client-prod",
  "budget_mode": "soft_limit",
  "credits_allocated": 1000,
  "alert_at_percentage": 80
}

Behavior: - Can exceed allocated credits - Alerts sent at 80%, 100%, 120% - Track overage separately for billing

Mode 3: Unlimited (Enterprise)

No credit limits:

{
  "team_id": "enterprise-prod",
  "budget_mode": "unlimited"
}

Behavior: - No credit checks - Track usage for billing - Typically for enterprise contracts

Credit Replenishment

Add credits to teams from payments (subscriptions or one-time purchases).

Replenishing from Payment

When a client pays for credits, use the replenish endpoint:

API Endpoint: POST /api/credits/teams/{team_id}/replenish

Authentication: Admin only (JWT Bearer token or X-Admin-Key header)

curl -X POST http://localhost:8003/api/credits/teams/acme-prod/replenish \
  -H "Authorization: Bearer your-admin-jwt" \
  -H "Content-Type: application/json" \
  -d '{
    "credits": 5000,
    "payment_type": "subscription",
    "payment_amount_usd": 499.00,
    "reason": "November 2024 subscription payment"
  }'

Response:

{
  "team_id": "acme-prod",
  "credits_added": 5000,
  "credits_before": 1250,
  "credits_after": 6250,
  "payment_type": "subscription",
  "payment_amount_usd": 499.00,
  "transaction": {
    "transaction_id": "txn_abc123",
    "transaction_type": "subscription_payment",
    "credits_amount": 5000,
    "credits_before": 1250,
    "credits_after": 6250,
    "reason": "November 2024 subscription payment ($499.00 USD)",
    "created_at": "2024-11-01T00:00:00Z"
  }
}

Payment Types

Subscription Payment:

{
  "payment_type": "subscription",
  "credits": 5000,
  "payment_amount_usd": 499.00
}
- Recurring monthly/annual payments - Updates last_refill_at timestamp - Creates subscription_payment transaction

One-Time Payment:

{
  "payment_type": "one_time",
  "credits": 1000,
  "payment_amount_usd": 99.00
}
- Ad-hoc credit purchases - Top-ups or overages - Creates one_time_payment transaction

Auto-Refill Configuration

Set up automatic credit refills tied to subscription billing:

API Endpoint: POST /api/credits/teams/{team_id}/configure-auto-refill

curl -X POST http://localhost:8003/api/credits/teams/acme-prod/configure-auto-refill \
  -H "Authorization: Bearer your-admin-jwt" \
  -H "Content-Type: application/json" \
  -d '{
    "enabled": true,
    "refill_amount": 5000,
    "refill_period": "monthly"
  }'

Response:

{
  "team_id": "acme-prod",
  "auto_refill_enabled": true,
  "refill_amount": 5000,
  "refill_period": "monthly",
  "last_refill_at": "2024-11-01T00:00:00Z",
  "message": "Auto-refill enabled successfully"
}

Refill Periods: - monthly - Refill once per month - weekly - Refill once per week - daily - Refill once per day

Disabling Auto-Refill

curl -X POST http://localhost:8003/api/credits/teams/acme-prod/configure-auto-refill \
  -H "Authorization: Bearer your-admin-jwt" \
  -H "Content-Type: application/json" \
  -d '{
    "enabled": false
  }'

Integration with Payment Processors

Example: Stripe Webhook Handler

import requests
from datetime import datetime

def handle_stripe_subscription_payment(event):
    """Handle successful subscription payment from Stripe"""
    subscription = event['data']['object']
    customer_id = subscription['customer']
    amount_paid = subscription['latest_invoice']['amount_paid'] / 100  # cents to dollars

    # Map Stripe customer to team_id
    team_id = get_team_id_from_stripe_customer(customer_id)

    # Calculate credits based on plan
    credits_to_add = calculate_credits_from_amount(amount_paid)

    # Replenish credits
    response = requests.post(
        f"http://localhost:8003/api/credits/teams/{team_id}/replenish",
        headers={"Authorization": f"Bearer {ADMIN_JWT}"},
        json={
            "credits": credits_to_add,
            "payment_type": "subscription",
            "payment_amount_usd": amount_paid,
            "reason": f"Stripe subscription payment - {subscription['id']}"
        }
    )

    return response.json()

Transaction Tracking

All replenishments create audit trail transactions:

View Replenishment History:

curl "http://localhost:8003/api/credits/teams/acme-prod/transactions?limit=50" \
  -H "Authorization: Bearer sk-virtual-key"

Response showing replenishment transactions:

{
  "team_id": "acme-prod",
  "total": 3,
  "transactions": [
    {
      "transaction_id": "txn_abc123",
      "transaction_type": "subscription_payment",
      "credits_amount": 5000,
      "credits_before": 1250,
      "credits_after": 6250,
      "reason": "November 2024 subscription payment ($499.00 USD)",
      "created_at": "2024-11-01T00:00:00Z"
    },
    {
      "transaction_id": "txn_abc122",
      "transaction_type": "one_time_payment",
      "credits_amount": 1000,
      "credits_before": 250,
      "credits_after": 1250,
      "reason": "Additional credits purchase ($99.00 USD)",
      "created_at": "2024-10-15T10:30:00Z"
    }
  ]
}

Replenishment Best Practices

  1. Track Payment References
  2. Include payment processor ID in reason field
  3. Enables reconciliation between payments and credits

  4. Validate Before Replenishing

  5. Verify payment completed successfully
  6. Check for duplicate webhook events
  7. Implement idempotency

  8. Monitor Refill Timing

  9. Check last_refill_at timestamp
  10. Prevent duplicate monthly refills
  11. Handle edge cases (failed payments, cancellations)

  12. Automate Subscription Refills

  13. Enable auto-refill for subscription customers
  14. Configure appropriate refill_period
  15. Decouple from manual payment processing

Example Workflow: Monthly Subscription

# 1. Customer subscribes to $99/month plan (1000 credits)
team_id = "customer-prod"
monthly_credits = 1000

# 2. Configure auto-refill
requests.post(
    f"http://localhost:8003/api/credits/teams/{team_id}/configure-auto-refill",
    headers={"Authorization": f"Bearer {ADMIN_JWT}"},
    json={
        "enabled": True,
        "refill_amount": monthly_credits,
        "refill_period": "monthly"
    }
)

# 3. On payment success (e.g., Stripe webhook)
def on_payment_success(amount_usd, team_id):
    requests.post(
        f"http://localhost:8003/api/credits/teams/{team_id}/replenish",
        headers={"Authorization": f"Bearer {ADMIN_JWT}"},
        json={
            "credits": monthly_credits,
            "payment_type": "subscription",
            "payment_amount_usd": amount_usd,
            "reason": f"Monthly subscription - {datetime.now().strftime('%B %Y')}"
        }
    )

# 4. Customer usage tracked normally
# 5. Next month, repeat step 3 on next payment

Client Communication

Credit Allocation Email

Subject: Your Credits Have Been Allocated

Hi [Client],

We've allocated 1,000 credits to your account!

WHAT ARE CREDITS?
- 1 credit = 1 completed job
- Jobs can contain multiple LLM calls
- Only successful jobs consume credits

YOUR BALANCE:
- Allocated: 1,000 credits
- Remaining: 1,000 credits

ESTIMATED USAGE:
- Document analysis: ~1 credit per document
- Chat sessions: ~1 credit per conversation
- Your 1,000 credits = approximately 1,000 operations

MONITOR USAGE:
View real-time usage at: https://dashboard.yourcompany.com

Need more credits? Reply to this email or visit your dashboard.

Questions? support@yourcompany.com

Low Credit Warning

Subject: Low Credit Alert - 20% Remaining

Hi [Client],

Your credit balance is running low:

CURRENT BALANCE: 200 credits (20% remaining)
ESTIMATED DEPLETION: 2-3 days at current usage

ACTION REQUIRED:
1. Purchase additional credits: https://dashboard.yourcompany.com/credits
2. Or contact us: support@yourcompany.com

Don't let your integration stop! Top up now.

Best regards,
Your Company Team

Out of Credits Notice

Subject: URGENT: Credits Exhausted

Hi [Client],

Your account has run out of credits.

CURRENT BALANCE: 0 credits
STATUS: API calls suspended

TO RESTORE SERVICE:
1. Purchase credits immediately: https://dashboard.yourcompany.com/credits
2. Or contact support: support@yourcompany.com

Service will resume automatically once credits are added.

Questions? We're here to help!

Monitoring & Analytics

Team Credit Utilization

# Get utilization across all teams
curl "http://localhost:8003/api/credits/utilization"

Response:

{
  "total_teams": 25,
  "summary": {
    "total_allocated": 50000,
    "total_remaining": 32000,
    "total_used": 18000,
    "avg_utilization": 36.0
  },
  "by_status": {
    "healthy": 20,      // >50% remaining
    "warning": 3,       // 20-50% remaining
    "critical": 2       // <20% remaining
  },
  "teams_needing_attention": [
    {
      "team_id": "client-a-prod",
      "credits_remaining": 50,
      "percentage": 5.0,
      "status": "critical"
    }
  ]
}

Track credit consumption over time:

curl "http://localhost:8003/api/credits/trends?team_id=client-prod&days=30"

Use For: - Predicting when team will run out - Recommending plan upgrades - Identifying usage spikes - Forecasting revenue

Best Practices

For Admins

  1. Start Conservative
  2. Begin with modest allocation (1,000 credits)
  3. Monitor usage first week
  4. Adjust based on actual patterns

  5. Set Up Alerts

  6. Alert at 50%, 20%, 10%, 0%
  7. Proactive outreach before depletion
  8. Automated top-up for trusted clients

  9. Review Monthly

  10. Check all team credit levels
  11. Identify heavy users (upsell opportunity)
  12. Identify low users (check satisfaction)

  13. Track Profit Margins

  14. Monitor actual LiteLLM costs vs. credits charged
  15. Adjust pricing if margins too thin
  16. Offer volume discounts for retention

For Clients

Share these tips with clients:

  1. Monitor Your Balance
  2. Check dashboard regularly
  3. Set up low-balance alerts
  4. Don't let credits hit zero

  5. Estimate Usage

  6. 1 credit ≈ 1 business operation
  7. Track your typical job types
  8. Budget accordingly

  9. Complete Jobs Properly

  10. Always call complete_job()
  11. Failed jobs don't cost credits
  12. Incomplete jobs won't be billed

  13. Optimize Usage

  14. Group related calls into single jobs
  15. Don't create unnecessary jobs
  16. Cache responses when possible

Common Scenarios

Scenario 1: Client Runs Out Mid-Month

Problem: Client exhausted their 1,000 credits in 2 weeks

Actions: 1. Add immediate emergency credits (100-200) 2. Analyze usage patterns 3. Recommend plan upgrade 4. Set up automatic top-ups going forward

Scenario 2: Client Barely Uses Credits

Problem: Client using <10% of allocation

Actions: 1. Check if they're having integration issues 2. Offer to help with implementation 3. Consider downgrading to save them money (builds trust) 4. Check if they need different features

Scenario 3: Unexpected Usage Spike

Problem: Team uses 3x normal credits in one day

Actions: 1. Check for runaway processes 2. Contact client to verify legitimate usage 3. Temporarily increase limits if needed 4. Investigate potential security issues

Troubleshooting

Credits Not Deducted

Problem: Jobs completing but credits unchanged

Solutions: 1. Check job actually reached "completed" status 2. Verify credit deduction logic in code 3. Check database transaction logs 4. Ensure job_id is valid

Can't Add Credits

Problem: API returns error when adding credits

Solutions: 1. Verify team exists and is active 2. Check team not suspended 3. Validate credit amount is positive integer 4. Check for database connection issues

Balance Shows Negative

Problem: Credits_remaining is negative

Solutions: 1. This can happen with race conditions 2. Investigate concurrent job completions 3. Add credits to bring back to positive 4. Implement better locking in credit deduction

Next Steps

Now that you understand credits:

  1. Create Teams - Allocate credits when creating teams
  2. Monitor Usage - Track credit consumption
  3. Set Up Model Access - Control which models teams can use
  4. Review Best Practices - Optimize credit management