Docs
⚙️ Configuration
Token Usage

Token Usage

Intro

As of v6.0.0, LibreChat accurately tracks token usage for the OpenAI/Plugins endpoints. This can be viewed in your database’s “Transactions” collection.

In the future, you will be able to toggle viewing how much a conversation has cost you.

Currently, you can limit user token usage by enabling user balances. To enables token credit limiting for the OpenAI/Plugins endpoints, set the following .env variable:

KeyTypeDescriptionExample
CHECK_BALANCEbooleanEnable token credit balances for the OpenAI/Plugins endpoints.CHECK_BALANCE=false

And You can also set the number of tokens a user will receive upon registration:

KeyTypeDescriptionExample
START_BALANCEintegerIf the value is set, tokens will be credited to the user's balance after registration.START_BALANCE=20000

You’re right. Let me combine both styles, maintaining the clarity of the original content while adding the deployment context:

Managing Token Balances

You manually add user balance, or you will need to build out a balance-accruing system for users. This may come as a feature to the app whenever an admin dashboard is introduced.

These commands vary depending on your deployment method:

  • Local Development: Run from the project root directory when running the application locally without Docker
  • Docker (default setup): Run from the directory containing your docker-compose.yml when using the default Docker setup (typically started with docker compose up)
  • Docker (deployment setup): Run from the directory containing your deploy-compose.yml if you followed the Ubuntu Docker Guide

Adding Balances

To manually add balances, run the following command:
# Local Development
npm run add-balance
 
# Docker (default setup)
docker-compose exec api npm run add-balance
 
# Docker (deployment setup)
docker exec -it LibreChat-API /bin/sh -c "cd .. && npm run add-balance"
You can also specify the email and token credit amount to add, e.g.:
# Local Development
npm run add-balance [email protected] 1000
 
# Docker (default setup)
docker-compose exec api npm run add-balance [email protected] 1000
 
# Docker (deployment setup)
docker exec -it LibreChat-API /bin/sh -c "cd .. && npm run add-balance [email protected] 1000"

Setting Balances

Additionally, you can set a balance for a user. An existing balance will be overwritten by the new balance.

To manually set balances, run the following command:
# Local Development
npm run set-balance
 
# Docker (default setup)
docker-compose exec api npm run set-balance
 
# Docker (deployment setup)
docker exec -it LibreChat-API /bin/sh -c "cd .. && npm run set-balance"
You can also specify the email and token credit amount to set, e.g.:
# Local Development
npm run set-balance [email protected] 1000
 
# Docker (default setup)
docker-compose exec api npm run set-balance [email protected] 1000
 
# Docker (deployment setup)
docker exec -it LibreChat-API /bin/sh -c "cd .. && npm run set-balance [email protected] 1000"

Listing of balances

To see the balances of your users, you can run:
# Local Development
npm run list-balances
 
# Docker (default setup)
docker-compose exec api npm run list-balances
 
# Docker (deployment setup)
docker exec -it LibreChat-API /bin/sh -c "cd .. && npm run list-balances"

This works well to track your own usage for personal use; 1000 credits = $0.001 (1 mill USD)

Notes

  • With summarization enabled, you will be blocked from making an API request if the cost of the content that you need to summarize + your messages payload exceeds the current balance
  • Counting Prompt tokens is really accurate for OpenAI calls, but not 100% for plugins (due to function calling). It is really close and conservative, meaning its count may be higher by 2-5 tokens.
  • The system allows deficits incurred by the completion tokens. It only checks if you have enough for the prompt Tokens, and is pretty lenient with the completion. The graph below details the logic
  • The above said, plugins are checked at each generation step, since the process works with multiple API calls. Anything the LLM has generated since the initial user prompt is shared to the user in the error message as seen below.
  • There is a 150 token buffer for titling since this is a 2 step process, that averages around 200 total tokens. In the case of insufficient funds, the titling is cancelled before any spend happens and no error is thrown.

image

More details

source: LibreChat/discussions/1640

”rawAmount”: -000, // what’s this?

Raw amount of tokens as counted per the tokenizer algorithm.

”tokenValue”: -00000, // what’s this?

Token credits value. 1000 credits = $0.001 (1 mill USD)

“rate”: 00, // what’s this?

The rate at which tokens are charged as credits.

For example, gpt-3.5-turbo-1106 has a rate of 1 for user prompt (input) and 2 for completion (output)

ModelInputOutput
gpt-3.5-turbo-1106$0.0010 / 1K tokens$0.0020 / 1K tokens

Given the provided example:

    "rawAmount": -137
    "tokenValue": -205.5
    "rate": 1.5
\[\text{Token Value} = (\text{Raw Amount of Tokens}) \times (\text{Rate}) \] \[137 \times 1.5 = 205.5 \]

And to get the real amount of USD spend based on Token Value:

\[\frac{\text{Token Value}}{1,000,000} = \left(\frac{\text{Raw Amount of Tokens} \times \text{Rate}}{1,000,000}\right) \] \[\frac{205.5}{1,000,000} = \$0.0002055 \text{ USD} \]

The relevant file for editing rates is found in api/models/tx.js

There will be more customization for this soon from the librechat.yaml file.

Preview

image

image