Usage Accounting

Track AI Model Token Usage

The Infron AI API provides built-in Usage Accounting that allows you to track AI model usage without making additional API calls. This feature provides detailed information about token counts, costs, and caching status directly in your API responses.

Usage Information

When enabled, the API will return detailed usage information including:

Prompt and completion token counts using the model's native tokenizer
Cost in credits
Reasoning token counts (if applicable)
Cached token counts (if available)

This information is included in the last SSE message for streaming responses, or in the complete response for non-streaming requests.

Enabling Usage Accounting

You can enable usage accounting in your requests by including the usage parameter:

import requests
import json

response = requests.post(
  url="https://llm.onerouter.pro/v1/chat/completions",
  headers={
    "Authorization": "Bearer <<API Keys>>",
    "Content-Type": "application/json"
  },
  data=json.dumps({
    "model": "google-ai-studio/gemini-2.5-flash-preview-09-2025", 
    "messages": [
      {
        "role": "user",
        "content": "What is the meaning of life?"
      }
    ],
    "usage": {
      "include": True
    }
  })
)
print(response.json())

Response Format

When usage accounting is enabled, the response will include a usage object with detailed token information and a cost item and a cost_details object with detailed costs:

{
  'choices': [{
    'finish_reason': 'stop',
    'index': 0,
    'logprobs': None,
    'message': {
      'content': 'This is perhaps the most profound and widely debated question in human history. The truth is, there is no single, universally accepted answer.\n\nThe "meaning of life" is a concept that is approached very differently across disciplines, cultures, and individual beliefs. Here is an overview of the major perspectives:\n\n---\n\n## 1. Philosophical Perspectives\n\nPhilosophy attempts to logically define existence and purpose.\n\n| Perspective | Core Idea | Key Concept |\n| :--- | :--- | :--- |\n| **Nihilism** | Life is inherently without objective meaning, purpose, or intrinsic value. Any purpose is an illusion. | Meaninglessness, Absurdity (often a precursor to Existentialism). |\n| **Absurdism** | The conflict between humanity\'s inherent search for meaning and the universe\'s silent, objective meaninglessness (**The Absurd**). The meaning is found in rebelling against the Absurd by embracing life fully despite its lack of ultimate purpose. | Revolt, Freedom, Passion (Albert Camus). |\n| **Existentialism** | Existence precedes essence. Humans are born a "blank slate" and must create their own meaning through choices, freedom, and responsibility. Meaning is subjective and individually constructed. | Radical Freedom, Responsibility, Subjectivity (Sartre, de Beauvoir). |\n| **Hedonism/Utilitarianism** | The ultimate meaning or good is the maximization of pleasure and the minimization of pain (utility). Life\'s purpose is to seek happiness. | Utility, Pleasure, The Greatest Good for the Greatest Number. |\n| **Stoicism** | Meaning is found in living in accordance with nature and reason. Focus on what you can control (your judgments and reactions) and accept what you cannot. Virtue is the sole good. | Virtue, Resilience, Acceptance. |\n\n---\n\n## 2. Religious and Spiritual Perspectives\n\nReligions generally posit that life has a transcendent, divinely ordained purpose.\n\n*   **Christianity/Islam/Judaism:** Meaning is found in serving God, following divine law, achieving salvation, and preparing for an afterlife. The purpose is to fulfill the covenant or will of the Creator.\n*   **Buddhism:** The meaning of life is to escape suffering (Dukkha) caused by desire, achieve enlightenment (Nirvana), and break the cycle of reincarnation (Samsara). This is often achieved through compassion and ethical conduct.\n*   **Hinduism:** Meaning is tied to *Dharma* (righteous conduct/duty) and *Moksha* (liberation/union with the ultimate reality, Brahman). The goal is to fulfill one’s duty within the cosmic order.\n*   **Spirituality (Non-denominational):** Meaning is often found in connection—to nature, to other beings, or to a higher power—and in personal growth or evolution.\n\n---\n\n## 3. Scientific and Biological Perspectives\n\nScience approaches the question from an empirical viewpoint, focusing on observable functions.\n\n*   **Biological Imperative:** From an evolutionary standpoint, the meaning of life is the perpetuation of the species. The goal of any organism is to survive long enough to reproduce and pass on its genetic material.\n*   **Cognitive Science:** The function of human life is the processing of information, learning, adaptation, and the creation of culture, knowledge, and complex systems. Meaning can be seen as the ultimate emergent property of a highly complex brain interacting with the environment.\n*   **Cosmic Perspective (Atheistic/Naturalistic):** In the vastness of the cosmos, human life has no inherent cosmic significance. Meaning, if it exists, is purely self-imposed and human-centered.\n\n---\n\n## 4. Psychological Perspectives\n\nPsychology focuses on the internal experience of purpose, fulfillment, and well-being.\n\n*   **Logotherapy (Viktor Frankl):** Meaning is found in the relentless search for purpose, even amidst suffering. Frankl, a Holocaust survivor, argued that meaning can be found in three primary ways:\n    1.  Creating a work or doing a deed.\n    2.  Experiencing something or loving someone.\n    3.  The attitude we take toward unavoidable suffering.\n*   **Humanistic Psychology (Maslow/Rogers):** The meaning of life is **Self-Actualization**—the realization of one\'s full potential and inherent capabilities. It is the drive to become the best possible version of oneself.\n*   **Positive Psychology (Seligman):** Meaning is a core component of flourishing (**PERMA Model**). It is achieved by belonging to and serving something larger than oneself.\n\n---\n\n## Conclusion: The Answer May Be a Verb\n\nSince there is no objective meaning decreed by the universe, most contemporary thought (philosophical and psychological) suggests that **meaning is not something to be discovered, but something to be created.**\n\nIf you accept that the universe is indifferent, the meaning of life becomes:\n\n### 1. The Meaning You Choose to Create\n\nIt is the subjective purpose you assign to your own existence. This could be love, art, justice, connection, knowledge, or helping others.\n\n### 2. The Experience of Living\n\nThe ultimate meaning might simply be the experience itself—the engagement with the world, the appreciation of beauty, the capacity for connection, and the pursuit of growth.\n\nIn essence, **The meaning of life is the life you lead.**',
      'role': 'assistant'
    }
  }],
  'cost': 0.002835,
  'cost_details': {
    'audio_cost': 0,
    'cache_prompt_cost': 0,
    'cache_write_cost': 0,
    'generation_cost': 0,
    'image_cost': 0,
    'input_prompt_cost': 2.4e-06,
    'output_prompt_cost': 0.002832727,
    'tools_cost': 0,
    'video_cost': 0
  },
  'created': 1766026675,
  'discounted': '1',
  'id': 'chatcmpl-202512180257506444719362giBMqDX',
  'model': 'gemini-2.5-flash-preview-09-2025',
  'object': 'chat.completion',
  'request_id': '8dce2dd0fd4c4bfeb4374df8fb32a8a6',
  'usage': {
    'completion_tokens': 1133,
    'input_tokens': 0,
    'output_tokens': 0,
    'prompt_tokens': 8,
    'prompt_tokens_details': {
      'text_tokens': 8
    },
    'server_tool_use': {
      'web_search_requests': ''
    },
    'total_tokens': 1141,
    'ttft': 0
  }
}

cost is the total amount charged to your account.

cost_details is the breakdown of the total cost.

Enabling usage accounting will add a few hundred milliseconds to the last response as the API calculates token counts and costs. This only affects the final message and does not impact overall streaming performance.

Benefits

Efficiency: Get usage information without making separate API calls
Accuracy: Token counts are calculated using the model's native tokenizer
Transparency: Track costs and cached token usage in real-time
Detailed Breakdown: Separate counts for prompt, completion, reasoning, and cached tokens

Best Practices

Enable usage tracking when you need to monitor token consumption or costs
Account for the slight delay in the final response when usage accounting is enabled
Consider implementing usage tracking in development to optimize token usage before production
Use the cached token information to optimize your application's performance

Examples

Basic Usage with Token Tracking

from openai import OpenAI

client = OpenAI(
    base_url="https://llm.onerouter.pro/v1",
    api_key="{{API_KEY_REF}}",
)

response = client.chat.completions.create(
    model="{{MODEL}}",
    messages=[
        {"role": "user", "content": "What is the capital of France?"}
    ],
    extra_body={
        "usage": {
            "include": True
        }
    }
)

print("Response:", response.choices[0].message.content)
print("Usage Stats:", getattr(response, "usage", None))

Streaming with Token Tracking

According to the OpenAI specification, to request token usage information in a streaming response, you would include the following parameters in your request:

from openai import OpenAI

client = OpenAI(
    base_url="https://llm.onerouter.pro/v1",
    api_key="<<API Keys>>",
)

response = client.chat.completions.create(
    model="google-ai-studio/gemini-2.5-flash-preview-09-2025",
    messages=[
        {"role": "user", "content": "What is the capital of France?"}
    ],
    extra_body={
        "usage": {
            "include": True
        },
        "stream": True,
        "stream_options": {
            "include_usage": True
        }
    }
)

print(response)

curl https://llm.onerouter.pro/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <<API Keys>>" \
  -d '{
  "model": "google-ai-studio/gemini-2.5-flash-preview-09-2025",
  "messages": [
    {
      "role": "user",
      "content": "What is the meaning of life?"
    }
  ],
  "usage": {
      "include": True
  }
  "stream": true,
  "stream_options": {
      "include_usage": true
  }
}'

This configuration tells the API to:

⁠Use the google-ai-studio/gemini-2.5-flash-preview-09-2025
⁠Stream the response incrementally
Include token usage statistics in the stream response

The ⁠ stream_options.include_usage ⁠ parameter specifically requests that token usage information be returned as part of the streaming response.

The response example is below:

data: {"id":"chatcmpl-20251218030558845163004nROxtMvj","object":"chat.completion.chunk","created":1766027159,"model":"gemini-2.5-flash-preview-09-2025","system_fingerprint":null,"choices":[{"delta":{"content":"The capital of France","role":"assistant"},"logprobs":null,"finish_reason":null,"index":0}],"usage":null}

data: {"id":"chatcmpl-20251218030558845163004nROxtMvj","object":"chat.completion.chunk","created":1766027159,"model":"gemini-2.5-flash-preview-09-2025","system_fingerprint":null,"choices":[{"delta":{"content":" is **Paris**.","role":"assistant"},"logprobs":null,"finish_reason":"stop","index":0}],"usage":null}

data: {"id":"chatcmpl-20251218030558845163004nROxtMvj","object":"chat.completion.chunk","created":1766027159,"model":"gemini-2.5-flash-preview-09-2025","system_fingerprint":null,"request_id":"aca8184689134db7be37e41fc0e91486","choices":[{"delta":{},"logprobs":null,"finish_reason":"stop","index":0}],"usage":null}

data: {"choices":[],"cost":0.000022,"cost_details":{"audio_cost":0,"cache_prompt_cost":0,"cache_write_cost":0,"generation_cost":0,"image_cost":0,"input_prompt_cost":0.0000024,"output_prompt_cost":0.000020002,"tools_cost":0,"video_cost":0},"created":1766027159,"discounted":"1","id":"chatcmpl-20251218030558845163004nROxtMvj","model":"gemini-2.5-flash-preview-09-2025","object":"chat.completion.chunk","request_id":"aca8184689134db7be37e41fc0e91486","system_fingerprint":null,"usage":{"completion_tokens":8,"input_tokens":0,"output_tokens":0,"prompt_tokens":8,"prompt_tokens_details":{"text_tokens":8},"server_tool_use":{"web_search_requests":""},"total_tokens":16,"ttft":377876649}}

data: [DONE]

The cost & usage is in the last chat.completion.chunk.

{
  "choices": [],
  "cost": 0.000022,
  "cost_details": {
    "audio_cost": 0,
    "cache_prompt_cost": 0,
    "cache_write_cost": 0,
    "generation_cost": 0,
    "image_cost": 0,
    "input_prompt_cost": 0.0000024,
    "output_prompt_cost": 0.000020002,
    "tools_cost": 0,
    "video_cost": 0
  },
  "created": 1766027159,
  "discounted": "1",
  "id": "chatcmpl-20251218030558845163004nROxtMvj",
  "model": "gemini-2.5-flash-preview-09-2025",
  "object": "chat.completion.chunk",
  "request_id": "aca8184689134db7be37e41fc0e91486",
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 8,
    "input_tokens": 0,
    "output_tokens": 0,
    "prompt_tokens": 8,
    "prompt_tokens_details": {
      "text_tokens": 8
    },
    "server_tool_use": {
      "web_search_requests": ""
    },
    "total_tokens": 16,
    "ttft": 377876649
  }
}

PreviousBalance Info NextModels API

Last updated 25 days ago

hashtagUsage Information

hashtagEnabling Usage Accounting

hashtagResponse Format

hashtagBenefits

hashtagBest Practices

hashtagExamples

hashtagBasic Usage with Token Tracking

hashtagStreaming with Token Tracking