boltOverview

Track AI Model Token Usage and Cost Breakdowns

Infron provides built‑in Usage Accounting that allows you to monitor AI model usage and cost breakdowns directly from your API responses. This feature includes detailed insights into token consumption, associated costs, and caching behavior.

Benefits

  • Efficiency: Retrieve usage information without additional API calls

  • Accuracy: Token counts are computed using each model’s native tokenizer

  • Transparency: Track real-time cost and cached token utilization

  • Detailed Breakdown: Separate reporting for prompt, completion, reasoning, and cached tokens

Usage Information

When enabled, the API returns comprehensive usage metrics, including:

  • Prompt and completion token counts calculated with the model’s native tokenizer

  • Total cost in credits

  • Reasoning token counts (when supported by the model)

  • Cached token counts (when applicable)

This usage information appears in the final SSE message for streaming responses, or in the full response body for non‑streaming requests.

Last updated