Overview
Track AI Model Token Usage and Cost Breakdowns
Infron provides built‑in Usage Accounting that allows you to monitor AI model usage and cost breakdowns directly from your API responses. This feature includes detailed insights into token consumption, associated costs, and caching behavior.
Benefits
Efficiency: Retrieve usage information without additional API calls
Accuracy: Token counts are computed using each model’s native tokenizer
Transparency: Track real-time cost and cached token utilization
Detailed Breakdown: Separate reporting for prompt, completion, reasoning, and cached tokens
Usage Information
When enabled, the API returns comprehensive usage metrics, including:
Prompt and completion token counts calculated with the model’s native tokenizer
Total cost in credits
Reasoning token counts (when supported by the model)
Cached token counts (when applicable)
This usage information appears in the final SSE message for streaming responses, or in the full response body for non‑streaming requests.
Last updated