Documentation Index
Fetch the complete documentation index at: https://docs.upsonic.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Every chat session exposes a single read-only chat.usage property. It returns an AggregatedUsage view derived from the centralized usage registry on each access, scoped to this chat’s chat_usage_id.
The session’s own agent calls, sub-agents, memory summarization, reliability passes, culture, and policy checks all inherit the chat scope and roll into chat.usage automatically.
For wall-clock session length, use chat.duration (or chat.end_time - chat.start_time for an explicit window).
Accessing Chat Metrics
Read chat.usage on any Chat instance. It always returns an AggregatedUsage — zero-valued before the first message, populated thereafter.
Token Metrics
| Property | Type | Description |
|---|
input_tokens | int | Prompt tokens across all model calls in the session |
output_tokens | int | Completion tokens across all model calls |
total_tokens | int | Sum of input_tokens + output_tokens |
cache_read_tokens | int | Tokens read from prompt cache |
cache_write_tokens | int | Tokens written to prompt cache |
reasoning_tokens | int | Reasoning (chain-of-thought) tokens |
| Property | Type | Description |
|---|
requests | int | Number of LLM API requests across the session |
tool_calls | int | Number of tool calls across the session |
Timing Metrics
| Property | Type | Description |
|---|
duration | float | Sum of per-call durations recorded on entries (seconds) |
model_execution_time | float | Total time spent inside LLM API calls (seconds) |
tool_execution_time | float | Total time spent executing tools (seconds) |
upsonic_execution_time | float | Framework overhead = duration − model − tool (seconds) |
time_to_first_token | float | None | Earliest TTFT across contributing entries |
Cost Metrics
| Property | Type | Description |
|---|
cost | float | None | Sum of cost_usd across contributing entries. None if no entry was priced; 0.0 means priced and free. |
| Property | Type | Description |
|---|
entry_count | int | Number of underlying UsageEntry rows contributing to this view |
models | list[str] | Distinct model identifiers that contributed (first-seen order) |
Session Wall-Clock & Messages
| Property | Type | Description |
|---|
chat.duration | float | Wall-clock session length in seconds |
chat.start_time | float | Unix timestamp when the session opened |
chat.end_time | float | None | Unix timestamp when the session closed (None if active) |
len(chat.all_messages) | int | Total messages in the session |
Example
import asyncio
from upsonic import Agent, Chat
async def main():
agent = Agent("anthropic/claude-sonnet-4-5")
chat = Chat(session_id="session1", user_id="user1", agent=agent)
await chat.invoke("Hello")
await chat.invoke("How are you?")
u = chat.usage
print(f"Requests: {u.requests}")
print(f"Input tokens: {u.input_tokens}")
print(f"Output tokens: {u.output_tokens}")
print(f"Tool calls: {u.tool_calls}")
if u.cost is not None:
print(f"Cost: ${u.cost:.4f}")
print(f"Model time: {u.model_execution_time:.2f}s")
print(f"Tool time: {u.tool_execution_time:.2f}s")
print(f"Framework time: {u.upsonic_execution_time:.2f}s")
print(f"Models used: {u.models}")
# Wall-clock session length
print(f"Wall-clock: {chat.duration:.1f}s")
print(f"Messages: {len(chat.all_messages)}")
if __name__ == "__main__":
asyncio.run(main())
JSON-Friendly Output
import json
snapshot = {
**chat.usage.to_dict(),
"duration_wallclock": chat.duration,
"messages": len(chat.all_messages),
}
print(json.dumps(snapshot, indent=2))
Scope & Propagation
- Across invocations — Every
chat.invoke / chat.stream adds entries under the chat’s scope. Reading chat.usage re-aggregates them, so the figures always reflect the latest state.
- Sub-pipeline rollup — Memory summarization, reliability validator/editor, culture, policy, and sub-agent calls inherit the chat scope via context variables and roll into
chat.usage automatically.
- Retry idempotency — The registry is keyed by
entry_id. Retried requests replace their prior entry instead of double-counting.
- Persistence — When you configure storage on
Chat, recorded entries persist alongside the conversation. Re-opening the same session_id re-hydrates the registry, so chat.usage continues from where it left off across processes and restarts.
Legacy Migration
Legacy chat-level properties and helpers have been removed in favour of chat.usage:| Legacy | Replacement |
|---|
chat.input_tokens | chat.usage.input_tokens |
chat.output_tokens | chat.usage.output_tokens |
chat.total_tokens | chat.usage.total_tokens |
chat.total_cost | chat.usage.cost |
chat.total_requests | chat.usage.requests |
chat.total_tool_calls | chat.usage.tool_calls |
chat.run_duration, chat.time_to_first_token | chat.usage.duration, chat.usage.time_to_first_token |
chat.get_usage() | chat.usage |
chat.get_session_metrics(), chat.get_session_summary(), SessionMetrics | chat.usage + chat.duration + len(chat.all_messages) |
chat.get_cost_history() | (registry query — filter UsageEntry rows by chat_usage_id) |