Chat Metrics - Upsonic AI

Overview

Every chat session exposes a single read-only chat.usage property. It returns an AggregatedUsage view derived from the centralized usage registry on each access, scoped to this chat’s chat_usage_id. The session’s own agent calls, sub-agents, memory summarization, reliability passes, culture, and policy checks all inherit the chat scope and roll into chat.usage automatically. For wall-clock session length, use chat.duration (or chat.end_time - chat.start_time for an explicit window).

Accessing Chat Metrics

Read chat.usage on any Chat instance. It always returns an AggregatedUsage — zero-valued before the first message, populated thereafter.

Token Metrics

Property	Type	Description
`input_tokens`	`int`	Prompt tokens across all model calls in the session
`output_tokens`	`int`	Completion tokens across all model calls
`total_tokens`	`int`	Sum of `input_tokens + output_tokens`
`cache_read_tokens`	`int`	Tokens read from prompt cache
`cache_write_tokens`	`int`	Tokens written to prompt cache
`reasoning_tokens`	`int`	Reasoning (chain-of-thought) tokens

Request & Tool Metrics

Property	Type	Description
`requests`	`int`	Number of LLM API requests across the session
`tool_calls`	`int`	Number of tool calls across the session

Timing Metrics

Property	Type	Description
`duration`	`float`	Sum of per-call durations recorded on entries (seconds)
`model_execution_time`	`float`	Total time spent inside LLM API calls (seconds)
`tool_execution_time`	`float`	Total time spent executing tools (seconds)
`upsonic_execution_time`	`float`	Framework overhead = `duration − model − tool` (seconds)
`time_to_first_token`	`float \| None`	Earliest TTFT across contributing entries

Cost Metrics

Property	Type	Description
`cost`	`float \| None`	Sum of `cost_usd` across contributing entries. `None` if no entry was priced; `0.0` means priced and free.

Aggregation Metadata

Property	Type	Description
`entry_count`	`int`	Number of underlying `UsageEntry` rows contributing to this view
`models`	`list[str]`	Distinct model identifiers that contributed (first-seen order)

Session Wall-Clock & Messages

Property	Type	Description
`chat.duration`	`float`	Wall-clock session length in seconds
`chat.start_time`	`float`	Unix timestamp when the session opened
`chat.end_time`	`float \| None`	Unix timestamp when the session closed (`None` if active)
`len(chat.all_messages)`	`int`	Total messages in the session

Example

import asyncio
from upsonic import Agent, Chat


async def main():
    agent = Agent("anthropic/claude-sonnet-4-5")
    chat = Chat(session_id="session1", user_id="user1", agent=agent)

    await chat.invoke("Hello")
    await chat.invoke("How are you?")

    u = chat.usage
    print(f"Requests:        {u.requests}")
    print(f"Input tokens:    {u.input_tokens}")
    print(f"Output tokens:   {u.output_tokens}")
    print(f"Tool calls:      {u.tool_calls}")
    if u.cost is not None:
        print(f"Cost:            ${u.cost:.4f}")
    print(f"Model time:      {u.model_execution_time:.2f}s")
    print(f"Tool time:       {u.tool_execution_time:.2f}s")
    print(f"Framework time:  {u.upsonic_execution_time:.2f}s")
    print(f"Models used:     {u.models}")

    # Wall-clock session length
    print(f"Wall-clock:      {chat.duration:.1f}s")
    print(f"Messages:        {len(chat.all_messages)}")


if __name__ == "__main__":
    asyncio.run(main())

JSON-Friendly Output

import json

snapshot = {
    **chat.usage.to_dict(),
    "duration_wallclock": chat.duration,
    "messages": len(chat.all_messages),
}
print(json.dumps(snapshot, indent=2))

Scope & Propagation

Across invocations — Every chat.invoke / chat.stream adds entries under the chat’s scope. Reading chat.usage re-aggregates them, so the figures always reflect the latest state.
Sub-pipeline rollup — Memory summarization, reliability validator/editor, culture, policy, and sub-agent calls inherit the chat scope via context variables and roll into chat.usage automatically.
Retry idempotency — The registry is keyed by entry_id. Retried requests replace their prior entry instead of double-counting.
Persistence — When you configure storage on Chat, recorded entries persist alongside the conversation. Re-opening the same session_id re-hydrates the registry, so chat.usage continues from where it left off across processes and restarts.

Legacy Migration

Legacy chat-level properties and helpers have been removed in favour of chat.usage:

Legacy	Replacement
`chat.input_tokens`	`chat.usage.input_tokens`
`chat.output_tokens`	`chat.usage.output_tokens`
`chat.total_tokens`	`chat.usage.total_tokens`
`chat.total_cost`	`chat.usage.cost`
`chat.total_requests`	`chat.usage.requests`
`chat.total_tool_calls`	`chat.usage.tool_calls`
`chat.run_duration`, `chat.time_to_first_token`	`chat.usage.duration`, `chat.usage.time_to_first_token`
`chat.get_usage()`	`chat.usage`
`chat.get_session_metrics()`, `chat.get_session_summary()`, `SessionMetrics`	`chat.usage` + `chat.duration` + `len(chat.all_messages)`
`chat.get_cost_history()`	(registry query — filter `UsageEntry` rows by `chat_usage_id`)

Usage Registry — Architecture, scope tags, persistence
Agent Metrics — Accumulated agent-level view
Task Metrics — Per-task scoped view
Team Metrics — Per-team scoped view
Storage Backends — Persisting usage across processes

Documentation Index

​Overview

​Accessing Chat Metrics

​Token Metrics

​Request & Tool Metrics

​Timing Metrics

​Cost Metrics

​Aggregation Metadata

​Session Wall-Clock & Messages

​Example

​JSON-Friendly Output

​Scope & Propagation

​Legacy Migration

​Related Documentation