Skip to content

Agent Cost Tracking

AgentCostAggregator is an OTel SpanProcessor that automatically rolls up LLM costs from GenAI spans into parent agent invocation and session spans — giving you per-invocation and per-orchestration cost visibility.

How It Works

%%{init: {'theme': 'base', 'themeVariables': {'lineColor': '#6366f1'}}}%%
graph TB
    subgraph GenAI["GenAI Spans (agenttel-genai)"]
        G1["gen_ai chat<br/><small>cost_usd=0.003</small>"]
        G2["gen_ai chat<br/><small>cost_usd=0.005</small>"]
        G3["gen_ai chat<br/><small>cost_usd=0.002</small>"]
    end

    AGG["AgentCostAggregator<br/><small>SpanProcessor.onEnd()</small>"]

    subgraph Targets["Parent Spans"]
        INV["invoke_agent<br/><small>cost.total_usd=0.010<br/>cost.llm_calls=3</small>"]
        SES["agenttel.agentic.session<br/><small>cost.total_usd=0.010</small>"]
    end

    G1 --> AGG
    G2 --> AGG
    G3 --> AGG
    AGG --> INV
    AGG --> SES

    style G1 fill:#818cf8,stroke:#6366f1,color:#1e1b4b
    style G2 fill:#818cf8,stroke:#6366f1,color:#1e1b4b
    style G3 fill:#818cf8,stroke:#6366f1,color:#1e1b4b
    style AGG fill:#a78bfa,stroke:#7c3aed,color:#1e1b4b
    style INV fill:#a78bfa,stroke:#7c3aed,color:#1e1b4b
    style SES fill:#a78bfa,stroke:#7c3aed,color:#1e1b4b
  1. When a GenAI span ends with agenttel.genai.cost_usd > 0, the aggregator accumulates its cost and token counts by trace ID
  2. When an invoke_agent or agenttel.agentic.session span ends, the aggregator applies accumulated totals as attributes

Cost Attributes

Attribute Type Description
agenttel.agentic.cost.total_usd Double Total LLM cost in USD
agenttel.agentic.cost.input_tokens Long Total input tokens consumed
agenttel.agentic.cost.output_tokens Long Total output tokens generated
agenttel.agentic.cost.llm_calls Long Number of LLM API calls
agenttel.agentic.cost.reasoning_tokens Long Reasoning/thinking tokens (e.g., Claude extended thinking)
agenttel.agentic.cost.cached_read_tokens Long Tokens served from cache reads
agenttel.agentic.cost.cached_write_tokens Long Tokens written to cache

Spring Boot Setup

When both agenttel-agentic and agenttel-genai are on the classpath, cost tracking is automatic. AgentTelAgenticAutoConfiguration creates the AgentCostAggregator bean and registers it as an OTel SpanProcessor.

<dependencies>
    <dependency>
        <groupId>dev.agenttel</groupId>
        <artifactId>agenttel-spring-boot-starter</artifactId>
        <version>0.2.0-alpha</version>
    </dependency>
    <dependency>
        <groupId>dev.agenttel</groupId>
        <artifactId>agenttel-agentic</artifactId>
        <version>0.2.0-alpha</version>
    </dependency>
    <dependency>
        <groupId>dev.agenttel</groupId>
        <artifactId>agenttel-genai</artifactId>
        <version>0.2.0-alpha</version>
    </dependency>
</dependencies>
dependencies {
    implementation("dev.agenttel:agenttel-spring-boot-starter:0.2.0-alpha")
    implementation("dev.agenttel:agenttel-agentic:0.2.0-alpha")
    implementation("dev.agenttel:agenttel-genai:0.2.0-alpha")
}

No additional configuration is needed. The auto-configuration registers the cost aggregator via AutoConfigurationCustomizerProvider:

// This happens automatically — shown for reference
customizer.addTracerProviderCustomizer(
    (builder, config) -> builder.addSpanProcessor(costAggregator));

Programmatic Registration

For non-Spring applications, register the aggregator manually:

AgentCostAggregator costAggregator = new AgentCostAggregator();

SdkTracerProvider tracerProvider = SdkTracerProvider.builder()
    .addSpanProcessor(costAggregator)
    .addSpanProcessor(BatchSpanProcessor.builder(exporter).build())
    .build();

OpenTelemetry otel = OpenTelemetrySdk.builder()
    .setTracerProvider(tracerProvider)
    .build();
from agenttel.agentic.cost import AgentCostAggregator

aggregator = AgentCostAggregator()
aggregator.record(input_tokens=1000, output_tokens=500, cost_usd=0.003, model="gpt-4o")

summary = aggregator.summary()
# summary.total_cost_usd, summary.total_input_tokens, etc.

Warning

Register AgentCostAggregator before the BatchSpanProcessor (exporter) so that cost attributes are set on parent spans before they are exported.


Monitoring Costs Across Orchestrations

For multi-agent orchestrations, costs aggregate at the session level:

try (SequentialOrchestration seq = tracer.orchestrate(
        OrchestrationPattern.SEQUENTIAL, 3)) {

    try (AgentInvocation s1 = seq.stage("researcher", 1)) {
        // GenAI calls here → costs tracked per-invocation
        chatModel.generate(messages);  // cost_usd = 0.003
        s1.complete(true);
    }

    try (AgentInvocation s2 = seq.stage("writer", 2)) {
        chatModel.generate(messages);  // cost_usd = 0.008
        chatModel.generate(messages);  // cost_usd = 0.005
        s2.complete(true);
    }

    try (AgentInvocation s3 = seq.stage("reviewer", 3)) {
        chatModel.generate(messages);  // cost_usd = 0.004
        s3.complete(true);
    }

    seq.complete();
}
// Session span gets: cost.total_usd = 0.020, cost.llm_calls = 4
from agenttel.agentic.orchestration import SequentialOrchestration

orch = SequentialOrchestration(tracer, stages=["research", "write", "review"])
# GenAI costs are automatically tracked per-stage
# Session span gets aggregated totals

The aggregator uses ConcurrentHashMap with DoubleAdder and LongAdder for thread-safe accumulation across parallel branches.


Budget Guardrails

Combine cost tracking with guardrail activation to enforce budget limits:

try (AgentInvocation inv = tracer.invoke("Process batch")) {
    inv.maxSteps(100);
    double budgetLimit = 5.00;

    for (var item : items) {
        chatModel.generate(buildPrompt(item));

        // Check accumulated cost (requires custom tracking)
        if (currentCost > budgetLimit) {
            inv.guardrail("budget-limit", GuardrailAction.BLOCK,
                String.format("Cost $%.2f exceeded budget $%.2f", currentCost, budgetLimit));
            inv.complete(InvocationStatus.ESCALATED);
            return;
        }
    }

    inv.complete(true);
}

Tip

The AgentCostAggregator sets cost attributes when spans end. For real-time budget enforcement during an invocation, maintain a local cost counter alongside the aggregator.


Further Reading