Skip to content

Python GenAI Instrumentation

AgentTel instruments GenAI SDK calls to capture model, token usage, cost, and performance as OTel spans.

Supported Providers

Provider Install Extra Wrapper
OpenAI agenttel[openai] instrument_openai(client)
Anthropic agenttel[anthropic] instrument_anthropic(client)
LangChain agenttel[langchain] instrument_langchain()
AWS Bedrock agenttel[bedrock] instrument_bedrock(client)

OpenAI

from openai import OpenAI
from agenttel.genai import instrument_openai

client = instrument_openai(OpenAI())

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Explain SLOs"}],
    temperature=0.7,
)

Every chat.completions.create() call creates a span with:

Attribute Example Value
gen_ai.operation.name "chat"
gen_ai.system "openai"
gen_ai.request.model "gpt-4o"
gen_ai.request.temperature 0.7
gen_ai.usage.input_tokens 42
gen_ai.usage.output_tokens 156
gen_ai.response.finish_reasons ["stop"]
agenttel.genai.cost_usd 0.000265

Streaming

Streaming responses are also instrumented. Token usage is captured from the final chunk:

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

Anthropic

from anthropic import Anthropic
from agenttel.genai import instrument_anthropic

client = instrument_anthropic(Anthropic())

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Explain error budgets"}],
)

Captures Anthropic-specific fields: input_tokens, output_tokens, cache_read_input_tokens, stop_reason.

LangChain

from agenttel.genai import instrument_langchain

instrument_langchain()  # Patches globally

Instruments via LangChain callbacks:

  • ChatModel calls → gen_ai chat spans
  • Retriever calls → gen_ai retrieval spans with document count

AWS Bedrock

import boto3
from agenttel.genai import instrument_bedrock

client = instrument_bedrock(boto3.client("bedrock-runtime"))

# Both invoke_model() and converse() are traced
response = client.converse(
    modelId="anthropic.claude-3-5-sonnet-20241022-v2:0",
    messages=[{"role": "user", "content": [{"text": "Hello"}]}],
)

Cost Calculation

AgentTel includes a ModelCostCalculator with built-in pricing for popular models:

from agenttel.genai.cost import ModelCostCalculator, ModelPricing

calc = ModelCostCalculator()

# Built-in pricing
cost = calc.calculate("gpt-4o", input_tokens=1000, output_tokens=500)

# Custom pricing
calc.register_pricing("my-model", ModelPricing(
    input_per_1m=2.0,   # $2.00 per 1M input tokens
    output_per_1m=8.0,  # $8.00 per 1M output tokens
))

Built-in Model Pricing

Model Input (per 1M) Output (per 1M)
gpt-4o $2.50 $10.00
gpt-4o-mini $0.15 $0.60
gpt-4-turbo $10.00 $30.00
claude-sonnet-4 $3.00 $15.00
claude-opus-4 $15.00 $75.00
claude-3.5-sonnet $3.00 $15.00
claude-3-haiku $0.25 $1.25

Custom Span Builder

For direct span creation without wrapping a client:

from agenttel.genai.span_builder import GenAiSpanBuilder

builder = GenAiSpanBuilder()

span = builder.start_chat_span(model="gpt-4o", system="openai")
# ... make your API call ...
builder.end_span_with_response(
    span=span,
    model="gpt-4o",
    input_tokens=100,
    output_tokens=50,
    finish_reason="stop",
)