GenAI Instrumentation¶

The agenttel-genai module provides observability for AI/ML workloads on the JVM. It instruments LLM frameworks and provider SDKs with OpenTelemetry spans following the emerging gen_ai.* semantic conventions, enriched with AgentTel extensions for cost tracking, framework identification, and RAG observability.

Overview¶

Framework	Instrumentation Approach	What You Get
Spring AI	`SpanProcessor` enrichment of existing Micrometer spans	Framework tag, cost calculation
LangChain4j	Decorator-based full instrumentation	Chat, embeddings, RAG retrieval spans
Anthropic Java SDK	Client wrapper	Messages API with token/cost tracking
OpenAI Java SDK	Client wrapper	Chat completions with token/cost tracking
AWS Bedrock SDK	Client wrapper	Converse API with token/cost tracking

All GenAI library dependencies are compileOnly — they activate only when the corresponding library is on the classpath. Users provide their own runtime versions.

Python SDK¶

The Python SDK provides equivalent GenAI instrumentation. See the Python GenAI Guide for complete documentation. Quick overview:

Framework	Installation	What You Get
OpenAI	`pip install agenttel[openai]`	Chat completions with token/cost tracking
Anthropic	`pip install agenttel[anthropic]`	Messages API with token/cost tracking
LangChain	`pip install agenttel[langchain]`	ChatModel and Retriever callback handler
AWS Bedrock	`pip install agenttel[bedrock]`	`invoke_model` and `converse` wrapping

from agenttel.genai import instrument_openai, instrument_anthropic
from openai import OpenAI
from anthropic import Anthropic

# One-line instrumentation
client = instrument_openai(OpenAI())
response = client.chat.completions.create(model="gpt-4o", messages=[...])
# Span created automatically with gen_ai.* attributes and cost tracking

anthropic_client = instrument_anthropic(Anthropic())
response = anthropic_client.messages.create(model="claude-sonnet-4-20250514", ...)

Dependency Setup¶

Maven:

<dependencies>
    <dependency>
        <groupId>dev.agenttel</groupId>
        <artifactId>agenttel-genai</artifactId>
        <version>0.2.0-alpha</version>
    </dependency>

    <!-- Include whichever GenAI libraries you use: -->
    <dependency>
        <groupId>dev.langchain4j</groupId>
        <artifactId>langchain4j-core</artifactId>
        <version>1.0.0</version>
    </dependency>
    <!-- or -->
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-core</artifactId>
        <version>1.0.0</version>
    </dependency>
    <!-- or -->
    <dependency>
        <groupId>com.anthropic</groupId>
        <artifactId>anthropic-java</artifactId>
        <version>2.0.0</version>
    </dependency>
    <dependency>
        <groupId>com.openai</groupId>
        <artifactId>openai-java</artifactId>
        <version>4.0.0</version>
    </dependency>
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>bedrockruntime</artifactId>
        <version>2.30.0</version>
    </dependency>
</dependencies>

Gradle:

// build.gradle.kts
dependencies {
    implementation("dev.agenttel:agenttel-genai:0.2.0-alpha")

    // Include whichever GenAI libraries you use:
    implementation("dev.langchain4j:langchain4j-core:1.0.0")
    // or
    implementation("org.springframework.ai:spring-ai-core:1.0.0")
    // or
    implementation("com.anthropic:anthropic-java:2.0.0")
    implementation("com.openai:openai-java:4.0.0")
    implementation("software.amazon.awssdk:bedrockruntime:2.30.0")
}

LangChain4j Instrumentation¶

LangChain4j has no built-in OTel tracing. AgentTel provides full instrumentation via the decorator pattern.

Chat Model¶

import io.agenttel.genai.langchain4j.LangChain4jInstrumentation;

ChatLanguageModel model = OpenAiChatModel.builder()
    .apiKey("...")
    .modelName("gpt-4o")
    .build();

// Wrap with tracing — every call creates a span
ChatLanguageModel traced = LangChain4jInstrumentation.instrument(
    model, openTelemetry, "gpt-4o", "openai"
);

// Use as normal
ChatResponse response = traced.chat(ChatRequest.builder()
    .messages(List.of(UserMessage.from("Explain observability")))
    .build());

Span output:

Span: "chat gpt-4o"
  gen_ai.operation.name     = "chat"
  gen_ai.system             = "openai"
  gen_ai.request.model      = "gpt-4o"
  gen_ai.usage.input_tokens = 150
  gen_ai.usage.output_tokens = 42
  gen_ai.response.finish_reasons = ["stop"]
  agenttel.genai.framework  = "langchain4j"
  agenttel.genai.cost_usd   = 0.000795

Embedding Model¶

EmbeddingModel embeddingModel = OpenAiEmbeddingModel.builder()
    .apiKey("...")
    .modelName("text-embedding-3-small")
    .build();

EmbeddingModel traced = LangChain4jInstrumentation.instrumentEmbedding(
    embeddingModel, openTelemetry, "text-embedding-3-small", "openai"
);

// Span: "embeddings text-embedding-3-small"
Response<List<Embedding>> embeddings = traced.embedAll(
    List.of(TextSegment.from("Agent-ready telemetry"))
);

Streaming Chat Model¶

StreamingChatLanguageModel streaming = OpenAiStreamingChatModel.builder()
    .apiKey("...")
    .modelName("gpt-4o")
    .build();

StreamingChatLanguageModel traced = LangChain4jInstrumentation.instrumentStreaming(
    streaming, openTelemetry, "gpt-4o", "openai"
);

// Span starts on call, ends when streaming completes or errors
traced.chat(ChatRequest.builder()
    .messages(messages)
    .build(), handler);

RAG Content Retrieval¶

ContentRetriever retriever = EmbeddingStoreContentRetriever.builder()
    .embeddingStore(store)
    .embeddingModel(embeddingModel)
    .build();

ContentRetriever traced = LangChain4jInstrumentation.instrumentRetriever(
    retriever, openTelemetry
);

// Span includes RAG-specific attributes
List<Content> results = traced.retrieve(Query.from("What is AgentTel?"));

RAG span attributes:

Span: "retrieve"
  gen_ai.operation.name              = "retrieve"
  agenttel.genai.framework           = "langchain4j"
  agenttel.genai.rag_source_count    = 5
  agenttel.genai.rag_relevance_score_avg = 0.87

Spring AI Enrichment¶

Spring AI already emits gen_ai.* spans via Micrometer. AgentTel enriches these existing spans rather than replacing them.

SpanProcessor Enrichment¶

SpringAiSpanEnricher is a SpanProcessor that detects Spring AI spans and adds AgentTel attributes:

// Auto-configured via Spring Boot — no code needed
// Adds to every Spring AI span:
//   agenttel.genai.framework = "spring_ai"

Cost Calculation¶

Since token counts are only available after the model responds, cost is computed at export time using a delegating SpanExporter:

// Wraps your existing exporter
SpanExporter costAware = new CostEnrichingSpanExporter(yourOtlpExporter);

// Adds agenttel.genai.cost_usd to spans that have:
//   gen_ai.request.model
//   gen_ai.usage.input_tokens
//   gen_ai.usage.output_tokens

Why a SpanExporter? In OpenTelemetry's SpanProcessor.onEnd(), the span is ReadableSpan (immutable). Token usage attributes are set by Spring AI during span execution. The CostEnrichingSpanExporter wraps SpanData with a delegate that injects the cost attribute at export time — the only point where the data can be modified.

Provider SDK Instrumentation¶

Anthropic Java SDK¶

import io.agenttel.genai.anthropic.TracingAnthropicClient;

AnthropicClient client = AnthropicOkHttpClient.builder()
    .apiKey("...")
    .build();

// Wrap with tracing
AnthropicClient traced = new TracingAnthropicClient(client, openTelemetry);

// Spans created for every messages.create() call
MessageCreateParams params = MessageCreateParams.builder()
    .model("claude-sonnet-4-20250514")
    .maxTokens(1024)
    .messages(List.of(...))
    .build();

Message response = traced.messages().create(params);

Span attributes:

Span: "chat claude-sonnet-4-20250514"
  gen_ai.system             = "anthropic"
  gen_ai.request.model      = "claude-sonnet-4-20250514"
  gen_ai.usage.input_tokens = 200
  gen_ai.usage.output_tokens = 150
  agenttel.genai.cost_usd   = 0.00165

OpenAI Java SDK¶

import io.agenttel.genai.openai.TracingOpenAIClient;

OpenAIClient client = OpenAIOkHttpClient.builder()
    .apiKey("...")
    .build();

OpenAIClient traced = new TracingOpenAIClient(client, openTelemetry);

ChatCompletion completion = traced.chat().completions().create(params);

AWS Bedrock SDK¶

import io.agenttel.genai.bedrock.TracingBedrockRuntimeClient;

BedrockRuntimeClient client = BedrockRuntimeClient.builder()
    .region(Region.US_EAST_1)
    .build();

BedrockRuntimeClient traced = new TracingBedrockRuntimeClient(client, openTelemetry);

ConverseResponse response = traced.converse(ConverseRequest.builder()
    .modelId("anthropic.claude-3-sonnet-20240229-v1:0")
    .messages(...)
    .build());

Cost Calculation¶

ModelCostCalculator computes estimated costs based on model and token counts.

Supported Models¶

Provider	Models	Input Cost (per 1M tokens)	Output Cost (per 1M tokens)
Anthropic	Claude Opus 4	$15.00	$75.00
Anthropic	Claude Sonnet 4	$3.00	$15.00
Anthropic	Claude Haiku 3.5	$0.80	$4.00
OpenAI	GPT-4o	$5.00	$15.00
OpenAI	GPT-4o mini	$0.15	$0.60
OpenAI	GPT-4 Turbo	$10.00	$30.00
OpenAI	text-embedding-3-small	$0.02	—
AWS Bedrock	Claude models (via Bedrock)	Same as Anthropic	Same as Anthropic

Programmatic Usage¶

double cost = ModelCostCalculator.calculateCost("gpt-4o", 1000, 500);
// cost = 0.0125 (USD)

// Returns 0.0 for unknown models (graceful fallback)
double unknown = ModelCostCalculator.calculateCost("custom-model", 1000, 500);
// unknown = 0.0

Auto-Configuration¶

When using Spring Boot, GenAI instrumentation is auto-configured based on classpath detection:

Condition	Configuration Class	What It Does
`ChatLanguageModel` on classpath	`LangChain4jGenAiAutoConfiguration`	Wraps LangChain4j model beans with tracing decorators
`ChatModel` on classpath	`SpringAiGenAiAutoConfiguration`	Registers `SpringAiSpanEnricher` as SpanProcessor
`AnthropicClient` on classpath	`AnthropicGenAiAutoConfiguration`	Wraps Anthropic client beans
`OpenAIClient` on classpath	`OpenAiGenAiAutoConfiguration`	Wraps OpenAI client beans
`BedrockRuntimeClient` on classpath	`BedrockGenAiAutoConfiguration`	Wraps Bedrock client beans

Auto-configuration classes are registered via META-INF/spring/org.springframework.boot.autoconfigure.AutoConfiguration.imports.

Span Naming Convention¶

All GenAI spans follow the pattern: "{operation} {model}".

Operation	Span Name Example
Chat completion	`"chat gpt-4o"`
Text completion	`"text_completion gpt-3.5-turbo"`
Embedding	`"embeddings text-embedding-3-small"`
RAG retrieval	`"retrieve"`

This follows the emerging OTel GenAI semantic conventions for span naming.