GenAI Instrumentation¶
The agenttel-genai module provides observability for AI/ML workloads on the JVM. It instruments LLM frameworks and provider SDKs with OpenTelemetry spans following the emerging gen_ai.* semantic conventions, enriched with AgentTel extensions for cost tracking, framework identification, and RAG observability.
Overview¶
| Framework | Instrumentation Approach | What You Get |
|---|---|---|
| Spring AI | SpanProcessor enrichment of existing Micrometer spans |
Framework tag, cost calculation |
| LangChain4j | Decorator-based full instrumentation | Chat, embeddings, RAG retrieval spans |
| Anthropic Java SDK | Client wrapper | Messages API with token/cost tracking |
| OpenAI Java SDK | Client wrapper | Chat completions with token/cost tracking |
| AWS Bedrock SDK | Client wrapper | Converse API with token/cost tracking |
All GenAI library dependencies are compileOnly — they activate only when the corresponding library is on the classpath. Users provide their own runtime versions.
Python SDK¶
The Python SDK provides equivalent GenAI instrumentation. See the Python GenAI Guide for complete documentation. Quick overview:
| Framework | Installation | What You Get |
|---|---|---|
| OpenAI | pip install agenttel[openai] |
Chat completions with token/cost tracking |
| Anthropic | pip install agenttel[anthropic] |
Messages API with token/cost tracking |
| LangChain | pip install agenttel[langchain] |
ChatModel and Retriever callback handler |
| AWS Bedrock | pip install agenttel[bedrock] |
invoke_model and converse wrapping |
from agenttel.genai import instrument_openai, instrument_anthropic
from openai import OpenAI
from anthropic import Anthropic
# One-line instrumentation
client = instrument_openai(OpenAI())
response = client.chat.completions.create(model="gpt-4o", messages=[...])
# Span created automatically with gen_ai.* attributes and cost tracking
anthropic_client = instrument_anthropic(Anthropic())
response = anthropic_client.messages.create(model="claude-sonnet-4-20250514", ...)
Dependency Setup¶
Maven:
<dependencies>
<dependency>
<groupId>dev.agenttel</groupId>
<artifactId>agenttel-genai</artifactId>
<version>0.2.0-alpha</version>
</dependency>
<!-- Include whichever GenAI libraries you use: -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-core</artifactId>
<version>1.0.0</version>
</dependency>
<!-- or -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-core</artifactId>
<version>1.0.0</version>
</dependency>
<!-- or -->
<dependency>
<groupId>com.anthropic</groupId>
<artifactId>anthropic-java</artifactId>
<version>2.0.0</version>
</dependency>
<dependency>
<groupId>com.openai</groupId>
<artifactId>openai-java</artifactId>
<version>4.0.0</version>
</dependency>
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>bedrockruntime</artifactId>
<version>2.30.0</version>
</dependency>
</dependencies>
Gradle:
// build.gradle.kts
dependencies {
implementation("dev.agenttel:agenttel-genai:0.2.0-alpha")
// Include whichever GenAI libraries you use:
implementation("dev.langchain4j:langchain4j-core:1.0.0")
// or
implementation("org.springframework.ai:spring-ai-core:1.0.0")
// or
implementation("com.anthropic:anthropic-java:2.0.0")
implementation("com.openai:openai-java:4.0.0")
implementation("software.amazon.awssdk:bedrockruntime:2.30.0")
}
LangChain4j Instrumentation¶
LangChain4j has no built-in OTel tracing. AgentTel provides full instrumentation via the decorator pattern.
Chat Model¶
import io.agenttel.genai.langchain4j.LangChain4jInstrumentation;
ChatLanguageModel model = OpenAiChatModel.builder()
.apiKey("...")
.modelName("gpt-4o")
.build();
// Wrap with tracing — every call creates a span
ChatLanguageModel traced = LangChain4jInstrumentation.instrument(
model, openTelemetry, "gpt-4o", "openai"
);
// Use as normal
ChatResponse response = traced.chat(ChatRequest.builder()
.messages(List.of(UserMessage.from("Explain observability")))
.build());
Span output:
Span: "chat gpt-4o"
gen_ai.operation.name = "chat"
gen_ai.system = "openai"
gen_ai.request.model = "gpt-4o"
gen_ai.usage.input_tokens = 150
gen_ai.usage.output_tokens = 42
gen_ai.response.finish_reasons = ["stop"]
agenttel.genai.framework = "langchain4j"
agenttel.genai.cost_usd = 0.000795
Embedding Model¶
EmbeddingModel embeddingModel = OpenAiEmbeddingModel.builder()
.apiKey("...")
.modelName("text-embedding-3-small")
.build();
EmbeddingModel traced = LangChain4jInstrumentation.instrumentEmbedding(
embeddingModel, openTelemetry, "text-embedding-3-small", "openai"
);
// Span: "embeddings text-embedding-3-small"
Response<List<Embedding>> embeddings = traced.embedAll(
List.of(TextSegment.from("Agent-ready telemetry"))
);
Streaming Chat Model¶
StreamingChatLanguageModel streaming = OpenAiStreamingChatModel.builder()
.apiKey("...")
.modelName("gpt-4o")
.build();
StreamingChatLanguageModel traced = LangChain4jInstrumentation.instrumentStreaming(
streaming, openTelemetry, "gpt-4o", "openai"
);
// Span starts on call, ends when streaming completes or errors
traced.chat(ChatRequest.builder()
.messages(messages)
.build(), handler);
RAG Content Retrieval¶
ContentRetriever retriever = EmbeddingStoreContentRetriever.builder()
.embeddingStore(store)
.embeddingModel(embeddingModel)
.build();
ContentRetriever traced = LangChain4jInstrumentation.instrumentRetriever(
retriever, openTelemetry
);
// Span includes RAG-specific attributes
List<Content> results = traced.retrieve(Query.from("What is AgentTel?"));
RAG span attributes:
Span: "retrieve"
gen_ai.operation.name = "retrieve"
agenttel.genai.framework = "langchain4j"
agenttel.genai.rag_source_count = 5
agenttel.genai.rag_relevance_score_avg = 0.87
Spring AI Enrichment¶
Spring AI already emits gen_ai.* spans via Micrometer. AgentTel enriches these existing spans rather than replacing them.
SpanProcessor Enrichment¶
SpringAiSpanEnricher is a SpanProcessor that detects Spring AI spans and adds AgentTel attributes:
// Auto-configured via Spring Boot — no code needed
// Adds to every Spring AI span:
// agenttel.genai.framework = "spring_ai"
Cost Calculation¶
Since token counts are only available after the model responds, cost is computed at export time using a delegating SpanExporter:
// Wraps your existing exporter
SpanExporter costAware = new CostEnrichingSpanExporter(yourOtlpExporter);
// Adds agenttel.genai.cost_usd to spans that have:
// gen_ai.request.model
// gen_ai.usage.input_tokens
// gen_ai.usage.output_tokens
Why a SpanExporter? In OpenTelemetry's SpanProcessor.onEnd(), the span is ReadableSpan (immutable). Token usage attributes are set by Spring AI during span execution. The CostEnrichingSpanExporter wraps SpanData with a delegate that injects the cost attribute at export time — the only point where the data can be modified.
Provider SDK Instrumentation¶
Anthropic Java SDK¶
import io.agenttel.genai.anthropic.TracingAnthropicClient;
AnthropicClient client = AnthropicOkHttpClient.builder()
.apiKey("...")
.build();
// Wrap with tracing
AnthropicClient traced = new TracingAnthropicClient(client, openTelemetry);
// Spans created for every messages.create() call
MessageCreateParams params = MessageCreateParams.builder()
.model("claude-sonnet-4-20250514")
.maxTokens(1024)
.messages(List.of(...))
.build();
Message response = traced.messages().create(params);
Span attributes:
Span: "chat claude-sonnet-4-20250514"
gen_ai.system = "anthropic"
gen_ai.request.model = "claude-sonnet-4-20250514"
gen_ai.usage.input_tokens = 200
gen_ai.usage.output_tokens = 150
agenttel.genai.cost_usd = 0.00165
OpenAI Java SDK¶
import io.agenttel.genai.openai.TracingOpenAIClient;
OpenAIClient client = OpenAIOkHttpClient.builder()
.apiKey("...")
.build();
OpenAIClient traced = new TracingOpenAIClient(client, openTelemetry);
ChatCompletion completion = traced.chat().completions().create(params);
AWS Bedrock SDK¶
import io.agenttel.genai.bedrock.TracingBedrockRuntimeClient;
BedrockRuntimeClient client = BedrockRuntimeClient.builder()
.region(Region.US_EAST_1)
.build();
BedrockRuntimeClient traced = new TracingBedrockRuntimeClient(client, openTelemetry);
ConverseResponse response = traced.converse(ConverseRequest.builder()
.modelId("anthropic.claude-3-sonnet-20240229-v1:0")
.messages(...)
.build());
Cost Calculation¶
ModelCostCalculator computes estimated costs based on model and token counts.
Supported Models¶
| Provider | Models | Input Cost (per 1M tokens) | Output Cost (per 1M tokens) |
|---|---|---|---|
| Anthropic | Claude Opus 4 | $15.00 | $75.00 |
| Anthropic | Claude Sonnet 4 | $3.00 | $15.00 |
| Anthropic | Claude Haiku 3.5 | $0.80 | $4.00 |
| OpenAI | GPT-4o | $5.00 | $15.00 |
| OpenAI | GPT-4o mini | $0.15 | $0.60 |
| OpenAI | GPT-4 Turbo | $10.00 | $30.00 |
| OpenAI | text-embedding-3-small | $0.02 | — |
| AWS Bedrock | Claude models (via Bedrock) | Same as Anthropic | Same as Anthropic |
Programmatic Usage¶
double cost = ModelCostCalculator.calculateCost("gpt-4o", 1000, 500);
// cost = 0.0125 (USD)
// Returns 0.0 for unknown models (graceful fallback)
double unknown = ModelCostCalculator.calculateCost("custom-model", 1000, 500);
// unknown = 0.0
Auto-Configuration¶
When using Spring Boot, GenAI instrumentation is auto-configured based on classpath detection:
| Condition | Configuration Class | What It Does |
|---|---|---|
ChatLanguageModel on classpath |
LangChain4jGenAiAutoConfiguration |
Wraps LangChain4j model beans with tracing decorators |
ChatModel on classpath |
SpringAiGenAiAutoConfiguration |
Registers SpringAiSpanEnricher as SpanProcessor |
AnthropicClient on classpath |
AnthropicGenAiAutoConfiguration |
Wraps Anthropic client beans |
OpenAIClient on classpath |
OpenAiGenAiAutoConfiguration |
Wraps OpenAI client beans |
BedrockRuntimeClient on classpath |
BedrockGenAiAutoConfiguration |
Wraps Bedrock client beans |
Auto-configuration classes are registered via META-INF/spring/org.springframework.boot.autoconfigure.AutoConfiguration.imports.
Span Naming Convention¶
All GenAI spans follow the pattern: "{operation} {model}".
| Operation | Span Name Example |
|---|---|
| Chat completion | "chat gpt-4o" |
| Text completion | "text_completion gpt-3.5-turbo" |
| Embedding | "embeddings text-embedding-3-small" |
| RAG retrieval | "retrieve" |
This follows the emerging OTel GenAI semantic conventions for span naming.