GenAI Instrumentation¶
The agenttel-genai module provides observability for AI/ML workloads on the JVM. It instruments LLM frameworks and provider SDKs with OpenTelemetry spans following the emerging gen_ai.* semantic conventions, enriched with AgentTel extensions for cost tracking, framework identification, and RAG observability.
Overview¶
| Framework | Instrumentation Approach | What You Get |
|---|---|---|
| Spring AI | SpanProcessor enrichment of existing Micrometer spans |
Framework tag, cost calculation |
| LangChain4j | Decorator-based full instrumentation | Chat, embeddings, RAG retrieval spans |
| Anthropic Java SDK | Client wrapper | Messages API with token/cost tracking |
| OpenAI Java SDK | Client wrapper | Chat completions with token/cost tracking |
| AWS Bedrock SDK | Client wrapper | Converse API with token/cost tracking |
All GenAI library dependencies are compileOnly — they activate only when the corresponding library is on the classpath. Users provide their own runtime versions.
Dependency Setup¶
Maven:
<dependencies>
<dependency>
<groupId>io.agenttel</groupId>
<artifactId>agenttel-genai</artifactId>
<version>0.1.0-alpha</version>
</dependency>
<!-- Include whichever GenAI libraries you use: -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-core</artifactId>
<version>1.0.0</version>
</dependency>
<!-- or -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-core</artifactId>
<version>1.0.0</version>
</dependency>
<!-- or -->
<dependency>
<groupId>com.anthropic</groupId>
<artifactId>anthropic-java</artifactId>
<version>2.0.0</version>
</dependency>
<dependency>
<groupId>com.openai</groupId>
<artifactId>openai-java</artifactId>
<version>4.0.0</version>
</dependency>
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>bedrockruntime</artifactId>
<version>2.30.0</version>
</dependency>
</dependencies>
Gradle:
// build.gradle.kts
dependencies {
implementation("io.agenttel:agenttel-genai:0.1.0-alpha")
// Include whichever GenAI libraries you use:
implementation("dev.langchain4j:langchain4j-core:1.0.0")
// or
implementation("org.springframework.ai:spring-ai-core:1.0.0")
// or
implementation("com.anthropic:anthropic-java:2.0.0")
implementation("com.openai:openai-java:4.0.0")
implementation("software.amazon.awssdk:bedrockruntime:2.30.0")
}
LangChain4j Instrumentation¶
LangChain4j has no built-in OTel tracing. AgentTel provides full instrumentation via the decorator pattern.
Chat Model¶
import io.agenttel.genai.langchain4j.LangChain4jInstrumentation;
ChatLanguageModel model = OpenAiChatModel.builder()
.apiKey("...")
.modelName("gpt-4o")
.build();
// Wrap with tracing — every call creates a span
ChatLanguageModel traced = LangChain4jInstrumentation.instrument(
model, openTelemetry, "gpt-4o", "openai"
);
// Use as normal
ChatResponse response = traced.chat(ChatRequest.builder()
.messages(List.of(UserMessage.from("Explain observability")))
.build());
Span output:
Span: "chat gpt-4o"
gen_ai.operation.name = "chat"
gen_ai.system = "openai"
gen_ai.request.model = "gpt-4o"
gen_ai.usage.input_tokens = 150
gen_ai.usage.output_tokens = 42
gen_ai.response.finish_reasons = ["stop"]
agenttel.genai.framework = "langchain4j"
agenttel.genai.cost_usd = 0.000795
Embedding Model¶
EmbeddingModel embeddingModel = OpenAiEmbeddingModel.builder()
.apiKey("...")
.modelName("text-embedding-3-small")
.build();
EmbeddingModel traced = LangChain4jInstrumentation.instrumentEmbedding(
embeddingModel, openTelemetry, "text-embedding-3-small", "openai"
);
// Span: "embeddings text-embedding-3-small"
Response<List<Embedding>> embeddings = traced.embedAll(
List.of(TextSegment.from("Agent-ready telemetry"))
);
Streaming Chat Model¶
StreamingChatLanguageModel streaming = OpenAiStreamingChatModel.builder()
.apiKey("...")
.modelName("gpt-4o")
.build();
StreamingChatLanguageModel traced = LangChain4jInstrumentation.instrumentStreaming(
streaming, openTelemetry, "gpt-4o", "openai"
);
// Span starts on call, ends when streaming completes or errors
traced.chat(ChatRequest.builder()
.messages(messages)
.build(), handler);
RAG Content Retrieval¶
ContentRetriever retriever = EmbeddingStoreContentRetriever.builder()
.embeddingStore(store)
.embeddingModel(embeddingModel)
.build();
ContentRetriever traced = LangChain4jInstrumentation.instrumentRetriever(
retriever, openTelemetry
);
// Span includes RAG-specific attributes
List<Content> results = traced.retrieve(Query.from("What is AgentTel?"));
RAG span attributes:
Span: "retrieve"
gen_ai.operation.name = "retrieve"
agenttel.genai.framework = "langchain4j"
agenttel.genai.rag_source_count = 5
agenttel.genai.rag_relevance_score_avg = 0.87
Spring AI Enrichment¶
Spring AI already emits gen_ai.* spans via Micrometer. AgentTel enriches these existing spans rather than replacing them.
SpanProcessor Enrichment¶
SpringAiSpanEnricher is a SpanProcessor that detects Spring AI spans and adds AgentTel attributes:
// Auto-configured via Spring Boot — no code needed
// Adds to every Spring AI span:
// agenttel.genai.framework = "spring_ai"
Cost Calculation¶
Since token counts are only available after the model responds, cost is computed at export time using a delegating SpanExporter:
// Wraps your existing exporter
SpanExporter costAware = new CostEnrichingSpanExporter(yourOtlpExporter);
// Adds agenttel.genai.cost_usd to spans that have:
// gen_ai.request.model
// gen_ai.usage.input_tokens
// gen_ai.usage.output_tokens
Why a SpanExporter? In OpenTelemetry's SpanProcessor.onEnd(), the span is ReadableSpan (immutable). Token usage attributes are set by Spring AI during span execution. The CostEnrichingSpanExporter wraps SpanData with a delegate that injects the cost attribute at export time — the only point where the data can be modified.
Provider SDK Instrumentation¶
Anthropic Java SDK¶
import io.agenttel.genai.anthropic.TracingAnthropicClient;
AnthropicClient client = AnthropicOkHttpClient.builder()
.apiKey("...")
.build();
// Wrap with tracing
AnthropicClient traced = new TracingAnthropicClient(client, openTelemetry);
// Spans created for every messages.create() call
MessageCreateParams params = MessageCreateParams.builder()
.model("claude-sonnet-4-20250514")
.maxTokens(1024)
.messages(List.of(...))
.build();
Message response = traced.messages().create(params);
Span attributes:
Span: "chat claude-sonnet-4-20250514"
gen_ai.system = "anthropic"
gen_ai.request.model = "claude-sonnet-4-20250514"
gen_ai.usage.input_tokens = 200
gen_ai.usage.output_tokens = 150
agenttel.genai.cost_usd = 0.00165
OpenAI Java SDK¶
import io.agenttel.genai.openai.TracingOpenAIClient;
OpenAIClient client = OpenAIOkHttpClient.builder()
.apiKey("...")
.build();
OpenAIClient traced = new TracingOpenAIClient(client, openTelemetry);
ChatCompletion completion = traced.chat().completions().create(params);
AWS Bedrock SDK¶
import io.agenttel.genai.bedrock.TracingBedrockRuntimeClient;
BedrockRuntimeClient client = BedrockRuntimeClient.builder()
.region(Region.US_EAST_1)
.build();
BedrockRuntimeClient traced = new TracingBedrockRuntimeClient(client, openTelemetry);
ConverseResponse response = traced.converse(ConverseRequest.builder()
.modelId("anthropic.claude-3-sonnet-20240229-v1:0")
.messages(...)
.build());
Cost Calculation¶
ModelCostCalculator computes estimated costs based on model and token counts.
Supported Models¶
| Provider | Models | Input Cost (per 1M tokens) | Output Cost (per 1M tokens) |
|---|---|---|---|
| Anthropic | Claude Opus 4 | $15.00 | $75.00 |
| Anthropic | Claude Sonnet 4 | $3.00 | $15.00 |
| Anthropic | Claude Haiku 3.5 | $0.80 | $4.00 |
| OpenAI | GPT-4o | $5.00 | $15.00 |
| OpenAI | GPT-4o mini | $0.15 | $0.60 |
| OpenAI | GPT-4 Turbo | $10.00 | $30.00 |
| OpenAI | text-embedding-3-small | $0.02 | — |
| AWS Bedrock | Claude models (via Bedrock) | Same as Anthropic | Same as Anthropic |
Programmatic Usage¶
double cost = ModelCostCalculator.calculateCost("gpt-4o", 1000, 500);
// cost = 0.0125 (USD)
// Returns 0.0 for unknown models (graceful fallback)
double unknown = ModelCostCalculator.calculateCost("custom-model", 1000, 500);
// unknown = 0.0
Auto-Configuration¶
When using Spring Boot, GenAI instrumentation is auto-configured based on classpath detection:
| Condition | Configuration Class | What It Does |
|---|---|---|
ChatLanguageModel on classpath |
LangChain4jGenAiAutoConfiguration |
Wraps LangChain4j model beans with tracing decorators |
ChatModel on classpath |
SpringAiGenAiAutoConfiguration |
Registers SpringAiSpanEnricher as SpanProcessor |
AnthropicClient on classpath |
AnthropicGenAiAutoConfiguration |
Wraps Anthropic client beans |
OpenAIClient on classpath |
OpenAiGenAiAutoConfiguration |
Wraps OpenAI client beans |
BedrockRuntimeClient on classpath |
BedrockGenAiAutoConfiguration |
Wraps Bedrock client beans |
Auto-configuration classes are registered via META-INF/spring/org.springframework.boot.autoconfigure.AutoConfiguration.imports.
Span Naming Convention¶
All GenAI spans follow the pattern: "{operation} {model}".
| Operation | Span Name Example |
|---|---|
| Chat completion | "chat gpt-4o" |
| Text completion | "text_completion gpt-3.5-turbo" |
| Embedding | "embeddings text-embedding-3-small" |
| RAG retrieval | "retrieve" |
This follows the emerging OTel GenAI semantic conventions for span naming.