Anthropic
Maven Dependency
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-anthropic</artifactId>
<version>1.8.0</version>
</dependency>
AnthropicChatModel
AnthropicChatModel model = AnthropicChatModel.builder()
.apiKey(System.getenv("ANTHROPIC_API_KEY"))
.modelName(CLAUDE_3_5_SONNET_20240620)
.build();
String answer = model.chat("Say 'Hello World'");
System.out.println(answer);
Customizing AnthropicChatModel
AnthropicChatModel model = AnthropicChatModel.builder()
.httpClientBuilder(...)
.baseUrl(...)
.apiKey(...)
.version(...)
.beta(...)
.modelName(...)
.temperature(...)
.topP(...)
.topK(...)
.maxTokens(...)
.stopSequences(...)
.toolSpecifications(...)
.toolChoice(...)
.toolChoiceName(...)
.disableParallelToolUse(...)
.cacheSystemMessages(...)
.cacheTools(...)
.thinkingType(...)
.thinkingBudgetTokens(...)
.returnThinking(...)
.sendThinking(...)
.timeout(...)
.maxRetries(...)
.logRequests(...)
.logResponses(...)
.listeners(...)
.defaultRequestParameters(...)
.userId(...)
.customParameters(...)
.build();
See the description of some of the parameters above here.
AnthropicStreamingChatModel
AnthropicStreamingChatModel model = AnthropicStreamingChatModel.builder()
.apiKey(System.getenv("ANTHROPIC_API_KEY"))
.modelName(CLAUDE_3_5_SONNET_20240620)
.build();
model.chat("Say 'Hello World'", new StreamingChatResponseHandler() {
@Override
public void onPartialResponse(String partialResponse) {
// this method is called when a new partial response is available. It can consist of one or more tokens.
}
@Override
public void onCompleteResponse(ChatResponse completeResponse) {
// this method is called when the model has completed responding
}
@Override
public void onError(Throwable error) {
// this method is called when an error occurs
}
});
Customizing AnthropicStreamingChatModel
Identical to the AnthropicChatModel, see above.
Tools
Anthropic supports tools in both streaming and non-streaming mode.
Anthropic documentation on tools can be found here.
Tool Choice
Anthropic's tool choice
feature is available for both streaming and non-streaming interactions
by setting toolChoice(ToolChoice) or toolChoiceName(String).
Parallel Tool Use
By default, Anthropic Claude may use multiple tools to answer a user query,
but you can disable parallel tool by setting disableParallelToolUse(true).
Caching
AnthropicChatModel and AnthropicStreamingChatModel support caching of system messages and tools.
Caching is disabled by default.
It can be enabled by setting the cacheSystemMessages and cacheTools parameters, respectively.
When enabled,cache_control blocks will be added to the last system message and tool, respectively.
To use caching, please set beta("prompt-caching-2024-07-31").
AnthropicChatModel and AnthropicStreamingChatModel return AnthropicTokenUsage in response which
contains cacheCreationInputTokens and cacheReadInputTokens.
More info on caching can be found here.
Thinking
Both AnthropicChatModel and AnthropicStreamingChatModel support
thinking feature.
It is controlled by the following parameters:
thinkingTypeandthinkingBudgetTokens: enable thinking, see more details here.returnThinking: controls whether to return thinking (if available) insideAiMessage.thinking()and whether to invokeStreamingChatResponseHandler.onPartialThinking()andTokenStream.onPartialThinking()callbacks when usingBedrockStreamingChatModel. Disabled by default. If enabled, tinking signatures will also be stored and returned inside theAiMessage.attributes().sendThinking: controls whether to send thinking and signatures stored inAiMessageto the LLM in follow-up requests. Enabled by default.
Here is an example of how to configure thinking:
ChatModel model = AnthropicChatModel.builder()
.apiKey(System.getenv("ANTHROPIC_API_KEY"))
.modelName(CLAUDE_3_7_SONNET_20250219)
.thinkingType("enabled")
.thinkingBudgetTokens(1024)
.maxTokens(1024 + 100)
.returnThinking(true)
.sendThinking(true)
.build();
Setting custom chat request parameters
When building AnthropicChatModel and AnthropicStreamingChatModel,
you can configure custom parameters for the chat request within the HTTP request's JSON body.
Here is an example of how to enable context editing:
record Edit(String type) {}
record ContextManagement(List<Edit> edits) { }
Map<String, Object> customParameters = Map.of("context_management", new ContextManagement(List.of(new Edit("clear_tool_uses_20250919"))));
ChatModel model = AnthropicChatModel.builder()
.apiKey(System.getenv("ANTHROPIC_API_KEY"))
.modelName(CLAUDE_SONNET_4_5_20250929)
.beta("context-management-2025-06-27")
.customParameters(customParameters)
.logRequests(true)
.logResponses(true)
.build();
String answer = model.chat("Hi");
This will produce an HTTP request with the following body:
{
"model" : "claude-sonnet-4-5-20250929",
"messages" : [ {
"role" : "user",
"content" : [ {
"type" : "text",
"text" : "Hi"
} ]
} ],
"context_management" : {
"edits" : [ {
"type" : "clear_tool_uses_20250919"
} ]
}
}
Alternatively, custom parameters can also be specified as a structure of nested maps:
Map<String, Object> customParameters = Map.of(
"context_management",
Map.of("edits", List.of(Map.of("type", "clear_tool_uses_20250919")))
);
AnthropicTokenCountEstimator
TokenCountEstimator tokenCountEstimator = AnthropicTokenCountEstimator.builder()
.modelName(CLAUDE_3_OPUS_20240229)
.apiKey(System.getenv("ANTHROPIC_API_KEY"))
.logRequests(true)
.logResponses(true)
.build();
List<ChatMessage> messages = List.of(...);
int tokenCount = tokenCountEstimator.estimateTokenCountInMessages(messages);
Quarkus
See more details here.
Spring Boot
Import Spring Boot starter for Anthropic:
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-anthropic-spring-boot-starter</artifactId>
<version>1.8.0-beta15</version>
</dependency>
Configure AnthropicChatModel bean:
langchain4j.anthropic.chat-model.api-key = ${ANTHROPIC_API_KEY}
Configure AnthropicStreamingChatModel bean:
langchain4j.anthropic.streaming-chat-model.api-key = ${ANTHROPIC_API_KEY}