Anthropic
Maven Dependency
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-anthropic</artifactId>
<version>1.7.1</version>
</dependency>
AnthropicChatModel
AnthropicChatModel model = AnthropicChatModel.builder()
.apiKey(System.getenv("ANTHROPIC_API_KEY"))
.modelName(CLAUDE_3_5_SONNET_20240620)
.build();
String answer = model.chat("Say 'Hello World'");
System.out.println(answer);
Customizing AnthropicChatModel
AnthropicChatModel model = AnthropicChatModel.builder()
.httpClientBuilder(...)
.baseUrl(...)
.apiKey(...)
.version(...)
.beta(...)
.modelName(...)
.temperature(...)
.topP(...)
.topK(...)
.maxTokens(...)
.stopSequences(...)
.toolSpecifications(...)
.toolChoice(...)
.toolChoiceName(...)
.disableParallelToolUse(...)
.cacheSystemMessages(...)
.cacheTools(...)
.thinkingType(...)
.thinkingBudgetTokens(...)
.returnThinking(...)
.sendThinking(...)
.timeout(...)
.maxRetries(...)
.logRequests(...)
.logResponses(...)
.listeners(...)
.defaultRequestParameters(...)
.userId(...)
.customParameters(...)
.build();
See the description of some of the parameters above here.
AnthropicStreamingChatModel
AnthropicStreamingChatModel model = AnthropicStreamingChatModel.builder()
.apiKey(System.getenv("ANTHROPIC_API_KEY"))
.modelName(CLAUDE_3_5_SONNET_20240620)
.build();
model.chat("Say 'Hello World'", new StreamingChatResponseHandler() {
@Override
public void onPartialResponse(String partialResponse) {
// this method is called when a new partial response is available. It can consist of one or more tokens.
}
@Override
public void onCompleteResponse(ChatResponse completeResponse) {
// this method is called when the model has completed responding
}
@Override
public void onError(Throwable error) {
// this method is called when an error occurs
}
});
Customizing AnthropicStreamingChatModel
Identical to the AnthropicChatModel
, see above.
Tools
Anthropic supports tools in both streaming and non-streaming mode.
Anthropic documentation on tools can be found here.
Tool Choice
Anthropic's tool choice
feature is available for both streaming and non-streaming interactions
by setting toolChoice(ToolChoice)
or toolChoiceName(String)
.
Parallel Tool Use
By default, Anthropic Claude may use multiple tools to answer a user query,
but you can disable parallel tool by setting disableParallelToolUse(true)
.
Caching
AnthropicChatModel
and AnthropicStreamingChatModel
support caching of system messages and tools.
Caching is disabled by default.
It can be enabled by setting the cacheSystemMessages
and cacheTools
parameters, respectively.
When enabled,cache_control
blocks will be added to the last system message and tool, respectively.
To use caching, please set beta("prompt-caching-2024-07-31")
.
AnthropicChatModel
and AnthropicStreamingChatModel
return AnthropicTokenUsage
in response which
contains cacheCreationInputTokens
and cacheReadInputTokens
.
More info on caching can be found here.
Thinking
Both AnthropicChatModel
and AnthropicStreamingChatModel
support
thinking feature.
It is controlled by the following parameters:
thinkingType
andthinkingBudgetTokens
: enable thinking, see more details here.returnThinking
: controls whether to return thinking (if available) insideAiMessage.thinking()
and whether to invokeStreamingChatResponseHandler.onPartialThinking()
andTokenStream.onPartialThinking()
callbacks when usingBedrockStreamingChatModel
. Disabled by default. If enabled, tinking signatures will also be stored and returned inside theAiMessage.attributes()
.sendThinking
: controls whether to send thinking and signatures stored inAiMessage
to the LLM in follow-up requests. Enabled by default.
Here is an example of how to configure thinking:
ChatModel model = AnthropicChatModel.builder()
.apiKey(System.getenv("ANTHROPIC_API_KEY"))
.modelName(CLAUDE_3_7_SONNET_20250219)
.thinkingType("enabled")
.thinkingBudgetTokens(1024)
.maxTokens(1024 + 100)
.returnThinking(true)
.sendThinking(true)
.build();
Setting custom chat request parameters
When building AnthropicChatModel
and AnthropicStreamingChatModel
,
you can configure custom parameters for the chat request within the HTTP request's JSON body.
Here is an example of how to enable context editing:
record Edit(String type) {}
record ContextManagement(List<Edit> edits) { }
Map<String, Object> customParameters = Map.of("context_management", new ContextManagement(List.of(new Edit("clear_tool_uses_20250919"))));
ChatModel model = AnthropicChatModel.builder()
.apiKey(System.getenv("ANTHROPIC_API_KEY"))
.modelName(CLAUDE_SONNET_4_5_20250929)
.beta("context-management-2025-06-27")
.customParameters(customParameters)
.logRequests(true)
.logResponses(true)
.build();
String answer = model.chat("Hi");
This will produce an HTTP request with the following body:
{
"model" : "claude-sonnet-4-5-20250929",
"messages" : [ {
"role" : "user",
"content" : [ {
"type" : "text",
"text" : "Hi"
} ]
} ],
"context_management" : {
"edits" : [ {
"type" : "clear_tool_uses_20250919"
} ]
}
}
Alternatively, custom parameters can also be specified as a structure of nested maps:
Map<String, Object> customParameters = Map.of(
"context_management",
Map.of("edits", List.of(Map.of("type", "clear_tool_uses_20250919")))
);
AnthropicTokenCountEstimator
TokenCountEstimator tokenCountEstimator = AnthropicTokenCountEstimator.builder()
.modelName(CLAUDE_3_OPUS_20240229)
.apiKey(System.getenv("ANTHROPIC_API_KEY"))
.logRequests(true)
.logResponses(true)
.build();
List<ChatMessage> messages = List.of(...);
int tokenCount = tokenCountEstimator.estimateTokenCountInMessages(messages);
Quarkus
See more details here.
Spring Boot
Import Spring Boot starter for Anthropic:
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-anthropic-spring-boot-starter</artifactId>
<version>1.7.1-beta14</version>
</dependency>
Configure AnthropicChatModel
bean:
langchain4j.anthropic.chat-model.api-key = ${ANTHROPIC_API_KEY}
Configure AnthropicStreamingChatModel
bean:
langchain4j.anthropic.streaming-chat-model.api-key = ${ANTHROPIC_API_KEY}