Skip to main content

OpenAI

note

This is the documentation for the OpenAI integration, that uses a custom Java implementation of the OpenAI REST API, that works best with Quarkus (as it uses the Quarkus REST client) and Spring (as it uses Spring's RestClient).

If you are using Quarkus, please refer to the Quarkus LangChain4j documentation.

LangChain4j provides 4 different integrations with OpenAI for using chat models, and this is #1 :

  • OpenAI uses a custom Java implementation of the OpenAI REST API, that works best with Quarkus (as it uses the Quarkus REST client) and Spring (as it uses Spring's RestClient).
  • OpenAI Official SDK uses the official OpenAI Java SDK.
  • Azure OpenAI uses the Azure SDK from Microsoft, and works best if you are using the Microsoft Java stack, including advanced Azure authentication mechanisms.
  • GitHub Models uses the Azure AI Inference API to access GitHub Models.

OpenAI Documentation

Maven Dependency

Plain Java

<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-open-ai</artifactId>
<version>1.2.0</version>
</dependency>

Spring Boot

<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-open-ai-spring-boot-starter</artifactId>
<version>1.2.0-beta8</version>
</dependency>

API Key

To use OpenAI models, you will need an API key. You can create one here.

What if I don't have an API key?

If you don't have your own OpenAI API key, don't worry. You can temporarily use demo key, which we provide for free for demonstration purposes. Be aware that when using the demo key, all requests to the OpenAI API need to go through our proxy, which injects the real key before forwarding your request to the OpenAI API. We do not collect or use your data in any way. The demo key has a quota, is restricted to the gpt-4o-mini model, and should only be used for demonstration purposes.

OpenAiChatModel model = OpenAiChatModel.builder()
.baseUrl("http://langchain4j.dev/demo/openai/v1")
.apiKey("demo")
.modelName("gpt-4o-mini")
.build();

Creating OpenAiChatModel

Plain Java

ChatModel model = OpenAiChatModel.builder()
.apiKey(System.getenv("OPENAI_API_KEY"))
.modelName("gpt-4o-mini")
.build();


// You can also specify default chat request parameters using ChatRequestParameters or OpenAiChatRequestParameters
ChatModel model = OpenAiChatModel.builder()
.apiKey(System.getenv("OPENAI_API_KEY"))
.defaultRequestParameters(OpenAiChatRequestParameters.builder()
.modelName("gpt-4o-mini")
.build())
.build();

This will create an instance of OpenAiChatModel with the specified default parameters.

Spring Boot

Add to the application.properties:

# Mandatory properties:
langchain4j.open-ai.chat-model.api-key=${OPENAI_API_KEY}
langchain4j.open-ai.chat-model.model-name=gpt-4o-mini

# Optional properties:
langchain4j.open-ai.chat-model.base-url=...
langchain4j.open-ai.chat-model.custom-headers=...
langchain4j.open-ai.chat-model.frequency-penalty=...
langchain4j.open-ai.chat-model.log-requests=...
langchain4j.open-ai.chat-model.log-responses=...
langchain4j.open-ai.chat-model.logit-bias=...
langchain4j.open-ai.chat-model.max-retries=...
langchain4j.open-ai.chat-model.max-completion-tokens=...
langchain4j.open-ai.chat-model.max-tokens=...
langchain4j.open-ai.chat-model.metadata=...
langchain4j.open-ai.chat-model.organization-id=...
langchain4j.open-ai.chat-model.parallel-tool-calls=...
langchain4j.open-ai.chat-model.presence-penalty=...
langchain4j.open-ai.chat-model.project-id=...
langchain4j.open-ai.chat-model.reasoning-effort=...
langchain4j.open-ai.chat-model.response-format=...
langchain4j.open-ai.chat-model.return-thinking=...
langchain4j.open-ai.chat-model.seed=...
langchain4j.open-ai.chat-model.service-tier=...
langchain4j.open-ai.chat-model.stop=...
langchain4j.open-ai.chat-model.store=...
langchain4j.open-ai.chat-model.strict-schema=...
langchain4j.open-ai.chat-model.strict-tools=...
langchain4j.open-ai.chat-model.supported-capabilities=...
langchain4j.open-ai.chat-model.temperature=...
langchain4j.open-ai.chat-model.timeout=...
langchain4j.open-ai.chat-model.top-p=
langchain4j.open-ai.chat-model.user=...

See the description of most of the parameters above here.

This configuration will create an OpenAiChatModel bean, which can be either used by an AI Service or autowired where needed, for example:

@RestController
class ChatModelController {

ChatModel chatModel;

ChatModelController(ChatModel chatModel) {
this.chatModel = chatModel;
}

@GetMapping("/model")
public String model(@RequestParam(value = "message", defaultValue = "Hello") String message) {
return chatModel.chat(message);
}
}

Structured Outputs

The Structured Outputs feature is supported for both tools and response format.

See more info on Structured Outputs here.

Structured Outputs for Tools

To enable Structured Outputs feature for tools, set .strictTools(true) when building the model:

OpenAiChatModel.builder()
...
.strictTools(true)
.build(),

Please note that this will automatically make all tool parameters mandatory (required in json schema) and set additionalProperties=false for each object in json schema. This is due to the current OpenAI limitations.

Structured Outputs for Response Format

To enable the Structured Outputs feature for response formatting when using AI Services, set .supportedCapabilities(RESPONSE_FORMAT_JSON_SCHEMA) and .strictJsonSchema(true) when building the model:

OpenAiChatModel.builder()
...
.supportedCapabilities(RESPONSE_FORMAT_JSON_SCHEMA)
.strictJsonSchema(true)
.build();

In this case AI Service will automatically generate a JSON schema from the given POJO and pass it to the LLM.

Thinking / Reasoning

This setting is intended for DeepSeek.

When the returnThinking parameter is enabled while building OpenAiChatModel or OpenAiStreamingChatModel, the reasoning_content field of the DeepSeek API response will be parsed and returned inside AiMessage.thinking().

When the returnThinking parameter is enabled for OpenAiStreamingChatModel, the StreamingChatResponseHandler.onPartialThinking() and TokenStream.onPartialThinking() callbacks will be invoked when the DeepSeek API streams reasoning_content.

Here is an example of how to configure thinking:

ChatModel model = OpenAiChatModel.builder()
.baseUrl("https://api.deepseek.com/v1")
.apiKey(System.getenv("DEEPSEEK_API_KEY"))
.modelName("deepseek-reasoner")
.returnThinking(true)
.build();

Creating OpenAiStreamingChatModel

Plain Java

StreamingChatModel model = OpenAiStreamingChatModel.builder()
.apiKey(System.getenv("OPENAI_API_KEY"))
.modelName("gpt-4o-mini")
.build();

// You can also specify default chat request parameters using ChatRequestParameters or OpenAiChatRequestParameters
StreamingChatModel model = OpenAiStreamingChatModel.builder()
.apiKey(System.getenv("OPENAI_API_KEY"))
.defaultRequestParameters(OpenAiChatRequestParameters.builder()
.modelName("gpt-4o-mini")
.build())
.build();

Spring Boot

Add to the application.properties:

# Mandatory properties:
langchain4j.open-ai.streaming-chat-model.api-key=${OPENAI_API_KEY}
langchain4j.open-ai.streaming-chat-model.model-name=gpt-4o-mini

# Optional properties:
langchain4j.open-ai.streaming-chat-model.base-url=...
langchain4j.open-ai.streaming-chat-model.custom-headers=...
langchain4j.open-ai.streaming-chat-model.frequency-penalty=...
langchain4j.open-ai.streaming-chat-model.log-requests=...
langchain4j.open-ai.streaming-chat-model.log-responses=...
langchain4j.open-ai.streaming-chat-model.logit-bias=...
langchain4j.open-ai.streaming-chat-model.max-retries=...
langchain4j.open-ai.streaming-chat-model.max-completion-tokens=...
langchain4j.open-ai.streaming-chat-model.max-tokens=...
langchain4j.open-ai.streaming-chat-model.metadata=...
langchain4j.open-ai.streaming-chat-model.organization-id=...
langchain4j.open-ai.streaming-chat-model.parallel-tool-calls=...
langchain4j.open-ai.streaming-chat-model.presence-penalty=...
langchain4j.open-ai.streaming-chat-model.project-id=...
langchain4j.open-ai.streaming-chat-model.reasoning-effort=...
langchain4j.open-ai.streaming-chat-model.response-format=...
langchain4j.open-ai.streaming-chat-model.return-thinking=...
langchain4j.open-ai.streaming-chat-model.seed=...
langchain4j.open-ai.streaming-chat-model.service-tier=...
langchain4j.open-ai.streaming-chat-model.stop=...
langchain4j.open-ai.streaming-chat-model.store=...
langchain4j.open-ai.streaming-chat-model.strict-schema=...
langchain4j.open-ai.streaming-chat-model.strict-tools=...
langchain4j.open-ai.streaming-chat-model.temperature=...
langchain4j.open-ai.streaming-chat-model.timeout=...
langchain4j.open-ai.streaming-chat-model.top-p=...
langchain4j.open-ai.streaming-chat-model.user=...

Creating OpenAiModerationModel

Plain Java

ModerationModel model = OpenAiModerationModel.builder()
.apiKey(System.getenv("OPENAI_API_KEY"))
.modelName("text-moderation-stable")
.build();

Spring Boot

Add to the application.properties:

# Mandatory properties:
langchain4j.open-ai.moderation-model.api-key=${OPENAI_API_KEY}
langchain4j.open-ai.moderation-model.model-name=text-moderation-stable

# Optional properties:
langchain4j.open-ai.moderation-model.base-url=...
langchain4j.open-ai.moderation-model.custom-headers=...
langchain4j.open-ai.moderation-model.log-requests=...
langchain4j.open-ai.moderation-model.log-responses=...
langchain4j.open-ai.moderation-model.max-retries=...
langchain4j.open-ai.moderation-model.organization-id=...
langchain4j.open-ai.moderation-model.project-id=...
langchain4j.open-ai.moderation-model.timeout=...

Creating OpenAiTokenCountEstimator

TokenCountEstimator tokenCountEstimator = new OpenAiTokenCountEstimator("gpt-4o-mini");

Setting custom chat request parameters

When using OpenAiChatModel and OpenAiStreamingChatModel, you can configure custom parameters for the chat request within the HTTP request's JSON body. Here is an example of how to enable web search:

record ApproximateLocation(String city) {}
record UserLocation(String type, ApproximateLocation approximate) {}
record WebSearchOptions(UserLocation user_location) {}
WebSearchOptions webSearchOptions = new WebSearchOptions(new UserLocation("approximate", new ApproximateLocation("London")));
Map<String, Object> customParameters = Map.of("web_search_options", webSearchOptions);

ChatRequest chatRequest = ChatRequest.builder()
.messages(UserMessage.from("Where can I buy good coffee?"))
.parameters(OpenAiChatRequestParameters.builder()
.modelName("gpt-4o-mini-search-preview")
.customParameters(customParameters)
.build())
.build();

ChatModel model = OpenAiChatModel.builder()
.apiKey(System.getenv("OPENAI_API_KEY"))
.logRequests(true)
.build();

ChatResponse chatResponse = model.chat(chatRequest);

This will produce an HTTP request with the following body:

{
"model" : "gpt-4o-mini-search-preview",
"messages" : [ {
"role" : "user",
"content" : "Where can I buy good coffee?"
} ],
"web_search_options" : {
"user_location" : {
"type" : "approximate",
"approximate" : {
"city" : "London"
}
}
}
}

Alternatively, custom parameters can also be specified as a structure of nested maps:

Map<String, Object> customParameters = Map.of(
"web_search_options", Map.of(
"user_location", Map.of(
"type", "approximate",
"approximate", Map.of("city", "London")
)
)
);

Accessing raw HTTP responses and Server-Sent Events (SSE)

When using OpenAiChatModel, you can access the raw HTTP response:

SuccessfulHttpResponse rawHttpResponse = ((OpenAiChatResponseMetadata) chatResponse.metadata()).rawHttpResponse();
System.out.println(rawHttpResponse.body());
System.out.println(rawHttpResponse.headers());
System.out.println(rawHttpResponse.statusCode());

When using OpenAiStreamingChatModel, you can access raw Server-Sent Events:

List<ServerSentEvent> rawServerSentEvents = ((OpenAiChatResponseMetadata) chatResponse.metadata()).rawServerSentEvents();
System.out.println(rawServerSentEvents.get(0).data());
System.out.println(rawServerSentEvents.get(0).event());

HTTP Client

Plain Java

When using the langchain4j-open-ai module, the JDK's java.net.http.HttpClient is used as the default HTTP client.

You can customize it or use any other HTTP client of your choice. More information can be found here.

Spring Boot

When using the langchain4j-open-ai-spring-boot-starter Spring Boot starter, the Spring's RestClient is used as the default HTTP client.

You can customize it or use any other HTTP client of your choice. More information can be found here.

Examples