Skip to main content

Google Vertex AI

Get started

To get started follow the steps outlined in the Get started section of Vertex AI Gemini integration tutorial to create a Google Cloud Platform account and establish a new project with access to Vertex AI API.

Add dependencies

Add the following dependencies to your project's pom.xml:


or project's build.gradle:

implementation 'dev.langchain4j:langchain4j-vertex-ai:1.0.0-beta1'

Try out an example code:

An Example of using Vertex AI Embedding Model

The PROJECT_ID field represents the variable you set when creating a new Google Cloud project.

import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.model.output.Response;
import dev.langchain4j.model.vertexai.VertexAiEmbeddingModel;

public class VertexAiEmbeddingModelExample {

private static final String PROJECT_ID = "YOUR-PROJECT-ID";
private static final String MODEL_NAME = "textembedding-gecko@latest";

public static void main(String[] args) {

EmbeddingModel embeddingModel = VertexAiEmbeddingModel.builder()

Response<Embedding> response = embeddingModel.embed("Hello, how are you?");

Embedding embedding = response.content();

int dimension = embedding.dimension(); // 768
float[] vector = embedding.vector(); // [-0.06050122, -0.046411075, ...


Available Embedding models

English modelsMultilingual modelsMultimodal embeddings

List of supported languages for multi lingual model

Model names suffixed with @latest reference the most recent version of the model.

By default, most embedding models output 768-dimensional vector embeddings (except for "Matryoshka" models that accept a configurable lower dimension). The API accepts a maximum of 2,048 input tokens per segment to embed. You can send upto 250 text segments. The VertexAiEmbeddingModel class automatically and transparently splits the requests in batches when you ask for more than 250 segments to be embedded at the same time. The embedding API is limited to a total of 20,000 tokens per call (across all segments). When that limit is reached, VertexAiEmbeddingModel will again batch the requests to avoid hitting that limit.

Configuring the embedding model

EmbeddingModel embeddingModel = VertexAiEmbeddingModel.builder()
.endpoint("") // optional
.maxRetries(3) // 3 by default
.maxSegmentsPerBatch(250) // up to 250 segments per batch
.maxTokensPerBatch(2048) // up to 2048 tokens per segment
.taskType() // see below for the different task types
.titleMetadataKey() // for the RETRIEVAL_DOCUMENT task, you can specify a title
// for the text segment to identify its document origin
.autoTruncate(false) // false by default: truncates segments longer than 2,048 input tokens
.outputDimensionality(512) // for models that support different output vector dimensions

Embedding task types

Embedding models can be used for different use cases. To get better embedding values, you can specify a task among the following ones:


See the list of supported models.


Google Codelab on Vertex AI Embedding Model

Available stable Embedding Models

Latest Embedding Models version