Class EmbeddingStoreIngestor
java.lang.Object
dev.langchain4j.store.embedding.EmbeddingStoreIngestor
The
In the simplest configuration,
Optionally, the
Optionally, the
Optionally, the
Including a document title or a short summary in each
EmbeddingStoreIngestor represents an ingestion pipeline and is responsible
for ingesting Documents into an EmbeddingStore.
In the simplest configuration,
EmbeddingStoreIngestor embeds provided documents
using a provided EmbeddingModel and stores them, along with their Embeddings
in an EmbeddingStore.
Optionally, the
EmbeddingStoreIngestor can transform documents using a provided DocumentTransformer.
This can be useful if you want to clean, enrich, or format documents before embedding them.
Optionally, the
EmbeddingStoreIngestor can split documents into TextSegments
using a provided DocumentSplitter.
This can be useful if documents are big, and you want to split them into smaller segments to improve the quality
of similarity searches and reduce the size and cost of a prompt sent to the LLM.
Optionally, the
EmbeddingStoreIngestor can transform TextSegments using a TextSegmentTransformer.
This can be useful if you want to clean, enrich, or format TextSegments before embedding them.
Including a document title or a short summary in each
TextSegment is a common technique
to improve the quality of similarity searches.-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classEmbeddingStoreIngestor builder. -
Constructor Summary
ConstructorsConstructorDescriptionEmbeddingStoreIngestor(DocumentTransformer documentTransformer, DocumentSplitter documentSplitter, TextSegmentTransformer textSegmentTransformer, EmbeddingModel embeddingModel, EmbeddingStore<TextSegment> embeddingStore) Creates an instance of anEmbeddingStoreIngestor. -
Method Summary
Modifier and TypeMethodDescriptionbuilder()Creates a new EmbeddingStoreIngestor builder.Ingests a specified document into anEmbeddingStorethat was specified during the creation of thisEmbeddingStoreIngestor.Ingests specified documents into anEmbeddingStorethat was specified during the creation of thisEmbeddingStoreIngestor.static IngestionResultingest(Document document, EmbeddingStore<TextSegment> embeddingStore) Ingests a specifiedDocumentinto a specifiedEmbeddingStore.Ingests specified documents into anEmbeddingStorethat was specified during the creation of thisEmbeddingStoreIngestor.static IngestionResultingest(List<Document> documents, EmbeddingStore<TextSegment> embeddingStore) Ingests specifiedDocuments into a specifiedEmbeddingStore.
-
Constructor Details
-
EmbeddingStoreIngestor
public EmbeddingStoreIngestor(DocumentTransformer documentTransformer, DocumentSplitter documentSplitter, TextSegmentTransformer textSegmentTransformer, EmbeddingModel embeddingModel, EmbeddingStore<TextSegment> embeddingStore) Creates an instance of anEmbeddingStoreIngestor.- Parameters:
documentTransformer- TheDocumentTransformerto use. Optional.documentSplitter- TheDocumentSplitterto use. Optional. If none is specified, it tries to load one through SPI (seeDocumentSplitterFactory).textSegmentTransformer- TheTextSegmentTransformerto use. Optional.embeddingModel- TheEmbeddingModelto use. Mandatory. If none is specified, it tries to load one through SPI (seeEmbeddingModelFactory).embeddingStore- TheEmbeddingStoreto use. Mandatory.
-
-
Method Details
-
ingest
Ingests a specifiedDocumentinto a specifiedEmbeddingStore.
UsesDocumentSplitterandEmbeddingModelfound through SPIs (seeDocumentSplitterFactoryandEmbeddingModelFactory).
For the "Easy RAG", importlangchain4j-easy-ragmodule, which contains aDocumentSplitterFactoryandEmbeddingModelFactoryimplementations.- Returns:
- result including information related to ingestion process.
-
ingest
public static IngestionResult ingest(List<Document> documents, EmbeddingStore<TextSegment> embeddingStore) Ingests specifiedDocuments into a specifiedEmbeddingStore.
UsesDocumentSplitterandEmbeddingModelfound through SPIs (seeDocumentSplitterFactoryandEmbeddingModelFactory).
For the "Easy RAG", importlangchain4j-easy-ragmodule, which contains aDocumentSplitterFactoryandEmbeddingModelFactoryimplementations.- Returns:
- result including information related to ingestion process.
-
ingest
Ingests a specified document into anEmbeddingStorethat was specified during the creation of thisEmbeddingStoreIngestor.- Parameters:
document- the document to ingest.- Returns:
- result including information related to ingestion process.
-
ingest
Ingests specified documents into anEmbeddingStorethat was specified during the creation of thisEmbeddingStoreIngestor.- Parameters:
documents- the documents to ingest.- Returns:
- result including information related to ingestion process.
-
ingest
Ingests specified documents into anEmbeddingStorethat was specified during the creation of thisEmbeddingStoreIngestor.- Parameters:
documents- the documents to ingest.- Returns:
- result including information related to ingestion process.
-
builder
Creates a new EmbeddingStoreIngestor builder.- Returns:
- the builder.
-