Uses of Interface
dev.langchain4j.data.document.DocumentSplitter
Package
Description
-
Uses of DocumentSplitter in dev.langchain4j.data.document.splitter
Modifier and TypeClassDescriptionclass
Splits the providedDocument
into characters and attempts to fit as many characters as possible into a singleTextSegment
, adhering to the limit set bymaxSegmentSize
.class
Splits the providedDocument
into lines and attempts to fit as many lines as possible into a singleTextSegment
, adhering to the limit set bymaxSegmentSize
.class
Splits the providedDocument
into paragraphs and attempts to fit as many paragraphs as possible into a singleTextSegment
, adhering to the limit set bymaxSegmentSize
.class
Splits the providedDocument
into parts using the providedregex
and attempts to fit as many parts as possible into a singleTextSegment
, adhering to the limit set bymaxSegmentSize
.class
Splits the providedDocument
into sentences and attempts to fit as many sentences as possible into a singleTextSegment
, adhering to the limit set bymaxSegmentSize
.class
Splits the providedDocument
into words and attempts to fit as many words as possible into a singleTextSegment
, adhering to the limit set bymaxSegmentSize
.class
Base class for hierarchical document splitters.Modifier and TypeFieldDescriptionprotected final DocumentSplitter
HierarchicalDocumentSplitter.subSplitter
Modifier and TypeMethodDescriptionprotected DocumentSplitter
DocumentByCharacterSplitter.defaultSubSplitter()
protected DocumentSplitter
DocumentByLineSplitter.defaultSubSplitter()
protected DocumentSplitter
DocumentByParagraphSplitter.defaultSubSplitter()
protected DocumentSplitter
DocumentByRegexSplitter.defaultSubSplitter()
protected DocumentSplitter
DocumentBySentenceSplitter.defaultSubSplitter()
protected DocumentSplitter
DocumentByWordSplitter.defaultSubSplitter()
protected abstract DocumentSplitter
HierarchicalDocumentSplitter.defaultSubSplitter()
The default sub-splitter to use when a single segment is too long.static DocumentSplitter
DocumentSplitters.recursive
(int maxSegmentSizeInChars, int maxOverlapSizeInChars) This is a recommendedDocumentSplitter
for generic text.static DocumentSplitter
DocumentSplitters.recursive
(int maxSegmentSizeInTokens, int maxOverlapSizeInTokens, Tokenizer tokenizer) This is a recommendedDocumentSplitter
for generic text.ModifierConstructorDescriptionDocumentByCharacterSplitter
(int maxSegmentSizeInChars, int maxOverlapSizeInChars, DocumentSplitter subSplitter) DocumentByCharacterSplitter
(int maxSegmentSizeInTokens, int maxOverlapSizeInTokens, Tokenizer tokenizer, DocumentSplitter subSplitter) DocumentByLineSplitter
(int maxSegmentSizeInChars, int maxOverlapSizeInChars, DocumentSplitter subSplitter) DocumentByLineSplitter
(int maxSegmentSizeInTokens, int maxOverlapSizeInTokens, Tokenizer tokenizer, DocumentSplitter subSplitter) DocumentByParagraphSplitter
(int maxSegmentSizeInChars, int maxOverlapSizeInChars, DocumentSplitter subSplitter) DocumentByParagraphSplitter
(int maxSegmentSizeInTokens, int maxOverlapSizeInTokens, Tokenizer tokenizer, DocumentSplitter subSplitter) DocumentByRegexSplitter
(String regex, String joinDelimiter, int maxSegmentSizeInChars, int maxOverlapSizeInChars, DocumentSplitter subSplitter) DocumentByRegexSplitter
(String regex, String joinDelimiter, int maxSegmentSizeInTokens, int maxOverlapSizeInTokens, Tokenizer tokenizer, DocumentSplitter subSplitter) DocumentBySentenceSplitter
(int maxSegmentSizeInChars, int maxOverlapSizeInChars, DocumentSplitter subSplitter) DocumentBySentenceSplitter
(int maxSegmentSizeInTokens, int maxOverlapSizeInTokens, Tokenizer tokenizer, DocumentSplitter subSplitter) DocumentByWordSplitter
(int maxSegmentSizeInChars, int maxOverlapSizeInChars, DocumentSplitter subSplitter) DocumentByWordSplitter
(int maxSegmentSizeInTokens, int maxOverlapSizeInTokens, Tokenizer tokenizer, DocumentSplitter subSplitter) protected
HierarchicalDocumentSplitter
(int maxSegmentSizeInTokens, int maxOverlapSizeInTokens, Tokenizer tokenizer, DocumentSplitter subSplitter) Creates a new instance ofHierarchicalDocumentSplitter
. -
Uses of DocumentSplitter in dev.langchain4j.data.document.splitter.recursive
-
Uses of DocumentSplitter in dev.langchain4j.spi.data.document.splitter
-
Uses of DocumentSplitter in dev.langchain4j.store.embedding
Modifier and TypeMethodDescriptionEmbeddingStoreIngestor.Builder.documentSplitter
(DocumentSplitter documentSplitter) Sets the document splitter.ModifierConstructorDescriptionEmbeddingStoreIngestor
(DocumentTransformer documentTransformer, DocumentSplitter documentSplitter, TextSegmentTransformer textSegmentTransformer, EmbeddingModel embeddingModel, EmbeddingStore<TextSegment> embeddingStore) Creates an instance of anEmbeddingStoreIngestor
.