Class OracleDocumentSplitter

java.lang.Object
dev.langchain4j.data.document.splitter.oracle.OracleDocumentSplitter
All Implemented Interfaces:
DocumentSplitter

public class OracleDocumentSplitter extends Object implements DocumentSplitter
Split documents Use dbms_vector_chain.utl_to_chunks to split documents. You can specify how to split the content such as by words, characters, or vocabulary to match a tokenizer in the preference. Some example preferences To split by words: {"by": "words", "max": 100} To split by characters: {"by": "characters", "max": 100}
  • Constructor Details

    • OracleDocumentSplitter

      public OracleDocumentSplitter(Connection conn, String pref)
      Create a document splitter
  • Method Details

    • split

      public List<TextSegment> split(Document document)
      Split a single document
      Specified by:
      split in interface DocumentSplitter
      Parameters:
      document - The Document to be split.
      Returns:
      A list of TextSegment objects derived from the input Document.
    • splitAll

      public List<TextSegment> splitAll(List<Document> list)
      Split a list of documents
      Specified by:
      splitAll in interface DocumentSplitter
      Parameters:
      list - The list of Documents to be split.
      Returns:
      A list of TextSegment objects derived from the input Documents.
    • split

      public String[] split(String content) throws SQLException, com.fasterxml.jackson.core.JsonProcessingException
      Split the provided text into parts
      Throws:
      SQLException
      com.fasterxml.jackson.core.JsonProcessingException