Class ClassPathDocumentLoader
java.lang.Object
dev.langchain4j.data.document.loader.ClassPathDocumentLoader
DocumentLoader
implementation for loading documents using a ClassPathSource
- Author:
- Eric Deandrea
-
Method Summary
Modifier and TypeMethodDescriptionstatic Document
loadDocument
(String pathOnClasspath) Loads aDocument
from the specified file path.static Document
loadDocument
(String pathOnClasspath, DocumentParser documentParser) Loads aDocument
from the specified file path.loadDocuments
(String directoryOnClasspath) LoadsDocument
s from the specified directory.loadDocuments
(String directoryOnClasspath, DocumentParser documentParser) LoadsDocument
s from the specified directory.loadDocuments
(String directoryOnClasspath, PathMatcher pathMatcher) Loads matchingDocument
s from the specified directory.loadDocuments
(String directoryOnClasspath, PathMatcher pathMatcher, DocumentParser documentParser) Loads matchingDocument
s from the specified directory.loadDocumentsRecursively
(String directoryOnClasspath) Recursively loadsDocument
s from the specified directory and its subdirectories.loadDocumentsRecursively
(String directoryOnClasspath, DocumentParser documentParser) Recursively loadsDocument
s from the specified directory and its subdirectories.loadDocumentsRecursively
(String directoryOnClasspath, PathMatcher pathMatcher) Recursively loads matchingDocument
s from the specified directory and its subdirectories.loadDocumentsRecursively
(String directoryOnClasspath, PathMatcher pathMatcher, DocumentParser documentParser) Recursively loads matchingDocument
s from the specified directory and its subdirectories.
-
Method Details
-
loadDocument
Loads aDocument
from the specified file path.
The file is parsed using the defaultDocumentParser
. The defaultDocumentParser
is loaded through SPI (seeDocumentParserFactoru
). If noDocumentParserFactory
is available in the classpath, aTextDocumentParser
is used.
ReturnedDocument
contains all the textual information from the file.- Parameters:
pathOnClasspath
- The path on the classpath to the file.- Returns:
- document
- Throws:
IllegalArgumentException
- If specified path is not a file.
-
loadDocument
Loads aDocument
from the specified file path.
The file is parsed using the specifiedDocumentParser
.
ReturnedDocument
contains all the textual information from the file.- Parameters:
pathOnClasspath
- The path on the classpath to the file.documentParser
- The parser to be used for parsing text from the file.- Returns:
- document
- Throws:
IllegalArgumentException
- If specified path is not a file.
-
loadDocuments
LoadsDocument
s from the specified directory. Does not use recursion.
The files are parsed using the defaultDocumentParser
. The defaultDocumentParser
is loaded through SPI (seeDocumentParserFactoru
). If noDocumentParserFactory
is available in the classpath, aTextDocumentParser
is used.
Skips anyDocument
s that fail to load.- Parameters:
directoryOnClasspath
- The path to the directory on the classpath with files.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocuments
public static List<Document> loadDocuments(String directoryOnClasspath, DocumentParser documentParser) LoadsDocument
s from the specified directory. Does not use recursion.
The files are parsed using the specifiedDocumentParser
.
Skips anyDocument
s that fail to load.- Parameters:
directoryOnClasspath
- The path to the directory on the classpath with files.documentParser
- The parser to be used for parsing text from each file.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocuments
Loads matchingDocument
s from the specified directory. Does not use recursion.
The files are parsed using the defaultDocumentParser
. The defaultDocumentParser
is loaded through SPI (seeDocumentParserFactoru
). If noDocumentParserFactory
is available in the classpath, aTextDocumentParser
is used.
Skips anyDocument
s that fail to load.- Parameters:
directoryOnClasspath
- The path to the directory on the classpath with files.pathMatcher
- Only files whose paths match the providedPathMatcher
will be loaded. For example, usingFileSystems.getDefault().getPathMatcher("glob:*.txt")
will load all files fromdirectoryPath
with atxt
extension. When traversing the directory, each file path is converted from absolute to relative (relative todirectoryPath
) before being matched by apathMatcher
. Thus,pathMatcher
should use relative patterns.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocuments
public static List<Document> loadDocuments(String directoryOnClasspath, PathMatcher pathMatcher, DocumentParser documentParser) Loads matchingDocument
s from the specified directory. Does not use recursion.
The files are parsed using the specifiedDocumentParser
.
Skips anyDocument
s that fail to load.- Parameters:
directoryOnClasspath
- The path to the directory on the classpath with files.pathMatcher
- Only files whose paths match the providedPathMatcher
will be loaded. For example, usingFileSystems.getDefault().getPathMatcher("glob:*.txt")
will load all files fromdirectoryPath
with atxt
extension. When traversing the directory, each file path is converted from absolute to relative (relative todirectoryPath
) before being matched by apathMatcher
. Thus,pathMatcher
should use relative patterns.documentParser
- The parser to be used for parsing text from each file.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocumentsRecursively
Recursively loadsDocument
s from the specified directory and its subdirectories.
The files are parsed using the defaultDocumentParser
. The defaultDocumentParser
is loaded through SPI (seeDocumentParserFactoru
). If noDocumentParserFactory
is available in the classpath, aTextDocumentParser
is used.
Skips anyDocument
s that fail to load.- Parameters:
directoryOnClasspath
- The path to the directory on the classpath with files.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocumentsRecursively
public static List<Document> loadDocumentsRecursively(String directoryOnClasspath, DocumentParser documentParser) Recursively loadsDocument
s from the specified directory and its subdirectories.
The files are parsed using the specifiedDocumentParser
.
Skips anyDocument
s that fail to load.- Parameters:
directoryOnClasspath
- The path to the directory on the classpath with files.documentParser
- The parser to be used for parsing text from each file.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocumentsRecursively
public static List<Document> loadDocumentsRecursively(String directoryOnClasspath, PathMatcher pathMatcher) Recursively loads matchingDocument
s from the specified directory and its subdirectories.
The files are parsed using the defaultDocumentParser
. The defaultDocumentParser
is loaded through SPI (seeDocumentParserFactoru
). If noDocumentParserFactory
is available in the classpath, aTextDocumentParser
is used.
Skips anyDocument
s that fail to load.- Parameters:
directoryOnClasspath
- The path to the directory on the classpath with files.pathMatcher
- Only files whose paths match the providedPathMatcher
will be loaded. For example, usingFileSystems.getDefault().getPathMatcher("glob:**.txt")
will load all files fromdirectoryPath
and its subdirectories with atxt
extension. When traversing the directory tree, each file path is converted from absolute to relative (relative todirectoryPath
) before being matched by apathMatcher
. Thus,pathMatcher
should use relative patterns. Please be aware that*.txt
pattern (with a single asterisk) will match files only in thedirectoryPath
, but it will not match files from the subdirectories ofdirectoryPath
.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocumentsRecursively
public static List<Document> loadDocumentsRecursively(String directoryOnClasspath, PathMatcher pathMatcher, DocumentParser documentParser) Recursively loads matchingDocument
s from the specified directory and its subdirectories.
The files are parsed using the specifiedDocumentParser
.
Skips anyDocument
s that fail to load.- Parameters:
directoryOnClasspath
- The path to the directory on the classpath with files.pathMatcher
- Only files whose paths match the providedPathMatcher
will be loaded. For example, usingFileSystems.getDefault().getPathMatcher("glob:**.txt")
will load all files fromdirectoryPath
and its subdirectories with atxt
extension. When traversing the directory tree, each file path is converted from absolute to relative (relative todirectoryPath
) before being matched by apathMatcher
. Thus,pathMatcher
should use relative patterns. Please be aware that*.txt
pattern (with a single asterisk) will match files only in thedirectoryPath
, but it will not match files from the subdirectories ofdirectoryPath
.documentParser
- The parser to be used for parsing text from each file.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-