Class FileSystemDocumentLoader
java.lang.Object
dev.langchain4j.data.document.loader.FileSystemDocumentLoader
-
Method Summary
Modifier and TypeMethodDescriptionstatic Document
loadDocument
(String filePath) Loads aDocument
from the specified file path.static Document
loadDocument
(String filePath, DocumentParser documentParser) Loads aDocument
from the specified file path.static Document
loadDocument
(Path filePath) static Document
loadDocument
(Path filePath, DocumentParser documentParser) loadDocuments
(String directoryPath) LoadsDocument
s from the specified directory.loadDocuments
(String directoryPath, DocumentParser documentParser) LoadsDocument
s from the specified directory.loadDocuments
(String directoryPath, PathMatcher pathMatcher) Loads matchingDocument
s from the specified directory.loadDocuments
(String directoryPath, PathMatcher pathMatcher, DocumentParser documentParser) Loads matchingDocument
s from the specified directory.loadDocuments
(Path directoryPath) LoadsDocument
s from the specified directory.loadDocuments
(Path directoryPath, DocumentParser documentParser) LoadsDocument
s from the specified directory.loadDocuments
(Path directoryPath, PathMatcher pathMatcher) Loads matchingDocument
s from the specified directory.loadDocuments
(Path directoryPath, PathMatcher pathMatcher, DocumentParser documentParser) Loads matchingDocument
s from the specified directory.loadDocumentsRecursively
(String directoryPath) Recursively loadsDocument
s from the specified directory and its subdirectories.loadDocumentsRecursively
(String directoryPath, DocumentParser documentParser) Recursively loadsDocument
s from the specified directory and its subdirectories.loadDocumentsRecursively
(String directoryPath, PathMatcher pathMatcher) Recursively loads matchingDocument
s from the specified directory and its subdirectories.loadDocumentsRecursively
(String directoryPath, PathMatcher pathMatcher, DocumentParser documentParser) Recursively loads matchingDocument
s from the specified directory and its subdirectories.loadDocumentsRecursively
(Path directoryPath) Recursively loadsDocument
s from the specified directory and its subdirectories.loadDocumentsRecursively
(Path directoryPath, DocumentParser documentParser) Recursively loadsDocument
s from the specified directory and its subdirectories.loadDocumentsRecursively
(Path directoryPath, PathMatcher pathMatcher) Recursively loads matchingDocument
s from the specified directory and its subdirectories.loadDocumentsRecursively
(Path directoryPath, PathMatcher pathMatcher, DocumentParser documentParser) Recursively loads matchingDocument
s from the specified directory and its subdirectories.
-
Method Details
-
loadDocument
Loads aDocument
from the specified filePath
.
The file is parsed using the specifiedDocumentParser
.
ReturnedDocument
contains all the textual information from the file.- Parameters:
filePath
- The path to the file.documentParser
- The parser to be used for parsing text from the file.- Returns:
- document
- Throws:
IllegalArgumentException
- If specified path is not a file.
-
loadDocument
Loads aDocument
from the specified filePath
.
The file is parsed using the defaultDocumentParser
. The defaultDocumentParser
is loaded through SPI (seeDocumentParserFactory
). If noDocumentParserFactory
is available in the classpath, aTextDocumentParser
is used.
ReturnedDocument
contains all the textual information from the file.- Parameters:
filePath
- The path to the file.- Returns:
- document
- Throws:
IllegalArgumentException
- If specified path is not a file.
-
loadDocument
Loads aDocument
from the specified file path.
The file is parsed using the specifiedDocumentParser
.
ReturnedDocument
contains all the textual information from the file.- Parameters:
filePath
- The path to the file.documentParser
- The parser to be used for parsing text from the file.- Returns:
- document
- Throws:
IllegalArgumentException
- If specified path is not a file.
-
loadDocument
Loads aDocument
from the specified file path.
The file is parsed using the defaultDocumentParser
. The defaultDocumentParser
is loaded through SPI (seeDocumentParserFactory
). If noDocumentParserFactory
is available in the classpath, aTextDocumentParser
is used.
ReturnedDocument
contains all the textual information from the file.- Parameters:
filePath
- The path to the file.- Returns:
- document
- Throws:
IllegalArgumentException
- If specified path is not a file.
-
loadDocuments
LoadsDocument
s from the specified directory. Does not use recursion.
The files are parsed using the specifiedDocumentParser
.
Skips anyDocument
s that fail to load.- Parameters:
directoryPath
- The path to the directory with files.documentParser
- The parser to be used for parsing text from each file.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocuments
LoadsDocument
s from the specified directory. Does not use recursion.
The files are parsed using the defaultDocumentParser
. The defaultDocumentParser
is loaded through SPI (seeDocumentParserFactory
). If noDocumentParserFactory
is available in the classpath, aTextDocumentParser
is used.
Skips anyDocument
s that fail to load.- Parameters:
directoryPath
- The path to the directory with files.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocuments
LoadsDocument
s from the specified directory. Does not use recursion.
The files are parsed using the specifiedDocumentParser
.
Skips anyDocument
s that fail to load.- Parameters:
directoryPath
- The path to the directory with files.documentParser
- The parser to be used for parsing text from each file.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocuments
LoadsDocument
s from the specified directory. Does not use recursion.
The files are parsed using the defaultDocumentParser
. The defaultDocumentParser
is loaded through SPI (seeDocumentParserFactory
). If noDocumentParserFactory
is available in the classpath, aTextDocumentParser
is used.
Skips anyDocument
s that fail to load.- Parameters:
directoryPath
- The path to the directory with files.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocuments
public static List<Document> loadDocuments(Path directoryPath, PathMatcher pathMatcher, DocumentParser documentParser) Loads matchingDocument
s from the specified directory. Does not use recursion.
The files are parsed using the specifiedDocumentParser
.
Skips anyDocument
s that fail to load.- Parameters:
directoryPath
- The path to the directory with files.pathMatcher
- Only files whose paths match the providedPathMatcher
will be loaded. For example, usingFileSystems.getDefault().getPathMatcher("glob:*.txt")
will load all files fromdirectoryPath
with atxt
extension. When traversing the directory, each file path is converted from absolute to relative (relative todirectoryPath
) before being matched by apathMatcher
. Thus,pathMatcher
should use relative patterns.documentParser
- The parser to be used for parsing text from each file.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocuments
Loads matchingDocument
s from the specified directory. Does not use recursion.
The files are parsed using the defaultDocumentParser
. The defaultDocumentParser
is loaded through SPI (seeDocumentParserFactory
). If noDocumentParserFactory
is available in the classpath, aTextDocumentParser
is used.
Skips anyDocument
s that fail to load.- Parameters:
directoryPath
- The path to the directory with files.pathMatcher
- Only files whose paths match the providedPathMatcher
will be loaded. For example, usingFileSystems.getDefault().getPathMatcher("glob:*.txt")
will load all files fromdirectoryPath
with atxt
extension. When traversing the directory, each file path is converted from absolute to relative (relative todirectoryPath
) before being matched by apathMatcher
. Thus,pathMatcher
should use relative patterns.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocuments
public static List<Document> loadDocuments(String directoryPath, PathMatcher pathMatcher, DocumentParser documentParser) Loads matchingDocument
s from the specified directory. Does not use recursion.
The files are parsed using the specifiedDocumentParser
.
Skips anyDocument
s that fail to load.- Parameters:
directoryPath
- The path to the directory with files.pathMatcher
- Only files whose paths match the providedPathMatcher
will be loaded. For example, usingFileSystems.getDefault().getPathMatcher("glob:*.txt")
will load all files fromdirectoryPath
with atxt
extension. When traversing the directory, each file path is converted from absolute to relative (relative todirectoryPath
) before being matched by apathMatcher
. Thus,pathMatcher
should use relative patterns.documentParser
- The parser to be used for parsing text from each file.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocuments
Loads matchingDocument
s from the specified directory. Does not use recursion.
The files are parsed using the defaultDocumentParser
. The defaultDocumentParser
is loaded through SPI (seeDocumentParserFactory
). If noDocumentParserFactory
is available in the classpath, aTextDocumentParser
is used.
Skips anyDocument
s that fail to load.- Parameters:
directoryPath
- The path to the directory with files.pathMatcher
- Only files whose paths match the providedPathMatcher
will be loaded. For example, usingFileSystems.getDefault().getPathMatcher("glob:*.txt")
will load all files fromdirectoryPath
with atxt
extension. When traversing the directory, each file path is converted from absolute to relative (relative todirectoryPath
) before being matched by apathMatcher
. Thus,pathMatcher
should use relative patterns.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocumentsRecursively
public static List<Document> loadDocumentsRecursively(Path directoryPath, DocumentParser documentParser) Recursively loadsDocument
s from the specified directory and its subdirectories.
The files are parsed using the specifiedDocumentParser
.
Skips anyDocument
s that fail to load.- Parameters:
directoryPath
- The path to the directory with files.documentParser
- The parser to be used for parsing text from each file.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocumentsRecursively
Recursively loadsDocument
s from the specified directory and its subdirectories.
The files are parsed using the defaultDocumentParser
. The defaultDocumentParser
is loaded through SPI (seeDocumentParserFactory
). If noDocumentParserFactory
is available in the classpath, aTextDocumentParser
is used.
Skips anyDocument
s that fail to load.- Parameters:
directoryPath
- The path to the directory with files.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocumentsRecursively
public static List<Document> loadDocumentsRecursively(String directoryPath, DocumentParser documentParser) Recursively loadsDocument
s from the specified directory and its subdirectories.
The files are parsed using the specifiedDocumentParser
.
Skips anyDocument
s that fail to load.- Parameters:
directoryPath
- The path to the directory with files.documentParser
- The parser to be used for parsing text from each file.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocumentsRecursively
Recursively loadsDocument
s from the specified directory and its subdirectories.
The files are parsed using the defaultDocumentParser
. The defaultDocumentParser
is loaded through SPI (seeDocumentParserFactory
). If noDocumentParserFactory
is available in the classpath, aTextDocumentParser
is used.
Skips anyDocument
s that fail to load.- Parameters:
directoryPath
- The path to the directory with files.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocumentsRecursively
public static List<Document> loadDocumentsRecursively(Path directoryPath, PathMatcher pathMatcher, DocumentParser documentParser) Recursively loads matchingDocument
s from the specified directory and its subdirectories.
The files are parsed using the specifiedDocumentParser
.
Skips anyDocument
s that fail to load.- Parameters:
directoryPath
- The path to the directory with files.pathMatcher
- Only files whose paths match the providedPathMatcher
will be loaded. For example, usingFileSystems.getDefault().getPathMatcher("glob:**.txt")
will load all files fromdirectoryPath
and its subdirectories with atxt
extension. When traversing the directory tree, each file path is converted from absolute to relative (relative todirectoryPath
) before being matched by apathMatcher
. Thus,pathMatcher
should use relative patterns. Please be aware that*.txt
pattern (with a single asterisk) will match files only in thedirectoryPath
, but it will not match files from the subdirectories ofdirectoryPath
.documentParser
- The parser to be used for parsing text from each file.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocumentsRecursively
Recursively loads matchingDocument
s from the specified directory and its subdirectories.
The files are parsed using the defaultDocumentParser
. The defaultDocumentParser
is loaded through SPI (seeDocumentParserFactory
). If noDocumentParserFactory
is available in the classpath, aTextDocumentParser
is used.
Skips anyDocument
s that fail to load.- Parameters:
directoryPath
- The path to the directory with files.pathMatcher
- Only files whose paths match the providedPathMatcher
will be loaded. For example, usingFileSystems.getDefault().getPathMatcher("glob:**.txt")
will load all files fromdirectoryPath
and its subdirectories with atxt
extension. When traversing the directory tree, each file path is converted from absolute to relative (relative todirectoryPath
) before being matched by apathMatcher
. Thus,pathMatcher
should use relative patterns. Please be aware that*.txt
pattern (with a single asterisk) will match files only in thedirectoryPath
, but it will not match files from the subdirectories ofdirectoryPath
.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocumentsRecursively
public static List<Document> loadDocumentsRecursively(String directoryPath, PathMatcher pathMatcher, DocumentParser documentParser) Recursively loads matchingDocument
s from the specified directory and its subdirectories.
The files are parsed using the specifiedDocumentParser
.
Skips anyDocument
s that fail to load.- Parameters:
directoryPath
- The path to the directory with files.pathMatcher
- Only files whose paths match the providedPathMatcher
will be loaded. For example, usingFileSystems.getDefault().getPathMatcher("glob:**.txt")
will load all files fromdirectoryPath
and its subdirectories with atxt
extension. When traversing the directory tree, each file path is converted from absolute to relative (relative todirectoryPath
) before being matched by apathMatcher
. Thus,pathMatcher
should use relative patterns. Please be aware that*.txt
pattern (with a single asterisk) will match files only in thedirectoryPath
, but it will not match files from the subdirectories ofdirectoryPath
.documentParser
- The parser to be used for parsing text from each file.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocumentsRecursively
public static List<Document> loadDocumentsRecursively(String directoryPath, PathMatcher pathMatcher) Recursively loads matchingDocument
s from the specified directory and its subdirectories.
The files are parsed using the defaultDocumentParser
. The defaultDocumentParser
is loaded through SPI (seeDocumentParserFactory
). If noDocumentParserFactory
is available in the classpath, aTextDocumentParser
is used.
Skips anyDocument
s that fail to load.- Parameters:
directoryPath
- The path to the directory with files.pathMatcher
- Only files whose paths match the providedPathMatcher
will be loaded. For example, usingFileSystems.getDefault().getPathMatcher("glob:**.txt")
will load all files fromdirectoryPath
and its subdirectories with atxt
extension. When traversing the directory tree, each file path is converted from absolute to relative (relative todirectoryPath
) before being matched by apathMatcher
. Thus,pathMatcher
should use relative patterns. Please be aware that*.txt
pattern (with a single asterisk) will match files only in thedirectoryPath
, but it will not match files from the subdirectories ofdirectoryPath
.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-