Class ApachePoiDocumentParser
java.lang.Object
dev.langchain4j.data.document.parser.apache.poi.ApachePoiDocumentParser
- All Implemented Interfaces:
DocumentParser
Parses Microsoft Office file into a
Document using Apache POI library.
This parser supports various file formats, including doc, docx, ppt, pptx, xls, and xlsx.
For detailed information on supported formats,
please refer to the official Apache POI website.-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionparse(InputStream inputStream) Parses a givenInputStreaminto aDocument.
-
Constructor Details
-
ApachePoiDocumentParser
public ApachePoiDocumentParser()
-
-
Method Details
-
parse
Description copied from interface:DocumentParserParses a givenInputStreaminto aDocument. The specific implementation of this method will depend on the type of the document being parsed.Note: This method does not close the provided
InputStream- it is the caller's responsibility to manage the lifecycle of the stream.- Specified by:
parsein interfaceDocumentParser- Parameters:
inputStream- TheInputStreamthat contains the content of theDocument.- Returns:
- The parsed
Document.
-