Class ApachePoiDocumentParser
java.lang.Object
dev.langchain4j.data.document.parser.apache.poi.ApachePoiDocumentParser
- All Implemented Interfaces:
DocumentParser
Parses Microsoft Office file into a
Document
using Apache POI library.
This parser supports various file formats, including doc, docx, ppt, pptx, xls, and xlsx.
For detailed information on supported formats,
please refer to the official Apache POI website.-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionparse
(InputStream inputStream) Parses a givenInputStream
into aDocument
.
-
Constructor Details
-
ApachePoiDocumentParser
public ApachePoiDocumentParser()
-
-
Method Details
-
parse
Description copied from interface:DocumentParser
Parses a givenInputStream
into aDocument
. The specific implementation of this method will depend on the type of the document being parsed.Note: This method does not close the provided
InputStream
- it is the caller's responsibility to manage the lifecycle of the stream.- Specified by:
parse
in interfaceDocumentParser
- Parameters:
inputStream
- TheInputStream
that contains the content of theDocument
.- Returns:
- The parsed
Document
.
-