Class ApachePdfBoxDocumentParser
java.lang.Object
dev.langchain4j.data.document.parser.apache.pdfbox.ApachePdfBoxDocumentParser
- All Implemented Interfaces:
DocumentParser
Parses PDF file into a
Document using Apache PDFBox library-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionparse(InputStream inputStream) Parses a givenInputStreaminto aDocument.
-
Constructor Details
-
ApachePdfBoxDocumentParser
public ApachePdfBoxDocumentParser() -
ApachePdfBoxDocumentParser
public ApachePdfBoxDocumentParser(boolean includeMetadata)
-
-
Method Details
-
parse
Description copied from interface:DocumentParserParses a givenInputStreaminto aDocument. The specific implementation of this method will depend on the type of the document being parsed.Note: This method does not close the provided
InputStream- it is the caller's responsibility to manage the lifecycle of the stream.- Specified by:
parsein interfaceDocumentParser- Parameters:
inputStream- TheInputStreamthat contains the content of theDocument.- Returns:
- The parsed
Document.
-