Interface Document
- All Known Implementing Classes:
DefaultDocument
public interface Document
Represents an unstructured piece of text that usually corresponds to a content of a single file.
This text could originate from various sources such as a text file, PDF, DOCX, or a web page (HTML).
Each document may have associated
Metadata including its source, owner, creation date, etc.-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final StringCommon metadata key for the absolute path of the directory from which the document was loaded.static final StringCommon metadata key for the name of the file from which the document was loaded.static final StringCommon metadata key for the URL from which the document was loaded. -
Method Summary
Modifier and TypeMethodDescriptionstatic DocumentCreates a new Document from the given text.static DocumentCreates a new Document from the given text.static DocumentCreates a new Document from the given text.static DocumentCreates a new Document from the given text.metadata()Returns the metadata associated with this document.text()Returns the text of this document.default TextSegmentBuilds aTextSegmentfrom this document.
-
Field Details
-
FILE_NAME
Common metadata key for the name of the file from which the document was loaded.- See Also:
-
ABSOLUTE_DIRECTORY_PATH
Common metadata key for the absolute path of the directory from which the document was loaded.- See Also:
-
URL
Common metadata key for the URL from which the document was loaded.- See Also:
-
-
Method Details
-
text
-
metadata
-
toTextSegment
-
from
-
from
-
document
-
document
-