Google Cloud Storage
A Google Cloud Storage (GCS) document loader that allows you to load documents from storage buckets.
Maven Dependency
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-document-loader-google-cloud-storage</artifactId>
<version>0.36.2</version>
</dependency>
APIs
GoogleCloudStorageDocumentLoader
Authentication
The authentication should be handled transparently for you:
- If your application is running on Google Cloud Platform (Cloud Run, App Engine, Compute Engine, etc)
- When running locally on your machine, if you are already authenticated via Google's
gcloud
SDK
You should just create a loader specifying just your project ID:
GoogleCloudStorageDocumentLoader gcsLoader = GoogleCloudStorageDocumentLoader.builder()
.project(System.getenv("GCP_PROJECT_ID"))
.build();
Otherwise, it's possible to specify Credentials
, if you have downloaded a service account key, and exported an environment variable pointing to it:
GoogleCloudStorageDocumentLoader gcsLoader = GoogleCloudStorageDocumentLoader.builder()
.project(System.getenv("GCP_PROJECT_ID"))
.credentials(GoogleCredentials.fromStream(new FileInputStream(System.getenv("GOOGLE_APPLICATION_CREDENTIALS"))))
.build();
Learn more about credentials.
When accessing a public bucket, you shouldn't need to authenticate.
Examples
Load a single file from a GCS bucket
GoogleCloudStorageDocumentLoader gcsLoader = GoogleCloudStorageDocumentLoader.builder()
.project(System.getenv("GCP_PROJECT_ID"))
.build();
Document document = gcsLoader.loadDocument("BUCKET_NAME", "FILE_NAME.txt", new TextDocumentParser());
Load all files from a GCS bucket
GoogleCloudStorageDocumentLoader gcsLoader = GoogleCloudStorageDocumentLoader.builder()
.project(System.getenv("GCP_PROJECT_ID"))
.build();
List<Document> documents = gcsLoader.loadDocuments("BUCKET_NAME", new TextDocumentParser());
Load all files from a GCS bucket with a glob pattern
GoogleCloudStorageDocumentLoader gcsLoader = GoogleCloudStorageDocumentLoader.builder()
.project(System.getenv("GCP_PROJECT_ID"))
.build();
List<Document> documents = gcsLoader.loadDocuments("BUCKET_NAME", "*.txt", new TextDocumentParser());
For more code samples, please have a look at the integration test class: