dagster_gcp.
bq_create_dataset
(*args, **kwargs)[source]¶BigQuery Create Dataset.
This solid encapsulates creating a BigQuery dataset.
Expects a BQ client to be provisioned in resources as context.resources.bigquery.
dagster_gcp.
bq_delete_dataset
(*args, **kwargs)[source]¶BigQuery Delete Dataset.
This solid encapsulates deleting a BigQuery dataset.
Expects a BQ client to be provisioned in resources as context.resources.bigquery.
dagster_gcp.
GCSFileHandle
(gcs_bucket: str, gcs_key: str)[source]¶A reference to a file on GCS.
dagster_gcp.
gcs_file_manager
ResourceDefinition[source]¶FileManager that provides abstract access to GCS.
Implements the FileManager
API.
dagster_gcp.gcs.
gcs_pickle_io_manager
IOManagerDefinition[source]¶Persistent IO manager using GCS for storage.
Serializes objects via pickling. Suitable for objects storage for distributed executors, so long as each execution node has network connectivity and credentials for GCS and the backing bucket.
Attach this resource definition to a ModeDefinition
in order to make it available to your pipeline:
pipeline_def = PipelineDefinition(
mode_defs=[
ModeDefinition(
resource_defs={'io_manager': gcs_pickle_io_manager, 'gcs': gcs_resource, ...},
), ...
], ...
)
You may configure this storage as follows:
resources:
io_manager:
config:
gcs_bucket: my-cool-bucket
gcs_prefix: good/prefix-for-files-