This library provides a Dagster integration with dbt (data build tool), created by Fishtown Analytics.
dagster_dbt.
dbt_cli_compile
(*args, **kwargs)[source]¶This solid executes dbt compile
via the dbt CLI.
dagster_dbt.
dbt_cli_run_operation
(*args, **kwargs)[source]¶This solid executes dbt run-operation
via the dbt CLI.
dagster_dbt.
dbt_cli_snapshot
(*args, **kwargs)[source]¶This solid executes dbt snapshot
via the dbt CLI.
dagster_dbt.
dbt_cli_snapshot_freshness
(*args, **kwargs)[source]¶This solid executes dbt source snapshot-freshness
via the dbt CLI.
dagster_dbt.
DbtCliOutput
[source]¶The results of executing a dbt command, along with additional metadata about the dbt CLI process that was run.
Note that users should not construct instances of this class directly. This class is intended to be constructed from the JSON output of dbt commands.
If the executed dbt command is either run
or test
, then the .num_*
attributes will
contain non-None
integer values. Otherwise, they will be None
.
from_dict
(d: Dict[str, Any]) → dagster_dbt.cli.types.DbtCliOutput[source]¶Constructs an instance of DbtCliOutput
from a
dictionary.
d (Dict[str, Any]) – A dictionary with key-values to construct a DbtCliOutput
.
An instance of DbtCliOutput
.
dagster_dbt.
create_dbt_rpc_run_sql_solid
(name: str, output_def: Optional[dagster.core.definitions.output.OutputDefinition] = None, **kwargs) → Callable[source]¶This function is a factory which constructs a solid that will copy the results of a SQL query
run within the context of a dbt project to a pandas DataFrame
.
Any kwargs passed to this function will be passed along to the underlying @solid
decorator. However, note that overriding config_schema
, input_defs
, and
required_resource_keys
is not allowed and will throw a DagsterInvalidDefinitionError
.
If you would like to configure this solid with different config fields, you could consider using
@composite_solid
to wrap this solid.
name (str) – The name of this solid.
output_def (OutputDefinition, optional) – The OutputDefinition
for the solid. This value should always be a representation
of a pandas DataFrame
. If not specified, the solid will default to an
OutputDefinition
named “df” with a DataFrame
dagster type.
Returns the constructed solid definition.
dagster_dbt.
dbt_rpc_compile_sql
(*args, **kwargs)[source]¶This solid sends the dbt compile
command to a dbt RPC server and returns the request
token.
This dbt RPC solid is asynchronous. The request token can be used in subsequent RPC requests to poll the progress of the running dbt process.
dagster_dbt.
dbt_rpc_run
(*args, **kwargs)[source]¶This solid sends the dbt run
command to a dbt RPC server and returns the request token.
This dbt RPC solid is asynchronous. The request token can be used in subsequent RPC requests to poll the progress of the running dbt process.
dagster_dbt.
dbt_rpc_run_and_wait
(*args, **kwargs)[source]¶This solid sends the dbt run
command to a dbt RPC server and returns the result of the
executed dbt process.
This dbt RPC solid is synchronous, and will periodically poll the dbt RPC server until the dbt process is completed.
dagster_dbt.
dbt_rpc_run_operation
(*args, **kwargs)[source]¶This solid sends the dbt run-operation
command to a dbt RPC server and returns the
request token.
This dbt RPC solid is asynchronous. The request token can be used in subsequent RPC requests to poll the progress of the running dbt process.
dagster_dbt.
dbt_rpc_run_operation_and_wait
(*args, **kwargs)[source]¶This solid sends the dbt run-operation
command to a dbt RPC server and returns the
result of the executed dbt process.
This dbt RPC solid is synchronous, and will periodically poll the dbt RPC server until the dbt process is completed.
dagster_dbt.
dbt_rpc_snapshot
(*args, **kwargs)[source]¶This solid sends the dbt snapshot
command to a dbt RPC server and returns the
request token.
This dbt RPC solid is asynchronous. The request token can be used in subsequent RPC requests to poll the progress of the running dbt process.
dagster_dbt.
dbt_rpc_snapshot_and_wait
(*args, **kwargs)[source]¶This solid sends the dbt snapshot
command to a dbt RPC server and returns the result of
the executed dbt process.
This dbt RPC solid is synchronous, and will periodically poll the dbt RPC server until the dbt process is completed.
dagster_dbt.
dbt_rpc_snapshot_freshness
(*args, **kwargs)[source]¶This solid sends the dbt source snapshot-freshness
command to a dbt RPC server and
returns the request token.
This dbt RPC solid is asynchronous. The request token can be used in subsequent RPC requests to poll the progress of the running dbt process.
dagster_dbt.
dbt_rpc_snapshot_freshness_and_wait
(*args, **kwargs)[source]¶This solid sends the dbt source snapshot
command to a dbt RPC server and returns the
result of the executed dbt process.
This dbt RPC solid is synchronous, and will periodically poll the dbt RPC server until the dbt process is completed.
dagster_dbt.
dbt_rpc_test
(*args, **kwargs)[source]¶This solid sends the dbt test
command to a dbt RPC server and returns the request token.
This dbt RPC solid is asynchronous. The request token can be used in subsequent RPC requests to poll the progress of the running dbt process.
dagster_dbt.
dbt_rpc_test_and_wait
(*args, **kwargs)[source]¶This solid sends the dbt test
command to a dbt RPC server and returns the result of the
executed dbt process.
This dbt RPC solid is synchronous, and will periodically poll the dbt RPC server until the dbt process is completed.
dagster_dbt.
dbt_rpc_resource
ResourceDefinition[source]¶This resource defines a dbt RPC client.
To configure this resource, we recommend using the configured method.
Examples:
custom_dbt_rpc_resource = dbt_rpc_resource.configured({"host": "80.80.80.80","port": 8080,})
@pipeline(mode_defs=[ModeDefinition(resource_defs={"dbt_rpc": custom_dbt_rpc_resource})])
def dbt_rpc_pipeline():
# Run solids with `required_resource_keys={"dbt_rpc", ...}`.
dagster_dbt.
local_dbt_rpc_resource
ResourceDefinition¶This resource defines a dbt RPC client for an RPC server running on 0.0.0.0:8580.
dagster_dbt.
DbtRpcClient
(host: str = '0.0.0.0', port: int = 8580, jsonrpc_version: str = '2.0', logger: Optional[Any] = None, **_)[source]¶A client for a dbt RPC server.
If you are need a dbt RPC server as a Dagster resource, we recommend that you use
dbt_rpc_resource
.
cli
(*, cli: str, **kwargs) → requests.models.Response[source]¶Sends a request with CLI syntax to the dbt RPC server, and returns the response. For more details, see the dbt docs for running CLI commands via RPC.
cli (str) – a dbt command in CLI syntax.
the HTTP response from the dbt RPC server.
Response
compile
(*, models: List[str] = None, exclude: List[str] = None, **kwargs) → requests.models.Response[source]¶Sends a request with the method compile
to the dbt RPC server, and returns the
response. For more details, see the dbt docs for compiling projects via RPC.
compile_sql
(*, sql: str, name: str) → requests.models.Response[source]¶Sends a request with the method compile_sql
to the dbt RPC server, and returns the
response. For more details, see the dbt docs for compiling SQL via RPC.
generate_docs
(*, models: List[str] = None, exclude: List[str] = None, compile: bool = False, **kwargs) → requests.models.Response[source]¶Sends a request with the method docs.generate
to the dbt RPC server, and returns the
response. For more details, see the dbt docs for the RPC method docs.generate.
the HTTP response from the dbt RPC server.
Response
kill
(*, task_id: str) → requests.models.Response[source]¶Sends a request with the method kill
to the dbt RPC server, and returns the response.
For more details, see the dbt docs for the RPC method kill.
task_id (str) – the ID of the task to terminate.
the HTTP response from the dbt RPC server.
Response
logger
¶A property for injecting a logger dependency.
Optional[Any]
poll
(*, request_token: str, logs: bool = False, logs_start: int = 0) → requests.models.Response[source]¶Sends a request with the method poll
to the dbt RPC server, and returns the response.
For more details, see the dbt docs for the RPC method poll.
ps
(*, completed: bool = False) → requests.models.Response[source]¶Sends a request with the method ps
to the dbt RPC server, and returns the response.
For more details, see the dbt docs for the RPC method ps.
compelted (bool) – If True
, then also return completed tasks. Defaults to False
.
the HTTP response from the dbt RPC server.
Response
run
(*, models: List[str] = None, exclude: List[str] = None, **kwargs) → requests.models.Response[source]¶Sends a request with the method run
to the dbt RPC server, and returns the response.
For more details, see the dbt docs for the RPC method run.
run_operation
(*, macro: str, args: Optional[Dict[str, Any]] = None, **kwargs) → requests.models.Response[source]¶Sends a request with the method run-operation
to the dbt RPC server, and returns the
response. For more details, see the dbt docs for the command run-operation.
run_sql
(*, sql: str, name: str) → requests.models.Response[source]¶Sends a request with the method run_sql
to the dbt RPC server, and returns the
response. For more details, see the dbt docs for running SQL via RPC.
seed
(*, show: bool = False, **kwargs) → requests.models.Response[source]¶Sends a request with the method seed
to the dbt RPC server, and returns the response.
For more details, see the dbt docs for the RPC method seed.
show (bool, optional) – If True
, then show a sample of the seeded data in the
response. Defaults to False
.
the HTTP response from the dbt RPC server.
Response
snapshot
(*, select: List[str] = None, exclude: List[str] = None, **kwargs) → requests.models.Response[source]¶Sends a request with the method snapshot
to the dbt RPC server, and returns the
response. For more details, see the dbt docs for the command snapshot.
snapshot_freshness
(*, select: Optional[List[str]] = None, **kwargs) → requests.models.Response[source]¶Sends a request with the method snapshot-freshness
to the dbt RPC server, and returns
the response. For more details, see the dbt docs for the command source snapshot-freshness.
select (List[str], optional) – the models to include in calculating snapshot freshness.
the HTTP response from the dbt RPC server.
Response
status
()[source]¶Sends a request with the method status
to the dbt RPC server, and returns the
response. For more details, see the dbt docs for the RPC method status.
the HTTP response from the dbt RPC server.
Response
test
(*, models: List[str] = None, exclude: List[str] = None, data: bool = True, schema: bool = True, **kwargs) → requests.models.Response[source]¶Sends a request with the method test
to the dbt RPC server, and returns the response.
For more details, see the dbt docs for the RPC method test.
the HTTP response from the dbt RPC server.
Response
dagster_dbt.
DbtRpcOutput
[source]¶The output from executing a dbt command via the dbt RPC server.
Note that users should not construct instances of this class directly. This class is intended to be constructed from the JSON output of dbt commands.
from_dict
(d: Dict[str, Any]) → dagster_dbt.rpc.types.DbtRpcOutput[source]¶Constructs an instance of DbtRpcOutput
from a
dictionary.
d (Dict[str, Any]) – A dictionary with key-values to construct a DbtRpcOutput
.
An instance of DbtRpcOutput
.
dagster_dbt.
DbtResult
[source]¶The results of executing a dbt command.
Note that users should not construct instances of this class directly. This class is intended to be constructed from the JSON output of dbt commands.
results
¶Details about each executed dbt node (model) in the run.
List[NodeResult]]
dagster_dbt.
NodeResult
[source]¶The result of executing a dbt node (model).
Note that users should not construct instances of this class directly. This class is intended to be constructed from the JSON output of dbt commands.
fail
¶The fail
field from the results of the executed dbt node.
Optional[Any]
warn
¶The warn
field from the results of the executed dbt node.
Optional[Any]
skip
¶The skip
field from the results of the executed dbt node.
Optional[Any]
step_timings
¶The timings for each step in the executed dbt node (model).
List[StepTiming]
table
¶Details about the table/view that is created from executing a run_sql command on an dbt RPC server.
Optional[Dict]
from_dict
(d: Dict[str, Any]) → dagster_dbt.types.NodeResult[source]¶Constructs an instance of NodeResult
from a dictionary.
d (Dict[str, Any]) – A dictionary with key-values to construct a NodeResult
.
An instance of NodeResult
.
dagster_dbt.
StepTiming
[source]¶The timing information of an executed step for a dbt node (model).
Note that users should not construct instances of this class directly. This class is intended to be constructed from the JSON output of dbt commands.
started_at
¶An ISO string timestamp of when the step started executing.
completed_at
¶An ISO string timestamp of when the step completed execution.
duration
¶The execution duration of the step.
dagster_dbt.
DagsterDbtError
(description=None, metadata_entries=None)[source]¶The base exception of the dagster-dbt
library.
dagster_dbt.
DagsterDbtCliRuntimeError
(description: str, logs: List[Dict[str, Any]], raw_output: str)[source]¶Represents an error while executing a dbt CLI command.
dagster_dbt.
DagsterDbtCliFatalRuntimeError
(logs: List[Dict[str, Any]], raw_output: str)[source]¶Represents a fatal error in the dbt CLI (return code 2).
dagster_dbt.
DagsterDbtCliHandledRuntimeError
(logs: List[Dict[str, Any]], raw_output: str)[source]¶Represents a model error reported by the dbt CLI at runtime (return code 1).
dagster_dbt.
DagsterDbtCliOutputsNotFoundError
(path: str)[source]¶Represents a problem in finding the target/run_results.json
artifact when executing a dbt
CLI command.
For more details on target/run_results.json
, see
https://docs.getdbt.com/reference/dbt-artifacts#run_resultsjson.