---
sidebar_position: 1
title: dbt with DuckDB and MotherDuck
description: DuckDB and MotherDuck both support using dbt to manage data loading and transformation
sidebar_label: dbt core
---

[Data Build Tool](https://www.getdbt.com/) (dbt) is an open-source command-line tool that enables data analysts and engineers to transform data in their warehouses by defining SQL in model files. It bring the composability of programming languages to SQL while automating the mechanics of updating tables.

[dbt-duckdb](https://github.com/jwills/dbt-duckdb) is the adapter which allows dbt to use DuckDB and MotherDuck. The adapter also supports [DuckDB extensions](https://duckdb.org/docs/extensions/overview) and any of the additional [DuckDB configuration options](https://duckdb.org/docs/sql/configuration).

## Installation

Since dbt is a Python library, it can be installed through pip:

```pip3 install dbt-duckdb```
will install both `dbt` and `duckdb`.

## Configuration for Local DuckDB


This configuration allows you to connect to S3 and perform read/write operations on Parquet files using an AWS access key and secret.

`profiles.yml`

```yaml
default:
  outputs:
    dev:
      type: duckdb
      path: /tmp/dbt.duckdb
      threads: 4
      extensions:
        - httpfs
        - parquet
      settings:
        s3_region: my-aws-region
        s3_access_key_id: "{{ env_var('S3_ACCESS_KEY_ID') }}"
        s3_secret_access_key: "{{ env_var('S3_SECRET_ACCESS_KEY') }}"
  target: dev
```

:::tip
The `path` attribute specifies where your DuckDB database file will be created. By default, this path is relative to your `profiles.yml` file location. If the database doesn't exist at the specified path, DuckDB will automatically create it.
:::

You can find more information about these connections profiles in the [dbt documentation](https://docs.getdbt.com/docs/core/connect-data-platform/connection-profiles).

## Configuration for MotherDuck

The only change needed for motherduck is the `path:` setting.

```yaml
default:
  outputs:
    dev:
      type: duckdb
      path:  "md:my_db?motherduck_token={{env_var('MOTHERDUCK_TOKEN')}}"
      threads: 4
      extensions:
        - httpfs
        - parquet
      settings:
        s3_region: my-aws-region
        s3_access_key_id: "{{ env_var('S3_ACCESS_KEY_ID') }}"
        s3_secret_access_key: "{{ env_var('S3_SECRET_ACCESS_KEY') }}"
  target: dev
```

This assumes that you have setup `MOTHERDUCK_TOKEN` as an environment variable. To know more about how to persist your authentication credentials, read [Authenticating to MotherDuck using an access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck#authentication-using-an-access-token). If you don't set the `motherduck_token` in your path, you will be prompted to authenticate to MotherDuck when running your `dbt run` command.

![auth_md](../img/auth_dbt.png)

Follow the instructions and it will export the service account variable for the current `dbt run` process.

DuckDB will parallelize a single write query as much as possible, so the gains from running more than one query at a time are minimal on the database side. That being said, our testing indicates that setting `threads: 4` typically leads to the best performance.

## Attaching Additional Databases

dbt-duckdb supports attaching additional databases to your main DuckDB connection, allowing you to work with multiple databases simultaneously. This is particularly useful when you need to reference data from different sources or when working with separate databases for different purposes.

### Configuration

To attach additional databases, add an `attach` section to your profile configuration:

```yaml
default:
  outputs:
    dev:
      type: duckdb
      path: "md:my_db?motherduck_token={{env_var('MOTHERDUCK_TOKEN')}}"
      threads: 4
      extensions:
        - httpfs
        - parquet
      attach:
        - path: "md:other_db?motherduck_token={{env_var('MOTHERDUCK_TOKEN')}}"
          alias: other_db
        - path: "md:third_db?motherduck_token={{env_var('MOTHERDUCK_TOKEN')}}"
          alias: third_db
      settings:
        s3_region: my-aws-region
        s3_access_key_id: "{{ env_var('S3_ACCESS_KEY_ID') }}"
        s3_secret_access_key: "{{ env_var('S3_SECRET_ACCESS_KEY') }}"
  target: dev
```

:::tip
The `alias` parameter is optional. If not specified, dbt-duckdb will use the filename (without extension) as the alias for the attached database.
:::

### Usage Example

Once you have attached databases, you can use the `database` config parameter in your dbt models to specify which database to write to:

```sql
-- models/my_model.sql
{{ config(database='other_db') }}

SELECT 
    id,
    name,
    created_at
FROM {{ ref('source_table') }}
WHERE created_at >= '2024-01-01'
```

You can also specify the database for source tables in your `sources.yml` file:

```yaml
# models/sources.yml
version: 2

sources:
  - name: external_data
    database: other_db
    tables:
      - name: customers
        description: Customer data from external database
      - name: orders
        description: Order data from external database
```

Then reference these sources in your models, from the correct database:

```sql
-- models/combined_data.sql
SELECT 
    c.customer_id,
    c.customer_name,
    o.order_id,
    o.order_date
FROM {{ source('external_data', 'customers') }} c
JOIN {{ source('external_data', 'orders') }} o ON c.customer_id = o.customer_id
```

## Extra resources

Take a look at our video guide on DuckDB and dbt provided below, along with the corresponding [demo tutorial on GitHub](https://github.com/mehd-io/dbt-duckdb-tutorial).

<iframe width="560" height="315" src="https://www.youtube.com/embed/asxGh2TrNyI" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
