# MotherDuck Documentation - Integrations > Scoped full Markdown content for Integrations. For other areas, start from https://motherduck.com/docs/llms.txt instead of loading unrelated documentation. ## Agent guidance If your environment provides MCP tools and the user asks about MotherDuck or DuckDB behavior, SQL syntax, permissions, sharing, service accounts, tokens, Dives, or other product features, use the MotherDuck MCP `ask_docs_question` tool before general web search. It answers from official DuckDB and MotherDuck documentation. For broad context, prefer the most specific scoped `llms-full.txt` file listed in https://motherduck.com/docs/llms.txt before loading the root `llms-full.txt`. The root file contains the complete public documentation corpus and is intended for bulk indexing or large-context workflows. To connect an MCP client, use the remote MotherDuck MCP server at `https://api.motherduck.com/mcp`. Setup instructions: https://motherduck.com/docs/key-tasks/ai-and-motherduck/mcp-setup. Tool reference: https://motherduck.com/docs/sql-reference/mcp/ask-docs-question. --- Source: https://motherduck.com/docs/integrations/cloud-storage/amazon-s3 # Amazon S3 > Amazon S3 is a Data Sources/Sinks service for storing and retrieving data. ## Configure S3 credentials You can safely store your Amazon S3 credentials in MotherDuck for convenience by creating a `SECRET` object using the [CREATE SECRET](/sql-reference/motherduck-sql-reference/create-secret.md) command. Secrets are scoped to your user account and are not shared with other users in your organization. ### Create a SECRET object ### SQL ```sql -- to configure a secret manually: CREATE SECRET IN MOTHERDUCK ( TYPE S3, KEY_ID 'access_key', SECRET 'secret_key', REGION 'us-east-1', SCOPE 'my-bucket-path' ); ``` :::note When creating a secret using the `CONFIG` (default) provider, be aware that the credential might be temporary. If so, a `SESSION_TOKEN` field also needs to be set for the secret to work correctly. ::: ```sql -- to store a secret using your local AWS credentials (from `aws configure` or SSO): -- if you use AWS SSO, run `aws sso login --profile ` first CREATE SECRET aws_secret IN MOTHERDUCK ( TYPE S3, PROVIDER credential_chain, -- optional: add CHAIN and PROFILE for SSO credentials CHAIN 'sso', PROFILE '' ); ``` :::note Secret validation Starting with DuckDB v1.4.0, credentials are validated at secret creation time. If your credentials are not resolvable locally (for example, expired SSO tokens or missing `~/.aws/credentials`), the `CREATE SECRET` command will fail with a `Secret Validation Failure` error. The recommended fix is to use the correct `CHAIN` and `PROFILE` for your credential type (see the SSO example above). If you need to bypass local validation, you can add `VALIDATION 'none'`, but keep in mind that this skips the local check that confirms your credentials are valid before storing them in MotherDuck. ::: ```sql -- test the s3 credentials SELECT count(*) FROM 's3:///'; -- browse objects in a bucket or prefix FROM md_list_files('s3:///'); ``` ### Python ```python import duckdb con = duckdb.connect('md:') con.sql("CREATE SECRET IN MOTHERDUCK (TYPE S3, KEY_ID 'access_key', SECRET 'secret_key', REGION 'your_bucket_region')"); # testing that our s3 credentials work con.sql("SELECT count(*) FROM 's3:///'").show() # 42 ``` ### UI Click on your profile to access the `Settings` panel and click on `Secrets` menu. ![menu_1](./img/settings_access.png) ![menu_2](./img/settings_panel.png) Then click on `Add secret` in the secrets section. ![menu_3](./img/settings_secrets_panel.png) You will then be prompted to enter your Amazon S3 credentials. ![menu_3](./img/settings_secrets_pop_up.png) You can update your secret by executing [CREATE OR REPLACE SECRET](/sql-reference/motherduck-sql-reference/create-secret.md) command to overwrite your secret. ### Delete a SECRET object ### SQL You can use the same method above, using the [DROP SECRET](/sql-reference/motherduck-sql-reference/delete-secret.md) command. ```sql DROP SECRET ; ``` ### UI Click on your profile and access the `Settings` menu. Click on the bin icon to delete your current secrets. ![menu_4](./img/secrets_delete_4.png) ### Amazon S3 credentials as **temporary** secrets MotherDuck supports DuckDB syntax for providing S3 credentials. ```sql CREATE SECRET ( TYPE S3, KEY_ID 's3_access_key', SECRET 's3_secret_key', REGION 'us-east-1' ); ``` :::note Local/In-memory secrets are not persisted across sessions. ::: :::info Even temporary, in-memory secrets are available to MotherDuck's cloud execution engine when you connect your local DuckDB instance to MotherDuck. When you query {props.provider}, the query runs on MotherDuck's servers, not your local machine, and MotherDuck uses the best-matching secret to authenticate, whether it is stored locally or in MotherDuck. For more details, see [CREATE SECRET](/sql-reference/motherduck-sql-reference/create-secret/#querying-with-secrets). ::: ## Troubleshooting For detailed troubleshooting steps, see our [AWS S3 Secrets Troubleshooting](/documentation/troubleshooting/aws-s3-secrets.md) guide. ## Browse buckets and files To inspect storage from SQL before querying specific files: ```sql FROM md_list_buckets_for_secret('__default_s3'); FROM md_list_files('s3:///'); FROM md_list_files('s3:////'); ``` See [`MD_LIST_BUCKETS_FOR_SECRET()`](/sql-reference/motherduck-sql-reference/md-list-buckets-for-secret) and [`MD_LIST_FILES()`](/sql-reference/motherduck-sql-reference/md-list-files) for details. --- Source: https://motherduck.com/docs/integrations/file-formats/apache-iceberg # Apache Iceberg > MotherDuck supports the Apache Iceberg format through the DuckDB Iceberg extension. The extension is loaded automatically when Iceberg functions or catalogs are used in your current MotherDuck session. ## Iceberg REST catalogs You can attach an [Iceberg REST catalog](https://duckdb.org/docs/stable/core_extensions/iceberg/iceberg_rest_catalogs) in your current MotherDuck session and query it with standard SQL. The attached catalog is local to the current client session: it does not become a persisted MotherDuck database or workspace attachment. ### Authentication Create a secret with your catalog credentials: ```sql -- OAuth2 CREATE SECRET my_iceberg_secret ( TYPE iceberg, CLIENT_ID 'my_client_id', CLIENT_SECRET 'my_client_secret', OAUTH2_SERVER_URI 'https://my-catalog.example.com/v1/oauth/tokens' ); -- Bearer token CREATE SECRET my_iceberg_secret ( TYPE iceberg, TOKEN 'my_bearer_token' ); ``` ### Attaching a catalog ```sql ATTACH 'my_warehouse' AS my_iceberg ( TYPE iceberg, SECRET my_iceberg_secret, ENDPOINT 'https://my-catalog.example.com' ); ``` :::note This `ATTACH` adds the Iceberg catalog to your current client session only. Re-attach it in each new session. Use `DETACH my_iceberg;` to remove it from the current session. ::: Once attached, browse and query tables using standard SQL: ```sql -- List schemas SHOW SCHEMAS IN my_iceberg; -- Query a table SELECT * FROM my_iceberg.my_schema.my_table; ``` ### Session-scoped write operations Within the attached session, DuckDB's Iceberg REST catalog support includes operations such as creating schemas and tables and inserting data: ```sql CREATE SCHEMA my_iceberg.analytics; CREATE TABLE my_iceberg.analytics.events ( event_id INT, event_type VARCHAR, created_at TIMESTAMP ); INSERT INTO my_iceberg.analytics.events VALUES (1, 'page_view', '2025-01-15 10:30:00'); ``` ### Additional DuckDB Iceberg catalog features DuckDB documents additional Iceberg REST catalog capabilities such as time travel for attached catalogs. Refer to the upstream documentation for the current support matrix and syntax details. ```sql SELECT * FROM my_iceberg.my_schema.my_table AT (VERSION => 1234567890); SELECT * FROM my_iceberg.my_schema.my_table AT (TIMESTAMP => TIMESTAMP '2025-01-15 10:30:00'); ``` ### Limitations - Attached Iceberg REST catalogs are local to the current client session and are not persisted as MotherDuck workspace attachments - `UPDATE` and `DELETE` only work on unpartitioned, unsorted tables - Only merge-on-read semantics (no copy-on-write) - `MERGE INTO` and `ALTER TABLE` are not supported - Reading from REST catalogs is limited to S3, S3 Tables, and GCS storage backends For more details, see the [DuckDB Iceberg REST catalog documentation](https://duckdb.org/docs/stable/core_extensions/iceberg/iceberg_rest_catalogs). ## Scanning individual Iceberg tables Use `iceberg_scan` to query individual Iceberg tables directly by path, without attaching a catalog: ```sql SELECT count(*) FROM iceberg_scan('s3://my-bucket/my-iceberg-table', allow_moved_paths = true); ``` :::note To query data in a secure Amazon S3 bucket, you will need to configure your [Amazon S3 credentials](../../cloud-storage/amazon-s3). ::: ### `iceberg_scan` parameters | Parameter | Type | Default | Description | | :--- | :--- | :--- | :--- | | `allow_moved_paths` | `BOOLEAN` | `false` | Allow scanning Iceberg tables that have been moved or relocated | | `metadata_compression_codec` | `VARCHAR` | `''` | Set to `'gzip'` to read gzip-compressed metadata files | | `snapshot_from_id` | `UBIGINT` | `NULL` | Query a specific snapshot by ID | | `snapshot_from_timestamp` | `TIMESTAMP` | `NULL` | Query the latest snapshot as of a given timestamp | | `version` | `VARCHAR` | `'?'` | Explicit version string, hint file path, or `'?'` for auto-detection | | `version_name_format` | `VARCHAR` | `'v%s%s.metadata.json,%s%s.metadata.json'` | Custom metadata filename pattern | ### Time travel with `iceberg_scan` ```sql -- Query a specific snapshot SELECT * FROM iceberg_scan('s3://my-bucket/my-iceberg-table', allow_moved_paths = true, snapshot_from_id = 1234567890); -- Query as of a timestamp SELECT * FROM iceberg_scan('s3://my-bucket/my-iceberg-table', allow_moved_paths = true, snapshot_from_timestamp = TIMESTAMP '2025-01-15 10:30:00'); ``` ### Metadata and snapshot functions Use `iceberg_metadata` to inspect manifest entries (file paths, formats, record counts): ```sql SELECT * FROM iceberg_metadata('s3://my-bucket/my-iceberg-table', allow_moved_paths = true); ``` Use `iceberg_snapshots` to list available snapshots: ```sql SELECT * FROM iceberg_snapshots('s3://my-bucket/my-iceberg-table'); ``` ### Example with sample dataset ```sql SELECT count(*) FROM iceberg_scan('s3://us-prd-motherduck-open-datasets/iceberg/lineitem_iceberg', allow_moved_paths = true); ``` --- Source: https://motherduck.com/docs/integrations/cloud-storage/azure-blob-storage # Azure Blob Storage > Azure Blob is a Data Sources/Sinks service for storing and retrieving data. ## Configure Azure Blob Storage credentials You can safely store your Azure Blob Storage credentials in MotherDuck for convenience by creating a `SECRET` object using the [CREATE SECRET](/sql-reference/motherduck-sql-reference/create-secret.md) command. :::note See [Azure docs](https://learn.microsoft.com/en-gb/azure/storage/common/storage-configure-connection-string#configure-a-connection-string-for-an-azure-storage-account) to find the correct connection string format. ::: ### Create a SECRET object ### SQL ```sql -- to configure a secret manually: CREATE SECRET IN MOTHERDUCK ( TYPE AZURE, CONNECTION_STRING '[your_connection_string]' ); ``` ```sql -- to store a secret configured through `az configure`: CREATE SECRET az_secret IN MOTHERDUCK ( TYPE AZURE, PROVIDER credential_chain, ACCOUNT_NAME 'some-account' ); ``` ```sql -- test the azure credentials SELECT count(*) FROM 'azure://[container]/[file]' SELECT * FROM 'azure://[container]/*.csv'; -- browse objects in a container FROM md_list_files('azure://[container]/', limit := 50); ``` ### Python ```python import duckdb con = duckdb.connect('md:') con.sql("CREATE SECRET IN MOTHERDUCK (TYPE AZURE, CONNECTION_STRING '[your_connection_string]')"); # testing that our Azure credentials work con.sql("SELECT count(*) FROM 'azure://[container]/[file]'").show() con.sql("SELECT * FROM 'azure://[container]/*.csv'").show() ``` ### UI Click on your profile to access the `Settings` panel and click on `Secrets` menu. ![menu_1](./img/settings_access.png) ![menu_2](./img/settings_panel.png) Then click on `Add secret` in the secrets section. ![menu_3](./img/settings_secrets_panel.png) You will then be prompted to enter your Amazon S3 credentials. ![menu_3](./img/secrets_add_azure.png) ### Delete a SECRET object ### SQL You can use the same method above, using the [DROP SECRET](/sql-reference/motherduck-sql-reference/delete-secret.md) command. ```sql DROP SECRET ; ``` ### UI Click on your profile and access the `Settings` menu. Click on the bin icon to delete the secret. ![menu_4](./img/secrets_delete_azure.png) ### Azure credentials as **temporary** secrets MotherDuck supports DuckDB syntax for providing Azure credentials. ```sql CREATE SECRET ( TYPE AZURE, CONNECTION_STRING '[your_connection_string]' ); ``` or if you use the `az configure` command to store your credentials in the `az` CLI. ```sql CREATE SECRET az_secret ( TYPE AZURE, PROVIDER credential_chain, ACCOUNT_NAME 'some-account' ); ``` :::note Local/In-memory secrets are not persisted across sessions. ::: :::info Even temporary, in-memory secrets are available to MotherDuck's cloud execution engine when you connect your local DuckDB instance to MotherDuck. When you query {props.provider}, the query runs on MotherDuck's servers, not your local machine, and MotherDuck uses the best-matching secret to authenticate, whether it is stored locally or in MotherDuck. For more details, see [CREATE SECRET](/sql-reference/motherduck-sql-reference/create-secret/#querying-with-secrets). ::: ## Browse files in Azure Blob Storage To inspect a container before querying individual files, use [`MD_LIST_FILES()`](/sql-reference/motherduck-sql-reference/md-list-files): ```sql FROM md_list_files('azure://[container]/'); FROM md_list_files('az://[container]/path/'); ``` --- Source: https://motherduck.com/docs/integrations/databases/bigquery # BigQuery > Load data from Google BigQuery into MotherDuck using the duckdb-bigquery community extension. BigQuery is Google Cloud's fully-managed, serverless data warehouse that lets you run SQL queries on the processing power of Google's infrastructure. To load data from BigQuery into MotherDuck, use the [`duckdb-bigquery` community extension](https://github.com/hafenkran/duckdb-bigquery). It reads through the BigQuery Storage Read API with parallel streams, filter pushdown, and Arrow compression — and loads results straight into DuckDB or MotherDuck without any glue code. ## Prerequisites - DuckDB installed (using the CLI or Python). - Access to a GCP project with BigQuery enabled. - Valid Google Cloud credentials, provided through one of: - the `GOOGLE_APPLICATION_CREDENTIALS` environment variable, or - `gcloud auth application-default login`. Minimum required IAM roles: - `BigQuery Data Editor` - `BigQuery Job User` ## Loading data from BigQuery into MotherDuck The following examples use the [DuckDB CLI](/getting-started/interfaces/connect-query-from-duckdb-cli.mdx), but you can use any [DuckDB or MotherDuck client](/getting-started/interfaces/interfaces.mdx). ### Install and load the extension ```sql INSTALL bigquery FROM community; LOAD bigquery; ``` ### Attach a BigQuery project To read data from your project, attach it like you would attach a DuckDB database: ```sql ATTACH 'project=my-gcp-project' AS bq (TYPE bigquery, READ_ONLY); ``` To read from a public dataset, use the following syntax: ```sql ATTACH 'project=bigquery-public-data dataset=pypi billing_project=my-gcp-project' AS bq_public (TYPE bigquery, READ_ONLY); ``` ### Query a table Once attached, you can query BigQuery tables directly using standard SQL syntax: ```sql SELECT * FROM bq.dataset_name.table_name LIMIT 10; ``` Behind the scenes, this uses `bigquery_scan`. The extension also exposes two functions you can call directly: **`bigquery_scan`** — for direct reads from a single table: ```sql SELECT * FROM bigquery_scan('my_gcp_project.my_dataset.my_table'); ``` **`bigquery_query`** — for custom [GoogleSQL](https://cloud.google.com/bigquery/docs/introduction-sql), including views and external tables that the Storage Read API can't access on its own: ```sql SELECT * FROM bigquery_query( 'my_gcp_project', 'SELECT * FROM `my_gcp_project.my_dataset.my_table` WHERE column = "value"' ); ``` Both functions share the same Arrow scan engine. For very large reads, you can enable parallel read streams by relaxing DuckDB's default ordering guarantee: ```sql SET preserve_insertion_order = FALSE; ``` ### Load data into MotherDuck Verify the `motherduck_token` environment variable is set, then attach MotherDuck: ```sql ATTACH 'md:'; ``` Use `CREATE TABLE ... AS` to create a new table, or `INSERT INTO ... SELECT` to append data to an existing one: ```sql CREATE DATABASE IF NOT EXISTS pypi_playground; USE pypi_playground; CREATE TABLE IF NOT EXISTS duckdb_sample AS SELECT * FROM bq_public.pypi.file_downloads WHERE project = 'duckdb' AND timestamp = TIMESTAMP '2025-05-26 00:00:00' LIMIT 100; ``` --- Source: https://motherduck.com/docs/integrations/cloud-storage/cloudflare-r2 # Cloudflare R2 > Cloudflare R2 is a Data Sources/Sinks service for storing and retrieving data. ## Configure Cloudflare R2 credentials You can safely store your Cloudflare R2 credentials in MotherDuck for convenience by creating a `SECRET` object using the [CREATE SECRET](/sql-reference/motherduck-sql-reference/create-secret.md) command. :::note See [Cloudflare docs](https://developers.cloudflare.com/r2/api/s3/tokens/) to create a Cloudflare access token. ::: ### Create a SECRET object ### SQL ```sql CREATE SECRET IN MOTHERDUCK ( TYPE R2, KEY_ID 'your_key_id', SECRET 'your_secret_key', ACCOUNT_ID 'your_account_id' ); ``` :::note The `ACCOUNT_ID` can be found when generating the API token on the endpoint URL `https://.r2.cloudflarestorage.com`. ::: :::note R2 buckets are regionless, so you do not need to specify a `REGION` parameter. If provided, it defaults to `auto`. ::: ```sql -- test the R2 credentials SELECT count(*) FROM 'r2://[bucket]/[file]' ``` ### Python ```python import duckdb con = duckdb.connect('md:') con.sql("CREATE SECRET IN MOTHERDUCK ( TYPE R2, KEY_ID 'your_key_id', SECRET 'your_secret_key', ACCOUNT_ID 'your_account_id' )"); # testing that our R2 credentials work con.sql("SELECT count(*) FROM 'r2://[bucket]/[file]'").show() ``` ### UI Click on your profile to access the `Settings` panel and click on `Secrets` menu. ![menu_1](./img/settings_access.png) ![menu_2](./img/settings_panel.png) Then click on `Add secret` in the secrets section. ![menu_3](./img/settings_secrets_panel.png) Select the Secret Type `R2` and fill in the required fields. ### Delete a SECRET object ### SQL You can use the same method above, using the [DROP SECRET](/sql-reference/motherduck-sql-reference/delete-secret.md) command. ```sql DROP SECRET ; ``` ### UI Click on your profile and access the `Settings` menu. Click on the bin icon to delete the secret. ![menu_4](./img/secrets_delete_azure.png) ### R2 credentials as **temporary** secrets MotherDuck supports DuckDB syntax for providing R2 credentials. ```sql CREATE SECRET ( TYPE R2, KEY_ID 'your_key_id', SECRET 'your_secret_key', ACCOUNT_ID 'your_account_id' ); ``` :::note Local/In-memory secrets are not persisted across sessions. ::: :::info Even temporary, in-memory secrets are available to MotherDuck's cloud execution engine when you connect your local DuckDB instance to MotherDuck. When you query {props.provider}, the query runs on MotherDuck's servers, not your local machine, and MotherDuck uses the best-matching secret to authenticate, whether it is stored locally or in MotherDuck. For more details, see [CREATE SECRET](/sql-reference/motherduck-sql-reference/create-secret/#querying-with-secrets). ::: --- Source: https://motherduck.com/docs/integrations/serverless-compute/cloudflare-workers # Cloudflare Workers > Cloudflare Workers is an edge compute platform for running serverless functions globally. Workers can connect to MotherDuck through the Postgres endpoint using the pg npm package. ## Connection Workers connect to MotherDuck using a standard Postgres connection string: ```typescript import { Client } from "pg"; const connectionString = `postgresql://user:${MOTHERDUCK_TOKEN}@pg.us-east-1-aws.motherduck.com:5432/${DATABASE}?sslmode=require`; const client = new Client({ connectionString }); await client.connect(); ``` Key requirements: - The `nodejs_compat` compatibility flag must be enabled in `wrangler.toml` — it provides the `node:net` module that `pg` needs for TCP connections. - Use `?sslmode=require`. MotherDuck's Postgres endpoint only accepts encrypted connections, and Cloudflare Workers delegates certificate verification to the runtime's TLS stack. - Store your MotherDuck token as a [Wrangler secret](https://developers.cloudflare.com/workers/configuration/secrets/) — never commit tokens to source code. ## Connection pooling with Hyperdrive For production workloads, [Cloudflare Hyperdrive](https://developers.cloudflare.com/hyperdrive/) provides built-in connection pooling. This reduces latency by reusing connections across Worker invocations instead of opening a new connection per request. ```toml # wrangler.toml [[hyperdrive]] binding = "MD_HYPERDRIVE" id = "" ``` ```typescript const client = new Client({ connectionString: env.MD_HYPERDRIVE.connectionString, }); ``` For local development with `wrangler dev`, add a `localConnectionString` to the Hyperdrive binding or export `CLOUDFLARE_HYPERDRIVE_LOCAL_CONNECTION_STRING_MD_HYPERDRIVE`. That lets you test the Worker locally against MotherDuck before deploying the Hyperdrive-backed version. ## Tutorial For a step-by-step guide to building and deploying a Cloudflare Worker that queries MotherDuck, see [Connect from Cloudflare Workers](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint/cloudflare-workers). --- Source: https://motherduck.com/docs/integrations/transformation/dbt # dbt with DuckDB and MotherDuck > Data Build Tool (dbt) is an open-source command-line tool that enables data analysts and engineers to transform data in their warehouses by defining SQL in model files. It bring the composability of programming languages to SQL while automating the mechanics of updating tables. [dbt-duckdb](https://github.com/jwills/dbt-duckdb) is the adapter which allows dbt to use DuckDB and MotherDuck. The adapter also supports [DuckDB extensions](https://duckdb.org/docs/extensions/overview) and any of the additional [DuckDB configuration options](https://duckdb.org/docs/sql/configuration). ## Installation Since dbt is a Python library, it can be installed through pip: ```pip3 install dbt-duckdb``` will install both `dbt` and `duckdb`. ## Configuration for Local DuckDB This configuration allows you to connect to S3 and perform read/write operations on Parquet files using an AWS access key and secret. `profiles.yml` ```yaml default: outputs: dev: type: duckdb path: /tmp/dbt.duckdb threads: 4 extensions: - httpfs - parquet settings: s3_region: my-aws-region s3_access_key_id: "{{ env_var('S3_ACCESS_KEY_ID') }}" s3_secret_access_key: "{{ env_var('S3_SECRET_ACCESS_KEY') }}" target: dev ``` :::tip The `path` attribute specifies where your DuckDB database file will be created. By default, this path is relative to your `profiles.yml` file location. If the database doesn't exist at the specified path, DuckDB will automatically create it. ::: You can find more information about these connections profiles in the [dbt documentation](https://docs.getdbt.com/docs/core/connect-data-platform/connection-profiles). ## Configuration for MotherDuck The only change needed for motherduck is the `path:` setting. ```yaml default: outputs: dev: type: duckdb path: "md:my_db?motherduck_token={{env_var('MOTHERDUCK_TOKEN')}}" threads: 4 extensions: - httpfs - parquet settings: s3_region: my-aws-region s3_access_key_id: "{{ env_var('S3_ACCESS_KEY_ID') }}" s3_secret_access_key: "{{ env_var('S3_SECRET_ACCESS_KEY') }}" target: dev ``` This assumes that you have setup `MOTHERDUCK_TOKEN` as an environment variable. To know more about how to persist your authentication credentials, read [Authenticating to MotherDuck using an access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck#authentication-using-an-access-token). If you don't set the `motherduck_token` in your path, you will be prompted to authenticate to MotherDuck when running your `dbt run` command. ![auth_md](../img/auth_dbt.png) Follow the instructions and it will export the service account variable for the current `dbt run` process. DuckDB will parallelize a single write query as much as possible, so the gains from running more than one query at a time are minimal on the database side. That being said, our testing indicates that setting `threads: 4` typically leads to the best performance. ## Attaching Additional Databases dbt-duckdb supports attaching additional databases to your main DuckDB connection, allowing you to work with multiple databases simultaneously. This is particularly useful when you need to reference data from different sources or when working with separate databases for different purposes. ### Configuration To attach additional databases, add an `attach` section to your profile configuration: ```yaml default: outputs: dev: type: duckdb path: "md:my_db?motherduck_token={{env_var('MOTHERDUCK_TOKEN')}}" threads: 4 extensions: - httpfs - parquet attach: - path: "md:other_db?motherduck_token={{env_var('MOTHERDUCK_TOKEN')}}" alias: other_db - path: "md:third_db?motherduck_token={{env_var('MOTHERDUCK_TOKEN')}}" alias: third_db settings: s3_region: my-aws-region s3_access_key_id: "{{ env_var('S3_ACCESS_KEY_ID') }}" s3_secret_access_key: "{{ env_var('S3_SECRET_ACCESS_KEY') }}" target: dev ``` :::tip The `alias` parameter is optional. If not specified, dbt-duckdb will use the filename (without extension) as the alias for the attached database. ::: ### Usage Example Once you have attached databases, you can use the `database` config parameter in your dbt models to specify which database to write to: ```sql -- models/my_model.sql {{ config(database='other_db') }} SELECT id, name, created_at FROM {{ ref('source_table') }} WHERE created_at >= '2024-01-01' ``` You can also specify the database for source tables in your `sources.yml` file: ```yaml # models/sources.yml version: 2 sources: - name: external_data database: other_db tables: - name: customers description: Customer data from external database - name: orders description: Order data from external database ``` Then reference these sources in your models, from the correct database: ```sql -- models/combined_data.sql SELECT c.customer_id, c.customer_name, o.order_id, o.order_date FROM {{ source('external_data', 'customers') }} c JOIN {{ source('external_data', 'orders') }} o ON c.customer_id = o.customer_id ``` ## Extra resources Take a look at our video guide on DuckDB and dbt provided below, along with the corresponding [demo tutorial on GitHub](https://github.com/mehd-io/dbt-duckdb-tutorial). --- Source: https://motherduck.com/docs/integrations/file-formats/delta-lake # Delta Lake > MotherDuck supports querying data in the Delta Lake format. The Delta DuckDB extension is loaded automatically when any of the supported Delta Lake functions are called. ## Delta function | Function Name | Description | Supported parameters | :--- | :--- | :--- | | `delta_scan` | Query Delta Lake data | All the parquet_scan parameters plus delta_file_number. :::note The available functions are only for reading Delta Lake data. Creating or updating data in Delta format is not yet supported. ::: ## Examples ```sql -- query data SELECT COUNT(*) FROM delta_scan('path-to-delta-folder'); -- query data with parameters FROM delta_scan('path-to-delta-folder', delta_file_number=1, file_row_number=1); ``` ### Query Delta data stored in S3 :::warning At the moment, querying Delta tables stored in Amazon S3 from **public** buckets is not supported. ::: [Create a S3 secret](/sql-reference/motherduck-sql-reference/create-secret.md) in MotherDuck using the secret manager: ```sql CREATE SECRET IN MOTHERDUCK ( TYPE S3, KEY_ID 's3_access_key', SECRET 's3_secret_key', REGION 's3-region' ); ``` Query Delta data stored in S3: ```sql SELECT count(*) FROM delta_scan('s3:///'); ``` :::note To query data in an Amazon S3 bucket, you will need to configure your [Amazon S3 credentials](../../cloud-storage/amazon-s3). ::: Example using MotherDuck Delta sample dataset. ```sql SELECT COUNT(*) FROM delta_scan('s3://us-prd-motherduck-open-datasets/file_format_demo/delta_lake/dat/out/reader_tests/generated/basic_append/delta'); ``` --- Source: https://motherduck.com/docs/integrations/file-formats/ducklake # DuckLake > DuckLake is an integrated data lake and catalog format for large scale data analytics. ::::note MotherDuck supports DuckDB 1.5.3. In **US East (N. Virginia) -** `us-east-1`, MotherDuck is compatible with client versions 1.4.0 through 1.5.3. In **US West (Oregon) -** `us-west-2`, MotherDuck supports client versions 1.4.1 through 1.5.3. In **Europe (Frankfurt) -** `eu-central-1`, MotherDuck supports client version 1.4.1 through 1.5.3. :::: [DuckLake](https://ducklake.select) is an integrated data lake and catalog format. DuckLake delivers advanced data lake features without traditional lakehouse complexity by using Parquet files and a SQL database. MotherDuck provides two main options for creating and integrating with DuckLake databases: - **[Fully managed](#creating-a-fully-managed-ducklake-database)**: Create a DuckLake database where MotherDuck manages both data storage and metadata - **[Bring your own bucket (BYOB)](#bring-your-own-bucket)**: Connect your own S3 or R2 bucket for data storage with: - **[MotherDuck compute + MotherDuck catalog](#using-motherduck-compute)**: Use MotherDuck for both compute and catalog services - **[Own compute + MotherDuck catalog](#using-own-compute)**: Use your own DuckDB client for compute while MotherDuck provides catalog services ## Creating a fully managed DuckLake database Create a fully managed DuckLake with the following command: ```sql CREATE DATABASE my_ducklake (TYPE DUCKLAKE); ``` MotherDuck stores both data and metadata in MotherDuck-managed storage (not externally accessible at the moment), providing a streamlined way to evaluate DuckLake functionality. The `my_ducklake` database can be accessed like any other MotherDuck database — including over the [Postgres endpoint](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint/) for clients that don't use the DuckDB SDK. To inspect the metadata catalog backing the DuckLake, see [Performing metadata operations on a DuckLake](#performing-metadata-operations-on-a-ducklake). You can attach the DuckLake metadata with: ```sql ATTACH 'md:__ducklake_metadata_' AS ; ``` ::::note The metadata database can only be attached by the database owner. :::: ## Data inlining Data inlining is an optimization feature that stores small data changes directly in the metadata catalog rather than creating individual Parquet files for every insert operation. This eliminates the overhead of creating small Parquet files while maintaining full query and update capabilities. ### Creating a DuckLake database with custom inlining To create a (fully managed) DuckLake database with a custom inlining threshold: ```sql CREATE DATABASE my_ducklake ( TYPE DUCKLAKE, DATA_INLINING_ROW_LIMIT 100 ); ``` This configuration will inline all inserts with fewer than 100 rows directly into the metadata catalog. ### How data inlining works Data inlining is **enabled by default** with a threshold of 10 rows. Any insert writing fewer than 10 rows is automatically stored inline in the metadata catalog rather than creating a Parquet file. You can customize the threshold with the `DATA_INLINING_ROW_LIMIT` parameter. For example, if you set it to 100, inserts with fewer than 100 rows are stored inline, while inserts with 100 or more rows create Parquet files. Set it to 0 to disable inlining. The inlining threshold applies **per insert operation**. For example, if the limit is set to 100, four separate inserts of 50 rows each will all be stored inline (200 total rows), because each individual insert is below the threshold. When an insert exceeds the threshold, that insert writes directly to a Parquet file, but any previously inlined data remains in the metadata catalog. Larger inserts do not automatically flush existing inlined data. ### Flushing inlined data Because inlined data can accumulate, it is good practice to periodically flush it to parquet storage using the `ducklake_flush_inlined_data` function: ```sql -- Flush inlined data for a specific table SELECT ducklake_flush_inlined_data('my_ducklake.my_schema.my_table'); -- Flush all inlined data in a schema SELECT ducklake_flush_inlined_data('my_ducklake.my_schema'); -- Flush all inlined data in the database SELECT ducklake_flush_inlined_data('my_ducklake'); ``` For workloads with frequent small inserts, schedule regular flushes to prevent excessive inlined data accumulation. > Automatic background flush operations are in active development. ### Configuring inlining You can override the database-level inlining threshold for individual tables: ```sql -- Disable inlining for a specific table CALL my_ducklake.set_option('data_inlining_row_limit', 0, table_name => 'my_table'); -- Set a custom threshold for a specific table CALL my_ducklake.set_option('data_inlining_row_limit', 50, table_name => 'my_table'); ``` You can also set a session-level default that applies to new tables: ```sql SET ducklake_default_data_inlining_row_limit = 0; ``` ## DuckLake configuration DuckLake provides configuration options that you can set at the database or table level using the `set_option` function. For example, you can adjust the `parquet_row_group_size` to control how data is organized in Parquet files: ```sql -- Set row group size for the entire database CALL my_ducklake.set_option('parquet_row_group_size', 50000); -- Set row group size for a specific table CALL my_ducklake.set_option('parquet_row_group_size', 50000, table_name => 'my_table'); ``` Note that calls the `set_option` take precedence over configuration passed when creating the database. ```sql CREATE DATABASE my_ducklake ( TYPE DUCKLAKE, DATA_INLINING_ROW_LIMIT 100 -- sets database level inlining row limit to 100 ); -- overrides the prior value and sets database level row limit to 250 CALL my_ducklake.set_option('data_inlining_row_limit', 250); -- overrides prior value _ONLY_ for `my_table`. CALL my_ducklake.set_option('data_inlining_row_limit', 0, table_name => 'my_table'); ``` For the full list of available configuration options, see the [DuckLake configuration reference](https://ducklake.select/docs/stable/duckdb/usage/configuration#setting-config-values-1). ## Bring your own bucket (BYOB) {#bring-your-own-bucket} You can use MotherDuck as a compute engine and managed DuckLake catalog while connecting your own [AWS S3](/integrations/cloud-storage/amazon-s3/) or [Cloudflare R2](/integrations/cloud-storage/cloudflare-r2/) object store for data storage. Additionally, you can bring your own compute (BYOC) using your DuckDB client to query and write data directly to your DuckLake. ### Setup Configure a custom data path when creating your DuckLake to use your own bucket. ### AWS S3 :::note Your S3 bucket must be in the same region as your MotherDuck organization: - **US organizations**: `us-east-1` (US East - N. Virginia) - **EU organizations**: `eu-central-1` (Europe - Frankfurt) Other AWS regions are not supported for BYOB. This does not affect reading files directly from S3-compatible object stores (for example, CSV or Parquet): you can still query data from buckets in any region. ::: ```sql CREATE DATABASE my_ducklake ( TYPE DUCKLAKE, DATA_PATH 's3://mybucket/my_optional_path/' ); ``` ### Cloudflare R2 :::tip Cloudflare R2 buckets are not bound to a specific region, so you can use them with any MotherDuck organization regardless of region. When creating your R2 bucket, set a [location hint](https://developers.cloudflare.com/r2/reference/data-location/) close to your MotherDuck region to minimize latency (for example, `enam` for US organizations, `weur` for EU organizations). ::: ```sql CREATE DATABASE my_ducklake ( TYPE DUCKLAKE, DATA_PATH 'r2://mybucket/my_optional_path/' ); ``` Create a corresponding secret in MotherDuck to allow MotherDuck compute to access your bucket. See [Cloud Storage integrations](/integrations/cloud-storage/) for instructions on creating secrets for your provider. You can then create DuckLake tables as you would with a standard DuckDB database using either MotherDuck or local compute as shown in the examples below. #### Required permissions for DuckLake ### AWS S3 The minimum required IAM permissions are: ```json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:ListBucket" ], "Resource": "${s3_bucket_arn}" }, { "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject", "s3:DeleteObject" ], "Resource": "${s3_bucket_arn}/*" } ] } ``` ### Cloudflare R2 Your R2 API token needs the following permissions on the bucket: - **Object Read** - read data files - **Object Write** - write and delete data files - **Bucket List** - list objects in the bucket See the [Cloudflare R2 API tokens documentation](https://developers.cloudflare.com/r2/api/s3/tokens/) for instructions on creating an API token. ### Using MotherDuck compute Connect to MotherDuck: ```sql ./duckdb md: ``` Create your first DuckLake table from an hosted Parquet file: ```sql CREATE TABLE my_ducklake.air_quality AS SELECT * FROM 'https://us.data.motherduck.com/who_ambient_air_quality/parquet/who_ambient_air_quality_database_version_2024.parquet'; ``` Query using MotherDuck: ```sql SELECT year, AVG(pm25_concentration::double) AS avg_pm25, AVG(pm10_concentration::double) AS avg_pm10, AVG(no2_concentration::double) AS avg_no2 FROM my_ducklake.air_quality WHERE city = 'Berlin/DEU' GROUP BY year ORDER BY year DESC; ``` ### Using own compute To use your own compute (for example, your DuckDB client), you must: 1. Ensure you have appropriate credentials in your compute environment to read/write to your defined `DATA_PATH` (specified at database creation) 2. Attach the DuckLake using the `ducklake:` prefix so compute runs locally against the MotherDuck-managed metadata catalog Create a secret in your compute environment: ### AWS S3 If you have authenticated using `aws sso login`: ```sql CREATE OR REPLACE SECRET my_secret IN MOTHERDUCK ( TYPE S3, PROVIDER credential_chain, CHAIN 'sso', PROFILE '' ); ``` :::note Run `aws sso login --profile ` before creating the secret to refresh your SSO token. You may need to restart your DuckDB CLI session after logging in for the credentials to be picked up. Starting with DuckDB v1.4.0, credentials are validated at creation time: if validation fails, confirm your SSO session is active and that you are using the correct `CHAIN` and `PROFILE`. ::: Alternatively, provide static AWS keys: ```sql CREATE SECRET my_secret IN MOTHERDUCK ( TYPE S3, KEY_ID 'my_s3_access_key', SECRET 'my_s3_secret_key', REGION 'my-bucket-region', SCOPE 'my-bucket-path' ); ``` ### Cloudflare R2 ```sql CREATE SECRET my_secret IN MOTHERDUCK ( TYPE R2, KEY_ID 'your_r2_access_key', SECRET 'your_r2_secret_key', ACCOUNT_ID 'your_account_id' ); ``` Attach the DuckLake to your DuckDB session, pointing at the MotherDuck-managed metadata catalog and your data bucket: ```sql ATTACH 'ducklake:md:__ducklake_metadata_' AS (DATA_PATH ''); ``` This tells DuckLake to: - Use `ducklake:md:__ducklake_metadata_` as the metadata catalog (through MotherDuck) - Use `` for reading and writing data files - Run all compute locally on your DuckDB client rather than on MotherDuck The `ducklake:` prefix is what enables local compute. Attaching with `ATTACH 'md:__ducklake_metadata_'` (without the prefix) gives you the metadata catalog for inspection only -- see [Performing metadata operations on a DuckLake](#performing-metadata-operations-on-a-ducklake). Create a table using your own compute: ```sql CREATE TABLE .air_quality AS SELECT * FROM 'https://us.data.motherduck.com/who_ambient_air_quality/parquet/who_ambient_air_quality_database_version_2024.parquet'; ``` With this configuration, your own compute can directly access or write data to your DuckLake (assuming appropriate credentials are configured). Data uploaded using your own compute will appear in the MotherDuck catalog and be queryable as a standard MotherDuck database. ## What's new in DuckLake 1.0 DuckLake 1.0 is the first production-ready release, with a stable specification and backward-compatibility guarantees. Highlights include: - **Stable specification and multi-engine support**: The DuckLake 1.0 spec is stable with backward-compatibility guarantees going forward, and is designed to be used across multiple query engines. - **Full inlining for inserts, updates, and deletes**: Updates now join inserts and deletes in being inlined into the metadata catalog when under the row threshold (10 by default). Customize with `DATA_INLINING_ROW_LIMIT`. - **Clustering with `SET SORTED BY`**: Declare sort keys on columns or arbitrary SQL expressions. DuckLake applies the sort during compaction and inline flush (and optionally on insert), improving row-group and file pruning for filtered queries. - **Bucket partitioning**: Iceberg-compatible `bucket(N, column)` transforms for high-cardinality columns, giving a middle ground between traditional partitioning and avoiding the small-files problem. - **GEOMETRY enhancements**: Per-file bounding-box statistics enable file pruning on spatial filters, and `GEOMETRY` can now be nested inside `STRUCT`, `LIST`, and `MAP`. - **VARIANT type with shredded statistics**: `VARIANT` sub-fields receive file-level statistics, enabling filter pushdown and faster selective queries over semi-structured data. - **Deletion vectors (experimental)**: Iceberg v3-compatible deletion vectors, stored as Puffin files as an alternative to delete files. See the [DuckLake 1.0 release post](https://ducklake.select/2026/04/13/ducklake-10/) and the [MotherDuck announcement](https://motherduck.com/blog/announcing-ducklake-1-0-on-motherduck/) for more detail. ## Additional DuckLake features DuckLake on MotherDuck also supports: - **Stats-only `COUNT(*)`**: Simple `COUNT(*)` queries are answered directly from metadata statistics without scanning data files. - **TopN file pruning**: `LIMIT` queries with an `ORDER BY` skip data files that fall outside the requested range, making paginated and top-N queries faster. - **Expressions as default values**: Column defaults can use expressions like `now()`, not only literal values. - **Macros**: DuckLake catalogs can store [macros](https://duckdb.org/docs/sql/statements/create_macro.html). ## Performing metadata operations on a DuckLake DuckLake databases provide additional metadata operations for introspection and maintenance. These operations can be performed from both MotherDuck and your own compute environments. For example, you can [list the snapshots](https://ducklake.select/docs/stable/duckdb/usage/snapshots) backing your DuckLake. Every DuckLake database in MotherDuck has a corresponding **metadata database** that stores internal state, including schema definitions, snapshots, file mappings, and more. To inspect this metadata catalog directly from any DuckDB session -- this works for both fully managed and BYOB databases: ```sql ATTACH 'md:__ducklake_metadata_' AS ; ``` ::::note The metadata database can only be attached by the database owner. This form attaches the metadata catalog for inspection only. To run DuckLake compute locally against your data, use the `ducklake:` ATTACH form shown in [Using own compute](#using-own-compute). :::: ## Current limitations - **Limited sharing options**: Read-only sharing is supported through the [existing share functionality](/key-tasks/sharing-data/), restricted to auto-update shares only - **Single-account write access**: Write permissions are limited to one account per database. This account can perform multiple concurrent writes, as long as they are append-only. If multiple queries attempt to update or delete from the same table concurrently, only the first to commit will succeed. Concurrent DDL operations are also not allowed. Support for *multi-account* write access is planned for a future release. - **Limited BYOB storage providers**: Bring Your Own Bucket is supported for [AWS S3](/integrations/cloud-storage/amazon-s3/) and [Cloudflare R2](/integrations/cloud-storage/cloudflare-r2/) storage. Other clouds are under consideration for future support. :::info For multiple concurrent readers to a MotherDuck DuckLake database, you can create a [read scaling token](/key-tasks/authenticating-and-connecting-to-motherduck/read-scaling/). ::: --- Source: https://motherduck.com/docs/integrations/language-apis-and-drivers/go-driver # Go driver > Official Go driver for MotherDuck, enabling seamless integration with Go applications. The go-duckdb driver supports MotherDuck out of the box! To connect, you need a dependency on the driver in your `go.mod` file: ```go github.com/duckdb/duckdb-go/v2 v2.5.1 ``` Your code can then open a connection using the standard [database/sql](https://pkg.go.dev/database/sql) package, or any other mechanisms supported by [go-duckdb](https://github.com/duckdb/duckdb-go/blob/master/README.md): ```go db, err := sql.Open("duckdb", "md:my_db?motherduck_token=") ``` ## Go gotchas ### Use "motherduck_" prefixed configuration in the connection string Because `duckdb-go` parses all arguments out into a configuration dictionary, the shorthand properties such as `attach_mode` will not work. Use the fully qualified properties such as `motherduck_attach_mode` for the MotherDuck-specific properties: ```go db, err := sql.Open("duckdb", "md:my_db?motherduck_attach_mode=single") ``` ### Connecting to multiple accounts from the same process Because `duckdb-go` parses all arguments out into a configuration dictionary, trying to connect with multiple MotherDuck accounts (different `motherduck_token` values) from the same Go process will fail with [Can't open a connection to same database file with a different configuration](/documentation/troubleshooting/error_messages.md#disallowed-connections-with-a-different-configuration). If connecting to different accounts is a requirement, work around this by connecting to an in-memory DuckDB database first: ```go c, err := duckdb.NewConnector(":memory:?custom_user_agent=INTEGRATION_NAME/v1.2.3", func(execer driver.ExecerContext) error { bootQueries := []string{ `INSTALL motherduck`, `LOAD motherduck`, fmt.Sprintf("SET motherduck_token='%s'", token), `SET motherduck_session_name='user123'`, `ATTACH 'md:my_db'`, } for _, query := range bootQueries { _, err := execer.ExecContext(context.Background(), query, nil) if err != nil { return err } } return nil }) if err != nil { // handle the error } defer c.Close() db := sql.OpenDB(c) defer db.Close() ``` --- Source: https://motherduck.com/docs/integrations/cloud-storage/google-cloud-storage # Google Cloud Storage > With MotherDuck, you can access files in a private Google Cloud Storage (GCS) bucket. This leverages the GCS S3 compatible connection. ## Google Cloud Storage connection process 1. Create an [HMAC key](https://docs.cloud.google.com/storage/docs/authentication/hmackeys) for the service account: Cloud Storage → Settings → Interoperability → Create a key for a service account 2. Save the Access ID and Secret (shown once) 3. Create the DuckDB secret using the HMAC credentials as described below ## Configure Google Cloud Storage credentials You can safely store your Google Cloud Storage credentials in MotherDuck for convenience by creating a `SECRET` object using the [CREATE SECRET](/sql-reference/motherduck-sql-reference/create-secret.md) command. ### Create a SECRET object You can safely store your Google Cloud Storage credentials in MotherDuck for convenience by creating a `SECRET` object using the [CREATE SECRET](/sql-reference/motherduck-sql-reference/create-secret.md) command. ### SQL ```sql CREATE SECRET IN MOTHERDUCK ( TYPE GCS, KEY_ID 'HMAC_ACCESS_ID', SECRET 'HMAC_SECRET' ); -- test GCS credentials SELECT count(*) FROM 'gcs:///'; ``` ### Python ```python import duckdb con = duckdb.connect('md:') con.sql("CREATE SECRET IN MOTHERDUCK (TYPE GCS, KEY_ID 'access_key', SECRET 'secret_key')"); # test GCS con.sql("SELECT count(*) FROM 'gcs:///'").show() # 42 ``` ### UI Click on your profile to access the `Settings` panel and click on `Secrets` menu. ![menu_1](./img/settings_access.png) ![menu_2](./img/settings_panel.png) Then click on `Add secret` in the secrets section. ![menu_3](./img/settings_secrets_panel.png) You will then be prompted to enter your Amazon S3 credentials. ![menu_3](./img/settings_secrets_pop_up.png) You can update your secret by executing [CREATE OR REPLACE SECRET](/sql-reference/motherduck-sql-reference/create-secret.md) command to overwrite your secret. ### Delete a SECRET object ### SQL You can use the same method above, using the [DROP SECRET](/sql-reference/motherduck-sql-reference/delete-secret.md) command. ```sql DROP SECRET ; ``` ### UI Click on your profile and access the `Settings` menu. Click on the bin icon to delete your current secrets. ![menu_4](./img/secrets_delete_4.png) ### Google Cloud Storage credentials as **temporary** secrets MotherDuck supports DuckDB syntax for providing GCS credentials. ```sql CREATE SECRET ( TYPE GCS, KEY_ID 's3_access_key', SECRET 's3_secret_key' ); ``` :::note Local/In-memory secrets are not persisted across sessions. ::: :::info Even temporary, in-memory secrets are available to MotherDuck's cloud execution engine when you connect your local DuckDB instance to MotherDuck. When you query {props.provider}, the query runs on MotherDuck's servers, not your local machine, and MotherDuck uses the best-matching secret to authenticate, whether it is stored locally or in MotherDuck. For more details, see [CREATE SECRET](/sql-reference/motherduck-sql-reference/create-secret/#querying-with-secrets). ::: ## Additional resources - [Using the S3 compatible connection in GCS](https://docs.cloud.google.com/storage/docs/aws-simple-migration) - [HMAC Keys in Google Cloud](https://docs.cloud.google.com/storage/docs/authentication/hmackeys) --- Source: https://motherduck.com/docs/integrations/bi-tools/hex # Hex > Connect Hex notebooks to MotherDuck using SQL data connections or Python cells for interactive analytics. [Hex](https://hex.tech/) is a software platform for collaborative data science and analytics using Python, SQL and no-code. You have two ways to connect to MotherDuck using Hex: - **Using SQL cells with a data connection**: MotherDuck is a supported [data connection in Hex](https://learn.hex.tech/docs/connect-to-data/data-connections/data-connections-introduction#supported-data-sources). - **Using Python cells**: You can use Python cells to connect to MotherDuck and query data using DuckDB. ## Using SQL cells with a data connection :::tip When many human users query through the same MotherDuck data connection, consider using a [read scaling token](/key-tasks/authenticating-and-connecting-to-motherduck/read-scaling/). Hex will then route the queries to a dedicated Duckling per Hex kernel, up to the maximum pool size determined by your organization admin. What this means in practice: * Each workbook will get a stable backend for each unique data connection. Multiple users collaborating on the same workbook will share the Duckling to query faster on warm data caches. * In a published app, each user will get a stable backend for each data connection to power their own unique exploration. ::: To add a new data connection, head over the Data browser in a new notebook and click on `Add data connection`. ![hex_data_browser](../img/hex_data_browser.png) Select `MotherDuck` as the data source and fill in the required fields. The most important is the MotherDuck token, which you can find in the [MotherDuck UI](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/#creating-an-access-token). ![hex_configuration](../img/hex_configuration.png) Once done, you can use the data browser to explore the tables and columns and directly specify your data connection in your SQL cell. ![hex_data_browser](../img/hex_data_browser_2.png) ![hex_sql_cell](../img/hex_sql_cell.png) ### Query some data Add another cell and run the same query we ran in a Python cell : ```sql SELECT dayname(tpep_pickup_datetime) AS day_of_week, strftime('%H', tpep_pickup_datetime) AS hour_of_day, COUNT(*) AS trip_count FROM sample_data.nyc.taxi GROUP BY day_of_week, hour_of_day ORDER BY day_of_week, hour_of_day; ``` This produces both a table and a Dataframe, which you can utilize in the same manner as we previously demonstrated with Python to generate data visualizations. ![hex_sql_result](../img/hex_sql_result.png) ## Using Python cells :::tip Use Python 3.12 or later When using Python cells in your environment to connect to MotherDuck, set your Hex project's Python version to 3.12 or later to ensure you have a compatible version of DuckDB pre-installed in your Hex environment. To change your Python version, go to **Settings** --> **Environment** and select **Python 3.12** or **Latest**. ::: If you prefer programming in Python, you can use Python cells to connect to MotherDuck and start query data. You can jump directly on the [Hex notebook](https://app.hex.tech/c0083b53-a04f-47b1-bff7-a9ff12590a9f/hex/5c85b3e2-3df7-4011-87a0-1fff63787d03/draft/logic) for a quickstart. The notebook highlight how you can query data using Python or SQL cells and display charts! ### Storing your MotherDuck token The first step is to safely store your MotherDuck token. You can do this by [creating a new secret in Hex.](https://learn.hex.tech/docs/environment-configuration/environment-views#secrets) ![Hex secrets](../img/hex_secrets.png) Let's add your [MotherDuck access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/authenticating-to-motherduck.md#authentication-using-an-access-token) under the name `motherduck_token`. ![Hex secrets2](../img/hex_secrets_2.png) Once done, add the next Python cell to export as environment variable your `motherduck_token`. This will be detected by SQL/Python processes when authenticating to MotherDuck. ```python # Passing the secrets as environment variable for Python/SQL cell auth # Fill in your token as a Hex project secret https://learn.hex.tech/docs/environment-configuration/environment-views#secret import os os.environ["motherduck_token"] = motherduck_token ``` ### Connecting to MotherDuck DuckDB is already pre-installed in the Hex environment, so you can connect to MotherDuck directly. Add a Python cell and run the following code: ![Hex add cell](../img/hex_add_cell.png) ```python import duckdb # Connect to MotherDuck using Python conn = duckdb.connect(f'md:') ``` ### Query some data and display a chart You can query data from the [sample_data database](/getting-started/sample-data-queries/datasets.mdx). The following example runs a query and returns the result as a pandas dataframe to display as a chart. This database is auto-attached to any MotherDuck user, so you can query it directly. Add another Python cell and run the following code: ```python # Query sample_data database and convert it to a pandas dataframe for dataviz peak_hours = conn.sql(""" SELECT dayname(tpep_pickup_datetime) AS day_of_week, strftime('%H', tpep_pickup_datetime) AS hour_of_day, COUNT(*) AS trip_count FROM sample_data.nyc.taxi GROUP BY day_of_week, hour_of_day ORDER BY day_of_week, hour_of_day;""").to_df() ``` Now we can display the chart using the Visualization cell. Add a new Visualization cell, type `Chart` and select the dataframe we just created `peak_hours`. ![Hex chart](../img/hex_chart_df.png) Finally, play with the parameters to obtain the following chart which gives you a weekly view of the peak hours in New York City for the yellow cabs. ![Hex chart peak hours](../img/hex_chart_peak_hours.png) --- Source: https://motherduck.com/docs/integrations/language-apis-and-drivers/jdbc-driver # JDBC driver > Java Database Connectivity (JDBC) driver for connecting Java applications to MotherDuck. The official DuckDB JDBC driver supports MotherDuck out of the box! To connect, you need a dependency on the driver. For example, in your Maven pom.xml file: ```xml {` org.duckdb duckdb_jdbc ${appVersions.language_clients.duckdb_jdbc} `} ``` Your code can then create a `Connection` by using `jdbc:duckdb:md:databaseName` connection string format: ```xml Connection conn = DriverManager.getConnection("jdbc:duckdb:md:my_db"); ``` This `Connection` can then be [used directly](https://docs.oracle.com/en/java/javase/17/docs/api/java.sql/java/sql/Connection.html) or through any framework built on `java.sql` JDBC abstractions. There are two main ways to programmatically authenticate with a valid MotherDuck token: 1) Passing it in through the connection configuration ```java Properties config = new Properties(); config.setProperty("motherduck_token", token); Connection mdConn = DriverManager.getConnection("jdbc:duckdb:md:mdw", config); ``` 2) Passing the token as a connection string parameter: ```java Connection conn = DriverManager.getConnection("jdbc:duckdb:md:my_db?motherduck_token="+token); ``` See [Authenticating to MotherDuck](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/authenticating-to-motherduck.md) for more details. --- Source: https://motherduck.com/docs/integrations/databases/postgres # PostgreSQL > Advanced open-source relational database with powerful features and extensibility. :::tip[Looking for a Postgres-compatible connection to MotherDuck?] Use the **[Postgres endpoint](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint/)** to connect any Postgres-wire-compatible client — BI tools, ORMs, serverless runtimes, or languages without a DuckDB SDK — directly to MotherDuck. No extension required. ::: [PostgreSQL](https://www.postgresql.org) is an object-relational database management system (ORDBMS) based on POSTGRES, Version 4.2, developed at the University of California at Berkeley Computer Science Department. POSTGRES pioneered many concepts that only became available in some commercial database systems much later. As explained by DuckDB Lab's Hannes Mühleisen in the [explainer blog post](https://duckdb.org/2022/09/30/postgres-scanner.html): > PostgreSQL is designed for traditional transactional use cases, "OLTP", where rows in tables are created, updated and removed concurrently, and it excels at this. But this design decision makes PostgreSQL far less suitable for analytical use cases, "OLAP", where large chunks of tables are read to create summaries of the stored data. Yet there are many use cases where both transactional and analytical use cases are important, for example when trying to gain the latest business intelligence insights into transactional data. Choose the PostgreSQL workflow based on where your query needs to run. ## Query MotherDuck from PostgreSQL-compatible clients Use the [Postgres endpoint](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint) when an application, BI tool, or serverless runtime needs to connect to MotherDuck through the PostgreSQL wire protocol. This is the preferred path for PostgreSQL-compatible clients because it does not require installing or operating a PostgreSQL extension. ## Load PostgreSQL data into MotherDuck Use [DuckDB's PostgreSQL extension](/key-tasks/loading-data-into-motherduck/loading-data-from-postgres) when a DuckDB client needs to read from PostgreSQL and copy data into MotherDuck. This workflow is best for one-time loads, backfills, and controlled client-side movement between PostgreSQL, DuckDB, and MotherDuck. ## Run DuckDB from inside PostgreSQL Use [pg_duckdb](/concepts/pgduckdb) when queries need to run inside a PostgreSQL server with DuckDB or MotherDuck access. This is useful when PostgreSQL-local tables need to be joined with DuckDB or MotherDuck data from the PostgreSQL environment itself. --- Source: https://motherduck.com/docs/integrations/bi-tools/powerbi/powerbi-desktop # Power BI Desktop with MotherDuck > Connect Power BI Desktop to MotherDuck using the Postgres endpoint for dashboards and reports. :::info[Preview] The Postgres endpoint is in [preview](/about-motherduck/feature-stages/). Features and behavior may change. ::: :::warning[Looking for the custom connector?] The DuckDB custom connector is a legacy approach. If you still need it, see the [legacy custom connector guide](./powerbi-custom-connector.md). ::: ## Before you start You'll need: - [Power BI Desktop](https://www.microsoft.com/en-us/power-platform/products/power-bi/desktop) installed on Windows - A [MotherDuck access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) - Your Postgres host, which you can find at [MotherDuck Postgres settings](https://app.motherduck.com/settings/postgres) (for example, `pg.us-east-1-aws.motherduck.com`) ## Connect to MotherDuck 1. In Power BI Desktop, click **Get data**. ![Power BI Get Data dialog showing PostgreSQL database connector](/img/integrations/powerbi-pg-get-data.png) 2. Search for **PostgreSQL database** in the connector list and select it. 3. Fill in the connection details: - **Server**: Your Postgres host (for example, `pg.us-east-1-aws.motherduck.com`). You can find this at [MotherDuck Postgres settings](https://app.motherduck.com/settings/postgres). - **Database**: Your database or share name in MotherDuck (for example, `sample_data`). 4. Select a data connectivity mode: - **DirectQuery**: Queries run against MotherDuck in real time. Best for dashboards that need up-to-date data. - **Import**: Loads a snapshot of the data into Power BI's in-memory model. Best when you want fast local interactions and can refresh on a schedule. ![Power BI PostgreSQL connection dialog with server and database fields](/img/integrations/powerbi-pg-connection.png) 5. Click **OK**. 6. When prompted for credentials, select **Database** on the left and enter: - **User name**: `postgres` - **Password**: Your [MotherDuck access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) ![Power BI credentials dialog with user name and password fields](/img/integrations/powerbi-pg-credentials.png) 7. Click **Connect**. In the Navigator, select the tables you want to use and click **Load**. ![Power BI Navigator showing tables from MotherDuck](/img/integrations/powerbi-pg-navigator.png) 8. You can build visualizations with your MotherDuck data. ![Power BI dashboard with visualizations from MotherDuck data](/img/integrations/powerbi-pg-query-result.png) ## Connection parameters | Parameter | Value | |-----------|-------| | **Server** | `pg.-aws.motherduck.com` (find yours at [Postgres settings](https://app.motherduck.com/settings/postgres)) | | **Database** | Your database name or share name | | **User name** | `postgres` | | **Password** | Your [MotherDuck access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) | ## Additional information - [Postgres endpoint reference](/sql-reference/postgres-endpoint) for connection parameters, SSL options, and limitations - [Connect through the Postgres endpoint](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint) for a general how-to guide - [Power BI documentation](https://learn.microsoft.com/en-us/power-bi/connect-data/) --- Source: https://motherduck.com/docs/integrations/language-apis-and-drivers/r # R > R is a language for statistical analysis. To connect to MotherDuck from an R program, you need to first install DuckDB: ```r install.packages("duckdb") ``` You'll then need to load the `motherduck` extension and `ATTACH 'md:'` to connect to all of your databases. To connect to only one database, use `ATTACH 'md:my_db'` syntax. ```r library("DBI") con <- dbConnect(duckdb::duckdb()) dbExecute(con, "INSTALL 'motherduck'") dbExecute(con, "LOAD 'motherduck'") dbExecute(con, "ATTACH 'md:'") dbExecute(con, "USE my_db") res <- dbGetQuery(con, "SHOW DATABASES") print(res) ``` Once connected, any R syntax described in the [DuckDB's documentation](https://duckdb.org/docs/api/r.html) should work. :::note Extension autoloading is turned off in R duckdb distributions, so `dbdir = "md:"` style connections do not connect to MotherDuck. ::: ## Considerations and limitations ### Windows integration MotherDuck extension is not available on Windows. As a workaround, you can use [WSL](https://learn.microsoft.com/en-us/windows/wsl/about) (Windows Subsystem for Linux) --- Source: https://motherduck.com/docs/integrations/dev-tools/retool # Retool > Low-code platform for building internal tools and custom business applications with drag-and-drop UI components. There are two ways to connect Retool to MotherDuck, depending on whether you use Retool Cloud or self-hosted Retool. ## Retool Cloud (native connector) Retool Cloud has a native MotherDuck resource type. To connect: 1. Go to **Resources** and select **Create new** > **Resource**. 2. Search for **MotherDuck** and select it. 3. Give the resource a descriptive name (for example, "MotherDuck analytics"). 4. Under **Resource credentials**, enter your [MotherDuck access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/#creating-an-access-token). 5. Optionally enter a **Database name**. Leave it empty to use workspace mode, which lets you query across multiple databases. 6. Click **Test connection**, then **Create resource**. You can use this resource in your Retool apps to run SQL queries against your MotherDuck databases. The resource supports both SQL mode for reading data and GUI mode for write operations (insert, update, delete, upsert). ### Connection options You can pass optional key-value pairs under **Connection options** to customize behavior: | Option | Values | Description | |--------|--------|-------------| | `access_mode` | `READ_WRITE`, `READ_ONLY` | Controls whether the connection can write data | | `attach_mode` | `single`, `workspace` | Sets the [attach mode](/key-tasks/authenticating-and-connecting-to-motherduck/attach-modes/). `single` scopes the connection to one database (useful when querying a specific tenant or to avoid catalog clutter); `workspace` (default) attaches every database in your saved workspace. | | `TimeZone` | For example, `UTC`, `America/New_York` | Sets the session time zone | | `default_null_order` | `NULLS_FIRST`, `NULLS_LAST` | Default null ordering for queries | | `default_order` | `ASC`, `DESC` | Default sort order for queries | For more details, see the [Retool MotherDuck documentation](https://docs.retool.com/data-sources/guides/connect/motherduck). ### Known limitations - `BLOB` and `ARRAY` column types are not supported by the native connector. Queries that return these types will fail. Cast these columns to a supported type (for example, using `CAST` or `list_string_agg`) or exclude them from your result set. ## Self-hosted (JDBC) If you run a self-hosted Retool instance, you can connect to MotherDuck through the [DuckDB JDBC driver](/integrations/language-apis-and-drivers/jdbc-driver/). Your instance needs network access to `motherduck.com` over HTTPS (port 443). 1. In your Retool instance, go to **Resources** and select **Create new**. 2. Choose **JDBC** as the resource type. 3. Use the following JDBC connection string: ```text jdbc:duckdb:md:?motherduck_token= ``` Replace `` with your MotherDuck database and `` with your [access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/#creating-an-access-token). 4. Test the connection and save. For more details on the JDBC driver, see [JDBC driver](/integrations/language-apis-and-drivers/jdbc-driver/). --- Source: https://motherduck.com/docs/integrations/bi-tools/tableau/tableau-cloud # Tableau Cloud with MotherDuck > Connect Tableau Cloud to MotherDuck using the Postgres endpoint for dashboards and reports. :::info[Preview] The Postgres endpoint is in [preview](/about-motherduck/feature-stages/). Features and behavior may change. ::: :::warning[Looking for the Tableau Bridge setup?] Connecting through Tableau Bridge is a legacy approach. If you still need it, refer to the [legacy Tableau Bridge guide](./tableau-bridge.md). ::: ## Before you start You'll need: - A [Tableau Cloud](https://www.tableau.com/) account - A [MotherDuck access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) - Your Postgres host and port, which you can find at [MotherDuck Postgres settings](https://app.motherduck.com/settings/postgres) (for example, `pg.us-east-1-aws.motherduck.com`) ## Connect to MotherDuck 1. In a Tableau Cloud workbook, click **Connect to Data**. 2. Under the **Connectors** tab, select **PostgreSQL**. ![Tableau Cloud Connect to Data dialog showing the Connectors tab with PostgreSQL available](/img/integrations/tableau-pg-connectors.png) 3. Fill in the connection details: - **Server**: Your Postgres host (for example, `pg.us-east-1-aws.motherduck.com`). Find this at [MotherDuck Postgres settings](https://app.motherduck.com/settings/postgres). - **Port**: The port from your Postgres settings (for example, `5432`). - **Database**: Your database name in MotherDuck (for example, `sample_data`). - **Username**: `postgres` - **Password**: Your [MotherDuck access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) - Check **Require SSL**. ![Tableau Cloud PostgreSQL connection dialog with server, port, database, username, and password fields](/img/integrations/tableau-pg-connection.png) 4. Click **Sign In**. Tableau connects to MotherDuck and shows your tables. ![Tableau Cloud data source tab showing tables from the MotherDuck database](/img/integrations/tableau-pg-tables.png) 5. Select your tables and build visualizations with your MotherDuck data. ![Tableau Cloud worksheet with a line chart built from MotherDuck data](/img/integrations/tableau-pg-visualization.png) ## Connection parameters | Parameter | Value | |-----------|-------| | **Server** | `pg.-aws.motherduck.com` (find yours at [Postgres settings](https://app.motherduck.com/settings/postgres)) | | **Port** | `5432` (find yours at [Postgres settings](https://app.motherduck.com/settings/postgres)) | | **Database** | Your database name | | **Username** | `postgres` | | **Password** | Your [MotherDuck access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) | | **Require SSL** | Checked | ## Additional information - [Postgres endpoint reference](/sql-reference/postgres-endpoint) for connection parameters, SSL options, and limitations - [Connect through the Postgres endpoint](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint) for a general how-to guide - [Tableau documentation](https://help.tableau.com/current/online/en-us/to_connect_live_sql.htm) --- Source: https://motherduck.com/docs/integrations/web-development/vercel # Vercel > Vercel is a web development platform for building and deploying web applications. [Vercel](https://vercel.com/) is a cloud platform for static sites and serverless functions. It supports two ways to connect your Next.js application to MotherDuck: - **[Postgres endpoint](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint)** — connects from server-side API routes using the standard [`pg`](https://www.npmjs.com/package/pg) npm package. This lets you connect to MotherDuck databases like a regular Postgres database, but use MotherDuck as the fast analytics query backend. - **[Wasm SDK](/sql-reference/wasm-client)** — runs DuckDB directly in the browser using WebAssembly. Best for building analytics dashboards that are highly interactive, allowing queries to execute on the user's device. Both approaches work with Vercel's Native Integration for automatic token management. Vercel typically provides two ways to integrate with 3rd party services: - Native integration: create a new account on the 3rd party service and connect it to Vercel. Billing and setup is managed by Vercel. - Non-native integration (connectable accounts): connect existing 3rd party accounts to Vercel. :::info Vercel supports Native Integration with MotherDuck. Support for non-native integration (connectable accounts) is not yet available. But to use your existing MotherDuck account, populate the Vercel app environment variables with a MotherDuck token for your existing account. ::: ## Native integration To kickstart the integration, you can either start from: - [Vercel's marketplace](https://vercel.com/marketplace/motherduck) and install the integration from there on an existing Vercel project. - Deploy a new project from [MotherDuck's Vercel template](https://vercel.com/motherduck-marketing/~/integrations/motherduck) which includes snippets to get started with MotherDuck and your Next.js project. ### How to install 1. To install the MotherDuck Native Integration from the Vercel Marketplace: 2 Navigate to the Vercel Marketplace or to the Integrations Console on your Vercel Dashboard. 3. Locate the MotherDuck integration. 4. Click Install. 5. On the Install MotherDuck modal, you are presented with two plans options. ![modal1](./img/vercel1.png) 6. On the next modal, you would be prompt to give your database a name. Note that a new installation will create a new account and database within a new MotherDuck organization. ![modal2](./img/vercel2.png) 7. You are all set! You have now a new account and database within a new organization. Plus, tokens ([access token](/docs/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/#creating-an-access-token), and [read scaling token](/docs/key-tasks/authenticating-and-connecting-to-motherduck/read-scaling/#understanding-read-scaling-tokens)) are automatically generated and stored in Vercel's environment variables. ![model3](./img/vercel3.png) You can head to `Getting Started` section on the integration page to have more information on how to use the integration. ![model4](./img/vercel4.png) --- ## Connect using the Postgres endpoint Next.js API routes can connect to MotherDuck through the [Postgres endpoint](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint) using the [`pg`](https://www.npmjs.com/package/pg) npm package. This gives you a thin-client path to query MotherDuck from serverless functions without any DuckDB dependencies. This guide walks through building a Next.js app that queries NYC taxi data from MotherDuck's built-in `sample_data` database. ### Prerequisites - [Node.js](https://nodejs.org/) v18+ - A [Vercel account](https://vercel.com/signup) - A [MotherDuck account](https://motherduck.com/) and [access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) ### Project setup Create a new Next.js project and install the Postgres client: ```bash npx create-next-app@latest motherduck-nextjs --typescript --app cd motherduck-nextjs npm install pg @vercel/functions npm install --save-dev @types/pg ``` ### Store your token For local development, create a `.env.local` file (this is gitignored by default in Next.js): ```text MOTHERDUCK_TOKEN="your_token_here" MOTHERDUCK_HOST="pg.us-east-1-aws.motherduck.com" MOTHERDUCK_DB="sample_data" ``` For production, add the environment variable through the Vercel dashboard or CLI: ```bash vercel env add MOTHERDUCK_TOKEN ``` If you installed the [Native Integration](#native-integration), your access token is already available as an environment variable. ### Create the connection pool Create `src/lib/motherduck.ts`. The pool is initialized at module scope so it persists across requests within the same function instance — this is the recommended pattern for [connection pooling on Vercel](https://vercel.com/kb/guide/connection-pooling-with-functions). ```typescript import { Pool, PoolClient } from "pg"; import { attachDatabasePool } from "@vercel/functions"; const token = process.env.MOTHERDUCK_TOKEN; const host = process.env.MOTHERDUCK_HOST ?? "pg.us-east-1-aws.motherduck.com"; const db = process.env.MOTHERDUCK_DB ?? "sample_data"; if (!token) { throw new Error("MOTHERDUCK_TOKEN environment variable is required"); } const pool = new Pool({ connectionString: `postgresql://user:${token}@${host}:5432/${db}`, ssl: { rejectUnauthorized: true }, max: 10, idleTimeoutMillis: 5000, }); attachDatabasePool(pool); export async function withClient( fn: (client: PoolClient) => Promise ): Promise { const client = await pool.connect(); try { return await fn(client); } finally { client.release(); } } ``` A few things to note: - **`attachDatabasePool(pool)`** from `@vercel/functions` ensures idle connections are cleaned up before a function instance is suspended, preventing connection leaks. - **`idleTimeoutMillis: 5000`** closes unused connections after 5 seconds, balancing reuse during traffic bursts with prompt cleanup during quiet periods. - **`ssl: { rejectUnauthorized: true }`** enables full certificate verification (`verify-full`). Node.js verifies the server certificate against the system CA bundle and checks that the hostname matches the certificate. MotherDuck's Postgres endpoint uses a publicly trusted certificate, so no custom CA configuration is needed. ### Write the API routes Create two route handlers. The first returns a sample of recent taxi trips. **`src/app/api/trips/route.ts`** ```typescript import { NextResponse } from "next/server"; import { withClient } from "@/lib/motherduck"; export async function GET() { try { const rows = await withClient(async (client) => { const result = await client.query( `SELECT tpep_pickup_datetime AS pickup, tpep_dropoff_datetime AS dropoff, passenger_count, trip_distance, fare_amount, tip_amount, total_amount FROM nyc.taxi ORDER BY tpep_pickup_datetime DESC LIMIT 20` ); return result.rows; }); return NextResponse.json(rows); } catch (error) { console.error("Failed to fetch trips:", error); return NextResponse.json( { error: "Failed to fetch trips" }, { status: 500 } ); } } ``` The second accepts date range parameters and returns aggregated fare data. It validates inputs before querying and uses parameterized queries (`$1`, `$2`) to prevent SQL injection — never interpolate user input directly into SQL strings. **`src/app/api/stats/route.ts`** ```typescript import { NextRequest, NextResponse } from "next/server"; import { withClient } from "@/lib/motherduck"; export async function GET(request: NextRequest) { const startDate = request.nextUrl.searchParams.get("start"); const endDate = request.nextUrl.searchParams.get("end"); if (!startDate || !endDate) { return NextResponse.json( { error: "Both 'start' and 'end' query parameters are required. Use YYYY-MM-DD format.", }, { status: 400 } ); } const datePattern = /^\d{4}-\d{2}-\d{2}$/; if (!datePattern.test(startDate) || !datePattern.test(endDate)) { return NextResponse.json( { error: "Invalid date format. Use YYYY-MM-DD." }, { status: 400 } ); } try { const data = await withClient(async (client) => { const result = await client.query( `SELECT sum(passenger_count)::INTEGER AS total_passengers, round(sum(fare_amount), 2) AS total_fare FROM nyc.taxi WHERE tpep_pickup_datetime >= $1 AND tpep_pickup_datetime < $2`, [`${startDate} 00:00:00`, `${endDate} 00:00:00`] ); return result.rows[0]; }); return NextResponse.json({ start: startDate, end: endDate, ...data, }); } catch (error) { console.error("Failed to fetch stats:", error); return NextResponse.json( { error: "Failed to fetch stats" }, { status: 500 } ); } } ``` ### Test locally ```bash npm run dev ``` Then open `http://localhost:3000/api/trips` or try the stats endpoint with a date range: ```text http://localhost:3000/api/stats?start=2022-11-01&end=2022-12-01 ``` ### Deploy ```bash vercel deploy ``` Or push to a connected Git repository — Vercel deploys automatically on every push. --- ## Connect using the Wasm SDK The Wasm SDK runs DuckDB in the browser, making it ideal for highly interactive analytics dashboards. ### Project templates Learn more about how to setup your projects by using the following templates: - [MotherDuck's Vercel template](https://github.com/MotherDuck-Open-Source/nextjs-motherduck-wasm-analytics-quickstart) : A fully-fledged template that includes a Next.js project and a MotherDuck WASM setup with sample data integration and an interactive data visualization example. - [MotherDuck's Vercel template minimal](https://github.com/MotherDuck-Open-Source/nextjs-motherduck-wasm-analytics-quickstart-minimal) : a minimal template which includes a Next.js project and MotherDuck Wasm setup with some sample data integration. --- Source: https://motherduck.com/docs/integrations/dev-tools/obsidian # Obsidian > Use the DuckDB & MotherDuck Obsidian plugin to query external data from your notes and freeze the results as markdown tables. The [DuckDB & MotherDuck plugin](https://community.obsidian.md/plugins/duckdb-motherduck) lets you run DuckDB SQL from inside an Obsidian note and freeze the results as a markdown table directly below the query. Local queries run in WASM with no account required. Add a MotherDuck token to query cloud databases or push heavier compute off your laptop. Both backends can coexist in the same note — each code block picks its connection through the fence type. ![MotherDuck SQL block in Obsidian with the rendered result table and the frozen markdown table written below the query](./img/obsidian-motherduck-demo.png) ## Install 1. In Obsidian, open **Settings → Community plugins → Browse**. 2. Search for **DuckDB & MotherDuck** and select **Install**, then **Enable**. To use the cloud backend, add a [MotherDuck access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/#creating-an-access-token) under **Settings → DuckDB & MotherDuck → MotherDuck token**. For shared vaults or scoped access, prefer a [service account token](/key-tasks/service-accounts-guide/create-and-configure-service-accounts/). :::warning The MotherDuck token is stored in plaintext in `/.obsidian/plugins/duckdb-motherduck/data.json`. Don't commit or publicly sync a vault that contains it. ::: ## Running queries Each fenced code block picks its backend from the fence language: | Fence | Backend | Reaches cloud | |-------|---------|---------------| | ` ```duckdb ` | Local DuckDB WASM | No | | ` ```motherduck ` | MotherDuck WASM client | Yes | ### Local DuckDB Use a `duckdb` block to query any file format DuckDB reads — Parquet, CSV, JSON, Excel, Iceberg, Delta, or geospatial — from a local path or URL: ````markdown ```duckdb SELECT o_orderpriority AS priority, count(*) AS orders, round(sum(o_totalprice), 2) AS revenue FROM read_parquet('https://shell.duckdb.org/data/tpch/0_01/parquet/orders.parquet') GROUP BY 1 ORDER BY revenue DESC ``` ```` In reading mode the block becomes a panel with **Run**, **Freeze**, and **Clear freeze** buttons. The **Path to local DuckDB file** setting has three modes: - `:memory:` (default) — ephemeral, reset each time Obsidian restarts. - A bare filename like `notes.duckdb` — persistent storage in the browser's Origin Private File System. Survives restart, lives outside your vault. - An absolute path like `/Users/you/data.duckdb` — read an existing `.duckdb` file from disk. Read-only: writes succeed in the worker but don't persist back to the file. ### MotherDuck Use a `motherduck` block to query your cloud databases: ````markdown ```motherduck SELECT type, count(*) AS items, round(avg(score), 1) AS avg_score, round(avg(descendants), 1) AS avg_comments FROM sample_data.hn.hacker_news WHERE type IS NOT NULL GROUP BY 1 ORDER BY items DESC ``` ```` Any DuckDB SQL that runs in MotherDuck works here — joins across databases, AI functions, shared datasets, and so on. ## Freezing results Selecting **Freeze** inserts the query result as a markdown table directly under the SQL block, wrapped in sentinel comments so the next refresh knows what to replace: ````markdown ```motherduck SELECT brand, sum(revenue) FROM sales GROUP BY 1 ORDER BY 2 DESC LIMIT 10 ``` | brand | sum(revenue) | | ----- | ------------ | | acme | 42000 | ```` Frozen tables are regular markdown — they diff cleanly in git, render in any editor, and stay readable to agents that scan the vault. ## Scheduled refresh Pick a cadence in the **Refresh** dropdown above any SQL block to opt that note into auto-refresh. The plugin adds a frontmatter property: ```yaml --- duckdb-motherduck-refresh: daily --- ``` While Obsidian is running, the plugin sweeps once an hour and re-materializes the frozen tables for any note whose cadence has elapsed. The active editor is skipped to avoid stomping in-progress edits. Scheduled refresh runs only while Obsidian is open. To refresh while it's closed, trigger the plugin's API from the [Obsidian CLI](https://obsidian.md/help/cli): ```bash obsidian eval code="app.plugins.getPlugin('duckdb-motherduck').api.refreshFile('path/to/note.md')" ``` Drop that into a cron job, a Claude Code skill, or any agent with shell access. ## Commands From the command palette: - **Refresh all queries in this note** — re-runs every block in the current note. - **Refresh query at cursor** — re-runs and re-freezes only the block at the cursor. Bind a hotkey under **Settings → Hotkeys** for fast iteration. - **Clear freeze at cursor** — removes the frozen result below the SQL block. - **Reset DuckDB / MotherDuck connections** — drops both connections. Use after changing the path or token. ## Settings - **Row cap** — maximum rows rendered inline or written into a frozen table. The runtime stops scanning at `rowCap + 1` rows so heavy queries don't materialize unnecessary data in WASM heap. - **Cell character cap** — maximum characters per cell in rendered and frozen tables. Default `80`. Longer values are truncated with an ellipsis; hover a truncated cell in the live result to see the full value. - **Auto-refresh scheduled notes** — global toggle for the hourly sweep. - **Reset connections after each scheduled refresh** — terminates the WASM workers after each sweep to free memory. Default on. ## Known limitations - Pointing at an on-disk `.duckdb` file is read-only — writes don't persist back to the file. - Scheduled refresh runs only while Obsidian is open; use the plugin API plus the Obsidian CLI for external scheduling. - The MotherDuck token is stored in plaintext in `data.json`. There's no keychain integration. - Absolute-path mode requires Node integration that isn't available on mobile. ## Source The plugin is open source under the MIT license at [motherduckdb/obsidian-duckdb-motherduck](https://github.com/motherduckdb/obsidian-duckdb-motherduck). --- Source: https://motherduck.com/docs/integrations/databases/planetscale # PlanetScale > PlanetScale offers hosted PostgreSQL and MySQL Vitess Databases. MotherDuck supports PlanetScale Postgres via the pg_duckdb extension, as well as the Postgres Connector. In our internal benchmarking, pg_duckdb offers 100x or greater query acceleration for analytical queries when compared to vanilla Postgres. ## Prerequisites Before connecting PlanetScale to MotherDuck, ensure you have: - A PlanetScale account with a Postgres database created - The `pg_duckdb` extension enabled in your PlanetScale database (see [PlanetScale extension documentation](https://planetscale.com/docs/postgres/extensions/pg_duckdb)) - A MotherDuck account and authentication token (get your token from the [MotherDuck dashboard](https://app.motherduck.com)) - Database connection credentials from your PlanetScale dashboard (host, port, username, password, database name) ## Connecting pg_duckdb to MotherDuck To run pg_duckdb, ensure you add it to your [extensions in PlanetScale](https://planetscale.com/docs/postgres/extensions/pg_duckdb). :::tip Review the configuration parameters before deploying the extension. Once deployed, you can connect to MotherDuck with the following SQL statements. ::: ```sql -- Grant necessary permissions to the PlanetScale superuser GRANT CREATE ON SCHEMA public to pscale_superuser; -- Create the pg_duckdb extension in your Postgres database CREATE EXTENSION pg_duckdb; -- Enable a MotherDuck connection with your authentication token CALL duckdb.enable_motherduck(); ``` To swap tokens, you can drop the MotherDuck connection and then re-add with: ```sql -- Remove the existing MotherDuck server connection DROP SERVER motherduck CASCADE; -- Re-enable MotherDuck with a new authentication token CALL duckdb.enable_motherduck(); ``` ### Using read replicas with PlanetScale :::info Pg_duckdb will automatically round-robin between your replicas when you use a read-only token. When switching between a read-write and a read-only token, you will want to snapshot your database and then force sync as part of the hand-off. ::: Switching from read-write to read-only is done with the following SQL statement in Postgres: ```sql -- Create a snapshot of your MotherDuck database to ensure consistency SELECT * FROM duckdb.raw_query('CREATE SNAPSHOT OF '); -- Drop the existing MotherDuck connection DROP SERVER motherduck CASCADE; -- Re-enable MotherDuck with your read-only token CALL duckdb.enable_motherduck(); -- Refresh the database to sync with the snapshot SELECT * FROM duckdb.raw_query('REFRESH DATABASE '); ``` ### Reading from MotherDuck :::info By default, data in [MotherDuck is mapped to Postgres in two different ways](https://github.com/duckdb/pg_duckdb/blob/main/docs/motherduck.md#schema-mapping). This is because MotherDuck is designed to hold many databases in its global catalog, while Postgres traditionally has a single database in its catalog. - For data in `my_db.main`, it is mapped directly to the `public` schema in the Postgres database. - For data in any other database & schema, it is mapped to `ddb$database$schema` in the Postgres database. ::: Once the catalog is in sync between MotherDuck and Postgres, the data can be queried directly from Postgres. If it is out of sync for any reason, it can be re-sync'd with the following SQL command: ```sql -- Terminate the pg_duckdb sync worker to force a re-sync SELECT * FROM pg_terminate_backend(( SELECT pid FROM pg_stat_activity WHERE backend_type = 'pg_duckdb sync worker' )); ``` #### Sample MotherDuck queries Once the catalog is synchronized to Postgres, we can query the data as if it was normal data in Postgres. ```sql -- Query data from a MotherDuck database and schema -- Note: Non-main schemas use the ddb$database$schema naming convention SELECT * FROM "ddb$sample_data$nyc".taxi ORDER BY tpep_dropoff_datetime DESC LIMIT 10; ``` You can also join with data in Postgres. ```sql -- Join MotherDuck data with local Postgres tables SELECT a.col1, b.col2 -- MotherDuck table from a non-main schema FROM "ddb$my_database$my_schema".my_table AS a -- Local Postgres table in the public schema LEFT JOIN public.another_table AS b on a.key = b.key ``` The DuckDB `iceberg_scan` function also works as well: ```sql -- Use DuckDB's iceberg_scan function to query Iceberg tables SELECT COUNT(*) FROM iceberg_scan('https://motherduck-demo.s3.amazonaws.com/iceberg/lineitem_iceberg', allow_moved_paths := true) ``` :::info Two special helper functions exist to run queries directly with DuckDB: - **`duckdb.query`**: Returns tabular data, use for SELECT queries - **`duckdb.raw_query`**: Returns void, use for DDL queries such as Snapshot Creation and Database Refresh. This function keeps the database in-sync when handing off between read and write nodes. ::: ```sql -- Use duckdb.query for SELECT queries that return tabular data -- This example lists all databases in MotherDuck SELECT * FROM duckdb.query('FROM md_databases()') ``` ```sql -- Use duckdb.raw_query for DDL queries that return void -- This example drops a table in MotherDuck SELECT * FROM duckdb.raw_query('DROP TABLE my_database.my_schema.some_table') ``` ### Replicating data to MotherDuck :::tip For smaller tables, data can be replicated using simple SQL statements. ::: ```sql -- Create a table in MotherDuck and populate it with data from Postgres -- Replace my_database and my_schema with your target database and schema names CREATE TABLE "ddb$my_database$my_schema".my_table USING duckdb AS SELECT * FROM public.my_table ``` :::tip For larger tables, state management, and tighter SLAs & requirements, MotherDuck offers [integrations to various other ingestion partners](/integrations/ingestion/). ::: ### Further reading The [pg_duckdb github repo](https://github.com/duckdb/pg_duckdb) contains [further documentation](https://github.com/duckdb/pg_duckdb/blob/main/docs/README.md) of all available functions. For ease of finding the documentation, a table of the documentation sections is below: | Topic | Description | |-------|-------------| | [**Functions**](https://github.com/duckdb/pg_duckdb/blob/main/docs/functions.md) | Complete reference for all available functions | | [**Syntax Guide & Gotchas**](https://github.com/duckdb/pg_duckdb/blob/main/docs/gotchas_and_syntax.md) | Quick reference for common SQL patterns and things to know | | [**Types**](https://github.com/duckdb/pg_duckdb/blob/main/docs/types.md) | Supported data types and type mappings | | [**Extensions**](https://github.com/duckdb/pg_duckdb/blob/main/docs/extensions.md) | DuckDB extension installation and usage | | [**Settings**](https://github.com/duckdb/pg_duckdb/blob/main/docs/settings.md) | Configuration options and parameters | | [**Transactions**](https://github.com/duckdb/pg_duckdb/blob/main/docs/transactions.md) | Transaction behavior and limitations | ## Connecting with the Postgres extension You can also connect to PlanetScale Postgres with the DuckDB Postgres extension. This approach lets you query PlanetScale data directly from DuckDB or MotherDuck. ### Install and load the extension ```sql -- Install the Postgres extension from DuckDB's extension registry INSTALL postgres; -- Load the extension to enable Postgres connectivity LOAD postgres; -- Attach your PlanetScale database using a connection string ATTACH '' AS postgres_db (TYPE postgres); ``` ### Connection string format The connection string format follows PostgreSQL's standard connection parameters. Here's an example with explanations: ```sql ATTACH 'host= port= user= password= dbname= sslmode=require' AS planetscale (TYPE postgres); ``` **Connection Parameters:** - `host`: Your PlanetScale database hostname (found in your PlanetScale dashboard) - `port`: The database port (typically 3306 for MySQL or 5432 for Postgres) - `user`: Your PlanetScale database username - `password`: Your PlanetScale database password - `dbname`: The name of your database in PlanetScale - `sslmode=require`: Ensures SSL encryption is used (required for PlanetScale) :::info The above connection string works with DuckDB. PlanetScale suggests also using the `sslnegotiation` and `sslrootcert` keys when connecting to Postgres, but these keys are not supported by the `libpq` version that is included in DuckDB. The `sslmode=require` parameter is sufficient for secure connections. ::: --- Source: https://motherduck.com/docs/integrations/bi-tools/powerbi/powerbi-service # Power BI Service with MotherDuck > Publish Power BI reports to the cloud using the On-Premises Data Gateway and MotherDuck's Postgres endpoint. :::info[Preview] The Postgres endpoint is in [preview](/about-motherduck/feature-stages/). Features and behavior may change. ::: Power BI Service is the cloud-based version of Power BI that lets you publish, share, and schedule refreshes for reports and dashboards. To connect Power BI Service to MotherDuck, you need a Microsoft On-Premises Data Gateway that bridges the cloud service to MotherDuck's Postgres endpoint. Both **Import** and **DirectQuery** modes work through the gateway. ## Before you start You'll need: - A published `.pbix` report connected to MotherDuck through the [Power BI Desktop setup](./powerbi-desktop.mdx) - A [Power BI Pro or Premium Per User](https://www.microsoft.com/en-us/power-platform/products/power-bi/pricing) license (required for sharing reports and using the standard gateway) - A [MotherDuck access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) - A Windows machine to host the gateway (see [Microsoft's gateway requirements](https://learn.microsoft.com/en-us/data-integration/gateway/service-gateway-install#requirements)) ## Install the gateway 1. Download the standard gateway installer from [Microsoft's gateway download page](https://aka.ms/on-premises-data-gateway-installer). Download the **standard (enterprise) gateway**, not the personal mode gateway. 2. Run the installer and accept the default installation path. 3. After installation, the configuration wizard opens. Sign in with your **Microsoft work or school account** (the one associated with your Power BI tenant). 4. Select **Register a new gateway on this computer**. 5. Enter a gateway name (for example, `MD-PG-Gateway`) and a recovery key. Store the recovery key securely. 6. Click **Configure** and wait for registration to complete. **Verify:** The configuration wizard shows "The gateway is online and ready to be used." The Windows service `On-premises data gateway service` should be running in `services.msc`. ![On-premises data gateway configuration wizard showing the gateway is online and ready](/img/integrations/powerbi-service-gateway-ready.png) ## Add a MotherDuck data source 1. In [Power BI Service](https://app.powerbi.com), click the **Settings gear** and select **Manage connections and gateways**. 2. Verify your gateway shows **Online**. 3. Click **+ New** and select **On-premises**. 4. Fill in the connection details: | Field | Value | |-------|-------| | **Gateway cluster name** | Select your gateway | | **Connection name** | A descriptive name (for example, `MotherDuck-PG-sample_data`) | | **Data Source Type** | **PostgreSQL** | | **Server** | Your Postgres host (for example, `pg.us-east-1-aws.motherduck.com`) | | **Database** | Your MotherDuck database name | | **Authentication method** | **Basic** | | **Username** | `postgres` | | **Password** | Your [MotherDuck access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) | | **Encrypted Connection** | Checked | | **Privacy Level** | Organizational | ![Power BI Service new connection dialog with gateway, server, and database fields](/img/integrations/powerbi-service-new-connection.png) ![Power BI Service connection dialog showing authentication, encryption, and privacy settings](/img/integrations/powerbi-service-connection-auth.png) 5. Click **Create**. :::warning The **Server** and **Database** values must match your `.pbix` file character-for-character. If they differ, the published dataset won't find the gateway data source. ::: ## Publish and connect a report 1. In Power BI Desktop, publish your report: **File > Publish > Publish to Power BI** and select a workspace. 2. In Power BI Service, go to the workspace and find the semantic model (dataset). 3. Open **Settings** for the semantic model and expand **Gateway and cloud connections**. 4. Map the connection to your gateway data source. ![Power BI Service gateway and cloud connections settings showing the gateway mapped to a MotherDuck data source](/img/integrations/powerbi-service-gateway-mapping.png) 5. Under **Data source credentials**, click **Edit credentials** and enter: - Authentication method: **Basic** - User name: `postgres` - Password: Your MotherDuck access token - Encrypted connection: Checked 6. Click **Sign in**. ## Set up scheduled refresh For reports using **Import** mode, you can configure automatic data refreshes. 1. In the semantic model settings, expand **Refresh**. 2. Toggle **Keep your data up to date** to **On**. 3. Set your refresh frequency and time zone. 4. Click **Apply**. To verify, trigger a manual refresh: open the semantic model's three-dot menu and select **Refresh now**. All steps should complete with green check marks. ![Power BI Service refreshing data dialog showing all steps completed successfully](/img/integrations/powerbi-service-refresh-complete.png) ## DirectQuery through the gateway For reports using **DirectQuery** mode, queries run against MotherDuck in real time through the gateway. No scheduled refresh is needed since data is always live. After publishing and mapping the gateway data source (steps above), DirectQuery reports work automatically in Power BI Service. ![Power BI Service showing a report with visualizations from MotherDuck data](/img/integrations/powerbi-service-report.png) ## Connection parameters | Parameter | Value | |-----------|-------| | **Server** | `pg.-aws.motherduck.com` (find yours at [Postgres settings](https://app.motherduck.com/settings/postgres)) | | **Database** | Your database name | | **Username** | `postgres` | | **Password** | Your [MotherDuck access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) | | **Encrypted Connection** | Checked | ## Troubleshooting ### Gateway shows offline Check the gateway machine is on, connected to the network, and the `On-premises data gateway service` Windows service is running. Restart the service if needed. ### Firewall blocking port 5432 If `Test-NetConnection -ComputerName pg.us-east-1-aws.motherduck.com -Port 5432` returns `TcpTestSucceeded: False`, add an outbound firewall rule allowing TCP 5432 to the MotherDuck Postgres host. ### SSL/TLS handshake failure MotherDuck uses certificates from a publicly trusted CA, so the gateway should trust them by default. If you see "The remote certificate is invalid," run Windows Update to refresh the root CA store, or manually import the ISRG Root X1 certificate into the machine-level Trusted Root Certification Authorities store. After importing, restart the gateway service. ### Credential errors - The username must be `postgres`. - The password is your **MotherDuck access token** (starting with `md_`), not your web UI password. - Check for trailing whitespace in the token. ### Published dataset doesn't see the gateway The **Server** and **Database** values in the gateway data source must match the `.pbix` file exactly, including case. Recreate the data source with the correct values if they differ. ## Additional information - [Postgres endpoint reference](/sql-reference/postgres-endpoint) for connection parameters, SSL options, and limitations - [Connect through the Postgres endpoint](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint) for a general how-to guide - [Microsoft gateway documentation](https://learn.microsoft.com/en-us/power-bi/connect-data/service-gateway-onprem) - [Power BI Service documentation](https://learn.microsoft.com/en-us/power-bi/fundamentals/power-bi-service-overview) --- Source: https://motherduck.com/docs/integrations/bi-tools/tableau/tableau-desktop # Tableau Desktop and Server with MotherDuck > Connect Tableau Desktop or Server to MotherDuck using the DuckDB JDBC driver and Tableau connector. ## Tableau Desktop setup for DuckDB and MotherDuck 1. Download a [recent version of the DuckDB JDBC driver](https://repo1.maven.org/maven2/org/duckdb/duckdb_jdbc/) and copy it into the Tableau Drivers directory: * MacOS: `~/Library/Tableau/Drivers/` * Windows: `C:\Program Files\Tableau\Drivers` * Linux: `/opt/tableau/tableau_driver/jdbc` 2. Download the signed tableau connector (aka "Taco file") file from the [latest available release](https://github.com/MotherDuck-Open-Source/duckdb-tableau-connector/releases) and copy it into the Connectors directory: * Desktop Windows: `C:\Users\[YourUser]\Documents\My Tableau Repository\Connectors` * Desktop MacOS: `/Users/[YourUser]/Documents/My Tableau Repository/Connectors` * Server Windows: `C:\ProgramData\Tableau\Tableau Server\data\tabsvc\vizqlserver\Connectors` * Server Linux: `[Your Tableau Server Install Directory]/data/tabsvc/vizqlserver/Connectors` ## Connecting Once the Taco is installed, and you have launched Tableau, you can create a new connection by choosing "DuckDB by MotherDuck": ![Tableau connector list](../../img/tableau-connector-list.png) ### Local DuckDB database If you wish to connect to a local DuckDB database, select "Local file" as DuckDB Server option, and use the file picker: ![DuckDB Server dropdown](../../img/tableau-connect-options-local-file.png) ![Connection Dialogue](../../img/tableau-connect-local-file.png) ### In-memory database The driver can be used with an in-memory database by selecting the `In-memory database` DuckDB Server option. ![DuckDB Server dropdown](../../img/tableau-connect-options-in-memory.png) The data will then need to be provided by an Initial SQL string, for example: ```sql CREATE VIEW my_parquet AS SELECT * FROM read_parquet('/path/to/file/my_file.parquet'); ``` You can then access it by using the Tableau Data Source editing controls. ### MotherDuck To connect to MotherDuck, you have two authentication options: * Token -- provide the value that you [get from MotherDuck UI](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/#creating-an-access-token). * No Authentication -- unless `motherduck_token` environment variable is available to Tableau at startup, you will then be prompted to authenticate when at connection time. To work with a MotherDuck database in Tableau, you have to provide the database to use when issuing queries. In `MotherDuck Database` field, provide the name of your database. You don't have to prefix it with `md:`: ![DuckDB Server dropdown](../../img/tableau-connect-options-md.png) ![Connection Dialogue](../../img/tableau-connect-motherduck.png) ## Additional information * [Tableau Documentation](https://help.tableau.com/current/pro/desktop/en-us/gettingstarted_overview.htm) * [Tableau Exchange Connector DuckDB/MotherDuck](https://exchange.tableau.com/en-gb/products/1021) * [DuckDB Tableau Connector](https://github.com/MotherDuck-Open-Source/duckdb-tableau-connector/) --- Source: https://motherduck.com/docs/integrations/bi-tools/evidence # Evidence > Evidence is an open source, code-based alternative to drag-and-drop BI tools. Build polished data products with just SQL and markdown. ## Getting started Head over to [their installation page](https://docs.evidence.dev/getting-started/install-evidence) and start with their template to get you started. ## Authenticate to MotherDuck When using development, you can go manually through the UI, pick "settings". If you are running Evidence locally, typically at [http://localhost:3000/settings](http://localhost:3000/settings). ![img](../img/evidence_settings.png) Then select 'DuckDB' as a connection type, and as the filename, use `'md:?motherduck_token=xxxx'` where `xxx` is your [access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck#authentication-using-an-access-token). Finally as extension, select "No extension". Click on `Save`. ![img](../img/evidence_duckdb.png) In production, you can set [some global environments](https://docs.evidence.dev/deployment/environments#prod-environment), you would have to set two environments variables: - `EVIDENCE_DUCKDB_FILENAME='md:?motherduck_token=xxxx'` - `EVIDENCE_DATABASE=duckdb` ## Displaying some data through SQL and markdown Once done, you can add a new page in the `pages` folder and add the following code blocks to `stackoverflow.md` file: First, add some Markdown headers. ```md --- title: Evidence & MotherDuck --- # Stories with most score ``` Then, we query our data from the [HackerNews sample_data database](/getting-started/sample-data-queries/hacker-news.md) in MotherDuck. The query is fetching the top stories (posts) from HackerNews. SELECT id, title, score, "by", strftime('%Y-%m-%d', to_timestamp(time)) AS date FROM sample_data.hn.hacker_news WHERE type = 'story' ORDER BY score DESC LIMIT 20; Finally, we use the reference of that query result `new_items` to create a list that would be generated in Markdown. The list contains the title (with the url of the story), the date, the score and the author of the story. ```md {#each new_items as item} * [{item.title}](https://news.ycombinator.com/item?id={item.id}) {item.date} ⬆ {item.score} by [{item.by}](https://news.ycombinator.com/user?id={item.by}) {/each} ``` Head over then to this page you created and you should see the final result that looks like this: ![img](../img/evidence_hackernews.png) --- Source: https://motherduck.com/docs/integrations/bi-tools/powerbi/powerbi-custom-connector # Power BI custom connector (legacy) > Connect Power BI to MotherDuck using the DuckDB ODBC driver and Power Query custom connector. :::warning[Legacy] The custom connector is a legacy approach. Use the [Postgres endpoint setup](./powerbi-desktop.mdx) instead for a simpler connection that doesn't require installing drivers or custom extensions. ::: The open-source [DuckDB Power Query Connector](https://github.com/motherduckdb/duckdb-power-query-connector/) lets you connect Power BI to DuckDB and MotherDuck using the DuckDB ODBC driver. ## Installing 1. Download the latest [DuckDB ODBC driver for Windows (x86_64/AMD64)](https://github.com/duckdb/duckdb-odbc/releases/download/v1.4.4.0/duckdb_odbc-windows-amd64.zip). See [the releases page](https://github.com/duckdb/duckdb-odbc/releases) for other versions and architectures. For more information about the Windows ODBC Driver, see the [DuckDB Docs page on DuckDB ODBC API on Windows](https://duckdb.org/docs/stable/clients/odbc/windows). 2. Extract the `.zip` archive. Run `odbc_install.exe`. If Windows displays a security warning, click "More information" then "Run Anyway". 3. Optionally, verify the installation in the Registry Editor: - Open Registry Editor by running `regedit` - Navigate to `HKEY_LOCAL_MACHINE\SOFTWARE\ODBC\ODBCINST.INI\DuckDB` - Confirm the Driver field shows your installed version - If incorrect, delete the `DuckDB` registry key and reinstall 4. Configure Power BI security settings to allow loading of custom extensions: - Go to File -> Options and settings -> Options -> Security -> Data Extensions - Enable "Allow any extensions to load without validation or warning" - ![Dialog window showing Power BI Options to Security to Data Extensions](/img/integrations/PowerBI-data-extensions.png) 5. Download the latest version of the DuckDB Power Query extension: - [duckdb-power-query-connector.mez](https://github.com/MotherDuck-Open-Source/duckdb-power-query-connector/releases/latest/download/duckdb-power-query-connector.mez) 6. Create the Custom Connectors directory if it does not yet exist: - Navigate to `[Documents]\Power BI Desktop\Custom Connectors` - Create this folder, if it doesn't exist - Note: If this location does not work you may need to place this in your OneDrive Documents folder instead 7. Copy the `duckdb-power-query-connector.mez` file into the Custom Connectors folder 8. Restart Power BI Desktop ## How to use with Power BI 1. In Power BI Desktop, click "Get Data" -> "More..." ![Power BI Get Data dialog](/img/integrations/PowerBI-get-data.png) 2. Search for "DuckDB" in the connector search box and select the DuckDB connector ![Power BI connector search showing the DuckDB connector](/img/integrations/PowerBI-Select-source-duckdb.png) 3. For MotherDuck connections, you'll need to provide: - Database Location: Use the `md:` prefix followed by your database name (for example, `md:my_database`). This can also be a local file path (for example, `~\my_database.db`) or an in-memory database (`:memory:`). - MotherDuck Token: Get your token from [MotherDuck's token page](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/#creating-an-access-token). *For local DuckDB connections:* Enter "localtoken" to enable the connection. ![Power BI DuckDB connection dialog](/img/integrations/PowerBI-duckdb-database.png) - Read Only (Optional): Set to `true` if you only need read access. - Saas_mode (Optional): Set to `true` to disable [DuckDB extensions](../../../concepts/duckdb-extensions.md). - Attach_mode (Optional): Set to `single` to scope the connection to one database (recommended for BI-tool catalog browsers, which can be confused by multiple attached databases). Leave blank to use the default workspace mode and see all databases in your workspace. See [Attach modes](/key-tasks/authenticating-and-connecting-to-motherduck/attach-modes/). 4. Click "OK". 5. Click "Connect". ![Power BI connection dialog for MotherDuck through the DuckDB connector](/img/integrations/PowerBI-DuckDB-connect.png) 6. Select the table(s) you want to import. Click "Load". ![Power BI Navigator dialog previewing and selecting tables](/img/integrations/PowerBI-data-selection.png) 7. You can query your data and create visualizations. ![Power BI report built from MotherDuck data](/img/integrations/PowerBI-example-graph.png) 8. After connecting, you can: - Browse and select tables from your MotherDuck or DuckDB database - Use "Transform Data" to modify your queries before loading - Write custom SQL queries using the "Advanced Editor" - Import multiple tables in one go 9. Power BI maintains the connection to your MotherDuck or DuckDB database, letting you: - Refresh data automatically or on-demand - Create relationships between tables - Build visualizations and dashboards - Share reports with other users (requires proper gateway setup) ## Use custom data connectors with an on-premises data gateway You can use custom data connectors with an on-premises data gateway to connect to data sources that are not supported by default. To do this, you need to install the on-premises data gateway and configure it to use the custom data connector. For more information, see [Use custom data connectors with an on-premises data gateway in Power BI](https://learn.microsoft.com/en-us/power-bi/connect-data/service-gateway-custom-connectors). There are some limitations with using a custom connector with an on-premises data gateway: - The folder you create must be accessible to the background gateway service. Folders under user Windows folders or system folders typically aren't accessible. The on-premises data gateway app shows a message if the folder isn't accessible. This limitation doesn't apply to the on-premises data gateway (personal mode). - If your custom connector is on a network drive, include the fully qualified path in the on-premises data gateway app. - You can only use one custom connector data source when working in DirectQuery mode. Multiple custom connector data sources don't work with DirectQuery. ## Additional information - [Power BI documentation](https://learn.microsoft.com/en-us/power-bi/connect-data/) - [DuckDB Power Query Connector](https://github.com/motherduckdb/duckdb-power-query-connector/) ## Troubleshooting ### Missing VCRUNTIME140.dll If you receive an error about missing `VCRUNTIME140.dll`, you need to install the Microsoft Visual C++ Redistributable. You can download it from [Microsoft's download page](https://www.microsoft.com/en-us/download/details.aspx?id=52685). ### Visual C++ and ODBC issues :::note These steps are particularly relevant for Windows Server environments, especially for Windows Server 2019, but may also help resolve issues on other Windows versions. ::: If you encounter issues with ODBC connectivity or receive errors related to Visual C++ libraries, try these troubleshooting steps: 1. Reinstall the Microsoft Visual C++ Redistributable: - Download the latest version from [Microsoft's official website](https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170) for your architecture - Run the installer with administrator privileges - Restart your computer after installation - Try connecting to MotherDuck again 2. If you're still experiencing issues, you can use the ODBC Test tool to diagnose the connection: - Open the ODBC Test tool (typically available in Windows SDK) - Look for a dropdown menu labeled "hstmt 1: ..." - Select this option to run test queries - If queries work in the ODBC Test tool but not in Power BI, this indicates a Power BI-specific configuration issue If you continue to experience problems after trying these steps: - Verify that your MotherDuck token is valid and hasn't expired - Check that your network allows connections to MotherDuck's services - Confirm you have the latest version of the DuckDB Power Query Connector installed If you're still experiencing issues, reach out to us at [support@motherduck.com](mailto:support@motherduck.com) and we'll be happy to help you troubleshoot the issue. --- Source: https://motherduck.com/docs/integrations/language-apis-and-drivers/python/sqlalchemy # SQLAlchemy with DuckDB and MotherDuck > Connect to MotherDuck from SQLAlchemy using either the PostgreSQL connector through MotherDuck's Postgres endpoint or the DuckDB SQLAlchemy driver. [SQLAlchemy](https://www.sqlalchemy.org/) is a Python SQL toolkit and Object-Relational Mapping (ORM) system that supports a wide range of database dialects. Many business intelligence tools support SQLAlchemy out of the box. You can connect SQLAlchemy to MotherDuck through two paths: - **Recommended:** [MotherDuck's Postgres endpoint](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint/) with SQLAlchemy's built-in PostgreSQL dialect and the `psycopg` driver. This path uses the standard PostgreSQL wire protocol and doesn't require DuckDB in your application environment. - **DuckDB SQLAlchemy driver:** the [`duckdb-engine`](https://github.com/Mause/duckdb_engine) dialect, which connects through a DuckDB connection string. Use this path when you need DuckDB-specific SQLAlchemy behavior or local DuckDB features such as local-file access, local attachments, hybrid execution, or DuckDB extension management. ## Why use the Postgres endpoint The DuckDB SQLAlchemy driver can connect to MotherDuck through a DuckDB connection string, but most SQLAlchemy applications should use the Postgres endpoint instead: - **Standard connector support**: Use SQLAlchemy's built-in PostgreSQL dialect with the `psycopg` driver. - **No DuckDB dependency**: Connect from serverless runtimes, containers, and application servers without bundling DuckDB. - **Production connection management**: Use SQLAlchemy pooling with long-lived Postgres-compatible connections. - **Consistent integration path**: Share the same connection parameters used by other Postgres-compatible tools. Use the DuckDB SQLAlchemy driver only when your application needs local DuckDB features such as local-file access, local attachments, hybrid execution, or DuckDB extension management. ## Before you start You need a [MotherDuck access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/). Store it in an environment variable: ```bash export MOTHERDUCK_TOKEN="your_token_here" ``` You also need your MotherDuck Postgres host. You can find it in [MotherDuck Postgres settings](https://app.motherduck.com/settings/postgres). The examples below use `pg.us-east-1-aws.motherduck.com`; use the host shown for your account. ## Install SQLAlchemy and psycopg Install SQLAlchemy and the PostgreSQL connector: ```bash pip install --upgrade sqlalchemy psycopg ``` ## Connect with SQLAlchemy Create a SQLAlchemy engine with the PostgreSQL dialect and the `psycopg` driver: ```python import os from sqlalchemy import create_engine, text engine = create_engine( "postgresql+psycopg://postgres@pg.us-east-1-aws.motherduck.com:5432/md:", connect_args={ "password": os.environ["MOTHERDUCK_TOKEN"], "sslmode": "verify-full", "sslrootcert": "system", }, pool_pre_ping=True, ) with engine.connect() as conn: result = conn.execute(text("SHOW DATABASES")) for row in result: print(row) ``` Using `md:` as the database name connects to your default database and uses workspace attach mode, which makes the databases in your MotherDuck workspace available from the session. To connect to a specific database, replace `md:` with the database name: ```python engine = create_engine( "postgresql+psycopg://postgres@pg.us-east-1-aws.motherduck.com:5432/sample_data", connect_args={ "password": os.environ["MOTHERDUCK_TOKEN"], "sslmode": "verify-full", "sslrootcert": "system", }, ) ``` ## Query MotherDuck Execute SQL with SQLAlchemy's `text()` construct: ```python from sqlalchemy import text with engine.connect() as conn: result = conn.execute( text( """ SELECT title, score FROM sample_data.hn.hacker_news WHERE type = 'story' ORDER BY score DESC LIMIT 5 """ ) ) for row in result: print(row.title, row.score) ``` The Postgres endpoint is a PostgreSQL-wire interface to MotherDuck. You write **DuckDB SQL**, not PostgreSQL SQL. ## Loading data For loading data through SQLAlchemy and the Postgres endpoint, prefer server-side reads from remote storage: - Use `CREATE TABLE AS SELECT` or `INSERT INTO ... SELECT`. - Point `read_parquet`, `read_csv`, or `read_json` at S3, GCS, R2, Azure, or HTTPS. - Set `MD_RUN = REMOTE` on remote file reads. ```python from sqlalchemy import text with engine.begin() as conn: conn.execute( text( """ CREATE OR REPLACE TABLE my_db.main.weather_events AS SELECT * FROM read_csv( 'https://raw.githubusercontent.com/duckdb/duckdb-web/main/data/weather.csv', HEADER = true, AUTO_DETECT = true, MD_RUN = REMOTE ) """ ) ) ``` Use a DuckDB client path instead for local-file ingestion, `COPY FROM STDIN`, `MD_RUN = LOCAL`, or high-volume inserts from application memory. ## Operational notes - **Use SSL**: The Postgres endpoint requires encrypted connections. `sslmode=verify-full` with `sslrootcert=system` verifies the server certificate when supported by your client. - **Keep tokens out of code**: Pass your MotherDuck access token through an environment variable or secret manager. - **Prefer long-lived connections**: Configure SQLAlchemy pooling for application workloads instead of opening a new connection per query. - **Avoid unsupported Postgres features**: PostgreSQL-specific functions, indexes, sequences, stored procedures, and temporary tables are not supported. ## Connecting with the DuckDB SQLAlchemy driver If your application needs DuckDB-specific SQLAlchemy behavior or local DuckDB features, use the [DuckDB SQLAlchemy driver](https://github.com/Mause/duckdb_engine) and the DuckDB SQLAlchemy URI style instead. ### Install the DuckDB SQLAlchemy driver ```bash pip install --upgrade duckdb-engine ``` ### Connect to a local DuckDB database Access a local DuckDB database with the SQLAlchemy URI: ```bash duckdb:///path/to/file.db ``` ### Connect to MotherDuck The general pattern for the SQLAlchemy URI to access a MotherDuck database is: ```bash duckdb:///md:?motherduck_token= ``` The database name `` in the connection string is optional. Omitting it lets you query multiple databases with one connection to MotherDuck. You can authenticate in several ways: **1. Web login** If no token is available, the process directs you to a web login for authentication, which lets you obtain a token. ```python from sqlalchemy import create_engine, text eng = create_engine("duckdb:///md:my_db") with eng.connect() as conn: result = conn.execute(text("SHOW DATABASES")) for row in result: print(row) ``` When you run the above, you'll see something like this to authenticate: ![motherduck login](../img/sqlalchemy_auth.png) **2. `MOTHERDUCK_TOKEN` environment variable** ```python from sqlalchemy import create_engine, text eng = create_engine("duckdb:///md:my_db") with eng.connect() as conn: result = conn.execute(text("SHOW DATABASES")) for row in result: print(row) ``` **3. Configuration dictionary** ```python from sqlalchemy import create_engine, text config = {} token = 'asdfwerasdf' # Fill in your token config["motherduck_token"] = token eng = create_engine( "duckdb:///md:my_db", connect_args={'config': config} ) with eng.connect() as conn: result = conn.execute(text("SHOW DATABASES")) for row in result: print(row) ``` **4. Token as a connection string parameter** ```python from sqlalchemy import create_engine, text token = 'asdfwerasdf' # Fill in your token eng = create_engine(f"duckdb:///md:my_db?motherduck_token={token}") with eng.connect() as conn: result = conn.execute(text("SHOW DATABASES")) for row in result: print(row) ``` :::info The DuckDB Python API has a `.sql()` method on the connection API, but SQLAlchemy doesn't. Both share the `.execute()` function and concept. For more, see the [SQLAlchemy connection documentation](https://docs.sqlalchemy.org/en/20/core/connections.html#sqlalchemy.engine.Connection). ::: ## Related content - **Connect through the Postgres endpoint**: [Postgres endpoint guide](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint/) - **Review connection parameters**: [Postgres Endpoint reference](/sql-reference/postgres-endpoint/) - **Connect from Python**: [Python through the Postgres endpoint](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint/python/) - **Choose an interface**: [Client APIs](/getting-started/interfaces/client-apis/) --- Source: https://motherduck.com/docs/integrations/bi-tools/tableau/tableau-bridge # Tableau Bridge (legacy) > Connect Tableau Cloud to MotherDuck using Tableau Bridge and the DuckDB JDBC connector. :::warning[Deprecated] Connecting through Tableau Bridge is a legacy approach. Use the [Postgres endpoint setup](./tableau-cloud.mdx) instead for a simpler connection that doesn't require Bridge infrastructure. ::: ## How to use Tableau Cloud with MotherDuck through Tableau Bridge ### Setup This guide assumes you have: - a [Tableau account](https://www.tableau.com/) - a Tableau Cloud Site - a Tableau Desktop installation (with the same version as the Tableau Cloud Server Version) set up with the DuckDB JDBC Driver and Tableau Connector. If you don't, sign up or ask your organization to purchase a plan, or sign up for a free trial. ### Obtain a PAT token Follow [Tableau's instructions on creating a PAT token.](https://help.tableau.com/current/server/en-us/security_personal_access_tokens.htm) This token must belong to a site admin. ### Set up Bridge client Use the [Tableau Bridge client setup instructions](https://help.tableau.com/current/online/en-us/to_bridge_client.htm) to install and set up Bridge client. 1. Make sure the machine where the Bridge client is installed has access to the Database used in the above steps. Important notes: > Network access - Because Bridge facilitates connections between your private network data and Tableau Cloud, it requires the ability to make outbound connections through the internet. After the initial outbound connection, communication is bidirectional. > Required ports - Tableau Bridge uses port 443 to make outbound internet requests to Tableau Cloud and port 80 for certificate validation. 2. Install Bridge client and make sure the Bridge client is signed in to the Tableau Cloud site. You can download the installer from the [Tableau Bridge releases page](https://www.tableau.com/support/releases/bridge). 3. Install the driver and taco files as outlined in the [Tableau connector setup guide](https://help.tableau.com/current/online/en-us/to_sync_local_data.htm#connectors-and-data-types). - [Windows Server] The driver also needs to be installed here: `C:\Program Files\Tableau\Tableau Bridge\Drivers` - [Windows Server] The connector also needs to be installed here: `C:\Program Files\Tableau\Connectors` > Note: Tableau Bridge can be deployed on both Windows or Linux. ### Running Bridge on Linux using Docker (advanced) If you want to run Bridge centrally on a Linux host, the official guidance recommends running it inside a Docker container, as described in Tableau's documentation on [installing Bridge for Linux in containers](https://help.tableau.com/current/online/en-us/to_bridge_linux_install.htm). Below is an **example Dockerfile** you can use as a starting point—this includes where to add JDBC drivers and the **DuckDB/MotherDuck** `.taco` file. It's provided for inspiration and may require updates to match your environment or newer versions of the software.
Example Dockerfile ```dockerfile FROM registry.access.redhat.com/ubi8/ubi:latest RUN yum update -y RUN yum install -y glibc-langpack-en # This is the latest version of Tableau Bridge that is known working with the MotherDuck connector RUN curl -o /tmp/TableauBridge.rpm -L \ https://downloads.tableau.com/tssoftware/TableauBridge-20243.25.0114.1153.x86_64.rpm && \ ACCEPT_EULA=y yum install -y /tmp/TableauBridge.rpm && \ rm /tmp/TableauBridge.rpm # Drivers RUN mkdir -p /opt/tableau/tableau_driver/jdbc # Connectors (tacos) RUN mkdir -p /root/Documents/My_Tableau_Bridge_Repository/Connectors # Download DuckDB JDBC driver and signed taco RUN curl -o /opt/tableau/tableau_driver/jdbc/duckdb_jdbc-1.3.0.0.jar \ -L https://repo1.maven.org/maven2/org/duckdb/duckdb_jdbc/1.3.0.0/duckdb_jdbc-1.3.0.0.jar && \ curl -o /root/Documents/My_Tableau_Bridge_Repository/Connectors/duckdb_jdbc-v1.1.1-signed.taco \ -L https://github.com/motherduckdb/duckdb-tableau-connector/releases/download/v1.1.1/duckdb_jdbc-v1.1.1-signed.taco ENV TZ=Europe/Berlin ENV LC_ALL=en_US.UTF-8 # ----- user specific settings ----- ENV USER_EMAIL="" ENV PAT_ID=BridgeToken ENV CLIENT_NAME="" ENV SITE_NAME="" ENV POOL_ID="" # ----------------------------------- CMD /opt/tableau/tableau_bridge/bin/run-bridge.sh -e \ --patTokenId=$PAT_ID \ --userEmail=$USER_EMAIL \ --client=$CLIENT_NAME \ --site=$SITE_NAME \ --patTokenFile="/home/documents/token.txt" \ --poolId=$POOL_ID ```
Key points: * Build an image that **installs the Bridge RPM** and then copies the DuckDB JDBC driver to `/opt/tableau/tableau_bridge/Drivers` and the connector to `/root/Documents/My_Tableau_Bridge_Repository/Connectors`. * Start the bridge by calling `run-bridge.sh` and pass the following flags: * `--patTokenFile /run/secrets/pat.json` * `--patTokenId ` * `--site ` * `--poolId ` (optional – see note on pools below) * **PAT naming rule** – the *name* you give the Personal-Access-Token in Tableau **must** be a valid JSON key and must be used **verbatim** 1. as the key in `pat.json` → `{"": ""}` 2. in the `--patTokenId` flag. A mismatch will result in a silent authentication failure. * The latest Bridge **2025.1** builds contain a regression that prevents the MotherDuck connector (and several others) from loading. Until Tableau fixes this, pin the image to the **20243.25.0114.1153** release (see discussion in [GitHub issue #22](https://github.com/MotherDuck-Open-Source/duckdb-tableau-connector/issues/22)). * Bridge listens only on outbound **443/tcp**, so you do **not** need to publish any container ports. If you run a host firewall (for example, `ufw`) remember that Docker bypasses it [[Docker docs](https://docs.docker.com/engine/network/packet-filtering-firewalls/#docker-and-ufw)]. Restrict egress traffic to Tableau Cloud CIDR blocks if your security policy requires it. * Logs written to `stdout` are useful, but the *detailed* logs live in `/root/Documents/My_Tableau_Bridge_Repository/Log`. Mount this path as a volume or use a side-car to ship the logs to your observability stack. ### Tableau Cloud Bridge pool setup By default, Tableau places the Bridge in the default pool. 1. In Settings → Bridge page, make sure the Bridge client is connected in the connection Status. 2. In the "Private Network Allowlist" add the domain of the database and select the pool. ![Tableau Bridge Pooling](useBaseUrl('/img/integrations/tableau-bridge-pooling.png')) > **Pool Gotcha**: Some users report that a Linux containerised Bridge never shows up under a custom site pool. If that happens, leave `POOL_ID` blank when starting the client – it will join the legacy **Default** pool and still work with live connections. ### Create embedded data source (live) and workbook 1. Open Tableau desktop and sign in to a Tableau Cloud site. > Note: Make sure the Tableau Desktop and [Tableau Cloud version](https://help.tableau.com/current/server/en-us/version_server_view.htm) match. 2. Create new Workbook and select the database connector. 3. Connect to the database. ![Tableau Cloud DuckDB connector dialog](useBaseUrl('/img/integrations/tableau-cloud-connect-motherduck.png')) 4. Set up Datasource to use live connectivity. 5. Create a worksheet with the data. ![Tableau worksheet with MotherDuck data](useBaseUrl('/img/integrations/tableau-create-worksheet.png')) ### Publish the workbook to Tableau Cloud 1. Click on "Server > Publish Workbook". ![Tableau publish workbook menu](useBaseUrl('/img/integrations/tableau-publish-workbook.png')) 2. Select "Publish Separately" under Publish Type and "Embedded password" under Authentication. Select "Maintain connection to a live data source". ![Tableau publish separately dialog](useBaseUrl('/img/integrations/tableau-publish-separately.png')) ![Tableau publish workbook and data source dialog](useBaseUrl('/img/integrations/tableau-publish-workbook-and-data-source.png')) 3. Click "Publish Workbook & 1 Data Source". ![Tableau publishing complete confirmation](useBaseUrl('/img/integrations/tableau-publishing-complete.png')) ### (Important step!) update Tableau Bridge client in data source 1. Navigate to the newly published data source in Tableau Cloud (in your browser) and click on the "i" icon to open Data Source Details. ![Tableau data source info icon](useBaseUrl('/img/integrations/tableau-data-source-i.png')) 2. Click on "Change Bridge Client..." ![Tableau data source details dialog](useBaseUrl('/img/integrations/tableau-data-source-details.png')) 3. Change the bridge client from "Site client pool" to your bridge client (the one you set up in the previous section). Click "Save" and close the dialog. ![Tableau change bridge client dialog](useBaseUrl('/img/integrations/tableau-change-bridge-client.png')) 4. Check that the data source shows up in your Tableau Bridge status dialog. This dialog is located in the Windows Start bar (in the Icon panel). ![Tableau Bridge connected status](useBaseUrl('/img/integrations/tableau-bridge-connected.png')) 5. You can access your Published Workbook on your Tableau Cloud Site, or you can create a new Tableau Workbook using the Published Data Source. ![Tableau workbook using published data source](useBaseUrl('/img/integrations/tableau-workbook-using-this-data-source.png')) ## Additional information - [Tableau Documentation](https://help.tableau.com/current/pro/desktop/en-us/gettingstarted_overview.htm) - [Tableau Exchange Connector DuckDB/MotherDuck](https://exchange.tableau.com/en-gb/products/1021) - [DuckDB Tableau Connector](https://github.com/MotherDuck-Open-Source/duckdb-tableau-connector/) --- Source: https://motherduck.com/docs/integrations/databases/sql-server # SQL Server > Use the SQL Server replication guide when you need to read tables or queries from SQL Server and write the results to MotherDuck. The guide covers Python, pyodbc, SQL Server authentication, and loading dataframe results into MotherDuck. ## How it works with MotherDuck 1. Connect to SQL Server with the Microsoft ODBC driver and `pyodbc`. 2. Read a SQL Server table or query result into a dataframe. 3. Connect to MotherDuck from Python and persist the dataframe as a MotherDuck table. ## Related content - [Replicating SQL Server tables to MotherDuck](/key-tasks/data-warehousing/replication/sql-server) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/bi-tools/superset-preset # Superset & Preset > Apache Superset is a powerful, open-source data exploration and visualization platform designed to be intuitive and interactive. It allows data professionals to quickly integrate and analyze data from various sources, creating insightful dashboards and charts for better decision making. [Preset](https://preset.io/) is a cloud-native, user-friendly platform built on Apache Superset. It offers enhanced capabilities and managed services to leverage the power of Superset without needing to handle installation and maintenance. In this guide, we'll cover how you can use MotherDuck with either Superset or Preset. ## Self-hosted Superset ### Setup The easy way to get started locally with Superset is to use their [docker-compose configurations.](https://superset.apache.org/docs/installation/installing-superset-using-docker-compose/) ### Adding a database connection to MotherDuck To make it work with DuckDB & MotherDuck, you will have to install two extra Python packages in your local Superset environment: - DuckDB SQLAlchemy driver [duckdb-engine](https://github.com/Mause/duckdb_engine) - DuckDB [duckdb](https://github.com/duckdb/duckdb) 1. Clone the [Superset repository](https://github.com/apache/superset): ```bash git clone https://github.com/apache/superset.git ``` 2. Create a new file in `superset/docker/requirements-local.txt` and add the following packages: ```text duckdb-engine duckdb ``` 3. Build or run the docker container, depending whether this is the first time you run it or not, with the following command: ```bash # First time running it docker-compose up --build # Subsequent runs docker-compose up ``` 4. Once the container is running, you can access the Superset UI at [http://localhost:8088](http://localhost:8088) or at the address you specified in the `docker-compose.yml` file. 5. Once you are logged in, head over to "Settings" and click on "Database Connections", then click on "+ Database". ![Superset Settings menu showing Database Connections option](./img/superset-database-connections-menu.png) ![Superset Add Database button](./img/superset-add-database.png) 6. In the Dropdown, pick "MotherDuck", then enter the database name that you want to connect to and the MotherDuck token of the user or service account. :::note If MotherDuck isn't listed, there's probably an error in the installation of the `duckdb-engine`. Review the installation steps under (2) to install this extra python package. ::: :::info `Database name` is **optional**. Instead of specifying a database name, you can leave it empty to connect to all databases. ::: ![Superset dropdown showing MotherDuck option](./img/superset-select-motherduck.png) ![Superset MotherDuck connection form with database name and token fields](./img/superset-motherduck-connection.png) 7. Finally, you can test your token/connection is valid by clicking "Test connection" and click "Connect". Now your MotherDuck database is available in Superset and you can start querying data and making some dashboards! ## Preset ### Setup You can register a Preset account for [free](https://preset.io/pricing/) (up to 5 users). Upon your account creation, you will need to create a workspace and be prompted to connect to your data source. ### Adding your first database connection to MotherDuck When you first setup Preset, you will be offered to create a connection to a database. Preset has a direct integration with MotherDuck, making the connection process simpler. 1. In the Database Connection Dropdown in "Connect your first database", select "MotherDuck" and enter your MotherDuck credentials and database information. :::note The Database Name needs to be prefixed with `md:` to connect to MotherDuck. The Access Token is the token you created in the [MotherDuck dashboard](https://app.motherduck.com). ::: ![Preset database connection dropdown with MotherDuck option](./img/preset-select-motherduck.png) ![Preset MotherDuck credentials form](./img/preset-motherduck-credentials.png) 2. Click "Connect" to verify your connection is valid. Now your MotherDuck database is available in Preset and you can start creating dashboards immediately! :::info You can connect to multiple databases using a single MotherDuck connection. ::: ### Adding additional database connections When adding more database connections to Preset, you can choose the option of "Get MotherDuck token". This generates a new token from the MotherDuck account you are logged into. 1. Add a database connection by going to "Settings", then "Database Connections". In the Database Connections page, click on "+ Database" in the top right corner. ![Preset Settings showing Database Connections page](./img/preset-database-connections-menu.png) ![Preset Add Database button](./img/preset-add-database.png) 2. In the dropdown, select "MotherDuck" (see above). 3. Enter your MotherDuck credentials and database information. Here you have the option to generate a new token using the `Get MotherDuck token` button or use a token you previously created. ![Preset MotherDuck credentials form with Get MotherDuck token option](./img/preset-motherduck-token.png) :::caution Given that usually BI tools such as Preset and Superset are connected to service accounts, we do not recommend the "Get MotherDuck token" option for production systems but only for testing. For production systems the recommended approach is to generate an access token for the dedicated service account using the MotherDuck REST API and connect this account to Preset instead. ::: ## Related content - [SQLAlchemy with DuckDB and MotherDuck](/docs/integrations/language-apis-and-drivers/python/sqlalchemy/) - [Authenticating to MotherDuck](/docs/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/) - [Managing Service Accounts](/docs/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/) --- Source: https://motherduck.com/docs/integrations/sql-ides/datagrip # DataGrip > JetBrains DataGrip is a cross-platform IDE for working with SQL and noSQL databases. It includes a DuckDB integration, which makes connecting to MotherDuck easy. ## Connecting to MotherDuck in DataGrip Create a new data source and choose the **DuckDB** driver. DataGrip opens the **Data Sources and Drivers** window where you configure the connection. ### Token authentication To retrieve a MotherDuck token, follow the steps in [Authenticating to MotherDuck](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/authenticating-to-motherduck.md). 1. In **Data Sources and Drivers > General**, set **Authentication** to **No auth**. 2. Populate the **URL** field with the MotherDuck connection string, replacing `my_db` with your database name or omitting it to connect to the default catalog: ```sh jdbc:duckdb:md:[my_db] ``` ![config](../img/datagrip_config.png) 3. Open the **Advanced** tab and add a new parameter named `motherduck_token`, setting its value to the token you generated earlier. ![config](../img/datagrip_token.png) Click "OK" to begin querying MotherDuck! :::note The default schema filtering configuration of DataGrip may hide some of the schemas that exist in your MotherDuck account. Reconfigure to display all schemas following [DataGrip documentation](https://www.jetbrains.com/help/datagrip/schemas.html). ::: ## Update the DuckDB driver version DataGrip bundles a DuckDB JDBC driver, but you can replace it with another version if needed. 1. Visit the [DuckDB JDBC maven repository](https://mvnrepository.com/artifact/org.duckdb/duckdb_jdbc). 2. Select the DuckDB release you want to use and download the `.jar` file listed under **Files**. 3. In the **Data Sources and Drivers** window, switch to the **Drivers** pane and select **DuckDB**. 4. On the **General** tab, find **Driver files**, click the **+** icon, and choose the `.jar` file you downloaded. 5. You need to remove the existing DuckDB driver from the **Drivers** pane for the new driver to take effect (needs to be first in the list). 6. [optional] To restore the default driver, click on the **+** icon and select **DuckDB** among the available drivers. DataGrip now uses the updated DuckDB driver for MotherDuck connections. --- Source: https://motherduck.com/docs/integrations/sql-ides/dbeaver # DBeaver > DBeaver Community is a free cross-platform database integrated development environment (IDE). It includes a DuckDB integration, so it is a great choice for querying MotherDuck. ## DBeaver DuckDB setup DBeaver uses the official [DuckDB JDBC driver](https://duckdb.org/docs/api/java.html), which supports MotherDuck out of the box! To install DBeaver and the DuckDB driver, first follow the [DuckDB DBeaver guide](https://duckdb.org/docs/guides/sql_editors/dbeaver). That guide will create a local DuckDB in memory connection. After completing those steps, follow the steps below to add a MotherDuck connection in addition! ## Connecting DBeaver to MotherDuck ### Browser authentication Create a new DuckDB connection in DBeaver. When entering the connection string in DBeaver, instead of using `:memory:` for an in memory DuckDB, use `md:my_db`. Replace `my_db` with the name of the target MotherDuck database as needed. Clicking either "Test Connection" or "Finish" will open the default browser and display an authorization prompt. Click "Confirm", then return to DBeaver to begin querying MotherDuck! ### Token authentication To avoid the authentication prompt when opening DBeaver, a MotherDuck access token can be included as a connection string parameter. To retrieve a token, follow the steps in [Authenticating to MotherDuck](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/authenticating-to-motherduck.md). Then, create a new DuckDB connection in DBeaver. Include the token as a query string parameter in the connection string following this format, replacing `` with the access token from the prior step, and `my_db` with the target MotherDuck database: ```sh md:my_db?motherduck_token= ``` Click "Finish" to begin querying MotherDuck! --- Source: https://motherduck.com/docs/integrations/cloud-storage/hetzner-object-storage # Hetzner Object Storage > Hetzner Object Storage is a S3-compatible object storage service. ## Configure Hetzner Object Storage credentials You can safely store your Hetzner Object Storage credentials in MotherDuck for convenience by creating a `SECRET` object using the [CREATE SECRET](/sql-reference/motherduck-sql-reference/create-secret.md) command. :::note See [Hetzner docs](https://docs.hetzner.com/storage/object-storage/getting-started/generating-s3-keys/) to create S3 access keys. Save your secret key immediately as it cannot be viewed again after creation. ::: ### Create a SECRET object ### SQL ```sql CREATE SECRET IN MOTHERDUCK ( TYPE S3, KEY_ID 'your_access_key', # provided by Hetzner SECRET 'your_secret_key', # provided by Hetzner ENDPOINT 'fsn1.your-objectstorage.com', # provided by Hetzner SCOPE 'your_bucket_scope' # Example: s3://test-bucket ); ``` :::note The endpoint must include the location (e.g., fsn1, nbg1, or hel1). Available endpoints: - `fsn1.your-objectstorage.com` (Falkenstein) - `nbg1.your-objectstorage.com` (Nuremberg) - `hel1.your-objectstorage.com` (Helsinki) ::: ```sql -- test the Hetzner Object Storage credentials SELECT count(*) FROM 's3://[bucket]/[file]' ``` ### Python ```python import duckdb con = duckdb.connect('md:') con.sql("CREATE SECRET IN MOTHERDUCK ( TYPE S3, KEY_ID 'your_access_key', SECRET 'your_secret_key', ENDPOINT 'fsn1.your-objectstorage.com', SCOPE 'your_bucket_scope' )"); # testing that our Hetzner credentials work con.sql("SELECT count(*) FROM 's3://[bucket]/[file]'").show() ``` ### UI Click on your profile to access the `Settings` panel and click on `Secrets` menu. ![menu_1](./img/settings_access.png) ![menu_2](./img/settings_panel.png) Then click on `Add secret` in the secrets section. ![menu_3](./img/settings_secrets_panel.png) Select the Secret Type `S3` and fill in the required fields. Ensure you add the endpoint URL (e.g., `fsn1.your-objectstorage.com`) in the endpoint field. ### Delete a SECRET object ### SQL You can use the same method above, using the [DROP SECRET](/sql-reference/motherduck-sql-reference/delete-secret.md) command. ```sql DROP SECRET ; ``` ### UI Click on your profile and access the `Settings` menu. Click on the bin icon to delete the secret. ![menu_4](./img/secrets_delete_azure.png) ### Hetzner Object Storage credentials as temporary secrets MotherDuck supports DuckDB syntax for providing Hetzner Object Storage credentials. ```sql CREATE SECRET ( TYPE S3, KEY_ID 'your_access_key', SECRET 'your_secret_key', ENDPOINT 'fsn1.your-objectstorage.com', SCOPE 'your_bucket_scope' ); ``` :::note Local/In-memory secrets are not persisted across sessions. ::: :::info Even temporary, in-memory secrets are available to MotherDuck's cloud execution engine when you connect your local DuckDB instance to MotherDuck. When you query {props.provider}, the query runs on MotherDuck's servers, not your local machine, and MotherDuck uses the best-matching secret to authenticate, whether it is stored locally or in MotherDuck. For more details, see [CREATE SECRET](/sql-reference/motherduck-sql-reference/create-secret/#querying-with-secrets). ::: ### Multiple locations configuration If you have buckets in different Hetzner locations, you should be creating scoped secrets: ```sql -- Secret for Falkenstein location CREATE SECRET hetzner_fsn1 IN MOTHERDUCK ( TYPE S3, KEY_ID 'access_key_1', SECRET 'secret_key_1', ENDPOINT 'fsn1.your-objectstorage.com', SCOPE 's3://my-bucket-fsn1' ); -- Secret for Nuremberg location CREATE SECRET hetzner_nbg1 IN MOTHERDUCK ( TYPE S3, KEY_ID 'access_key_2', SECRET 'secret_key_2', ENDPOINT 'nbg1.your-objectstorage.com', SCOPE 's3://my-bucket-nbg1' ); ``` :::tip By default, each key pair is automatically valid for every bucket within the same Hetzner project. Use bucket policies to restrict access if needed. ::: --- Source: https://motherduck.com/docs/integrations/bi-tools/metabase # Metabase > Metabase is an open source analytics/BI platform that provides intuitive data visualization and exploration capabilities. This guide details how to connect Metabase to both local DuckDB databases and MotherDuck. ## Prerequisites - Metabase installed (self-hosted) - Admin access to your Metabase instance - For MotherDuck connections: valid MotherDuck token ## Metabase Cloud Metabase Cloud does not support installing custom drivers. Support for the DuckDB/MotherDuck driver on Metabase Cloud is under development. Until Cloud support is available, use Self-hosted Metabase to connect to DuckDB or MotherDuck. ## Self-hosted Metabase ### Install the DuckDB driver ### Dockerfile (bundled plugin) 1. Create a `Dockerfile` that includes the latest Metabase plus the DuckDB driver: ```dockerfile FROM eclipse-temurin:21-jre ENV MB_PLUGINS_DIR=/plugins RUN mkdir -p ${MB_PLUGINS_DIR} /app # Latest Metabase ADD https://downloads.metabase.com/latest/metabase.jar /app/metabase.jar # Latest DuckDB driver ADD https://github.com/MotherDuck-Open-Source/metabase_duckdb_driver/releases/latest/download/duckdb.metabase-driver.jar ${MB_PLUGINS_DIR}/ EXPOSE 3000 CMD ["java", "-jar", "/app/metabase.jar"] ``` 2. Build and run: ```bash docker build -t metabase-duckdb:latest . docker run -d --name metaduck -p 3000:3000 -e MB_PLUGINS_DIR=/plugins metabase-duckdb:latest ``` Tip: For reproducible builds, pin versions instead of `latest`: ```dockerfile # Example of pinning versions (replace X.Y.Z) ADD https://downloads.metabase.com/vX.Y.Z/metabase.jar /app/metabase.jar ADD https://github.com/MotherDuck-Open-Source/metabase_duckdb_driver/releases/download/1.X.Y/duckdb.metabase-driver.jar ${MB_PLUGINS_DIR}/ ``` Note: Use a Debian/Ubuntu-based JRE image (not Alpine) to avoid glibc issues with the DuckDB driver. ### Manual 1. Download the latest DuckDB driver `.jar`: ```bash curl -L -o duckdb.metabase-driver.jar \ https://github.com/MotherDuck-Open-Source/metabase_duckdb_driver/releases/latest/download/duckdb.metabase-driver.jar ``` 1. Copy it to the Metabase plugins directory: - Standard installation (example): If your `metabase.jar` is at `~/app/metabase.jar`, place the driver in `~/app/plugins/` ```bash mkdir -p ~/app/plugins mv duckdb.metabase-driver.jar ~/app/plugins/ ``` - On Mac: The plugins directory is `~/Library/Application Support/Metabase/Plugins/` (if you are using a Mac) ```bash mkdir -p "${HOME}/Library/Application Support/Metabase/Plugins/" mv duckdb.metabase-driver.jar "${HOME}/Library/Application Support/Metabase/Plugins/" ``` - Custom location or Docker: set `MB_PLUGINS_DIR` to point Metabase at your plugins directory and place the `.jar` there (if you are using a custom location or Docker). 1. Restart Metabase so it picks up the new plugin. ### Remote (SSH) 1. SSH to the host and download to the plugins directory. Replace user/host and adjust `MB_PLUGINS_DIR` as needed. ```bash ssh user@your-host "bash -lc ' set -euo pipefail MB_PLUGINS_DIR=${MB_PLUGINS_DIR:-/app/plugins} mkdir -p "$MB_PLUGINS_DIR" if command -v wget >/dev/null; then wget -qO "$MB_PLUGINS_DIR/duckdb.metabase-driver.jar" \ https://github.com/MotherDuck-Open-Source/metabase_duckdb_driver/releases/latest/download/duckdb.metabase-driver.jar else curl -L -o "$MB_PLUGINS_DIR/duckdb.metabase-driver.jar" \ https://github.com/MotherDuck-Open-Source/metabase_duckdb_driver/releases/latest/download/duckdb.metabase-driver.jar fi '" ``` 2. Restart Metabase on the remote host: - systemd: `ssh user@your-host 'sudo systemctl restart metabase'` - Docker: `ssh user@your-host 'docker restart '` :::important Restart required: Metabase must be restarted after adding or upgrading plugins. Hot-reload of drivers is not supported. ::: :::tip Compatibility and upgrades: New DuckDB driver releases are designed to be backward compatible with recent Metabase versions. Upgrading to the latest driver is recommended for bug fixes and stability. If you run a significantly older Metabase version, validate in staging first. ::: ### Add your database connection After installing the driver, you can add MotherDuck as a data source in Metabase. 1. Log in to Metabase with admin credentials 2. Navigate to **Admin Settings** > **Databases** > **Add Database** 3. Select **DuckDB** as the database type :::note Since DuckDB does not do implicit casting by default, the `old_implicit_casting` config is necessary for datetime filtering in Metabase to function. It's recommended to keep it set. ::: #### Connecting to MotherDuck To connect to MotherDuck: 1. **Database name**: In the Database file field, enter `md:[database_name]` where `[database_name]` is your MotherDuck database name 2. **MotherDuck token**: Paste your MotherDuck token (retrieve from the [MotherDuck UI](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/authenticating-to-motherduck.md)) 3. **Configuration**: Enable `old_implicit_casting` (recommended) for proper datetime handling ![Example](../img/metabase_motherduck.png) ### DuckLake on Metabase DuckLake is supported with the DuckDB driver in Metabase. Use the latest DuckDB driver release and a DuckDB version that supports DuckLake (DuckDB v1.3.2 or newer is recommended). #### MotherDuck-managed DuckLake If your DuckLake database is managed by MotherDuck, you can connect the same way you connect to any MotherDuck database: 1. Select DuckDB as the database type 2. Database file: `md:[ducklake_database_name]` 3. MotherDuck token: paste your token 4. Keep `old_implicit_casting` enabled (recommended) No extra Init SQL is required. Query your tables normally in Metabase. #### Own compute + DuckLake catalog (attach in init SQL) If you want Metabase’s embedded DuckDB to query a DuckLake stored externally, attach the DuckLake catalog in the connection’s Init SQL. This works for both MotherDuck-managed catalogs and self-managed catalogs. - Init SQL for a MotherDuck-managed DuckLake catalog: ```sql -- Attaches the DuckLake metadata catalog hosted in MotherDuck ATTACH 'ducklake:md:__ducklake_metadata_[database_name]' AS dl1; ``` - Init SQL for a self-managed DuckLake catalog (local metadata DB) with S3 data path: ```sql -- Replace the path to your DuckLake metadata DB and bucket prefix ATTACH 'ducklake:/duckdb/my_ducklake_metadata.ducklake' AS dl1 ( DATA_PATH 's3://my_bucket/lake/' ); ``` Once attached, reference tables with the alias, for example: `FROM dl1.my_table`. ### Connecting to a local DuckDB database To connect to a local DuckDB database: 1. Database file: enter the full path to your DuckDB file (e.g., `/path/to/database.db`) 2. Configuration: enable `old_implicit_casting` (recommended) to ensure proper datetime filtering :::note DuckDB's concurrency model supports either one process with read/write permissions, or multiple processes with read permissions, but not both at the same time. This means you will not be able to open a local DuckDB in read-only mode, then the same DuckDB in read-write mode in a different process. ::: ![Example](../img/metabase_local_duckdb.png) ## Configuration best practices - **Connection pooling**: For production instances, set an appropriate connection pool size based on expected concurrent users - **Query timeouts**: Configure timeouts in Metabase settings to prevent long-running queries from affecting system performance - **Data access**: Use database-level permissions in Metabase to control who can access which data sources ## Troubleshooting | Issue | Solution | |-------|----------| | Driver not detected | Ensure driver is in the correct plugins directory and Metabase has been restarted | | Connection failures | Verify database path (local) or database name and token (MotherDuck) | | Permission errors | Check file permissions for local databases | | Datetime filtering issues | Enable `old_implicit_casting` in the connection settings | | Add MotherDuck token in the connection string | Specify a correct MotherDuck token or MotherDuck database name after the `md:` prefix | ### Updating the MotherDuck token Metabase keeps long-lived database connections alive. When you update only the MotherDuck token while an existing connection is still cached, Metabase raises `Connection error: Can't open a connection to same database file with a different configuration than existing connections`. Use one of the following approaches to refresh the token successfully: 1. **Add a cache buster while editing the database.** Edit the connection under **Admin Settings** > **Databases**, then update both the **Database file** field and the **MotherDuck Token** field with a small cache-busting change (for example, append `?refresh=20250917`). Updating both values at the same time forces Metabase to treat the configuration as new. Save the connection, then optionally revert the fields to their clean values once the change is persisted. 2. **Restart Metabase before updating the token.** Restart the Metabase service and, immediately after it starts, go straight to `/admin/databases` to update the token field. Do not open the Metabase home screen before editing the database connection, or the previous connection (with the old token) will be re-established. ### Connecting to a local DuckDB database To connect to a local DuckDB database: 1. **Database file**: Enter the full path to your DuckDB file (e.g., `/path/to/database.db`) 2. **Configuration**: Enable `old_implicit_casting` (recommended) to ensure proper datetime filtering 3. **Additional settings**: - **Read only**: Toggle as appropriate for your use case - **Naming strategy**: Choose your preferred table/field naming strategy :::note DuckDB's concurrency model supports either one process with read/write permissions, or multiple processes with read permissions, but not both at the same time. This means you will not be able to open a local DuckDB in read-only mode, then the same DuckDB in read-write mode in a different process. ::: ![Example](../img/metabase_local_duckdb.png) --- Source: https://motherduck.com/docs/integrations/cloud-storage/tigris # Tigris > With MotherDuck, you can access files in a private Tigris bucket. Tigris is a globally distributed S3-compatible object storage service that provides low latency anywhere in the world. ## Tigris requirements To get started using Tigris with MotherDuck, you need to: 1. Create a new bucket at [storage.new](https://storage.new) if you don't have one 2. Create an access keypair for that bucket at [storage.new/accesskey](https://storage.new/accesskey) 3. Configure MotherDuck to use Tigris 4. Query files in Tigris When creating a bucket, you can select from different storage tiers: - Standard (default) - Best for general use cases - Infrequent Access - Cheaper than Standard, but charges per gigabyte of retrieval - Instant Retrieval Archive - For long-term storage with urgent access needs - Archive - For long-term storage where retrieval time is not critical ## Configure Tigris credentials ### Create a SECRET object :::note If you are using multiple secrets, the `SCOPE` parameter will make sure MotherDuck knows which one to use. You can validate which secret to use with [`which_secret`](https://duckdb.org/docs/stable/configuration/secrets_manager). As an example, see below: ```sql FROM which_secret('s3://my-other-bucket/file.parquet', 's3'); ``` ::: ### SQL ```sql CREATE OR REPLACE PERSISTENT SECRET tigris ( TYPE s3, PROVIDER config, KEY_ID 'tid_access_key_id', SECRET 'tsec_secret_access_key', REGION 'auto', ENDPOINT 't3.storage.dev', URL_STYLE 'vhost', SCOPE 's3://my_bucket' ); -- test Tigris credentials SELECT count(*) FROM 's3:///'; ``` ### Python ```python import duckdb con = duckdb.connect('md:') con.sql(""" CREATE OR REPLACE PERSISTENT SECRET tigris ( TYPE s3, PROVIDER config, KEY_ID 'tid_access_key_id', SECRET 'tsec_secret_access_key', REGION 'auto', ENDPOINT 't3.storage.dev', URL_STYLE 'vhost', SCOPE 's3://my_bucket' ) """) # test Tigris con.sql("SELECT count(*) FROM 's3:///'").show() ``` ### UI Adding Tigris secrets through the UI is not supported. Please add them using SQL statements. ### Delete a SECRET object ### SQL ```sql DROP SECRET tigris; ``` ### Tigris credentials as **temporary** secrets You can also create temporary secrets that are not persisted across sessions: ```sql CREATE OR REPLACE SECRET ( TYPE s3, PROVIDER config, KEY_ID 'tid_access_key_id', SECRET 'tsec_secret_access_key', REGION 'auto', ENDPOINT 't3.storage.dev', URL_STYLE 'vhost' ); ``` :::note Local/In-memory secrets are not persisted across sessions. ::: :::info Even temporary, in-memory secrets are available to MotherDuck's cloud execution engine when you connect your local DuckDB instance to MotherDuck. When you query {props.provider}, the query runs on MotherDuck's servers, not your local machine, and MotherDuck uses the best-matching secret to authenticate, whether it is stored locally or in MotherDuck. For more details, see [CREATE SECRET](/sql-reference/motherduck-sql-reference/create-secret/#querying-with-secrets). ::: --- Source: https://motherduck.com/docs/integrations/file-formats/google-sheets # Google Sheets > Query Google Sheets from MotherDuck with CSV export URLs or the DuckDB Google Sheets community extension. Google Sheets can be queried from MotherDuck in two ways: - Use `read_csv()` with the Google Sheets `/export?format=csv` URL. This works well for server-side reads in MotherDuck and for views that should reflect the current sheet contents. - Use the community [`duckdb-gsheets`](https://duckdb-gsheets.com/) extension when you need its Google Sheets-specific features. ## Query a sheet with read_csv() For a public Google Sheet, use the CSV export URL: ```sql SELECT * FROM read_csv( 'https://docs.google.com/spreadsheets/d//export?format=csv&gid=', MD_RUN = REMOTE ); ``` The `sheet_id` is the value between `/d/` and `/edit` in the Google Sheet URL. The `gid` identifies the worksheet tab. When connected to MotherDuck, `read_csv()` can read the HTTPS URL server side. `MD_RUN = REMOTE` makes the execution location explicit, although non-local HTTPS reads are remote by default. ## Create a view or table Create a view when you want queries to reflect the current Google Sheet contents: ```sql CREATE OR REPLACE VIEW my_database.main.google_sheet AS SELECT * FROM read_csv( 'https://docs.google.com/spreadsheets/d//export?format=csv&gid=', MD_RUN = REMOTE ); ``` Create a table when you want to snapshot the sheet into MotherDuck: ```sql CREATE OR REPLACE TABLE my_database.main.google_sheet_snapshot AS SELECT * FROM read_csv( 'https://docs.google.com/spreadsheets/d//export?format=csv&gid=', MD_RUN = REMOTE ); ``` ## Authenticate to a private sheet For private sheets, create an `HTTP` secret with an OAuth bearer token that has access to the sheet. Store it in MotherDuck if the query needs to run server side from future sessions or scheduled jobs: ```sql CREATE SECRET google_sheets_http IN MOTHERDUCK ( TYPE HTTP, SCOPE 'https://docs.google.com', EXTRA_HTTP_HEADERS MAP { 'Authorization': 'Bearer ' } ); ``` The bearer token must come from a Google identity or service account that can read the spreadsheet. See the [DuckDB HTTP authentication documentation](https://duckdb.org/docs/current/core_extensions/httpfs/https#authenticating) for additional `httpfs` authentication options. ## Use the Google Sheets extension The community Google Sheets extension can read sheets with `read_gsheet()`: ```sql INSTALL gsheets FROM community; LOAD gsheets; CREATE SECRET (TYPE gsheet); SELECT * FROM read_gsheet('https://docs.google.com/spreadsheets/d//edit'); ``` This workflow may require browser interactivity unless you configure an API access token. See [Using Excel and Google Sheets Data in MotherDuck](/key-tasks/data-warehousing/replication/spreadsheets/) for a longer walkthrough. ## Related content - [Swimming in Google Sheets with MotherDuck](https://motherduck.com/blog/google-sheets-motherduck/) - [CSV integration](/integrations/file-formats/csv/) - [MD_RUN parameter](/sql-reference/motherduck-sql-reference/md-run-parameter/) --- Source: https://motherduck.com/docs/integrations/bi-tools/looker # Looker with MotherDuck > Connect Looker (Google Cloud core) to MotherDuck using the Postgres endpoint, including the required compatibility-mode parameter and recommended pooling and token settings. :::info[Preview] The Postgres endpoint is in [preview](/about-motherduck/feature-stages/). Features and behavior may change. ::: [Looker (Google Cloud core)](https://cloud.google.com/looker) connects to MotherDuck through the [Postgres endpoint](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint/) using its standard PostgreSQL dialect. A Looker-specific compatibility mode is required so that symmetric aggregates and Persistent Derived Tables (PDTs) work correctly. ## Before you start You'll need: - A [Looker](https://cloud.google.com/looker) instance and admin access to create database connections - A [MotherDuck access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) (see [Choose the right token](#choose-the-right-token) below) - Your Postgres host, which you can find at [MotherDuck Postgres settings](https://app.motherduck.com/settings/postgres) (for example, `pg.us-east-1-aws.motherduck.com`) ## Connect to MotherDuck In Looker, go to **Admin → Connections → Add Connection** and configure: | Parameter | Value | |---|---| | **Dialect** | PostgreSQL 9.5+ | | **Host** | Your MotherDuck Postgres host (for example, `pg.us-east-1-aws.motherduck.com`) | | **Port** | `5432` | | **Database** | Your MotherDuck database name | | **Username** | `postgres` | | **Password** | Your [MotherDuck access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) (see [Choose the right token](#choose-the-right-token)) | | **SSL** | Enabled | | **Verify SSL** | Disabled, unless your Looker instance is configured for certificate verification | | **Additional JDBC Parameters** | `options=--compatibility-mode=looker` | ![Looker database and authentication settings for a MotherDuck Postgres endpoint connection](/img/integrations/looker-database-authentication-settings.png) In **Optional Settings**, enable **SSL** and **Database Connection Pooling**. Leave **Verify SSL** disabled unless your Looker instance is configured to verify the server certificate. ![Looker optional settings with SSL and database connection pooling enabled](/img/integrations/looker-optional-settings.png) After configuring the connection, click **Test these settings**. The test runs against MotherDuck and produces a known cancellation warning that is safe to ignore (see [Connection test behavior](#connection-test-behavior)). Click **Add Connection** to save. ## Required: compatibility mode parameter You **must** set the following in the **Additional JDBC Parameters** field: ```text options=--compatibility-mode=looker ``` This parameter does two important things: - Enables symmetric aggregates - Enables Persistent Derived Table (PDT) support Without it, symmetric aggregate queries return incorrect results and PDT builds fail. ## Enable connection pooling Enable **database connection pooling** in Looker's connection settings. DuckDB is optimized for large analytical queries rather than high volumes of short concurrent connections, so pooling reduces connection overhead and improves overall stability. This is the recommended configuration for MotherDuck. ## Choose the right token MotherDuck supports two token types. Choose based on how your Looker deployment will use the connection: | Token type | Use when | Notes | |---|---|---| | **Read scaling token** | Reporting / BI usage with many concurrent users (reads only) | Recommended for the main Looker connection when PDT writes are not needed on this connection. | | **Read/write token** | PDT builds, or any connection that needs to write tables | Looker supports configuring a separate PDT connection — you can use a read/write token there while keeping a read scaling token on the main connection. | ## Connection test behavior When you run Looker's built-in connection test, you may see a warning that query cancellation does not work. This is expected and can be safely ignored. The warning is produced because the test cancellation query itself fails due to memory consumption — not because the cancellation mechanism is broken. Production query cancellation is unaffected. ## Troubleshooting | Symptom | Resolution | |---|---| | Symmetric aggregate queries fail or return incorrect results | Ensure `options=--compatibility-mode=looker` is set in Additional JDBC Parameters. | | PDT build fails or Explore intermittently errors | Check **Admin → PDT → PDT Details** for build status and last SQL. Confirm the table exists in your MotherDuck scratch schema. | | Connection test shows cancellation warning | Expected behavior. The warning appears only during the test query and does not affect production query cancellation. | ## Additional information - [Postgres endpoint reference](/sql-reference/postgres-endpoint) for connection parameters, SSL options, and limitations - [Connect through the Postgres endpoint](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint/) for a general how-to guide - [Looker documentation: Connecting Looker to your database](https://cloud.google.com/looker/docs/db-config-postgresql) --- Source: https://motherduck.com/docs/integrations/bi-tools/excel # Connect MotherDuck to Excel > Use Excel's 'Get Data' flow with the DuckDB ODBC driver to load MotherDuck data into Excel. This setup works well for recurring reporting, analysis, ad hoc SQL exploration, finance models, and operational dashboards without relying on exported CSVs. ### Windows ## Before you start To get started you'll need the following. - Windows + Excel (ODBC is Windows-only for this flow) - A MotherDuck access token (create one in the [MotherDuck token page](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/#creating-an-access-token)) - Admin rights on your computer to install the ODBC driver ## Installation steps ### 1. Install the DuckDB ODBC driver 1. Download the latest DuckDB ODBC driver for Windows (amd64): - [duckdb_odbc-windows-amd64.zip](https://github.com/duckdb/duckdb-odbc/releases/latest/download/duckdb_odbc-windows-amd64.zip) 2. Extract the `.zip` file and run `odbc_install.exe` as Administrator (right click -> Run as administrator). ### 2. Configure the DuckDB system DSN 1. Open the ODBC Data Source Administrator: - 64-bit Excel: Start menu -> ODBC Data Sources (64-bit) - 32-bit Excel: Start menu -> ODBC Data Sources (32-bit) ![ODBC Data Sources in Windows](./img/ODBC-data-sources-windows.png) 2. Go to System DSN, select DuckDB, and click Configure. ![Select the DuckDB system DSN](./img/ODBC-data-source-duckdb.png) 3. Set Database to one of the following: - Recommended (scoped): `md:your_database_name` - Open scope: `md:` (allows access to any database) 4. Click OK to save. ![DuckDB ODBC configuration for MotherDuck](./img/ODBC-data-source-configuration.png) ### 3. Connect from the data menu 1. In Excel, go to Data -> Get Data -> From Other Sources -> From ODBC. ![Excel Get Data menu](./img/getdata-excel.png) 2. Choose DuckDB from the DSN dropdown and click OK. ![From ODBC dialog in Excel](./img/from-ODBC-driver-excel.png) 3. On the credentials screen, choose Default or Custom and add this to the Connection string properties field: ```text motherduck_token= ``` ![DuckDB ODBC driver installer](./img/ODBC-driver-excel.png) 4. Click Connect. ### 4. Load or transform data Use the Navigator window to select tables and choose Load to bring data into Excel, or Transform Data to shape it in Power Query before loading. ### macOS ## Excel ODBC on macOS Direct ODBC connectivity between Excel and MotherDuck is **not supported on macOS** due to a driver incompatibility. ### Why it doesn't work Excel on macOS uses the **iODBC** driver manager, but the DuckDB ODBC driver is built for **unixODBC**. These drivers are incompatible at the binary level. This is a [known issue](https://github.com/duckdb/duckdb-odbc/issues/40) being tracked by the DuckDB team. If necessary, you can build this driver yourself. ### Alternatives for macOS users #### Option 1: Export directly with DuckDB (CLI and drivers) DuckDB has an [Excel extension](https://duckdb.org/docs/stable/core_extensions/excel) that can write `.xlsx` files directly. This works with DuckDB CLI or any DuckDB driver, but cannot be used in the MotherDuck UI because the UI cannot export `.xlsx` files to your local file system. ```sql -- Connect to MotherDuck and export to Excel ATTACH 'md:'; COPY (SELECT * FROM my_database.my_table) TO 'output.xlsx' WITH (FORMAT xlsx, HEADER true); ``` Or from the command line: ```bash duckdb -c "ATTACH 'md:'; COPY (SELECT * FROM my_database.my_table) TO 'output.xlsx' WITH (FORMAT xlsx, HEADER true);" ``` #### Option 2: Use the MotherDuck Web UI Query your data in the [MotherDuck Web UI](https://app.motherduck.com) and export results: 1. Run your query in the MotherDuck UI 2. Click the download button to export as CSV 3. Open the CSV in Excel #### Option 3: Export to CSV via DuckDB CLI Use the DuckDB CLI to export query results to CSV: ```bash duckdb -c "ATTACH 'md:'; COPY (SELECT * FROM my_database.my_table) TO 'output.csv' (HEADER, DELIMITER ',');" ``` ### Linux ## Excel workflows on Linux Direct ODBC connectivity between desktop Excel and MotherDuck is Windows-only for this flow. On Linux, use DuckDB CLI or a DuckDB client to export query results, then open the exported file in Excel, Excel for the web, or another spreadsheet tool. ### Option 1: Export directly with DuckDB (CLI and drivers) DuckDB has an [Excel extension](https://duckdb.org/docs/stable/core_extensions/excel) that can write `.xlsx` files directly. This works with DuckDB CLI or any DuckDB driver, but cannot be used in the MotherDuck UI because the UI cannot export `.xlsx` files to your local file system. ```sql -- Connect to MotherDuck and export to Excel ATTACH 'md:'; COPY (SELECT * FROM my_database.my_table) TO 'output.xlsx' WITH (FORMAT xlsx, HEADER true); ``` Or from the command line: ```bash duckdb -c "ATTACH 'md:'; COPY (SELECT * FROM my_database.my_table) TO 'output.xlsx' WITH (FORMAT xlsx, HEADER true);" ``` ### Option 2: Export to CSV via DuckDB CLI Use the DuckDB CLI to export query results to CSV: ```bash duckdb -c "ATTACH 'md:'; COPY (SELECT * FROM my_database.my_table) TO 'output.csv' (HEADER, DELIMITER ',');" ``` ## Tips - If you change your MotherDuck token, update the connection string properties in Excel. - If you use multiple databases, create separate DSNs (e.g., `DuckDB - analytics`, `DuckDB - finance`) with different `md:database` values. ## Troubleshooting ### How do I delete an existing MotherDuck connection? 1. In Excel, go to Data -> Queries & Connections. 2. Find the connection you want to remove, right click it, and choose Delete. ### How do I modify an existing MotherDuck connection? 1. In Excel, go to Data -> Queries & Connections. 2. Right click the connection and choose Properties. 3. Open the Definition tab and update the connection string (for example, update `motherduck_token=...`) and save. If you don't see the Definition tab, use Data -> Get Data -> Data Source Settings, select your DuckDB connection, then choose Change Source or Edit Permissions as needed. --- Source: https://motherduck.com/docs/integrations/data-science-ai/marimo # Marimo > marimo is a reactive notebook for Python and SQL that models notebooks as dataflow graphs. When you run a cell or interact with a UI element, marimo automatically runs affected cells (or marks them as stale), keeping code and outputs consistent and preventing bugs before they happen. Every marimo notebook is stored as pure Python, executable as a script, and deployable as an app. ## Getting started ### Installation First, install marimo with SQL support: ### pip ```bash pip install "marimo[sql]" ``` ### uv ```bash uv pip install "marimo[sql]" ``` ### conda ```bash conda install -c conda-forge marimo duckdb polars ``` ### Authentication There are two ways to authenticate: 1. **Interactive Authentication**: When you first connect to MotherDuck (e.g. `ATTACH 'md:my_db'`), marimo will open a browser window for authentication. 2. **Token-based Authentication**: Set your MotherDuck token as an environment variable: ```bash export motherduck_token="your_token" ``` You can find your token in the MotherDuck UI under Account Settings. ## Using MotherDuck First, open your first notebook: ```bash marimo edit my_notebook.py ``` ### 1. Connecting and database discovery ### SQL ```sql ATTACH IF NOT EXISTS 'md:my_db' ``` ### Python ```python import duckdb # Connect to MotherDuck duckdb.sql("ATTACH IF NOT EXISTS 'md:my_db'") ``` You will be prompted to authenticate with MotherDuck when you run the above cell. This will open a browser window where you can log in and authorize your marimo notebook to access your MotherDuck database. To avoid being prompted each time you open a notebook, you can set the `motherduck_token` environment variable: ```bash export motherduck_token="your_token" marimo edit my_notebook.py ``` Once connected, your MotherDuck tables are automatically discovered in the Datasources Panel: ![Browse your MotherDuck databases](../img/marimo_motherduck_db_discovery.png) _Browse your MotherDuck databases_ ### 2. Writing SQL queries You can query your MotherDuck db using SQL cells in marimo. Here's an example of how to query a table and display the results using marimo: ![Query a MotherDuck table](../img/marimo_motherduck_sql.png) _Query a MotherDuck table_ marimo's reactive execution model extends into SQL queries, so changes to your SQL will automatically trigger downstream computations for dependent cells (or optionally mark cells as stale for expensive computations). ![img](../img/marimo_motherduck_reactivity-ezgif.com-speed.gif) ### 3. Mixing SQL and Python marimo lets you combine SQL queries with Python code: ![Mixing SQL and Python](../img/marimo_motherduck_python_and_sql.png) _Mixing SQL and Python_ ## Example notebook For a full example of using MotherDuck with marimo, check out this [example notebook](https://github.com/marimo-team/marimo/blob/main/examples/sql/connect_to_motherduck.py). --- Source: https://motherduck.com/docs/integrations/transformation/dbt-cloud # dbt cloud with MotherDuck via pg_duckdb > dbt cloud is a managed service for dbt core. MotherDuck is used with dbt cloud by deploying a Postgres proxy with pg_duckdb installed. :::note If you only need to connect to MotherDuck from a PostgreSQL-compatible client, use the [Postgres endpoint](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint) instead. Use this `pg_duckdb` pattern when you specifically need to operate a PostgreSQL server or proxy for dbt Cloud. ::: ## Getting started You will need the following items to get started: 1. A Postgres instance with pg_duckdb installed. 2. A [MotherDuck token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/#authentication-using-an-access-token). 3. A dbt cloud account. ## Configuring pg_duckdb The full documentation for pg_duckdb can be found on [GitHub](https://github.com/duckdb/pg_duckdb/blob/main/docs/README.md), but a simple way to set it up is using Docker on EC2. In our testing, we have used m7g.xlarge, which is a 4-core, 16GB instance. Since Postgres exists as a proxy for MotherDuck, it only needs to have enough working space to stream results back to dbt. Even smaller instances could suffice as well, i.e. a1.large, although it has not been tested thoroughly. The memory limits set below assumes a 16GB limit. Once you have added your MotherDuck Token and Postgres password to your environment, you can execute the `docker run` statement below: ```yml docker run -d \ --name pgduckdb \ -p 5432:5432 \ -e POSTGRES_PASSWORD="$POSTGRES_PASSWORD" \ -e MOTHERDUCK_TOKEN="$MOTHERDUCK_TOKEN" \ -v ~/pgduckdb_data_v17:/var/lib/postgresql/data \ --restart unless-stopped \ --memory=12288m \ pgduckdb/pgduckdb:17-main ``` :::note The default configuration of Postgres is sub-optimal for m7g.xlarge. Consider making the following changes to the `postgresql.conf` file. ```ini # Memory configuration optimized for AWS m7g.xlarge with more conservative settings work_mem = '32MB' # Per-operation memory for sorts, joins, etc. maintenance_work_mem = '512MB' # Memory for maintenance operations shared_buffers = '2GB' # ~12.5% of RAM for shared buffer cache effective_cache_size = '6GB' # Conservative estimate of OS cache max_connections = 100 # Reduced maximum concurrent connections ``` ::: ### Upgrading to newer builds of pg_duckdb New containers are built for pg_duckdb on every release. Since we are using docker to run the container, the pg_duckdb server can be stopped, pruned, and then rebuilt with the above docker run command. It is recommended to use a script to rebuild docker image on some cadence. Terraform or similar is recommended to handle this maintenance process. An example shell script can be seen below:
Shell script ```sh #!/bin/bash # Error handling function handle_error() { local line_no=$1 local exit_code=$2 echo "ERROR: An error occurred at line ${line_no}, exit code ${exit_code}" exit ${exit_code} } # Set up error trap trap 'handle_error ${LINENO} $?' ERR # Script to install Docker and run PGDuckDB with MotherDuck on AWS EC2 # Usage: POSTGRES_PASSWORD=your_secure_password MOTHERDUCK_TOKEN=your_md_token ./setup_pgduckdb.sh # Detect OS if grep -q 'Amazon Linux release 2023' /etc/os-release; then OS_VERSION="Amazon Linux 2023" elif grep -q 'Amazon Linux release 2' /etc/os-release; then OS_VERSION="Amazon Linux 2" elif grep -q 'Ubuntu' /etc/os-release; then OS_VERSION="Ubuntu" else OS_VERSION="Linux" fi echo "Starting setup for PGDuckDB with MotherDuck on $OS_VERSION..." # Check if required environment variables are set if [ -z "$POSTGRES_PASSWORD" ]; then echo "ERROR: POSTGRES_PASSWORD environment variable is not set." echo "Usage: POSTGRES_PASSWORD=your_secure_password MOTHERDUCK_TOKEN=your_md_token ./setup_pgduckdb.sh" exit 1 fi if [ -z "$MOTHERDUCK_TOKEN" ]; then echo "ERROR: MOTHERDUCK_TOKEN environment variable is not set." echo "Usage: POSTGRES_PASSWORD=your_secure_password MOTHERDUCK_TOKEN=your_md_token ./setup_pgduckdb.sh" exit 1 fi # Update package lists - continue even if there are errors with some repositories echo "Updating package lists..." if [[ "$OS_VERSION" == "Ubuntu" ]]; then sudo apt-get update -y || true elif [[ "$OS_VERSION" == "Amazon Linux 2023" ]]; then sudo dnf update -y || true else sudo yum update -y || true fi # Check if Docker is already installed if command -v docker &>/dev/null; then echo "Docker is already installed, skipping installation." else # Install prerequisites based on OS echo "Installing prerequisites..." if [[ "$OS_VERSION" == "Ubuntu" ]]; then sudo apt-get install -y \ apt-transport-https \ ca-certificates \ curl \ gnupg \ lsb-release elif [[ "$OS_VERSION" == "Amazon Linux 2023" ]]; then # Use --allowerasing to handle curl package conflicts sudo dnf install -y --allowerasing \ device-mapper-persistent-data \ lvm2 \ ca-certificates else sudo yum install -y \ device-mapper-persistent-data \ lvm2 \ ca-certificates fi # Install Docker based on OS echo "Installing Docker..." if [[ "$OS_VERSION" == "Ubuntu" ]]; then # Add Docker's official GPG key curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg # Set up the repository echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null # Update and install sudo apt-get update -y sudo apt-get install -y docker-ce docker-ce-cli containerd.io elif [[ "$OS_VERSION" == "Amazon Linux 2023" ]]; then # Amazon Linux 2023 - use the standard package sudo dnf install -y docker elif [[ "$OS_VERSION" == "Amazon Linux 2" ]]; then # Amazon Linux 2 - use extras sudo amazon-linux-extras install -y docker else # Fallback sudo yum install -y docker fi # Verify Docker was installed if ! command -v docker &>/dev/null; then echo "ERROR: Docker installation failed." exit 1 fi fi # Start Docker service echo "Starting Docker service..." sudo systemctl start docker || sudo service docker start sudo systemctl enable docker || sudo chkconfig docker on # Add current user to docker group to avoid using sudo with docker commands echo "Adding current user to docker group..." sudo usermod -aG docker "$USER" # Create a new data directory for PostgreSQL 17 echo "Creating new data directory for PostgreSQL 17..." mkdir -p ~/pgduckdb_data_v17 # Fix permissions on the data directory echo "Setting correct permissions on data directory..." sudo chown -R 999:999 ~/pgduckdb_data_v17 # 999 is the standard UID for postgres user in Docker sudo chmod 700 ~/pgduckdb_data_v17 # Check architecture ARCH=$(uname -m) echo "Detected architecture: $ARCH" if [[ "$ARCH" == "aarch64" || "$ARCH" == "arm64" ]]; then echo "Using ARM64 architecture (Graviton3)..." else echo "Using x86_64 architecture..." fi # Check if container already exists and remove it if necessary if sudo docker ps -a | grep -q pgduckdb; then echo "Found existing pgduckdb container. Removing it..." sudo docker stop pgduckdb || true sudo docker rm pgduckdb || true fi # Pull the Docker image echo "Pulling Docker image..." sudo docker pull pgduckdb/pgduckdb:17-main # Check available system memory echo "Checking system memory..." TOTAL_MEM_KB=$(grep MemTotal /proc/meminfo | awk '{print $2}') TOTAL_MEM_MB=$((TOTAL_MEM_KB / 1024)) echo "Total system memory: ${TOTAL_MEM_MB}MB" # Calculate 75% of system memory for Docker container limit DOCKER_MEM_LIMIT=$((TOTAL_MEM_MB * 75 / 100)) echo "Setting Docker container memory limit to: ${DOCKER_MEM_LIMIT}MB" # Run the Docker container with memory limit echo "Starting PostgreSQL container..." sudo docker run -d \ --name pgduckdb \ -p 5432:5432 \ -e POSTGRES_PASSWORD="$POSTGRES_PASSWORD" \ -e MOTHERDUCK_TOKEN="$MOTHERDUCK_TOKEN" \ -v ~/pgduckdb_data_v17:/var/lib/postgresql/data \ --restart unless-stopped \ --memory=${DOCKER_MEM_LIMIT}m \ pgduckdb/pgduckdb:17-main # Wait for PostgreSQL to start echo "Waiting for PostgreSQL to start..." sleep 10 # Configure PostgreSQL echo "Configuring PostgreSQL and DuckDB..." # Append settings to the main PostgreSQL configuration file echo "Appending settings to PostgreSQL configuration file..." sudo docker exec -i pgduckdb bash -c "cat >> /var/lib/postgresql/data/postgresql.conf << 'EOT' # DuckDB integration settings duckdb.motherduck_enabled = true # Memory configuration optimized for AWS m7g.xlarge with more conservative settings work_mem = '32MB' # Per-operation memory for sorts, joins, etc. maintenance_work_mem = '512MB' # Memory for maintenance operations shared_buffers = '2GB' # ~12.5% of RAM for shared buffer cache effective_cache_size = '6GB' # Conservative estimate of OS cache max_connections = 100 # Reduced maximum concurrent connections # Detailed query logging log_min_duration_statement = 0 # Log all queries log_statement = 'all' # Log all SQL statements log_duration = on # Log duration of each SQL statement log_line_prefix = '%t [%p]: [%l-1] db=%d,user=%u ' # Prefix format EOT" # Restart PostgreSQL to apply all configuration settings echo "Restarting PostgreSQL container to apply all configuration settings..." sudo docker restart pgduckdb # Wait for PostgreSQL to restart echo "Waiting for PostgreSQL container to restart..." sleep 10 # Verify PostgreSQL is running with new settings echo "Verifying PostgreSQL configuration..." sudo docker exec -i pgduckdb psql -U postgres << EOF -- Check if PostgreSQL is running SELECT version(); EOF # Create monitoring script echo "Creating monitoring script..." cat > ~/monitor_pg.sh << 'EOF' #!/bin/bash echo "=== PostgreSQL Container Status ===" docker ps -a -f name=pgduckdb echo -e "\n=== Resource Usage ===" docker stats --no-stream pgduckdb echo -e "\n=== Recent Logs ===" docker logs --tail 10 pgduckdb echo -e "\n=== Connection Test ===" docker exec -it pgduckdb pg_isready -U postgres if [ $? -eq 0 ]; then echo "PostgreSQL is accepting connections." else echo "PostgreSQL is not accepting connections." fi EOF chmod +x ~/monitor_pg.sh # Create startup script echo "Creating startup script..." cat > ~/start_pg.sh << 'EOF' #!/bin/bash echo "Starting PostgreSQL container..." docker start pgduckdb echo "Container status:" docker ps -a -f name=pgduckdb EOF chmod +x ~/start_pg.sh # Check if container is running or restarting echo "Checking container status..." CONTAINER_STATUS=$(sudo docker inspect -f '{{.State.Status}}' pgduckdb 2>/dev/null || echo "not_found") if [[ "$CONTAINER_STATUS" == "restarting" ]]; then echo "WARNING: Container is restarting. Checking logs for errors..." sudo docker logs pgduckdb echo " Try reducing the memory settings in the PostgreSQL configuration if the container keeps restarting." echo "You can manually adjust settings by connecting to the container once it's stable." elif [[ "$CONTAINER_STATUS" != "running" && "$CONTAINER_STATUS" != "not_found" ]]; then echo "WARNING: Container is not running (status: $CONTAINER_STATUS). Checking logs for errors..." sudo docker logs pgduckdb fi # Final status check echo "=== Setup Complete ===" echo "PostgreSQL with DuckDB is now running." echo "Container status:" sudo docker ps -a -f name=pgduckdb echo -e "\n=== Connection Information ===" echo "Host: localhost" echo "Port: 5432" echo "User: postgres" echo "Password: [The password you provided]" echo "Database: postgres" echo -e "\n=== Useful Commands ===" echo "Monitor status: ./monitor_pg.sh" echo "Start after reboot: ./start_pg.sh" echo "Connect to PostgreSQL: docker exec -it pgduckdb psql -U postgres" echo "View logs: docker logs pgduckdb" echo -e "\n=== Note ===" echo "You may need to log out and log back in for the docker group changes to take effect." echo "After that, you can run docker commands without sudo." ```
## dbt cloud configuration dbt cloud is configured as standard Postgres, with a couple of key details. 1. You will need to create a schema in MotherDuck for each user as well as production, as using pg_duckdb to create new schemas in MotherDuck is not supported. 2. You will need to set an environmental variable for `DBT_SCHEMA` that uses the pg_duckdb schema format, which is `ddb$[database]$[schema]` since Postgres only supports a single database per instance. This will need to be set for each user as well as production with `{{ env_var('DBT_SCHEMA')}}`. 3. The recommended thread count follow our dbt-core recommendation, which is 4 threads. If dbt is configured incorrectly, data may write to Postgres, which is much slower than MotherDuck. In that case, the easiest fix is to rebuild the docker container per above, to assure that no data accidentally ends up in Postgres. ## Usage notes There are a few things to know about using dbt cloud with pg_duckdb that are unusual. 1. You write Postgres dialect SQL that is executed against DuckDB. As such, there are some idiosyncrasies that are neither Postgres nor DuckDB, but a secret, third thing (pg_duckdb SQL). The details of this are described in the [pg_duckdb documentation](https://github.com/duckdb/pg_duckdb/blob/main/docs/README.md). 2. Views are only stored in Postgres without any artifacts in MotherDuck. As such, they can be used for interim data but not final datasets to be consumed by end-users. As such, changing materialization type from view to table, or table to view, is a hybrid MotherDuck & Postgres transaction, and unsupported. 3. Running on multiple threads can occasionally cause deadlocks with the pg_duckdb catalog maintenance service. This can be resolved with `dbt retry` in your production pipeline runs. 4. DuckDB types are more specific than Postgres, so model builds using numeric types will throw errors that can be resolved with specific typing. 5. From time-to-time the Postgres catalog can get out of sync, and will show tables that do not exist in MotherDuck. To resolve this, create the missing object in MotherDuck, i.e. `CREATE TABLE my_schema.model_name AS SELECT 1;`, which will unblock your dbt model. --- Source: https://motherduck.com/docs/integrations/ingestion/airbyte # Airbyte > Airbyte is a data integration platform for connecting data sources to warehouses. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. ## How it works with MotherDuck Airbyte can load data into MotherDuck with the MotherDuck destination connector. ## Prerequisites - An Airbyte workspace with permission to create destinations. - A MotherDuck access token. - An existing MotherDuck database to use as the destination. ## Setup 1. In MotherDuck, create an access token for Airbyte. 2. In Airbyte, create a new destination and select **MotherDuck**. 3. Set **Destination DB** to an `md:` database path, for example `md:analytics`. 4. Paste the token into **MotherDuck Access Token**. 5. Optional: set **Schema Name**. Airbyte namespaces map to MotherDuck schemas. 6. Save the destination and use it in a connection. ## Authentication and configuration - Use Airbyte's **MotherDuck Access Token** field instead of putting the token in the `md:` URI. - Use `destination_path` for the database path. - Use the Airbyte schema field to control the default schema for loaded streams. ## Important notes - Airbyte warns against putting the token in the connection string because it can be printed in execution logs. - The destination supports full refresh and incremental sync modes. - Airbyte's connector reference includes local DuckDB file options. For MotherDuck, use the `md:` destination path. ## Use cases - Replicate SaaS, API, file, or database sources into MotherDuck. - Land Airbyte streams into a dedicated MotherDuck schema. - Use Airbyte Destinations V2 final tables as downstream analytics sources in MotherDuck. ## Related content - [View the full Airbyte MotherDuck setup guide](https://docs.airbyte.com/integrations/destinations/motherduck) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/orchestration/airflow # Airflow > Open-source data orchestration and scheduling platform for building ETL pipelines. It integrates with MotherDuck for scheduling DuckDB and MotherDuck tasks in repeatable data workflows. ## How it works with MotherDuck 1. Create a task, DAG, flow, or job in Airflow that runs a DuckDB or MotherDuck step. 2. Store the MotherDuck token in the orchestrator's secret manager. 3. Use a small validation query before scheduling larger ingestion, transformation, or export jobs. ## Related content - [View the full process in the Airflow documentation](https://airflow.apache.org/docs/) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/ingestion/artie # Artie > Artie is a fully managed CDC streaming platform that allows you to replicate data from your source database to your destination in real-time. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. ## How it works with MotherDuck Artie can write CDC and streaming pipeline output into MotherDuck. ## Prerequisites - An Artie pipeline. - A MotherDuck Read/Write token. - A target MotherDuck database name. - Optional: a dedicated MotherDuck service account for pipeline writes. ## Setup 1. Create a Read/Write token in MotherDuck. You can create it from a regular user account or from a service account. 2. In Artie, configure **MotherDuck** as the destination. 3. Enter the MotherDuck token and database name. 4. Start the pipeline and verify that the database appears in MotherDuck. 5. If you used a service account, impersonate that service account to inspect objects it created. ## Authentication and configuration - Artie requires a Read/Write token because the pipeline writes data. - A dedicated service account is recommended for production pipeline writes. - If team members need access to tables written by the service account, create an organization share from the service account-owned database. ## Important notes - Data written through a service account is visible to that service account by default. Share it explicitly with the organization if analysts need access. - Copy MotherDuck tokens when they are created because they are only shown once. ## Use cases - Replicate CDC streams into MotherDuck. - Keep operational sources synchronized with MotherDuck analytics tables. - Use Artie pipelines to land data into a database owned by a dedicated service account. ## Related content - [View the full Artie MotherDuck setup guide](https://www.artie.com/docs/destinations/motherduck) - [MotherDuck service accounts](/key-tasks/service-accounts-guide/) - [MotherDuck sharing overview](/key-tasks/sharing-data/sharing-overview/) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/ingestion/ascend-io # Ascend.io > Ascend.io is a data integration platform for connecting data sources to warehouses. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. ## How it works with MotherDuck 1. Create a pipeline in Ascend.io with MotherDuck as the destination or analytical store. 2. Create a MotherDuck access token and add it to the tool's secrets or destination settings. 3. Choose the target database and schema, then run a small initial sync before scheduling production loads. ## Related content - [Read the Ascend.io blog on MotherDuck](https://www.ascend.io/blog/ascending-with-motherduck/) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/orchestration/astronomer # Astronomer > Astronomer is a platform for authoring, scheduling, and monitoring workflows. It integrates with MotherDuck for scheduling DuckDB and MotherDuck tasks in repeatable data workflows. ## How it works with MotherDuck 1. Create a task, DAG, flow, or job in Astronomer that runs a DuckDB or MotherDuck step. 2. Store the MotherDuck token in the orchestrator's secret manager. 3. Use a small validation query before scheduling larger ingestion, transformation, or export jobs. ## Related content - [Read the Astronomer blog on using Airflow with MotherDuck and DuckDB](https://www.astronomer.io/blog/three-ways-to-use-airflow-with-motherduck-and-duckdb/) - [View the full process in the Astronomer documentation](https://docs.astronomer.io/learn/airflow-duckdb) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/ingestion/aws-glue # AWS Glue > AWS Glue is a serverless data integration service for preparing and moving data with Spark jobs, crawlers, and the AWS Glue Data Catalog. AWS Glue jobs can connect to MotherDuck through the MotherDuck Postgres endpoint using Glue's PostgreSQL JDBC support. ## How it works with MotherDuck 1. Create a MotherDuck access token. 2. Configure the AWS Glue job with a PostgreSQL JDBC connection to the MotherDuck Postgres endpoint. 3. Use `postgres` as the user, the MotherDuck token as the password, and `md:` or a specific MotherDuck database as the database name. 4. Use Glue's JDBC `dbtable` option for a table or view that the job should read. 5. Make sure the Glue job's network configuration can reach the public MotherDuck endpoint. ```python connection_options = { "url": "jdbc:postgresql://pg.us-east-1-aws.motherduck.com:5432/md:?sslmode=require", "dbtable": "main.my_table", "user": "postgres", "password": "", } dyf = glueContext.create_dynamic_frame.from_options( connection_type="postgresql", connection_options=connection_options, ) ``` Use this route when a Glue job needs to read MotherDuck data as part of an AWS ETL workflow. For high-volume loading into MotherDuck, it is often simpler to write files to S3 from Glue and load those files from MotherDuck. ## Related content - [View the full process in the AWS Glue JDBC documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-connect-jdbc-home.html) - [MotherDuck Postgres endpoint](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint/) - [Loading data from S3 into MotherDuck](/key-tasks/loading-data-into-motherduck/loading-data-from-cloud-or-https) - [Troubleshooting AWS S3 secrets](/troubleshooting/aws-s3-secrets/) --- Source: https://motherduck.com/docs/integrations/orchestration/bacalhau # Bacalhau > Bacalhau is a platform for running and managing data pipelines. It integrates with MotherDuck for scheduling DuckDB and MotherDuck tasks in repeatable data workflows. ## How it works with MotherDuck 1. Create a task, DAG, flow, or job in Bacalhau that runs a DuckDB or MotherDuck step. 2. Store the MotherDuck token in the orchestrator's secret manager. 3. Use a small validation query before scheduling larger ingestion, transformation, or export jobs. ## Related content - [View the full process in the Bacalhau documentation](https://bacalhau.org/integrations/duckdb) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/bi-tools/index # Business Intelligence Tools > Use MotherDuck as a data source in tools for interactive data analysis and presentation MotherDuck integrates with popular business intelligence tools to help you analyze and visualize your data. ## Included pages - [Hex](https://motherduck.com/docs/integrations/bi-tools/hex): Connect Hex notebooks to MotherDuck using SQL data connections or Python cells for interactive analytics. - [Evidence](https://motherduck.com/docs/integrations/bi-tools/evidence): Evidence is an open source, code-based alternative to drag-and-drop BI tools. Build polished data products with just SQL and markdown. - [Superset & Preset](https://motherduck.com/docs/integrations/bi-tools/superset-preset): Apache Superset is a powerful, open-source data exploration and visualization platform designed to be intuitive and interactive. It allows data professionals to quickly integrate and analyze data from various sources, creating insightful dashboards and charts for better decision making. - [Metabase](https://motherduck.com/docs/integrations/bi-tools/metabase): Metabase is an open source analytics/BI platform that provides intuitive data visualization and exploration capabilities. This guide details how to connect Metabase to both local DuckDB databases and MotherDuck. - [Tableau](https://motherduck.com/docs/integrations/bi-tools/tableau): Tableau is a widely-used business intelligence and data visualization platform that enables data analysts to build interactive dashboards and reports. You can connect Tableau Cloud to MotherDuck through the built-in PostgreSQL connector using MotherDuck's Postgres endpoint. For Tableau Desktop and Server, use the DuckDB JDBC connector. - [Looker with MotherDuck](https://motherduck.com/docs/integrations/bi-tools/looker): Connect Looker (Google Cloud core) to MotherDuck using the Postgres endpoint, including the required compatibility-mode parameter and recommended pooling and token settings. - [Connect MotherDuck to Excel](https://motherduck.com/docs/integrations/bi-tools/excel): Use Excel's 'Get Data' flow with the DuckDB ODBC driver to load MotherDuck data into Excel. This setup works well for recurring reporting, analysis, ad hoc SQL exploration, finance models, and operational dashboards without relying on exported CSVs. - [Microsoft Power BI](https://motherduck.com/docs/integrations/bi-tools/powerbi): Power BI is an interactive data visualization product developed by Microsoft. You can connect Power BI to MotherDuck through the built-in PostgreSQL database connector using MotherDuck's Postgres endpoint. - [Cube](https://motherduck.com/docs/integrations/bi-tools/cube): Cube is a semantic layer for building and visualizing data. It integrates with MotherDuck for dashboards, semantic models, and embedded analytics workflows. - [Data Wisp](https://motherduck.com/docs/integrations/bi-tools/data-wisp): Data Wisp is an AI data assistant for answering business questions in natural language. It integrates with MotherDuck for conversational analytics workflows. - [Dot](https://motherduck.com/docs/integrations/bi-tools/dot): AI data analyst that answers questions and provides insights through conversational analytics. It integrates with MotherDuck for dashboards, semantic models, and embedded analytics workflows. - [Explo](https://motherduck.com/docs/integrations/bi-tools/explo): Explo is a platform for embedded analytics, AI analytics, and data sharing in customer-facing products. It integrates with MotherDuck as a data source. - [Gooddata](https://motherduck.com/docs/integrations/bi-tools/gooddata): Enterprise analytics platform for building data products and embedded analytics. It integrates with MotherDuck for dashboards, semantic models, and embedded analytics workflows. - [Grafana](https://motherduck.com/docs/integrations/bi-tools/grafana): Grafana is an observability and dashboarding platform for building dashboards, alerts, and exploratory views. The MotherDuck-maintained DuckDB data source plugin lets Grafana query local DuckDB files and MotherDuck databases. - [Holistics](https://motherduck.com/docs/integrations/bi-tools/holistics): Holistics helps data teams set up self-service BIs that are reliable and easy to maintain. Everyone can now self-serve data with confidence by applying software's best practices. It integrates with MotherDuck for dashboards, semantic models, and embedded analytics workflows. - [Omni](https://motherduck.com/docs/integrations/bi-tools/omni): Modern business intelligence platform for creating interactive dashboards and data visualizations. It integrates with MotherDuck for dashboards, semantic models, and embedded analytics workflows. - [Rill Data](https://motherduck.com/docs/integrations/bi-tools/rill-data): Rill Data is a data platform for building and visualizing data. It integrates with MotherDuck for dashboards, semantic models, and embedded analytics workflows. - [Zenlytic](https://motherduck.com/docs/integrations/bi-tools/zenlytic): Zenlytic is a data visualization platform for building and visualizing data. It integrates with MotherDuck for dashboards, semantic models, and embedded analytics workflows. --- Source: https://motherduck.com/docs/integrations/ingestion/bytewax # Bytewax > Bytewax is a stream processing platform for building and managing data pipelines. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. ## How it works with MotherDuck 1. Create a pipeline in Bytewax with MotherDuck as the destination or analytical store. 2. Create a MotherDuck access token and add it to the tool's secrets or destination settings. 3. Choose the target database and schema, then run a small initial sync before scheduling production loads. ## Related content - [Read the Bytewax blog on the DuckDB and MotherDuck sink operator](https://bytewax.io/blog/bytewax-duckdb-motherduck-integration) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/cloud-storage/index # Cloud Storage > Use MotherDuck with your favorite cloud storage services MotherDuck integrates with popular cloud storage services to help you manage and store your data. ## Included pages - [Amazon S3](https://motherduck.com/docs/integrations/cloud-storage/amazon-s3): Amazon S3 is a Data Sources/Sinks service for storing and retrieving data. - [Azure Blob Storage](https://motherduck.com/docs/integrations/cloud-storage/azure-blob-storage): Azure Blob is a Data Sources/Sinks service for storing and retrieving data. - [Cloudflare R2](https://motherduck.com/docs/integrations/cloud-storage/cloudflare-r2): Cloudflare R2 is a Data Sources/Sinks service for storing and retrieving data. - [Google Cloud Storage](https://motherduck.com/docs/integrations/cloud-storage/google-cloud-storage): With MotherDuck, you can access files in a private Google Cloud Storage (GCS) bucket. This leverages the GCS S3 compatible connection. - [Hetzner Object Storage](https://motherduck.com/docs/integrations/cloud-storage/hetzner-object-storage): Hetzner Object Storage is a S3-compatible object storage service. - [Tigris](https://motherduck.com/docs/integrations/cloud-storage/tigris): With MotherDuck, you can access files in a private Tigris bucket. Tigris is a globally distributed S3-compatible object storage service that provides low latency anywhere in the world. --- Source: https://motherduck.com/docs/integrations/ingestion/cloudquery # CloudQuery > CloudQuery is a data integration platform for connecting data sources to warehouses. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. ## How it works with MotherDuck 1. Create a pipeline in CloudQuery with MotherDuck as the destination or analytical store. 2. Create a MotherDuck access token and add it to the tool's secrets or destination settings. 3. Choose the target database and schema, then run a small initial sync before scheduling production loads. ## Related content - [Read the CloudQuery guide to moving PostgreSQL data to MotherDuck](https://www.cloudquery.io/how-to-guides/moving-data-from-postgres-to-motherduck) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/how-to-integrate # Creating a new integration > Guidelines for integrating your application with MotherDuck, including connection strings and custom user agent configuration. Integrating with MotherDuck follows the same pattern as integrating with DuckDB, so you can use the same client APIs and frameworks. There are three differences: 1. Use `md:` or `md:analytics` as the connection string instead of a local filesystem path. 2. Pass `motherduck_token` through a config dictionary, connection string parameter, or environment variable. 3. Pass `custom_user_agent` so MotherDuck can identify the integration in query history. ### Choose a `custom_user_agent` format {#custom-user-agent-format} Use the format `integration/version(metadata1,metadata2)`. The version and metadata sections are optional. - Avoid spaces in the integration and version sections. - Separate multiple metadata values with commas. - If you plan to group by one workload label later, put it first in the metadata list. Examples: - `catalogsync` - `catalogsync/5.1.5.1` - `catalogsync/5.1.5.1(batchload)` - `catalogsync/5.1.5.1(batchload,useast1)` ## Language and framework examples {#custom-user-agent-examples} ### Python ```python con = duckdb.connect("md:analytics", config={ "motherduck_token": token, "custom_user_agent": "catalogsync/5.1.5.1(batchload,useast1)" }) ``` ### SQLAlchemy ```python eng = create_engine("duckdb:///md:analytics", connect_args={ "config": { "motherduck_token": token, "custom_user_agent": "catalogsync/5.1.5.1(batchload,useast1)" } }) ``` ### Java / JDBC ```java Properties config = new Properties(); config.setProperty("motherduck_token", token); config.setProperty("custom_user_agent", "catalogsync/5.1.5.1(batchload,useast1)"); Connection mdConn = DriverManager.getConnection("jdbc:duckdb:md:analytics", config); ``` ### Node.js ```javascript import { DuckDBInstance } from '@duckdb/node-api' const instance = await DuckDBInstance.create("md:analytics", { motherduck_token: token, custom_user_agent: "catalogsync/5.1.5.1(batchload,useast1)" }) const conn = await instance.connect() ``` ### Go ```go dsn := fmt.Sprintf( "md:analytics?motherduck_token=%s&custom_user_agent=%s", url.QueryEscape(token), url.QueryEscape("catalogsync/5.1.5.1(batchload,useast1)"), ) db, err := sql.Open("duckdb", dsn) ``` ## Implementation best practices If you use DuckDB or MotherDuck in a shared environment where one process serves multiple users, the connection string must be unique per user. You can disambiguate the connection string with a user-specific parameter such as `md:analytics?session_user=`. If you pass `motherduck_token` in the connection string, ensure your application does not log it in plaintext. --- Source: https://motherduck.com/docs/integrations/file-formats/csv # CSV > CSV is a simple text format for tabular data. DuckDB can read CSV files from local paths, HTTPS URLs, and supported cloud storage locations, then load the results into MotherDuck tables. ## How it works with MotherDuck 1. Connect to MotherDuck from the DuckDB CLI, Python, or another DuckDB client. 2. Use DuckDB's CSV reader to inspect local files, HTTPS URLs, or cloud storage paths. 3. Create a MotherDuck table from the file when you want durable storage, sharing, or repeated queries. ## Example ```sql CREATE TABLE my_table AS SELECT * FROM read_csv('data.csv'); ``` ## Remote CSV files CSV files available over HTTPS or cloud storage can be queried server side in MotherDuck: ```sql CREATE OR REPLACE TABLE my_database.main.remote_csv AS SELECT * FROM read_csv( 'https://example.com/path/to/file.csv', MD_RUN = REMOTE ); ``` For non-local `https://`, `s3://`, `gcs://`, `r2://`, and Azure URLs, MotherDuck uses remote execution by default. `MD_RUN = REMOTE` makes that explicit. See the [MD_RUN parameter](/sql-reference/motherduck-sql-reference/md-run-parameter/) for details. ## Google Sheets CSV exports Public Google Sheets can be queried as CSV by using the `/export?format=csv` URL: ```sql SELECT * FROM read_csv( 'https://docs.google.com/spreadsheets/d//export?format=csv&gid=', MD_RUN = REMOTE ); ``` For private sheets, configure HTTP authentication with a DuckDB `HTTP` secret. See the [Google Sheets integration](/integrations/file-formats/google-sheets/) for the full workflow. ## Related content - [DuckDB CSV documentation](https://duckdb.org/docs/current/data/csv/overview.html) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck cloud storage integrations](/integrations/cloud-storage/) - [Google Sheets integration](/integrations/file-formats/google-sheets/) --- Source: https://motherduck.com/docs/integrations/bi-tools/cube # Cube > Cube is a semantic layer for building and visualizing data. It integrates with MotherDuck for dashboards, semantic models, and embedded analytics workflows. ## How it works with MotherDuck Cube connects to MotherDuck through Cube's DuckDB data source. Use this setup when you want Cube's semantic layer, APIs, dashboards, or embedded analytics to query data that lives in MotherDuck. ## Prerequisites - A Cube project, either self-hosted or in Cube Cloud. - A MotherDuck service token or access token. For production deployments, use a dedicated service account token. - The MotherDuck database and schema Cube should use for its models. ## Setup ### Manual setup In a self-hosted Cube project, configure the DuckDB data source and pass the MotherDuck token to Cube: ```bash CUBEJS_DB_TYPE=duckdb CUBEJS_DB_DUCKDB_MOTHERDUCK_TOKEN= ``` Keep the token in your deployment secret manager rather than committing it to `.env`. ### Cube Cloud setup In Cube Cloud, choose **DuckDB** when creating the database connection, then paste your MotherDuck token into the **MotherDuck Token** field. ![Cube Cloud DuckDB connection form with a MotherDuck Token field](./img/cube-cloud-motherduck-connection.webp) Leave the MotherDuck token blank only when you are connecting Cube to a local DuckDB database instead of MotherDuck. ## Authentication and configuration - Use a read/write token if Cube needs to create or refresh objects in MotherDuck. Use a read token for read-only dashboard workloads. - Configure Cube's DuckDB schema setting if your Cube models should default to a specific MotherDuck schema. - If your Cube deployment reads private files from object storage through DuckDB, configure those storage credentials in Cube separately from the MotherDuck token. ## Important notes - Cube's DuckDB documentation includes S3, extension, and pre-aggregation settings. Those settings are Cube/DuckDB deployment details, not required for a basic MotherDuck connection. - For production, keep the MotherDuck token out of connection strings and application logs. - If you use Cube Cloud, allowlist the Cube Cloud IPs shown in the connection screen if your network policy requires it. ## Use cases - Build a governed semantic layer on top of MotherDuck tables. - Serve embedded analytics from Cube APIs while querying MotherDuck. - Prototype dashboard models locally and move the same Cube project to Cube Cloud. ## Related content - [Read the Cube blog on DuckDB and MotherDuck integrations](https://cube.dev/blog/introducing-duckdb-and-motherduck-integrations) - [View the full Cube DuckDB and MotherDuck setup guide](https://cube.dev/docs/product/configuration/data-sources/duckdb) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/orchestration/dagster # Dagster > Dagster is a platform for authoring, scheduling, and monitoring workflows. It integrates with MotherDuck for scheduling DuckDB and MotherDuck tasks in repeatable data workflows. ## How it works with MotherDuck 1. Create a task, DAG, flow, or job in Dagster that runs a DuckDB or MotherDuck step. 2. Store the MotherDuck token in the orchestrator's secret manager. 3. Use a small validation query before scheduling larger ingestion, transformation, or export jobs. ## Related content - [View the full process in the Dagster documentation](https://docs.dagster.io/integrations/libraries/duckdb/using-duckdb-with-dagster) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/data-quality/index # Data Quality Tools > Monitor and maintain data quality in MotherDuck Ensure data quality and reliability in MotherDuck using these integrated tools. ## Included pages - [Great Expectations](https://motherduck.com/docs/integrations/data-quality/great-expectations): Great Expectations is a data quality management platform combining data quality and data governance. It integrates with MotherDuck for table monitoring as part of data quality and observability workflows. - [Monte Carlo](https://motherduck.com/docs/integrations/data-quality/monte-carlo): End-to-end data observability platform for monitoring data quality and reliability. It integrates with MotherDuck for table monitoring as part of data quality and observability workflows. - [Secoda](https://motherduck.com/docs/integrations/data-quality/secoda): Data discovery and documentation platform for managing data quality and governance. It integrates with MotherDuck for table monitoring as part of data quality and observability workflows. - [Soda](https://motherduck.com/docs/integrations/data-quality/soda): Data quality platform for monitoring and managing data quality in your pipelines. It integrates with MotherDuck for table monitoring as part of data quality and observability workflows. --- Source: https://motherduck.com/docs/integrations/data-science-ai/index # Data Science & AI Tools > Use MotherDuck with your favorite data science and AI tools MotherDuck integrates with popular data science and AI tools to help you build powerful machine learning and AI applications. ## Included pages - [Marimo](https://motherduck.com/docs/integrations/data-science-ai/marimo): marimo is a reactive notebook for Python and SQL that models notebooks as dataflow graphs. When you run a cell or interact with a UI element, marimo automatically runs affected cells (or marks them as stale), keeping code and outputs consistent and preventing bugs before they happen. Every marimo notebook is stored as pure Python, executable as a script, and deployable as an app. - [Datalab](https://motherduck.com/docs/integrations/data-science-ai/datalab): Interactive data science platform for exploring and analyzing data with MotherDuck. It integrates with MotherDuck for notebooks, assistants, and AI-powered analysis workflows. - [Fabi.ai](https://motherduck.com/docs/integrations/data-science-ai/fabi-ai): Fabi.ai is an AI-native notebook and BI workspace for Python, SQL, dashboards, and workflows. It integrates with MotherDuck for exploration and analysis. - [Google Colab](https://motherduck.com/docs/integrations/data-science-ai/google-colab): Google Colab notebooks can query MotherDuck by installing DuckDB and opening an md: connection from Python. - [Jupyter](https://motherduck.com/docs/integrations/data-science-ai/jupyter): Jupyter notebooks can query MotherDuck through the DuckDB Python package and an md: connection string. - [LangChain](https://motherduck.com/docs/integrations/data-science-ai/langchain): LangChain is a framework for building and deploying language models. It integrates with MotherDuck for notebooks, assistants, and AI-powered analysis workflows. - [LlamaIndex](https://motherduck.com/docs/integrations/data-science-ai/llamaindex): LlamaIndex is a framework for building and deploying language models. It integrates with MotherDuck for notebooks, assistants, and AI-powered analysis workflows. - [Wobby](https://motherduck.com/docs/integrations/data-science-ai/wobby): Wobby provides AI analysts for delivering business-ready insights in Slack or Teams. It integrates with MotherDuck for connecting those analysis workflows to your data. --- Source: https://motherduck.com/docs/integrations/transformation/index # Data Transformation > Transform your data inside MotherDuck Use MotherDuck to transform your data. ## Included pages - [dbt with DuckDB and MotherDuck](https://motherduck.com/docs/integrations/transformation/dbt): Data Build Tool (dbt) is an open-source command-line tool that enables data analysts and engineers to transform data in their warehouses by defining SQL in model files. It bring the composability of programming languages to SQL while automating the mechanics of updating tables. - [dbt cloud with MotherDuck via pg_duckdb](https://motherduck.com/docs/integrations/transformation/dbt-cloud): dbt cloud is a managed service for dbt core. MotherDuck is used with dbt cloud by deploying a Postgres proxy with pg_duckdb installed. - [Paradime](https://motherduck.com/docs/integrations/transformation/paradime): Modern data transformation platform for building and managing data pipelines. It integrates with MotherDuck for running data transformation projects against MotherDuck. - [SQLMesh](https://motherduck.com/docs/integrations/transformation/sqlmesh): SQLMesh is a data transformation tool for building and managing data pipelines. It integrates with MotherDuck for running data transformation projects against MotherDuck. --- Source: https://motherduck.com/docs/integrations/bi-tools/data-wisp # Data Wisp > Data Wisp is an AI data assistant for answering business questions in natural language. It integrates with MotherDuck for conversational analytics workflows. ## How it works with MotherDuck Data Wisp connects to a MotherDuck database as a read-only data source for conversational analytics and lightweight data apps. ## Prerequisites - A MotherDuck database to query from Data Wisp. - A MotherDuck Read Scaling Token or another read-only token for the Data Wisp connection. - Access to create or import data sources in Data Wisp. ## Setup 1. In MotherDuck, create a token for Data Wisp. Choose a read-only token type because Data Wisp does not write to the database. 2. In Data Wisp, open **Data Sources** and select **Import**. 3. Choose **Database**, then set **Connection type** to **MotherDuck**. 4. Enter a title, paste the MotherDuck token, and enter the database name. 5. Select **Connect** and confirm that Data Wisp reports a successful connection. ![Data Wisp success message after connecting a MotherDuck data source](./img/data-wisp-motherduck-connection.png) ## Authentication and configuration - Store the MotherDuck token in Data Wisp's connection settings. - Enter the database name without the `md:` prefix unless Data Wisp's UI specifically asks for a full connection string. - If your environment uses network restrictions, allowlist the Data Wisp IP address shown in the connection form. ## Important notes - Prefer a dedicated token for Data Wisp so you can rotate or revoke access without affecting other tools. - Use a read-only token unless you have a clear reason for Data Wisp to use broader permissions. ## Use cases - Let business users ask natural-language questions over curated MotherDuck datasets. - Build lightweight internal dashboards and data apps from existing MotherDuck tables. - Connect a specific MotherDuck database to a Data Wisp workspace. ## Related content - [View the full Data Wisp MotherDuck setup guide](https://docs.datawisp.io/adding-your-data/connect-to-a-database/motherduck) - [MotherDuck Read Scaling tokens](/key-tasks/authenticating-and-connecting-to-motherduck/read-scaling/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/databases/index # Databases > Use MotherDuck with your favorite databases MotherDuck integrates directly with popular databases to help you build data pipelines and applications. ## Included pages - [BigQuery](https://motherduck.com/docs/integrations/databases/bigquery): Load data from Google BigQuery into MotherDuck using the duckdb-bigquery community extension. - [PostgreSQL](https://motherduck.com/docs/integrations/databases/postgres): Advanced open-source relational database with powerful features and extensibility. - [PlanetScale](https://motherduck.com/docs/integrations/databases/planetscale): PlanetScale offers hosted PostgreSQL and MySQL Vitess Databases. MotherDuck supports PlanetScale Postgres via the pg_duckdb extension, as well as the Postgres Connector. In our internal benchmarking, pg_duckdb offers 100x or greater query acceleration for analytical queries when compared to vanilla Postgres. - [SQL Server](https://motherduck.com/docs/integrations/databases/sql-server): Use the SQL Server replication guide when you need to read tables or queries from SQL Server and write the results to MotherDuck. The guide covers Python, pyodbc, SQL Server authentication, and loading dataframe results into MotherDuck. - [MySQL](https://motherduck.com/docs/integrations/databases/mysql): MySQL is a relational database commonly used for application data. DuckDB's MySQL extension can read from MySQL-compatible databases, which lets you copy selected data into MotherDuck from a DuckDB client. - [Supabase](https://motherduck.com/docs/integrations/databases/supabase): Supabase is a Postgres platform for building applications with a managed database, APIs, authentication, storage, and realtime features. Supabase's documented DuckDB Wrapper can query MotherDuck from a Supabase Postgres database through a foreign data wrapper. --- Source: https://motherduck.com/docs/integrations/data-science-ai/datalab # Datalab > Interactive data science platform for exploring and analyzing data with MotherDuck. It integrates with MotherDuck for notebooks, assistants, and AI-powered analysis workflows. ## How it works with MotherDuck DataLab connects to MotherDuck from a workbook so SQL cells can query MotherDuck data directly. ## Prerequisites - A DataLab workbook. - A MotherDuck service token. - Optional: a default MotherDuck database name for the connection. ## Setup 1. In MotherDuck, create or copy a service token. 2. In DataLab, open a workbook. 3. Select **View** > **Databases**. 4. Select the **+** icon and choose **MotherDuck**. 5. Enter a connection name, paste the service token, and optionally enter a database name. 6. Connect the data source. ![DataLab MotherDuck connection dialog with service token and optional database name fields](../img/datalab-motherduck-connection.png) ## Authentication and configuration - The service token is required. - The database name is optional. When set, DataLab connects to that database by default, but other accessible databases can still be queried. - If your environment requires network allowlisting, use the DataCamp IP addresses shown in the DataLab connection dialog. ## Important notes - Store the token only in the DataLab connection configuration. - Use SQL cells to query the connected MotherDuck source after setup. ## Use cases - Explore MotherDuck tables in notebook-style analysis. - Combine SQL query results with Python or chart cells in DataLab. - Share a workbook that uses a managed MotherDuck data connection. ## Related content - [View the full DataLab MotherDuck setup guide](https://datalab-docs.datacamp.com/connect-to-data/motherduck) - [MotherDuck Python overview](/integrations/language-apis-and-drivers/python/python-overview) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/dev-tools/index # Development Tools > Developer tools and utilities that work with MotherDuck Use MotherDuck with various development tools and utilities to enhance your workflow. ## Included pages - [Retool](https://motherduck.com/docs/integrations/dev-tools/retool): Low-code platform for building internal tools and custom business applications with drag-and-drop UI components. - [Obsidian](https://motherduck.com/docs/integrations/dev-tools/obsidian): Use the DuckDB & MotherDuck Obsidian plugin to query external data from your notes and freeze the results as markdown tables. - [Puppygraph](https://motherduck.com/docs/integrations/dev-tools/puppygraph): Graph visualization tool for exploring and analyzing data relationships in DuckDB. It integrates with MotherDuck for development workflows that read from or write to MotherDuck. - [ShadowTraffic](https://motherduck.com/docs/integrations/dev-tools/shadowtraffic): ShadowTraffic is a synthetic data generation tool for simulating production traffic to your backend. It integrates with MotherDuck for development workflows that read from or write to MotherDuck. - [Vanna](https://motherduck.com/docs/integrations/dev-tools/vanna): Vanna is a data science and AI framework for building and sharing data applications. It integrates with MotherDuck for development workflows that read from or write to MotherDuck. --- Source: https://motherduck.com/docs/integrations/ingestion/dlt # dlt (data load tool) > dlt is an open-source Python library that loads data from various, often messy data sources into well-structured, live datasets. It offers a lightweight interface for extracting data from REST APIs, SQL databases, cloud storage, Python data structures, and many more. dlt is designed to be easy to use, flexible, and scalable: * dlt infers schemas and data types, normalizes the data, and handles nested data structures. * dlt supports a variety of popular destinations and has an interface to add custom destinations to create reverse ETL pipelines. * dlt can be deployed anywhere Python runs, be it on Airflow, serverless functions, or any other cloud deployment of your choice. * dlt automates pipeline maintenance with schema evolution and schema and data contracts. Dlt integrates well with DuckDB (they also used it as a local [cache](https://dlthub.com/blog/dltplus-project-cache-in-early-access)) and therefore with MotherDuck. You can check more about MotherDuck integration in the [official documentation](https://dlthub.com/docs/dlt-ecosystem/destinations/motherduck). ## Authentication To authenticate with MotherDuck, you have two options: 1. **Environment variable:** export your `motherduck_token` as an environment variable: ```bash export motherduck_token="your_motherduck_token" ``` 2. For Local development: add the token to `.dlt/secrets.toml`: ```toml [destination.motherduck.credentials] password = "my_motherduck_token" ``` ## Minimal example Below is a minimal example of using dlt to load data from a REST API (with fake data) into a DuckDB (MotherDuck) database: ```python import dlt from typing import Dict, Iterator, List, Sequence import random from datetime import datetime from dlt.sources import DltResource @dlt.source(name="dummy_github") def dummy_source(repos: List[str] = None) -> Sequence[DltResource]: """ A minimal DLT source that generates dummy GitHub-like data. Args: repos (List[str]): A list of dummy repository names. Returns: Sequence[DltResource]: A sequence of resources with dummy data. """ if repos is None: repos = ["dummy/repo1", "dummy/repo2"] return ( dummy_repo_info(repos), dummy_languages(repos), ) @dlt.resource(write_disposition="replace") def dummy_repo_info(repos: List[str]) -> Iterator[Dict]: """ Generates dummy repository information. Args: repos (List[str]): List of repository names. Yields: Iterator[Dict]: An iterator over dummy repository data. """ for repo in repos: owner, name = repo.split("/") yield { "id": random.randint(10000, 99999), "name": name, "full_name": repo, "owner": {"login": owner}, "description": f"This is a dummy repository for {repo}", "created_at": datetime.now().isoformat(), "updated_at": datetime.now().isoformat(), "stargazers_count": random.randint(0, 1000), "forks_count": random.randint(0, 500), } @dlt.resource(write_disposition="replace") def dummy_languages(repos: List[str]) -> Iterator[Dict]: """ Generates dummy language data for repositories in an unpivoted format. Args: repos (List[str]): List of repository names. Yields: Iterator[Dict]: An iterator over dummy language data. """ languages = ["Python", "JavaScript", "TypeScript", "C++", "Rust", "Go"] for repo in repos: # Generate 2-4 random languages for each repo num_languages = random.randint(2, 4) selected_languages = random.sample(languages, num_languages) for language in selected_languages: yield { "repo": repo, "language": language, "bytes": random.randint(1000, 100000), "check_time": datetime.now().isoformat(), } def run_minimal_example(): """ Runs a minimal example pipeline that loads dummy GitHub data to MotherDuck. """ # Define some dummy repositories repos = ["example/repo1", "example/repo2", "example/repo3"] # Configure the pipeline pipeline = dlt.pipeline( pipeline_name="minimal_github_pipeline", destination='motherduck', dataset_name="minimal_example", ) # Create the data source data = dummy_source(repos) # Run the pipeline with all resources info = pipeline.run(data) print(info) # Show what was loaded print("\nLoaded data:") print(f"- {len(repos)} repositories") print(f"- Languages for {len(repos)} repositories") if __name__ == "__main__": run_minimal_example() ``` dlt revolves around three core concepts: * Sources: Define where the data comes from. * Resources: Represent structured units of data within a source. * Pipelines: Manage the data loading process. In the example above: * dummy_source defines a source that simulates GitHub-like data. * dummy_repo_info and dummy_languages are resources producing repository and language data. * A pipeline loads this data into MotherDuck. The core integration with MotherDuck is defined in the pipeline configuration: ```python pipeline = dlt.pipeline( pipeline_name="minimal_github_pipeline", destination="motherduck", dataset_name="minimal_example", ) ``` Setting destination="motherduck" tells dlt to load the data into MotherDuck. --- Source: https://motherduck.com/docs/integrations/bi-tools/dot # Dot > AI data analyst that answers questions and provides insights through conversational analytics. It integrates with MotherDuck for dashboards, semantic models, and embedded analytics workflows. ## How it works with MotherDuck Dot connects to MotherDuck with an `md:` connection string and uses that connection for AI-assisted analysis, including Slack workflows. ## Prerequisites - A MotherDuck access token. - The name of the MotherDuck database Dot should query. - DuckDB 0.10.2 or later for the MotherDuck connector used by Dot. ## Setup 1. In MotherDuck, create or copy an access token. 2. Build a connection string for the target database: ```text md:?motherduck_token= ``` 3. In Dot, add a MotherDuck database integration and paste the connection string. 4. Save the connection, then validate it with a small question or query. ![Dot MotherDuck token screen](./img/dot-motherduck-token.png) ## Authentication and configuration - Use a dedicated token for the Dot workspace. - Scope the token to the database access Dot needs for analysis. - Treat the connection string as a secret because it includes the MotherDuck token. ## Important notes - Dot's setup guide includes local DuckDB options. For MotherDuck, use the `md:` connection string. - Store the token in Dot's secret handling rather than sharing the connection string in chat. ## Use cases - Ask questions about MotherDuck data from Slack. - Give a team an AI analyst experience over curated MotherDuck schemas. - Connect a specific MotherDuck database to a Dot workspace. ## Related content - [View the full Dot MotherDuck setup guide](https://docs.getdot.ai/integrations/databases/motherduck-and-duckdb) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/sql-ides/duckdb-ui # DuckDB UI > DuckDB UI is a local SQL interface for DuckDB. You can use it with MotherDuck when you want a lightweight SQL workspace backed by the DuckDB CLI and an md: connection. ## How it works with MotherDuck 1. Install and start DuckDB UI from a DuckDB environment that can authenticate to MotherDuck. 2. Store your MotherDuck token in the `motherduck_token` environment variable or use another supported MotherDuck authentication method. 3. Connect to MotherDuck with an `md:` database path and validate the session with a small query. ## Example ```sql ATTACH 'md:' AS motherduck; SELECT current_database(); ``` ## Related content - [DuckDB UI documentation](https://duckdb.org/docs/current/core_extensions/ui.html) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) - [Connecting to MotherDuck](/key-tasks/authenticating-and-connecting-to-motherduck/connecting-to-motherduck) --- Source: https://motherduck.com/docs/integrations/ingestion/estuary # Estuary > Real-time data integration platform for streaming data between systems. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. ## How it works with MotherDuck Estuary materializes collections into MotherDuck tables. The connector uses object storage as a temporary staging area while writing to MotherDuck. ## Prerequisites - An Estuary Flow collection to materialize. - A MotherDuck service token. - A target MotherDuck database and schema. - A staging bucket in S3, S3-compatible storage, Google Cloud Storage, Azure Blob Storage, or Cloudflare R2. ## Setup 1. In MotherDuck, create a service token for Estuary. 2. Prepare a staging bucket and credentials with read/write access. 3. In Estuary, create a MotherDuck materialization. 4. Enter the MotherDuck service token, database, and schema. 5. Configure the staging bucket. 6. Add bindings from Estuary collections to MotherDuck table names. 7. Start the materialization. ## Authentication and configuration - Use the MotherDuck service token for the `/token` connector field. - Set `/database` and `/schema` for the target database and default schema. - Configure per-binding table names and optional schema overrides for specific collections. - Choose the staging bucket type and credentials that match your object storage provider. ## Important notes - The staging bucket is temporary working storage for the materialization, not the permanent analytical data store. - Estuary recommends S3 in `us-east-1` for best performance and cost when using S3 staging. - Delta updates can improve latency and cost for large datasets when your events have suitable keys, but they are not the default. ## Use cases - Stream source collections into MotherDuck tables. - Materialize operational and SaaS data into a MotherDuck analytics database. - Use Estuary-managed sync schedules for repeatable MotherDuck loads. ## Related content - [Read the MotherDuck blog on streaming data to MotherDuck](https://motherduck.com/blog/streaming-data-to-motherduck/) - [View the full Estuary MotherDuck setup guide](https://docs.estuary.dev/reference/Connectors/materialization-connectors/motherduck/) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/file-formats/excel # Excel > Excel workbooks can be loaded through DuckDB's Excel extension and stored in MotherDuck for repeatable SQL analysis. ## How it works with MotherDuck 1. Connect to MotherDuck from a DuckDB client. 2. Install and load the DuckDB Excel extension in the client session. 3. Use `read_xlsx` to read a workbook and create a MotherDuck table from the result. ## Example ```sql INSTALL excel; LOAD excel; CREATE TABLE my_table AS SELECT * FROM read_xlsx('workbook.xlsx'); ``` To read a specific worksheet, pass the `sheet` parameter: ```sql CREATE OR REPLACE TABLE my_database.main.excel_data AS SELECT * FROM read_xlsx('workbook.xlsx', sheet = 'Sheet1'); ``` ## Related content - [DuckDB Excel import documentation](https://duckdb.org/docs/current/guides/file_formats/excel_import.html) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck cloud storage integrations](/integrations/cloud-storage/) - [Using Excel and Google Sheets data in MotherDuck](/key-tasks/data-warehousing/replication/spreadsheets/) - [Connect MotherDuck to Excel](/integrations/bi-tools/excel/) --- Source: https://motherduck.com/docs/integrations/ingestion/expanso # Expanso > Data integration platform for connecting data sources to warehouses. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. ## How it works with MotherDuck 1. Create a pipeline in Expanso with MotherDuck as the destination or analytical store. 2. Create a MotherDuck access token and add it to the tool's secrets or destination settings. 3. Choose the target database and schema, then run a small initial sync before scheduling production loads. ## Related content - [Read the Expanso announcement for the MotherDuck integration](https://expanso.io/newsroom/expanso-and-motherduck-join-forces-to-deliver-distributed-data-analytics/) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/bi-tools/explo # Explo > Explo is a platform for embedded analytics, AI analytics, and data sharing in customer-facing products. It integrates with MotherDuck as a data source. ## How it works with MotherDuck Explo connects to MotherDuck as a data source for embedded analytics and customer-facing dashboards. ## Prerequisites - A MotherDuck database for Explo to query. - A MotherDuck access token provisioned for the Explo workspace. - The database name and any schema names you plan to expose in Explo. ## Setup 1. In MotherDuck, create an access token for Explo. 2. In Explo, create a new data source and select **MotherDuck**. 3. Enter the MotherDuck database name. 4. Choose the authentication option that uses an access token. 5. Paste the token and save the data source. ## Authentication and configuration - Use a dedicated token for each Explo environment or workspace. - Prefer read-only access for embedded analytics workloads. - Configure schema access in Explo so customer-facing dashboards only expose the intended data model. ## Important notes - Explo's MotherDuck documentation lists the required credentials but does not require a platform-specific environment variable list. - Keep the token in Explo's credential store and rotate it like any other production credential. ## Use cases - Power embedded dashboards from MotherDuck tables. - Build customer-facing analytics over per-customer or shared schemas. - Let Explo query curated datasets without moving data into another warehouse. ## Related content - [View the full Explo MotherDuck setup guide](https://docs.explo.co/data-sources/connecting-to-data-sources/data-source-types/motherduck#motherduck) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/data-science-ai/fabi-ai # Fabi.ai > Fabi.ai is an AI-native notebook and BI workspace for Python, SQL, dashboards, and workflows. It integrates with MotherDuck for exploration and analysis. ## How it works with MotherDuck Fabi.ai connects to MotherDuck as a database source for notebooks, BI workflows, dashboards, and AI-assisted analysis. ## Prerequisites - A Fabi.ai workspace. - A MotherDuck service token. - Network allowlisting if your security policy restricts inbound connections. ## Setup 1. In MotherDuck, create a service token for Fabi.ai. 2. In Fabi.ai, start the data source connection flow and choose **MotherDuck**. 3. Paste the service token into the MotherDuck connection. 4. Save the data source and validate it from a notebook or workflow. ## Authentication and configuration - Use a dedicated service token for the Fabi.ai workspace. - Fabi.ai documents the IP addresses to allowlist for MotherDuck connections. Add them if your environment enforces firewall rules. - Keep the token in Fabi.ai's connection settings or secret manager. ## Important notes - Fabi.ai's MotherDuck setup is token-based; no local DuckDB file path is needed. - If a connection fails, verify both the token and any IP allowlist configuration. ## Use cases - Analyze MotherDuck data in Fabi.ai notebooks. - Build dashboards and data apps backed by MotherDuck. - Use Fabi.ai AI workflows against curated MotherDuck datasets. ## Related content - [View the full Fabi.ai MotherDuck setup guide](https://docs.fabi.ai/integrations_and_connectors/motherduck) - [MotherDuck Python overview](/integrations/language-apis-and-drivers/python/python-overview) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/file-formats/index # File Formats > Load data into MotherDuck using various file formats Load data into MotherDuck using various file formats. ## Included pages - [Apache Iceberg](https://motherduck.com/docs/integrations/file-formats/apache-iceberg): MotherDuck supports the Apache Iceberg format through the DuckDB Iceberg extension. The extension is loaded automatically when Iceberg functions or catalogs are used in your current MotherDuck session. - [Delta Lake](https://motherduck.com/docs/integrations/file-formats/delta-lake): MotherDuck supports querying data in the Delta Lake format. The Delta DuckDB extension is loaded automatically when any of the supported Delta Lake functions are called. - [DuckLake](https://motherduck.com/docs/integrations/file-formats/ducklake): DuckLake is an integrated data lake and catalog format for large scale data analytics. - [Google Sheets](https://motherduck.com/docs/integrations/file-formats/google-sheets): Query Google Sheets from MotherDuck with CSV export URLs or the DuckDB Google Sheets community extension. - [CSV](https://motherduck.com/docs/integrations/file-formats/csv): CSV is a simple text format for tabular data. DuckDB can read CSV files from local paths, HTTPS URLs, and supported cloud storage locations, then load the results into MotherDuck tables. - [Excel](https://motherduck.com/docs/integrations/file-formats/excel): Excel workbooks can be loaded through DuckDB's Excel extension and stored in MotherDuck for repeatable SQL analysis. - [JSON](https://motherduck.com/docs/integrations/file-formats/json): JSON is a common format for semi-structured data. DuckDB can read JSON files and load the results into MotherDuck for SQL analytics. - [Parquet](https://motherduck.com/docs/integrations/file-formats/parquet): Parquet is a columnar file format designed for analytics. DuckDB can query Parquet files directly and persist the result as a MotherDuck table. --- Source: https://motherduck.com/docs/integrations/ingestion/fivetran # Fivetran > Automated data integration platform for connecting data sources to warehouses. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. This page covers using MotherDuck as a Fivetran destination. To sync modeled MotherDuck data out to business tools, use [Fivetran Activations (formerly Census)](/integrations/reverse-etl/census). ## How it works with MotherDuck Fivetran can use MotherDuck as a destination for connector syncs. ## Prerequisites - A MotherDuck account and an existing database for Fivetran to load into. - A MotherDuck authentication token. - A Fivetran user account with permission to create or manage destinations. ## Setup 1. In MotherDuck, create an authentication token for Fivetran. 2. In Fivetran, open **Destinations** and select **Add destination**. 3. Enter a destination name and add the destination. 4. Select **MotherDuck** as the destination type. 5. Enter the MotherDuck authentication token. 6. Enter the existing MotherDuck database name. 7. Select **Save and Test**. When the test succeeds, Fivetran can sync connector data into the configured MotherDuck database. ## Authentication and configuration - Use a token dedicated to the Fivetran destination. - The database must already exist in MotherDuck before you save and test the destination. - Review Fivetran's automatically created platform connector if you want destination logs and account metadata synced into MotherDuck. ## Important notes - The Fivetran MotherDuck destination is partner-built. Questions about the destination can go to MotherDuck Support. - This page covers MotherDuck as a Fivetran destination. For syncing modeled MotherDuck data out to business tools, use Fivetran Activations. ## Use cases - Load SaaS, database, and file connector data into MotherDuck. - Centralize Fivetran-managed data in a MotherDuck analytics database. - Keep connector logs and metadata alongside the destination data if you enable the platform connector. ## Related content - [View the full Fivetran MotherDuck setup guide](https://fivetran.com/docs/destinations/motherduck/setup-guide) - [Fivetran Activations with MotherDuck](/integrations/reverse-etl/census) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/reverse-etl/census # Fivetran Activations (formerly Census) > Fivetran Activations, formerly Census, syncs modeled data from a warehouse or database into operational and customer engagement tools. MotherDuck is available as a source in Fivetran Activations, which means Activations can read from MotherDuck and sync selected data to supported activation destinations. This is different from the [Fivetran MotherDuck destination](/integrations/ingestion/fivetran), which loads data from other sources into MotherDuck. ## How it works with MotherDuck Fivetran Activations can use MotherDuck as a source for reverse ETL syncs into operational tools. ## Prerequisites - A Fivetran Activations workspace or trial. - A MotherDuck service token. - Modeled tables or views in MotherDuck that contain the records you want to activate. ## Setup 1. In MotherDuck, create or copy a service token. 2. In Fivetran Activations, open **Sources**. 3. Select **New Source** and choose **MotherDuck**. 4. Enter the MotherDuck service token and connect the source. 5. Open **Syncs** and create a sync from the MotherDuck source to your target destination. ## Authentication and configuration - Use a dedicated service token for Activations. - Expose stable tables or views for activation syncs so field mappings remain consistent. - Start with a limited audience or test destination object before enabling production syncs. ## Important notes - Fivetran Activations uses MotherDuck as a source, not as a destination. - Fivetran's docs note MotherDuck support through the Basic Sync Engine. Check the linked docs for current engine support before designing large activation workflows. ## Use cases - Sync product-qualified leads or account scores from MotherDuck to CRM tools. - Push modeled user segments from MotherDuck to marketing platforms. - Activate customer event or enrichment data produced in MotherDuck. ## Related content - [View the full Fivetran Activations MotherDuck setup guide](https://fivetran.com/docs/activations/sources/available-sources/motherduck) - [Fivetran MotherDuck destination](/integrations/ingestion/fivetran) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/bi-tools/gooddata # Gooddata > Enterprise analytics platform for building data products and embedded analytics. It integrates with MotherDuck for dashboards, semantic models, and embedded analytics workflows. ## How it works with MotherDuck GoodData Cloud connects to MotherDuck as a data source for semantic models, dashboards, and embedded analytics. ## Prerequisites - A GoodData Cloud workspace with permission to create data sources. - A MotherDuck service token. - The MotherDuck database name and schema GoodData should use. ## Setup 1. In GoodData Cloud, open **Data sources** and select **Connect data**. 2. Select **MotherDuck**. 3. Enter a data source display name. 4. Paste the MotherDuck service token. 5. Enter the database name and schema. 6. Select **Connect**. ![GoodData MotherDuck data source form with service token, database, and schema fields](./img/gooddata-motherduck-data-source.webp) GoodData also supports creating the data source through its API. When using the API, encode the MotherDuck service token as required by the GoodData request body and use a JDBC URL such as `jdbc:duckdb:md:`. ## Authentication and configuration - Use a MotherDuck service token dedicated to the GoodData data source. - Enter a schema so GoodData can build its logical data model from the intended tables. - Keep the service token in GoodData's credential handling or your deployment secret store if you create the data source through the API. ## Important notes - GoodData's guide includes both UI and API setup. Start with the UI unless you need repeatable provisioning. - GoodData's API examples include GoodData API authentication details; those are separate from the MotherDuck service token. ## Use cases - Build governed BI workspaces on top of MotherDuck. - Create embedded analytics backed by MotherDuck tables. - Provision MotherDuck data sources with GoodData's API for repeatable environments. ## Related content - [View the full GoodData MotherDuck setup guide](https://www.gooddata.com/docs/cloud/connect-data/create-data-sources/motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/data-science-ai/google-colab # Google Colab > Google Colab notebooks can query MotherDuck by installing DuckDB and opening an md: connection from Python. ## How it works with MotherDuck 1. Install DuckDB in the Colab notebook. 2. Store your MotherDuck token in Colab secrets or another notebook-safe secret store. 3. Connect with `duckdb.connect("md:...")` and run SQL from notebook cells. ## Example ```python %pip install duckdb import duckdb con = duckdb.connect('md:my_db') con.sql('SELECT current_database()').show() ``` ## Related content - [Google Colab](https://colab.research.google.com/) - [MotherDuck Python overview](/integrations/language-apis-and-drivers/python/python-overview) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/bi-tools/grafana # Grafana > Grafana is an observability and dashboarding platform for building dashboards, alerts, and exploratory views. The MotherDuck-maintained DuckDB data source plugin lets Grafana query local DuckDB files and MotherDuck databases. ## How it works with MotherDuck 1. Install Grafana 10.4.0 or later on a glibc-based Linux environment. If you use Docker, use an Ubuntu-based Grafana image instead of the default Alpine-based image. 2. Download the DuckDB data source plugin from the GitHub releases page. 3. Because the plugin is unsigned, allow `motherduck-duckdb-datasource` in Grafana's unsigned plugin configuration. 4. Add a DuckDB data source in Grafana and provide a MotherDuck token. 5. If `md:` does not work as the database path in a Docker deployment, leave the path blank and add `ATTACH IF NOT EXISTS 'md:';` in the initialization SQL. ## Related content - [View the full process in the Grafana DuckDB data source plugin documentation](https://github.com/motherduckdb/grafana-duckdb-datasource) - [Grafana data source documentation](https://grafana.com/docs/grafana/latest/features/datasources/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/data-quality/great-expectations # Great Expectations > Great Expectations is a data quality management platform combining data quality and data governance. It integrates with MotherDuck for table monitoring as part of data quality and observability workflows. ## How it works with MotherDuck 1. Create a connection or data source in Great Expectations for the MotherDuck database you want to monitor. 2. Provide a MotherDuck access token or supported connection string in the tool's secret manager. 3. Start with a narrow set of schemas or tables, then expand checks and monitoring after the connection is validated. ## Related content - [View the full process in the Great Expectations documentation](https://docs.greatexpectations.io/docs/guides/connecting_to_your_data/database/duckdb) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) - [Connecting to MotherDuck](/key-tasks/authenticating-and-connecting-to-motherduck/connecting-to-motherduck) --- Source: https://motherduck.com/docs/integrations/ingestion/hevo # Hevo > Hevo is a data integration platform for connecting data sources to warehouses. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. ## How it works with MotherDuck 1. Create a pipeline in Hevo with MotherDuck as the destination or analytical store. 2. Create a MotherDuck access token and add it to the tool's secrets or destination settings. 3. Choose the target database and schema, then run a small initial sync before scheduling production loads. ## Related content - [View the full process in the Hevo documentation](https://hevodata.com/learn/ingest-data-into-motherduck-via-s3/) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/bi-tools/holistics # Holistics > Holistics helps data teams set up self-service BIs that are reliable and easy to maintain. Everyone can now self-serve data with confidence by applying software's best practices. It integrates with MotherDuck for dashboards, semantic models, and embedded analytics workflows. ## How it works with MotherDuck Holistics connects to MotherDuck as a data source for semantic modeling, self-service BI, dashboards, and analytics-as-code workflows. ## Prerequisites - A Holistics workspace with permission to create data sources. - A MotherDuck access token. - The MotherDuck database and schemas Holistics should query. ## Setup 1. In MotherDuck, create an access token for Holistics. 2. In Holistics, open **Organization Settings** > **Data Sources**. 3. Select **New Data Source** and choose **MotherDuck**. 4. Enter a display name. 5. Paste the MotherDuck token. 6. Test and save the data source. ![Holistics MotherDuck data source form](./img/holistics-motherduck-data-source.png) ## Authentication and configuration - Use a dedicated MotherDuck token for Holistics. - Select the token type based on the work Holistics needs to run. Read-only access is enough for dashboard queries. - Share the Holistics data source only with the analysts or teams that should model and query the connected data. ## Important notes - Holistics queries MotherDuck directly; data remains in MotherDuck. - If you model data from multiple MotherDuck schemas, confirm the token can access all of them before saving the data source. ## Use cases - Model MotherDuck data in Holistics' semantic layer. - Build governed self-service dashboards. - Manage BI content through Holistics analytics-as-code workflows. ## Related content - [View the full Holistics MotherDuck setup guide](https://docs.holistics.io/docs/connect/databases/motherduck) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/ingestion/infinyon # InfinyOn > Real-time data integration platform for streaming data between systems. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. ## How it works with MotherDuck 1. Create a pipeline in InfinyOn with MotherDuck as the destination or analytical store. 2. Create a MotherDuck access token and add it to the tool's secrets or destination settings. 3. Choose the target database and schema, then run a small initial sync before scheduling production loads. ## Related content - [Read the InfinyOn blog on the MotherDuck connector](https://www.infinyon.com/blog/2023/07/infinyon-motherduck/) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/ingestion/index # Ingestion Tools > Configure MotherDuck as the destination for your data in the following data ingestion tools Configure MotherDuck as the destination for your data in the following data ingestion tools. ## Included pages - [Airbyte](https://motherduck.com/docs/integrations/ingestion/airbyte): Airbyte is a data integration platform for connecting data sources to warehouses. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. - [Artie](https://motherduck.com/docs/integrations/ingestion/artie): Artie is a fully managed CDC streaming platform that allows you to replicate data from your source database to your destination in real-time. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. - [Ascend.io](https://motherduck.com/docs/integrations/ingestion/ascend-io): Ascend.io is a data integration platform for connecting data sources to warehouses. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. - [AWS Glue](https://motherduck.com/docs/integrations/ingestion/aws-glue): AWS Glue is a serverless data integration service for preparing and moving data with Spark jobs, crawlers, and the AWS Glue Data Catalog. AWS Glue jobs can connect to MotherDuck through the MotherDuck Postgres endpoint using Glue's PostgreSQL JDBC support. - [Bytewax](https://motherduck.com/docs/integrations/ingestion/bytewax): Bytewax is a stream processing platform for building and managing data pipelines. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. - [CloudQuery](https://motherduck.com/docs/integrations/ingestion/cloudquery): CloudQuery is a data integration platform for connecting data sources to warehouses. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. - [dlt (data load tool)](https://motherduck.com/docs/integrations/ingestion/dlt): dlt is an open-source Python library that loads data from various, often messy data sources into well-structured, live datasets. It offers a lightweight interface for extracting data from REST APIs, SQL databases, cloud storage, Python data structures, and many more. - [Estuary](https://motherduck.com/docs/integrations/ingestion/estuary): Real-time data integration platform for streaming data between systems. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. - [Expanso](https://motherduck.com/docs/integrations/ingestion/expanso): Data integration platform for connecting data sources to warehouses. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. - [Fivetran](https://motherduck.com/docs/integrations/ingestion/fivetran): Automated data integration platform for connecting data sources to warehouses. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. - [Hevo](https://motherduck.com/docs/integrations/ingestion/hevo): Hevo is a data integration platform for connecting data sources to warehouses. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. - [InfinyOn](https://motherduck.com/docs/integrations/ingestion/infinyon): Real-time data integration platform for streaming data between systems. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. - [Mage](https://motherduck.com/docs/integrations/ingestion/mage): Mage is a data integration platform for connecting data sources to warehouses. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. - [Polytomic](https://motherduck.com/docs/integrations/ingestion/polytomic): Use Polytomic to sync data to and from MotherDuck for ETL and reverse ETL workflows. - [Salesforce](https://motherduck.com/docs/integrations/ingestion/salesforce): Salesforce is a CRM platform for sales, marketing, service, and customer data. To analyze Salesforce data in MotherDuck, use an ingestion tool that supports Salesforce as a source and MotherDuck as a destination. - [Sling](https://motherduck.com/docs/integrations/ingestion/sling): Data integration platform for connecting data sources to warehouses. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. - [Stacksync](https://motherduck.com/docs/integrations/ingestion/stacksync): Stacksync helps your teams access and manipulate CRM and ERP data through your existing databases. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. - [Streamkap](https://motherduck.com/docs/integrations/ingestion/streamkap): Streamkap is a stream processing platform built for Change Data Capture (CDC) and event sources. It makes it easy to move operational data into analytics systems like MotherDuck with low latency and high reliability. Streamkap offers various sources, including PostgreSQL, MySQL, SQL Server, a range of SQL and NoSQL databases, Kafka, and other storage systems. - [Unstructured.io](https://motherduck.com/docs/integrations/ingestion/unstructured-io): Unstructured.io is an ingestion platform for processing unstructured data. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. --- Source: https://motherduck.com/docs/integrations/integrations # Integrations > Integrations that work with MotherDuck from the modern data stack MotherDuck integrates with a lot of common tools from the modern data stack. If you would like to create a new integration, see [this guide](how-to-integrate). Below, you will find a comprehensive list of integrations that work with MotherDuck. Each integration includes links to either our own detailed tutorials, the integrator's documentation, or insightful articles and blogs that can help you get started. :::info When working with integrations, it may be useful to be aware of the [different connection string parameters](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck#using-connection-string-parameters) you can use to connect to MotherDuck. ::: ## Supported Integrations Use the search box to find specific integrations or click on category tags to filter the table. See the integration pages listed in this section for supported tools. :::note See [DuckDB documentation](https://duckdb.org/docs/api/overview.html) for the full list of supported client APIs and drivers. ::: ## Diagram: Modern Duck Stack ![img_duck_stack](../img/md-diagram.svg) --- Source: https://motherduck.com/docs/integrations/file-formats/json # JSON > JSON is a common format for semi-structured data. DuckDB can read JSON files and load the results into MotherDuck for SQL analytics. ## How it works with MotherDuck 1. Connect to MotherDuck from a DuckDB client. 2. Use `read_json` for JSON files, newline-delimited JSON, or JSON arrays. 3. Create a MotherDuck table once you have the schema and options you want. ## Example ```sql CREATE TABLE my_table AS SELECT * FROM read_json('events.json'); ``` ## Related content - [DuckDB JSON documentation](https://duckdb.org/docs/current/data/json/overview.html) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck cloud storage integrations](/integrations/cloud-storage/) --- Source: https://motherduck.com/docs/integrations/data-science-ai/jupyter # Jupyter > Jupyter notebooks can query MotherDuck through the DuckDB Python package and an md: connection string. ## How it works with MotherDuck 1. Install DuckDB in the notebook environment. 2. Provide a MotherDuck access token with an environment variable or connection parameter. 3. Use DuckDB SQL from Python cells to explore or transform MotherDuck data. ## Example ```python import duckdb con = duckdb.connect('md:my_db') con.sql('SELECT current_database()').show() ``` ## Related content - [DuckDB Jupyter documentation](https://duckdb.org/docs/current/guides/python/jupyter.html) - [MotherDuck Python overview](/integrations/language-apis-and-drivers/python/python-overview) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/orchestration/kestra # Kestra > Open-source data orchestration and scheduling platform for building ETL pipelines. It integrates with MotherDuck for scheduling DuckDB and MotherDuck tasks in repeatable data workflows. ## How it works with MotherDuck 1. Create a task, DAG, flow, or job in Kestra that runs a DuckDB or MotherDuck step. 2. Store the MotherDuck token in the orchestrator's secret manager. 3. Use a small validation query before scheduling larger ingestion, transformation, or export jobs. ## Related content - [Read the MotherDuck blog on DuckDB, MotherDuck, and Kestra ETL pipelines](https://motherduck.com/blog/motherduck-kestra-etl-pipelines/) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/data-science-ai/langchain # LangChain > LangChain is a framework for building and deploying language models. It integrates with MotherDuck for notebooks, assistants, and AI-powered analysis workflows. ## How it works with MotherDuck 1. Create a workspace, notebook, or assistant configuration in LangChain. 2. Use the MotherDuck token, service token, or connection string required by the integration. 3. Run a small query such as `SELECT current_database()` before adding larger analytical workflows. ## Related content - [View the full process in the LangChain documentation](https://python.langchain.com/docs/integrations/providers/duckdb) - [MotherDuck Python overview](/integrations/language-apis-and-drivers/python/python-overview) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/language-apis-and-drivers/index # Language APIs & Drivers > Connect to MotherDuck using your preferred programming language Connect to MotherDuck using official drivers and APIs for various programming languages. ## Included pages - [Go driver](https://motherduck.com/docs/integrations/language-apis-and-drivers/go-driver): Official Go driver for MotherDuck, enabling seamless integration with Go applications. The go-duckdb driver supports MotherDuck out of the box! - [JDBC driver](https://motherduck.com/docs/integrations/language-apis-and-drivers/jdbc-driver): Java Database Connectivity (JDBC) driver for connecting Java applications to MotherDuck. The official DuckDB JDBC driver supports MotherDuck out of the box! - [Python](https://motherduck.com/docs/integrations/language-apis-and-drivers/python/python-overview): Python is a programming language for building and deploying web applications. - [R](https://motherduck.com/docs/integrations/language-apis-and-drivers/r): R is a language for statistical analysis. - [Node.js](https://motherduck.com/docs/integrations/language-apis-and-drivers/node-js): The DuckDB Node.js client can connect to MotherDuck with an md: connection string, so JavaScript and TypeScript applications can query MotherDuck directly. --- Source: https://motherduck.com/docs/integrations/data-science-ai/llamaindex # LlamaIndex > LlamaIndex is a framework for building and deploying language models. It integrates with MotherDuck for notebooks, assistants, and AI-powered analysis workflows. ## How it works with MotherDuck 1. Create a workspace, notebook, or assistant configuration in LlamaIndex. 2. Use the MotherDuck token, service token, or connection string required by the integration. 3. Run a small query such as `SELECT current_database()` before adding larger analytical workflows. ## Related content - [View the full process in the LlamaIndex documentation](https://docs.llamaindex.ai/en/stable/api_reference/storage/vector_store/duckdb/) - [MotherDuck Python overview](/integrations/language-apis-and-drivers/python/python-overview) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/ingestion/mage # Mage > Mage is a data integration platform for connecting data sources to warehouses. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. ## How it works with MotherDuck 1. Create a pipeline in Mage with MotherDuck as the destination or analytical store. 2. Create a MotherDuck access token and add it to the tool's secrets or destination settings. 3. Choose the target database and schema, then run a small initial sync before scheduling production loads. ## Related content - [Read the Mage blog on MotherDuck](https://www.mage.ai/blog/making-magic-motherduck-with-mage) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/bi-tools/powerbi/index # Microsoft Power BI > Power BI is an interactive data visualization product developed by Microsoft. You can connect Power BI to MotherDuck through the built-in PostgreSQL database connector using MotherDuck's Postgres endpoint. ## Included pages - [Power BI Desktop with MotherDuck](https://motherduck.com/docs/integrations/bi-tools/powerbi/powerbi-desktop): Connect Power BI Desktop to MotherDuck using the Postgres endpoint for dashboards and reports. - [Power BI Service with MotherDuck](https://motherduck.com/docs/integrations/bi-tools/powerbi/powerbi-service): Publish Power BI reports to the cloud using the On-Premises Data Gateway and MotherDuck's Postgres endpoint. - [Power BI custom connector (legacy)](https://motherduck.com/docs/integrations/bi-tools/powerbi/powerbi-custom-connector): Connect Power BI to MotherDuck using the DuckDB ODBC driver and Power Query custom connector. --- Source: https://motherduck.com/docs/integrations/data-quality/monte-carlo # Monte Carlo > End-to-end data observability platform for monitoring data quality and reliability. It integrates with MotherDuck for table monitoring as part of data quality and observability workflows. ## How it works with MotherDuck Monte Carlo connects to MotherDuck for data observability workflows, including custom SQL monitors over MotherDuck tables. ## Prerequisites - A Monte Carlo account with access to the MotherDuck integration. - A MotherDuck account and database access for the objects you want to monitor. - A MotherDuck service token that can run the monitor queries. ## Setup 1. In MotherDuck, create a service token for Monte Carlo. 2. In Monte Carlo, add MotherDuck as a data source. 3. Enter the MotherDuck connection details requested by Monte Carlo. 4. Validate the connection. 5. Create custom SQL monitors for the tables, freshness checks, or metrics you need to observe. ## Authentication and configuration - Use a dedicated token for Monte Carlo monitoring. - Grant access to the databases and schemas where monitor queries run. - Keep monitor queries scoped to the smallest useful result set. ## Important notes - Monte Carlo lists the MotherDuck integration as public preview in its documentation. Confirm current availability and support requirements with Monte Carlo before relying on it for production alerting. - Query complexity and result size affect monitor performance. ## Use cases - Monitor freshness or row-count expectations for MotherDuck tables. - Run custom SQL checks against curated analytics models. - Route MotherDuck data quality incidents into existing Monte Carlo notification workflows. ## Related content - [View the full Monte Carlo MotherDuck setup guide](https://docs.getmontecarlo.com/docs/motherduck) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) - [Connecting to MotherDuck](/key-tasks/authenticating-and-connecting-to-motherduck/connecting-to-motherduck) --- Source: https://motherduck.com/docs/integrations/databases/mysql # MySQL > MySQL is a relational database commonly used for application data. DuckDB's MySQL extension can read from MySQL-compatible databases, which lets you copy selected data into MotherDuck from a DuckDB client. ## How it works with MotherDuck 1. Connect to MotherDuck from the DuckDB CLI, Python, or another DuckDB client. 2. Install and load DuckDB's MySQL extension in that session. 3. Attach the MySQL database, then create MotherDuck tables from selected MySQL tables or queries. ## Example ```sql INSTALL mysql; LOAD mysql; ATTACH 'host=localhost port=3306 user=my_user password=my_password database=my_database' AS mysql_db (TYPE mysql); CREATE TABLE my_table AS SELECT * FROM mysql_db.my_schema.my_table; ``` ## Related content - [DuckDB MySQL extension documentation](https://duckdb.org/docs/current/core_extensions/mysql.html) - [Loading data from PostgreSQL-compatible sources](/key-tasks/loading-data-into-motherduck/loading-data-from-postgres) - [Running hybrid queries](/key-tasks/running-hybrid-queries) --- Source: https://motherduck.com/docs/integrations/language-apis-and-drivers/node-js # Node.js > The DuckDB Node.js client can connect to MotherDuck with an md: connection string, so JavaScript and TypeScript applications can query MotherDuck directly. ## How it works with MotherDuck 1. Install the DuckDB Node.js client in your application. 2. Create a MotherDuck access token and provide it through a connection string parameter or environment variable. 3. Open an `md:` connection and run SQL from your application code. ## Example ```javascript import duckdb from '@duckdb/node-api'; const token = process.env.motherduck_token; const instance = await duckdb.DuckDBInstance.create(`md:my_db?motherduck_token=${token}`); const connection = await instance.connect(); const result = await connection.run('SELECT current_database()'); ``` ## Related content - [DuckDB Node.js client documentation](https://duckdb.org/docs/current/clients/node_neo/overview.html) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) - [Connecting to MotherDuck](/key-tasks/authenticating-and-connecting-to-motherduck/connecting-to-motherduck) --- Source: https://motherduck.com/docs/integrations/bi-tools/omni # Omni > Modern business intelligence platform for creating interactive dashboards and data visualizations. It integrates with MotherDuck for dashboards, semantic models, and embedded analytics workflows. ## How it works with MotherDuck Omni connects to MotherDuck as a database connection for modeling, dashboards, AI-assisted exploration, and embedded analytics. ## Prerequisites - Organization Admin permissions in Omni. - A MotherDuck database available on MotherDuck v0.10.2 or later. - A MotherDuck Read/Write token for the Omni connection. ## Setup 1. In MotherDuck, create a Read/Write token for Omni and copy it before closing the dialog. 2. Optional: create a dedicated schema for Omni table uploads if users need to upload CSVs and join them to modeled data. 3. In Omni, open **Settings** > **Connections**. 4. Select **MotherDuck**. 5. Paste the MotherDuck token and complete the connection form. 6. Create the connection. ## Authentication and configuration - Use a dedicated token for the Omni connection. - Configure schema filters to limit what Omni imports into its model. - Use a separate upload schema if Omni users will upload files. Do not reuse that schema for modeled tables. - Review timezone settings during setup so dashboard results match your reporting conventions. ## Important notes - Omni recommends a Read/Write token because Omni may need to support table uploads and modeling workflows. - Table uploads are optional. If you do not use them, you can keep the connection focused on querying existing MotherDuck tables. ## Use cases - Generate an Omni model from MotherDuck schemas. - Build BI dashboards and topics on top of MotherDuck. - Combine user-uploaded files with governed MotherDuck data in Omni. ## Related content - [Read the Omni announcement for MotherDuck support](https://omni.co/blog/announcing-support-for-motherduck) - [View the full Omni MotherDuck setup guide](https://docs.omni.co/connect-data/setup/motherduck#connecting-motherduck-to-omni) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/reverse-etl/onesignal # OneSignal > Customer engagement platform for delivering personalized, real-time messages across multiple channels based on user behavior. It integrates with MotherDuck for syncing modeled MotherDuck data into operational and customer engagement tools. ## How it works with MotherDuck OneSignal can sync custom event data from MotherDuck into OneSignal so journeys and campaigns can use modeled behavior from your analytics database. ## Prerequisites - A OneSignal plan that includes the required data integration and custom event features. - A MotherDuck service token. - A MotherDuck database with event tables or views. ## Setup 1. In MotherDuck, create a service token for OneSignal. 2. Prepare a table or view with event fields such as event name, user identifier, event timestamp, and event properties. 3. In OneSignal, go to **Data** > **Integrations** and select **Add Integration**. 4. Choose **MotherDuck**. 5. Enter the service token, database name, and connection string, for example `md:my_database`. 6. Select tables or write SQL queries that define which events to sync. 7. Map the MotherDuck result columns to OneSignal custom event fields. ## Authentication and configuration - Use a dedicated MotherDuck service token for OneSignal. - Keep event queries selective by filtering to the time range and columns needed for the sync. - Create separate integrations if you need to sync from multiple MotherDuck databases. ## Important notes - Query complexity and large result sets can affect sync performance. - JSON event properties should use proper JSON typing in the MotherDuck query result. - OneSignal supports table mode and SQL query mode. Use SQL query mode when you need to transform event data before syncing. ## Use cases - Trigger OneSignal Journeys from behavior modeled in MotherDuck. - Sync customer segments, lifecycle events, or product usage events into OneSignal. - Personalize messaging campaigns from MotherDuck tables or views. ## Related content - [View the full OneSignal MotherDuck setup guide](https://documentation.onesignal.com/docs/en/motherduck) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/orchestration/orchestra # Orchestra > Orchestra is a platform for building and managing data pipelines. It integrates with MotherDuck for scheduling DuckDB and MotherDuck tasks in repeatable data workflows. ## How it works with MotherDuck Orchestra connects to MotherDuck with the DuckDB Python client and can run scheduled SQL tasks against a configured MotherDuck database. ## Prerequisites - Access to the Orchestra MotherDuck integration. Orchestra marks the integration as beta in its docs. - A MotherDuck database. - A MotherDuck access token with the permissions required for the SQL Orchestra will run. ## Setup 1. In MotherDuck, create an access token. 2. In Orchestra, create a MotherDuck integration connection. 3. Enter the MotherDuck database name. 4. Paste the access token. 5. Use the MotherDuck **Run SQL** action in a pipeline. ![Orchestra MotherDuck access token instructions](../img/orchestra-motherduck-access-token.png) ## Authentication and configuration - The required authentication parameters are the database name and access token. - Use a dedicated token for Orchestra pipelines. - Choose token permissions based on whether the SQL task reads only or writes data. ## Important notes - The Orchestra MotherDuck integration is documented as beta, so availability may vary by account. - Store the token in Orchestra's connection configuration rather than embedding it in SQL tasks. ## Use cases - Schedule SQL transformations in MotherDuck. - Add MotherDuck query tasks to larger Orchestra data pipelines. - Run operational checks or data maintenance SQL on a schedule. ## Related content - [View the full Orchestra MotherDuck setup guide](https://docs.getorchestra.io/docs/integrations/motherduck) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/orchestration/index # Orchestration Tools > Orchestrate data pipelines with MotherDuck Build and manage data pipelines with MotherDuck using these orchestration tools. ## Included pages - [Airflow](https://motherduck.com/docs/integrations/orchestration/airflow): Open-source data orchestration and scheduling platform for building ETL pipelines. It integrates with MotherDuck for scheduling DuckDB and MotherDuck tasks in repeatable data workflows. - [Astronomer](https://motherduck.com/docs/integrations/orchestration/astronomer): Astronomer is a platform for authoring, scheduling, and monitoring workflows. It integrates with MotherDuck for scheduling DuckDB and MotherDuck tasks in repeatable data workflows. - [Bacalhau](https://motherduck.com/docs/integrations/orchestration/bacalhau): Bacalhau is a platform for running and managing data pipelines. It integrates with MotherDuck for scheduling DuckDB and MotherDuck tasks in repeatable data workflows. - [Dagster](https://motherduck.com/docs/integrations/orchestration/dagster): Dagster is a platform for authoring, scheduling, and monitoring workflows. It integrates with MotherDuck for scheduling DuckDB and MotherDuck tasks in repeatable data workflows. - [Kestra](https://motherduck.com/docs/integrations/orchestration/kestra): Open-source data orchestration and scheduling platform for building ETL pipelines. It integrates with MotherDuck for scheduling DuckDB and MotherDuck tasks in repeatable data workflows. - [Orchestra](https://motherduck.com/docs/integrations/orchestration/orchestra): Orchestra is a platform for building and managing data pipelines. It integrates with MotherDuck for scheduling DuckDB and MotherDuck tasks in repeatable data workflows. - [Prefect](https://motherduck.com/docs/integrations/orchestration/prefect): Prefect is a platform for authoring, scheduling, and monitoring workflows. It integrates with MotherDuck for scheduling DuckDB and MotherDuck tasks in repeatable data workflows. - [Union.ai](https://motherduck.com/docs/integrations/orchestration/union-ai): Union.ai is a platform for building and managing data pipelines. It integrates with MotherDuck for scheduling DuckDB and MotherDuck tasks in repeatable data workflows. - [Windmill](https://motherduck.com/docs/integrations/orchestration/windmill): Windmill is a platform for building and managing data pipelines. It integrates with MotherDuck for scheduling DuckDB and MotherDuck tasks in repeatable data workflows. --- Source: https://motherduck.com/docs/integrations/transformation/paradime # Paradime > Modern data transformation platform for building and managing data pipelines. It integrates with MotherDuck for running data transformation projects against MotherDuck. ## How it works with MotherDuck Paradime connects dbt development environments to MotherDuck so dbt models can run against a MotherDuck database. ## Prerequisites - A Paradime workspace and dbt project. - A MotherDuck service token. - The MotherDuck database path and default schema for dbt models. ## Setup 1. In Paradime, open **Settings** > **Connections**. 2. Select **Add New** under the Code IDE connection section. 3. Choose **MotherDuck**. 4. Enter the dbt profile name and target. 5. Configure the profile with the MotherDuck database path, for example `md:jaffle_shop_dev`. 6. Paste the MotherDuck service token. 7. Enter the default schema and thread count. 8. Save the connection and validate it from the Paradime terminal or by running a small dbt model. ![Paradime MotherDuck connection form](../img/paradime-motherduck-connection.png) ## Authentication and configuration - Paradime stores the MotherDuck token as a user-level environment variable named `motherduck_token`. - If your Paradime environment uses IP restrictions, allow traffic from the Paradime IP range for your selected data location. - Configure extensions and DuckDB settings only when your dbt project needs them, such as reading or writing external files. ## Important notes - The Paradime docs include an example with S3 and Parquet settings. Those are optional project settings, not required for a basic MotherDuck connection. - Keep the MotherDuck token at user level so different developers can use their own credentials when needed. ## Use cases - Develop dbt models in Paradime against MotherDuck. - Run validation queries from the Paradime terminal. - Schedule dbt transformations that target MotherDuck. ## Related content - [View the full Paradime MotherDuck setup guide](https://docs.paradime.io/app-help/documentation/settings/connections/development-environment/motherduck) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/file-formats/parquet # Parquet > Parquet is a columnar file format designed for analytics. DuckDB can query Parquet files directly and persist the result as a MotherDuck table. ## How it works with MotherDuck 1. Connect to MotherDuck from a DuckDB client. 2. Point `read_parquet` at a local file, HTTPS URL, S3 path, or another supported storage location. 3. Load the result into a MotherDuck table if you need managed storage, access control, or sharing. ## Example ```sql CREATE TABLE my_table AS SELECT * FROM read_parquet('s3://my-bucket/path/*.parquet'); ``` ## Related content - [DuckDB Parquet documentation](https://duckdb.org/docs/current/data/parquet/overview.html) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck cloud storage integrations](/integrations/cloud-storage/) --- Source: https://motherduck.com/docs/integrations/ingestion/polytomic # Polytomic > Use Polytomic to sync data to and from MotherDuck for ETL and reverse ETL workflows. [Polytomic](https://www.polytomic.com/) is a data integration platform for syncing data between SaaS applications, databases, cloud storage, spreadsheets, and data warehouses. You can use Polytomic with MotherDuck as either a source or destination. Polytomic supports these MotherDuck workflows: - Load data into MotherDuck from SaaS platforms, databases, data warehouses, and cloud storage. - Stream high-scale change data capture (CDC) data into MotherDuck from systems such as PostgreSQL, MySQL, PlanetScale, MongoDB, Amazon DynamoDB, and Amazon S3. - Sync data from MotherDuck into SaaS platforms, databases, spreadsheets, webhooks, and cloud storage. ## Prerequisites - A [MotherDuck account](https://app.motherduck.com/) - A [MotherDuck access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/#creating-an-access-token) - A Polytomic workspace - Optional: An S3 bucket with an access key ID, secret access key, bucket name, and region if Polytomic writes data to MotherDuck ## Connect to MotherDuck Polytomic connects to MotherDuck using a MotherDuck access token. 1. In MotherDuck, create or copy an access token. 2. In Polytomic, go to **Connections**. 3. Click **Add Connection**. 4. Select **MotherDuck**. 5. Enter a connection name. 6. Optional: Enter the MotherDuck database name. 7. Paste your MotherDuck access token. 8. If Polytomic will write data to MotherDuck, enter the S3 staging bucket credentials. 9. Click **Test connection**. 10. Click **Save**. ![Polytomic MotherDuck connection form with access token and staging bucket fields](../img/polytomic-motherduck-connection.png) ## S3 staging bucket for writes Polytomic requires S3 credentials when it writes data to MotherDuck. The S3 bucket is a temporary staging area for files that Polytomic loads into MotherDuck; it is not used as permanent data lake storage. If you want Polytomic to write permanent files to S3, configure an S3 destination in Polytomic instead of using the MotherDuck connection's staging bucket. ## Sync data to MotherDuck Use a Polytomic bulk sync when you want to load whole source objects or tables into MotherDuck from SaaS applications, databases, data warehouses, or cloud storage buckets. Use a Polytomic model sync when you want to load the result of a custom SQL model into MotherDuck, such as a custom query from PostgreSQL. ## Sync data from MotherDuck Use a Polytomic model sync to send query results from MotherDuck to downstream tools, including Salesforce, Google Sheets, Airtable, webhooks, databases, and cloud storage. ## Related content - [Polytomic MotherDuck documentation](https://docs.polytomic.com/docs/motherduck) - [Authenticating to MotherDuck](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [Service accounts](/key-tasks/service-accounts-guide/) --- Source: https://motherduck.com/docs/integrations/orchestration/prefect # Prefect > Prefect is a platform for authoring, scheduling, and monitoring workflows. It integrates with MotherDuck for scheduling DuckDB and MotherDuck tasks in repeatable data workflows. ## How it works with MotherDuck 1. Create a task, DAG, flow, or job in Prefect that runs a DuckDB or MotherDuck step. 2. Store the MotherDuck token in the orchestrator's secret manager. 3. Use a small validation query before scheduling larger ingestion, transformation, or export jobs. ## Related content - [View the full process in the Prefect documentation](https://docs.prefect.io/v3/tutorials/s3-motherduck) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/dev-tools/puppygraph # Puppygraph > Graph visualization tool for exploring and analyzing data relationships in DuckDB. It integrates with MotherDuck for development workflows that read from or write to MotherDuck. ## How it works with MotherDuck 1. Create a connection in Puppygraph that targets MotherDuck or DuckDB. 2. Store the MotherDuck token as a secret rather than hard-coding it in project files. 3. Validate the connection with a small query, then build the tool-specific workflow on top of that connection. ## Related content - [View the full process in the Puppygraph documentation](https://docs.puppygraph.com/getting-started/querying-duckdb-data-as-a-graph) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) - [Connecting to MotherDuck](/key-tasks/authenticating-and-connecting-to-motherduck/connecting-to-motherduck) --- Source: https://motherduck.com/docs/integrations/language-apis-and-drivers/python/python-overview # Python > Python is a programming language for building and deploying web applications. Check out our [Python tutorial](/getting-started/interfaces/client-apis/python/installation-authentication). --- Source: https://motherduck.com/docs/integrations/reverse-etl/index # Development Tools > Reverse ETL tools and utilities that work with MotherDuck Use MotherDuck with various development tools and utilities to enhance your workflow. ## Included pages - [Fivetran Activations (formerly Census)](https://motherduck.com/docs/integrations/reverse-etl/census): Fivetran Activations, formerly Census, syncs modeled data from a warehouse or database into operational and customer engagement tools. MotherDuck is available as a source in Fivetran Activations, which means Activations can read from MotherDuck and sync selected data to supported activation destinations. - [OneSignal](https://motherduck.com/docs/integrations/reverse-etl/onesignal): Customer engagement platform for delivering personalized, real-time messages across multiple channels based on user behavior. It integrates with MotherDuck for syncing modeled MotherDuck data into operational and customer engagement tools. --- Source: https://motherduck.com/docs/integrations/bi-tools/rill-data # Rill Data > Rill Data is a data platform for building and visualizing data. It integrates with MotherDuck for dashboards, semantic models, and embedded analytics workflows. ## How it works with MotherDuck Rill can use MotherDuck as the OLAP engine that powers Rill dashboards. This is useful when your dashboard data already lives in MotherDuck and you do not want to ingest it into a separate Rill-managed engine. ## Prerequisites - Rill Developer or Rill Cloud. - A MotherDuck access token. - The MotherDuck database path and schema Rill should use. ## Setup 1. In MotherDuck, create an access token for Rill. 2. In Rill Developer, add MotherDuck as an OLAP connection through **Add Data**. 3. Rill creates a connector file such as `motherduck.yaml` and stores `MOTHERDUCK_TOKEN` in `.env`. 4. Configure the connector with an `md:` path and schema: ```yaml type: connector driver: duckdb token: "{{ .env.MOTHERDUCK_TOKEN }}" path: "md:my_database" schema_name: "my_schema" ``` 5. Set the project's `olap_connector` to the MotherDuck connector. ## Authentication and configuration - Keep `MOTHERDUCK_TOKEN` in `.env` or your Rill Cloud environment variables. - Use `rill env push` when deploying a project that already has the token in the local project environment. - Use the Rill connector YAML reference for optional connector parameters. ## Important notes - Creating a MotherDuck OLAP connection changes the project's default OLAP engine to MotherDuck. - Metrics view SQL should use DuckDB-compatible syntax because Rill sends dashboard queries to MotherDuck. ## Use cases - Build fast dashboards on existing MotherDuck tables. - Use MotherDuck as a bring-your-own OLAP engine for Rill. - Deploy the same Rill project locally and in Rill Cloud with environment-managed credentials. ## Related content - [View the full Rill MotherDuck setup guide](https://docs.rilldata.com/developers/build/connectors/olap/motherduck) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/ingestion/salesforce # Salesforce > Salesforce is a CRM platform for sales, marketing, service, and customer data. To analyze Salesforce data in MotherDuck, use an ingestion tool that supports Salesforce as a source and MotherDuck as a destination. ## How it works with MotherDuck The most direct documented path is `dlt`, which has a Salesforce source and a MotherDuck destination. 1. Install `dlt` with MotherDuck support. 2. Initialize a Salesforce-to-MotherDuck pipeline. 3. Configure Salesforce credentials and MotherDuck credentials in the generated `.dlt/secrets.toml`. 4. Run the generated pipeline script to load Salesforce resources into a MotherDuck dataset. ```bash pip install "dlt[motherduck]" mkdir salesforce_pipeline cd salesforce_pipeline dlt init salesforce motherduck pip install -r requirements.txt python salesforce_pipeline.py ``` Managed ingestion platforms can also move Salesforce data into MotherDuck. Fivetran supports Salesforce as a source and MotherDuck as a destination. Airbyte supports Salesforce as a source and has a MotherDuck destination. ## Related content - [View the full process in the dlt Salesforce-to-MotherDuck documentation](https://dlthub.com/docs/pipelines/salesforce/load-data-with-python-from-salesforce-to-motherduck) - [dlt MotherDuck destination documentation](https://dlthub.com/docs/dlt-ecosystem/destinations/motherduck) - [Fivetran Salesforce connector documentation](https://fivetran.com/docs/connectors/applications/salesforce) - [Fivetran MotherDuck destination documentation](https://fivetran.com/docs/destinations/motherduck/setup-guide) - [Airbyte Salesforce connector overview](https://airbyte.com/connectors/salesforce) - [Airbyte MotherDuck destination documentation](https://docs.airbyte.com/integrations/destinations/motherduck) --- Source: https://motherduck.com/docs/integrations/data-quality/secoda # Secoda > Data discovery and documentation platform for managing data quality and governance. It integrates with MotherDuck for table monitoring as part of data quality and observability workflows. ## How it works with MotherDuck Secoda connects to MotherDuck as a data warehouse integration for metadata extraction, catalog search, documentation, lineage, and governance workflows. ## Prerequisites - A Secoda workspace with permission to add integrations. - A MotherDuck service token. - Access to the MotherDuck databases and schemas Secoda should catalog. ## Setup 1. In MotherDuck, create or copy a service token. 2. In Secoda, open the **Integrations** tab. 3. Select **Add Integration**. 4. Search for and select **MotherDuck**. 5. Paste the MotherDuck service token. 6. Connect the integration and let Secoda extract metadata. ## Authentication and configuration - Use a dedicated service token for Secoda. - Limit the token to the data assets Secoda should discover and document. - Configure ownership, documentation, and governance rules in Secoda after the metadata sync completes. ## Important notes - Secoda's setup requires only the MotherDuck token from the MotherDuck side. - If assets do not appear after connecting, first verify the token and database access for the account that created it. ## Use cases - Catalog MotherDuck tables, views, schemas, and columns. - Generate and maintain table documentation in Secoda. - Add governance context, lineage, and quality monitoring around MotherDuck assets. ## Related content - [View the full Secoda MotherDuck setup guide](https://docs.secoda.co/integrations/data-warehouses/motherduck) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) - [Connecting to MotherDuck](/key-tasks/authenticating-and-connecting-to-motherduck/connecting-to-motherduck) --- Source: https://motherduck.com/docs/integrations/serverless-compute/index # Serverless Compute > Connect to MotherDuck from serverless and edge compute platforms Query MotherDuck from serverless functions and edge runtimes using the [Postgres endpoint](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint). Because these environments can't run native DuckDB bindings, the Postgres wire protocol provides a thin-client path to MotherDuck with no DuckDB dependencies. ## Included pages - [Cloudflare Workers](https://motherduck.com/docs/integrations/serverless-compute/cloudflare-workers): Cloudflare Workers is an edge compute platform for running serverless functions globally. Workers can connect to MotherDuck through the Postgres endpoint using the pg npm package. --- Source: https://motherduck.com/docs/integrations/dev-tools/shadowtraffic # ShadowTraffic > ShadowTraffic is a synthetic data generation tool for simulating production traffic to your backend. It integrates with MotherDuck for development workflows that read from or write to MotherDuck. ## How it works with MotherDuck ShadowTraffic can generate read and write traffic against a MotherDuck database for development, testing, and load-shaping workflows. ## Prerequisites - ShadowTraffic 1.10.0 or later. - A MotherDuck token. - A target MotherDuck database. ## Setup 1. Create a MotherDuck token and store it in an environment variable such as `MOTHERDUCK_TOKEN`. 2. Add a ShadowTraffic connection with `kind: motherduck`: ```json { "connections": { "md": { "kind": "motherduck", "connectionConfigs": { "token": { "_gen": "env", "var": "MOTHERDUCK_TOKEN" }, "db": "mydb" } } } } ``` 3. Add generators that write to tables or run read queries through that connection. ## Authentication and configuration - Use `token` and `db` for the standard MotherDuck connection. - Use `jdbcUrl` only when you need to control the full JDBC connection string. - Use `queryParams` for MotherDuck connection parameters such as `attach_mode`. - Use `batchConfigs` to tune write batch timing and row count. ## Important notes - ShadowTraffic writes asynchronously. By default it commits when 1000 ms pass or 10000 rows accumulate, whichever happens first. - Automatic table creation is enabled by default. Set `tablePolicy: manual` if you want to manage tables yourself. - For `UPDATE` and `DELETE` operations, generators need a `where` map so ShadowTraffic can identify rows. ## Use cases - Generate synthetic write traffic into MotherDuck tables. - Simulate reads and writes while testing downstream systems. - Use automatic table creation for quick generator iteration, then switch to manual table control for production-like tests. ## Related content - [View the full ShadowTraffic MotherDuck setup guide](https://docs.shadowtraffic.io/connections/motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) - [Connecting to MotherDuck](/key-tasks/authenticating-and-connecting-to-motherduck/connecting-to-motherduck) --- Source: https://motherduck.com/docs/integrations/ingestion/sling # Sling > Data integration platform for connecting data sources to warehouses. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. ## How it works with MotherDuck Sling connects to MotherDuck as a database connection that can be used in replication and pipeline workflows. ## Prerequisites - Sling CLI or Sling Platform. - A MotherDuck service token. - The target MotherDuck database name. ## Setup Configure the MotherDuck connection with the required `type`, `database`, and `motherduck_token` values: ```bash sling conns set MOTHERDUCK type=motherduck database=my_db motherduck_token= ``` You can also use a connection URL: ```bash sling conns set MOTHERDUCK url="motherduck://my_db?motherduck_token=" ``` For checked-in configuration, define the connection in Sling's environment file and load the token from your secret manager before running Sling. ## Authentication and configuration - `database` and `motherduck_token` are required. - `schema` sets the default schema. - `read_only` can be used for workflows that should not write to MotherDuck. - `motherduck_attach_mode` can be set to `workspace` or `single` when you need explicit attach behavior. ## Important notes - Keep the MotherDuck token out of committed Sling configuration. - Sling's MotherDuck docs list additional copy and DuckDB CLI options. Most MotherDuck workflows only need the database, token, and optional schema. - A `.duckdbrc` file can interfere with Sling because Sling invokes DuckDB under the hood. ## Use cases - Replicate data from files, APIs, and databases into MotherDuck. - Use Sling CLI in scheduled jobs or CI workflows. - Move data from MotherDuck to another supported destination when needed. ## Related content - [View the full Sling MotherDuck setup guide](https://docs.slingdata.io/connections/database-connections/motherduck) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/data-quality/soda # Soda > Data quality platform for monitoring and managing data quality in your pipelines. It integrates with MotherDuck for table monitoring as part of data quality and observability workflows. ## How it works with MotherDuck Soda connects to MotherDuck through the `soda-duckdb` package and runs quality scans against a MotherDuck `md:` database connection. ## Prerequisites - Soda installed in the environment that will run scans. - The `soda-duckdb` package. - A MotherDuck access token and database path. ## Setup 1. Install the Soda DuckDB package: ```bash pip install soda-duckdb ``` 2. Add a MotherDuck data source to your Soda configuration: ```yaml data_source motherduck: type: duckdb database: "md:sample_data?motherduck_token=" read_only: true ``` 3. Test the connection: ```bash soda test-connection -d motherduck -c configuration.yml -V ``` ## Authentication and configuration - The MotherDuck token can be passed in the `md:` connection string shown in Soda's reference configuration. - Store the token through your deployment secret manager or CI secret store before rendering the Soda configuration. - Set `read_only: true` for scan-only workflows. ## Important notes - Some Soda users report using `path` instead of `database` successfully. If `database` does not work in your environment, test `path` with the same `md:` value. - Keep Soda checks focused on the tables and columns you need to monitor so scans remain predictable. ## Use cases - Run SodaCL data quality checks against MotherDuck tables. - Validate pipeline outputs after loading data into MotherDuck. - Add MotherDuck quality scans to CI or scheduled data checks. ## Related content - [View the full Soda MotherDuck setup guide](https://docs.soda.io/data-source-reference/connect-motherduck) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) - [Connecting to MotherDuck](/key-tasks/authenticating-and-connecting-to-motherduck/connecting-to-motherduck) --- Source: https://motherduck.com/docs/integrations/sql-ides/index # SQL IDEs > Use MotherDuck with your favorite SQL development environments Connect to MotherDuck using popular SQL development environments and query editors. ## Included pages - [DataGrip](https://motherduck.com/docs/integrations/sql-ides/datagrip): JetBrains DataGrip is a cross-platform IDE for working with SQL and noSQL databases. It includes a DuckDB integration, which makes connecting to MotherDuck easy. - [DBeaver](https://motherduck.com/docs/integrations/sql-ides/dbeaver): DBeaver Community is a free cross-platform database integrated development environment (IDE). It includes a DuckDB integration, so it is a great choice for querying MotherDuck. - [DuckDB UI](https://motherduck.com/docs/integrations/sql-ides/duckdb-ui): DuckDB UI is a local SQL interface for DuckDB. You can use it with MotherDuck when you want a lightweight SQL workspace backed by the DuckDB CLI and an md: connection. --- Source: https://motherduck.com/docs/integrations/transformation/sqlmesh # SQLMesh > SQLMesh is a data transformation tool for building and managing data pipelines. It integrates with MotherDuck for running data transformation projects against MotherDuck. ## How it works with MotherDuck SQLMesh can use MotherDuck as its execution engine for transformation projects. ## Prerequisites - SQLMesh installed with DuckDB support, for example `sqlmesh[duckdb]`. - A SQLMesh project. - A MotherDuck access token. For shared projects, use a service account that owns SQLMesh-managed objects. ## Setup 1. Install SQLMesh with DuckDB support: ```bash pip install "sqlmesh[duckdb]" ``` 2. Create a MotherDuck token and store it in `MOTHERDUCK_TOKEN`. 3. Add a MotherDuck gateway to `config.yaml`: ```yaml gateways: motherduck: connection: type: motherduck token: {{ env_var('MOTHERDUCK_TOKEN') }} default_gateway: motherduck ``` 4. Validate the connection: ```bash sqlmesh info ``` 5. Run a plan when the connection succeeds: ```bash sqlmesh plan ``` ## Authentication and configuration - Load the token from an environment variable instead of committing it in `config.yaml`. - SQLMesh supports persistent and ephemeral catalogs for MotherDuck projects. - The built-in scheduler uses the `motherduck` engine adapter type. ## Important notes - SQLMesh needs permission to create and access the databases, schemas, and objects it manages. - Use a service account for shared or automated SQLMesh projects so ownership is stable. - The SQLMesh documentation also covers advanced connection options such as extensions, connector config, and external-source secrets. ## Use cases - Run SQLMesh transformation plans against MotherDuck. - Manage model state and environments with MotherDuck as the execution engine. - Use service-account credentials for scheduled SQLMesh runs. ## Related content - [View the full SQLMesh MotherDuck setup guide](https://sqlmesh.readthedocs.io/en/stable/integrations/engines/motherduck/) - [MotherDuck service accounts](/key-tasks/service-accounts-guide/) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/ingestion/stacksync # Stacksync > Stacksync helps your teams access and manipulate CRM and ERP data through your existing databases. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. ## How it works with MotherDuck Stacksync connects to MotherDuck for two-way sync workflows between MotherDuck and business systems. ## Prerequisites - A Stacksync workspace. - A MotherDuck access token. - Optional: the database name and schema if you do not want to use Stacksync's defaults. ## Setup 1. In MotherDuck, open **Settings** > **General** and create an access token. 2. Copy the generated token. 3. In Stacksync, open **Connections** and select **Create new connection**. 4. Search for and select **MotherDuck**. 5. Paste the token. 6. Update the database name or schema if required, then save the connection. ![Stacksync MotherDuck connection form with token, database, and schema fields](../img/stacksync-motherduck-connection.png) ## Authentication and configuration - Use a dedicated MotherDuck token for Stacksync. - Configure the database and schema fields when your sync should not use the defaults. - Revoke or rotate the token from MotherDuck when the Stacksync connection should no longer have access. ## Important notes - Stacksync's guide shows MotherDuck token creation from the web UI. If you use service accounts, create the token under the account that should own the sync access. - Test with a small sync before enabling a production two-way sync. ## Use cases - Sync operational app data into MotherDuck. - Use MotherDuck as a source for downstream business applications. - Keep CRM or ERP data in sync with a MotherDuck-backed analytics workflow. ## Related content - [View the full Stacksync MotherDuck setup guide](https://docs.stacksync.com/two-way-sync/connectors/motherduck) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/ingestion/streamkap # Streamkap > Streamkap is a stream processing platform built for Change Data Capture (CDC) and event sources. It makes it easy to move operational data into analytics systems like MotherDuck with low latency and high reliability. Streamkap offers various sources, including PostgreSQL, MySQL, SQL Server, a range of SQL and NoSQL databases, Kafka, and other storage systems. Streamkap is designed to get you streaming in minutes without a heavy setup. You focus on your business, and Streamkap handles the hard parts: * Lightweight in-stream transformations let you preprocess, clean, and enrich data with minimal latency and cost. * Automatically adapts to schema changes—added or removed fields, renamed columns, evolving data types, and nested structures. * Built-in observability and automated recovery reduce operational overhead. * Fully managed through API or Terraform, integrates with CI/CD workflows, and automates environment provisioning. * Deploy multiple service versions to isolate workloads—logically (per microservice or environment) or physically (across regions or infrastructure). * Choose from Streamkap Cloud or BYOC (Bring Your Own Cloud) for maximum flexibility and security. You can explore Streamkap’s MotherDuck integration and examples in the [official documentation.](https://docs.streamkap.com/motherduck) ## Overview This guide explains how to stream data from Streamkap into the MotherDuck database using Amazon S3 as an intermediary. We'll utilise the S3 connector to first stream data into an S3 bucket. Then, you can configure MotherDuck to read from the S3 bucket to ingest the data into your database. * Streamkap to S3: Streamkap is Kafka-based, so Kafka messages are streamed into an Amazon S3 bucket through an existing dedicated S3 connector. Please refer to the Streamkap’s [Kafka to S3 Streaming Guide](https://docs.streamkap.com/s3) for detailed instructions. * S3 to MotherDuck: MotherDuck is configured to read the data from the S3 bucket and load it into the database. ## Prerequisites * Amazon S3 Bucket: A bucket in Amazon S3 where data from Streamkap will be streamed. * MotherDuck Account: A valid MotherDuck account and database setup where the data will be loaded. * Streamkap’s Kafka S3 Connector: Your Kafka to S3 connector configured and running. ## MotherDuck setup Once data is available in the S3 bucket, you can configure MotherDuck to read from the S3 bucket and load it into your database. Follow these steps: ## Configure the S3 source in MotherDuck To read data from the S3 bucket into MotherDuck, you need to configure a data source that points to the S3 bucket. This involves creating a connection between MotherDuck and your S3 bucket using AWS credentials. 1. Log in to MotherDuck and navigate to your workspace or database. 2. Go to the Secrets. 3. Add new secret and choose Amazon S3 as the secret type. 4. Provide the necessary details to access the S3 bucket: * Secret Name: The name of your source connection details. * Region: The region of your S3 bucket (e.g., us-west-2). * Access Key ID: Your AWS Access Key ID. * Secret Access Key: Your AWS Secret Access Key. ### SQL command for secret configuration Alternatively, you can configure the secret using SQL. Below is an example configuration for setting up the secret: ```sql CREATE SECRET IN MOTHERDUCK ( TYPE S3, KEY_ID 'access_key', SECRET 'secret_key', REGION 'us-east-1' ); ``` ### Verify existing secrets To check your existing secrets, you can run the following SQL command: ```sql FROM duckdb_secrets()` ``` ![Streamkap S3 secret configuration in MotherDuck](../img/streamkap_image1.png) ## Query data from the S3 bucket Once the connection between MotherDuck and your S3 bucket is established, you can define a schema and table in MotherDuck or query the data directly from the S3 bucket. Since your Kafka stream might be writing multiple files to the S3 bucket, we recommend using a wildcard `*` to read all files in a folder. This will enable MotherDuck to automatically pick up new files as they are written to the S3 bucket. Here is an example SQL query to read data from your S3 bucket (using a wildcard for streaming): ```sql SELECT key.id, value.name, value.note FROM read read_parquet('s3://streamkap-s3-test-bucket/parquet_test/*') ``` ![Query results from S3 bucket in MotherDuck](../img/streamkap_image2.png) --- Source: https://motherduck.com/docs/integrations/web-development/streamlit # Streamlit > Streamlit is a web development platform for building and visualizing data. It integrates with MotherDuck for applications that query MotherDuck through a supported server-side connection. ## How it works with MotherDuck 1. Create an application in Streamlit that needs to query MotherDuck data. 2. Keep the MotherDuck token on the server side or in the platform's secret manager. 3. Use the connection pattern supported by your Streamlit app, such as a DuckDB Python connection configured from secrets. ## Related content - [View the full process in the Streamlit documentation](https://docs.streamlit.io/develop/concepts/connections/connecting-to-data) - [MotherDuck Python overview](/integrations/language-apis-and-drivers/python/python-overview) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/databases/supabase # Supabase > Supabase is a Postgres platform for building applications with a managed database, APIs, authentication, storage, and realtime features. Supabase's documented DuckDB Wrapper can query MotherDuck from a Supabase Postgres database through a foreign data wrapper. ## How it works with MotherDuck 1. Enable the Supabase Wrappers extension. 2. Create the `duckdb_wrapper` foreign data wrapper. 3. Store a MotherDuck token in Supabase Vault, then create a foreign server with `type 'md'`, the MotherDuck database name, and the Vault-backed token option. 4. Create a schema for the foreign tables. 5. Import a MotherDuck schema, such as `main`, into Supabase and query the imported foreign tables from Postgres. ```sql create extension if not exists wrappers with schema extensions; create foreign data wrapper duckdb_wrapper handler duckdb_fdw_handler validator duckdb_fdw_validator; create server duckdb_server_md foreign data wrapper duckdb_wrapper options ( type 'md', database 'my_db', vault_motherduck_token '' ); create schema if not exists duckdb; import foreign schema "main" from server duckdb_server_md into duckdb; select * from duckdb.my_table limit 10; ``` The Supabase DuckDB Wrapper is a read path into MotherDuck: it supports querying foreign tables, including `where`, `order by`, and `limit` pushdown, but does not support inserts, updates, deletes, or truncates through the foreign tables. ## Related content - [View the full process in the Supabase DuckDB Wrapper documentation](https://supabase.com/docs/guides/database/extensions/wrappers/duckdb) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) - [PostgreSQL and MotherDuck](/integrations/databases/postgres) --- Source: https://motherduck.com/docs/integrations/bi-tools/tableau/index # Tableau > Tableau is a widely-used business intelligence and data visualization platform that enables data analysts to build interactive dashboards and reports. You can connect Tableau Cloud to MotherDuck through the built-in PostgreSQL connector using MotherDuck's Postgres endpoint. For Tableau Desktop and Server, use the DuckDB JDBC connector. ## Included pages - [Tableau Cloud with MotherDuck](https://motherduck.com/docs/integrations/bi-tools/tableau/tableau-cloud): Connect Tableau Cloud to MotherDuck using the Postgres endpoint for dashboards and reports. - [Tableau Desktop and Server with MotherDuck](https://motherduck.com/docs/integrations/bi-tools/tableau/tableau-desktop): Connect Tableau Desktop or Server to MotherDuck using the DuckDB JDBC driver and Tableau connector. - [Tableau Bridge (legacy)](https://motherduck.com/docs/integrations/bi-tools/tableau/tableau-bridge): Connect Tableau Cloud to MotherDuck using Tableau Bridge and the DuckDB JDBC connector. --- Source: https://motherduck.com/docs/integrations/orchestration/union-ai # Union.ai > Union.ai is a platform for building and managing data pipelines. It integrates with MotherDuck for scheduling DuckDB and MotherDuck tasks in repeatable data workflows. ## How it works with MotherDuck 1. Create a task, DAG, flow, or job in Union.ai that runs a DuckDB or MotherDuck step. 2. Store the MotherDuck token in the orchestrator's secret manager. 3. Use a small validation query before scheduling larger ingestion, transformation, or export jobs. ## Related content - [View the full process in the Union.ai documentation](https://docs-legacy.flyte.org/en/latest/flytesnacks/examples/duckdb_plugin/index.html) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/ingestion/unstructured-io # Unstructured.io > Unstructured.io is an ingestion platform for processing unstructured data. It integrates with MotherDuck for loading data from operational systems, APIs, files, or event streams. ## How it works with MotherDuck 1. Create a pipeline in Unstructured.io with MotherDuck as the destination or analytical store. 2. Create a MotherDuck access token and add it to the tool's secrets or destination settings. 3. Choose the target database and schema, then run a small initial sync before scheduling production loads. ## Related content - [Read the Unstructured blog on the MotherDuck integration](https://unstructured.io/blog/unstructured-s-new-motherduck-integration) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/dev-tools/vanna # Vanna > Vanna is a data science and AI framework for building and sharing data applications. It integrates with MotherDuck for development workflows that read from or write to MotherDuck. ## How it works with MotherDuck 1. Create a connection in Vanna that targets MotherDuck or DuckDB. 2. Store the MotherDuck token as a secret rather than hard-coding it in project files. 3. Validate the connection with a small query, then build the tool-specific workflow on top of that connection. ## Related content - [View the full process in the Vanna documentation](https://vanna.ai/docs/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) - [Connecting to MotherDuck](/key-tasks/authenticating-and-connecting-to-motherduck/connecting-to-motherduck) --- Source: https://motherduck.com/docs/integrations/web-development/index # Web Development > Build web applications with MotherDuck Use MotherDuck to power your web applications and services. ## Included pages - [Vercel](https://motherduck.com/docs/integrations/web-development/vercel): Vercel is a web development platform for building and deploying web applications. - [Streamlit](https://motherduck.com/docs/integrations/web-development/streamlit): Streamlit is a web development platform for building and visualizing data. It integrates with MotherDuck for applications that query MotherDuck through a supported server-side connection. --- Source: https://motherduck.com/docs/integrations/orchestration/windmill # Windmill > Windmill is a platform for building and managing data pipelines. It integrates with MotherDuck for scheduling DuckDB and MotherDuck tasks in repeatable data workflows. ## How it works with MotherDuck 1. Create a task, DAG, flow, or job in Windmill that runs a DuckDB or MotherDuck step. 2. Store the MotherDuck token in the orchestrator's secret manager. 3. Use a small validation query before scheduling larger ingestion, transformation, or export jobs. ## Related content - [View the full process in the Windmill documentation](https://www.windmill.dev/docs/core_concepts/data_pipelines#canonical-data-pipeline-in-windmill-w-polars-and-duckdb) - [Loading data into MotherDuck](/key-tasks/loading-data-into-motherduck/) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/data-science-ai/wobby # Wobby > Wobby provides AI analysts for delivering business-ready insights in Slack or Teams. It integrates with MotherDuck for connecting those analysis workflows to your data. ## How it works with MotherDuck Wobby connects to MotherDuck as a data source for AI analyst workflows. ## Prerequisites - A Wobby workspace. - A valid MotherDuck access token. - The MotherDuck database name and schemas Wobby should query. ## Setup 1. In MotherDuck, create an access token and copy it. 2. In Wobby, open **Connections**. 3. Select the plus button and choose **MotherDuck**. 4. Enter a display name. 5. Enter the database name. If your database path is `md:my_database`, enter `my_database`. 6. Paste the access token and select the schemas to connect. 7. Test and save the connection. ## Authentication and configuration - Use a dedicated token for Wobby. - Select only the schemas Wobby should use for AI analysis. - Treat the access token like a password and rotate it if access changes. ## Important notes - Wobby expects the database name, not the full `md:` connection string. - If the test fails, check the token, database name, and selected schemas first. ## Use cases - Let Wobby agents answer questions over MotherDuck data. - Connect specific schemas to a business-facing AI analyst workflow. - Use MotherDuck as the analytical source for Slack or Teams insights. ## Related content - [View the full Wobby MotherDuck setup guide](https://docs.wobby.ai/connections/connect-a-data-source/motherduck) - [MotherDuck Python overview](/integrations/language-apis-and-drivers/python/python-overview) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck) --- Source: https://motherduck.com/docs/integrations/bi-tools/zenlytic # Zenlytic > Zenlytic is a data visualization platform for building and visualizing data. It integrates with MotherDuck for dashboards, semantic models, and embedded analytics workflows. ## How it works with MotherDuck Zenlytic connects to MotherDuck as a data source for governed metrics, dashboards, and AI-assisted analysis. ## Prerequisites - A Zenlytic workspace with permission to add data sources. - A MotherDuck service token with access to the databases Zenlytic should use. - The database name, if you want Zenlytic to connect to one database by default. ## Setup 1. In MotherDuck, create a service token for Zenlytic and copy it. 2. In Zenlytic, open **Settings** > **Data Sources**. 3. Select **Add Data Source** and choose **MotherDuck**. 4. Paste the service token. 5. Optionally enter a database name. 6. Test the connection, then save it. ## Authentication and configuration - Use a token with read access to the data Zenlytic should model. - If you omit the database name, configure the target database later in Zenlytic. - Rotate the token from MotherDuck if a Zenlytic workspace or project no longer needs access. ## Important notes - Zenlytic's setup guide calls out token permissions as the first troubleshooting check. If the connection fails, verify the token and database name first. - Keep the token scoped to analytics workloads rather than reusing a broad personal token. ## Use cases - Build a metrics layer over MotherDuck data. - Let teams ask governed analytics questions in Zenlytic. - Connect a specific MotherDuck database to a Zenlytic workspace. ## Related content - [View the full Zenlytic MotherDuck setup guide](https://docs.zenlytic.com/data-sources/motherduck_setup) - [MotherDuck authentication](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck)