Build a Real-Time CDC Pipeline with Estuary & MotherDuck: March 27thRegister Now

Skip to main content

Connecting to MotherDuck

A single DuckDB connection executes one query at a time, aiming to maximize the performance of that query, making reuse of a single connection both simple and performant. We recommend starting with the simplest way of connecting to MotherDuck and running queries, and if that does not meet your requirements, to explore the advanced use-cases described in subsequent sections.

Create a connection

The below code snippets show how to create a connection to a MotherDuck database from the CLI, Python, JDBC and NodeJS language APIs.

To connect to your MotherDuck database, use duckdb.connect("md:my_database_name") which will return a DuckDBPyConnection object that you can use to interact with your database.

import duckdb

# Create connection to your default database
conn = duckdb.connect("md:my_db")

# Run query
conn.sql("CREATE TABLE items (item VARCHAR, value DECIMAL(10, 2), count INTEGER)")
conn.sql("INSERT INTO items VALUES ('jeans', 20.0, 1), ('hammer', 42.2, 2)")
res = conn.sql("SELECT * FROM items")

# Close the connection
conn.close()

Multiple Connections and the Database Instance cache

DuckDB clients in Python, R, JDBC, and ODBC prevent redundant reinitialization by keeping instances of database-global context cached by the database path. Other language APIs are likely to get similar functionality over time.

When connecting to MotherDuck, the instance is cached for an additional 15 minutes after the last connection is closed (see Setting Custom Database Instance Cache TTL for how to override this value). For an application that creates and closes connections frequently, this could provide a significant speedup for connection creation, as the same catalog data can be reused across connections.

This means that only the first of multiple connections to the same database will take the time to load the MotherDuck extension, verify its signature, and fetch the catalog metadata.

con1 = duckdb.connect("md:my_db")	// MotherDuck catalog fetched
con2 = duckdb.connect("md:my_db") // MotherDuck catalog reused

For language APIs that do not yet have a database instance cache, reusing the same database instance will prevent redundant reinitialization:

const db = await DuckDBInstance.create('md:my_db', (err) => { /* ... */ });

const con1 = await db.connect();
const con2 = await db.connect();

Setting Custom Database Instance Cache Time (TTL)

By default, connections to MotherDuck established through the database instance caching supporting DuckDB APIs will reuse the same database instance for 15 minutes after the last connection is closed. In some cases, you may want to make that period longer (to avoid the redundant reinitialization) or shorter (to connect to the same database with a different configuration).

The database TTL value can be set either at the initial connection time, or by using the SET command at any point. Any valid DuckDB Instant part specifiers can be used for the TTL value, for example '5s', '3m', or '1h'.

con = duckdb.connect("md:my_db?dbinstance_inactivity_ttl=1h")
con.close()

# different database connection string (without `?dbinstance_inactivity_ttl=1h`), no instance cached; TTL is 15 minutes (default)
con2 = duckdb.connect("md:my_db")

# allow the database instance to expire immediately
con2.execute("SET motherduck_dbinstance_inactivity_ttl='0s'")

# the database instance can only expire after the last connection is closed
con2.close()

# new database instance with a new TTL (the 15 minute default)
con3 = duckdb.connect("md:my_db")
con3.close()

# the last TTL for this database was 15 minutes; the cached database instance will be reused
con4 = duckdb.connect("md:my_db")

Connect to multiple databases

If you need to connect to MotherDuck and run one or more queries in succession on the same account, you can use a single database connection. If you want to connect to another database in the same account, you can either reuse the same connection, or create copies of the connection.

If you need to connect to multiple databases, you can either directly reuse the same DuckDBPyConnection instance, or create copies of the connection using the .cursor() method.

note

FROM <table name> is a shorthand version of SELECT * FROM <table name>.

Example 1: Reuse the same DuckDB Connection

To connect to different databases in the same MotherDuck account, you can use the same connection object and simply fully qualify the names of the tables in your query.

conn = duckdb.connect("md:my_db")

res1 = conn.sql("FROM my_db1.main.tbl")
res2 = conn.sql("FROM my_db2.main.tbl")
res3 = conn.sql("FROM my_db3.main.tbl")

conn.close()

Example 2: Create copies of the initial DuckDB Connection

conn.cursor() returns a copy of the DuckDB connection, with a reference to the existing DuckDB database instance. Closing the original connection also closes all associated cursors.

conn = duckdb.connect("md:my_db")

cur1 = conn.cursor()
cur2 = conn.cursor()
cur3 = conn.cursor()

cur1.sql("USE my_db1")
cur2.sql("USE my_db2")
cur3.sql("USE my_db3")

res = []
for cur in [cur1, cur2, cur3]:
res.append(cur.sql("SELECT * FROM tbl"))

# This closes the original DuckDB connection and all cursors
conn.close()
note

duckdb.connect(path) creates and caches a DuckDB instance. Subsequent calls with the same path reuse this instance. New connections to the same instance are independent, similar to conn.cursor(), but closing one doesn't affect others. To create a new instance instead of using the cached one, make the path unique (e.g., md:my_db?user=<unique ID>).

Example 3: Create multiple connections

You can also create multiple connections to the same MotherDuck account using different DuckDB instances. However, keep in mind that each connection takes time to establish, and if connection times are an important factor for your application, it might be beneficial to consider Example 1 or Example 2.

note

If you need to run queries on separate connections in quick succession, instead of opening and closing a connection for every query, we recommend using a Connection Pool (Python, JDBC or NodeJS).

conn1 = duckdb.connect("md:my_db1")
conn2 = duckdb.connect("md:my_db2")
conn3 = duckdb.connect("md:my_db3")

res1 = conn1.sql("SELECT * FROM tbl")
res2 = conn2.sql("SELECT * FROM tbl")
res3 = conn3.sql("SELECT * FROM tbl")

conn1.close()
conn2.close()
conn3.close()