From a PostgreSQL or MySQL Database
Using PostgresSQL or MySQL DuckDB Extensions
DuckDB's PostgreSQL extension and MySQL extension makes it extremely easy to connect to and access data stored in your OLTP databases. Once connected, you can just as easily export the data to MotherDuck to offload analytical queries while benefiting from data centralization, persistence, and data sharing capabilities. In this guide we will demonstrate this workflow with the PostgreSQL extension. Consult the DuckDB MySQL extension documentation to make adjustments to the steps to work with MySQL databases.
MotherDuck does not yet support the PostgreSQL and MySQL extensions, so you need to perform the following steps on your own computer or cloud computing resource. We are working on supporting the PostgreSQL extension on the server side so that this can happen within the MotherDuck app in the future with improved performance.
Prerequisites
- PostgreSQL Database Credentials: Ensure you have access details to the PostgreSQL database, including host address, port, and user credentials. You can put the user credentials in the PostgreSQL Password File, store them in environment variables, or pass them inline in the script below.
- Network Connectivity: Your machine must be able to connect to the target PostgreSQL database.
- MotherDuck Credentials: MotherDuck credentials should be set up. If not, follow the steps in Authenticating to MotherDuck.
- DuckDB: Either the DuckDB command-line interface or Python + the DuckDB package should be installed and operational. See the Getting Started tutorials for instructions to install DuckDB.
Steps
The following SQL script reads from a table in the PostgreSQL database and write it to the table named my_db.pg_data_schema.first_pg_table
in MotherDuck.
Fill in the placeholders <dbname>
, <host_address>
, <username>
, <password>
, <schema>
, <table>
, and <number_of_rows>
with the appropriate values and save the script to a file, e.g., ingest_data_from_postgres.sql
.
-- Connect to a MotherDuck database.
ATTACH 'md:';
USE 'my_db';
-- Optionally create a schema, by default MotherDuck uses the main schema;
CREATE SCHEMA IF NOT EXISTS pg_data_schema;
-- Ingest data from PostgreSQL to a MotherDuck table
CREATE OR REPLACE TABLE pg_data_schema.first_pg_table AS
SELECT * FROM
postgres_scan('dbname=<dbname> host=<host_address> user=<username> password=<password> connect_timeout=10', '<schema>', '<table>')
-- optionally limit the number of rows ingested
LIMIT <number_of_rows>;
-- Optional: Verify the number of rows in the MotherDuck table
SELECT count(1) FROM pg_data_schema.first_pg_table;
Run with DuckDB CLI
After filling out the placeholders, you can either execute the statements line by line in the DuckDB CLI, or save the commands in a file, e.g., ingest_data_from_postgres.sql
, and run:
> duckdb < ingest_data_from_postgres.sql
Run with Python
You can also execute it using Python with the DuckDB package.
import duckdb
with open("ingest_data_from_postgres.sql", 'r') as f:
s = f.read()
duckdb.sql(s)
After completing these steps, you should see the new table show up in the MotherDuck Web UI.
Using a MotherDuck integration Partners
MotherDuck collaborates with various integration partners to facilitate data transfer in diverse ways—including change data capture (CDC)—from your PostgreSQL or MySQL database to MotherDuck. For example, you can refer to our Estuary guide that demonstrates how to stream data from Neon, a PostgreSQL-based database, to MotherDuck. To explore the full range of solutions tailored to your needs, visit our MotherDuck ecosystem partners page.