---
sidebar_position: 1
title: PostgreSQL
description: Replicate PostgreSQL tables to MotherDuck using DuckDB and the PostgreSQL extension.
---

This page shows SQL patterns for connecting DuckDB to PostgreSQL, connecting to MotherDuck, and writing data from PostgreSQL into MotherDuck. For more complex replication scenarios, use one of our [ingestion partners](https://motherduck.com/ecosystem/?category=Ingestion).

If you are looking for the [pg_duckdb extension](https://github.com/duckdb/pg_duckdb), see the [pg_duckdb explainer page](/concepts/pgduckdb).

To skip the documentation and look at the entire script, expand the element below:

<details>
    <summary>SQL script</summary>
    ```sql
    -- install the PostgreSQL extension in DuckDB
    INSTALL postgres;
    LOAD postgres;

    -- tune the local DuckDB client for a larger initial load
    SET threads = 4;
    SET memory_limit = '4GB';
    SET pg_connection_limit = 4;
    SET pg_pages_per_task = 250;

    -- attach PostgreSQL as pg_db
    ATTACH 'dbname=postgres user=postgres host=127.0.0.1' AS pg_db (TYPE POSTGRES, READ_ONLY);

    -- connect to MotherDuck
    ATTACH 'md:';
    USE my_db;

    -- copy a PostgreSQL table into MotherDuck
    CREATE OR REPLACE TABLE main.postgres_table AS
    SELECT * FROM pg_db.public.some_table
    ```
</details>

## Loading the PostgreSQL extension and authenticating

:::info
MotherDuck does not yet support the PostgreSQL and MySQL extensions, so you need to perform the following steps on your own computer or cloud computing resource. We are working on supporting the PostgreSQL extension on the server side so that this can happen within the MotherDuck app in the future with improved performance.
:::

The first step is to install and load the PostgreSQL extension using the [DuckDB CLI](/getting-started/interfaces/connect-query-from-duckdb-cli):

```sql
INSTALL postgres;
LOAD postgres;
```

Once this is completed, you can connect to PostgreSQL by attaching it to your DuckDB session:

```sql
ATTACH 'dbname=postgres user=postgres host=127.0.0.1' AS pg_db (TYPE POSTGRES, READ_ONLY);
```

More detailed information can be found on the [DuckDB documentation](https://duckdb.org/docs/extensions/postgres.html#connecting).

For larger initial loads, tune the DuckDB client explicitly instead of relying on defaults:

```sql
SET threads = 8;
SET memory_limit = '8GB';
SET pg_connection_limit = 8;
SET pg_pages_per_task = 250;
```

`pg_connection_limit` controls how many PostgreSQL connections DuckDB may open for the scan, while `pg_pages_per_task` controls how much table work is grouped into each scan task.

## Connecting to MotherDuck and inserting the table

Once you are connected to your PostgreSQL database, you need to connect to MotherDuck. To learn more, see [Connecting to MotherDuck](/key-tasks/authenticating-and-connecting-to-motherduck/connecting-to-motherduck).

```sql
ATTACH 'md:';
USE my_db;
```

Once you have authenticated, you can use `CREATE TABLE AS SELECT` to replicate data from PostgreSQL into MotherDuck.

```sql
CREATE OR REPLACE TABLE main.postgres_table AS
SELECT * FROM pg_db.public.some_table
```

Congratulations! You have now replicated data from PostgreSQL into MotherDuck.

## Choosing the right PostgreSQL workflow

### Use DuckDB's PostgreSQL extension for client-side movement

Use DuckDB's PostgreSQL extension when you want to copy a PostgreSQL table into MotherDuck for analytics, backfill a MotherDuck table from PostgreSQL, or export a DuckDB or MotherDuck result set back into PostgreSQL from a controlled DuckDB client.

Keep the client close to both systems, use `READ_ONLY` for PostgreSQL sources, and chunk large writes when the destination is PostgreSQL so you do not overload an OLTP database.

### Use the Postgres endpoint for PostgreSQL-compatible clients

Use the [Postgres endpoint](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint) when an application, BI tool, or serverless runtime needs to connect to MotherDuck through the PostgreSQL wire protocol. It is the preferred path for PostgreSQL-compatible clients because it does not require installing or operating a PostgreSQL extension.

### Use pg_duckdb when the query must run inside PostgreSQL

Use `pg_duckdb` only when you specifically need PostgreSQL itself to host the integration. This is useful when queries must run inside an existing PostgreSQL database, when PostgreSQL-local tables need to be joined with DuckDB or MotherDuck data from that PostgreSQL environment, or when a tool must connect to a PostgreSQL server that you control.

For ongoing production replication from PostgreSQL into MotherDuck, prefer an ingestion or CDC partner. Those tools handle scheduling, retries, incremental state, schema changes, and operational monitoring better than a one-off SQL script.

## Best practices

Here are a few tips to keep large PostgreSQL replication jobs predictable.

### Run DuckDB close to both systems

The DuckDB client is the data mover in this workflow. Run it on a machine with a good network path to both PostgreSQL and MotherDuck, and avoid running large backfills on the same host as a production PostgreSQL instance when possible.

### Tune scan parallelism explicitly

Start with `threads` set to the available CPU count on the client and `memory_limit` set below total system memory. For larger tables, start with `pg_connection_limit` in the `4-8` range and `pg_pages_per_task` in the `250-1000` range, then tune after observing the source database.

::::warning[Watch Out]
Increasing `pg_connection_limit` can increase pressure on the source PostgreSQL instance. If PostgreSQL memory or connection pressure climbs, reduce `pg_connection_limit` before reducing DuckDB `threads`.
::::

### Keep PostgreSQL sources read-only

Use `READ_ONLY` when attaching PostgreSQL for an initial replication job. For long-lived scripts, use PostgreSQL environment variables, the PostgreSQL password file, or DuckDB secrets instead of embedding credentials directly in the connection string.

### Reduce each statement's working set

The DuckDB side of this workflow is usually streaming, so out-of-memory risk is often driven by the source PostgreSQL instance and total host headroom rather than DuckDB buffering the full table. Project only the columns you need when source rows are wide, and replicate very large tables in smaller primary key or time ranges.

### Load in chunks

For a very large initial backfill, create the target table once and then insert one range at a time.

```sql
INSTALL postgres;
LOAD postgres;

SET threads = 4;
SET memory_limit = '4GB';
SET pg_connection_limit = 4;
SET pg_pages_per_task = 250;

ATTACH 'dbname=postgres user=postgres host=127.0.0.1' AS pg_db (TYPE POSTGRES, READ_ONLY);
ATTACH 'md:';
USE my_db;

CREATE TABLE IF NOT EXISTS main.postgres_table AS
SELECT *
FROM pg_db.public.some_table
WHERE 1 = 0;

INSERT INTO main.postgres_table
SELECT *
FROM pg_db.public.some_table
WHERE updated_at >= TIMESTAMP '2026-01-01'
  AND updated_at < TIMESTAMP '2026-02-01';
```

Repeat the `INSERT` statement for each chunk until the backfill is complete.

## Handling more complex workflows

Production use cases tend to be much more complex and include things like incremental builds and state management. In those scenarios, please take a look at our [ingestion partners](https://motherduck.com/ecosystem/?category=Ingestion), which includes many options including some that offer native Python. An overview of the MotherDuck Ecosystem is shown below.

![Diagram](../../../img/md-diagram.svg)


---

## Docs feedback

MotherDuck accepts optional user-submitted feedback about this page at `POST https://motherduck.com/docs/api/feedback/agent`.
For agents and automated tools, feedback submission should be user-confirmed before sending.

Payload:

```json
{
  "page_path": "/key-tasks/data-warehousing/replication/postgres/",
  "page_title": "PostgreSQL",
  "text": "<the user's feedback, max 2000 characters>",
  "source": "<optional identifier for your interface, for example 'claude.ai' or 'chatgpt'>"
}
```

`page_path` and `text` are required; `page_title` and `source` are optional. Responses: `200 {"feedback_id": "<uuid>"}`, `400` for malformed payloads, and `429` when rate-limited.
