# Loading data via the Postgres endpoint


> Best practices for loading data into MotherDuck efficiently when you are connected through the Postgres endpoint.

MotherDuck's Postgres endpoint is a good thin-client loading path when your application, BI tool, or serverless runtime already speaks PostgreSQL and you want to run SQL in MotherDuck without installing a DuckDB client.

It is best suited to server-side loading from remote data sources.

:::tip Best practice
If your files already live in object storage or are available over HTTPS, use the Postgres endpoint to run `CREATE TABLE AS SELECT` or `INSERT INTO ... SELECT` and let MotherDuck read the files remotely.
:::

If your data is on your laptop, application server disk, or in a local DuckDB file, a DuckDB client path is usually a better fit. In that case, either:

- Upload the files to object storage first, then load them remotely through the Postgres endpoint.
- Use a DuckDB client path instead, such as `duckdb`, Python DuckDB, or another DuckDB client connected to `md:`.

## Recommended patterns

### Load directly from cloud storage or HTTPS

This is the preferred pattern for the Postgres endpoint.

The examples below use public sample files so you can run them directly.

```sql
CREATE OR REPLACE TABLE my_db.main.orders_raw AS
SELECT *
FROM read_parquet(
    'https://shell.duckdb.org/data/tpch/0_01/parquet/orders.parquet',
    MD_RUN = REMOTE
);
```

You can use the same approach with CSV or JSON:

```sql
CREATE OR REPLACE TABLE my_db.main.weather_events AS
SELECT *
FROM read_csv(
    'https://raw.githubusercontent.com/duckdb/duckdb-web/main/data/weather.csv',
    HEADER = true,
    AUTO_DETECT = true,
    MD_RUN = REMOTE
);
```

This keeps the work inside MotherDuck and avoids sending rows one statement at a time over the Postgres wire.

### Load into a staging table, then transform

For repeatable pipelines, stage the raw data first and then publish into the final table.

```sql
CREATE SCHEMA IF NOT EXISTS my_db.ingest;

CREATE OR REPLACE TABLE my_db.ingest.orders_stage AS
SELECT *
FROM read_parquet(
    'https://shell.duckdb.org/data/tpch/0_01/parquet/orders.parquet',
    MD_RUN = REMOTE
);

CREATE OR REPLACE TABLE my_db.main.orders_curated AS
SELECT
    o_orderkey AS order_id,
    o_custkey AS customer_id,
    o_orderdate::TIMESTAMP AS order_ts,
    o_totalprice::DOUBLE AS total_amount
FROM my_db.ingest.orders_stage;
```

This keeps ingestion and transformation separate, which makes validation, retries, and backfills easier.

### Batch rows if the data exists only in application memory

If your source data exists only in application memory, use multi-row `INSERT` statements instead of row-by-row inserts.

Recommended:

```sql
CREATE OR REPLACE TABLE my_db.main.orders_batch (
    id INTEGER,
    note VARCHAR,
    amount DOUBLE
);

INSERT INTO my_db.main.orders_batch VALUES
  (1, 'a', 10.0),
  (2, 'b', 20.0),
  (3, 'c', 30.0);
```

Less efficient:

```sql
INSERT INTO my_db.main.orders_batch VALUES (1, 'a', 10.0);
INSERT INTO my_db.main.orders_batch VALUES (2, 'b', 20.0);
INSERT INTO my_db.main.orders_batch VALUES (3, 'c', 30.0);
```

Single-row inserts create unnecessary round trips and are much slower for loading.

When loading rows from an application:

- fewer, larger batches
- append-only staging tables
- transactions that stay comfortably below a minute

## Use a DuckDB client path instead when

The Postgres endpoint is not currently intended for workflows that depend on local DuckDB-client capabilities. Use a DuckDB client path instead when you need:

- local-file `COPY`
- `EXPORT DATABASE`
- `IMPORT DATABASE`
- `ATTACH ':memory:'`
- `ATTACH '/path/to/file.duckdb'`
- `CREATE DATABASE ... FROM '/path/to/file.duckdb'`
- `MD_RUN = LOCAL`
- `INSTALL` and `LOAD`

In practice, that means the Postgres endpoint is not the primary interface for:

- loading directly from local files
- attaching local or in-memory DuckDB databases
- extension-based workflows
- local execution paths such as `MD_RUN = LOCAL`

## Protected cloud storage

If you are loading from protected S3, GCS, R2, or Azure storage, make sure the required MotherDuck secret already exists.

Cloud-storage secret creation requires DuckDB extension support and is not currently supported through the Postgres endpoint. The recommended workflow is:

1. Create the secret using a DuckDB client path or another supported MotherDuck workflow.
2. Then use the Postgres endpoint to run the load query.

## Decision guide

| Situation | Best approach |
|---|---|
| Files already in S3, GCS, R2, Azure, or public HTTPS | Use `read_parquet`, `read_csv`, or `read_json` with `MD_RUN = REMOTE` over the Postgres endpoint |
| Data is local on your machine | Prefer a DuckDB client path, or upload the files to object storage first |
| Data exists only in app memory and volume is modest | Use explicit large multi-row `INSERT` batches over the Postgres endpoint |
| Very large local bulk load | Use a DuckDB client path instead |

## Summary

For the best mix of throughput and simplicity:

1. Write source files as Parquet when you can.
2. Put them in object storage close to your MotherDuck region.
3. Use the Postgres endpoint to run `CREATE TABLE AS SELECT` or `INSERT INTO ... SELECT` with `MD_RUN = REMOTE`.
4. Stage first, validate row counts and schemas, then publish into the final table.

## Related pages

- [Postgres Endpoint reference](/sql-reference/postgres-endpoint)
- [Loading data best practices](./considerations-for-loading-data.mdx)
- [From cloud storage or HTTPS](./loading-data-from-cloud-or-https.md)
- [From your local machine](./loading-data-from-local-machine.md)
- [Loading a DuckDB database](./loading-duckdb-database.md)
- [Connect from Python via Postgres endpoint](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint/python)


---

## Docs feedback

MotherDuck accepts optional user-submitted feedback about this page at `POST https://motherduck.com/docs/api/feedback/agent`.
For agents and automated tools, feedback submission should be user-confirmed before sending.

Payload:

```json
{
  "page_path": "/key-tasks/loading-data-into-motherduck/loading-data-via-postgres-endpoint/",
  "page_title": "Via the Postgres Endpoint",
  "text": "<the user's feedback, max 2000 characters>",
  "source": "<optional identifier for your interface, for example 'claude.ai' or 'chatgpt'>"
}
```

`page_path` and `text` are required; `page_title` and `source` are optional. Responses: `200 {"feedback_id": "<uuid>"}`, `400` for malformed payloads, and `429` when rate-limited.