# Packages and recommended libraries
> Manage Python dependencies, choose Flight loading patterns, and use dlt and dbt for ingest and transformation.
A Flight runs your Python with the packages you list in `requirements.txt`. This page covers how to declare dependencies, how to choose a loading pattern for Flight ingestion, and the two libraries we recommend for the most common workloads.

## requirements.txt is plain pip syntax

Pass package specifications one per line, the same as a regular pip requirements file:

```text
duckdb==1.5.3
dlt==1.27.0
httpx==0.28.1
pandas==2.2.3
```

You can use any version specifier pip supports: `==`, `>=`, `~=`, extras (`some-package[extra]`), and so on.

The one dependency worth special attention is **DuckDB**: pin it to the version MotherDuck's server ships. Find that version in the [MotherDuck release notes](/about-motherduck/release-notes), or run a quick query against MotherDuck:

<MotherDuckSQLEditor
  database="docs_playground"
  title="Check the MotherDuck DuckDB version"
  query={`SELECT version();`} />

## Choose a loading pattern

Flights often start with Python variables: API responses, scraped rows, JSON objects, or files written under `/tmp`. The slow path is to send one row at a time to MotherDuck. Pick a bulk pattern before the data grows.

| Source shape | Use this pattern | Why |
|---|---|---|
| A few hundred control rows | Direct `INSERT` or `executemany` is acceptable. | The code stays simple and the round-trip overhead is small enough. |
| API pages already in Python memory | Build batches with PyArrow, Polars, or Pandas, then `INSERT INTO ... SELECT` from the registered table. | Keeps the load as a bulk operation. PyArrow and Polars give better type control than plain Python objects. |
| Larger scrape or API pull without cloud storage | Write CSV, Parquet, or a local DuckDB file under `/tmp`, then load in chunks. | Keeps memory bounded. Parquet is typed and compressed; CSV is easy when you control both write and read. Clean up `/tmp` at the end of the run. |
| Files already in S3, or data you want to replay and backfill | Write Parquet to S3 and load with `read_parquet()` or `INSERT INTO ... SELECT`. | Best fit for large, partitioned, or shared datasets. It requires cloud credentials, but gives you durable staging and easier retries. |
| Schema-evolving API or app data | Use `dlt[motherduck]` and make the loader format explicit with `loader_file_format="parquet"`. | dlt handles state, schema evolution, and merge logic while avoiding row-wise remote inserts. |

As a rough rule, direct inserts are only for tiny control tables. For Flight ingestion, aim to flush batches rather than individual rows. Batches in the 10-100 MB range are usually easier to reason about than one huge load, and they leave room for retries, logging, and memory headroom.

:::tip
If you already have files in object storage, keep them there and let MotherDuck read them. If the data exists only inside the Flight process, batch it locally first; only write to S3 when you need durable staging, replay, backfills, or larger parallel reads.
:::

## Recommended libraries

Two libraries cover most of what teams build with Flights.

### dlt for ingest

[dlt](https://dlthub.com/) is the recommended Python library for moving data **into** MotherDuck. It handles schema evolution, incremental loading, retries, and state tracking, and it ships a MotherDuck destination out of the box.

```text
duckdb==1.5.3
dlt[motherduck]==1.27.0
```

A minimal ingest from a REST API into MotherDuck:

```python
import dlt
import httpx

def main():
    pipeline = dlt.pipeline(
        pipeline_name="github_stars",
        destination="motherduck",
        dataset_name="github",
    )

    response = httpx.get("https://api.github.com/repos/duckdb/duckdb", timeout=30)
    response.raise_for_status()
    pipeline.run(
        [response.json()],
        table_name="repo_stats",
        loader_file_format="parquet",
    )

if __name__ == "__main__":
    main()
```

Use the MotherDuck destination, not the generic DuckDB destination pointed at `md:`, for remote MotherDuck loads. The MotherDuck destination uses Parquet and `COPY` for data loading; the generic DuckDB destination has different defaults. Passing `loader_file_format="parquet"` in Flight examples makes the intended loading path explicit. See the [dlt MotherDuck destination docs](https://dlthub.com/docs/dlt-ecosystem/destinations/motherduck) for the full setup.

### dbt for transformation

[dbt](https://docs.getdbt.com/) with the [`dbt-duckdb`](https://github.com/duckdb/dbt-duckdb) adapter is the recommended way to run transformation graphs against MotherDuck data.

```text
duckdb==1.5.3
dbt-duckdb==1.10.1
```

Run a dbt project from a Flight:

```python
import os
import subprocess

def main():
    cwd = os.path.dirname(os.path.abspath(__file__))
    subprocess.run(["dbt", "build", "--target", "prod"], cwd=cwd, check=True)

if __name__ == "__main__":
    main()
```

If your dbt project pulls in dbt packages from git (for example, `dbt-utils` declared in `packages.yml`), install `git` at the start of `main()` before calling `dbt deps`:

```python
import subprocess

def main():
    subprocess.run(["apt-get", "update"], check=True)
    subprocess.run(["apt-get", "install", "-y", "git"], check=True)
    subprocess.run(["dbt", "deps"], check=True)
    subprocess.run(["dbt", "build"], check=True)

if __name__ == "__main__":
    main()
```


---

## Docs feedback

MotherDuck accepts optional user-submitted feedback about this page at `POST https://motherduck.com/docs/api/feedback/agent`.
For agents and automated tools, feedback submission should be user-confirmed before sending.

Payload:

```json
{
  "page_path": "/key-tasks/flights/packages-and-runtime/",
  "page_title": "Packages and recommended libraries",
  "text": "<the user's feedback, max 2000 characters>",
  "source": "<optional identifier for your interface, for example 'claude.ai' or 'chatgpt'>"
}
```

`page_path` and `text` are required; `page_title` and `source` are optional. Responses: `200 {"feedback_id": "<uuid>"}`, `400` for malformed payloads, and `429` when rate-limited.
