# Flights
> How Flights run scheduled Python next to your MotherDuck data, and when to reach for one instead of SQL, a Dive, or an external orchestrator.
A **Flight** is a Python program that MotherDuck schedules and runs, with first-class access to your databases. Where SQL handles transformation against your tables, Flights add everything Python can do (HTTP calls, the full PyPI ecosystem, file processing, custom logic) right next to your data.

Use Flights when the work you want done isn't expressible in a single SQL query: pulling data from an external API, calling out to a Python library, running dbt against MotherDuck, posting a scheduled report to Slack, or stitching together steps that don't compose cleanly in one statement.

## Where Flights fit in MotherDuck

MotherDuck runs two compute planes against the same storage:

```mermaid
flowchart LR
    User{{"You"}}:::green

    subgraph MotherDuck["MotherDuck"]
        Runtime["Python runtime<br/>(per Flight run)"]:::yellow
        Duckling["Duckling<br/>(SQL compute)"]:::yellow
    end

    DB[("Your databases")]:::yellow

    User -->|"interactive SQL"| Duckling
    User -->|"create / schedule"| Runtime
    Runtime -->|"md:"| Duckling
    Duckling --> DB
```

**Ducklings** run your SQL. They're per-user (see [Hypertenancy](/concepts/hypertenancy)) and start in about a second.

The **Python runtime** runs your scheduled Python. Each Flight run gets its own isolated runtime, with a short startup before `main()` begins. This runtime is not a container platform or a serverless function environment: it exists to run a Flight's Python `main()` entrypoint, then exit.

A Flight reaches your data the same way any DuckDB client does: the Python runtime opens an `md:` connection, which routes through a Duckling that executes the SQL against your databases. MotherDuck injects an access token into the Flight's environment so the connection works without you embedding credentials.

## The relationship to SQL

Flights and SQL talk to each other in both directions.

**From Python to SQL.** A Flight uses the DuckDB Python client to run any query you can express in SQL:

```python
import duckdb

def main():
    con = duckdb.connect("md:")
    con.execute("INSERT INTO sales.daily_totals SELECT * FROM read_parquet('s3://incoming/today.parquet')")
```

Anything in the SQL reference is available here: aggregations, MotherDuck table functions, attach commands, and so on.

**From SQL to Flights.** Flights also have a SQL surface. `MD_CREATE_FLIGHT`, `MD_FLIGHTS`, `MD_FLIGHT_RUNS`, and the rest of the [Flights SQL functions](/sql-reference/motherduck-sql-reference) let you create, schedule, list, and monitor Flights from anywhere you can run SQL: a DuckDB CLI, your BI tool, or another Flight.

**When to pick which.** A reasonable rule of thumb:

- **Stay in SQL** when the transformation is expressible in SQL. Scheduled aggregations, materialized views, and table-to-table moves usually belong here.
- **Reach for a Flight** when you need to call an external service, install a Python package, process a file format SQL can't read, or stitch together steps that don't compose into one query.

A Flight whose `main()` is nothing but `con.execute("...")` is usually a SQL job in disguise. Flights earn their keep when they add *non-SQL* work to the pipeline.

## What you can build with Flights

Four patterns cover most of what teams build:

- **Ingest.** Pull data from external sources (Postgres, BigQuery, Snowflake, S3, APIs) and write it to MotherDuck tables. [dlt](https://dlthub.com/) is the recommended ingest library: it gives you a declarative pipeline with schema evolution, incremental loading, and a MotherDuck destination.
- **Transform.** Read MotherDuck data, run Python-heavy transformations (machine-learning features, geospatial work, API enrichment), and write the result back. When the transformation is graph-shaped, [dbt](https://docs.getdbt.com/) with the `dbt-duckdb` adapter is the recommended approach.
- **Reverse ETL and notifications.** Read from MotherDuck and push to a downstream system: post a daily summary to Slack, sync an aggregate to a SaaS tool, drop a CSV in S3 for a partner.
- **Operational tasks.** One-shot backfills, cleanup, archival, audit jobs, legacy-system migrations. These don't need a schedule — you run them on demand and keep the Flight around for re-runs.

## Integrating with the tools you already use

- **dlt** for ingest. Generate or hand-write a dlt pipeline that uses MotherDuck as the destination, then deploy it as a Flight.
- **dbt + `dbt-duckdb`** for transformation. Install the project's dependencies through `requirements.txt`, then call `dbt build` from `main()`.
- **Anything on PyPI.** Pull in any pip-installable package through `requirements.txt`.
- **MCP and AI agents.** Agents create, edit, and run Flights through the [MotherDuck MCP Server](/sql-reference/mcp/). This is the marquee creation path: describe what you want in natural language, the agent writes the Python and wires up the schedule.
- **Dives.** Build a dashboard around a Flight with a [Dive](/key-tasks/ai-and-motherduck/dives). A Dive can call `MD_RUN_FLIGHT` from a button to kick off a run on demand and read from `MD_FLIGHT_RUNS` to display status and results.
- **Existing orchestrators (Airflow, Prefect, Dagster).** Flights aren't a forced replacement. They shine for MotherDuck-centric workloads where the data, the schedule, and the compute should live in one place. Keep using your existing orchestrator for workflows that span many systems.

## Beyond Python

A Flight is a Linux process that runs your Python program, with shell access through `subprocess`. That opens up two patterns worth knowing.

### Install and call system binaries

Use `subprocess` to run `apt-get` and invoke command-line tools (git, ffmpeg, Playwright, anything available as a Debian package) from your Flight.

```python
import subprocess

def main():
    subprocess.run(["apt-get", "install", "-y", "git"], check=True)
    subprocess.run(["git", "clone", "https://github.com/example/repo"], check=True)
```

### Run a local DuckDB with community extensions

A Flight can open its own in-process DuckDB connection alongside the `md:` connection, which lets you load DuckDB [community extensions](https://duckdb.org/community_extensions/) that aren't supported on MotherDuck's server-side runtime. For example, the [`bigquery`](https://duckdb.org/community_extensions/extensions/bigquery) and [`snowflake`](https://duckdb.org/community_extensions/extensions/snowflake) extensions let a Flight read directly from another warehouse and write the result to MotherDuck.

```python
import duckdb

def main():
    local = duckdb.connect()  # local in-process DuckDB
    local.execute("INSTALL bigquery FROM community; LOAD bigquery;")
    local.execute("ATTACH 'project=my-project' AS bq (TYPE bigquery, READ_ONLY)")

    # read from BigQuery into the Flight, then write the result to MotherDuck
    events = local.sql("SELECT * FROM bq.analytics.events").df()

    md = duckdb.connect("md:")
    md.execute("INSERT INTO raw.events SELECT * FROM events")
```

Keep heavy compute outside the Flight runtime. A Flight is sized for orchestration and light processing, not for crunching large tables in memory. Let the source warehouse (BigQuery, Snowflake) and MotherDuck handle the heavy lifting, and let the Flight move data between them.

## What Flights don't give you out of the box

Flights are a Python runtime tied to your MotherDuck data, not a general-purpose compute platform. A few things to set expectations:

- **No inbound HTTP.** There's no public endpoint, no request/response model. A Flight is started by its schedule or by a `MD_RUN_FLIGHT` call, not by an incoming HTTP request.
- **No container or serverless function runtime.** You don't deploy arbitrary container images or request-triggered functions. Each run executes the Flight's Python program and then exits. If you're comparing against AWS Lambda, Cloud Run, or Fargate, the shapes are different: Flights are scheduled jobs, not request-handlers.
- **No managed worker pool.** A Flight run is one Linux process, not a fleet. There's no built-in queue, no coordinated retries across workers, no in-process distributed state. If your current setup is a Prefect worker pool, a Databricks job cluster, or an Airflow executor pool, that's a different shape. You *can* build fan-out yourself by having a Flight call `MD_RUN_FLIGHT` to kick off other Flights, but the orchestration logic is yours to write.
- **No interactive REPL.** A Flight is a single-file Python program with a `main()` entrypoint, not a notebook. For human-driven exploration, write a query or build a [Dive](/key-tasks/ai-and-motherduck/dives).
- **Not sub-second.** Each run pays a startup cost before `main()` begins, so Flights aren't a fit for synchronous, user-facing latency budgets. For long-running ingest (streaming pulls, mini-batch loops with sleeps) the startup cost amortizes, and a Flight is well-suited to that shape.

## Related resources

- [Flights SQL reference](/sql-reference/motherduck-sql-reference/flights/) — `MD_CREATE_FLIGHT`, `MD_RUN_FLIGHT`, and the rest of the Flight table functions.
- [Hypertenancy](/concepts/hypertenancy) — why Ducklings give every user dedicated SQL compute.
- [Creating visualizations with Dives](/key-tasks/ai-and-motherduck/dives) — interactive React apps over MotherDuck data.
- [MotherDuck MCP Server](/sql-reference/mcp/) — the MCP surface for AI agents that create and manage Flights.


---

## Docs feedback

MotherDuck accepts optional user-submitted feedback about this page at `POST https://motherduck.com/docs/api/feedback/agent`.
For agents and automated tools, feedback submission should be user-confirmed before sending.

Payload:

```json
{
  "page_path": "/concepts/flights/",
  "page_title": "Flights",
  "text": "<the user's feedback, max 2000 characters>",
  "source": "<optional identifier for your interface, for example 'claude.ai' or 'chatgpt'>"
}
```

`page_path` and `text` are required; `page_title` and `source` are optional. Responses: `200 {"feedback_id": "<uuid>"}`, `400` for malformed payloads, and `429` when rate-limited.
