From Source to Dashboard: Real Pipelines with Flights

2026/06/23

TL;DR: Jacob Matson opens Claude Code, gives an agent one prompt, and it builds a working data pipeline in MotherDuck Flights, runs it until the data lands clean, then turns that data into a live Dive dashboard he can keep using.

What you'll see

He points the agent at two public datasets: New York City 311 complaints and daily NYC weather from Open-Meteo. The goal is to load both into MotherDuck and figure out which complaint types track with the weather. The agent writes two Flights pipelines, runs them, reads the logs, catches its own mistake when it writes a table to the wrong database, fixes it, and builds a Dive that joins the two sources.

How Flights works

A Flight is Python that runs on the MotherDuck platform. Each one has four parts: the code, the logs, the config, and the requirements. It runs on a full Linux runtime, not just a Python sandbox, so you can clone a repo or shell out to bash if you need to. You can run a Flight manually, on a cron schedule, or from SQL with MD RUN. Secrets live in the MotherDuck secret manager, not in your config.

Why an agent can pull this off

The MotherDuck MCP server ships a Flights guide as a tool. The agent reads what a Flight is and how to edit one before it writes any code. That context is what lets a single prompt produce a pipeline that actually runs end to end. You can drive the same workflow from Claude Desktop, Claude web, or any MCP-capable AI tool.

Who this is for

You don't need a data engineering team. The session closes with a marketing use case: pulling webinar registration data, joining it with CRM records, and building an internal dashboard, all without filing a ticket. If you're new to the platform, start with the getting started guide.

0:00Hello, everyone. Welcome to our webinar today, from source to dashboard, real pipelines with flights. This is part two of our webinars on flights, a feature we launched a little under two weeks ago, which allows you to run Python and get data into MotherDuck and do other things with it. We're recording this and we'll send out a link to anyone who registered, as well as put it on our website afterwards. If you have any questions, feel free to put them in the chat now or at the end. I'm Jerel, on the marketing team here at MotherDuck. Hi everybody, I'm Jacob, on the devrel side here at MotherDuck. I'm super pumped to be here and to show off how all this works. Before the demo, I'll do a quick introduction into MotherDuck. We had the big data era, with Hadoop and Spark, which allowed you to query massive amounts of data, but you needed to split a query across multiple workers and nodes and shuffle data across the network, lots of round trips. That leads to high costs, high latency, and complex infrastructure to manage. And now we have AI. Agents do not query like humans do. You can ask a question from your agent, and it can spin off and run dozens or hundreds of queries on its own. So that brings us to DuckDB, an open source OLAP database that MotherDuck is built on top of, built by Duck Labs out of Amsterdam, the same place where Python was invented. DuckDB is lightweight and embeddable. It runs locally, on servers, and in the browser, with fast zero latency queries and powerful vertical scaling. MotherDuck is building a cloud data warehouse on top of DuckDB. We have a serverless compute platform, so zero infrastructure to manage. We enable dual execution, splitting work between client and the cloud. We have a hypertenant architecture, so you can scale compute independently by user, service account, or even by agent. And we have agent-native tools: our MCP server, which lets you query data with natural language, Dives, our BI tool, and now flights, where you can use your favorite AI agent like Claude or Codex to build a pipeline to bring data into MotherDuck and kick off transformations or send alerts. Let me hand it over to Jacob. Thank you, Jerel. To talk about flights, first we need to talk about our MCP. MCP is basically just an API, but designed for your agent to work with. Our MCP in MotherDuck is very easy to add: type in MotherDuck, click add, and authorize through OAuth. There's a handful of tools: GetFlight, GetFlightGuide, GetFlightRunLogs, ListFlights, ListFlightVersions, and more. I'm going to do this on Claude Code. We're doing all this live, so pray to the demo gods. Claude was down this morning. A flight lets us build pipelines in Python and run them in MotherDuck. I'm giving it this prompt and telling it to use subagents to do this in parallel. There's a couple of good open datasets: New York City 311 complaints, and daily New York City weather from Open-Meteo. Let's run both, read the logs, make sure there's no errors, and then build a dive that shows which complaint types correlate to weather. A flight has four parts: logs, the Python code, config (variables for the run, not secrets, which live in the secret manager), and requirements. We can run this now. It might fail because this isn't pinned, since there was a new DuckDB release. It failed, that's what I expected. So let's fix this: go to requirements, pin it, save, and run. For those who've done automation with GitHub Actions, this should feel familiar. It printed hello world. This is actually a full Linux box behind the scenes. Here's an NBA dataset that runs every night on a cron. In main.py we clone some code and use the Python subprocess to run bash commands, so this is not just a Python runtime. We can see Claude is doing stuff as we go, and we can see changes in versions. We got our weather data, it took about nine seconds. Now we're getting the 311 data into flightswebinar.main.nyc_daily. My favorite function is summarize: complaint type, complaint count, and date. Back in Claude: the weather flight succeeded but landed in the wrong database, so it's fixing itself. That's pretty funny. Now we build the dive, which takes the two datasets and combines them. Weather daily has average temp, max, min, precipitation, and one year of data. So far I've given one prompt and told it to use subagents. We build a MotherDuck flights pipeline and a dive end-to-end, get the NYC 311 complaints and daily weather, run until no errors, then build a dive that joins them and answers which complaints associate to the weather. One thing that's really cool: inside our MCP there's a tool called GetFlightsGuide, which is basically a skill. So the agent reads what a flight is, the interface, and the MCP tools for editing them before writing code. DiveSaved successfully. We can see that different complaint types change with cold. Filter to heat hot water and these complaints correlate nicely with colder weather. I built a nicer view: click on the top complaints and you can see plumbing has some correlation, heat hot water a lot, and vendor enforcement goes up as temperature goes up, since more people are outside. The main takeaway: we can give it one prompt and it'll build something for us. Flights let us do these explorations, and also give us the power to do this on a recurring basis. We're starting to run this for some production pipelines internally. On the marketing side we've got places like StreamYard with an API for data, and with flights we can integrate that more readily and get it through quality gates. So that's the quick demo. Let's take questions. Jacob, do you want to pull up our cookbook to show examples of how our DevRel team is thinking about using flights? This is our cookbook, with recipes: a prompt that tells you how to do something with a flight. It's easy to copy a prompt and get the thing working. First question: how does version control work for flights? Is it done via GitHub? Today, version control is just files in S3. You can use GitHub as your main code repo and have your flight be simple and just fetch from GitHub. You can see versions and changes over time. There's not a diff viewer yet, which is where you're leading. We're thinking about whether it should work more like a GitHub action with explicit source control, or like a Lambda function where you ship a package of code. Flights is technically in preview, so our product engineering team is still adding features. Question from Frank: is flights available to only work with MotherDuck, not DuckDB? Flights are orchestration running on the MotherDuck platform. If you wanted to write to DuckDB files, you could figure that out, but MotherDuck is the best in class way to integrate. We're giving you a generic Python runtime. Question from Mark: if a flight kicks off multiple sub-flights, do they all run as threads in a single process? No. Each flight is separate, distinct compute, like Airflow where each job can run on its own pod. Each opens its own connection to MotherDuck, and depending on setup they can connect to the same or different databases. A follow up: it would be good not to pay the runtime overhead for a complex flow, a full DAG. The billing is per second with no cooldown, so you just pay for what's running while it's running. A dbt job, which is its own DAG, runs on a single flight. The overhead should be fairly minimal outside of build time for packages. Happy to take more offline on community Slack. One thing I'll call out: you can use Claude or Codex to write all the code, or write it raw within the MotherDuck UI. At the end of the day it's all just code, and it's all visible. It's not a complete black box where the AI writes something and you never see what's going on. I'll add that, like with dives, flights has SQL functions behind it. Every view, the logs, code, config, requirements, is retrievable with SQL. So for a deterministic application where you don't want MCP, you can query that with SQL. You can even run jobs from SQL with the MD RUN flights command: give it the flight ID and it runs the flight. It's starting to feel like stored procedures, although it's a separate runtime, not in the database. Bertrand says he tried both flights and dives and sees great potential for people with limited technical skills to iterate quickly. I'll speak to that as someone with limited technical skills on the marketing team. It's great where I don't have to bug our data engineering team to get insights into how our LinkedIn profile is performing, or to join data from different sources and create a dashboard. I own our webinars, and I want to pull all our registration data and combine it with data in our CRM to show how our webinars contribute to our business. That's all easy for me to do as a non-technical duck, so I can ask lots of questions and get answers without being bottlenecked. Awesome. If that's all, we'll end here. If you haven't tried dives or flights, go try them, they're easy to get going. We had a hackathon that closed up yesterday for who can build the coolest dive, and we have another livestream tomorrow morning where we'll announce the winners and talk about what makes a good data visualization in the day of AI. We'll send out a link to the recording afterwards, and you can join us on community Slack with any questions about flights or MotherDuck. Thank you. Thanks, everybody.

FAQS

A Flight is a Python pipeline that runs on MotherDuck. Each one has four editable parts: code, logs, config, and requirements. It runs on a full Linux environment, so you can shell out to bash or clone a repo if you need to. You can run it manually, on a cron schedule, or from SQL with the MD RUN command.

Yes. The MotherDuck MCP server exposes a Flights guide as a tool, so an agent like Claude can read how Flights work before writing code. In the demo, one prompt produced two working pipelines and a dashboard. The agent also caught a table it had written to the wrong database and fixed it on its own.

Today each Flight version is saved as files, and you can browse previous versions in the UI. There's no built-in diff viewer yet. A common pattern is to keep your actual code in GitHub and have the Flight fetch and run it, so you get full source control.

Flights run on MotherDuck and write back to MotherDuck. Each Flight is a Python runtime, so you could technically write to local DuckDB files, but MotherDuck is the only supported destination.

Flights bill per second of runtime with no cooldown, so you only pay while a Flight is actually running. Each sub-flight runs as its own isolated process with its own connection to MotherDuck.

Related Videos

"DuckDB's agent moment (Jordan Tigani)" video thumbnail

54:22

2026-06-18

DuckDB's agent moment (Jordan Tigani)

Jordan Tigani helped build BigQuery, then left to bet that most data isn't actually big. Three years later, the agent era is making his case for him. In this conversation on The Analytics Engineering Podcast, he and host Tristan Handy talk about how MotherDuck and DuckDB fit together, why a single-node database makes more sense than you'd think for agents, and what it would actually look like to have a swarm of agents managing your data.

YouTube

AI, ML and LLMs

Interview