0:00Hello, everyone. Welcome to our webinar today, from source to dashboard, real pipelines with flights. This is part two of our webinars on flights, a feature we launched a little under two weeks ago, which allows you to run Python and get data into MotherDuck and do other things with it. We're recording this and we'll send out a link to anyone who registered, as well as put it on our website afterwards. If you have any questions, feel free to put them in the chat now or at the end. I'm Jerel, on the marketing team here at MotherDuck. Hi everybody, I'm Jacob, on the devrel side here at MotherDuck. I'm super pumped to be here and to show off how all this works. Before the demo, I'll do a quick introduction into MotherDuck. We had the big data era, with Hadoop and Spark, which allowed you to query massive amounts of data, but you needed to split a query across multiple workers and nodes and shuffle data across the network, lots of round trips. That leads to high costs, high latency, and complex infrastructure to manage. And now we have AI. Agents do not query like humans do. You can ask a question from your agent, and it can spin off and run dozens or hundreds of queries on its own. So that brings us to DuckDB, an open source OLAP database that MotherDuck is built on top of, built by Duck Labs out of Amsterdam, the same place where Python was invented. DuckDB is lightweight and embeddable. It runs locally, on servers, and in the browser, with fast zero latency queries and powerful vertical scaling. MotherDuck is building a cloud data warehouse on top of DuckDB. We have a serverless compute platform, so zero infrastructure to manage. We enable dual execution, splitting work between client and the cloud. We have a hypertenant architecture, so you can scale compute independently by user, service account, or even by agent. And we have agent-native tools: our MCP server, which lets you query data with natural language, Dives, our BI tool, and now flights, where you can use your favorite AI agent like Claude or Codex to build a pipeline to bring data into MotherDuck and kick off transformations or send alerts. Let me hand it over to Jacob. Thank you, Jerel. To talk about flights, first we need to talk about our MCP. MCP is basically just an API, but designed for your agent to work with. Our MCP in MotherDuck is very easy to add: type in MotherDuck, click add, and authorize through OAuth. There's a handful of tools: GetFlight, GetFlightGuide, GetFlightRunLogs, ListFlights, ListFlightVersions, and more. I'm going to do this on Claude Code. We're doing all this live, so pray to the demo gods. Claude was down this morning. A flight lets us build pipelines in Python and run them in MotherDuck. I'm giving it this prompt and telling it to use subagents to do this in parallel. There's a couple of good open datasets: New York City 311 complaints, and daily New York City weather from Open-Meteo. Let's run both, read the logs, make sure there's no errors, and then build a dive that shows which complaint types correlate to weather. A flight has four parts: logs, the Python code, config (variables for the run, not secrets, which live in the secret manager), and requirements. We can run this now. It might fail because this isn't pinned, since there was a new DuckDB release. It failed, that's what I expected. So let's fix this: go to requirements, pin it, save, and run. For those who've done automation with GitHub Actions, this should feel familiar. It printed hello world. This is actually a full Linux box behind the scenes. Here's an NBA dataset that runs every night on a cron. In main.py we clone some code and use the Python subprocess to run bash commands, so this is not just a Python runtime. We can see Claude is doing stuff as we go, and we can see changes in versions. We got our weather data, it took about nine seconds. Now we're getting the 311 data into flightswebinar.main.nyc_daily. My favorite function is summarize: complaint type, complaint count, and date. Back in Claude: the weather flight succeeded but landed in the wrong database, so it's fixing itself. That's pretty funny. Now we build the dive, which takes the two datasets and combines them. Weather daily has average temp, max, min, precipitation, and one year of data. So far I've given one prompt and told it to use subagents. We build a MotherDuck flights pipeline and a dive end-to-end, get the NYC 311 complaints and daily weather, run until no errors, then build a dive that joins them and answers which complaints associate to the weather. One thing that's really cool: inside our MCP there's a tool called GetFlightsGuide, which is basically a skill. So the agent reads what a flight is, the interface, and the MCP tools for editing them before writing code. DiveSaved successfully. We can see that different complaint types change with cold. Filter to heat hot water and these complaints correlate nicely with colder weather. I built a nicer view: click on the top complaints and you can see plumbing has some correlation, heat hot water a lot, and vendor enforcement goes up as temperature goes up, since more people are outside. The main takeaway: we can give it one prompt and it'll build something for us. Flights let us do these explorations, and also give us the power to do this on a recurring basis. We're starting to run this for some production pipelines internally. On the marketing side we've got places like StreamYard with an API for data, and with flights we can integrate that more readily and get it through quality gates. So that's the quick demo. Let's take questions. Jacob, do you want to pull up our cookbook to show examples of how our DevRel team is thinking about using flights? This is our cookbook, with recipes: a prompt that tells you how to do something with a flight. It's easy to copy a prompt and get the thing working. First question: how does version control work for flights? Is it done via GitHub? Today, version control is just files in S3. You can use GitHub as your main code repo and have your flight be simple and just fetch from GitHub. You can see versions and changes over time. There's not a diff viewer yet, which is where you're leading. We're thinking about whether it should work more like a GitHub action with explicit source control, or like a Lambda function where you ship a package of code. Flights is technically in preview, so our product engineering team is still adding features. Question from Frank: is flights available to only work with MotherDuck, not DuckDB? Flights are orchestration running on the MotherDuck platform. If you wanted to write to DuckDB files, you could figure that out, but MotherDuck is the best in class way to integrate. We're giving you a generic Python runtime. Question from Mark: if a flight kicks off multiple sub-flights, do they all run as threads in a single process? No. Each flight is separate, distinct compute, like Airflow where each job can run on its own pod. Each opens its own connection to MotherDuck, and depending on setup they can connect to the same or different databases. A follow up: it would be good not to pay the runtime overhead for a complex flow, a full DAG. The billing is per second with no cooldown, so you just pay for what's running while it's running. A dbt job, which is its own DAG, runs on a single flight. The overhead should be fairly minimal outside of build time for packages. Happy to take more offline on community Slack. One thing I'll call out: you can use Claude or Codex to write all the code, or write it raw within the MotherDuck UI. At the end of the day it's all just code, and it's all visible. It's not a complete black box where the AI writes something and you never see what's going on. I'll add that, like with dives, flights has SQL functions behind it. Every view, the logs, code, config, requirements, is retrievable with SQL. So for a deterministic application where you don't want MCP, you can query that with SQL. You can even run jobs from SQL with the MD RUN flights command: give it the flight ID and it runs the flight. It's starting to feel like stored procedures, although it's a separate runtime, not in the database. Bertrand says he tried both flights and dives and sees great potential for people with limited technical skills to iterate quickly. I'll speak to that as someone with limited technical skills on the marketing team. It's great where I don't have to bug our data engineering team to get insights into how our LinkedIn profile is performing, or to join data from different sources and create a dashboard. I own our webinars, and I want to pull all our registration data and combine it with data in our CRM to show how our webinars contribute to our business. That's all easy for me to do as a non-technical duck, so I can ask lots of questions and get answers without being bottlenecked. Awesome. If that's all, we'll end here. If you haven't tried dives or flights, go try them, they're easy to get going. We had a hackathon that closed up yesterday for who can build the coolest dive, and we have another livestream tomorrow morning where we'll announce the winners and talk about what makes a good data visualization in the day of AI. We'll send out a link to the recording afterwards, and you can join us on community Slack with any questions about flights or MotherDuck. Thank you. Thanks, everybody.