TL;DR
Hugo Lu (Orchestra) and Jacob Matson (MotherDuck) demo an AI agent that builds a complete data pipeline from a single natural language prompt — ingesting data from Linear, landing it in a MotherDuck staging database, promoting it to production via snapshots, and visualizing results in a Dive — all orchestrated through Orchestra and wired together with MCP.
Why AI changes how we build pipelines
AI models are good at writing SQL and structuring messy data, but they query differently than humans. Agents fire off lots of small, bursty queries to explore a database before they do anything useful. MotherDuck's DuckDB-based architecture handles this well — it's fast, runs queries in parallel via vectorization, and costs roughly 10x less than Snowflake or Redshift for comparable workloads.
The demo: one skill, one agent, one pipeline
Hugo built a single Claude Code skill that tells the agent how to scaffold an end-to-end pipeline. When triggered, the agent:
- Generates a Python ingestion script using DLT (with a connector it created on the fly for Linear)
- Writes an Orchestra pipeline YAML file to orchestrate the run
- Pushes everything to a feature branch
- Triggers the pipeline in Orchestra
The whole setup needs three sets of credentials (Linear, MotherDuck, Orchestra) and the MotherDuck MCP server for database operations. No custom skills needed for the warehouse layer.
Staging, snapshots, and safe promotion
Rather than writing directly to production, the agent lands data in a staging database. MotherDuck snapshots create an immutable, zero-copy checkpoint of that database. If everything looks right, the snapshot gets promoted to production. If something breaks, you roll back. This keeps unattended agent workflows reversible and safe.
Dives: BI as code from your pipeline
At the end of the pipeline, the agent also updates a MotherDuck Dive — a React-based visualization that lives in source control. The Dive shows when data was last refreshed and links back to the Orchestra run for full lineage. No dashboard tool to learn; the AI writes the TSX.
Q&A highlights
- Storage cost of snapshots: MotherDuck keeps 7 days of snapshots by default. If you build incrementally, the overhead is minimal. Transient databases reduce retention further.
- Token efficiency: The demo used roughly 10,000 tokens. Both MotherDuck and Orchestra keep MCP tool descriptions concise to avoid blowing up context windows.
- Semantic/modeling layers: You can extend the Orchestra pipeline YAML to include dbt modeling steps between ingestion and snapshot promotion.



