Question 1

What is DuckLake?

Accepted Answer

DuckLake is an open table format that moves all metadata — for both the catalog and individual tables — into a standard SQL database (Postgres, MySQL, SQLite, or DuckDB), while data stays in Parquet files on object storage. Every operation becomes a SQL transaction against the catalog database, leveraging its native ACID guarantees for true multi-table transactions.

Question 2

How does DuckLake compare to Apache Iceberg and Delta Lake?

Accepted Answer

Iceberg and Delta Lake store metadata as files in object storage (JSON/Avro for Iceberg, JSON/Parquet for Delta), requiring periodic compaction and file-system round trips for catalog operations. DuckLake stores metadata in a SQL database, enabling index-based partition discovery, transactional schema evolution via SQL ALTER TABLE, and database-level concurrency control. The guide includes a detailed feature comparison table covering transaction scope, concurrency models, small file handling, schema evolution, and more.

Question 3

Is DuckLake open source?

Accepted Answer

Yes. DuckLake is released under the MIT license. The format specification, DuckDB extension, and all tooling are fully open source. Data is stored as standard Parquet files readable by a wide variety of query engines and tools.

Question 4

Can I use DuckLake with my existing data stack?

Accepted Answer

Yes. DuckLake works with any SQL-compatible metadata database and any major object storage (S3, GCS, Azure Blob, Cloudflare R2). The guide covers integration patterns for dbt, Airflow, Dagster, Tableau, Power BI, and streaming pipelines via Kafka. It also covers migration paths from Iceberg, Delta Lake, traditional RDBMS, and raw Parquet files.

Question 5

Who is this guide for?

Accepted Answer

Data engineers, platform teams, and technical leaders evaluating open table formats. It ranges from foundational architecture concepts and a hands-on tutorial through to production operations, security implementation, and cost analysis versus Snowflake, BigQuery, and Databricks.

Question 6

What is MotherDuck?

Accepted Answer

MotherDuck is a serverless cloud analytics platform built on DuckDB. It provides managed DuckLake hosting with integrated authentication, automatic credential brokering for object storage, and AWS PrivateLink for enterprise network security. You can try it free at motherduck.com.

Question 7

Does MotherDuck offer managed DuckLakes?

Accepted Answer

Yes. MotherDuck manages the DuckLake catalog database and metadata operations for you, so you get 10–100x faster metadata lookups and sub-second query performance at petabyte scale without running your own catalog infrastructure. You can bring your own S3-compatible storage for data files, or let MotherDuck manage that too. Start with MotherDuck’s standard storage for typical workloads, then seamlessly scale to DuckLake-backed databases as your data grows — same SQL interface either way. Managed DuckLakes are currently available in preview.

Feature	DuckLake	Apache Iceberg	Delta Lake
Metadata Storage	SQL Database (Postgres, MySQL, DuckDB)	Files in Object Storage (JSON, Avro)	Files in Object Storage (JSON, Parquet)
Transaction Scope	Multi-table, database-level ACID	Single-table ACID	Single-table ACID
Small File Handling	Data inlining + SQL-based compaction	Periodic compaction jobs	Periodic compaction jobs (OPTIMIZE)
Schema Evolution	Transactional DDL via SQL ALTER TABLE	Atomic metadata pointer updates	Atomic commits to transaction log
Query Planning	Single SQL query to catalog	Multi-hop file reads (metadata → manifest list → manifests)	Read transaction log for file list

Stop wrestling with thousands of metadata files

The Essential Guide to DuckLake

Why data teams are switching to DuckLake

SQL-powered metadata

ACID-compliant

No vendor lock-in

10–100x faster

Simple to set up

Inside the guide

How DuckLake Works

Build Your First DuckLake

SQL Reference & API

Performance & Production

Integrations & Migration

Security & Compliance

DuckLake vs Iceberg vs Delta Lake

Is this guide for you?

Frequently asked questions