The Definitive Guide to DuckLake
Table formats shouldn't require a PhD in file management
A comprehensive O'Reilly guide to the open table format that replaces file-based metadata with SQL databases for a faster, easier LakeHouse. Free early access for data engineers and platform teams.
Enter your email. We'll notify you the moment the guide is available.

Why DuckLake
Traditional open table formats store metadata as thousands of small files scattered across object storage. Every catalog operation requires file-system round trips. Compaction jobs run for hours.
SQL-powered metadata
Stored in Postgres, SQLite, or DuckDB — not scattered files.
10–100x faster
Catalog queries in milliseconds, not hundreds of milliseconds.
ACID-compliant
Multi-table transactions and time travel built in from day one.
Engine-agnostic spec
MIT licensed with DuckDB, Spark, and DataFusion implementations.
Iceberg-compatible
Read Iceberg tables directly. Migrate metadata without moving data.
Simple to set up
Three SQL commands to create your first DuckLake.
What's inside the guide
Everything you need to evaluate, adopt, and operate DuckLake.
Part 01
Architecture & Design
Why SQL-backed metadata changes everything. Deep dive into DuckLake's architecture, how catalog operations hit 10–100x speed improvements, and the design decisions that make it possible.
Part 02
Compared: Iceberg & Delta Lake
An honest, technical comparison. Where DuckLake excels, where incumbent formats still have strengths, and how to think about the trade-offs for your data platform.
Part 03
Migration & Getting Started
Practical guidance for adopting DuckLake. Iceberg interop patterns, migration strategies, getting started with DuckDB, and integrating with your existing data stack.
Not another vendor whitepaper
O'Reilly doesn't put their name on marketing material. This is the same editorial standard behind their definitive guides to Kafka, Spark, and Kubernetes — peer-reviewed technical content built for practitioners, not prospects.
The guide covers DuckLake's strengths and its limitations. It compares it honestly against Iceberg and Delta Lake. It provides real migration paths, not handwaving. If you're evaluating table formats for your data platform, this is the unbiased technical resource you need.
What makes this different
- ✓ Deep technical content from DuckLake contributors
- ✓ The quality content you expect from O'Reilly
- ✓ Covers competing formats honestly
- ✓ Practical migration paths & interop patterns
Frequently asked questions
When will the guide be available?
The guide is currently in pre-release with O'Reilly. Sign up to be notified the moment early access opens. We expect the first couple chapters in the next few weeks. We'll notify you as additional chapters are available, and when the 1st edition is fully published.
Is it free?
Yes. The early-access edition is completely free. No paywall, no credit card, no trial signup required.
What topics does the guide cover?
The guide covers DuckLake's architecture and design philosophy, performance characteristics, honest comparisons with Iceberg/Delta Lake, migration strategies, Iceberg interop patterns, and practical getting-started guidance with DuckDB.
Do I need to use MotherDuck?
No. DuckLake is an open source project (MIT license) that works with DuckDB. The guide covers DuckLake as a technology — MotherDuck is one way to use it, but the content applies regardless of your deployment choice. With MotherDuck you can create a DuckLake with just 2 SQL commands in under 5 minutes.
Get the guide before anyone else
Enter your email and we'll send you early access the moment it's ready.
Sign up for early access
YOU'RE ON THE LIST We'll send you early access the moment it's ready.


