DuckLake - The Definitive Guide
Table formats shouldn't require a PhD in file management
A comprehensive O'Reilly guide to the open table format that replaces file-based metadata with SQL databases for a faster, easier LakeHouse. Free early access for data engineers and platform teams.
Chapter 1 Now Available. We'll send you additional chapters of the early release as they become available.

Why DuckLake
Traditional open table formats store metadata as thousands of small files scattered across object storage. Every catalog operation requires file-system round trips. Compaction jobs run for hours.
SQL-powered metadata
Stored in Postgres, SQLite, or DuckDB — not scattered files.
10–100x faster
Catalog queries in milliseconds, not hundreds of milliseconds.
ACID-compliant
Multi-table transactions and time travel built in from day one.
Engine-agnostic spec
MIT licensed with DuckDB, Spark, and DataFusion implementations.
Iceberg-compatible
Read Iceberg tables directly. Migrate metadata without moving data.
Simple to set up
Three SQL commands to create your first DuckLake.
What's inside the guide
Everything you need to evaluate, adopt, and operate DuckLake.
Part 01
Architecture & Design
Why SQL-backed metadata changes everything. Deep dive into DuckLake's architecture, how catalog operations hit 10–100x speed improvements, and the design decisions that make it possible.
Part 02
Compared: Iceberg & Delta Lake
An honest, technical comparison. Where DuckLake excels, where incumbent formats still have strengths, and how to think about the trade-offs for your data platform.
Part 03
Migration & Getting Started
Practical guidance for adopting DuckLake. Iceberg interop patterns, migration strategies, getting started with DuckDB, and integrating with your existing data stack.
Not another vendor whitepaper
O'Reilly doesn't put their name on marketing material. This is the same editorial standard behind their definitive guides to Kafka, Spark, and Kubernetes — peer-reviewed technical content built for practitioners, not prospects.
The guide covers DuckLake's strengths and its limitations. It compares it honestly against Iceberg and Delta Lake. It provides real migration paths, not handwaving. If you're evaluating table formats for your data platform, this is the unbiased technical resource you need.
What makes this different
- ✓ Deep technical content from DuckLake contributors
- ✓ The quality content you expect from O'Reilly
- ✓ Covers competing formats honestly
- ✓ Practical migration paths & interop patterns
Get the guide before anyone else
Enter your email and we'll send you chapters from Early Access as they are ready.
Download the Early Release
Check your email for the book!
