HEY, FRIEND 👋
I hope you're doing well. I'm Simon, and I am happy to share another monthly newsletter with highlights and the latest updates about DuckDB, delivered straight to your inbox.
In this April issue, I gathered 11 updates and news highlights from DuckDB's ecosystem. Please enjoy this month's update, including the big one — DuckLake 1.0 going production-ready — plus new vector search extensions with the Lance integration and Rust-based development, creative community projects like a SQL puzzle game and a Neovim-themed website, performance benchmarks on the new MacBook Neo, and AI-powered eBPF tracing with DuckDB.
DuckDB + MotherDuck meetups keep rolling: Round 2 in San Francisco on April 30th with talks on DuckLake 1.0 and distributed DuckDB — register here. And if you're in Seattle the same day, there's a PyData x MotherDuck event on Python + DuckDB workflows — register here.
If you have feedback, news, or any insights, they are always welcome. 👉🏻 duckdbnews@motherduck.com.
![]() | Featured Community Member |

Prof. Dr. Torsten Grust
Head of the research group
Torsten is a professor of Computer Science at the University of Tübingen (Germany), researching the design, compilation, optimization, and evaluation of database query languages. His group develops techniques that turn relational database systems into scalable processors for non-relational query and programming languages — walking the fine line between database and PL technology.
He recently started DiDi (Design and Implementation of DuckDB Internals), an open course that teaches database system design through hands-on exploration of DuckDB — covering memory management, sorting, indexing, vectorized execution, and query optimization across eight chapters with ~50 runnable code examples. It's great to see DuckDB getting adoption for teaching database fundamentals at university level.
Connect with Torsten on LinkedIn and check out his database research group.
![]() | Top DuckDB Links this Month |
DuckLake 1.0: The Lakehouse Format Goes Production-Ready
TL;DR: DuckLake 1.0, the metadata-in-a-database lakehouse format, is now production-ready with sorted tables, bucket partitioning, data inlining, geometry support, and Iceberg-compatible deletion vectors.
Unlike Delta Lake and Iceberg, DuckLake stores all metadata in a database catalog (PostgreSQL, SQLite, or DuckDB itself) rather than scattered files. The 1.0 release merges 108 PRs since late 2025 — 68 focused on reliability and correctness alone. Data inlining solves the small-file problem by storing tiny operations (≤10 rows by default) directly in the catalog, with CHECKPOINT to flush to object storage. Sorted tables enable automatic compaction and file pruning for high-cardinality columns, and the new Variant type brings semi-structured data with shredding to primitive types for better query performance.
Performance highlights include 8×–258× speedups for COUNT(*) via metadata-only queries and ~70× faster duckdb_views() lookups. Already ranked among DuckDB's top-10 extensions by downloads, with clients for Apache DataFusion, Spark, Trino, and Pandas. An O'Reilly book — "DuckLake: The Definitive Guide" — is in development. Available in DuckDB v1.5.2.
dux: Distributed DataFrames for Elixir powered by DuckDB
TL;DR: Dux is a distributed, lazy-by-default Elixir dataframe library backed by DuckDB, offering better performance and simpler maintenance than prior Polars-backed approaches.
Pipelines compile to SQL CTEs for end-to-end optimization by DuckDB, with lazy operations accumulating as an AST in the %Dux{} struct. It has built-in distributed execution across BEAM nodes, where data can be transferred, SQL compiled locally, and executed against each node's DuckDB instance without heavy RPC.
Early benchmarks (10M rows, Apple M4 Max) show Dux outperforming Explorer (Polars) by up to 2.5x for lazy filters (24ms vs 59ms) and 1.6x for group+summarise (40ms vs 63ms).
connections.duckdb: Play the New York Times Connections puzzle with DuckDB!
TL;DR: Tom Jakubowski built the New York Times Connections puzzle entirely in DuckDB using SQL macros and views.
The goal is to sort a grid of 16 words into 4 groups that share a hidden category. Play with duckdb https://www.tjak.dev/connections.duckdb, run select * from todays_puzzle;, and guess your groups with FROM guess_category_today(['CONTEST', 'GAME', 'BATTLE', 'CLASH']);. All game state and validation run in-memory via database-resident SQL.
Lance Extension
TL;DR: The Lance extension enables read/write of Lance datasets in DuckDB with vector, full-text, and hybrid search via dedicated SQL functions.
Lance is a columnar, open-table format optimized for ML/AI workloads and vector search. Hao Ding did the heavy lifting in adding support for reading and writing Lance tables. You can query via replacement scans and write with COPY (...) TO 'path/dataset.lance' (FORMAT lance, MODE 'overwrite'|'append');. Search functions include lance_vector_search(...), lance_fts(...), and lance_hybrid_search(...).
neovim-web: A website framework with Vim keybindings, Telescope fuzzy finder, and DuckDB SQL console to query site content
TL;DR: A zero-build Neovim-themed website framework with a built-in DuckDB SQL console for querying site content.
Volker integrates a DuckDB SQL console directly in the browser (:sql command) using DuckDB Wasm for client-side execution, no server-side processing needed. It's a fun way to learn more about in-browser SQL. Check Volker's website and type FROM pages; to try it, or clone the repo to build your own.
MotherDuck Now Speaks Postgres
TL;DR: MotherDuck now provides a PostgreSQL wire-protocol endpoint so you can run DuckDB SQL from any Postgres-compatible client without installing DuckDB libraries.
Point your existing Postgres client at pg.us-east-1-aws.motherduck.com:5432, authenticate with a MotherDuck token, and offload analytics while keeping OLTP Postgres lean. SQL remains DuckDB's dialect (largely PostgreSQL-compatible).
Existing drivers, poolers, and query patterns work unchanged. Supported clients include JDBC, rust-postgres, and node-postgres. Data movement from Postgres can be done with ETL tools or the pg_duckdb extension.
Big Data on the Cheapest MacBook
TL;DR: The entry-level MacBook Neo (Apple A18 Pro) handles heavy DuckDB workloads, such as ClickBench and TPC-DS, surprisingly well.
Gábor from DuckDB benchmarked the MacBook Neo with ClickBench (100M rows, 5 GB memory limit), yielding sub-second cold run medians, and TPC-DS at SF100 with a 1.63-second query median. Even the demanding SF300 completed in 79 minutes, though with significant disk spills.
quack-rs: A Rust SDK for building DuckDB loadable extensions.
TL;DR: quack-rs is a pure-Rust SDK wrapping DuckDB's C Extension API (v1.1+) to eliminate all C/C++ glue code and FFI pitfalls when building loadable extensions.
Previously, writing Rust-based DuckDB extensions required C++ glue and CMake tooling. The SDK wraps the C Extension API with safe, idiomatic abstractions and eliminates 16 documented FFI pitfalls, including silent NULL corruption and double-free in aggregate callbacks. The generate_scaffold function produces all 11 files needed for a community extension submission.
This means community extensions can now be built in Rust with its performance and safety guarantees, without needing to know DuckDB internals.
Announcing systing 1.0: Integration of DuckDB and AI accelerates the debugging workflow
TL;DR: Josef Bacik's systing eBPF (extended Berkeley Packet Filter) tracing tool now outputs directly to DuckDB databases, leveraging its speed for real-time AI-driven analysis of complex Linux performance issues.
eBPF is a Linux kernel technology that lets you run small, sandboxed programs directly in the kernel. Systing 1.0 marks a significant shift from generating Perfetto traces to creating DuckDB databases for system-wide eBPF tracing data. This addresses previous issues with overwhelming data volume and slow SQLite conversions. Josef implemented a Claude Code MCP designed to analyze these DuckDB traces, effectively replacing static analysis scripts with dynamic, AI-powered insights. This is a great use case for integrating DuckDB to improve speed.
Building a Full-Featured DuckDB Kernel for Jupyter — With a Database Explorer You'll Actually Use
TL;DR: Vladimir said "SQL notebooks deserve better tooling," and delivers a native Go DuckDB Jupyter kernel that streams Arrow IPC to a WASM Perspective viewer with a database explorer for JupyterLab and VS Code.
The kernel runs DuckDB directly (no Python wrapper) and exposes a localhost HTTP API for Arrow IPC streaming and explorer metadata. Perspective renders interactive tables/charts; a 5M-row (237 MB) result was queried in 238 ms and rendered in under 5s. Table detail panels include a Summarize tab computing approx_unique, avg, min, max, count without writing queries. Install via VS Code "Install / Update DuckDB Kernel", or JupyterLab with pip install hugr-perspective-viewer. The kernel is part of Hugr (an open source Data Mesh platform).
Introducing Embedded Dives
TL;DR: MotherDuck now lets you embed Dives (React+SQL components) with dual execution (cloud + DuckDB-Wasm) yielding 5–20 ms interaction latency.
Developers can integrate AI-created data apps into their applications and websites via <iframe>. The cloud engine handles the initial query and streams results into a local DuckDB-Wasm instance, so subsequent interactions like filtering and aggregations run entirely client-side with zero network roundtrips. Browse examples at the Dive Gallery.
![]() | Upcoming Events |
MotherDuck Now Speaks Postgres: Fast Analytics Without Changing Your Stack
2026-04-21. h: 16:00. Online
A livestream on MotherDuck's new Postgres wire protocol endpoint — any Postgres-compatible client, driver, or BI tool can query your data warehouse directly. No DuckDB libraries required.
A Practical Guide to Context Management for Data Agents
2026-04-23. h: 16:30. Online
A livestream covering data agents, context management, and business logic implementation with Virgil Data.
DuckDB + MotherDuck Meetup — San Francisco
2026-04-30. h: 18:00. San Francisco, CA, USA
Round 2 of the SF DuckDB + MotherDuck meetup! Talks on Building OpenDuck (distributed DuckDB) and DuckLake 1.0.
High-Performance Data Workflows with Python and DuckDB — PyData x MotherDuck
2026-04-30. h: 17:30. Seattle, WA, USA
Local-first analytics workflows combining Python and DuckDB, scaling to cloud with MotherDuck. A PyData x MotherDuck collaboration.
AI Council
2026-05-12. h: 08:00. San Francisco, CA, USA
Three-day conference at the SF Marriott Marquis with 50+ speakers covering AI engineering, agents, and databases.
Subscribe to DuckDB Newsletter
PREVIOUS POSTS

2026/04/15 - Jordan Tigani
Water Town: The Agent Swarm Data Stack
In a fully agentic world, will we still need analytics at all? A particularly unhinged example might offer some clues.

2026/04/16 - Alex Monahan
Announcing DuckLake 1.0 on MotherDuck
MotherDuck now supports DuckLake 1.0, the open table lakehouse format designed for simplicity and low latency. Learn what's new in the 1.0 release, including data inlining, clustering, bucket partitioning, geometry and variant types, plus multi-engine support. Learn how DuckLake compares with Apache Iceberg and Delta Lake.


