DuckDB Ecosystem Newsletter : April 2026

2026/04/17 - 8 min read

BY

HEY, FRIEND 👋

I hope you're doing well. I'm Simon, and I am happy to share another monthly newsletter with highlights and the latest updates about DuckDB, delivered straight to your inbox.

In this April issue, I gathered 11 updates and news highlights from DuckDB's ecosystem. Please enjoy this month's update, including the big one — DuckLake 1.0 going production-ready — plus new vector search extensions with the Lance integration and Rust-based development, creative community projects like a SQL puzzle game and a Neovim-themed website, performance benchmarks on the new MacBook Neo, and AI-powered eBPF tracing with DuckDB.

DuckDB + MotherDuck meetups keep rolling: Round 2 in San Francisco on April 30th with talks on DuckLake 1.0 and distributed DuckDB — register here. And if you're in Seattle the same day, there's a PyData x MotherDuck event on Python + DuckDB workflows — register here.

If you have feedback, news, or any insights, they are always welcome. 👉🏻 duckdbnews@motherduck.com.

Post asset
Torsten Grust

Prof. Dr. Torsten Grust

Head of the research group

Torsten is a professor of Computer Science at the University of Tübingen (Germany), researching the design, compilation, optimization, and evaluation of database query languages. His group develops techniques that turn relational database systems into scalable processors for non-relational query and programming languages — walking the fine line between database and PL technology.

He recently started DiDi (Design and Implementation of DuckDB Internals), an open course that teaches database system design through hands-on exploration of DuckDB — covering memory management, sorting, indexing, vectorized execution, and query optimization across eight chapters with ~50 runnable code examples. It's great to see DuckDB getting adoption for teaching database fundamentals at university level.

Connect with Torsten on LinkedIn and check out his database research group.

Post asset

DuckLake 1.0: The Lakehouse Format Goes Production-Ready

TL;DR: DuckLake 1.0, the metadata-in-a-database lakehouse format, is now production-ready with sorted tables, bucket partitioning, data inlining, geometry support, and Iceberg-compatible deletion vectors.

Unlike Delta Lake and Iceberg, DuckLake stores all metadata in a database catalog (PostgreSQL, SQLite, or DuckDB itself) rather than scattered files. The 1.0 release merges 108 PRs since late 2025 — 68 focused on reliability and correctness alone. Data inlining solves the small-file problem by storing tiny operations (≤10 rows by default) directly in the catalog, with CHECKPOINT to flush to object storage. Sorted tables enable automatic compaction and file pruning for high-cardinality columns, and the new Variant type brings semi-structured data with shredding to primitive types for better query performance.

Performance highlights include 8×–258× speedups for COUNT(*) via metadata-only queries and ~70× faster duckdb_views() lookups. Already ranked among DuckDB's top-10 extensions by downloads, with clients for Apache DataFusion, Spark, Trino, and Pandas. An O'Reilly book — "DuckLake: The Definitive Guide" — is in development. Available in DuckDB v1.5.2.

dux: Distributed DataFrames for Elixir powered by DuckDB

TL;DR: Dux is a distributed, lazy-by-default Elixir dataframe library backed by DuckDB, offering better performance and simpler maintenance than prior Polars-backed approaches.

Pipelines compile to SQL CTEs for end-to-end optimization by DuckDB, with lazy operations accumulating as an AST in the %Dux{} struct. It has built-in distributed execution across BEAM nodes, where data can be transferred, SQL compiled locally, and executed against each node's DuckDB instance without heavy RPC.

Early benchmarks (10M rows, Apple M4 Max) show Dux outperforming Explorer (Polars) by up to 2.5x for lazy filters (24ms vs 59ms) and 1.6x for group+summarise (40ms vs 63ms).

connections.duckdb: Play the New York Times Connections puzzle with DuckDB!

TL;DR: Tom Jakubowski built the New York Times Connections puzzle entirely in DuckDB using SQL macros and views.

The goal is to sort a grid of 16 words into 4 groups that share a hidden category. Play with duckdb https://www.tjak.dev/connections.duckdb, run select * from todays_puzzle;, and guess your groups with FROM guess_category_today(['CONTEST', 'GAME', 'BATTLE', 'CLASH']);. All game state and validation run in-memory via database-resident SQL.

Lance Extension

TL;DR: The Lance extension enables read/write of Lance datasets in DuckDB with vector, full-text, and hybrid search via dedicated SQL functions.

Lance is a columnar, open-table format optimized for ML/AI workloads and vector search. Hao Ding did the heavy lifting in adding support for reading and writing Lance tables. You can query via replacement scans and write with COPY (...) TO 'path/dataset.lance' (FORMAT lance, MODE 'overwrite'|'append');. Search functions include lance_vector_search(...), lance_fts(...), and lance_hybrid_search(...).

neovim-web: A website framework with Vim keybindings, Telescope fuzzy finder, and DuckDB SQL console to query site content

TL;DR: A zero-build Neovim-themed website framework with a built-in DuckDB SQL console for querying site content.

Volker integrates a DuckDB SQL console directly in the browser (:sql command) using DuckDB Wasm for client-side execution, no server-side processing needed. It's a fun way to learn more about in-browser SQL. Check Volker's website and type FROM pages; to try it, or clone the repo to build your own.

MotherDuck Now Speaks Postgres

TL;DR: MotherDuck now provides a PostgreSQL wire-protocol endpoint so you can run DuckDB SQL from any Postgres-compatible client without installing DuckDB libraries.

Point your existing Postgres client at pg.us-east-1-aws.motherduck.com:5432, authenticate with a MotherDuck token, and offload analytics while keeping OLTP Postgres lean. SQL remains DuckDB's dialect (largely PostgreSQL-compatible).

Existing drivers, poolers, and query patterns work unchanged. Supported clients include JDBC, rust-postgres, and node-postgres. Data movement from Postgres can be done with ETL tools or the pg_duckdb extension.

Big Data on the Cheapest MacBook

TL;DR: The entry-level MacBook Neo (Apple A18 Pro) handles heavy DuckDB workloads, such as ClickBench and TPC-DS, surprisingly well.

Gábor from DuckDB benchmarked the MacBook Neo with ClickBench (100M rows, 5 GB memory limit), yielding sub-second cold run medians, and TPC-DS at SF100 with a 1.63-second query median. Even the demanding SF300 completed in 79 minutes, though with significant disk spills.

quack-rs: A Rust SDK for building DuckDB loadable extensions.

TL;DR: quack-rs is a pure-Rust SDK wrapping DuckDB's C Extension API (v1.1+) to eliminate all C/C++ glue code and FFI pitfalls when building loadable extensions.

Previously, writing Rust-based DuckDB extensions required C++ glue and CMake tooling. The SDK wraps the C Extension API with safe, idiomatic abstractions and eliminates 16 documented FFI pitfalls, including silent NULL corruption and double-free in aggregate callbacks. The generate_scaffold function produces all 11 files needed for a community extension submission.

This means community extensions can now be built in Rust with its performance and safety guarantees, without needing to know DuckDB internals.

Announcing systing 1.0: Integration of DuckDB and AI accelerates the debugging workflow

TL;DR: Josef Bacik's systing eBPF (extended Berkeley Packet Filter) tracing tool now outputs directly to DuckDB databases, leveraging its speed for real-time AI-driven analysis of complex Linux performance issues.

eBPF is a Linux kernel technology that lets you run small, sandboxed programs directly in the kernel. Systing 1.0 marks a significant shift from generating Perfetto traces to creating DuckDB databases for system-wide eBPF tracing data. This addresses previous issues with overwhelming data volume and slow SQLite conversions. Josef implemented a Claude Code MCP designed to analyze these DuckDB traces, effectively replacing static analysis scripts with dynamic, AI-powered insights. This is a great use case for integrating DuckDB to improve speed.

TL;DR: Vladimir said "SQL notebooks deserve better tooling," and delivers a native Go DuckDB Jupyter kernel that streams Arrow IPC to a WASM Perspective viewer with a database explorer for JupyterLab and VS Code.

The kernel runs DuckDB directly (no Python wrapper) and exposes a localhost HTTP API for Arrow IPC streaming and explorer metadata. Perspective renders interactive tables/charts; a 5M-row (237 MB) result was queried in 238 ms and rendered in under 5s. Table detail panels include a Summarize tab computing approx_unique, avg, min, max, count without writing queries. Install via VS Code "Install / Update DuckDB Kernel", or JupyterLab with pip install hugr-perspective-viewer. The kernel is part of Hugr (an open source Data Mesh platform).

Introducing Embedded Dives

TL;DR: MotherDuck now lets you embed Dives (React+SQL components) with dual execution (cloud + DuckDB-Wasm) yielding 5–20 ms interaction latency.

Developers can integrate AI-created data apps into their applications and websites via <iframe>. The cloud engine handles the initial query and streams results into a local DuckDB-Wasm instance, so subsequent interactions like filtering and aggregations run entirely client-side with zero network roundtrips. Browse examples at the Dive Gallery.

Post asset

Upcoming Events

MotherDuck Now Speaks Postgres: Fast Analytics Without Changing Your Stack

2026-04-21. h: 16:00. Online

A livestream on MotherDuck's new Postgres wire protocol endpoint — any Postgres-compatible client, driver, or BI tool can query your data warehouse directly. No DuckDB libraries required.

A Practical Guide to Context Management for Data Agents

2026-04-23. h: 16:30. Online

A livestream covering data agents, context management, and business logic implementation with Virgil Data.

DuckDB + MotherDuck Meetup — San Francisco

2026-04-30. h: 18:00. San Francisco, CA, USA

Round 2 of the SF DuckDB + MotherDuck meetup! Talks on Building OpenDuck (distributed DuckDB) and DuckLake 1.0.

High-Performance Data Workflows with Python and DuckDB — PyData x MotherDuck

2026-04-30. h: 17:30. Seattle, WA, USA

Local-first analytics workflows combining Python and DuckDB, scaling to cloud with MotherDuck. A PyData x MotherDuck collaboration.

AI Council

2026-05-12. h: 08:00. San Francisco, CA, USA

Three-day conference at the SF Marriott Marquis with 50+ speakers covering AI engineering, agents, and databases.

Subscribe to DuckDB Newsletter

Subscribe to motherduck blog

PREVIOUS POSTS

Water Town: The Agent Swarm Data Stack

2026/04/15 - Jordan Tigani

Water Town: The Agent Swarm Data Stack

In a fully agentic world, will we still need analytics at all? A particularly unhinged example might offer some clues.

Announcing DuckLake 1.0 on MotherDuck

2026/04/16 - Alex Monahan

Announcing DuckLake 1.0 on MotherDuck

MotherDuck now supports DuckLake 1.0, the open table lakehouse format designed for simplicity and low latency. Learn what's new in the 1.0 release, including data inlining, clustering, bucket partitioning, geometry and variant types, plus multi-engine support. Learn how DuckLake compares with Apache Iceberg and Delta Lake.