Join Us At Small Data SF for Just $250 with code 'EarlyBird' until Sunday, 7/21Get Tickets

This Month in the DuckDB Ecosystem: June 2023



Subscribe to the newsletter

Hey, friend đź‘‹

It’s Marcos again, aka “DuckDB News Reporter” with another issue of “This Month in the DuckDB Ecosystem for June 2023.

This month keeps showing the rising popularity of DuckDB as a great developer tool. From analyzing music data to being the choice to work with 50k+ datasets in the Hugging Face Hub, from using it for creating dummy data to analyzing your own Fitbit data with it.

As always we share here, this is a two-way conversation: if you have any feedback on this newsletter, feel free to send us an email to


Post Image

Max Gabrielsson

Max Gabrielsson is a Junior Software engineer at DuckDB labs but he has already made some impressive waves! He’s the creator of the official spatial DuckDB extension.  While it’s still WIP, it’s much more welcome for any geo data processing. You can read more about this one here.

Learn more about Max here

DuckDB: run SQL queries on 50,000+ datasets on the Hugging Face Hub


The Hugging Face team just announced the integration with DuckDB, which means that now you can use the simplicity of SQL on 50k+ datasets on its Hub.

Correlated Subqueries in SQL

This new feature from DuckDB will allow building more readable and easier-to-maintain complex queries.

Shredding deeply nested JSON, one vector at a time by Laurens Kuiper - DuckDB Labs

In this video, Laurens shows how to work with deeply nested JSON data in DuckDB

What's the hype behind DuckDB?

Matt Palmer shares a very interesting perspective in this post on why DuckDB is so popular these days.

DuckDB + Dagster

The Dagster team just released a tutorial to show how to combine DuckDB I/O Manager and Dagster’s Software-Defined Assets. If you use Dagster in production today, you will benefit a lot from this seamless integration here

Cross-filtering 10 Million Entries with FalconVis + DuckDB

Researchers from the CMU Data Interaction Group just shared this notebook on Observable where they combined the power of FalconVIS and DuckDB to cross-filter 10 Million rows.

My (very) personal data warehouse — Fitbit activity analysis with DuckDB


In this post, Simon Aubury analyzed its own Fitbit activity with the help of DuckDB and Seaborn

clickhouse-local vs DuckDB on Two Billion Rows of Costs


The Vantage team shared an insightful comparison between clickhouse-local and DuckDB. The post is worth a read because it highlights a very important point on why people are selecting DuckDB for more and more projects: developer productivity with DuckDB is just awesome

DuckDB: Generate dummy data with user-defined functions (UDFs)


Mark Needham (a regular person in this newsletter) wrote about how to use the potential of UDFs on DuckDB to generate dummy data. If you are a visual person, you can watch the video Mark did explaining the same thing here.

Graph components with DuckDB

Max Halford show a simple way to work with graphs with Python and DuckDB

Music Stats with DuckDB

Arthur Dryomov wrote about how to analyze music data with DuckDB

Using DuckDB to query beneficial ownership data in Parquet files

In this post, Stephen Abbott Pugh explains in great detail how DuckDB could be the perfect tool to work with the Beneficial Ownership Data Standard (BODS)

Upcoming events

DuckCon in San Francisco - 29th June

“DuckCon,” the DuckDB user group, will be held for the first time outside of Europe in San Francisco Museum of Modern Art (SFMOMA), in the Phyllis Wattis Theater. In this edition, there will be talks from DuckDB creators Hannes Mühleisen and Mark Raasveldt about the current state of DuckDB and future plans. It will also talks from data industry notables Lloyd Tabb (of Looker and Malloy fame) and Josh Wills (creator of dbt-duckdb). The full agenda is available here.

Grab your ticket here, as there is limited space!

MotherDuck Party in San Francisco - 29th June

Following DuckCon, MotherDuck will host a party celebrating ducks at 111 Minna (located very close to SFMOMA). DuckCon attendees are cordially invited to attend to eat, drink, listen to music and play games (skeeball!). MotherDuck’s Chief Duck Herder will also demo the latest work bringing DuckDB to the cloud.

Register now before they run out of space!

Data + AI Summit - 28th and 29th June

DuckDB co-creator Hannes will be giving a keynote at this 10-track data conference hosted by Databricks. Additionally, Ryan Boyd (co-founder at MotherDuck) will be delivering a technical session: If A Duck Quacks In The Forest And Everyone Hears, Should You Care?

  1. Hey, friend đź‘‹
  2. Upcoming events

Subscribe to the newsletter