This Month in the DuckDB Ecosystem: June 2023
2023/06/16 - 5 min read
BYHey, friend đź‘‹
It’s Marcos again, aka “DuckDB News Reporter” with another issue of “This Month in the DuckDB Ecosystem for June 2023.
This month keeps showing the rising popularity of DuckDB as a great developer tool. From analyzing music data to being the choice to work with 50k+ datasets in the Hugging Face Hub, from using it for creating dummy data to analyzing your own Fitbit data with it.
As always we share here, this is a two-way conversation: if you have any feedback on this newsletter, feel free to send us an email to duckdbnews@motherduck.com
-Marcos
Featured Community Member
Max Gabrielsson
Max Gabrielsson is a Junior Software engineer at DuckDB labs but he has already made some impressive waves! He’s the creator of the official spatial DuckDB extension. While it’s still WIP, it’s much more welcome for any geo data processing. You can read more about this one here.
Top DuckDB Links this Month
DuckDB: run SQL queries on 50,000+ datasets on the Hugging Face Hub
The Hugging Face team just announced the integration with DuckDB, which means that now you can use the simplicity of SQL on 50k+ datasets on its Hub.
Correlated Subqueries in SQL
This new feature from DuckDB will allow building more readable and easier-to-maintain complex queries.
Shredding deeply nested JSON, one vector at a time by Laurens Kuiper - DuckDB Labs
In this video, Laurens shows how to work with deeply nested JSON data in DuckDB
What's the hype behind DuckDB?
Matt Palmer shares a very interesting perspective in this post on why DuckDB is so popular these days.
DuckDB + Dagster
The Dagster team just released a tutorial to show how to combine DuckDB I/O Manager and Dagster’s Software-Defined Assets. If you use Dagster in production today, you will benefit a lot from this seamless integration here
Cross-filtering 10 Million Entries with FalconVis + DuckDB
Researchers from the CMU Data Interaction Group just shared this notebook on Observable where they combined the power of FalconVIS and DuckDB to cross-filter 10 Million rows.
My (very) personal data warehouse — Fitbit activity analysis with DuckDB
In this post, Simon Aubury analyzed its own Fitbit activity with the help of DuckDB and Seaborn
clickhouse-local vs DuckDB on Two Billion Rows of Costs
The Vantage team shared an insightful comparison between clickhouse-local and DuckDB. The post is worth a read because it highlights a very important point on why people are selecting DuckDB for more and more projects: developer productivity with DuckDB is just awesome
DuckDB: Generate dummy data with user-defined functions (UDFs)
Mark Needham (a regular person in this newsletter) wrote about how to use the potential of UDFs on DuckDB to generate dummy data. If you are a visual person, you can watch the video Mark did explaining the same thing here.
Graph components with DuckDB
Max Halford show a simple way to work with graphs with Python and DuckDB
Music Stats with DuckDB
Arthur Dryomov wrote about how to analyze music data with DuckDB
Using DuckDB to query beneficial ownership data in Parquet files
In this post, Stephen Abbott Pugh explains in great detail how DuckDB could be the perfect tool to work with the Beneficial Ownership Data Standard (BODS)
Upcoming events
DuckCon in San Francisco - 29th June
“DuckCon,” the DuckDB user group, will be held for the first time outside of Europe in San Francisco Museum of Modern Art (SFMOMA), in the Phyllis Wattis Theater. In this edition, there will be talks from DuckDB creators Hannes Mühleisen and Mark Raasveldt about the current state of DuckDB and future plans. It will also talks from data industry notables Lloyd Tabb (of Looker and Malloy fame) and Josh Wills (creator of dbt-duckdb). The full agenda is available here.
Grab your ticket here, as there is limited space!
MotherDuck Party in San Francisco - 29th June
Following DuckCon, MotherDuck will host a party celebrating ducks at 111 Minna (located very close to SFMOMA). DuckCon attendees are cordially invited to attend to eat, drink, listen to music and play games (skeeball!). MotherDuck’s Chief Duck Herder will also demo the latest work bringing DuckDB to the cloud.
Register now before they run out of space!
Data + AI Summit - 28th and 29th June
DuckDB co-creator Hannes will be giving a keynote at this 10-track data conference hosted by Databricks. Additionally, Ryan Boyd (co-founder at MotherDuck) will be delivering a technical session: If A Duck Quacks In The Forest And Everyone Hears, Should You Care?
CONTENT
- Hey, friend đź‘‹
- Featured Community Member
- Top DuckDB Links this Month
- Upcoming events
Subscribe to DuckDB Newsletter