This Month in the DuckDB Ecosystem: June 2024



Subscribe to the newsletter

Hey, friend 👋

This week represented a pivotal moment for DuckDB: the release of DuckDB 1.0.0.  With this “Snow Duck” release, DuckDB is now production-ready, with guaranteed backwards-compatibility, improved performance and stability.  Our team at MotherDuck congratulates our friends at DuckDB Foundation and DuckDB Labs on this huge milestone.

I don’t think any of us could have imagined the success DuckDB has had in redefining the conversation around efficient data analytics and the viability of embedded database engines for data big and small.  The movement is underway and accelerating in no small part due to the amazing community around DuckDB which has contributed to the core library, built the extension ecosystem, helped other data people use DuckDB and shared their experiences.

Heres to you, the amazing DuckDB community: 🍻

Our Featured Community Member this month is a duplicate of the very first person featured in this newsletter because of how pivotal he’s been in the march towards 1.0: DuckDB co-creator Mark Raasveldt.


co-founder @ MotherDuck

Post Image
Post Image

Mark Raasveldt

Mark Raasveldt (mytherin) is the co-creator and one of the driving forces behind DuckDB. He was pivotal in getting DuckDB to 1.0.0.

During his studies at CWI in the Netherlands, he recognized the need for a database tailored to analytical workloads, leading him to co-found DuckDB with Hannes Mühleisen. As the CTO of DuckDB Labs, Mark's expertise in database internals and performance optimization has been pivotal in shaping DuckDB's architecture, from efficient data ingestion to advanced query processing techniques. 

With a vision for further enhancements and expanded capabilities, Mark continues to lead DuckDB's development, fostering a vibrant community of contributors.

Post Image

How We Fused DuckDB into Postgres with Crunchy Bridge for Analytics

The team at Crunchy Data has integrated DuckDB into Postgres using Crunchy Bridge, enabling powerful analytics capabilities. This fusion leverages DuckDB's speed and efficiency for in-database analytics without the need for data transfers. 

Accessing 150k Hugging Face Datasets with DuckDB, query using GPT-4o

Explore how DuckDB is being utilized to access and analyze a vast array of datasets available on Hugging Face. With over 150,000 datasets, DuckDB's seamless integration enhances data accessibility and analysis workflows. 

Enhancing DuckDB UNIX Pipe Integration with shellfs

Discover how shellfs is improving DuckDB's integration with UNIX pipes, making it easier to handle data streams efficiently. This enhancement significantly streamlines data processing tasks, particularly in UNIX environments.

DuckDB In-Process Python Analytics for Not-Quite-Big Data

Learn how DuckDB facilitates in-process analytics in Python, offering an efficient solution for medium-sized data. This tutorial covers the practical implementation and benefits of using DuckDB for Python-based data analysis. 

Working with Cron Expressions in DuckDB

Rusty Conover is featured twice this month!  In this article, Rusty provides a comprehensive guide on utilizing cron expressions within DuckDB for scheduling tasks. This article delves into the syntax and use cases of cron expressions to automate repetitive tasks.

My First Billion Rows in DuckDB

A detailed tutorial on handling large datasets efficiently with DuckDB, showcasing its performance and scalability. This article highlights practical tips and techniques for working with billion-row datasets in DuckDB. 

A Way to Production-Ready AI Analytics with RAG

GoodData Developers discuss leveraging DuckDB for robust AI analytics in production environments. This article explores the practical applications and benefits of using DuckDB in AI-driven analytics workflows. 

Quack Quack Ka-Ching: Cut Costs by Querying Snowflake from DuckDB

Learn how querying Snowflake from DuckDB can help you reduce costs significantly. This article provides insights into cost-saving strategies and performance optimization techniques for data querying. 

Search Using DuckDB - Part 2

MotherDuck continues their series on using DuckDB for efficient search functionalities. This part delves deeper into advanced search techniques and practical implementations using DuckDB. 

Post Image

Upcoming Events

Data & AI Summit

10-13 June, San Francisco, USA

There are a number of DuckDB related happenings this week at Databricks’ Data & AI Summit. Hannes will be featured in the keynote session on Thursday. There is also a breakout session by the Databricks team on Delta Lake and DuckDB. Lastly, MotherDuck is hosting a party for the DuckDB community on Tuesday evening.

DuckCon #5 in Seattle

15 August, Seattle, WA, USA

DuckDB Labs is excited to hold the next “DuckCon” DuckDB user group meeting in Seattle, WA, sponsored by MotherDuck. The meeting will take place on August 15, 2024 (Thursday) in the SIFF Cinema Egyptian.

As is traditional in DuckCons, it will start with a talk from DuckDB’s creators Hannes Mühleisen and Mark Raasveldt about the state of DuckDB. This will be followed by presentations by DuckDB users. In addition, they will have several lightning talks from the DuckDB community.

  1. Hey, friend 👋
  2. Upcoming Events

Subscribe to the newsletter