Thursday, april 4th - Take flight with dbt and duckdb

This Month in the DuckDB Ecosystem: May 2023

2023/05/24

BY

Subscribe to the newsletter

Hey, friend đź‘‹

It’s Marcos again, aka “DuckDB News Reporter” with another issue of “This Month in the DuckDB Ecosystem for May 2023.

This has been an exciting month for DuckDB and the whole ecosystem: DuckDB 0.8.0 is out, the project reaches 10k stars on GitHub (well 10,200 stars at the time of writing this), DuckDB now has a Spatial extension, a native Swift API (this is huge) and more.

This simply proves that DuckDB is more alive than ever before, and its stratospheric adoption growth curve keeps looking like a hockey stick.

As always we share here, this is a two-way conversation: if you have any feedback on this newsletter, feel free to send us an email to duckdbnews@motherduck.com

-Marcos

Post Image

Fenjin Wang

Fenjin Wang is an experienced software engineer working as a tech lead at TikTok. He’s the creator and maintainer of duckdb-rs, an ergonomic bindings to duckdb for Rust. With an interface similar to rusqlite, it aims to provide a seamless experience for Rust developers working with DuckDB.

Kudos to him and all the contributors for making DuckDB quacking Rust!

DuckDB 0.8.0 is out codename “Fulvigula”

This new release is pretty exciting because contains a lot of new cool and useful features like the Pivot/Unpivot, improvements to parallel data import/export, time series joins, User-defined functions for Python, the new Swift API, and much more.

image2.jpg

DuckDB Internals (CMU Advanced Databases / Spring 2023)

Mark Raasveldt (CTO at DuckDB Labs) gave an eye-opening lecture about DuckDB. Topics? Why DuckDB uses Vectors, how the Query Execution works inside DuckDB, Table storage, WASM, pluggable catalog, pausable pipelines, etc. Definitely, a video you should check out.

Throwing 107 GB and 5 billion fake rows of order data at DuckDB and Athena

Simon Pantzare dives deep in a technical post comparing data ordering in DuckDB and Amazon Athena.

Commercializing Open-source Projects by MotherDuck’s Jordan Tigani and DuckDB’s Hannes Mühleisen

An insightful conversation among Jon Turow (Partner at Madrona), Jordan Tigani (CEO at MotherDuck), and Hannes MĂĽhleisen (one of the co-creators of DuckDB).

Scalable Data Science with Ponder on DuckDB

Bala Atur from Ponder shares how to make Data Science scalable with the help of Ponder and DuckDB. Ponder now transparently uses DuckDB as a backend for both pandas and Numpy operations, making them significantly faster.

DuckDB vs Polars for Data Engineering

Daniel Beach shares an exciting perspective about using DuckDB and Polars for Data Engineering in this post.

Why We Built Rill with DuckDB

Michael Driscoll (CEO of Rill Data) explains why they rely on DuckDB to build Rill’s product in his own words, and why it is perfect for its use case.

Efficient DuckDB

Hussain Sultan writes about a data-backed deep dive into DuckDB using TPC-H Benchmarks. He uses the 0.7.1 version of DuckDB for these tests.

image6.png

How we evolved our query architecture with DuckDB

Jason Cole from Count explains why they selected DuckDB for its browser-first query model.

Discovering Chess Openings in Grandmasters’ Games using Python and DuckDB

Octavian Zarzu takes an interesting approach to analyze the top 10 openings in chess using Python and DuckDB.

Building a Streamlit app on a Lakehouse using Apache Iceberg & DuckDB

Dipankar Mazumdar writes about how to use the combination of Streamlit, DuckDB, and Apache Iceberg to build a Lakehouse.

Ibis 5.1: Faster file reading with DuckDB, Arrow-Native Workflows for Snowflake, and more

Kae Suarez and Anja Boskovic from Voltron Data discuss the great things coming to the 5.1 release of the Ibis Project, including faster file reading with DuckDB.

Automate Data Analysis With Kestra and DuckDB

Martin-Pierre Roset explains how to use the power of DuckDB and Kestra (a declarative data orchestration platform) to make the automation of Data Analysis simpler.

image3.jpg

Delta-RS and DuckDB — Read and Write Delta Without Spark

Alexander Volok shares why you should consider a new approach to faster analytics using Delta-RS and DuckDB.

Upcoming events

DuckCon in San Francisco - 29th June

“DuckCon,” the DuckDB user group, will be held for the first time outside of Europe in San Francisco Museum of Modern Art (SFMOMA), in the Phyllis Wattis Theater. In this edition, there will be talks from DuckDB creators Hannes Mühleisen and Mark Raasveldt about the current state of DuckDB and future plans. It will also talks from data industry notables Lloyd Tabb (of Looker and Malloy fame) and Josh Wills (creator of dbt-duckdb). The full agenda is available here.

Register before 26th of May, as there is limited space!

MotherDuck Party in San Francisco - 29th June

Following DuckCon, MotherDuck will host a party celebrating ducks at 111 Minna (located very close to SFMOMA). DuckCon attendees are cordially invited to attend to eat, drink, listen to music and play games (skeeball!). MotherDuck’s Chief Duck Herder will also demo the latest work bringing DuckDB to the cloud.

Register now before they run out of space!

Data + AI Summit - 28th and 29th June

DuckDB co-creator Hannes will be giving a keynote at this 10-track data conference hosted by Databricks. Additionally, Ryan Boyd (co-founder at MotherDuck) will be delivering a technical session: If A Duck Quacks In The Forest And Everyone Hears, Should You Care?

CONTENT
  1. Hey, friend đź‘‹
  2. Upcoming events

Subscribe to the newsletter