
2023/05/04 - Mehdi Ouazza
Data Engineer's Highlights from PyCon DE 2023
Data Engineer's Highlights from PyCon DE 2023
It’s Marcos again, aka “DuckDB News Reporter” with another issue of “This Month in the DuckDB Ecosystem for May 2023.
This has been an exciting month for DuckDB and the whole ecosystem: DuckDB 0.8.0 is out, the project reaches 10k stars on GitHub (well 10,200 stars at the time of writing this), DuckDB now has a Spatial extension, a native Swift API (this is huge) and more.
This simply proves that DuckDB is more alive than ever before, and its stratospheric adoption growth curve keeps looking like a hockey stick.
As always we share here, this is a two-way conversation: if you have any feedback on this newsletter, feel free to send us an email to duckdbnews@motherduck.com
-Marcos

Fenjin Wang is an experienced software engineer working as a tech lead at TikTok. He’s the creator and maintainer of duckdb-rs, an ergonomic bindings to duckdb for Rust. With an interface similar to rusqlite, it aims to provide a seamless experience for Rust developers working with DuckDB.
Kudos to him and all the contributors for making DuckDB quacking Rust!
This new release is pretty exciting because contains a lot of new cool and useful features like the Pivot/Unpivot, improvements to parallel data import/export, time series joins, User-defined functions for Python, the new Swift API, and much more.

Mark Raasveldt (CTO at DuckDB Labs) gave an eye-opening lecture about DuckDB. Topics? Why DuckDB uses Vectors, how the Query Execution works inside DuckDB, Table storage, WASM, pluggable catalog, pausable pipelines, etc. Definitely, a video you should check out.
Simon Pantzare dives deep in a technical post comparing data ordering in DuckDB and Amazon Athena.
An insightful conversation among Jon Turow (Partner at Madrona), Jordan Tigani (CEO at MotherDuck), and Hannes MĂĽhleisen (one of the co-creators of DuckDB).
Bala Atur from Ponder shares how to make Data Science scalable with the help of Ponder and DuckDB. Ponder now transparently uses DuckDB as a backend for both pandas and Numpy operations, making them significantly faster.
Daniel Beach shares an exciting perspective about using DuckDB and Polars for Data Engineering in this post.
Michael Driscoll (CEO of Rill Data) explains why they rely on DuckDB to build Rill’s product in his own words, and why it is perfect for its use case.
Hussain Sultan writes about a data-backed deep dive into DuckDB using TPC-H Benchmarks. He uses the 0.7.1 version of DuckDB for these tests.

Jason Cole from Count explains why they selected DuckDB for its browser-first query model.
Octavian Zarzu takes an interesting approach to analyze the top 10 openings in chess using Python and DuckDB.
Dipankar Mazumdar writes about how to use the combination of Streamlit, DuckDB, and Apache Iceberg to build a Lakehouse.
Kae Suarez and Anja Boskovic from Voltron Data discuss the great things coming to the 5.1 release of the Ibis Project, including faster file reading with DuckDB.
Martin-Pierre Roset explains how to use the power of DuckDB and Kestra (a declarative data orchestration platform) to make the automation of Data Analysis simpler.

Alexander Volok shares why you should consider a new approach to faster analytics using Delta-RS and DuckDB.
“DuckCon,” the DuckDB user group, will be held for the first time outside of Europe in San Francisco Museum of Modern Art (SFMOMA), in the Phyllis Wattis Theater. In this edition, there will be talks from DuckDB creators Hannes Mühleisen and Mark Raasveldt about the current state of DuckDB and future plans. It will also talks from data industry notables Lloyd Tabb (of Looker and Malloy fame) and Josh Wills (creator of dbt-duckdb). The full agenda is available here.
Following DuckCon, MotherDuck will host a party celebrating ducks at 111 Minna (located very close to SFMOMA). DuckCon attendees are cordially invited to attend to eat, drink, listen to music and play games (skeeball!). MotherDuck’s Chief Duck Herder will also demo the latest work bringing DuckDB to the cloud.
Register now before they run out of space!
DuckDB co-creator Hannes will be giving a keynote at this 10-track data conference hosted by Databricks. Additionally, Ryan Boyd (co-founder at MotherDuck) will be delivering a technical session: If A Duck Quacks In The Forest And Everyone Hears, Should You Care?

2023/05/04 - Mehdi Ouazza
Data Engineer's Highlights from PyCon DE 2023

2023/05/11 - Jordan Tigani
Explores why scale-out became so dominant, whether those rationales still hold, and some joyful advantages of scale-up architecture.