This Month in the DuckDB Ecosystem: September 2024

2024/09/03 - 5 min read

BY

Hey, friend 👋

This is again your usual data cap dude, aka Mehdio. I hope you all are taking a break from the online world this summer to recharge plentifully. I was out also and had to catch up with a LOT of content that happened over these past weeks. A major announcement was pg_duckdb, the official Postgres extension for DuckDB, but we also had a couple of pieces of content around geospatial (including one made by yours truly).

DuckDB 1.1 is right around the corner, so expect next month to be interesting too! By the way, DuckDB publishes their release calendar here in case you didn't know.

Post Image
Post Image

Simon Aubury & Ned Letcher

Simon Aubury and Ned Letcher, authors of Getting Started with DuckDB, bring a wealth of experience in data engineering and software development. Simon, with a background in creating robust data systems for various industries since 2000, and Ned, a data science and software engineering consultant since completing his PhD, combine their expertise to guide readers through enhancing data workflows with DuckDB. It's great to see more books about DuckDB, so if you are starting your journey with DuckDB, it's definitely worth a read!

Post Image

Practical Applications for DuckDB (with Simon Aubury & Ned Letcher)

Thanks to Kris Jenkins and his awesome YouTube channel Developer Voices, we had an insightful discussion with the book authors! They dive into how DuckDB simplifies data wrangling, enhances edge processing, and integrates smoothly with programming languages like R and Python, all through an engaging discussion.

Ibis Dropping Pandas Support - DuckDB is the Default

Ibis is a portable Python dataframe that enables you to write your pipeline and use different engines like DuckDB, Polars, DataFusion, or PySpark. They used to support Pandas but are dropping support as of version 10.0. As they mentioned: "There is no feature gap between the `pandas` backend and our default DuckDB backend, and DuckDB is _much_ more performant."

PostgreSQL in Line for DuckDB-Shaped Boost in Analytics Arena

A big release this month was the official open source Postgres extension for DuckDB where multiple companies will partner (including MotherDuck) together to provide the best analytical experience, directly in Postgres! While it's still in an experimental state, this is a big project that will get significant resources and attention, stay tuned! You can also read our blog about this release.

Modern GIS with DuckDB

Geospatial analysis always seemed like a niche in data that was hard to access. The reason for this is that the toolkit and knowledge were significantly different from what you commonly do. Thanks to DuckDB, that's not the case anymore. I tried to wrap up a getting started video about how to create your first heatmap using open EV charging spot data.

Letsql, a Multi-Engine Supporting DuckDB

Letsql is another multi-engine framework, like Ibis but much younger. The blog linked above discusses their caching feature for upstream source data. This allows you to cache the results of a SQL query in a dataframe for rapid iteration. It's great to see multiple tools adopting the strategy to avoid cloud dependency while developing and significantly improve the overall developer experience.

DuckDB Tricks

Gabor from DuckDB Labs shows us some kung fu SQL, or rather, some underrated functions through this pragmatic blog. For instance, did you know that in the CLI the `.schema` command will show all of the SQL statements used to define the schema of the database?!

Ibis + DuckDB Geospatial: A Match Made on Earth

The annual SciPy Conference is a gathering where participants from various sectors showcase projects, learn from experts, and collaborate on Scientific Python development. In this talk, Naty Clementi from Voltron Data explains how you can leverage Ibis and DuckDB for geospatial work.

How to Bootstrap a Data Warehouse with DuckDB

A couple of MotherDuckers were at SciPy for a SQL workshop and also to present! In this talk, Guen from our ecosystem team delivered a pragmatic talk to demonstrate how you can bootstrap a data warehouse with DuckDB and MotherDuck. No sales fluff, just a straightforward project for you to get started here.

Why Do People Like DuckDB

The subreddit data engineering is a popular and insightful place to learn about others' experience with data tools (if you omit the troll comment here and there ;-). This thread shows how people are currently using DuckDB. A lot of comments compared their experience with SQLite, Pandas, and others.

How DuckDB Function Chaining Works

Mark Needham shows us in this video how to make your long SQL script more readable using DuckDB function chaining with the `.` operator. It's again something I haven't seen many people using but really useful to make your code cleaner!

Post Image

Upcoming Events

dbt Data Modeling Challenge

9 September - online

Paradime, Hex, and MotherDuck have joined forces to bring data professionals worldwide a one-of-a-kind competition with some sweet prizes. Showcase your dbt, analytics, and SQL prowess on a global stage for a panel of esteemed judges!

Data Engineering for AI/ML

12 September - online

Organized by the MLOps community, Hannes (co-creator of DuckDB) and Mehdi (data engineer & developer relations at MotherDuck) will each have their own talk about data, and of course, ducks.

Small Data SF

24 September, San Francisco, CA, USA

Small data and AI is more powerful than you think. Data and AI that was once "Big" can now be handled by a single machine. Join MotherDuck, Ollama, Turso, and Cloudfare in San Francisco.

Location: San Francisco, CA 🌁 - 8:00 AM America/Los_Angeles

Type: In Person

CONTENT
  1. Hey, friend 👋
  2. Upcoming Events

Subscribe to DuckDB Newsletter

blog subscription icon

Subscribe to motherduck blog