DuckDB Ecosystem Newsletter: 0.7.0 Released and More
2023/02/22 - 5 min read
BYHey, friend đź‘‹
Hi, I'm Marcos! I'm a data engineer by day at Riot Games (via X-Team). By night, I create newsletters for a few topics I'm passionate about: helping folks find data digs and AWS graviton. After getting involved in the DuckDB community, I saw a great opportunity to partner with the MotherDuck team to share all the amazing things happening in the DuckDB ecosystem.
In this issue, we wanted to share the incredible talks from the DuckCon 2023, and many articles that were out in the second half of January and the first days of February. As each month goes by, a lot more great content is being published in the DuckDB ecosystem, so we've had to make some difficult choices for the featured community member and top links.
We hope you enjoy!
-Marcos
Feedback: duckdbnews@motherduck.com
Tweet great links to us with #DuckDBMonthly
Featured Community Member
Pedro Holanda
Pedro is a Post-Doc based in Amsterdam and a member of the Database Architecture group at CWI and currently as working as Chief of Operations at DuckDB Labs.
You can find him on Twitter @holanda_pe
New DuckDB Release: 0.7.0
The DuckDB team recently announced DuckDB 0.7.0! This new release introduces JSON ingestion through read_json, partitioned Parquet and CSV export, attaching multiple DuckDB databases in the same instance, SQLite storage backend, UPSERTs, LATERAL and POSITIONAL joins, improved Python APIs, better compression and more. The DuckDB community has clearly been heads down coding!
Top 10 DuckDB Links this Month
DuckCon Brussels 2023: Talks by DuckDB Creators, MotherDuck, LakeFS, Hopsworks, Fluvio
DuckCon this year had an exciting mix of talks from the core DuckDB team and the community. Catch them all on the playlist above.
Want to learn how to build DuckDB Extensions? In their talk, Pedro and Sam teased the audience about the power of DuckDB Extensions and what you can achieve with them easily by cloning their example project.
DuckDB: Bringing Analytical SQL directly to your Python shell
In this talk at FOSDEM 2023, Pedro talks about how DuckDB fits perfectly inside the Python ecosystem and makes a cool demo at the end of the talk using DuckDB, Pandas, and PySpark.
PyIceberg 0.2.1: Iceberg ❤️ PyArrow & DuckDB
In this video, Tabular’s team demonstrated the new features of PyIceberg 0.2.1. If you prefer the article, here is the complete write-up on Medium.
Solving Advent Of Code With DuckDB And dbt
A very interesting article from Graham Wetzler about he used DuckDB and Python to solve some of the Advent of Code challenges.
Querying 1 Billion Rows of AWS Cost Data 100X Faster with DuckDB
According to the Vantage’s team: from simple reads to complex writes and data ingestion they found that DuckDB was between 4X and 200X faster than Postgres for this use case.
Command Line Data Visualization with DuckDB and YouPlot
In this video, Mark Needham teaches us how to create data visualizations on the command line using YouPlot, DuckDB, and a bit of Pandas.
Streaming Data Pipelines with Striim + DuckDB
In this interesting article, Pedram Navid explains how to set up a streaming Data pipeline with the help of Striim (an enterprise-grade CDC platform) and DuckDB.
Python Faker for DuckDB Fake Data Generation
In this article, Ryan develops a simple way how to generate fake data with Python and upload it to DuckDB.
Pragmatism About Data Stacks with Pedram Navid of West Marin Data
The Data Stack Show, Eric and Kostas chat with Pedram Navid, Owner of West Marin Data and frequent contributor to substack. During the episode, Pedram discusses the modern data stack and its complexities, modern tooling, early-stage startups, and more.
DuckDB now supports ON CONFLICT clause on upserts
Now available in the latest 0.7.0 release. Thanks to Alex Monahan for the tip on Twitter.
Upcoming Events
Data Council Austin at the end of March will feature three days of technical talks on analytics, data engineering, data science and AI. Nicholas Ursa, co-founder and software engineer at MotherDuck, will speak about how "Data Warehouses are Gilded Cages. What Comes Next?"
QCon London, also at the end of March, is a software development conference featuring some of the brightest minds across software. Hannes MĂĽhleisen, co-creator of DuckDB, will present on "In-Process Analytical Data Management with DuckDB."
Modern Data Stack Conference (MDS Con) by Fivetran at the beginning of April in San Francisco will feature leaders in the industry such as DJ Patil, George Fraser, Tristan Handy, Ali Ghodsi, renowned analyst Sanjeev Mohan and Data Council founder Pete Soderling. Ryan Boyd, co-founder at MotherDuck, will be on a panel with Gabi Steele (CEO, Preql) and Chetan Sharma (CEO, Eppo).
Subscribe to the Newsletter
You can subscribe to the blog using RSS, or elect to join our mailing list for either the DuckDB Ecosystem Newsletter, MotherDuck News or both!
Subscribe to DuckDB Newsletter