
2025/01/22 - Simon Späti
The Data Engineering Toolkit: Essential Tools for Your Machine
Master the essential data engineering toolkit—Linux commands, Docker, Python, SQL, and developer tools. A practical guide to the tools every DE needs.
- 4 min read
BYMotherDuck support for DuckDB 1.2 has arrived, and with it comes a wave of improvements that make analytics in your data warehouse faster and more intuitive. We’re always excited to see how DuckDB pushes the boundaries of performance and usability, and the 1.2 release delivers on both fronts.
Whether you’re crunching CSVs, writing SQL, or optimizing complex queries, DuckDB 1.2 brings major enhancements to help you work more efficiently, and we’re proud to support it from the outset. Our early support for DuckDB 1.2 is possible due to the helpful collaboration with the DuckDB community as we tested and verified the upcoming release.
This blog highlights key improvements in performance, the SQL experience, CSV handling, and scalability.
Performance has always been a strength of DuckDB, and 1.2 takes it to new heights. Several core enhancements boost query speed, particularly for common real-world use cases.
Sorting and retrieving the top N records in a dataset is a frequent operation in analytics. DuckDB 1.2 now leverages a heap-based approach to make Top N queries faster. That means dashboards, ranking reports, and percentile calculations all see noticeable performance gains.
If you work with datasets containing long string values, DuckDB 1.2 introduces ZSTD-based string compression, resulting in better compression and faster write speeds. For MotherDuck users, this translates to faster reads and more efficient storage.
Grouping and summarizing large datasets is now faster thanks to partition-aware aggregation and other hash table optimizations. For example, aggregations on Hive-partitioned datasets now benefit from better data locality, leading to major efficiency improvements.
DuckDB 1.2 improvements aren’t just about efficiency gains: 1.2 also introduces improvements that make SQL more intuitive and expressive.
New shorthand syntax makes it easier to select and rename columns on the fly:
SELECT * LIKE '%name%' lets you select only columns matching a patternSELECT * RENAME allows renaming multiple columns inlineSELECT new_col: x + 1, another: x + 2Previously, summing a Boolean column required wrapping it in a CASE WHEN statement. Now, you can directly sum a Boolean column with SUM(price > 50), making queries both cleaner and faster.
Writing SQL is easier than ever with a more intelligent autocomplete engine that provides context-aware suggestions. Plus, the DuckDB CLI gets a fresh upgrade with syntax highlighting and thousands-separator support for better readability.
Reading CSV files remains one of the most common tasks in data analysis, and DuckDB 1.2 makes it even faster and more memory-efficient. Compression and filter pushdown optimizations speed up ingestion, while improved error handling makes dealing with messy data smoother than before.
Many enterprises still rely heavily on Excel files and handling them in DuckDB has traditionally been done through the spatial extension. Although not technically part of DuckDB 1.2, we want to highlight the newly-improved Excel extension, which now provides support for reading and writing Excel files. It works great with MotherDuck's Dual Execution query engine, enabling Excel files to be read on your local DuckDB client and referenced in your SQL queries so you can upload local data to MotherDuck or JOIN with MotherDuck tables in the cloud.
Reliability matters, and DuckDB 1.2 includes several robustness improvements that directly benefit MotherDuck users:
DuckDB 1.2 brings meaningful improvements across the board, making it faster, friendlier, and more scalable. At MotherDuck, we’re thrilled to see these optimizations in action, delivering even better performance for our users. Whether you're handling CSVs, running analytical queries, or writing SQL with ease, DuckDB 1.2 makes the experience smoother and more powerful.

2025/01/22 - Simon Späti
Master the essential data engineering toolkit—Linux commands, Docker, Python, SQL, and developer tools. A practical guide to the tools every DE needs.

2025/02/04 - Mehdi Ouazza
Learn how you can pragmatically use DuckDB to parse any CSVs