How it started
During the 2nd edition of the DuckCon in Brussels, I had the pleasure of interviewing DuckDB co-creator Hannes Mühleisen. Hannes is a researcher at the Dutch research institute for computer science and mathematics, CWI. He has been working in a group called Database Architectures for ten years, where they research how data systems should be built.
In his work, he discovered that some data practitioners, particularly in the R community, were not using databases at all. Instead, they used hand-rolled dataframe engines and dataframes in memory. However, these dataframes were slow and limited because of how the engines were structured.
That was the first bit that inspired DuckDB to be created.
Databases are cumbersome for local development
Data practitioners were not excited about traditional databases because they’re difficult to install and configure. It’s not a smooth to run a database locally. Plus, the client protocol of databases like JDBC, built in the 90s, hasn’t faced significant upgrades. Hannes wanted to research how he could build a database for these people while removing the hassle of managing one.
SQLite was a big inspiration for DuckDB. SQLite has no server, and it’s in-process with a simple library. However SQLite was designed for transactional workloads (with row-based storage). This limited the performance of SQLite for these use cases and presented an opportunity. In-process analytics database are a brand new class of databases, which was exciting for Hannes as a researcher.
This was just the beginning of the story and not even close to what we know today as “DuckDB”. But Hannes isn’t done with the DuckDB project. To quote him :
“My definition of success as a researcher is not to write papers but to have an impact. In the area of data systems, it is required to make something that will see widespread use in order to achieve impact.”
Check out the full interview above or directly on YouTube.
Start using MotherDuck now!