The Data Warehouse powered by DuckDB SQL

2024/11/01 - 4 min read

BY

Subscribe to MotherDuck Blog

Introducktion

There are many reasons to use a data warehouse - but ultimately value comes out of solving business problems. Of course, this is non-trivial to do, because great analytical results are downstream of ingestion, transformation, analytical capabilities, and flexibility.

Thankfully, DuckDB offers a powerful language to solve business problems: good ole SQL. DuckDB by itself, being in-process, is not enough to bring this power to the Enterprise, so MotherDuck offers a cloud service to turn the local, in-process power of DuckDB into a Cloud Data Warehouse.

Ingestion

There are myriad tools available for replicating data from sources to targets. But each additional tool adds one more thing to manage, another set of primitives to learn. MotherDuck offers a rich set of ingestion capabilities, all in SQL.

It can natively ingest from CSV, Parquet, JSON, Iceberg, & Delta file formats. It can manage authentication to S3, GCS, Azure Blob Storage, and Cloudflare R2. And that's just the tip of the "Iceberg".

Post Image

Of course, for sources that cannot be read directly from MotherDuck, we offer a diverse set of connectors for both Data Warehousing and Data Lake style ingestion.

Post Image

Transformation

Once data has been loaded into MotherDuck, DuckDB SQL proves to be both incredibly performant and easy to use. It is easy to build fast data transformations with supported libraries from dbt & sqlmesh. For scenarios where SQL is not enough, DuckDB offers native Python Dataframe APIs to allow even the most complex transformations to take place.

To learn more about transformation in the Duck Stack, watch the video of our talk at dbt Coalesce 2024 or take a look at a more in-depth example in our blog.

Analysis

From an analytics perspective, MotherDuck offers a very nice set of SQL functions that handles everything from simple aggregations to classical Machine Learning algorithms, like lin reg or K-means. The MotherDuck AI team continues to extend in the LLM space with Prompting, Embedding, and similarity functions, again all in SQL, to make the deployment of AI in your data warehouse simple, fast and easy to maintain.

An example dashboard built with MotherDuck is shown here:

Post Image

For further reading (with examples) around the advanced analytical capabilities of MotherDuck, check out the following posts:

Flexibility

Many data teams are compartmentalized into three sets of roles: Business Users, Data Analysts & Scientists, and Data Engineers. The tools generally are made with these personas in mind. However, most complex business problems require working across multiple roles and thus multiple tools. Furthermore, the most valuable problems often require support from Software Engineers to close the gap on these problems. Thankfully, DuckDB SQL offers a toolkit that can be shared across these roles, and is loved by software engineers too! This type of flexibility means that collaboration is easier, and value can be delivered faster.

Post Image

In addition to powerful SQL, MotherDuck’s built in AI features, like fix-up, mean that business users can shift their work upstream and look a little bit more like analysts when writing SQL. We have also found that Data Scientists, who are more familiar with R or Python, find our AI assisted SQL helpful in translating their ideas And its developer focused tooling like DuckDB-NSQL-7B means that internal app developers can extend the power of LLMs to their users.

Lastly, when you really need fast analytics for users, MotherDuck offers a WASM library that includes DuckDB in the browser to build customer experiences that are not possible anywhere else.

Summary

MotherDuck offers a unique take on Data Warehousing, powered by DuckDB. In addition to excellent integrations offered by its ecosystem partners, MotherDuck contains native functionality for integration, transformation, and analysis that make it incredibly flexible for solving complex business problems. Create your account and jump into the getting started guide today!

CONTENT
  1. Introducktion
  2. Ingestion
  3. Transformation
  4. Analysis
  5. Flexibility
  6. Summary

Subscribe to MotherDuck Blog