Architecture and capabilities
MotherDuck is a serverless cloud analytics service with a unique architecture that combines the power and scale of the cloud with the efficiency and convenience of DuckDB.
MotherDuck's key components are:
- The MotherDuck cloud service
- MotherDuck's DuckDB SDK
- Hybrid execution
- The MotherDuck web UI
The MotherDuck cloud service
The MotherDuck cloud service enables you to store structured data, query that data with SQL, and share it with others. A key MotherDuck product principle is ease of use.
Serverless execution model—You don't need to configure or spin up instances, clusters, or warehouses. You simply write and submit SQL. MotherDuck takes care of the rest. Under the hood, MotherDuck runs DuckDB and speaks DuckDB's SQL dialect.
Managed storage—you can load data into MotherDuck storage to be queried or shared. MotherDuck storage is durable, secure, and automatically optimized for best performance. MotherDuck storage is surfaced to you via the catalog and logical primitives database, schema, table, view, etc. In addition, MotherDuck can query data outside of MotherDuck storage—as data on Amazon S3, via https endpoints, on your laptop, and so on.
The service layer—MotherDuck provides key capabilities like secure identity, authorization, administration, monitoring, and so on. Currently, billing is not enabled for MotherDuck, and the service is free to use.
Currently, MotherDuck runs on AWS
us-east-1 region. We are working on expanding to other regions and cloud providers.
MotherDuck's DuckDB SDK
If you're using DuckDB in Python or CLI, you can connect to MotherDuck with a single line of code,
.open md:. After you run this command, your DuckDB instance becomes supercharged by MotherDuck. MotherDuck's hybrid execution is enabled, and your DuckDB instance gets additional capabilities like sharing, secrets storage, better interoperability with S3, and cloud persistence.
When connected together, DuckDB and MotherDuck form a different type of distributed system. The two nodes work in concert so you can query data wherever it lives, in the most efficient way possible. This query execution model, called hybrid execution, automatically routes the various stages of queries execution to the most opportune locations, including highly arbitrary scenarios:
- If a SQL query queries data on your laptop, MotherDuck routes the query to your local DuckDB instance
- If a SQL query queries data in MotherDuck or S3, MotherDuck routes that query to MotherDuck
- If a SQL query executes a join between data on your laptop and data in MotherDuck, MotherDuck finds the best way to efficiently join the two
The MotherDuck web UI
You can use MotherDuck's web UI to analyze and share data and to perform administrative tasks. Currently MotherDuck's UI consists of a lightweight notebook, a SQL IDE, and a data catalog. Uniquely, MotherDuck caches query results in a highly interactive query results panel, enabling you to sort, filter, and even pivot data quickly.
Summary of capabilities
Currently with MotherDuck you can:
- Use serverless DuckDB in the cloud to store data and execute DuckDB SQL
- Load data into MotherDuck from your personal computer, https, or S3
- Join datasets on your computer with datasets in MotherDuck or in S3
- Copy DuckDB databases between local and MotherDuck locations
- Materialize query results into local or MotherDuck locations, or S3
- Work with data in MotherDuck’s notebook UI, standard DuckDB CLI, or standard DuckDB Python package
- Share databases with your teammates
- Securely save S3 credentials in MotherDuck
Additionally, MotherDuck supports connectivity to third party tools via:
Considerations and limitations
At this point of time, MotherDuck supports only one version of DuckDB which is the v0.9.2
MotherDuck does not yet support the full range of SQL of DuckDB. We are continuously working on improving coverage of DuckDB in MotherDuck. If you need specific features enabled, please let us know.
Below is the list of DuckDB features that MotherDuck does not yet support:
- User Defined Functions (UDFs)
- Custom extensions. Only the https/httpfs, CSV/Parquet reader, fts, json, icu, tpcds, and tpch extensions are preloaded into MotherDuck.
- Stored Procedures
- Secondary Indexes (Create Index)
- Vacuum statement