Glossary
MotherDuck architecture
| Term | Definition |
|---|---|
| duckling | A dedicated DuckDB compute instance provisioned for each user or service account. Each duckling has its own CPU, memory, and fast SSD spill space. Learn more |
| hypertenancy | MotherDuck's tenancy model where every user gets their own dedicated DuckDB compute instance (duckling), providing full compute isolation and eliminating noisy-neighbor problems. Learn more |
| noisy neighbor | A problem in shared-resource systems where one user's heavy workload degrades performance for others. MotherDuck's hypertenancy model eliminates this. |
| service account | A non-human user account for powering applications, pipelines, or services. Service accounts have their own tokens and duckling sizes, separate from interactive users. Learn more |
| SSD | Solid-state drive: fast local storage attached to each duckling, used as spill space when a query exceeds available memory. |
Query execution
| Term | Definition |
|---|---|
| dual execution | A query execution model that automatically routes stages of a query to the most efficient location, whether local (your machine) or cloud (MotherDuck), based on where the data lives. Learn more |
| hybrid query | A query that accesses both local DuckDB databases and MotherDuck cloud databases in a single SQL statement. Learn more |
| vectorized execution | A query processing technique that operates on batches of values at once rather than row-by-row, improving CPU cache utilization and query speed. |
Data management
| Term | Definition |
|---|---|
| checkpoint | An operation where the current database state is written to persistent storage, creating a recoverable point and making changes visible to read-scaling replicas and shares. |
| fully qualified name | A reference to a database object that includes all parent namespaces, written as database.schema.table (or database.schema.view). Useful for disambiguating objects across attached databases and required when views or queries need to resolve tables across databases. Learn more |
| MotherDuck share | A zero-copy, read-only database object that lets you share data with other MotherDuck users or across your organization without duplicating storage. Learn more |
| read scaling | A feature that spins up additional read-only duckling replicas to handle read-heavy workloads. Queries are distributed across replicas with eventual consistency (syncing within minutes). Learn more |
| snapshot | An immutable point-in-time copy of a database. Automatic snapshots enable time travel and data recovery; named snapshots persist until explicitly dropped. Learn more |
| time travel | The ability to query a database as it existed at a previous point in time using snapshots, useful for data recovery and auditing. Learn more |
| transient database | A MotherDuck database type with shorter snapshot retention and no failsafe period, suited for temporary or ephemeral data. Learn more |
| zero-copy clone | A metadata-only operation that creates a new database reference sharing the same underlying data, consuming no additional storage. |
Identity and access
| Term | Definition |
|---|---|
| deprovisioning | Disabling a user's access while retaining their account record and data. In MotherDuck, deprovisioned users cannot sign in and their access tokens are revoked, but they can be reprovisioned later. Learn more |
| IdP | Identity provider: a service that authenticates users and sends identity or provisioning information to applications, such as Okta, Microsoft Entra ID, Google Workspace, or Keycloak. |
| JIT provisioning | Just-in-time provisioning: automatic account creation when a user signs in successfully for the first time through SSO, instead of creating the account ahead of time. Learn more |
| OIDC | OpenID Connect: an identity layer on OAuth 2.0 used for browser-based authentication and single sign-on. Learn more |
| SAML | Security Assertion Markup Language: an XML-based protocol used by many enterprise identity providers for browser-based single sign-on. Learn more |
| SCIM | System for Cross-domain Identity Management: an open standard used to automate user provisioning, updates, deprovisioning, and deletion between an identity provider and an application. Learn more |
| SSO | Single sign-on: an authentication setup where users sign in through a central identity provider instead of separate application-specific credentials. Learn more |
Connection modes
| Term | Definition |
|---|---|
| SaaS mode | A connection setting that sandboxes a MotherDuck session by blocking local file access, local DuckDB attachments, extension install or load, and most DuckDB configuration changes. Used automatically by the Postgres endpoint and recommended for third-party tools that host DuckDB. Learn more |
| single mode | A connection mode that creates a temporary, non-persistent session. Attachment changes are discarded when you disconnect. Useful for BI tools and ephemeral workloads. Learn more |
| workspace mode | The default connection mode where database attachment changes persist across sessions. All databases from your last session are automatically restored. Learn more |
Storage and formats
| Term | Definition |
|---|---|
| columnar storage | A data layout that stores each column separately rather than row-by-row, enabling better compression and faster analytical queries that read only the columns they need. |
| DuckLake | An open table format that stores metadata in database tables rather than files, enabling faster metadata lookups and multi-table ACID transactions on data in object storage. Learn more |
| object storage | Scalable cloud storage (like S3, GCS, or Azure Blob Storage) that stores data as objects, commonly used for data lakes. |
| OLAP | Online Analytical Processing: a category of database systems optimized for complex queries across large datasets, as opposed to OLTP systems designed for frequent small transactions. |
| Parquet | An open columnar file format optimized for analytical workloads, providing efficient compression and fast reads for specific columns. |