Boosting Data Performance: Unlocking the Power of DuckDB in your Gold LayerLive demo: October 27th

Skip to main content

Storage Lifecycle and Management

Understanding MotherDuck's storage lifecycle is crucial for optimizing costs and managing data effectively. Unlike traditional databases where deleted data is immediately freed, MotherDuck implements a sophisticated multi-stage storage system that ensures data safety while providing cost transparency. This system is particularly important for organizations that share data, use zero-copy cloning, or need to understand their storage footprint for billing purposes.

Storage Lifecycle Overview

The following documents MotherDuck's storage lifecycle.

There are 4 distinct stages of the storage lifecycle:

  • Active bytes: Actively referenced bytes of the database. These bytes are accessible by directly querying the database.
  • Historical bytes: Non-active bytes referenced by a share of this database
  • Kept for cloned bytes: Bytes referenced by other databases (via zero-copy clone) that are no longer referenced by this database as active or historical bytes
  • Failsafe bytes: Bytes that are no longer referenced by any database or share that are retained for some period of time as system backups

MotherDuck will run a periodic job that will reclassify data to the proper storage lifecycle stage.

Data can only flow through the storage lifecycle unidirectionally, from left to right.

The following conditions can trigger data to be reclassified to a new stage:

  • Active bytes: when the data is deleted from the database
  • Historical bytes: when all shares referencing the data are dropped or updated
  • Kept for cloned bytes: when the data is deleted from all zero-copy-cloned databases
  • Failsafe bytes: after the failsafe retention period (7 days)

An organization is billed for the sum of active, historical, kept for cloned, and failsafe bytes across all of their databases.

How This Affects Your Data Strategy

Understanding the storage lifecycle helps you make informed decisions about:

  • Data deletion strategies: When you delete data, it doesn't immediately reduce your bill due to the retention stages
  • Sharing considerations: Shared data remains in historical bytes until shares are updated or dropped
  • Cloning decisions: Zero-copy clones can keep data in kept for cloned bytes even after deletion from the source
  • Cost optimization: Different lifecycle stages have different cost implications and management strategies

For more information on data sharing, see Sharing Data. For details on zero-copy cloning, refer to MotherDuck Architectural Concepts.

Storage Management

MotherDuck supports two ways to configure storage retention for native storage-backed databases.

Standard Databases:

PlanFailsafe Period - Standard DatabasesMinimum (Default) Historical Retention
Business7 days1 day
Lite7 days1 day
Free7 dayszero days

Transient Databases:

For use cases that don't require the default failsafe retention period (7 days), a MotherDuck database can be set as TRANSIENT at database creation to enforce a 1 day failsafe minimum. This setting can only be defined at database creation and is not modifiable.

PlanFailsafe Period - Transient DatabasesMinimum (Default) Historical Retention
Business1 day1 day
Lite1 day1 day
Free1 dayzero days

Transient databases can be helpful for the following datasets:

  • Datasets that are the intermediate output of a job (write once, read once)
  • Datasets that can be easily reconstructed from an external data source

Breaking Down Storage Usage

To better understand your organization's storage bill, start with the STORAGE_INFO view in the MD_INFORMATION_SCHEMA. This function returns an overview of the storage footprint by lifecycle stage for the databases in your organization.

If Active bytes are higher than expected, consider whether you need all of the data stored in that database. Some common ways to decrease active bytes are to delete the data or optimize sorting and data types.

If Historical bytes are higher than expected, consider whether there are outstanding manually updated shares that reference this database in the organization. This footprint will decrease as the shares are updated (UPDATE SHARE) or dropped. You can find all shares that reference some database by using the OWNED_SHARES view in the MD_INFORMATION_SCHEMA.

If Kept for cloned bytes are higher than expected, consider whether there are other databases that were zero-copy cloned from this database that are still referencing deleted data. This footprint will decrease as you delete the cloned data from these other databases.

Failsafe bytes result from deleting data. This footprint should drop if this was a one-time deletion of data. If failsafe bytes remain consistently high - it is likely that you are overwriting or updating data too frequently. Common workloads that tend to delete a lot of data (via overwrites or updates) are: create or replace tables, truncate and insert, updates, and deletes. Avoiding these workload patterns can reduce your failsafe footprint. Transient databases won't have failsafe bytes.

If you have the Admin role, you can view your organization's storage breakdown on the databases page.

If you need help understanding or reducing your storage bill, please reach out to MotherDuck support.