Skip to main content

Workload scaling patterns

MotherDuck gives you several levers to scale your workloads. The right combination depends on your concurrency needs, query characteristics, and whether your workload is read-heavy or write-heavy. This page helps you match your workload to the right scaling pattern.

How MotherDuck scales per workload

MotherDuck scales workloads through compute units called Ducklings. Each user or service account gets a dedicated Duckling and read scaling flock, and you can adjust three levers to match your workload:

Scaling leverWhat it doesWhen to use it
Vertical scalingResize your Duckling (Pulse through Giga)Queries need more CPU or memory
Horizontal scalingAdd read-only Ducklings through read scalingMany concurrent users running read queries
Workload isolationCreate separate service accountsTeams or pipelines that should not share compute, for example to prevent a large data ingestion impacting the queries of analysts.

These levers are complementary. For example, you might use a Jumbo Duckling (vertical) for data loading and use a flock of pulse Ducklings with read scaling (horizontal) for your dashboard users. You can connect through any supported interface, including the native DuckDB SDK, the Postgres endpoint, or DuckDB WASM.

Not sure what you need?

Follow the decision flowchart at the bottom of this page to find the right scaling pattern for your workload.

Understanding the scaling levers

Vertical scaling: Duckling sizes

Per user or service account
Each workload gets its own Duckling size
Analyst
Boss
Ingestion
Pulse
Standard
Jumbo
Shared service account
1. Two users share a Standard Duckling
User 1
User 2
User 3
User 4
User 5
App
Standard

When a single query needs more resources, move to a larger Duckling. Larger Ducklings have more CPU, memory, and extra SSD space to be used whenever queries use up too much memory. This helps with:

  • Complex joins and aggregations
  • Large data loading jobs
  • Queries that process more data than fits in memory

Duckling sizes range from Pulse (lightweight, on-demand billing) to Giga (maximum resources for the heaviest batch jobs). See Duckling sizes for the full comparison.

When to size up: If queries are slow and you see high values for BYTES_SPILLED_TO_DISK or WAIT_TIME in your query history, your Duckling may need more memory.

Horizontal scaling: Read scaling

Read scaling
1. Two users, each routed to their own Duckling
User 1
User 2
User 3
User 4
User 5
User 6
MotherDuck
Duckling 1
Duckling 2
Duckling 3
Duckling 4

When you need to serve many concurrent read queries, read scaling adds read-only Duckling replicas behind your account. Key concepts:

  • Pool size: The default pool is 4 Ducklings, configurable up to 16 (soft limit). Contact support for higher limits.
  • Eventual consistency: Read replicas lag a few minutes behind the primary. Use CREATE SNAPSHOT and REFRESH DATABASES if you need tighter synchronization.
  • One Duckling per user: For the best performance, aim for one Duckling per concurrent user. This takes full advantage of DuckDB's single-node architecture.

Session affinity and routing

Without session_name
The BI tool / App hides the user so queries land on random Ducklings, no cache reuse
User 1
User 2
User 3
BI tool / App
MotherDuck
Duckling 1
Duckling 2
Duckling 3
Duckling 4
With session_name
When an identifier is added, each user routes to the same Duckling, warm cache
User 1
User 2
User 3
BI tool / App
MotherDuck
Duckling 1
Duckling 2
Duckling 3
Duckling 4

By default, read scaling distributes connections across the pool in round-robin fashion. When the number of connections exceeds your pool size, new connections share existing Ducklings.

For workloads where users run unique queries, use session_name to route a user's connections to the same Duckling. This improves performance because:

  • Cache locality: DuckDB caches data per instance. Routing the same user to the same Duckling means their subsequent queries benefit from a warm cache.
  • Consistent view: Queries within a session see a consistent snapshot of the data.
  • Better isolation: Concurrent users do not compete for the same Duckling's resources.

Set session_name to a user ID, session ID, or any stable identifier to benefit from caching where possible. You can set the Duckling cooldown period to match your use case and keep the Duckling alive with its cache.

When to use session_name: Use it when users run unique, personalized queries, for example in customer-facing analytics or multi-tenant dashboards. Skip it when all users run the same queries (such as a shared reporting dashboard), since a shared connection pool already routes efficiently.

Workload isolation: Service accounts

Service accounts give you full compute and data isolation between workloads. Each service account gets its own Duckling and read scaling pool, which makes it easier to track usage and billing.

Use separate service accounts when:

  • Different teams should never share compute (for example, production vs. development)
  • A data loading pipeline should not compete with queries from analysts
  • You need your customers to be able to write back to a database in a customer-facing analytics setup, or have a separate read scaling flock for each customer.

If you need a visual interface to manage these service accounts, the UI lets you impersonate a service account and adjust settings and run queries as the service account.

Quick reference: workload patterns

Use this matrix to find the recommended pattern for common workloads. Each row represents a typical use case, with inputs describing the workload and outputs recommending a configuration.

Use caseUsersConcurrencyR/WOverlapWeightDuckling sizeScaling approach
Ad-hoc analyst👤SequentialR/W⚡/🏋️Pulse / Standard+Default (single Duckling)
dbt or ELT pipeline👤ConcurrentW🏋️Jumbo / MegaDefault (single Duckling)
Scheduled ingestion job👤SequentialW🏋️Jumbo+Default + dedicated service account
BI dashboard (Omni, Hex, Metabase)👥👥ConcurrentRHighPulse / StandardRead scaling (shared pool)
Embedded analytics👥👥ConcurrentRLow⚡/🏋️Pulse / Standard+Read scaling + session_name
Customer-facing app (3-tier)👥👥ConcurrentRLowStandardRead scaling + session_name
Serverless function (Lambda, Workers)👥👥ConcurrentRVariesStandardRead scaling
Multi-team production👥👥ConcurrentR/WLow⚡/🏋️Per teamSeparate service accounts

Reading the matrix

Input columns describe your workload:

  • Users: How many people or clients connect: 👤 single, 👥👥 many
  • Concurrency: Whether queries run one at a time (sequential) or in parallel (concurrent)
  • R/W: Whether the workload reads (R), writes (W), or both (R/W)
  • Overlap: Whether different users tend to run the same queries (high) or unique queries (low)
  • Weight: Whether queries are light (⚡ sub-second) or heavy (🏋️ seconds to minutes)

Output columns recommend a configuration:

  • Duckling size: Which Duckling size to use
  • Scaling approach: Which horizontal scaling method to apply

Choosing an interface

Your choice of interface does not change the scaling levers available to you, but it does affect session management and connection behavior.

InterfaceBest forSession management
Native SDK (Python, Node.js, Java)Client applications, scripts, dbtInstance cache, session_name
Postgres endpointServerless functions, BI tools, environments without DuckDBPer-connection
DuckDB WASMBrowser-based applicationsClient-side compute

Cost considerations

Scaling decisions affect your compute costs:

  • Vertical scaling increases the per-second cost of your Duckling. Larger Ducklings cost more but finish heavy queries faster.
  • Horizontal scaling adds Ducklings proportional to active sessions, not total users. Idle Ducklings shut down after the configured cooldown period.
  • Pulse Ducklings use per-query billing (minimum 1 compute-unit second), making them cost-effective for sporadic, lightweight workloads.
  • Cooldown tuning balances cost against cache warmth. A longer cooldown keeps the cache warm for returning users but costs more during idle periods.

See Duckling sizes and pricing for the full cost breakdown.

Decision flowchart

If you are not sure where to start, follow this flowchart: