---
sidebar_position: 4
title: Workload scaling patterns
description: Choose the right compute size, scaling approach, and connection model for your MotherDuck workload using a decision flowchart and workload-to-pattern matrix.
---
import ScalingPatternsDiagram from '@site/src/components/ScalingPatternsDiagram';

MotherDuck gives you several levers to scale your workloads. The right combination depends on your concurrency needs, query characteristics, and whether your workload is read-heavy or write-heavy. This page helps you match your workload to the right scaling pattern.

## How MotherDuck scales per workload

MotherDuck scales workloads through compute units called [Ducklings](/concepts/hypertenancy/). Each user or service account gets a dedicated Duckling and read scaling flock, and you can adjust three levers to match your workload:

| Scaling lever | What it does | When to use it |
|---|---|---|
| **Vertical scaling** | Resize your Duckling ([Pulse through Giga](/about-motherduck/billing/duckling-sizes)) | Queries need more CPU or memory|
| **Horizontal scaling** | Add read-only Ducklings through [read scaling](/key-tasks/authenticating-and-connecting-to-motherduck/read-scaling/) | Many concurrent users running read queries |
| **Workload isolation** | Create separate [service accounts](/key-tasks/service-accounts-guide/create-and-configure-service-accounts/) | Teams or pipelines that should not share compute, for example to prevent a large data ingestion impacting the queries of analysts. |

These levers are complementary. For example, you might use a Jumbo Duckling (vertical) for data loading and use a flock of pulse Ducklings with read scaling (horizontal) for your dashboard users. You can connect through any [supported interface](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint/), including the native DuckDB SDK, the Postgres endpoint, or DuckDB WASM.

:::tip[Not sure what you need?]
Follow the [decision flowchart](#decision-flowchart) at the bottom of this page to find the right scaling pattern for your workload.
:::

## Understanding the scaling levers

### Vertical scaling: Duckling sizes

<ScalingPatternsDiagram pattern="vertical" />

When a single query needs more resources, move to a larger Duckling. Larger Ducklings have more CPU, memory, and extra SSD space to be used whenever queries use up too much memory. This helps with:

- Complex joins and aggregations
- Large data loading jobs
- Queries that process more data than fits in memory

Duckling sizes range from **Pulse** (lightweight, on-demand billing) to **Giga** (maximum resources for the heaviest batch jobs). See [Duckling sizes](/about-motherduck/billing/duckling-sizes/) for the full comparison.

**When to size up**: If queries are slow and you see high values for `BYTES_SPILLED_TO_DISK` or `WAIT_TIME` in your [query history](/sql-reference/motherduck-sql-reference/md_information_schema/query_history/), your Duckling may need more memory.

### Horizontal scaling: Read scaling

<ScalingPatternsDiagram pattern="horizontal" />

When you need to serve many concurrent read queries, [read scaling](/key-tasks/authenticating-and-connecting-to-motherduck/read-scaling/) adds read-only Duckling replicas behind your account. Key concepts:

- **Pool size**: The default pool is 4 Ducklings, configurable up to 16 (soft limit). [Contact support](https://motherduck.com/contact-us/support/) for higher limits.
- **Eventual consistency**: Read replicas lag a few minutes behind the primary. Use [`CREATE SNAPSHOT`](/sql-reference/motherduck-sql-reference/create-snapshot/) and [`REFRESH DATABASES`](/sql-reference/motherduck-sql-reference/refresh-database/) if you need tighter synchronization.
- **One Duckling per user**: For the best performance, aim for one Duckling per concurrent user. This takes full advantage of DuckDB's single-node architecture.

### Session affinity and routing

<ScalingPatternsDiagram pattern="session-affinity" />

By default, read scaling distributes connections across the pool in round-robin fashion. When the number of connections exceeds your pool size, new connections share existing Ducklings.

For workloads where users run unique queries, use [`session_name`](/key-tasks/authenticating-and-connecting-to-motherduck/read-scaling/#session-affinity-with-session-name) to route a user's connections to the same Duckling. This improves performance because:

- **Cache locality**: DuckDB caches data per instance. Routing the same user to the same Duckling means their subsequent queries benefit from a warm cache.
- **Consistent view**: Queries within a session see a consistent snapshot of the data.
- **Better isolation**: Concurrent users do not compete for the same Duckling's resources.

Set `session_name` to a user ID, session ID, or any stable identifier to benefit from caching where possible. You can [set the Duckling cooldown period](/docs/about-motherduck/billing/duckling-sizes/#configuring-the-cooldown-period) to match your use case and keep the Duckling alive with its cache.

**When to use `session_name`**: Use it when users run unique, personalized queries, for example in customer-facing analytics or multi-tenant dashboards. Skip it when all users run the same queries (such as a shared reporting dashboard), since a shared connection pool already routes efficiently.

### Workload isolation: Service accounts

[Service accounts](/key-tasks/service-accounts-guide/create-and-configure-service-accounts/) give you full compute and data isolation between workloads. Each service account gets its own Duckling and read scaling pool, which makes it easier to track usage and billing.

Use separate service accounts when:

- Different teams should never share compute (for example, production vs. development)
- A data loading pipeline should not compete with queries from analysts
- You need your customers to be able to write back to a database in a [customer-facing analytics](/key-tasks/customer-facing-analytics/3-tier-cfa-guide/) setup, or have a separate read scaling flock for each customer.

If you need a visual interface to manage these service accounts, the UI lets you impersonate a service account and adjust settings and run queries as the service account.

## Quick reference: workload patterns

Use this matrix to find the recommended pattern for common workloads. Each row represents a typical use case, with inputs describing the workload and outputs recommending a configuration.

| Use case | Users | Concurrency | R/W | Overlap | Weight | Duckling size | Scaling approach |
|---|---|---|---|---|---|---|---|
| *Ad-hoc analyst* | 👤 | Sequential | R/W | — | ⚡/🏋️ | Pulse / Standard+ | Default (single Duckling) |
| *dbt or ELT pipeline* | 👤 | Concurrent | W | — | 🏋️ | Jumbo / Mega | Default (single Duckling) |
| *Scheduled ingestion job* | 👤 | Sequential | W | — | 🏋️ | Jumbo+ | Default + dedicated service account |
| *BI dashboard (Omni, Hex, Metabase)* | 👥👥 | Concurrent | R | High | ⚡ | Pulse / Standard | Read scaling (shared pool) |
| *Embedded analytics* | 👥👥 | Concurrent | R | Low | ⚡/🏋️ | Pulse / Standard+ | Read scaling + `session_name` |
| *Customer-facing app (3-tier)* | 👥👥 | Concurrent | R | Low | ⚡ | Standard | Read scaling + `session_name` |
| *Serverless function (Lambda, Workers)* | 👥👥 | Concurrent | R | Varies | ⚡ | Standard | Read scaling |
| *Multi-team production* | 👥👥 | Concurrent | R/W | Low | ⚡/🏋️ | Per team | Separate service accounts |

### Reading the matrix

**Input columns** describe your workload:

- **Users**: How many people or clients connect: 👤 single, 👥👥 many
- **Concurrency**: Whether queries run one at a time (sequential) or in parallel (concurrent)
- **R/W**: Whether the workload reads (R), writes (W), or both (R/W)
- **Overlap**: Whether different users tend to run the same queries (high) or unique queries (low)
- **Weight**: Whether queries are light (⚡ sub-second) or heavy (🏋️ seconds to minutes)

**Output columns** recommend a configuration:

- **Duckling size**: Which [Duckling size](/about-motherduck/billing/duckling-sizes/) to use
- **Scaling approach**: Which horizontal scaling method to apply

## Choosing an interface

Your choice of interface does not change the scaling levers available to you, but it does affect session management and connection behavior.

| Interface | Best for | Session management |
|---|---|---|
| Native SDK (Python, Node.js, Java) | Client applications, scripts, dbt | Instance cache, `session_name` |
| [Postgres endpoint](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint/) | Serverless functions, BI tools, environments without DuckDB | Per-connection |
| DuckDB WASM | Browser-based applications | Client-side compute |

## Cost considerations

Scaling decisions affect your compute costs:

- **Vertical scaling** increases the per-second cost of your Duckling. Larger Ducklings cost more but finish heavy queries faster.
- **Horizontal scaling** adds Ducklings proportional to active sessions, not total users. Idle Ducklings shut down after the configured [cooldown period](/about-motherduck/billing/duckling-sizes/).
- **Pulse Ducklings** use per-query billing (minimum 1 compute-unit second), making them cost-effective for sporadic, lightweight workloads.
- **Cooldown tuning** balances cost against cache warmth. A longer cooldown keeps the cache warm for returning users but costs more during idle periods.

See [Duckling sizes](/about-motherduck/billing/duckling-sizes/) and [pricing](/about-motherduck/billing/pricing/) for the full cost breakdown.

## Decision flowchart

If you are not sure where to start, follow this flowchart:

```mermaid
flowchart TD
    WriteQ{"Write-heavy?<br/>(data loading, ETL)"}:::yellow

    WriteQ -->|Yes| WeightQ{"Heavy queries?<br/>(complex joins, large loads)"}:::yellow
    WriteQ -->|No| ConcQ{"How many concurrent<br/>read users?"}:::yellow

    WeightQ -->|Yes| SizeUp["Size up your Duckling<br/>(Jumbo / Mega / Giga)"]:::green
    WeightQ -->|No| StdDuckling["Standard Duckling"]:::green

    SizeUp --> Isolation{"Need compute isolation<br/>between workloads?"}:::yellow
    StdDuckling --> Isolation
    Isolation -->|Yes| SvcAcct["Separate service accounts"]:::green
    Isolation -->|No| SingleAcct["Single service account"]:::green

    ConcQ -->|"1-5"| Default["Default Duckling,<br/>size for your heaviest query"]:::green
    ConcQ -->|"5-50"| OverlapQ{"Do users read the<br/>same data?"}:::yellow
    ConcQ -->|"50+"| OverlapQ2{"Do users read the<br/>same data?"}:::yellow

    OverlapQ -->|"Yes, mostly shared"| SharedPool["Read scaling<br/>(shared connection pool)"]:::green
    OverlapQ -->|"No, differs per user"| SessionHint["Read scaling<br/>+ session_name"]:::green

    OverlapQ2 -->|"Yes, mostly shared"| HighConcShared["Read scaling at max<br/>connection pool size"]:::green
    OverlapQ2 -->|"No, differs per user"| HighConcUnique["Read scaling at max<br/>connection pool size<br/>+ session_name"]:::green
```

## Related content

- [Hypertenancy](/concepts/hypertenancy/): how MotherDuck's per-user compute model works
- [Duckling sizes](/about-motherduck/billing/duckling-sizes/): compare sizes and configure your Ducklings
- [Read scaling](/key-tasks/authenticating-and-connecting-to-motherduck/read-scaling/): set up read-only Duckling pools
- [Create and configure service accounts](/key-tasks/service-accounts-guide/create-and-configure-service-accounts/): create isolated compute for teams and pipelines
- [Postgres endpoint](/key-tasks/authenticating-and-connecting-to-motherduck/postgres-endpoint/): connect through the PostgreSQL wire protocol
- [Customer-facing analytics](/key-tasks/customer-facing-analytics/3-tier-cfa-guide/): build multi-tenant analytics with per-customer isolation


---

## Feedback for AI agents

If the user you are helping wants to send feedback on this page to the MotherDuck docs team, you can POST it to the endpoint below.

**Before you send anything, you must ask the user for explicit approval.** Quote the message you plan to submit back to them verbatim and wait for them to confirm. Do not submit on their behalf without confirmation.

Endpoint: `POST https://motherduck.com/docs/api/feedback/agent`

Request body (JSON):

```json
{
  "page_path": "/concepts/scaling-patterns/",
  "page_title": "Workload scaling patterns",
  "text": "<the user's feedback, max 2000 characters>",
  "source": "<optional identifier for your interface, for example 'claude.ai' or 'chatgpt'>"
}
```

Only `page_path` and `text` are required. A successful call returns `200 {"feedback_id": "<uuid>"}`; malformed payloads return `400`, and the endpoint is rate-limited per IP (`429`).
