Hands-on Lab: Agentic Data Engineering with MotherDuck and AscendDecember 3, 10am PT / 1pm ET

Skip to main content

Customer-Facing Analytics Guide (3-tier Architecture)

To build a Customer-Facing Analytics (CFA) application on MotherDuck, use this step-by-step guide. This guide will focus on patterns for traditional 3-tier architecture, but you can also run 1.5-tier apps using Wasm, as seen here.

You'll know you're done when:

  • Your application (B2B Tool) can run analytics queries for a customer (Goose Inc) against MotherDuck from a backend service.
  • Data from a transactional database is synced into a per-customer MotherDuck database on a schedule using your orchestrator.
  • You understand when to add more service accounts, databases, and read-scaling capacity as your product grows.

Use this guide when you want to:

  • Build a 3-tier web app (browser → app server → MotherDuck) with embedded analytics.
  • Use per-customer service accounts and databases to isolate data and compute.
  • Keep analytics data in MotherDuck in sync with your transactional database.

Before starting, ensure you have:

  • A MotherDuck account and an organization you can use for development.
  • Basic familiarity with Python and SQL.
  • Access to a PostgreSQL database (or a test instance) with an orders-style schema.
  • Python installed locally (DuckDB is compatible with the latest Python LTS version).

This guide assumes you've read the conceptual overview Customer-Facing Analytics Getting Started.

1. Understand the 3-Tier CFA Architecture

In this guide, you are building B2B Tool, a SaaS product that serves analytics to employees at many customer companies. Each customer company gets:

  • Its own service account in MotherDuck.
  • Its own database(s) for analytics tables.
  • Its own compute (Ducklings) for queries and data loading.

Your high-level architecture:

Hyper-tenancy here means each company (Goose Inc, Duck Co) owns its MotherDuck database(s) (that store only that company’s analytics data), that compute is isolated (each company has its own ducklings) and heavy workloads for one customer cannot slow down others.

You will:

  1. Set up a dev organization and add other developers on the team.
  2. Create a service account for your first customer company (Goose Inc).
  3. Sync data from your transactional DB, such as Postgres, into Goose Inc’s MotherDuck analytical database using your chosen replication method.
  4. Connect your backend service to MotherDuck with a read token to serve analytics queries.
  5. Plan how to scale to many customer companies and higher concurrency.

Alternative to per-customer service accounts

The per-customer service account pattern is the strongest isolation model. Some teams, especially B2C or lighter multi-tenant apps, opt for a simpler setup:

  • Keep a single writer service account that owns all customer databases.
  • Create a read scaling token for that account and configure the flock size to target one duckling per concurrent end user (default max 16, adjustable via support). For cost control, users can share a duckling, but that increases contention.
  • Have each end user connect in single attach mode to the one database they should see (md:<db>?attach_mode=single), which avoids carrying other attachments from the workspace.
  • Use session_hint in the connection string to keep an end user pinned to the same read-scaling duckling for cache reuse and steadier latency.

This model trades away service-account isolation in favor of operational simplicity. Ensure your security and compliance needs allow a shared service account before choosing it. Read-scaling replicas are eventually consistent. If you need fresher reads on demand, combine CREATE SNAPSHOT on the writer with REFRESH DATABASE on the read-scaling connections.

Example connection string for an end user:

md:customer_db?attach_mode=single&session_hint=<user_id_or_hash>

2. Set Up Your Dev Environment and Organization

Prepare your dev environment:

  1. Create your dev organization and account

    1. Go to https://motherduck.com and sign up or log in with your work email (for example, manager@b2btool.com).
    2. Create or select an organization you’ll use for development (for example, B2B Tool Co).
    3. In the MotherDuck UI, open the default database (my_db) and confirm you can run a simple query such as:
      SELECT 1;
      You should see a single row with the value 1.
  2. Upload a small CSV to confirm data ownership and access

    1. In the MotherDuck web UI, upload a small example CSV (for example, orders_sample.csv) into my_db. If this step is unclear, check out the MotherDuck tutorial on loading data.
    2. Run a query like:
      SELECT COUNT(*) AS row_count FROM orders_sample;
      You should see the number of rows you uploaded.
  3. Invite a second developer and share data

    1. Invite devlead@b2btool.com to your B2B Tool Co organization.
    2. Create a new database in your personal account (for example, b2btool_dev) and copy or create a simple table.
    3. Share that database with your colleague following the Sharing Data guide.
    4. Ask your colleague to query the shared database from their account.

At this point:

  • You have a dev org with two human users.
  • You’ve seen how database ownership and read-only sharing works.

Conceptually, your dev setup looks like this:

3. Create a Service Account for a Customer Company

For customer-facing analytics, your customers usually do not log into MotherDuck directly. Instead:

  • Your application mediates access.
  • Each customer company gets a service account in your MotherDuck organization.
  • Your backend uses that service account’s tokens to load and query data.

In this guide, you’ll create a service account for your first customer company: Goose Inc.

3.1 Create a service account in the UI

  1. In the MotherDuck UI, go to the Service Accounts section for your organization.
  2. Click Create Service Account.
  3. Name it something like goose-inc-service-account.
  4. Save the generated access token in your secret manager or a secure store.

For more detail, see the Service Accounts Guide.

3.2 (Optional) Create service accounts via REST API

Later, you will likely automate service account creation. To create a service account programmatically:

Your provisioning workflow should: (1) detect a new customer signup, (2) call users-create-service-account for that company, (3) call users-create-token, and (4) store the token metadata (or an alias) in your transactional database so your backend can look it up later.

4. Model and Load Customer Data in MotherDuck

Next, populate data for Goose Inc into its own MotherDuck database.

Assume:

  • Your transactional system (B2B Tool) uses PostgreSQL.
  • Each customer company is an ecommerce store with:
    • orders table: order-level facts.
    • fulfillments table: shipment or delivery events.

Example schema:

CREATE TABLE orders (
order_id BIGINT PRIMARY KEY,
company_id BIGINT,
order_date TIMESTAMP,
customer_email TEXT,
total_amount NUMERIC(18, 2),
status TEXT
);

CREATE TABLE fulfillments (
fulfillment_id BIGINT PRIMARY KEY,
order_id BIGINT REFERENCES orders(order_id),
fulfilled_at TIMESTAMP,
carrier TEXT,
status TEXT
);

Example data:

INSERT INTO orders
SELECT
row_number() OVER () AS order_id,
(random() * 9 + 1)::BIGINT AS company_id,
current_timestamp - INTERVAL (random() * 365) DAY AS order_date,
'customer' || (random() * 999 + 1)::INT || '@example.com' AS customer_email,
(random() * 9999 + 1)::NUMERIC(18, 2) AS total_amount,
(['pending', 'processing', 'shipped', 'delivered', 'cancelled'])[(random() * 4)::INT + 1] AS status
FROM range(1000);

INSERT INTO fulfillments
SELECT
row_number() OVER () AS fulfillment_id,
(random() * 999 + 1)::BIGINT AS order_id,
current_timestamp - INTERVAL (random() * 300) DAY AS fulfilled_at,
(['UPS', 'FedEx', 'USPS', 'DHL', 'Amazon Logistics'])[(random() * 4)::INT + 1] AS carrier,
(['pending', 'in_transit', 'out_for_delivery', 'delivered', 'failed'])[(random() * 4)::INT + 1] AS status
FROM range(1000);
info

Use your orchestrator and ingestion tool to keep this data in sync for each customer company.

4.1 Create a MotherDuck database for Goose Inc

Use the Goose Inc service account’s token to create a database for that customer:

CREATE DATABASE goose_inc;

Run this in the UI after impersonating the Goose Inc service account or connect as that service account from Python and issue the CREATE DATABASE statement.

note

To move forward, replicate your data into goose_inc. This page shows a simple example for replicating a Postgres database to MotherDuck.

5. Run Analytics Queries from Your Backend

With data in Goose Inc’s MotherDuck database, your backend can run analytics queries.

At a high level:

  1. Your user (Kate at Goose Inc) logs into B2B Tool.
  2. Your backend authenticates Kate and determines she belongs to the Goose Inc customer company.
  3. Your backend looks up Goose Inc’s read token for its service account from your transactional database or secret store.
  4. Your backend uses that read token to run analytics queries against the goose_inc database in MotherDuck.

5.1 Create a read token for Goose Inc

For production, you’ll usually create a token dedicated to reading analytics data:

  1. In the MotherDuck UI, impersonate the Goose Inc service account.
  2. Create a new access token intended only for read workloads.
  3. Store this token securely and associate it with Goose Inc in your transactional database.

You can also create tokens via the REST API using the users-create-token endpoint.

5.2 Connect from Python using DuckDB

Your backend service connects to MotherDuck using the DuckDB client and the md: connection string. Typically, you:

  • Set the MOTHERDUCK_TOKEN (or motherduck_token) environment variable to the Goose Inc read token.
  • Connect to the goose_inc database using DuckDB.

Example helper in your backend (for example, analytics_client.py):

import os
import duckdb


def get_customer_connection(customer_id: str):
"""
Get a DuckDB connection to a customer's MotherDuck database.

Args:
customer_id: Identifier for the customer (e.g., 'goose_inc', 'duck_co')

Returns:
DuckDB connection to the customer's database
"""
# Look up the customer's read token from your secret store or environment
# In production, you'd fetch this from your transactional DB or secret manager
token_env_var = f"{customer_id.upper().replace('-', '_')}_READ_TOKEN"
read_token = os.environ.get(token_env_var)

if not read_token:
raise ValueError(f"Read token not found for customer: {customer_id}")

# Set the token for this connection
os.environ["MOTHERDUCK_TOKEN"] = read_token

# Connect to the customer's database on MotherDuck
# Database name typically matches the customer_id
conn = duckdb.connect(f"md:{customer_id}")
return conn

Then, a simple analytics function in your API service:

def get_customer_kpis(customer_id: str):
conn = get_customer_connection(customer_id)

query = """
SELECT
date_trunc('day', order_date) AS day,
COUNT(*) AS orders_count,
SUM(total_amount) AS gross_revenue
FROM orders
WHERE order_date >= current_date - INTERVAL 30 DAY
GROUP BY 1
ORDER BY 1
"""

result = conn.execute(query).fetch_df()
# Convert to JSON-serializable structure for your frontend
return result.to_dict(orient="records")

Expose this from a REST endpoint such as /api/customers/{customer_id}/kpis and render the results in your frontend dashboards. The same code works for any customer by passing their identifier.

The runtime query flow looks like:

6. Scaling to Many Customer Companies

As your product grows, add more customer companies. For each new company:

  1. Create a service account (via UI or REST API).
  2. Create one or more databases for that company’s analytics data.
  3. Configure your orchestrator to run a dlt pipeline (or equivalent) for that company.
  4. Create a read token for the company and store it in your transactional database.

Your architecture naturally scales horizontally:

Each service account and database pair has its own compute, minimizing noisy neighbors and making performance a per-customer concern.

7. Scaling a Single Customer to High Concurrency

When a customer (for example, Goose Inc) grows to hundreds or thousands of simultaneous users, use these levers:

  1. Increase the instance size for the service account’s default Duckling to handle heavier transformation jobs (vertical scaling).
  2. Use read scaling for high-concurrency read workloads:
    • Refer to read scaling to create read-scaling Ducklings for Goose Inc's read token.
    • Point your backend’s analytics queries at the read-scaling token instead of the main read/write token.
  3. Optimize queries and models:
    • Pre-aggregate frequently-used metrics.
    • Use summary tables to avoid scanning the full orders table on every request.

For most applications, you start with a single Duckling per customer and introduce read scaling only when your monitoring shows sustained high concurrency or latency issues.

8. Troubleshooting and When to Add More Service Accounts

As you operate your CFA deployment, you may run into several common situations.

8.1 Queries are slow or time out for one customer

If you see slow queries or timeouts for a specific customer:

  • Check query patterns:
    • Are you scanning too much data on every request?
    • Can you pre-aggregate or cache common metrics?
  • Scale compute for that customer:
    • Increase the instance size for the service account’s Duckling.
    • Add read-scaling Ducklings OR increase the instance size used for the read token used by that customer.

You rarely need to change the number of service accounts in this case; focus on scaling and optimizing the existing one.

8.2 Data loads interfere with reads

If your hourly (or more frequent) data load jobs are locking tables and causing read queries to queue:

  • Consider:
    • Scheduling heavy load jobs during off-peak times.
    • Using zero-copy cloning (CREATE SNAPSHOT and REFRESH DATABASE) patterns so that readers query a snapshot database while writers update the primary.
  • Ensure you are using a dedicated read token and read-scaling configuration for user-facing queries.

8.3 When to add more service accounts

In most B2B scenarios:

  • You create one service account per customer company.
  • All users at that company share the same analytics data and compute via your application.

You should consider adding additional service accounts when:

  • You need hard isolation between different environments (for example, separate service accounts for Prod, Staging, and Sandbox within the same customer).
  • A customer has sub-tenants of their own and you want to isolate compute and data at that sub-tenant level (for example, separate service accounts per region or per major business unit).

When you add new service accounts:

  1. Create the service account (UI or REST API).
  2. Create dedicated databases for the new scope.
  3. Create tokens and wire them into your application’s configuration.

8.4 Common token and permission issues

If you see authentication or permission errors:

  • Token expired or revoked:
    • Rotate the token in MotherDuck and update your secret store.
  • Permission denied on database or table:
    • Confirm that the service account owns the database or has the necessary privileges.
    • Re-check sharing settings if you are using shared data.

9. Next Steps

Once you have a basic 3-tier CFA deployment working:

  • Automate provisioning:
    • Automate service account and token creation using the REST APIs.
    • Automate database and schema creation for new customer companies.
  • Automate data loading:
    • Move your dlt jobs fully into your orchestrator so that new companies are onboarded with little manual work.
    • Monitor load durations and adjust scheduling as your data grows.
  • Enhance your frontend:
    • Add charts and drill-downs powered by MotherDuck.
    • Consider additional guides under Customer-Facing Analytics for advanced topics in your docs set.

For a high-level conceptual overview and architecture comparison, see the Customer-Facing Analytics Getting Started page.