Materialized views vs on-the-fly queries for embedded BI

18 min read
Materialized views vs on-the-fly queries for embedded BI

TL;DR

  • Materialized views win when your tenant count is low and your data changes slowly. You pre-compute once, serve fast reads, and tolerate some staleness.
  • On-the-fly queries win when freshness matters more than predictable latency, and when your metric slices vary too much to pre-aggregate sensibly.
  • Multi-tenancy breaks MVs at scale. Each tenant × each metric slice needs its own view, its own refresh schedule, and its own storage. Add tenants and the count explodes.
  • For embedded BI, the answer is Hypertenancy plus dual execution. MotherDuck gives each tenant an isolated DuckDB instance, so server-side queries never contend, and DuckDB-Wasm runs interactive filtering client-side at 5–20ms with no materialized views.

The core trade-off: materialized views vs on-the-fly queries

Materialized views pre-compute results and store them, so reads return in milliseconds because the database already did the work. The cost lands in two places. Your data goes stale between refreshes, and you inherit the operational burden of keeping those refreshes correct and on schedule. On-the-fly query acceleration flips the trade. Every query runs against live data, so the numbers are always current, but you pay for that freshness in variable latency and concurrency pressure when many users hit the warehouse at once.

Consider an MV-backed fintech dashboard. You materialize a daily revenue rollup per merchant and refresh it every fifteen minutes. A merchant signs in at 10:07 to reconcile a chargeback that posted at 10:03, and the number is wrong. For a finance product, a stale revenue figure is not a cosmetic bug. It erodes trust, and in regulated contexts it can put you in front of an auditor explaining why the dashboard disagreed with the ledger. Shorten the refresh interval to fix it, and your compute bill climbs while you still have a window where the data lies.

On-the-fly queries avoid that staleness because they read the source directly, but they expose a different failure mode under load. Picture a shared warehouse like Snowflake, BigQuery, or Redshift serving an embedded dashboard with fifty tenants. Each tenant fires an unaggregated query that scans, joins, and groups across their slice. On a shared compute pool, those fifty queries fight for the same slots. P50 latency might look fine in a demo with three users, but P99 spikes the moment real concurrency arrives, and one tenant running an expensive filter degrades the experience for everyone else.

Neither side wins outright, and the right choice depends on your tenant count, freshness requirement, and latency SLA. MVs trade freshness and ops overhead for fast, predictable reads. On-the-fly trades latency stability for always-current data. The rest of this guide works through where each lands for embedded BI, and why multi-tenancy pushes the MV side past its breaking point.

Why multi-tenant embedded BI breaks the MV model

A single-tenant dashboard might run fine on a handful of materialized views. Multi-tenant embedded BI multiplies that count by every customer you serve, and the math turns hostile fast. You don't build one MV per metric. You build one MV per tenant per metric slice, because each customer can only see their own rows. Ten metrics across 500 tenants is 5,000 materialized views, and that number grows every time you sign a new account or ship a new chart.

Storage is the first problem, and it compounds linearly with tenant count. Every MV holds a pre-aggregated copy of that tenant's data, so you're paying to store the same logical metric thousands of times over. A revenue-by-day rollup that occupies a few megabytes per tenant becomes gigabytes once you cross a few hundred customers, and you're storing it on top of the raw tables that already hold the same facts.

Refresh orchestration is the second problem, and it scales worse than storage because it competes for the same compute. Each MV needs a refresh schedule, and 5,000 of them means 5,000 refresh jobs fighting for warehouse capacity. On Snowflake or BigQuery, those refreshes burn credits whether or not a tenant ever opens their dashboard. Stagger them and you trade freshness for cost. Run them concurrently and you spike your warehouse bill while starving live queries of resources.

DDL complexity is the third problem, and it's the one that quietly breaks your deploy pipeline. Every new tenant requires creating their slice of MVs, and every schema change to a metric requires altering all of them at once. Adding a column to a single rollup definition means running thousands of CREATE OR REPLACE statements, each of which can fail independently and leave your tenants in inconsistent states. Tools like Materialize and ClickHouse make individual MVs cheaper to maintain, but neither erases the combinatorial count. You still own N tenants times M metrics worth of definitions, schedules, and migrations.

The failure isn't gradual. A system that works at 20 tenants becomes unmanageable at 2,000, because all three costs grow with the same multiplier. By the time you notice the refresh bill or the failed migration, you've already built your product around an approach that punishes the growth you wanted. The harder you sell, the heavier the operational debt gets.

Hypertenancy: per-tenant compute isolation as the server-side fix

Hypertenancy gives each application user a dedicated DuckDB instance on the server, called a Duckling, so one tenant's heavy query never steals CPU from another's. The multi-tenant MV explosion exists because a shared warehouse forces you to pre-aggregate per tenant to keep response times predictable under load. When every tenant runs on isolated compute, you remove the contention that made pre-aggregation necessary in the first place. A scan that touches a few million rows for tenant A runs on tenant A's Duckling and leaves tenant B's dashboard untouched.

The isolation matters because the noisy-neighbor problem is the real reason on-the-fly queries get a bad reputation in multi-tenant BI. On a shared Snowflake warehouse or Redshift cluster, 50 tenants hitting the same compute at once produces queue waits and P99 latency spikes that no amount of query tuning fixes. Hypertenancy sidesteps that by scaling DuckDB out, one engine per user, rather than packing every tenant onto one large engine and fighting for slots.

Pulse instances make this economical because MotherDuck meters them per query rather than by wall-clock time. You don't pay to keep a warehouse warm for a tenant who checks their dashboard twice a day. A Duckling spins up, answers the query, and you're billed for that work, not for idle minutes between requests. That billing model is what makes per-tenant isolation affordable at high tenant counts. Giving 10,000 tenants their own always-on compute would be ruinous under a wall-clock model, but per-query metering charges only for actual queries, so consistent latency at scale stops being a cost problem.

Removing contention also removes the main argument for materialized views. The reason teams pre-aggregate is to guarantee a fast read path when the live query path is unreliable under concurrency. Isolated compute makes the live path reliable, so you can query raw tables on demand and still hit single-digit-to-low-hundreds millisecond responses without maintaining a refresh pipeline. You keep fresh data and drop the operational debt of N tenants times M metric slices worth of MVs.

Layers and PriceMedic both run customer-facing analytics on this model, and their workloads show the isolation holds when many tenants query concurrently. PriceMedic serves dashboards to healthcare buyers where each customer sees only their own data, and Layers ships embedded analytics inside its product. Neither maintains the per-tenant MV sprawl a shared warehouse would have forced on them. Both rely on per-tenant Ducklings to keep one customer's load from degrading another's experience, which is the exact failure mode that pushes shared-warehouse teams toward pre-aggregation in the first place.

Dual execution: eliminating MV latency for interactive workloads

The interactivity gap between materialized views and on-the-fly queries closes entirely when you run a full database engine on both the server and the client. MotherDuck does exactly that. It places a complete DuckDB engine in the cloud and a second one in the browser via WebAssembly. When an embedded Dive loads, the cloud engine runs the initial heavy query and streams the result set into the browser's local DuckDB instance. From that point on, every filter, cross-filter, and drill-down runs client-side with no network roundtrip.

That architecture produces interaction latency of 5 to 20 milliseconds. As Alex Monahan put it in a MotherDuck webinar, "It's like five milliseconds. It's like ten milliseconds... it's so buttery smooth, it's like a movie. That's faster than sixty frames per second." A live demo scrubbed through 4 million NYC 311 records on an interactive map, recalculating aggregations in real time as the cursor moved. No materialized view delivers that, because an MV still has to be queried over the wire, and no on-the-fly server query can beat a calculation that never leaves the laptop.

The mechanism matters here. DuckDB compiled to Wasm is C++, not JavaScript, so it can hold and aggregate millions of rows directly in the browser. Client-side compute is also effectively free, since it runs on a CPU that would otherwise sit idle rendering browser tabs. You cannot bolt this onto Snowflake, BigQuery, or Redshift, because those engines live only on a remote server. The interactive speed depends on having a real database engine on the client end.

Embedding Dives is a Business-plan feature, included at no extra cost. MotherDuck offers two query modes for embedded Dives. Dual mode is the default: each query runs in browser DuckDB-Wasm or server-side through MotherDuck depending on what it needs. Browser features like data exports and the full DuckDB type system, including structs and lists, work only in dual mode. Reach for it whenever users scrub, zoom, and cross-filter live.

Server mode is opt-in, for embeds that only need server-side SQL. You force it by adding ?queryMode=server to the iframe URL. It runs through the Postgres wire protocol, which keeps the deploy simple but doesn't support nested DuckDB types like structs and lists, and it routes each interaction back to the cloud. Use server mode for simpler dashboards where users mostly view rather than manipulate the data.

For interactive customer-facing analytics, dual execution settles the materialized-view-versus-on-the-fly debate. The first query runs server-side on isolated tenant compute, and every interaction after that runs locally at frame-rate speed. You get fresh data and instant interactivity without pre-aggregating anything.

Decision framework

Pick your approach by matching your actual constraints, not by defaulting to whatever your warehouse documentation recommends. The five variables below decide the answer for most embedded BI builds. Tenant count and freshness requirements carry the most weight, because both directly determine whether materialized views stay manageable or collapse under their own refresh load.

Your situationMVsOn-the-flyHybrid
Freshness: sub-second numbers, regulatory or trust risk if staleAvoidBest fitUse for slow-changing rollups only
Freshness: daily or weekly aggregates are fineBest fitWorks, but wastefulStrong fit
Tenant count: under ~10 tenantsManageableWorksOptional
Tenant count: dozens to thousandsMV explosion, avoidBest fit with HypertenancyBest fit
Query pattern: interactive filtering and drill-downsNo help for interactivityAdds latency per clickDual execution wins
Query pattern: scheduled batch reportsBest fitAcceptableStrong fit
Team capacity: no one to own refresh pipelinesAvoidBest fitPick the on-the-fly side
Team capacity: dedicated data engineeringViableViableViable
Latency SLA: strict, consistent ceiling at scaleHard at high tenant countsHypertenancy holds the SLAHolds with isolation

Read the table down your hardest constraint first. If you have many tenants and no team to babysit refresh schedules, the MV column is full of "avoid" for a reason, and on-the-fly with per-tenant compute isolation removes the operational debt entirely. If your data barely moves and you serve a handful of tenants, MVs stay manageable and cheap, so there's no reason to overbuild. Most production embedded BI lands in the hybrid column, where you materialize the slow-moving aggregates and run everything tenant-specific and interactive on the fly.

Hybrid patterns in practice: DuckDB and MotherDuck SQL examples

Most production embedded BI doesn't pick one strategy. You materialize the parts that change slowly and query the rest on the fly, and the trick is matching each query to the cost it actually incurs. Below are three patterns you can build today with DuckDB and MotherDuck.

Pattern 1: Materialize slow-changing aggregates

Daily revenue rollups, monthly cohort counts, and lifetime totals don't need to recompute on every page load. Materialize them on a refresh cadence that matches how often the underlying data changes.

Copy code

CREATE OR REPLACE TABLE daily_revenue_rollup AS SELECT tenant_id, date_trunc('day', created_at) AS day, sum(amount) AS revenue, count(*) AS txn_count FROM transactions GROUP BY tenant_id, day;

One table holds every tenant's rollup, keyed by tenant_id. You refresh it on a schedule, not per request, and you avoid one materialized view per tenant because the tenant lives in a column, not in a separate object.

Pattern 2: Query row-level detail on the fly

When a user drills into a single transaction or filters to last hour's activity, freshness matters more than precomputation. Run that query live against the tenant's own data.

Copy code

SELECT created_at, amount, status, customer_ref FROM transactions WHERE tenant_id = $tenant_id AND created_at >= now() - INTERVAL 1 HOUR ORDER BY created_at DESC;

Under Hypertenancy, each tenant's live query runs on its own Duckling, so a heavy ad-hoc scan from one customer never slows another's dashboard. You get fresh detail without the concurrency risk of a shared warehouse.

Pattern 3: One Dive definition, N tenants via required_resources

The hardest part of multi-tenant embedded BI is data isolation, and this is where per-tenant materialized views usually explode into N copies. MotherDuck solves it at the session layer instead. You define a single Dive that queries an alias like tenant_data, then route each session to the right database when you create the embed session.

Copy code

import requests requests.post( f"https://api.motherduck.com/v1/dives/{DIVE_ID}/embed-session", headers={"Authorization": f"Bearer {ADMIN_TOKEN}"}, json={ "username": "embed_service_account", "required_resources": [ {"url": "md:_share/tenant_a_data/<share_uuid>", "alias": "tenant_data"} ], "initial_state": {"date_range": "last_30_days"} }, )

The required_resources field overrides the Dive's declared source databases for that one session, so the same Dive renders against tenant A's share, then tenant B's, with no change to the Dive itself. You ship one Dive definition and serve every customer from it. For stricter isolation, scope a separate service account per tenant or region, and the session token stays read-only and never appears in your server logs.

You can also seed the view with initial_state, which maps to the useDiveState keys inside the Dive. That lets you render the same Dive against each customer's selected date range or region without forking the code.

Put the three patterns together and the division of labor is clear. Materialized rollups serve the precomputed top-line numbers, on-the-fly queries handle fresh row-level detail on isolated compute, and required_resources keeps one Dive serving all your tenants. You materialize what's cheap to precompute and stale-tolerant, and you query live what needs to be current, without ever multiplying views by tenant count.

Domain scenarios: fintech, SaaS product analytics, and HIPAA healthcare

Three domains push the materialized view tradeoff to its breaking point in different ways. Each one forces a clear recommendation once you name its real constraint.

Fintech: transaction dashboards where stale numbers cost trust

Real-time transaction dashboards in fintech cannot tolerate a materialized view refresh lag. A customer who sees a balance or a settlement total that lags the ledger by thirty minutes assumes your product is broken, and in regulated contexts a stale figure shown as current invites a compliance question you do not want to answer. Run these queries on-the-fly against the live ledger so every read reflects committed state. With per-tenant compute isolation, a spike in one customer's transaction volume never slows the dashboard another customer is watching, which is the failure mode shared warehouses hit during market hours. Reserve materialized views here for genuinely slow-moving rollups like prior-month closed statements, where the source data no longer changes.

SaaS product analytics: high tenant counts and bursty usage

SaaS product analytics breaks the materialized view model on cost, not freshness. When you serve thousands of accounts, pre-aggregating every tenant's metric slice means thousands of materialized views, each on its own refresh schedule, and most of them refresh for tenants who never open the dashboard that day. Usage arrives in bursts. A Monday-morning sign-in wave hits at once, then traffic drops for hours. On-the-fly queries with Hypertenancy match that shape, because Pulse instances meter per query rather than by wall-clock time, so an idle tenant costs you nothing while an active one gets dedicated compute. You stop paying to keep aggregates warm for accounts that are asleep.

HIPAA healthcare: isolation that maps to how you route queries

HIPAA-bound healthcare analytics needs data isolation enforced at the routing layer, not patched on after a query runs. Per-session database routing handles this directly. When you create the embed session, pass required_resources so each tenant's Dive resolves only to that tenant's database, and a query for one provider physically cannot reach another provider's records. Scope a separate service account per tenant, optionally per region, so the credential itself limits what data a session can touch. That structure gives you an audit story auditors accept, because isolation lives in the access path rather than in application logic you have to prove correct. Materialized views work against this goal, since a shared MV mixing tenant rows creates exactly the cross-tenant exposure HIPAA forbids. Keep reads on-the-fly and keep each tenant's data in its own routed database. If your deployment needs a signed Business Associate Agreement (BAA), that's available on MotherDuck's Enterprise plan.

Across all three, the pattern holds. Stale aggregates are acceptable only where the underlying data has genuinely stopped changing. Everywhere freshness, tenant count, or isolation matters, on-the-fly queries on isolated per-tenant compute carry less risk and less operational debt than maintaining a fleet of materialized views.

Real customer evidence

A MotherDuck customer who sells data with a dashboarding interface closed a $30,000 deal less than a week after embedded Dives launched. They saw the release announcement on a Thursday, built a customer-facing data app in 20 minutes, demoed it the following Monday, and signed the contract, with production going live the next day. Jordan Tigani described the timeline on LinkedIn.

The engineering detail matters more than the dollar figure. That same custom report had taken more than 10 hours to build in their legacy embedded BI tool, which made the deal uneconomical at the price point. The 20-minute Dive build replaced an entire BI embed workflow with React and SQL, generated through natural language. The customer skipped the deploy pipeline entirely and published directly.

What this proves about the architecture is concrete. The dual execution model meant the initial query ran server-side on cloud DuckDB, then every filter and drill-down ran client-side in DuckDB-Wasm at single-digit millisecond latency. No materialized views were built or refreshed for the interactive layer. Hypertenancy handled the multi-tenant isolation so the customer's own end users got dedicated compute without contending for shared warehouse resources.

The speed-to-build outcome was not a one-off. Hamilton Ulmer, who co-created Dives, reported that adoption spreads through customer organizations once one team ships a Dive. In one case, a colleague's 10- and 12-year-old children built working Dives in a single day, which says something about how much the build complexity dropped compared to configuring a traditional embed.

The lesson for the MV-versus-on-the-fly decision is direct. When the architecture removes both the pre-aggregation step and the resource contention, you stop paying the operational tax that pushes teams toward materialized views in the first place.

Conclusion

The heuristic comes down to two numbers. If you serve a handful of tenants and your data changes once a day, materialized views work fine, and the refresh cost stays manageable. The math turns against you the moment either number grows. Tenant counts in the dozens or fast-moving data push MV storage and refresh orchestration into operational debt you pay down every week.

For embedded BI at scale, on-the-fly queries with Hypertenancy and dual execution carry less of that debt. Each tenant gets isolated compute, so queries stay fast without pre-aggregation, and interactive filtering runs client-side in the browser at single-digit milliseconds.

To build this, start with the MotherDuck docs and the Embedded Dives announcement, which walk through per-session database routing and the two query modes in detail.

Start using MotherDuck now!