The Power of Wasm for Analytics: DuckDB in the Browser
2024/01/19TL;DR: Explore WebAssembly (Wasm) use cases for analytics, see DuckDB running in the browser, and build a Firefox extension that displays Parquet file schemas using DuckDB-Wasm.
What is WebAssembly?
WebAssembly (Wasm) is a binary instruction format that enables running code written in languages like C++, Rust, and Go in web browsers. First released in 2017, it's now powering applications like Figma, Photoshop Web, and Disney+.
Key concepts:
- Sandboxed execution: Code runs in an isolated environment
- Near-native performance: Much faster than JavaScript for compute-heavy tasks
- Portable: Runs in browsers but also in Docker containers and edge environments
Why Wasm Matters for Analytics
Modern laptops have incredible computing power that's often underutilized:
- A MacBook Air today may have more CPU and memory than many cloud servers
- Running analytics locally eliminates network latency and reduces cloud costs
- Parquet files enable efficient column selection and predicate pushdown over the network
Combining Wasm with DuckDB means:
- Zero installation: Users can run SQL queries directly in their browser
- Local compute: Process data on the client without server round-trips
- Privacy: Sensitive data never leaves the user's machine
Real-World Examples
- Evidence.dev: BI dashboards using DuckDB-Wasm for instant filtering and aggregation
- TensorFlow.js: Train ML models in the browser using WebGPU for GPU acceleration
- Docker + Wasm: Run Wasm containers alongside traditional containers
Demo: Parquet Schema Browser Extension
Christophe Blefari built a Firefox extension that displays Parquet file schemas when hovering over files in Google Cloud Storage:
The Problem: Checking a Parquet schema traditionally requires:
- Download the file (possibly gigabytes)
- Start a Python environment
- Run pandas/pyarrow to read the schema
The Solution: A browser extension using DuckDB-Wasm that:
- Listens for mouseover events on GCS file links
- Sends a message to the extension's background script
- DuckDB-Wasm reads only the Parquet metadata (no full download)
- Displays the schema in a popup panel
Copy code
// Initialize DuckDB-Wasm
const db = await duckdb.AsyncDuckDB.instantiate();
const conn = await db.connect();
// Query Parquet schema without downloading full file
const result = await conn.query(`
SELECT * FROM parquet_schema('gs://bucket/file.parquet')
`);
Key Advantages
- Instant schema inspection: No file downloads, just metadata
- Zero bandwidth for schema checks: Parquet stores schema in footer
- Works with any cloud storage: S3, GCS, Azure Blob (with credentials)
The Future
- WebGPU: Direct GPU access from browsers for ML training
- Decentralized analytics: Query encrypted data locally with keys stored client-side
- Browser-based data apps: Full analytical applications without backend infrastructure
Related Videos
2026-01-21
The MCP Sessions - Vol 2: Supply Chain Analytics
Jacob and Alex from MotherDuck query data using the MotherDuck MCP. Watch as they analyze 180,000 rows of shipment data through conversational AI, uncovering late delivery patterns, profitability insights, and operational trends with no SQL required!
Stream
AI, ML and LLMs
MotherDuck Features
SQL
BI & Visualization
Tutorial
2026-01-13
The MCP Sessions Vol. 1: Sports Analytics
Watch us dive into NFL playoff odds and PGA Tour stats using using MotherDuck's MCP server with Claude. See how to analyze data, build visualizations, and iterate on insights in real-time using natural language queries and DuckDB.
AI, ML and LLMs
SQL
MotherDuck Features
Tutorial
BI & Visualization
Ecosystem

2025-11-19
LLMs Meet Data Warehouses: Reliable AI Agents for Business Analytics
LLMs excel at natural language understanding but struggle with factual accuracy when aggregating business data. Ryan Boyd explores the architectural patterns needed to make LLMs work effectively alongside analytics databases.
AI, ML and LLMs
MotherDuck Features
SQL
Talk
Python
BI & Visualization

