YouTubeWasmBI & Visualization

The Power of Wasm for Analytics: DuckDB in the Browser

2024/01/19

TL;DR: Explore WebAssembly (Wasm) use cases for analytics, see DuckDB running in the browser, and build a Firefox extension that displays Parquet file schemas using DuckDB-Wasm.

What is WebAssembly?

WebAssembly (Wasm) is a binary instruction format that enables running code written in languages like C++, Rust, and Go in web browsers. First released in 2017, it's now powering applications like Figma, Photoshop Web, and Disney+.

Key concepts:

  • Sandboxed execution: Code runs in an isolated environment
  • Near-native performance: Much faster than JavaScript for compute-heavy tasks
  • Portable: Runs in browsers but also in Docker containers and edge environments

Why Wasm Matters for Analytics

Modern laptops have incredible computing power that's often underutilized:

  • A MacBook Air today may have more CPU and memory than many cloud servers
  • Running analytics locally eliminates network latency and reduces cloud costs
  • Parquet files enable efficient column selection and predicate pushdown over the network

Combining Wasm with DuckDB means:

  • Zero installation: Users can run SQL queries directly in their browser
  • Local compute: Process data on the client without server round-trips
  • Privacy: Sensitive data never leaves the user's machine

Real-World Examples

  • Evidence.dev: BI dashboards using DuckDB-Wasm for instant filtering and aggregation
  • TensorFlow.js: Train ML models in the browser using WebGPU for GPU acceleration
  • Docker + Wasm: Run Wasm containers alongside traditional containers

Demo: Parquet Schema Browser Extension

Christophe Blefari built a Firefox extension that displays Parquet file schemas when hovering over files in Google Cloud Storage:

The Problem: Checking a Parquet schema traditionally requires:

  1. Download the file (possibly gigabytes)
  2. Start a Python environment
  3. Run pandas/pyarrow to read the schema

The Solution: A browser extension using DuckDB-Wasm that:

  1. Listens for mouseover events on GCS file links
  2. Sends a message to the extension's background script
  3. DuckDB-Wasm reads only the Parquet metadata (no full download)
  4. Displays the schema in a popup panel

Copy code

// Initialize DuckDB-Wasm const db = await duckdb.AsyncDuckDB.instantiate(); const conn = await db.connect(); // Query Parquet schema without downloading full file const result = await conn.query(` SELECT * FROM parquet_schema('gs://bucket/file.parquet') `);

Key Advantages

  • Instant schema inspection: No file downloads, just metadata
  • Zero bandwidth for schema checks: Parquet stores schema in footer
  • Works with any cloud storage: S3, GCS, Azure Blob (with credentials)

The Future

  • WebGPU: Direct GPU access from browsers for ML training
  • Decentralized analytics: Query encrypted data locally with keys stored client-side
  • Browser-based data apps: Full analytical applications without backend infrastructure

Related Videos

"The MCP Sessions - Vol 2: Supply Chain Analytics" video thumbnail

2026-01-21

The MCP Sessions - Vol 2: Supply Chain Analytics

Jacob and Alex from MotherDuck query data using the MotherDuck MCP. Watch as they analyze 180,000 rows of shipment data through conversational AI, uncovering late delivery patterns, profitability insights, and operational trends with no SQL required!

Stream

AI, ML and LLMs

MotherDuck Features

SQL

BI & Visualization

Tutorial

" The MCP Sessions Vol. 1: Sports Analytics" video thumbnail

2026-01-13

The MCP Sessions Vol. 1: Sports Analytics

Watch us dive into NFL playoff odds and PGA Tour stats using using MotherDuck's MCP server with Claude. See how to analyze data, build visualizations, and iterate on insights in real-time using natural language queries and DuckDB.

AI, ML and LLMs

SQL

MotherDuck Features

Tutorial

BI & Visualization

Ecosystem

"LLMs Meet Data Warehouses: Reliable AI Agents for Business Analytics" video thumbnail

2025-11-19

LLMs Meet Data Warehouses: Reliable AI Agents for Business Analytics

LLMs excel at natural language understanding but struggle with factual accuracy when aggregating business data. Ryan Boyd explores the architectural patterns needed to make LLMs work effectively alongside analytics databases.

AI, ML and LLMs

MotherDuck Features

SQL

Talk

Python

BI & Visualization