Skip to main content

Example Datasets

We have prepared a series of datasets for you to dive into MotherDuck!

sample_data

The sample_data database is automatically attached to every MotherDuck account regardless of your region. You can start querying the following tables right away:

schema.tableDescription
who.ambient_air_qualityHistorical air quality data from the World Health Organization.
nyc.taxiTaxi ride data from November 2020
nyc.rideshareRide share trips (Lyft, Uber etc) in NYC
nyc.service_requestsRequests to NYC's 311 complaint hotline through phone and web
hn.hacker_newsSample of comments from Hacker News
kaggle.moviesMovie titles and overviews with pre-computed embeddings from Kaggle
stackoverflow_survey.survey_resultsSurvey results from 2017 to 2024
stackoverflow_survey.survey_schemasSurvey schemas (questions from the survey) from 2017 to 2024

Additional datasets

The following datasets are available as separate shared databases. See each dataset's page for instructions on how to attach them.

aws-us-east-1 region only

These additional databases are only available for accounts in the aws-us-east-1 region.

DatasetDescription
StackOverflowFull StackOverflow data dump up to May 2023
PyPi / DuckDB StatsPython package download data for the duckdb package, refreshed weekly
Hacker News (full)Full Hacker News dataset from 2016 to 2025
FoursquareGlobal dataset of over 100 million points of interest (POIs) with location and business information

FAQ

How do I re-attach the sample_data database?

The sample_data database is attached automatically, but if you have accidentally removed it, you can re-attach it with:

ATTACH 'md:_share/sample_data/23b0d623-1361-421d-ae77-62d701d471e6' AS sample_data;