Skip to main content

From Cloud Storage or over HTTPS

From Public Cloud Storage

MotherDuck supports several cloud storage providers, including Azure, Google Cloud and Cloudflare R2.

note

MotherDuck is currently hosted in Amazon AWS region us-east-1. We strongly encourage you locate your data in this availability zone for working with MotherDuck.

The following example features Amazon S3.

Connect to MotherDuck if you haven't already by doing the following:

-- assuming the db my_db exists
ATTACH 'md:my_db';
-- CTAS a table from a publicly available demo dataset stored in s3
CREATE OR REPLACE TABLE pypi_small AS
SELECT * FROM 's3://motherduck-demo/pypi.small.parquet';

-- JOIN the demo dataset against a larger table to find the most common duplicate urls
-- Note you can directly refer to the url as a table!
SELECT pypi_small.url, COUNT(*)
FROM pypi_small
JOIN 's3://motherduck-demo/pypi_downloads.parquet' AS s3_pypi
ON pypi_small.url = s3_pypi.url
GROUP BY pypi_small.url
ORDER BY COUNT(*) DESC
LIMIT 10;

From a Secure Cloud Storage Provider

MotherDuck supports several cloud storage providers, including Azure, Google Cloud and Cloudflare R2.

CREATE SECRET IN MOTHERDUCK (
TYPE S3,
KEY_ID 'access_key',
SECRET 'secret_key',
REGION 'us-east-1'
);

-- Now you can query from a secure S3 bucket
CREATE OR REPLACE TABLE mytable AS SELECT * FROM 's3://...';

Over HTTPS

MotherDuck supports loading data over HTTPS.

-- Reads the Central Park Squirrel Data
SELECT * FROM read_csv_auto('https://docs.google.com/spreadsheets/d/e/2PACX-1vQUZR6ikwZBRXWWQsFaUceEiYzJiVw4OQNGtwGBfcMfVatpCyfxxaWPdoKJIHlwNM-ow1oeW_2F-pO5/pub?gid=2035607922&single=true&output=csv');