From Cloud Storage or over HTTPS

From Public Cloud Storage

MotherDuck supports several cloud storage providers, including Azure, Google Cloud and Cloudflare R2.

note

MotherDuck is currently hosted in Amazon AWS region us-east-1. We strongly encourage you locate your data in this availability zone for working with MotherDuck.

The following example features Amazon S3.

Connect to MotherDuck if you haven't already by doing the following:

-- assuming the db my_db exists
ATTACH 'md:my_db';

-- CTAS a table from a publicly available demo dataset stored in s3
CREATE OR REPLACE TABLE pypi_small AS
    SELECT * FROM 's3://motherduck-demo/pypi.small.parquet';

-- JOIN the demo dataset against a larger table to find the most common duplicate urls
-- Note you can directly refer to the url as a table! 
SELECT pypi_small.url, COUNT(*) 
    FROM pypi_small 
    JOIN 's3://motherduck-demo/pypi_downloads.parquet' AS s3_pypi 
      ON pypi_small.url = s3_pypi.url 
    GROUP BY pypi_small.url 
    ORDER BY COUNT(*) DESC 
    LIMIT 10;

From a Secure Cloud Storage Provider

MotherDuck supports several cloud storage providers, including Azure, Google Cloud and Cloudflare R2. In order to access them securly, you first must create a secret.

CREATE SECRET IN MOTHERDUCK (
    TYPE S3,
    KEY_ID 'access_key',
    SECRET 'secret_key',
    REGION 'us-east-1',
    SCOPE 'my-bucket-path'
);

-- Now you can query from a secure S3 bucket
CREATE OR REPLACE TABLE mytable AS SELECT * FROM 's3://...';

Over HTTPS

MotherDuck supports loading data over HTTPS.

-- Reads the Central Park Squirrel Data
SELECT * FROM read_csv_auto('https://docs.google.com/spreadsheets/d/e/2PACX-1vQUZR6ikwZBRXWWQsFaUceEiYzJiVw4OQNGtwGBfcMfVatpCyfxxaWPdoKJIHlwNM-ow1oeW_2F-pO5/pub?gid=2035607922&single=true&output=csv');

From Public Cloud Storage

From a Secure Cloud Storage Provider​

Over HTTPS​

From a Secure Cloud Storage Provider

Over HTTPS