MD_RUN parameter
For certain DuckDB Table Functions, MotherDuck now provides an additional parameter, MD_RUN
that gives explicit control over where the query is executed.
This parameter is available to the following functions:
read_csv()
read_csv_auto()
read_json()
read_json_auto()
read_parquet()
and its aliasparquet_scan()
To leverage the MD_RUN parameter, you can choose:
MD_RUN=LOCAL
executes the function in your local DuckDB environmentMD_RUN=REMOTE
executes the function in MotherDuck-hosted DuckDB runtimes in the cloudMD_RUN=AUTO
executes remotely all s3://, http://, and https:// requests, except those to localhost/127.0.01. This is the default option.
The following is an example of evoking this parameter to execute the function remotely:
SELECT *
FROM read_csv_auto(
'https://github.com/duckdb/duckdb/raw/main/data/csv/ips.csv.gz',
MD_RUN=REMOTE)
LIMIT 100
In this example MD_RUN=REMOTE
is redundant, because omitting it implies MD_RUN=AUTO
and given that this is a non-local https:// resource, MotherDuck will automatically chose remote execution already.
One can force local execution with MD_RUN=LOCAL
. Be aware that DuckDB-WASM does not support reading compressed files yet, so inside the Web Browser one would get an error for this particular file as it is ips.csv**.gz** (it does work locally from the CLI or e.g. a python notebook).