PlanetScale
PlanetScale offers hosted PostgreSQL and MySQL Vitess Databases. MotherDuck supports PlanetScale Postgres via the pg_duckdb extension, as well as the Postgres Connector. In our internal benchmarking, pg_duckdb offers 100x or greater query acceleration for analytical queries when compared to vanilla Postgres.
Prerequisites
Before connecting PlanetScale to MotherDuck, ensure you have:
- A PlanetScale account with a Postgres database created
- The
pg_duckdbextension enabled in your PlanetScale database (see PlanetScale extension documentation) - A MotherDuck account and authentication token (get your token from the MotherDuck dashboard)
- Database connection credentials from your PlanetScale dashboard (host, port, username, password, database name)
Connecting pg_duckdb to MotherDuck
To run pg_duckdb, make sure to add it your extensions in PlanetScale.
Review the configuration parameters before deploying the extension. Once deployed, you can connect to MotherDuck with the following SQL statements.
-- Grant necessary permissions to the PlanetScale superuser
GRANT CREATE ON SCHEMA public to pscale_superuser;
-- Create the pg_duckdb extension in your Postgres database
CREATE EXTENSION pg_duckdb;
-- Enable a MotherDuck connection with your authentication token
CALL duckdb.enable_motherduck(<your token>);
To swap tokens, you can drop the MotherDuck connection and then re-add with:
-- Remove the existing MotherDuck server connection
DROP SERVER motherduck CASCADE;
-- Re-enable MotherDuck with a new authentication token
CALL duckdb.enable_motherduck(<your token>);
Using Read Replicas with PlanetScale
Pg_duckdb will automatically round-robin between your replicas when you use a read-only token. When switching between a read-write and a read-only token, you will want to snapshot your database and then force sync as part of the hand-off.
Switching from read-write to read-only is done with the following SQL statement in Postgres:
-- Create a snapshot of your MotherDuck database to ensure consistency
SELECT * FROM duckdb.raw_query('CREATE SNAPSHOT OF <db_name>');
-- Drop the existing MotherDuck connection
DROP SERVER motherduck CASCADE;
-- Re-enable MotherDuck with your read-only token
CALL duckdb.enable_motherduck(<your read only token>);
-- Refresh the database to sync with the snapshot
SELECT * FROM duckdb.raw_query('REFRESH DATABASE <db_name>');
Reading from MotherDuck
By default, data in MotherDuck is mapped to Postgres in two different ways. This is because MotherDuck is designed to hold many databases in its global catalog, while Postgres traditionally has a single database in its catalog.
- For data in
my_db.main, it is mapped directly to thepublicschema in the Postgres database. - For data in any other database & schema, it is mapped to
ddb$database$schemain the Postgres database.
Once the catalog is in sync between MotherDuck and Postgres, the data can be queried directly from Postgres. If it is out of sync for any reason, it can be re-sync'd with the following SQL command:
-- Terminate the pg_duckdb sync worker to force a re-sync
SELECT * FROM pg_terminate_backend((
SELECT pid FROM pg_stat_activity WHERE backend_type = 'pg_duckdb sync worker'
));
Sample MotherDuck Queries
Once the catalog is synchronized to Postgres, we can query the data as if it was normal data in Postgres.
-- Query data from a MotherDuck database and schema
-- Note: Non-main schemas use the ddb$database$schema naming convention
SELECT *
FROM "ddb$sample_data$nyc".taxi
ORDER BY tpep_dropoff_datetime DESC
LIMIT 10;
Of course, we can also join with data in Postgres.
-- Join MotherDuck data with local Postgres tables
SELECT a.col1, b.col2
-- MotherDuck table from a non-main schema
FROM "ddb$my_database$my_schema".my_table AS a
-- Local Postgres table in the public schema
LEFT JOIN public.another_table AS b on a.key = b.key
The DuckDB iceberg_scan function also works as well:
-- Use DuckDB's iceberg_scan function to query Iceberg tables
SELECT COUNT(*)
FROM iceberg_scan('https://motherduck-demo.s3.amazonaws.com/iceberg/lineitem_iceberg', allow_moved_paths := true)
Two special helper functions exist to run queries directly with DuckDB:
duckdb.query: Returns tabular data, use for SELECT queriesduckdb.raw_query: Returns void, use for DDL queries such as Snapshot Creation and Database Refresh. This function keeps the database in-sync when handing off between read and write nodes.
-- Use duckdb.query for SELECT queries that return tabular data
-- This example lists all databases in MotherDuck
SELECT * FROM duckdb.query('FROM md_databases()')
-- Use duckdb.raw_query for DDL queries that return void
-- This example drops a table in MotherDuck
SELECT * FROM duckdb.raw_query('DROP TABLE my_database.my_schema.some_table')
Replicating data to MotherDuck
For smaller tables, data can be replicated using simple SQL statements.
-- Create a table in MotherDuck and populate it with data from Postgres
-- Replace my_database and my_schema with your target database and schema names
CREATE TABLE "ddb$my_database$my_schema".my_table USING duckdb AS
SELECT * FROM public.my_table
For larger tables, state management, and tighter SLAs & requirements, MotherDuck offers integrations to various other ingestion partners.
Further reading
The pg_duckdb github repo contains further documentation of all available functions.
For ease of finding the documentation, a table of the documentation sections is below:
| Topic | Description |
|---|---|
| Functions | Complete reference for all available functions |
| Syntax Guide & Gotchas | Quick reference for common SQL patterns and things to know |
| Types | Supported data types and type mappings |
| Extensions | DuckDB extension installation and usage |
| Settings | Configuration options and parameters |
| Transactions | Transaction behavior and limitations |
Connecting with the Postgres Extension
You can also connect to PlanetScale Postgres with the DuckDB Postgres extension. This approach allows you to query PlanetScale data directly from DuckDB or MotherDuck.
Install and Load the Extension
-- Install the Postgres extension from DuckDB's extension registry
INSTALL postgres;
-- Load the extension to enable Postgres connectivity
LOAD postgres;
-- Attach your PlanetScale database using a connection string
ATTACH '<connection string>' AS postgres_db (TYPE postgres);
Connection String Format
The connection string format follows PostgreSQL's standard connection parameters. Here's an example with explanations:
ATTACH 'host=<host> port=<port> user=<user> password=<pw> dbname=<db> sslmode=require'
AS planetscale (TYPE postgres);
Connection Parameters:
host: Your PlanetScale database hostname (found in your PlanetScale dashboard)port: The database port (typically 3306 for MySQL or 5432 for Postgres)user: Your PlanetScale database usernamepassword: Your PlanetScale database passworddbname: The name of your database in PlanetScalesslmode=require: Ensures SSL encryption is used (required for PlanetScale)
The above connection string works with DuckDB. PlanetScale suggests also using the sslnegotation and sslrootcert keys when connecting to Postgres, but these keys are not supported by the libpq version that is included in DuckDB. The sslmode=require parameter is sufficient for secure connections.