Introducing Flights: agent-native data pipelines in MotherDuckJoin the livestream

Skip to main content

Soda

Data quality platform for monitoring and managing data quality in your pipelines. It integrates with MotherDuck for table monitoring as part of data quality and observability workflows.

How it works with MotherDuck

Soda connects to MotherDuck through the soda-duckdb package and runs quality scans against a MotherDuck md: database connection.

Prerequisites

  • Soda installed in the environment that will run scans.
  • The soda-duckdb package.
  • A MotherDuck access token and database path.

Setup

  1. Install the Soda DuckDB package:

    pip install soda-duckdb
  2. Add a MotherDuck data source to your Soda configuration:

    data_source motherduck:
    type: duckdb
    database: "md:sample_data?motherduck_token=<motherduck_token>"
    read_only: true
  3. Test the connection:

    soda test-connection -d motherduck -c configuration.yml -V

Authentication and configuration

  • The MotherDuck token can be passed in the md: connection string shown in Soda's reference configuration.
  • Store the token through your deployment secret manager or CI secret store before rendering the Soda configuration.
  • Set read_only: true for scan-only workflows.

Important notes

  • Some Soda users report using path instead of database successfully. If database does not work in your environment, test path with the same md: value.
  • Keep Soda checks focused on the tables and columns you need to monitor so scans remain predictable.

Use cases

  • Run SodaCL data quality checks against MotherDuck tables.
  • Validate pipeline outputs after loading data into MotherDuck.
  • Add MotherDuck quality scans to CI or scheduled data checks.