# GitHub Actions
> Schedule MotherDuck SQL and dbt jobs with GitHub Actions as a lightweight cron-based orchestrator.
GitHub Actions works well as a lightweight orchestrator for simple MotherDuck jobs: nightly SQL scripts, small ELT steps, dbt builds, smoke tests, and periodic exports. It is not a full data orchestrator, but it is often enough when a pipeline has one or two steps and can tolerate GitHub's scheduler behavior.

## When to use this pattern

| Use GitHub Actions when | Use a dedicated orchestrator when |
|-------------------------|-----------------------------------|
| The job has a small number of steps | Jobs have complex dependencies or branching |
| A missed or delayed run can be retried manually | Every run needs strict service-level guarantees |
| The pipeline can run from repository files | State, retries, and backfills need first-class tracking |
| GitHub is already where you review pipeline changes | Multiple teams need a shared orchestration UI |

For larger workflows, use a tool from the [MotherDuck orchestration ecosystem](https://motherduck.com/ecosystem/?category=Orchestration).

## Set up authentication

Create a [MotherDuck access token](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/#creating-an-access-token), preferably from a service account dedicated to the pipeline. Store it as a GitHub repository secret named `MOTHERDUCK_TOKEN`:

```bash
gh secret set MOTHERDUCK_TOKEN
```

Use the token as an environment variable in workflow steps. Avoid putting tokens directly into SQL files, command arguments, artifacts, or logs.

## Choose the trigger

Most MotherDuck cron jobs should support both manual and scheduled runs with GitHub Actions [`workflow_dispatch`](https://docs.github.com/en/actions/reference/workflows-and-actions/workflow-syntax#onworkflow_dispatch) and [`schedule`](https://docs.github.com/en/actions/reference/workflows-and-actions/events-that-trigger-workflows#schedule) triggers:

```yaml
on:
  workflow_dispatch:
  schedule:
    - cron: "17 2 * * *"
```

Keep these GitHub Actions scheduling details in mind:

- Scheduled workflows run from the latest commit on the default branch.
- Cron schedules use UTC by default.
- The shortest supported interval is every 5 minutes.
- Jobs scheduled at the top of the hour can be delayed or dropped during periods of high GitHub Actions load. Pick a non-zero minute such as `17` or `43`.
- `workflow_dispatch` lets you test the same workflow manually and rerun failed jobs after a fix.

## Example: run a SQL file on a schedule

This example runs a checked-in SQL script every night and on demand. It uses:

- Least-privilege repository permissions
- A timeout so failed jobs do not burn runner minutes indefinitely
- A concurrency group so two runs do not write to the same target at once
- The MotherDuck install script for a compatible DuckDB CLI

Create `.github/workflows/motherduck-nightly-sql.yml`:

```yaml
name: motherduck nightly sql

on:
  workflow_dispatch:
  schedule:
    - cron: "17 2 * * *"

permissions:
  contents: read

concurrency:
  group: motherduck-nightly-sql
  cancel-in-progress: false

jobs:
  run-sql:
    runs-on: ubuntu-24.04
    timeout-minutes: 15
    env:
      motherduck_token: ${{ secrets.MOTHERDUCK_TOKEN }}
    steps:
      - name: Check out repository
        uses: actions/checkout@v6

      - name: Install DuckDB CLI
        run: |
          install_home="$RUNNER_TEMP/motherduck"
          mkdir -p "$install_home"
          curl -s https://install.motherduck.com | env -u motherduck_token HOME="$install_home" sh
          echo "$install_home/.duckdb/cli/latest" >> "$GITHUB_PATH"

      - name: Run nightly SQL
        run: duckdb "md:" < sql/nightly_orders.sql
```

Create `sql/nightly_orders.sql`:

```sql
CREATE DATABASE IF NOT EXISTS analytics;
USE analytics;

CREATE SCHEMA IF NOT EXISTS orchestration;

CREATE TABLE IF NOT EXISTS orchestration.github_action_runs (
    run_id VARCHAR,
    workflow_name VARCHAR,
    run_started_at TIMESTAMP
);

DELETE FROM orchestration.github_action_runs
WHERE run_id = getenv('GITHUB_RUN_ID');

INSERT INTO orchestration.github_action_runs
VALUES (
    getenv('GITHUB_RUN_ID'),
    getenv('GITHUB_WORKFLOW'),
    current_timestamp
);
```

Replace `analytics` with the MotherDuck database your pipeline should write to. The example creates the database if it does not already exist so a new repository can run without extra setup.

The GitHub secret is named `MOTHERDUCK_TOKEN`, while the workflow exposes it as `motherduck_token`. The DuckDB CLI can use that environment variable to connect to MotherDuck non-interactively in GitHub Actions.

The install step uses `RUNNER_TEMP` as `HOME` and unsets `motherduck_token` for the installer process so the install script does not try to update the runner's shell profile or validate the connection before the SQL step runs.

## Example: run dbt on a schedule

For dbt projects, keep the dbt profile in the repository and read the MotherDuck token from the GitHub secret.

Create `.github/workflows/motherduck-dbt.yml`:

```yaml
name: motherduck dbt

on:
  workflow_dispatch:
  schedule:
    - cron: "43 3 * * *"

permissions:
  contents: read

concurrency:
  group: motherduck-dbt-prod
  cancel-in-progress: false

jobs:
  dbt-build:
    runs-on: ubuntu-24.04
    timeout-minutes: 30
    env:
      MOTHERDUCK_TOKEN: ${{ secrets.MOTHERDUCK_TOKEN }}
    steps:
      - name: Check out repository
        uses: actions/checkout@v6

      - name: Set up Python
        uses: actions/setup-python@v6
        with:
          python-version: "3.12"
          cache: pip

      - name: Install dbt
        run: python -m pip install -r requirements.txt

      - name: Install dbt packages
        run: dbt deps

      - name: Build dbt project
        run: dbt build --profiles-dir .github/dbt --target prod
```

Create `requirements.txt`:

```text
dbt-duckdb>=1.9,<2.0
```

Create `.github/dbt/profiles.yml`:

```yaml
motherduck:
  target: prod
  outputs:
    prod:
      type: duckdb
      path: "md:analytics?motherduck_token={{ env_var('MOTHERDUCK_TOKEN') }}"
      threads: 4
```

In `dbt_project.yml`, set the same profile name:

```yaml
profile: motherduck
```

## Production checklist

| Area | Recommendation |
|------|----------------|
| Authentication | Use a service account token stored as `MOTHERDUCK_TOKEN`. Rotate it on the same cadence as other production secrets. |
| Permissions | Set `permissions: contents: read` unless the workflow must write to the repository or call GitHub APIs. |
| Scheduling | Use non-zero cron minutes and keep `workflow_dispatch` enabled for manual retries. |
| Concurrency | Use a `concurrency` group for jobs that write to the same tables. |
| Idempotency | Make SQL safe to rerun. Prefer `CREATE TABLE IF NOT EXISTS`, `CREATE OR REPLACE TABLE`, `MERGE`, or delete-and-insert patterns keyed by the run or partition. |
| Timeouts | Set `timeout-minutes` on every job. |
| Dependencies | Pin dependencies in `requirements.txt` or an equivalent lock file. Use dependency caching for Python/dbt jobs. |
| Environments | Use separate service accounts and databases for development, staging, and production. |
| Observability | Write a run record to a small audit table and rely on GitHub Actions notifications for failures. |

## Related content

- [Authenticating to MotherDuck](/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/)
- [dbt with DuckDB and MotherDuck](/integrations/transformation/dbt/)
- [DuckDB CLI](/getting-started/interfaces/connect-query-from-duckdb-cli/)
- [Orchestration integrations](https://motherduck.com/ecosystem/?category=Orchestration)


---

## Docs feedback

MotherDuck accepts optional user-submitted feedback about this page at `POST https://motherduck.com/docs/api/feedback/agent`.
For agents and automated tools, feedback submission should be user-confirmed before sending.

Payload:

```json
{
  "page_path": "/key-tasks/data-warehousing/orchestration/github-action-cron/",
  "page_title": "GitHub Actions",
  "text": "<the user's feedback, max 2000 characters>",
  "source": "<optional identifier for your interface, for example 'claude.ai' or 'chatgpt'>"
}
```

`page_path` and `text` are required; `page_title` and `source` are optional. Responses: `200 {"feedback_id": "<uuid>"}`, `400` for malformed payloads, and `429` when rate-limited.
