Tech news signal

# Build a daily briefing Flight and Dive
> Create a Flight that ingests tech RSS feeds, summarizes articles with prompt(), writes a daily briefing, and creates a Dive over the result.
<!-- cspell:ignore fetchone fetchall -->

You want a personal data warehouse that keeps itself fresh and gives you a daily briefing without leaving MotherDuck. In this guide, a Flight handles the Python-only work of fetching database and AI RSS articles from The Register and TechCrunch each morning, stores them in `docs_playground`, uses [`prompt()`](/sql-reference/motherduck-sql-reference/ai-functions/prompt/) to summarize and classify each article, generates a daily briefing, and creates a small [Dive](/key-tasks/ai-and-motherduck/dives/) that shows the briefing and topic trends.

```mermaid
flowchart LR
    Register["The Register<br/>database news feed"]:::green --> Flight["Scheduled Flight"]:::yellow
    TechCrunch["TechCrunch<br/>AI feed"]:::green --> Flight
    Flight --> Articles[("news_articles<br/>raw + prompt fields")]:::yellow
    Articles --> Briefing["prompt()<br/>daily briefing"]:::yellow
    Briefing --> Dive["Daily briefing Dive"]:::green
```

The tables and Dive all live in your own MotherDuck account, so you can edit the prompts, swap feeds, or connect the output to a notebook.

## Before you start

The Flight runtime authenticates to MotherDuck for you: `duckdb.connect("md:")` inside the Flight picks up your identity automatically, so there is nothing to configure. To run a scheduled Flight as a service account instead, see [Authentication, config, and secrets](/key-tasks/flights/flights-authentication-config-and-secrets).

:::info
This guide calls `prompt()` for recent articles without summaries and once for the daily briefing. `prompt()` consumes AI Units, so keep the `LIMIT 20` cap while testing.
:::

The feed responses are intentionally small, so the Flight can parse each feed response and write article rows directly to a local CSV. If you expand this into a broad crawler or scrape many pages, switch to one of the bulk patterns in [Packages and recommended libraries](/key-tasks/flights/packages-and-runtime): PyArrow/Polars batches, local files under `/tmp`, or Parquet files in S3.

## Create the Flight

The Flight source fetches the RSS feeds, inserts only articles that are not already in the main `news_articles` table, then enriches recent rows without summaries with `prompt()`.

The script:

1. Fetches both RSS feeds and stages article rows in a local CSV file under `/tmp`.
2. Inserts staged rows that are not already in `news_articles` with an `ANTI JOIN`.
3. Enriches recent rows where `summarized_at IS NULL` with `prompt()`.
4. Writes a daily briefing over the latest enriched article summaries.

The local CSV keeps the Python easy to read while avoiding remote row-by-row inserts. The insert step reads the staged CSV directly in the `INSERT ... SELECT`, so the Flight does not need a temporary table.

Review the Python source that runs inside the Flight:

```python
import csv
import xml.etree.ElementTree as ET
from email.utils import parsedate_to_datetime
from pathlib import Path

import duckdb
import httpx

ARTICLE_TABLE = "docs_playground.flights_demo.news_articles"
BRIEFING_TABLE = "docs_playground.flights_demo.news_daily_briefings"
STAGING_CSV = Path("/tmp/fetched_news_articles.csv")
MAX_ARTICLES_TO_ENRICH = 20
USER_AGENT = "MotherDuck Flights docs demo"

FEEDS = [
    {
        "source": "The Register database news",
        "url": (
            "https://api.theregister.com/api/v1/article"
            "?orderBy=published&site_id=2"
            "&query=(tag:databases)&remapper=rss"
        ),
    },
    {
        "source": "TechCrunch AI",
        "url": "https://techcrunch.com/category/artificial-intelligence/feed/",
    },
]

def item_text(item, tag):
    """Return stripped text from an RSS item child tag."""
    node = item.find(tag)
    if node is None or node.text is None:
        return ""
    return node.text.strip()

def parse_feed_articles(source, feed_bytes):
    """Yield normalized RSS items as article table rows."""
    root = ET.fromstring(feed_bytes)
    for item in root.findall("./channel/item"):
        published_raw = item_text(item, "pubDate")
        published_at = parsedate_to_datetime(published_raw).isoformat() if published_raw else None
        link = item_text(item, "link")
        yield (
            source + ":" + (item_text(item, "guid") or link),
            source,
            item_text(item, "title"),
            link,
            published_at,
            item_text(item, "description"),
        )

def write_article_csv_header(writer):
    """Write the CSV columns expected by the insert query."""
    writer.writerow([
        "article_id",
        "source",
        "title",
        "link",
        "published_at",
        "description",
    ])

def stage_feed_articles():
    """Download each configured feed and stream article rows to a local CSV."""
    staged_count = 0
    with STAGING_CSV.open("w", newline="") as csv_file:
        writer = csv.writer(csv_file)
        write_article_csv_header(writer)

        for feed in FEEDS:
            response = httpx.get(
                feed["url"],
                timeout=30,
                headers={"User-Agent": USER_AGENT},
            )
            response.raise_for_status()

            for row in parse_feed_articles(feed["source"], response.content):
                writer.writerow(row)
                staged_count += 1

    return STAGING_CSV, staged_count

def ensure_demo_schema(con):
    """Create the schema and tables this Flight owns."""
    con.execute("CREATE SCHEMA IF NOT EXISTS docs_playground.flights_demo")
    con.execute(f"""
        CREATE TABLE IF NOT EXISTS {ARTICLE_TABLE} (
            article_id VARCHAR,
            source VARCHAR,
            title VARCHAR,
            link VARCHAR,
            published_at TIMESTAMPTZ,
            description VARCHAR,
            summary VARCHAR,
            primary_topic VARCHAR,
            topics VARCHAR[],
            audience VARCHAR,
            model VARCHAR,
            loaded_at TIMESTAMPTZ,
            summarized_at TIMESTAMPTZ
        )
    """)

    con.execute(f"""
        CREATE TABLE IF NOT EXISTS {BRIEFING_TABLE} (
            briefing_date DATE,
            article_count INTEGER,
            briefing VARCHAR,
            created_at TIMESTAMPTZ DEFAULT current_timestamp
        )
    """)

def insert_new_articles(con, csv_path):
    """Insert staged CSV rows whose IDs are not in MotherDuck yet."""
    inserted_rows = con.execute(
        f"""
        INSERT INTO {ARTICLE_TABLE} (
            article_id,
            source,
            title,
            link,
            published_at,
            description,
            loaded_at
        )
        SELECT
            fetched.article_id,
            fetched.source,
            fetched.title,
            fetched.link,
            fetched.published_at,
            fetched.description,
            current_timestamp
        FROM (
            SELECT
                article_id,
                source,
                title,
                link,
                published_at,
                description,
                row_number() OVER (
                    PARTITION BY article_id
                    ORDER BY published_at DESC NULLS LAST
                ) AS article_rank
            FROM read_csv(
                ?,
                header := true,
                columns := {{
                    'article_id': 'VARCHAR',
                    'source': 'VARCHAR',
                    'title': 'VARCHAR',
                    'link': 'VARCHAR',
                    'published_at': 'TIMESTAMPTZ',
                    'description': 'VARCHAR'
                }}
            )
        ) AS fetched
        ANTI JOIN {ARTICLE_TABLE} AS existing
            USING (article_id)
        WHERE fetched.article_rank = 1
        RETURNING article_id
        """,
        [str(csv_path)],
    ).fetchall()
    return len(inserted_rows)

def enrich_recent_articles(con):
    """Summarize recent rows that do not have prompt outputs yet."""
    pending_count = con.execute(f"""
        SELECT count(*)
        FROM (
            SELECT article_id
            FROM {ARTICLE_TABLE}
            WHERE summarized_at IS NULL
              AND loaded_at > current_date - INTERVAL 3 DAY
            ORDER BY published_at DESC NULLS LAST
            LIMIT {MAX_ARTICLES_TO_ENRICH}
        )
    """).fetchone()[0]

    if pending_count == 0:
        return 0

    con.execute(f"""
        UPDATE {ARTICLE_TABLE} AS article
        SET
            summary = enriched.summary,
            primary_topic = enriched.primary_topic,
            topics = enriched.topics,
            audience = enriched.audience,
            model = 'gpt-5-nano',
            summarized_at = current_timestamp
        FROM (
            SELECT
                article_id,
                extracted.summary AS summary,
                extracted.primary_topic AS primary_topic,
                extracted.topics AS topics,
                extracted.audience AS audience
            FROM (
                SELECT
                    article_id,
                    prompt(
                        'Summarize this technology article for a data practitioner. Return one specific primary topic, three short topics, the likely audience, and a one-sentence summary. Title: '
                        || title
                        || '. Description: '
                        || coalesce(description, ''),
                        model := 'gpt-5-nano',
                        reasoning_effort := 'minimal',
                        struct := {{
                            summary: 'VARCHAR',
                            primary_topic: 'VARCHAR',
                            topics: 'VARCHAR[]',
                            audience: 'VARCHAR'
                        }},
                        struct_descr := {{
                            primary_topic: 'A concise topic label such as databases, AI agents, cloud infrastructure, chips, security, data engineering, or startups',
                            topics: 'Three short topic labels',
                            audience: 'The reader who would care most, such as data engineer, analytics engineer, founder, developer, or CIO'
                        }}
                    ) AS extracted
                FROM {ARTICLE_TABLE}
                WHERE summarized_at IS NULL
                  AND loaded_at > current_date - INTERVAL 3 DAY
                ORDER BY published_at DESC NULLS LAST
                LIMIT {MAX_ARTICLES_TO_ENRICH}
            ) AS prompt_rows
        ) AS enriched
        WHERE article.article_id = enriched.article_id
    """)
    return pending_count

def latest_summarized_articles(con):
    """Return recent summaries as Python dictionaries for the briefing prompt."""
    rows = con.execute(f"""
        SELECT
            source,
            title,
            summary,
            primary_topic,
            topics
        FROM {ARTICLE_TABLE}
        WHERE summary IS NOT NULL
        ORDER BY published_at DESC NULLS LAST
        LIMIT 20
    """).fetchall()

    return [
        {
            "source": source,
            "title": title,
            "summary": summary,
            "primary_topic": primary_topic,
            "topics": list(topics or []),
        }
        for source, title, summary, primary_topic, topics in rows
    ]

def build_briefing_prompt(articles):
    """Turn summarized articles into the prompt for the daily briefing."""
    newline = chr(10)
    article_notes = []
    for article in articles:
        topics = ", ".join(article["topics"]) or "No extracted topics"
        article_notes.append(
            f"- [{article['source']}] {article['title']}: {article['summary']} "
            f"Primary topic: {article['primary_topic']}. Topics: {topics}"
        )

    return (
        "Write a concise daily briefing for a data and AI practitioner. "
        "Start with the main theme, then give three bullet points and one thing to watch. "
        "Base the briefing only on these article notes:"
        + newline
        + newline.join(article_notes)
    )

def write_daily_briefing(con):
    """Append today's briefing from the latest summarized article objects."""
    articles = latest_summarized_articles(con)
    if articles:
        briefing = con.execute(
            """
            SELECT prompt(
                ?,
                model := 'gpt-5-nano',
                reasoning_effort := 'minimal'
            )
            """,
            [build_briefing_prompt(articles)],
        ).fetchone()[0]
    else:
        briefing = "No enriched articles are available yet."

    con.execute(
        f"""
        INSERT INTO {BRIEFING_TABLE} (briefing_date, article_count, briefing)
        VALUES (current_date, ?, ?)
        """,
        [len(articles), briefing],
    )
    return len(articles)

def main():
    con = duckdb.connect("md:")
    ensure_demo_schema(con)

    csv_path, staged_count = stage_feed_articles()
    inserted_count = insert_new_articles(con, csv_path)
    enriched_count = enrich_recent_articles(con)
    briefing_count = write_daily_briefing(con)

    print(
        f"staged {staged_count} articles, inserted {inserted_count} new articles, "
        f"enriched {enriched_count} articles, and briefed on {briefing_count} summaries"
    )

if __name__ == "__main__":
    main()
```

Run the SQL. It creates an on-demand Flight first so you can test the ingestion and prompts before you add a schedule. The SQL editor embeds the same Python source inside `$flight$`.

#### Create a news briefing Flight

Database: `docs_playground`

```sql
SELECT flight_id, flight_name, current_version
FROM MD_CREATE_FLIGHT(
    name := 'docs_news_briefing',
    requirements_txt := array_to_string([
        'duckdb==1.5.3',
        'httpx==0.28.1'
    ], chr(10)),
    source_code := $flight$
import csv
import xml.etree.ElementTree as ET
from email.utils import parsedate_to_datetime
from pathlib import Path

import duckdb
import httpx

ARTICLE_TABLE = "docs_playground.flights_demo.news_articles"
BRIEFING_TABLE = "docs_playground.flights_demo.news_daily_briefings"
STAGING_CSV = Path("/tmp/fetched_news_articles.csv")
MAX_ARTICLES_TO_ENRICH = 20
USER_AGENT = "MotherDuck Flights docs demo"

FEEDS = [
    {
        "source": "The Register database news",
        "url": (
            "https://api.theregister.com/api/v1/article"
            "?orderBy=published&site_id=2"
            "&query=(tag:databases)&remapper=rss"
        ),
    },
    {
        "source": "TechCrunch AI",
        "url": "https://techcrunch.com/category/artificial-intelligence/feed/",
    },
]

def item_text(item, tag):
    """Return stripped text from an RSS item child tag."""
    node = item.find(tag)
    if node is None or node.text is None:
        return ""
    return node.text.strip()

def parse_feed_articles(source, feed_bytes):
    """Yield normalized RSS items as article table rows."""
    root = ET.fromstring(feed_bytes)
    for item in root.findall("./channel/item"):
        published_raw = item_text(item, "pubDate")
        published_at = parsedate_to_datetime(published_raw).isoformat() if published_raw else None
        link = item_text(item, "link")
        yield (
            source + ":" + (item_text(item, "guid") or link),
            source,
            item_text(item, "title"),
            link,
            published_at,
            item_text(item, "description"),
        )

def write_article_csv_header(writer):
    """Write the CSV columns expected by the insert query."""
    writer.writerow([
        "article_id",
        "source",
        "title",
        "link",
        "published_at",
        "description",
    ])

def stage_feed_articles():
    """Download each configured feed and stream article rows to a local CSV."""
    staged_count = 0
    with STAGING_CSV.open("w", newline="") as csv_file:
        writer = csv.writer(csv_file)
        write_article_csv_header(writer)

        for feed in FEEDS:
            response = httpx.get(
                feed["url"],
                timeout=30,
                headers={"User-Agent": USER_AGENT},
            )
            response.raise_for_status()

            for row in parse_feed_articles(feed["source"], response.content):
                writer.writerow(row)
                staged_count += 1

    return STAGING_CSV, staged_count

def ensure_demo_schema(con):
    """Create the schema and tables this Flight owns."""
    con.execute("CREATE SCHEMA IF NOT EXISTS docs_playground.flights_demo")
    con.execute(f"""
        CREATE TABLE IF NOT EXISTS {ARTICLE_TABLE} (
            article_id VARCHAR,
            source VARCHAR,
            title VARCHAR,
            link VARCHAR,
            published_at TIMESTAMPTZ,
            description VARCHAR,
            summary VARCHAR,
            primary_topic VARCHAR,
            topics VARCHAR[],
            audience VARCHAR,
            model VARCHAR,
            loaded_at TIMESTAMPTZ,
            summarized_at TIMESTAMPTZ
        )
    """)

    con.execute(f"""
        CREATE TABLE IF NOT EXISTS {BRIEFING_TABLE} (
            briefing_date DATE,
            article_count INTEGER,
            briefing VARCHAR,
            created_at TIMESTAMPTZ DEFAULT current_timestamp
        )
    """)

def insert_new_articles(con, csv_path):
    """Insert staged CSV rows whose IDs are not in MotherDuck yet."""
    inserted_rows = con.execute(
        f"""
        INSERT INTO {ARTICLE_TABLE} (
            article_id,
            source,
            title,
            link,
            published_at,
            description,
            loaded_at
        )
        SELECT
            fetched.article_id,
            fetched.source,
            fetched.title,
            fetched.link,
            fetched.published_at,
            fetched.description,
            current_timestamp
        FROM (
            SELECT
                article_id,
                source,
                title,
                link,
                published_at,
                description,
                row_number() OVER (
                    PARTITION BY article_id
                    ORDER BY published_at DESC NULLS LAST
                ) AS article_rank
            FROM read_csv(
                ?,
                header := true,
                columns := {{
                    'article_id': 'VARCHAR',
                    'source': 'VARCHAR',
                    'title': 'VARCHAR',
                    'link': 'VARCHAR',
                    'published_at': 'TIMESTAMPTZ',
                    'description': 'VARCHAR'
                }}
            )
        ) AS fetched
        ANTI JOIN {ARTICLE_TABLE} AS existing
            USING (article_id)
        WHERE fetched.article_rank = 1
        RETURNING article_id
        """,
        [str(csv_path)],
    ).fetchall()
    return len(inserted_rows)

def enrich_recent_articles(con):
    """Summarize recent rows that do not have prompt outputs yet."""
    pending_count = con.execute(f"""
        SELECT count(*)
        FROM (
            SELECT article_id
            FROM {ARTICLE_TABLE}
            WHERE summarized_at IS NULL
              AND loaded_at > current_date - INTERVAL 3 DAY
            ORDER BY published_at DESC NULLS LAST
            LIMIT {MAX_ARTICLES_TO_ENRICH}
        )
    """).fetchone()[0]

    if pending_count == 0:
        return 0

    con.execute(f"""
        UPDATE {ARTICLE_TABLE} AS article
        SET
            summary = enriched.summary,
            primary_topic = enriched.primary_topic,
            topics = enriched.topics,
            audience = enriched.audience,
            model = 'gpt-5-nano',
            summarized_at = current_timestamp
        FROM (
            SELECT
                article_id,
                extracted.summary AS summary,
                extracted.primary_topic AS primary_topic,
                extracted.topics AS topics,
                extracted.audience AS audience
            FROM (
                SELECT
                    article_id,
                    prompt(
                        'Summarize this technology article for a data practitioner. Return one specific primary topic, three short topics, the likely audience, and a one-sentence summary. Title: '
                        || title
                        || '. Description: '
                        || coalesce(description, ''),
                        model := 'gpt-5-nano',
                        reasoning_effort := 'minimal',
                        struct := {{
                            summary: 'VARCHAR',
                            primary_topic: 'VARCHAR',
                            topics: 'VARCHAR[]',
                            audience: 'VARCHAR'
                        }},
                        struct_descr := {{
                            primary_topic: 'A concise topic label such as databases, AI agents, cloud infrastructure, chips, security, data engineering, or startups',
                            topics: 'Three short topic labels',
                            audience: 'The reader who would care most, such as data engineer, analytics engineer, founder, developer, or CIO'
                        }}
                    ) AS extracted
                FROM {ARTICLE_TABLE}
                WHERE summarized_at IS NULL
                  AND loaded_at > current_date - INTERVAL 3 DAY
                ORDER BY published_at DESC NULLS LAST
                LIMIT {MAX_ARTICLES_TO_ENRICH}
            ) AS prompt_rows
        ) AS enriched
        WHERE article.article_id = enriched.article_id
    """)
    return pending_count

def latest_summarized_articles(con):
    """Return recent summaries as Python dictionaries for the briefing prompt."""
    rows = con.execute(f"""
        SELECT
            source,
            title,
            summary,
            primary_topic,
            topics
        FROM {ARTICLE_TABLE}
        WHERE summary IS NOT NULL
        ORDER BY published_at DESC NULLS LAST
        LIMIT 20
    """).fetchall()

    return [
        {
            "source": source,
            "title": title,
            "summary": summary,
            "primary_topic": primary_topic,
            "topics": list(topics or []),
        }
        for source, title, summary, primary_topic, topics in rows
    ]

def build_briefing_prompt(articles):
    """Turn summarized articles into the prompt for the daily briefing."""
    newline = chr(10)
    article_notes = []
    for article in articles:
        topics = ", ".join(article["topics"]) or "No extracted topics"
        article_notes.append(
            f"- [{article['source']}] {article['title']}: {article['summary']} "
            f"Primary topic: {article['primary_topic']}. Topics: {topics}"
        )

    return (
        "Write a concise daily briefing for a data and AI practitioner. "
        "Start with the main theme, then give three bullet points and one thing to watch. "
        "Base the briefing only on these article notes:"
        + newline
        + newline.join(article_notes)
    )

def write_daily_briefing(con):
    """Append today's briefing from the latest summarized article objects."""
    articles = latest_summarized_articles(con)
    if articles:
        briefing = con.execute(
            """
            SELECT prompt(
                ?,
                model := 'gpt-5-nano',
                reasoning_effort := 'minimal'
            )
            """,
            [build_briefing_prompt(articles)],
        ).fetchone()[0]
    else:
        briefing = "No enriched articles are available yet."

    con.execute(
        f"""
        INSERT INTO {BRIEFING_TABLE} (briefing_date, article_count, briefing)
        VALUES (current_date, ?, ?)
        """,
        [len(articles), briefing],
    )
    return len(articles)

def main():
    con = duckdb.connect("md:")
    ensure_demo_schema(con)

    csv_path, staged_count = stage_feed_articles()
    inserted_count = insert_new_articles(con, csv_path)
    enriched_count = enrich_recent_articles(con)
    briefing_count = write_daily_briefing(con)

    print(
        f"staged {staged_count} articles, inserted {inserted_count} new articles, "
        f"enriched {enriched_count} articles, and briefed on {briefing_count} summaries"
    )

if __name__ == "__main__":
    main()
$flight$
);
```

## Run it once

Store the news briefing Flight ID in a SQL variable. The next cells use that variable, so you can run, inspect, and schedule the same Flight without repeating the lookup.

#### Set the news briefing Flight ID

Database: `docs_playground`

```sql
SET VARIABLE news_briefing_flight_id = (
    SELECT flight_id
    FROM MD_LIST_FLIGHTS()
    WHERE flight_name = 'docs_news_briefing'
    ORDER BY created_at DESC
    LIMIT 1
);
```

Trigger one run from the docs SQL editor so you can inspect the result before adding a cron schedule.

#### Run the news briefing Flight

Database: `docs_playground`

```sql
SELECT *
FROM MD_RUN_FLIGHT(
    flight_id := getvariable('news_briefing_flight_id')
);
```

Runs are asynchronous. Poll the latest run until it reaches `RUN_STATUS_SUCCEEDED` or `RUN_STATUS_FAILED`.

#### Check the latest run

Database: `docs_playground`

```sql
SELECT run_number, status, flight_version, created_at
FROM MD_LIST_FLIGHT_RUNS(
    flight_id := getvariable('news_briefing_flight_id')
)
ORDER BY run_number DESC
LIMIT 5;
```

If the run fails, read the log before editing the source. First, store the latest run number in a variable.

#### Set the latest run number

Database: `docs_playground`

```sql
SET VARIABLE news_briefing_run_number = (
    SELECT max(run_number)
    FROM MD_LIST_FLIGHT_RUNS(
        flight_id := getvariable('news_briefing_flight_id')
    )
);
```

Then read the log.

#### Read the latest run log

Database: `docs_playground`

```sql
SELECT logs
FROM MD_GET_FLIGHT_LOGS(
    flight_id := getvariable('news_briefing_flight_id'),
    run_number := getvariable('news_briefing_run_number')
);
```

## Inspect the AI-enriched article table

The Flight used `prompt()` while inserting new feed items into the main article table. Start with the latest enriched article rows:

#### Read enriched articles

Database: `docs_playground`

```sql
SELECT source, title, primary_topic, topics, audience, summary
FROM docs_playground.flights_demo.news_articles
ORDER BY summarized_at DESC
LIMIT 10;
```

Then read the daily briefing generated across those article summaries:

#### Read the daily briefing

Database: `docs_playground`

```sql
SELECT briefing_date, article_count, briefing
FROM docs_playground.flights_demo.news_daily_briefings
ORDER BY briefing_date DESC, created_at DESC
LIMIT 3;
```

The topic query is what the Dive will use for a compact chart:

#### Read topic trends

Database: `docs_playground`

```sql
SELECT article_date, topic, article_count
FROM (
    SELECT
        CAST(published_at AS DATE) AS article_date,
        topic,
        count(*) AS article_count
    FROM docs_playground.flights_demo.news_articles,
         unnest(topics) AS topic_table(topic)
    GROUP BY ALL
)
ORDER BY article_date DESC, article_count DESC
LIMIT 20;
```

## Schedule future briefings

After the manual run succeeds, add the daily `07:00 UTC` schedule. Schedule updates are metadata-only; they do not create a new Flight version.

#### Schedule the briefing Flight

Database: `docs_playground`

```sql
CALL MD_UPDATE_FLIGHT(
    flight_id := getvariable('news_briefing_flight_id'),
    schedule_cron := '0 7 * * *'
);
```

## Create the daily briefing Dive

Run this SQL once after the Flight has produced the briefing tables. It creates a Dive that shows the latest briefing, topic counts derived from the enriched article table, and the most recent articles. The Dive declares the database it reads with `REQUIRED_DATABASES`, and the query result includes a `url` column that links straight to the new Dive. The editor folds the `$dive$` body by default so the runnable SQL stays scannable.

#### Create the daily briefing Dive

Database: `docs_playground`

```sql
SELECT
    'https://app.motherduck.com/dives/daily-tech-briefing-from-flights-' || id AS url,
    id,
    title,
    current_version
FROM MD_CREATE_DIVE(
  title = 'Daily tech briefing from Flights',
  description = 'Daily briefing and topic trends generated by a scheduled Flight',
  content = $dive$
import { useSQLQuery } from "@motherduck/react-sql-query";

export const REQUIRED_DATABASES = [
  { type: 'database', path: 'md:docs_playground', alias: 'docs_playground' }
];

const N = (value) => (value != null ? Number(value) : 0);

const C = { bg: '#faf6ec', panel: '#f3edda', ink: '#2b2620', muted: '#8a8070', line: '#e6dcc6', dot: '#b3502d' };
const SERIF = "'Iowan Old Style', 'Palatino Linotype', Palatino, Georgia, serif";
const META = { color: C.muted, fontSize: 12, textTransform: 'uppercase', letterSpacing: '0.08em' };

export default function Dive() {
  const briefingQuery = useSQLQuery('SELECT CAST(briefing_date AS VARCHAR) AS briefing_date, article_count, briefing FROM docs_playground.flights_demo.news_daily_briefings ORDER BY briefing_date DESC, created_at DESC LIMIT 1');
  const topicsQuery = useSQLQuery('SELECT CAST(article_date AS VARCHAR) AS article_date, topic, article_count FROM (SELECT CAST(published_at AS DATE) AS article_date, topic, count(*) AS article_count FROM docs_playground.flights_demo.news_articles, unnest(topics) AS topic_table(topic) GROUP BY ALL) ORDER BY article_date DESC, article_count DESC LIMIT 80');
  const articlesQuery = useSQLQuery("SELECT title, link, source, summary, strftime(published_at, '%b %d') AS published_label FROM docs_playground.flights_demo.news_articles ORDER BY published_at DESC LIMIT 10");

  const briefingRows = Array.isArray(briefingQuery.data) ? briefingQuery.data : [];
  const topicRows = Array.isArray(topicsQuery.data) ? topicsQuery.data : [];
  const articleRows = Array.isArray(articlesQuery.data) ? articlesQuery.data : [];
  const briefing = briefingRows[0];

  const topicTotals = Object.values(topicRows.reduce((acc, row) => {
    const key = row.topic || 'Uncategorized';
    acc[key] = acc[key] || { topic: key, count: 0 };
    acc[key].count += N(row.article_count);
    return acc;
  }, {})).sort((a, b) => b.count - a.count).slice(0, 8);

  const maxCount = Math.max(1, ...topicTotals.map((row) => row.count));

  if (briefingQuery.isLoading || topicsQuery.isLoading || articlesQuery.isLoading) {
    return <main style={{ fontFamily: SERIF, background: C.bg, color: C.muted, padding: 24, minHeight: '100vh', boxSizing: 'border-box' }}>Gathering the briefing...</main>;
  }

  return (
    <main style={{ fontFamily: SERIF, color: C.ink, background: C.bg, padding: '32px 24px 40px', minHeight: '100vh', boxSizing: 'border-box' }}>
      <div style={{ maxWidth: 820, margin: '0 auto' }}>
        <header style={{ textAlign: 'center' }}>
          <p style={{ ...META, margin: 0, letterSpacing: '0.12em', fontSize: 11 }}>Daily briefing</p>
          <h1 style={{ margin: '6px 0 0', fontSize: 28, fontWeight: 600, letterSpacing: '0.01em' }}>Tech news signal</h1>
          <p style={{ color: C.muted, fontStyle: 'italic', fontSize: 13, margin: '6px 0 0' }}>{briefing ? briefing.briefing_date : 'No briefing yet'} · {briefing ? N(briefing.article_count) : 0} articles</p>
        </header>

        <section style={{ marginTop: 24, background: C.panel, border: '1px solid ' + C.line, borderRadius: 3, padding: '18px 22px' }}>
          <p style={{ whiteSpace: 'pre-wrap', lineHeight: 1.7, fontSize: 14.5, margin: 0 }}>{briefing ? briefing.briefing : 'Run the Flight to generate a briefing.'}</p>
        </section>

        <section style={{ marginTop: 32 }}>
          <h2 style={{ ...META, fontSize: 13, fontWeight: 700, borderBottom: '1px solid ' + C.line, paddingBottom: 8 }}>Topics over time</h2>
          {topicTotals.length === 0 ? (
            <p style={{ color: C.muted }}>No topic rows yet.</p>
          ) : (
            topicTotals.map((row) => (
              <div key={row.topic} style={{ marginBottom: 12 }}>
                <div style={{ display: 'flex', justifyContent: 'space-between', gap: 12, fontSize: 14 }}>
                  <span style={{ fontWeight: 600 }}>{row.topic}</span>
                  <span style={{ color: C.muted }}>{row.count}</span>
                </div>
                <div style={{ height: 6, marginTop: 4, background: C.panel }}>
                  <div style={{ height: '100%', width: (row.count / maxCount * 100) + '%', background: C.dot }} />
                </div>
              </div>
            ))
          )}
        </section>

        <section style={{ marginTop: 32 }}>
          <h2 style={{ ...META, fontSize: 13, fontWeight: 700, borderBottom: '1px solid ' + C.line, paddingBottom: 8, marginBottom: 0 }}>Recent articles</h2>
          {articleRows.length === 0 ? (
            <p style={{ color: C.muted }}>No articles yet.</p>
          ) : (
            articleRows.map((row, i) => (
              <article key={row.link || i} style={{ padding: '16px 0', borderTop: i === 0 ? 'none' : '1px solid ' + C.line }}>
                <a href={row.link} target="_blank" rel="noopener noreferrer" style={{ color: C.ink, fontWeight: 600, fontSize: 17, lineHeight: 1.35, textDecoration: 'none', borderBottom: '1px solid ' + C.line }}>{row.title}</a>
                <p style={{ ...META, margin: '6px 0 0' }}>{row.source} · {row.published_label}</p>
                <p style={{ margin: '8px 0 0', fontSize: 14.5, lineHeight: 1.65 }}>{row.summary}</p>
              </article>
            ))
          )}
        </section>
      </div>
    </main>
  );
}
$dive$
);
```

Open the `url` from the query result to view the Dive, or find it in the MotherDuck UI under [**Dives**](https://app.motherduck.com/dives). Re-run the Flight on later days and the Dive reads the updated briefing tables.

## Adapt the pattern

- Add more RSS feeds to the `FEEDS` list.
- Change the `prompt()` instructions to match your role, team, or industry.
- Move settings like `MAX_ARTICLES_TO_ENRICH` into the Flight's `config` and read them with `os.environ`, then [override them for a single run](/key-tasks/flights/scheduling-and-runs#override-config-for-a-single-run) with the `config` argument of `MD_RUN_FLIGHT` instead of editing the source.
- Replace the Dive content with a chart-focused layout after you know which topics matter.
- Change `schedule_cron` after you are happy with the first manual run. See [Scheduling Flights and managing runs](/key-tasks/flights/scheduling-and-runs).

## Related resources

- [PROMPT](/sql-reference/motherduck-sql-reference/ai-functions/prompt/)
- [MD_CREATE_DIVE](/sql-reference/motherduck-sql-reference/dives/md-create-dive)
- [MD_CREATE_FLIGHT](/sql-reference/motherduck-sql-reference/flights/md-create-flight)
- [MD_RUN_FLIGHT](/sql-reference/motherduck-sql-reference/flights/md-run-flight)
- [MD_LIST_FLIGHT_RUNS](/sql-reference/motherduck-sql-reference/flights/md-list-flight-runs)


---

## Docs feedback

MotherDuck accepts optional user-submitted feedback about this page at `GET https://motherduck.com/docs/api/feedback/agent`.
For agents and automated tools, feedback submission should be user-confirmed before sending.

URL-encode query parameter values and send a GET request:

```text
GET https://motherduck.com/docs/api/feedback/agent?page_path=%2Fkey-tasks%2Fflights%2Fbuild-daily-briefing-flight-and-dive%2F&page_title=Build%20a%20daily%20briefing%20Flight%20and%20Dive&text=<url-encoded user feedback, max 2000 characters>
```

Optionally append `&source=<url-encoded interface identifier>` such as `claude.ai` or `chatgpt`.

`page_path` and `text` are required; `page_title` and `source` are optional. Responses: `200 {"feedback_id": "<uuid>"}`, `400` for malformed query parameters, and `429` when rate-limited.