# PROMPT
> Generate AI responses directly in SQL with the PROMPT function.
::::warning[Preview Feature]
This is a preview feature. Preview features may be operationally incomplete and may offer limited backward compatibility.
::::

## Prompt function

The `prompt` function sends text to a Large Language Model (LLM) from SQL and returns the model's response. Use it to generate free-form text, extract typed values, or produce structured data.

The function supports OpenAI's `gpt-5` series (`gpt-5`, `gpt-5-mini`, `gpt-5-nano`), `gpt-4o-mini` (default), `gpt-4o`, and the `gpt-4.1` series. All models support single-row prompts and multi-row queries for batch processing.

The `prompt` function runs once per row in the result set. A query like `SELECT prompt('Write a joke') FROM range(0, 10000)` calls the model 10,000 times, even though the prompt text looks like a single call. Cost scales with the number of rows the query evaluates.

Consumption is measured in [AI Units](/about-motherduck/billing/pricing#ai-function-pricing). As a rough guide, one AI Unit covers approximately the following number of rows per model:

- 480 rows with `gpt-4o`
- 8,000 rows with `gpt-4o-mini`
- 600 rows with `gpt-4.1`
- 3,000 rows with `gpt-4.1-mini`
- 12,000 rows with `gpt-4.1-nano`
- 720 rows with `gpt-5`
- 3,600 rows with `gpt-5-mini`
- 18,000 rows with `gpt-5-nano`

These estimates assume about 1,000 input characters and 250 output characters per row. Actual cost depends on token usage, so longer prompts or responses consume more AI Units per row.

## Syntax

```sql
SELECT prompt('Write a poem about ducks'); -- returns a single-cell result with the response
```

### Parameters

| **Parameter**      | **Required** | **Description**                                                                                                                                                                                                                                 |
|--------------------|--------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `prompt_text`      | Yes          | The text input to send to the model                                                                                                                                                                                                             |
| `model`            | No           | Model type: `'gpt-5'`, `'gpt-5-mini'`, `'gpt-5-nano'`, `'gpt-4o-mini'` (default), `'gpt-4o'`, `'gpt-4.1'`, `'gpt-4.1-mini'`, or `'gpt-4.1-nano'`                                                                                                |
| `temperature`      | No           | Model temperature value between `0` and `1`, default: `0.1`. Lower values produce more deterministic outputs. **Not supported with GPT-5 models** (use `reasoning_effort` instead).                                                             |
| `reasoning_effort` | No           | Controls reasoning depth for GPT-5 models only. Valid values: `'minimal'` (default), `'low'`, `'medium'`, `'high'`. Higher effort may improve accuracy for complex tasks. **Only available for GPT-5 series models**.                           |
| `return_type`      | No           | Specifies the exact SQL type to return (e.g., `'INTEGER'`, `'BOOLEAN'`, `'DATE'`, `'VARCHAR[]'`, `'STRUCT(name VARCHAR, age INTEGER)'`). Supports most DuckDB types including primitives, arrays, structs, and enums. Mutually exclusive with `struct` and `json_schema`. |
| `struct`           | No           | Output schema as struct, e.g. `{summary: 'VARCHAR', persons: 'VARCHAR[]'}`. Will result in `STRUCT` output. Mutually exclusive with `return_type` and `json_schema`.                                                                            |
| `struct_descr`     | No           | Descriptions for struct fields that will be added to the model's context, e.g. `{summary: 'a 1 sentence summary of the text', persons: 'an array of all persons mentioned in the text'}`                                                        |
| `json_schema`      | No           | A JSON schema that adheres to [OpenAI's structured output guide](https://developers.openai.com/api/docs/guides/structured-outputs). Provides more flexibility than the struct/struct_descr parameters. Will result in `JSON` output. Mutually exclusive with `return_type` and `struct`. |

**Note**: The `return_type` and `struct` parameters support enum types for classification tasks. Define enum types first using `CREATE TYPE`, then reference them in the struct schema (e.g., `sentiment: 'sentiment_enum'` or `categories: 'category_enum[]'` for arrays).

### Return types

The `prompt` function can return different data types depending on the parameters used:

- Without structure parameters: Returns `VARCHAR`
- With `return_type` parameter: Returns the exact SQL type specified (e.g., `INTEGER`, `BOOLEAN`, `DATE`, `VARCHAR[]`, `STRUCT(...)`)
- With `struct` parameter: Returns a `STRUCT` with the specified schema
- With `json_schema` parameter: Returns `JSON`

**Note**: The `return_type`, `struct`, and `json_schema` parameters are mutually exclusive. Use only one at a time.

## Example usage

### Basic text generation

```sql
-- Call gpt-4o-mini (default) to generate text
SELECT prompt('Write a poem about ducks') AS response;

-- Call gpt-4o with higher temperature for more creative outputs
SELECT prompt('Write a poem about ducks', model:='gpt-4o', temperature:=1) AS response;
```

### Structured output with struct

```sql
-- Extract structured information from text using struct parameter
SELECT prompt('My zoo visit was amazing, I saw elephants, tigers, and penguins. The staff was friendly.',
    struct:={summary: 'VARCHAR', favourite_animals:'VARCHAR[]', star_rating:'INTEGER'},
    struct_descr:={star_rating: 'visit rating on a scale from 1 (bad) to 5 (very good)'}) AS zoo_review;
```

This returns a `STRUCT` value that can be accessed with dot notation:

```sql
SELECT
    zoo_review.summary,
    zoo_review.favourite_animals,
    zoo_review.star_rating
FROM (
    SELECT prompt('My zoo visit was amazing, I saw elephants, tigers, and penguins. The staff was friendly.',
        struct:={summary: 'VARCHAR', favourite_animals:'VARCHAR[]', star_rating:'INTEGER'},
        struct_descr:={star_rating: 'visit rating on a scale from 1 (bad) to 5 (very good)'}) AS zoo_review
);
```

### Structured output with JSON schema

```sql
-- Extract structured information using JSON schema
SELECT prompt('My zoo visit was amazing, I saw elephants, tigers, and penguins. The staff was friendly.',
    json_schema := '{
    "name": "zoo_visit_review",
    "schema": {
      "type": "object",
      "properties": {
        "summary": { "type": "string" },
        "sentiment": { "type": "string", "enum": ["positive", "negative", "neutral"] },
        "animals_seen": { "type": "array", "items": { "type": "string" } }
      },
      "required": ["summary", "sentiment", "animals_seen"],
      "additionalProperties": false
    },
    "strict": true
  }') AS json_review;
```

This returns a `JSON` value that, if saved, can be accessed using JSON extraction functions:

```sql
SELECT
    json_extract_string(json_review, '$.summary') AS summary,
    json_extract_string(json_review, '$.sentiment') AS sentiment,
    json_extract(json_review, '$.animals_seen') AS animals_seen
FROM (
    SELECT prompt('My zoo visit was amazing, I saw elephants, tigers, and penguins. The staff was friendly.',
        json_schema := '{ ... }') AS json_review
);
```

### Typed output with return type

The `return_type` parameter lets you specify the exact SQL type for the model's response, providing strong typing for single-value extractions:

```sql
-- Extract an integer from text
SELECT prompt('The answer is 42', return_type := 'INTEGER') AS answer;
-- Returns: 42 (as INTEGER type)

-- Extract a boolean
SELECT prompt('Is the sky blue?', return_type := 'BOOLEAN') AS is_blue;
-- Returns: true (as BOOLEAN type)

-- Extract a date
SELECT prompt('When is January 15, 2025?', return_type := 'DATE') AS event_date;
-- Returns: 2025-01-15 (as DATE type)

-- Extract multiple structured fields
SELECT prompt(
    'John is 30 years old and lives in NYC',
    return_type := 'STRUCT(name VARCHAR, age INTEGER, city VARCHAR)'
) AS person_info;
-- Returns: {'name': 'John', 'age': 30, 'city': 'NYC'} (as STRUCT type)

-- Extract arrays
SELECT prompt('List the days of the week', return_type := 'VARCHAR[]') AS weekdays;
-- Returns: ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
```

The `return_type` parameter supports most DuckDB types including:
- **Primitives**: `VARCHAR`, `INTEGER`, `BIGINT`, `DOUBLE`, `BOOLEAN`, `DATE`, `TIMESTAMP`, etc.
- **Arrays**: `INTEGER[]`, `VARCHAR[]`, `DOUBLE[]`, etc.
- **Structs**: `STRUCT(field1 TYPE1, field2 TYPE2, ...)`
- **Enums**: Custom enum types created with `CREATE TYPE`

### GPT-5 reasoning effort

The `reasoning_effort` parameter controls how much computational effort GPT-5 models spend on reasoning. This is only available for GPT-5 series models (`gpt-5`, `gpt-5-mini`, `gpt-5-nano`):

```sql
-- Use minimal reasoning (fastest, default)
SELECT prompt('What is 2+2?', 'gpt-5-mini',
    reasoning_effort := 'minimal',
    return_type := 'INTEGER') AS result;

-- Use low reasoning for simple tasks
SELECT prompt('Count the letters in "hello"', 'gpt-5-nano',
    reasoning_effort := 'low',
    return_type := 'INTEGER') AS letter_count;

-- Use medium reasoning for moderate complexity
SELECT prompt('Calculate 5 factorial', 'gpt-5-mini',
    reasoning_effort := 'medium',
    return_type := 'INTEGER') AS factorial;

-- Use high reasoning for complex tasks
SELECT prompt('Solve this logic puzzle: ...', 'gpt-5',
    reasoning_effort := 'high') AS solution;
```

**Note**: The `reasoning_effort` parameter cannot be used with non-GPT-5 models, and `temperature` cannot be used with GPT-5 models. They are mutually exclusive ways of controlling model behavior.

## Use cases

### Text generation

Use the prompt function to write a poem about ducks:

```sql
--- Prompt LLM to write a poem about ducks
SELECT prompt('Write a poem about ducks') AS response;
```

| **response**                                                                                                     |
|------------------------------------------------------------------------------------------------------------------|
| 'Beneath the whispering willow trees, Where ripples dance with wayward breeze, A symphony of quacks arise [...]' |

### Summarization

Use the prompt function to create a one-sentence summary of movie descriptions.
The example is based on the sample movies dataset from [MotherDuck's sample data database](/docs/getting-started/interfaces/client-apis/python/query-data).

```sql
--- Create a new table with summaries for the first 100 overview texts
CREATE TABLE my_db.movies AS
    SELECT title,
           overview,
           prompt('Summarize this movie description in one sentence: ' || overview) AS summary
    FROM kaggle.movies
    LIMIT 100;
```

If write access to the source table is available, the summary column can also be added in place:

```sql
--- Update the existing table to add new column for summaries
ALTER TABLE my_db.movies ADD COLUMN summary VARCHAR;

--- Populate the column with summaries
UPDATE my_db.movies
SET summary = prompt('Summarize this movie description in one sentence: ' || overview);
```

The movies table now contains a new column `summary` with one-sentence summaries of the movies:

```sql
SELECT title, overview, summary
FROM my_db.movies;
```

| **title** | **overview**                                 | **summary**                                          |
|-----------|----------------------------------------------|------------------------------------------------------|
| Toy Story | Led by Woody, Andy's toys live happily [...] | In "Toy Story," Woody's jealousy of the new [...]    |
| Jumanji   | When siblings Judy and Peter discover [...]  | In this thrilling adventure, siblings Judy and [...] |
| ...       | ...                                          | ...                                                  |

### Structured data extraction

Use the prompt function to extract structured data from text.

The example is based on the same sample movies dataset from [MotherDuck's sample data database](/getting-started/sample-data-queries/datasets). This time we aim to extract structured metadata from the movie's overview description.

We are interested in the main characters mentioned in the descriptions, as well as the movie's genre and a rating of how much action the movie contains, given a scale of 1 (no action) to 5 (lot of action).
For this, we make use of the `struct` and `struct_descr` parameters, which will result in structured output.

```sql
--- Update the existing table to add new column for structured metadata
ALTER TABLE my_db.movies ADD COLUMN metadata STRUCT(main_characters VARCHAR[], genre VARCHAR, action INTEGER);

--- Populate the column with structured information
UPDATE my_db.movies
SET metadata = prompt(
  overview,
  struct:={main_characters: 'VARCHAR[]', genre: 'VARCHAR', action: 'INTEGER'},
  struct_descr:={
      main_characters: 'an array of the main character names mentioned in the movie description',
      genre: 'the primary genre of the movie based on the description',
      action: 'rate on a scale from 1 (no action) to 5 (high action) how much action the movie contains'
  }
);
```

The resulting `metadata` field is a `STRUCT` that can be accessed as follows:

```sql
SELECT title,
       overview,
       metadata.main_characters,
       metadata.genre,
       metadata.action
FROM my_db.movies;
```

| **title** | **overview**                                 | **metadata.main_characters**                                            | **metadata.genre**           | **action** |
|-----------|----------------------------------------------|-------------------------------------------------------------------------|------------------------------|------------|
| Toy Story | Led by Woody, Andy's toys live happily [...] | ['"Woody"', '"Buzz Lightyear"', '"Andy"', '"Mr. Potato Head"', '"Rex"'] | Animation, Adventure, Comedy | 3          |
| Jumanji   | When siblings Judy and Peter discover  [...] | ['"Judy Shepherd"', '"Peter Shepherd"', '"Alan Parrish"']               | Adventure, Fantasy, Family   | 4          |
| ...       | ...                                          | ...                                                                     | ...                          | ...        |

### Classification with enums

The `prompt` function supports enum types for classification tasks, ensuring consistent and constrained outputs. This is particularly useful for sentiment analysis, categorization, and other classification scenarios.

#### Sentiment analysis

```sql
-- Define an enum for sentiment classification
CREATE TYPE sentiment_type AS ENUM ('positive', 'negative', 'neutral');

-- Classify customer reviews
SELECT
    review_text,
    prompt(
        'Classify the sentiment of this review: ' || review_text,
        struct := {sentiment: 'sentiment_type'}
    ).sentiment AS sentiment
FROM (
    VALUES
        ('The product is amazing, I love it!'),
        ('Terrible quality, waste of money.'),
        ('It works fine, nothing special.')
) AS reviews(review_text);
```

This returns:

| **review_text** | **sentiment** |
|-----------------|---------------|
| The product is amazing, I love it! | positive |
| Terrible quality, waste of money. | negative |
| It works fine, nothing special. | neutral |

#### Extracting multiple categories

Use enum arrays to extract multiple instances of the same category from text:

```sql
-- Define enums for different types of skills mentioned in text
CREATE TYPE skill_type AS ENUM ('sql', 'python', 'javascript', 'react', 'aws', 'docker', 'git');
CREATE TYPE topic_type AS ENUM ('database', 'frontend', 'backend', 'devops', 'analytics', 'security');

-- Extract skills and topics from job descriptions
SELECT
    description,
    prompt(
        'Extract the technical skills and topics mentioned in this text: ' || description,
        struct := {
            skills: 'skill_type[]',
            topics: 'topic_type[]'
        }
    ) AS extracted
FROM (
    VALUES
        ('Looking for a developer with Python and SQL experience for database analytics work'),
        ('Frontend role using React and JavaScript, plus Git for version control'),
        ('DevOps engineer needed for AWS and Docker deployment automation')
) AS jobs(description);
```

This returns arrays of enum values:

| **description** | **extracted.skills** | **extracted.topics** |
|-----------------|---------------------|---------------------|
| Looking for a developer with Python and SQL experience for database analytics work | ['python', 'sql'] | ['database', 'analytics'] |
| Frontend role using React and JavaScript, plus Git for version control | ['javascript', 'react', 'git'] | ['frontend'] |
| DevOps engineer needed for AWS and Docker deployment automation | ['aws', 'docker'] | ['devops'] |

### Retrieval-augmented generation (RAG)

The `prompt` function can be combined with [similarity search on embeddings](/sql-reference/motherduck-sql-reference/ai-functions/embedding/) to build a [RAG](https://motherduck.com/blog/search-using-duckdb-part-2/) pipeline. For advanced retrieval strategies including hybrid search, reranking, and HyDE, see the [Text Search guide](/key-tasks/ai-and-motherduck/text-search-in-motherduck/).

```sql
-- Create a reusable macro for question answering
CREATE OR REPLACE TEMP MACRO ask_question(question_text) AS TABLE (
  SELECT question_text AS question, prompt(
    'User asks the following question:\n' || question_text || '\n\n' ||
    'Here is some additional information:\n' ||
    STRING_AGG('Title: ' || title || '; Description: ' || overview, '\n') || '\n' ||
    'Please answer the question based only on the additional information provided.',
    model := 'gpt-4o'
  ) AS response
  FROM (
    SELECT title, overview
    FROM kaggle.movies
    ORDER BY array_cosine_similarity(overview_embeddings, embedding(question_text)) DESC
    LIMIT 3
  )
);

-- Use the macro to answer questions
SELECT question, response
FROM ask_question('Can you recommend some good sci-fi movies about AI?');
```

This will result in the following output:

| **question**                                        | **response**                                                                      |
|-----------------------------------------------------|-----------------------------------------------------------------------------------|
| Can you recommend some good sci-fi movies about AI? | Based on the information provided, here are some sci-fi movies about AI that you might enjoy: [...] |

:::warning
When passing free-text arguments from external sources to the prompt function (e.g., user questions in a RAG application), always use prepared statements to prevent SQL injection.
:::

Using prepared statements in [Python](/docs/getting-started/interfaces/client-apis/python/query-data/):

```python
# First register the macro
con.execute("""
CREATE OR REPLACE TEMP MACRO ask_question(question_text) AS TABLE (
  -- Macro definition as above
);
""")

# Then use prepared statements for user input
user_query = "Can you recommend some good sci-fi movies about AI?"
result = con.execute("""
SELECT response FROM ask_question(?)
""", [user_query]).fetchall()[0]

print(result[0])
```

## Batch processing

The `prompt` function can process multiple rows in a single query:

```sql
--- Process multiple rows at once
SELECT
    title,
    prompt('Write a tagline for this movie: ' || overview) AS tagline
FROM kaggle.movies
LIMIT 10;
```

## Error handling

When usage limits have been reached or an unexpected error occurs while computing prompt responses,
the function returns `NULL` for the affected rows instead of failing the entire query.

To check whether all responses were computed successfully, check for `NULL` values in the resulting column.

```sql
-- Check for NULL values in response column
SELECT count(*)
FROM my_db.movies
WHERE response IS NULL AND overview IS NOT NULL;
```

Missing values can be filled in with a separate query:

```sql
-- Fill in missing prompt responses
UPDATE my_db.movies
SET response = prompt('Summarize this movie description in one sentence: ' || overview)
WHERE response IS NULL AND overview IS NOT NULL;
```

## Performance considerations

- **Batch processing**: When processing multiple rows, consider using `LIMIT` to control the number of API calls.
- **Model selection**: Use `gpt-4o-mini` for faster, less expensive responses when high accuracy isn't critical.
- **Caching**: Results are not cached between queries, so consider storing results in tables for repeated use.

## Notes

These capabilities are provided by MotherDuck's integration with Azure OpenAI. Inputs to the prompt function will be processed by Azure OpenAI.

For availability and usage limits, see [MotherDuck's Pricing Model](/about-motherduck/billing/pricing#motherduck-pricing-model).

Usage limits are in place to safeguard your spend, not because of throughput limitations. MotherDuck has the capacity to handle high-volume embedding workloads and is always open to working alongside customers to support any type of workload and model requirements.

If you need higher usage limits or have specific requirements, please see our [support page](/troubleshooting/support/).

### Regional processing

Requests are processed based on the region of the MotherDuck organization according to the table below. Functions that are not available within the region (no checkmark) will be processed with global compute resources.

| Function | Global | Europe | US West |
|----------|--------|--------|---------|
| `PROMPT` | ✓      | ✓      | ✓       |


---

## Docs feedback

MotherDuck accepts optional user-submitted feedback about this page at `POST https://motherduck.com/docs/api/feedback/agent`.
For agents and automated tools, feedback submission should be user-confirmed before sending.

Payload:

```json
{
  "page_path": "/sql-reference/motherduck-sql-reference/ai-functions/prompt/",
  "page_title": "PROMPT",
  "text": "<the user's feedback, max 2000 characters>",
  "source": "<optional identifier for your interface, for example 'claude.ai' or 'chatgpt'>"
}
```

`page_path` and `text` are required; `page_title` and `source` are optional. Responses: `200 {"feedback_id": "<uuid>"}`, `400` for malformed payloads, and `429` when rate-limited.
