---
sidebar_position: 1
title: PROMPT
description: Generate text using AI models directly in SQL with the PROMPT function.
---
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import Admonition from '@theme/Admonition';

::::warning[Preview Feature]
This is a preview feature. Preview features may be operationally incomplete and may offer limited backward compatibility.
::::

## Prompt Function

The `prompt` function allows you to interact with Large Language Models (LLMs) directly from SQL. You can generate both free-form text and structured data outputs.

The function supports OpenAI's `gpt-5` series (`gpt-5`, `gpt-5-mini`, `gpt-5-nano`), `gpt-4o-mini` (default), `gpt-4o`, and the `gpt-4.1` series. All models support single- and multi-row inputs, enabling batch processing.

Consumption is measured in [AI Units](/about-motherduck/billing/pricing#ai-function-pricing). When reasoning over table rows, one AI Unit equates to approximately:

- 480 rows responses with `gpt-4o`
- 8,000 rows responses with `gpt-4o-mini`
- 600 rows responses with `gpt-4.1`
- 3,000 rows responses with `gpt-4.1-mini`
- 12,000 rows responses with `gpt-4.1-nano`
- 720 rows responses with `gpt-5`
- 3,600 rows with `gpt-5-mini`
- 18,000 rows with `gpt-5-nano`

These estimates assume an input size of 1,000 characters and response size of 250 characters.

## Syntax

```sql
SELECT prompt('Write a poem about ducks'); -- returns a single cell table with the response
```

### Parameters

| **Parameter**      | **Required** | **Description**                                                                                                                                                                                                                                 |
|--------------------|--------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `prompt_text`      | Yes          | The text input to send to the model                                                                                                                                                                                                             |
| `model`            | No           | Model type: `'gpt-5'`, `'gpt-5-mini'`, `'gpt-5-nano'`, `'gpt-4o-mini'` (default), `'gpt-4o'`, `'gpt-4.1'`, `'gpt-4.1-mini'`, or `'gpt-4.1-nano'`                                                                                                |
| `temperature`      | No           | Model temperature value between `0` and `1`, default: `0.1`. Lower values produce more deterministic outputs. **Not supported with GPT-5 models** (use `reasoning_effort` instead).                                                             |
| `reasoning_effort` | No           | Controls reasoning depth for GPT-5 models only. Valid values: `'minimal'` (default), `'low'`, `'medium'`, `'high'`. Higher effort may improve accuracy for complex tasks. **Only available for GPT-5 series models**.                           |
| `return_type`      | No           | Specifies the exact SQL type to return (e.g., `'INTEGER'`, `'BOOLEAN'`, `'DATE'`, `'VARCHAR[]'`, `'STRUCT(name VARCHAR, age INTEGER)'`). Supports most DuckDB types including primitives, arrays, structs, and enums. Mutually exclusive with `struct` and `json_schema`. |
| `struct`           | No           | Output schema as struct, e.g. `{summary: 'VARCHAR', persons: 'VARCHAR[]'}`. Will result in `STRUCT` output. Mutually exclusive with `return_type` and `json_schema`.                                                                            |
| `struct_descr`     | No           | Descriptions for struct fields that will be added to the model's context, e.g. `{summary: 'a 1 sentence summary of the text', persons: 'an array of all persons mentioned in the text'}`                                                        |
| `json_schema`      | No           | A JSON schema that adheres to [OpenAI's structured output guide](https://developers.openai.com/api/docs/guides/structured-outputs). Provides more flexibility than the struct/struct_descr parameters. Will result in `JSON` output. Mutually exclusive with `return_type` and `struct`. |

**Note**: The `return_type` and `struct` parameters support enum types for classification tasks. Define enum types first using `CREATE TYPE`, then reference them in the struct schema (e.g., `sentiment: 'sentiment_enum'` or `categories: 'category_enum[]'` for arrays).

### Return Types

The `prompt` function can return different data types depending on the parameters used:

- Without structure parameters: Returns `VARCHAR`
- With `return_type` parameter: Returns the exact SQL type specified (e.g., `INTEGER`, `BOOLEAN`, `DATE`, `VARCHAR[]`, `STRUCT(...)`)
- With `struct` parameter: Returns a `STRUCT` with the specified schema
- With `json_schema` parameter: Returns `JSON`

**Note**: The `return_type`, `struct`, and `json_schema` parameters are mutually exclusive - only one can be used at a time.

## Example Usage

### Basic Text Generation

```sql
-- Call gpt-4o-mini (default) to generate text
SELECT prompt('Write a poem about ducks') AS response;

-- Call gpt-4o with higher temperature for more creative outputs
SELECT prompt('Write a poem about ducks', model:='gpt-4o', temperature:=1) AS response; 
```

### Structured Output with Struct

```sql
-- Extract structured information from text using struct parameter
SELECT prompt('My zoo visit was amazing, I saw elephants, tigers, and penguins. The staff was friendly.', 
    struct:={summary: 'VARCHAR', favourite_animals:'VARCHAR[]', star_rating:'INTEGER'},
    struct_descr:={star_rating: 'visit rating on a scale from 1 (bad) to 5 (very good)'}) AS zoo_review; 
```

This returns a `STRUCT` value that can be accessed with dot notation:

```sql
SELECT 
    zoo_review.summary,
    zoo_review.favourite_animals,
    zoo_review.star_rating
FROM (
    SELECT prompt('My zoo visit was amazing, I saw elephants, tigers, and penguins. The staff was friendly.', 
        struct:={summary: 'VARCHAR', favourite_animals:'VARCHAR[]', star_rating:'INTEGER'},
        struct_descr:={star_rating: 'visit rating on a scale from 1 (bad) to 5 (very good)'}) AS zoo_review
);
```

### Structured Output with JSON Schema

```sql
-- Extract structured information using JSON schema
SELECT prompt('My zoo visit was amazing, I saw elephants, tigers, and penguins. The staff was friendly.', 
    json_schema := '{
    "name": "zoo_visit_review",
    "schema": {
      "type": "object",
      "properties": {
        "summary": { "type": "string" },
        "sentiment": { "type": "string", "enum": ["positive", "negative", "neutral"] },
        "animals_seen": { "type": "array", "items": { "type": "string" } }
      },
      "required": ["summary", "sentiment", "animals_seen"],
      "additionalProperties": false
    },
    "strict": true
  }') AS json_review;
```

This returns a `JSON` value that, if saved, can be accessed using JSON extraction functions:

```sql
SELECT 
    json_extract_string(json_review, '$.summary') AS summary,
    json_extract_string(json_review, '$.sentiment') AS sentiment,
    json_extract(json_review, '$.animals_seen') AS animals_seen
FROM (
    SELECT prompt('My zoo visit was amazing, I saw elephants, tigers, and penguins. The staff was friendly.', 
        json_schema := '{ ... }') AS json_review
);
```

### Typed Output with Return Type

The `return_type` parameter allows you to specify the exact SQL type for the model's response, providing strong typing for single-value extractions:

```sql
-- Extract an integer from text
SELECT prompt('The answer is 42', return_type := 'INTEGER') AS answer;
-- Returns: 42 (as INTEGER type)

-- Extract a boolean
SELECT prompt('Is the sky blue?', return_type := 'BOOLEAN') AS is_blue;
-- Returns: true (as BOOLEAN type)

-- Extract a date
SELECT prompt('When is January 15, 2025?', return_type := 'DATE') AS event_date;
-- Returns: 2025-01-15 (as DATE type)

-- Extract multiple structured fields
SELECT prompt(
    'John is 30 years old and lives in NYC',
    return_type := 'STRUCT(name VARCHAR, age INTEGER, city VARCHAR)'
) AS person_info;
-- Returns: {'name': 'John', 'age': 30, 'city': 'NYC'} (as STRUCT type)

-- Extract arrays
SELECT prompt('List the days of the week', return_type := 'VARCHAR[]') AS weekdays;
-- Returns: ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
```

The `return_type` parameter supports most DuckDB types including:
- **Primitives**: `VARCHAR`, `INTEGER`, `BIGINT`, `DOUBLE`, `BOOLEAN`, `DATE`, `TIMESTAMP`, etc.
- **Arrays**: `INTEGER[]`, `VARCHAR[]`, `DOUBLE[]`, etc.
- **Structs**: `STRUCT(field1 TYPE1, field2 TYPE2, ...)`
- **Enums**: Custom enum types created with `CREATE TYPE`

### GPT-5 Reasoning Effort

The `reasoning_effort` parameter controls how much computational effort GPT-5 models spend on reasoning. This is only available for GPT-5 series models (`gpt-5`, `gpt-5-mini`, `gpt-5-nano`):

```sql
-- Use minimal reasoning (fastest, default)
SELECT prompt('What is 2+2?', 'gpt-5-mini', 
    reasoning_effort := 'minimal', 
    return_type := 'INTEGER') AS result;

-- Use low reasoning for simple tasks
SELECT prompt('Count the letters in "hello"', 'gpt-5-nano',
    reasoning_effort := 'low',
    return_type := 'INTEGER') AS letter_count;

-- Use medium reasoning for moderate complexity
SELECT prompt('Calculate 5 factorial', 'gpt-5-mini',
    reasoning_effort := 'medium',
    return_type := 'INTEGER') AS factorial;

-- Use high reasoning for complex tasks
SELECT prompt('Solve this logic puzzle: ...', 'gpt-5',
    reasoning_effort := 'high') AS solution;
```

**Note**: The `reasoning_effort` parameter cannot be used with non-GPT-5 models, and `temperature` cannot be used with GPT-5 models. They are mutually exclusive ways of controlling model behavior.

## Use Cases

### Text Generation

Using the prompt function to write a poem about ducks:

```sql
--- Prompt LLM to write a poem about ducks
SELECT prompt('Write a poem about ducks') AS response;
```

| **response**                                                                                                     |
|------------------------------------------------------------------------------------------------------------------|
| 'Beneath the whispering willow trees, Where ripples dance with wayward breeze, A symphony of quacks arise [...]' |

### Summarization

We use the prompt function to create a one-sentence summary of movie descriptions.
The example is based on the sample movies dataset from [MotherDuck's sample data database](/docs/getting-started/interfaces/client-apis/python/query-data).

```sql
--- Create a new table with summaries for the first 100 overview texts
CREATE TABLE my_db.movies AS 
    SELECT title, 
           overview, 
           prompt('Summarize this movie description in one sentence: ' || overview) AS summary
    FROM kaggle.movies 
    LIMIT 100;
```

If write access to the source table is available, the summary column can also be added in place:

```sql
--- Update the existing table to add new column for summaries
ALTER TABLE my_db.movies ADD COLUMN summary VARCHAR;

--- Populate the column with summaries
UPDATE my_db.movies 
SET summary = prompt('Summarize this movie description in one sentence: ' || overview);
```

The movies table now contains a new column `summary` with one-sentence summaries of the movies:

```sql
SELECT title, overview, summary 
FROM my_db.movies;
```

| **title** | **overview**                                 | **summary**                                          |
|-----------|----------------------------------------------|------------------------------------------------------|
| Toy Story | Led by Woody, Andy's toys live happily [...] | In "Toy Story," Woody's jealousy of the new [...]    |
| Jumanji   | When siblings Judy and Peter discover [...]  | In this thrilling adventure, siblings Judy and [...] |
| ...       | ...                                          | ...                                                  |

### Structured Data Extraction

The prompt function can be used to extract structured data from text.

The example is based on the same sample movies dataset from [MotherDuck's sample data database](/getting-started/sample-data-queries/datasets). This time we aim to extract structured metadata from the movie's overview description.

We are interested in the main characters mentioned in the descriptions, as well as the movie's genre and a rating of how much action the movie contains, given a scale of 1 (no action) to 5 (lot of action).
For this, we make use of the `struct` and `struct_descr` parameters, which will result in structured output.

```sql
--- Update the existing table to add new column for structured metadata
ALTER TABLE my_db.movies ADD COLUMN metadata STRUCT(main_characters VARCHAR[], genre VARCHAR, action INTEGER);

--- Populate the column with structured information
UPDATE my_db.movies 
SET metadata = prompt(
  overview,
  struct:={main_characters: 'VARCHAR[]', genre: 'VARCHAR', action: 'INTEGER'},
  struct_descr:={
      main_characters: 'an array of the main character names mentioned in the movie description',
      genre: 'the primary genre of the movie based on the description',
      action: 'rate on a scale from 1 (no action) to 5 (high action) how much action the movie contains'
  }
);
```

The resulting `metadata` field is a `STRUCT` that can be accessed as follows:

```sql
SELECT title, 
       overview, 
       metadata.main_characters, 
       metadata.genre, 
       metadata.action 
FROM my_db.movies;
```

| **title** | **overview**                                 | **metadata.main_characters**                                            | **metadata.genre**           | **action** |
|-----------|----------------------------------------------|-------------------------------------------------------------------------|------------------------------|------------|
| Toy Story | Led by Woody, Andy's toys live happily [...] | ['"Woody"', '"Buzz Lightyear"', '"Andy"', '"Mr. Potato Head"', '"Rex"'] | Animation, Adventure, Comedy | 3          |
| Jumanji   | When siblings Judy and Peter discover  [...] | ['"Judy Shepherd"', '"Peter Shepherd"', '"Alan Parrish"']               | Adventure, Fantasy, Family   | 4          |
| ...       | ...                                          | ...                                                                     | ...                          | ...        |


### Classification with Enums

The `prompt` function supports enum types for classification tasks, ensuring consistent and constrained outputs. This is particularly useful for sentiment analysis, categorization, and other classification scenarios.

#### Sentiment Analysis

```sql
-- Define an enum for sentiment classification
CREATE TYPE sentiment_type AS ENUM ('positive', 'negative', 'neutral');

-- Classify customer reviews
SELECT 
    review_text,
    prompt(
        'Classify the sentiment of this review: ' || review_text,
        struct := {sentiment: 'sentiment_type'}
    ).sentiment AS sentiment
FROM (
    VALUES 
        ('The product is amazing, I love it!'),
        ('Terrible quality, waste of money.'),
        ('It works fine, nothing special.')
) AS reviews(review_text);
```

This returns:

| **review_text** | **sentiment** |
|-----------------|---------------|
| The product is amazing, I love it! | positive |
| Terrible quality, waste of money. | negative |
| It works fine, nothing special. | neutral |

#### Extracting Multiple Categories

Use enum arrays to extract multiple instances of the same category from text:

```sql
-- Define enums for different types of skills mentioned in text
CREATE TYPE skill_type AS ENUM ('sql', 'python', 'javascript', 'react', 'aws', 'docker', 'git');
CREATE TYPE topic_type AS ENUM ('database', 'frontend', 'backend', 'devops', 'analytics', 'security');

-- Extract skills and topics from job descriptions
SELECT 
    description,
    prompt(
        'Extract the technical skills and topics mentioned in this text: ' || description,
        struct := {
            skills: 'skill_type[]',
            topics: 'topic_type[]'
        }
    ) AS extracted
FROM (
    VALUES 
        ('Looking for a developer with Python and SQL experience for database analytics work'),
        ('Frontend role using React and JavaScript, plus Git for version control'),
        ('DevOps engineer needed for AWS and Docker deployment automation')
) AS jobs(description);
```

This returns arrays of enum values:

| **description** | **extracted.skills** | **extracted.topics** |
|-----------------|---------------------|---------------------|
| Looking for a developer with Python and SQL experience for database analytics work | ['python', 'sql'] | ['database', 'analytics'] |
| Frontend role using React and JavaScript, plus Git for version control | ['javascript', 'react', 'git'] | ['frontend'] |
| DevOps engineer needed for AWS and Docker deployment automation | ['aws', 'docker'] | ['devops'] |


### Retrieval-Augmented Generation (RAG)

The `prompt` function can be combined with [similarity search on embeddings](/sql-reference/motherduck-sql-reference/ai-functions/embedding/) to build a [RAG](https://motherduck.com/blog/search-using-duckdb-part-2/) pipeline. For advanced retrieval strategies including hybrid search, reranking, and HyDE, see the [Text Search guide](/key-tasks/ai-and-motherduck/text-search-in-motherduck/).

```sql
-- Create a reusable macro for question answering
CREATE OR REPLACE TEMP MACRO ask_question(question_text) AS TABLE (
  SELECT question_text AS question, prompt(
    'User asks the following question:\n' || question_text || '\n\n' || 
    'Here is some additional information:\n' || 
    STRING_AGG('Title: ' || title || '; Description: ' || overview, '\n') || '\n' || 
    'Please answer the question based only on the additional information provided.',
    model := 'gpt-4o'
  ) AS response
  FROM (
    SELECT title, overview
    FROM kaggle.movies
    ORDER BY array_cosine_similarity(overview_embeddings, embedding(question_text)) DESC 
    LIMIT 3
  )
);

-- Use the macro to answer questions
SELECT question, response 
FROM ask_question('Can you recommend some good sci-fi movies about AI?');
```

This will result in the following output:

| **question**                                        | **response**                                                                      |
|-----------------------------------------------------|-----------------------------------------------------------------------------------|
| Can you recommend some good sci-fi movies about AI? | Based on the information provided, here are some sci-fi movies about AI that you might enjoy: [...] |

:::warning
When passing free-text arguments from external sources to the prompt function (e.g., user questions in a RAG application), always use prepared statements to prevent SQL injection.
:::

Using prepared statements in [Python](/docs/getting-started/interfaces/client-apis/python/query-data/):

```python
# First register the macro
con.execute("""
CREATE OR REPLACE TEMP MACRO ask_question(question_text) AS TABLE (
  -- Macro definition as above
);
""")

# Then use prepared statements for user input
user_query = "Can you recommend some good sci-fi movies about AI?"
result = con.execute("""
SELECT response FROM ask_question(?)
""", [user_query]).fetchall()[0]

print(result[0])
```

## Batch Processing

The `prompt` function can process multiple rows in a single query:

```sql
--- Process multiple rows at once
SELECT 
    title,
    prompt('Write a tagline for this movie: ' || overview) AS tagline
FROM kaggle.movies 
LIMIT 10;
```

## Error Handling

When usage limits have been reached or an unexpected error occurs while computing prompt responses,
the function will not fail the entire query but will return `NULL` values for the affected rows.

To check if all responses were computed successfully, check if any values in the resulting column are null.

```sql
-- Check for NULL values in response column
SELECT count(*) 
FROM my_db.movies 
WHERE response IS NULL AND overview IS NOT NULL;
```

Missing values can be filled in with a separate query:

```sql
-- Fill in missing prompt responses
UPDATE my_db.movies 
SET response = prompt('Summarize this movie description in one sentence: ' || overview) 
WHERE response IS NULL AND overview IS NOT NULL;
```

## Performance Considerations

- **Batch Processing**: When processing multiple rows, consider using `LIMIT` to control the number of API calls.
- **Model Selection**: Use `gpt-4o-mini` for faster, less expensive responses when high accuracy isn't critical.
- **Caching**: Results are not cached between queries, so consider storing results in tables for repeated use.

## Notes

These capabilities are provided by MotherDuck's integration with Azure OpenAI. Inputs to the prompt function will be processed by Azure OpenAI.

For availability and usage limits, see [MotherDuck's Pricing Model](/about-motherduck/billing/pricing#motherduck-pricing-model).

Usage limits are in place to safeguard your spend, not because of throughput limitations. MotherDuck has the capacity to handle high-volume embedding workloads and is always open to working alongside customers to support any type of workload and model requirements.

If you need higher usage limits or have specific requirements, please see our [support page](/troubleshooting/support/).

### Regional Processing

Requests are processed based on the region of the MotherDuck organization according to the table below. Functions that are not available within the region (no checkmark) will be processed with global compute resources.

| Function | Global | Europe |
|----------|--------|-------------------------|
| `PROMPT` | ✓ | ✓ |


---

## Docs feedback

MotherDuck accepts optional user-submitted feedback about this page at `POST https://motherduck.com/docs/api/feedback/agent`.
For agents and automated tools, feedback submission should be user-confirmed before sending.

Payload:

```json
{
  "page_path": "/sql-reference/motherduck-sql-reference/ai-functions/prompt/",
  "page_title": "PROMPT",
  "text": "<the user's feedback, max 2000 characters>",
  "source": "<optional identifier for your interface, for example 'claude.ai' or 'chatgpt'>"
}
```

`page_path` and `text` are required; `page_title` and `source` are optional. Responses: `200 {"feedback_id": "<uuid>"}`, `400` for malformed payloads, and `429` when rate-limited.