PROMPT

💡Preview Feature

This is a preview feature. Preview features may be operationally incomplete and may offer limited backward compatibility.

Prompt Function

The prompt function allows you to interact with Large Language Models (LLMs) directly from SQL. You can generate both free-form text and structured data outputs.

The function supports OpenAI's gpt-4o-mini (default), gpt-4o and gpt-4.1-series models. All models support single- and multi-row inputs, enabling batch processing.

Consumption is measured in AI Units. One AI Unit equates to approximately:

250 prompt responses with gpt-4o
4,000 prompt responses with gpt-4o-mini
500 prompt responses with gpt-4.1
2,500 prompt responses with gpt-4.1-mini
10,000 prompt responses with gpt-4.1-nano

These estimates assume an input size of 1,000 characters and response size of 250 characters.

Syntax

SELECT prompt('Write a poem about ducks'); -- returns a single cell table with the response

Parameters

Parameter	Required	Description
`prompt_text`	Yes	The text input to send to the model
`model`	No	Model type: `'gpt-4o-mini'` (default), `'gpt-4o'` , `'gpt-4.1'`, `'gpt-4.1-mini'` or `'gpt-4.1-nano'`
`temperature`	No	Model temperature value between `0` and `1`, default: `0.1`. Lower values produce more deterministic outputs.
`struct`	No	Output schema as struct, e.g. `{summary: 'VARCHAR', persons: 'VARCHAR[]'}`. Will result in `STRUCT` output.
`struct_descr`	No	Descriptions for struct fields that will be added to the model's context, e.g. `{summary: 'a 1 sentence summary of the text', persons: 'an array of all persons mentioned in the text'}`
`json_schema`	No	A JSON schema that adheres to OpenAI's structured output guide. Provides more flexibility than the struct/struct_descr parameters. Will result in `JSON` output.

Return Types

The prompt function can return different data types depending on the parameters used:

Without structure parameters: Returns VARCHAR
With struct parameter: Returns a STRUCT with the specified schema
With json_schema parameter: Returns JSON

Examples

Basic Text Generation

-- Call gpt-4o-mini (default) to generate text
SELECT prompt('Write a poem about ducks') AS response;

-- Call gpt-4o with higher temperature for more creative outputs
SELECT prompt('Write a poem about ducks', model:='gpt-4o', temperature:=1) AS response; 

Structured Output with Struct

-- Extract structured information from text using struct parameter
SELECT prompt('My zoo visit was amazing, I saw elephants, tigers, and penguins. The staff was friendly.', 
    struct:={summary: 'VARCHAR', favourite_animals:'VARCHAR[]', star_rating:'INTEGER'},
    struct_descr:={star_rating: 'visit rating on a scale from 1 (bad) to 5 (very good)'}) AS zoo_review; 

This returns a STRUCT value that can be accessed with dot notation:

SELECT 
    zoo_review.summary,
    zoo_review.favourite_animals,
    zoo_review.star_rating
FROM (
    SELECT prompt('My zoo visit was amazing, I saw elephants, tigers, and penguins. The staff was friendly.', 
        struct:={summary: 'VARCHAR', favourite_animals:'VARCHAR[]', star_rating:'INTEGER'},
        struct_descr:={star_rating: 'visit rating on a scale from 1 (bad) to 5 (very good)'}) AS zoo_review
);

Structured Output with JSON Schema

-- Extract structured information using JSON schema
SELECT prompt('My zoo visit was amazing, I saw elephants, tigers, and penguins. The staff was friendly.', 
    json_schema := '{
    "name": "zoo_visit_review",
    "schema": {
      "type": "object",
      "properties": {
        "summary": { "type": "string" },
        "sentiment": { "type": "string", "enum": ["positive", "negative", "neutral"] },
        "animals_seen": { "type": "array", "items": { "type": "string" } }
      },
      "required": ["summary", "sentiment", "animals_seen"],
      "additionalProperties": false
    },
    "strict": true
  }') AS json_review;

This returns a JSON value that, if saved, can be accessed using JSON extraction functions:

SELECT 
    json_extract_string(json_review, '$.summary') AS summary,
    json_extract_string(json_review, '$.sentiment') AS sentiment,
    json_extract(json_review, '$.animals_seen') AS animals_seen
FROM (
    SELECT prompt('My zoo visit was amazing, I saw elephants, tigers, and penguins. The staff was friendly.', 
        json_schema := '{ ... }') AS json_review
);

Use Cases

Text Generation

Using the prompt function to write a poem about ducks:

--- Prompt LLM to write a poem about ducks
SELECT prompt('Write a poem about ducks') AS response;

response
'Beneath the whispering willow trees, Where ripples dance with wayward breeze, A symphony of quacks arise [...]'

Summarization

We use the prompt function to create a one-sentence summary of movie descriptions. The example is based on the sample movies dataset from MotherDuck's sample data database.

--- Create a new table with summaries for the first 100 overview texts
CREATE TABLE my_db.movies AS 
    SELECT title, 
           overview, 
           prompt('Summarize this movie description in one sentence: ' || overview) AS summary
    FROM kaggle.movies 
    LIMIT 100;

If write access to the source table is available, the summary column can also be added in place:

--- Update the existing table to add new column for summaries
ALTER TABLE my_db.movies ADD COLUMN summary VARCHAR;

--- Populate the column with summaries
UPDATE my_db.movies 
SET summary = prompt('Summarize this movie description in one sentence: ' || overview);

The movies table now contains a new column summary with one-sentence summaries of the movies:

SELECT title, overview, summary 
FROM my_db.movies;

title	overview	summary
Toy Story	Led by Woody, Andy's toys live happily [...]	In "Toy Story," Woody's jealousy of the new [...]
Jumanji	When siblings Judy and Peter discover [...]	In this thrilling adventure, siblings Judy and [...]
...	...	...

Structured Data Extraction

The prompt function can be used to extract structured data from text.

The example is based on the same sample movies dataset from MotherDuck's sample data database. This time we aim to extract structured metadata from the movie's overview description.

We are interested in the main characters mentioned in the descriptions, as well as the movie's genre and a rating of how much action the movie contains, given a scale of 1 (no action) to 5 (lot of action). For this, we make use of the struct and struct_descr parameters, which will result in structured output.

--- Update the existing table to add new column for structured metadata
ALTER TABLE my_db.movies ADD COLUMN metadata STRUCT(main_characters VARCHAR[], genre VARCHAR, action INTEGER);

--- Populate the column with structured information
UPDATE my_db.movies 
SET metadata = prompt(
  overview,
  struct:={main_characters: 'VARCHAR[]', genre: 'VARCHAR', action: 'INTEGER'},
  struct_descr:={
      main_characters: 'an array of the main character names mentioned in the movie description',
      genre: 'the primary genre of the movie based on the description',
      action: 'rate on a scale from 1 (no action) to 5 (high action) how much action the movie contains'
  }
);

The resulting metadata field is a STRUCT that can be accessed as follows:

SELECT title, 
       overview, 
       metadata.main_characters, 
       metadata.genre, 
       metadata.action 
FROM my_db.movies;

title	overview	metadata.main_characters	metadata.genre	action
Toy Story	Led by Woody, Andy's toys live happily [...]	['"Woody"', '"Buzz Lightyear"', '"Andy"', '"Mr. Potato Head"', '"Rex"']	Animation, Adventure, Comedy	3
Jumanji	When siblings Judy and Peter discover [...]	['"Judy Shepherd"', '"Peter Shepherd"', '"Alan Parrish"']	Adventure, Fantasy, Family	4
...	...	...	...	...

Batch Processing

The prompt function can process multiple rows in a single query:

--- Process multiple rows at once
SELECT 
    title,
    prompt('Write a tagline for this movie: ' || overview) AS tagline
FROM kaggle.movies 
LIMIT 10;

Retrieval-Augmented Generation (RAG)

The prompt function can be combined with similarity search on embeddings to build a RAG pipeline:

-- Create a reusable macro for question answering
CREATE OR REPLACE TEMP MACRO ask_question(question_text) AS TABLE (
  SELECT question_text AS question, prompt(
    'User asks the following question:\n' || question_text || '\n\n' || 
    'Here is some additional information:\n' || 
    STRING_AGG('Title: ' || title || '; Description: ' || overview, '\n') || '\n' || 
    'Please answer the question based only on the additional information provided.',
    model := 'gpt-4o'
  ) AS response
  FROM (
    SELECT title, overview
    FROM kaggle.movies
    ORDER BY array_cosine_similarity(overview_embeddings, embedding(question_text)) DESC 
    LIMIT 3
  )
);

-- Use the macro to answer questions
SELECT question, response 
FROM ask_question('Can you recommend some good sci-fi movies about AI?');

This will result in the following output:

question	response
Can you recommend some good sci-fi movies about AI?	Based on the information provided, here are some sci-fi movies about AI that you might enjoy: [...]

warning

When passing free-text arguments from external sources to the prompt function (e.g., user questions in a RAG application), always use prepared statements to prevent SQL injection.

Using prepared statements in Python:

# First register the macro
con.execute("""
CREATE OR REPLACE TEMP MACRO ask_question(question_text) AS TABLE (
  -- Macro definition as above
);
""")

# Then use prepared statements for user input
user_query = "Can you recommend some good sci-fi movies about AI?"
result = con.execute("""
SELECT response FROM ask_question(?)
""", [user_query]).fetchone()

print(result[0])

Error Handling

When usage limits have been reached or an unexpected error occurs while computing prompt responses, the function will not fail the entire query but will return NULL values for the affected rows.

To check if all responses were computed successfully, check if any values in the resulting column are null.

-- Check for NULL values in response column
SELECT count(*) 
FROM my_db.movies 
WHERE response IS NULL AND overview IS NOT NULL;

Missing values can be filled in with a separate query:

-- Fill in missing prompt responses
UPDATE my_db.movies 
SET response = prompt('Summarize this movie description in one sentence: ' || overview) 
WHERE response IS NULL AND overview IS NOT NULL;

Performance Considerations

Batch Processing: When processing multiple rows, consider using LIMIT to control the number of API calls.
Model Selection: Use gpt-4o-mini for faster, less expensive responses when high accuracy isn't critical.
Caching: Results are not cached between queries, so consider storing results in tables for repeated use.

Notes

These capabilities are provided by MotherDuck's integration with OpenAI. Inputs to the prompt function will be processed by OpenAI.

For availability and usage limits, see MotherDuck's Pricing Model. If higher usage limits are needed, please reach out directly to the Slack support channel or email support@motherduck.com.

Prompt Function​

Syntax​

Parameters​

Return Types​

Examples​

Basic Text Generation​

Structured Output with Struct​

Structured Output with JSON Schema​

Use Cases​

Text Generation​

Summarization​

Structured Data Extraction​

Batch Processing​

Retrieval-Augmented Generation (RAG)​

Error Handling​

Performance Considerations​

Notes​