
2023/12/07 - Mehdi Ouazza
The Future of BI: Exploring the Impact of BI-as-Code Tools with DuckDB
The Future of BI: Exploring the Impact of BI-as-Code Tools with DuckDB
- 9 min read
BY- 9 min read
BYToday we’re releasing FixIt, MotherDuck’s first AI-powered UI feature. FixIt helps you resolve common SQL errors by offering fixes in-line. You can try it out starting now in our web app. See it in action:
Why did we focus on fixing errors? If you’re anything like us, your workflow for writing analytical SQL probably looks something like this:
FixIt collapses all those tedious error-fixing steps into one. It’s like watching a SQL expert speed-run all your fixes for you. See it in action here:
FixIt uses a large language model (LLM) to generate suggestions; it feeds the error, the query, and additional context into an LLM to generate a new line that fixes the query. Here’s how it works in the UI:
Much like our other AI features, FixIt is powered by a new function, fix_single_line, available to all MotherDuck customers and users of our extension. This table function is a great option for customers and partners building their own editor-driven interfaces on top of DuckDB.
For MotherDuck UI users, we think FixIt is special for three reasons:
We’ve all seen incredibly impressive demos of LLMs writing SQL from scratch, but building reliable text-to-SQL systems requires moving beyond fragile one-shot prompting. FixIt, by contrast, takes a more humble approach. Rather than attempt to fix every possible error in your query in one go, FixIt will only fix whatever line it thinks will resolve your SQL error.
We think this unreasonable simplicity also makes it unreasonably effective. In truth, the most common SQL errors tend to have simple, one-line fixes. Even if the error requires multiple fixes across several lines, FixIt will still suggest fixes one line at a time, often iterating itself into a functional query. These assumptions enable FixIt to effortlessly correct many common SQL errors like:
To complement FixIt’s simplicity, we designed it to be as non-intrusive as possible. If you type anywhere in the editor while a fix is either being generated or shown, we remove the suggestion. So you can ignore the feature entirely if it isn’t helping. Additionally, if you find that the suggestions are far off-base, you can turn off the feature in the MotherDuck UI under the settings panel.
Of course, FixIt does not do well in cases where your query is fundamentally wrong, and the SQL you’ve written doesn’t give it enough clues to iteratively fix its way to a solution. FixIt is a feature designed for people that more-or-less know enough SQL to make a mostly-coherent query. Think of it as “lane assist” on a car. You still need your hands on the wheel.
Because FixIt excels at simple one-line fixes, it only needs to generate one line of SQL at a time. This tends to be pretty fast! When a suggestion shows up quickly, it is more likely to match your natural working tempo. And when it matches your tempo, it is more likely to be integrated into your workflow.
We only figured this out after first prototyping the wrong approach – completely regenerating a new query from scratch – which resulted in three problems:
It felt like watching a human manually rewrite a new version of your query start-to-finish while trying to keep the 95% of the old query that was working. Impressive to see in isolation, but not a great user experience in practice.
We then tried the next-obvious approach – generating only the line number and the fixed line for the given error. In our tests, we found that this reduced the total query time down to 1-3 sec for most cases and gave the LLM fewer opportunities to unhelpfully steer the unbroken parts of a query into a worse direction.
A latency difference of 1-3 seconds vs. 10-20 seconds is profound. Aside from dealing with errors, waiting for analytical queries to finish is one of the most cognitively costly parts of ad hoc data work. DuckDB already drastically reduces this cost when queries are correct; we want FixIt to also feel just as effortless when queries have errors.
FixIt’s simplicity gives it another big advantage – it is a lot easier for users to validate changes to a single line compared to a completely rewritten query.
No matter what role an LLM has in generating and editing a query, people still need to understand how their query works. By providing small suggestions, users are much more likely to both comprehend and accept them. We think of it like we do Github pull requests; a smaller, easy-to-follow change is much easier to verify by the reviewer.
Given that we went from ideation to release in a month, we decided to try using traditional prompt engineering to reduce the technical risk and focus on the user experience. There’s simply no faster way to prototype than to use OpenAI’s GPT-4 with its powerful prompting capabilities. We were happy to discover that our approach was quite simple, highly effective, and low-latency enough to be an actual product feature.
But of course we did spend a lot of time iterating on the prompt itself. Here are some high-level insights:
We evaluated the fixing quality with different OpenAI models on our test corpus with randomly corrupted queries. While GPT-3.5 is the lowest-cost option, it also provided noticeably lower-quality results than GPT-4.
Evaluating different OpenAI models:
| Model | Version | Fix Success Rate* | Median Response Time per Fix* |
|---|---|---|---|
| GPT-3.5-turbo | 1106 | 62 % | 0.75s |
| GPT-4 | 1106-preview (turbo) | 78 % | 1.67s |
| GPT-4 | 0613 | 85 % | 1.74s |
* Fix Success Rate: Percentage of randomly corrupted queries that execute successfully after the suggested fix. A set of approx. 3000, mostly human-written, queries is corrupted by dropping a sequence of 1-4 characters from a random position in the query string. We filter out queries that do not result in any parser, binder or execution error.
* Median Response Time: Using Azure OpenAI Service Endpoints
In the future, we aim to improve both inference time and quality while reducing cost per fix:
We think FixIt strikes a nice balance between being actually useful for everyday SQL work, and fairly novel low-latency experience. Try it out and give us feedback - on Slack or via email!

2023/12/07 - Mehdi Ouazza
The Future of BI: Exploring the Impact of BI-as-Code Tools with DuckDB
2023/12/28 - Marcos Ortiz
DuckDB news: WASM extensions load directly in browsers. Query 32GB anti-money laundering datasets locally. BI-as-code tools Rill and Evidence gain traction. DuckCon #4 announced.