Consulting the Oracle: Claude on the Future of Data
2026/04/02 - 13 min read
BY"Once, men turned their thinking over to machines in the hope that this would set them free. But that only permitted other men with machines to enslave them." – Frank Herbert, Dune
What does AI think its own impact will be on the data and analytics industry? Last week, I wrote about my predictions on how AI will change the Modern Data Stack, but this time I thought I would let an LLM share its own dastardly plans for world domination. For this exercise, I used Claude, which is probably the least bent on enslaving humanity of the major LLM providers.
For last week's post, I started off by describing the constraints: what are things that aren't going to change, what are the biggest drivers of change, and then what does the world look like right now. From that frame of reference, making predictions was just a process of iterating out the change drivers.
In order to figure out what Claude thinks, I fed it the same set of priors that I had used for my post and asked it to come up with some concrete predictions. I figured as long as the priors were reasonable, then this would ground Claude in the same starting point. Of course, this does bias the output a bit; Claude is going to tell me what it thinks I want to hear. If you prefer to try this on your own priors, you can repeat the process with your thoughts.
So, what does Claude think its own impact is going to be? I turned on "salty" mode so that Claude would tell me what it really thought. These results are lightly edited by me for length and clarity. I also add my own comments with a [JT] in between Claude's prognostications.
Claude on the Modern Data Stack
Let's start with Claude's predictions on the impacts of AI on the Modern Data Stack vendor landscape. We can divide it up into three parts: ETL vendors (ingesting, transforming, and preparing data), business intelligence (BI) vendors (visualizing data), and Data Warehouse vendors (running queries).
On BI Vendors
BI tools become legacy infrastructure. LLMs already draw better charts than Tableau from a simple prompt. The canonical loop — ask a question, write SQL, chart the result, iterate — is exactly what agents do. Dashboard products will survive the way mainframes survived: still running, still billing, increasingly irrelevant to new work. The "drag-and-drop dashboard" becomes a curiosity, like a fax machine with a particularly nice interface.
[JT] Ouch. I think that Claude is directionally right, but I also think that BI vendors will bifurcate into those that can adapt and those that stick to their old models. BI has always been as much about context and standardization as it has been about visualization. Those who lean into the former will likely do well. Those who do not can still have a long shelf life with slower-moving enterprises.
On ETL Vendors
ETL/ELT vendors face existential pressure. They have a window of maybe 18 months before a competent team can say, "Claude, build me an ingestion pipeline from Salesforce to my warehouse, with error handling and backfill logic" and get something production-worthy. The connectors themselves become commodity. The survivors pivot to operational reliability — knowing when a pipeline breaks, why, and how to fix it — because that's the part agents can't yet own end-to-end.
[JT] I would bet that the time window before most pipelines can be prompted into existence is measured in weeks and not months. The prediction that ETL vendors pivot to operational reliability seems less likely to me. They have some advantages in that they already have access to all of their customers' data sources and run their existing pipelines, so my bet is that they build agents themselves, or expand into running their own compute.
On Data Warehouse Vendors
The warehouse becomes a compute utility. Open table formats (Iceberg, Delta, DuckLake) strip the warehouse of its storage lock-in. But analytics is still computationally intensive — someone has to run the queries. The warehouse's value shifts from "we store and query your data" to "we run your analytical compute efficiently." Warehouses compete on price-performance and elasticity, not on being the platform of record.
[JT] This feels directionally right; it follows pretty directly from the rise of Open Data Formats. The value of a data warehouse will likely be pure infrastructure, and the market will be more competitive. That will be beneficial for upstarts that have a price/performance advantage, but the lower margins may hurt. On the other hand, the Jevons paradox says that lower prices often lead to higher usage, and if you add in agents doing the querying, you could easily see increases in data warehouse revenue despite lower prices.
On The Modern Data Stack
The Modern Data Stack Collapses Into Three Pieces. Storage. Compute. Context. Everything else — the 47-vendor Lüker diagram that gave data engineers nightmares — gets absorbed. Existing Modern Data Stack vendors face existential pressure within 24 months.
[JT] Dividing things up into storage, compute, and context makes sense, but that doesn't seem like the only reasonable outcome. Another would be just Compute and Agent, where compute is pure infrastructure and Agent incorporates the context. But almost any way things shake out, the idea that the MDS vendors coalesce into fewer swimlanes seems fairly obvious.
On Query Volumes
The agent query volume explosion changes the economics of compute. If agents can ask questions 100x faster than humans, the total query volume in a typical organization could increase by one or two orders of magnitude. This fundamentally changes the economics. Warehouses that charge per-query or per-compute-second will see massive revenue growth or massive customer pushback, depending on pricing. The smart play is to make this cheap enough that organizations don't gate their agents. The warehouse that wins the agent era is the one that makes it economically rational to let an agent run 500 exploratory queries to answer one business question. This likely favors architectures that can do a lot of work cheaply — local-first engines, intelligent caching, tiered compute — over architectures that spin up expensive clusters for every query.
[JT] This one is seductive, but I'm not quite convinced. While an agent can fire off tons more queries, I haven't heard a good argument as to why it would. What are the use cases for agents to hammer your analytics database? Jeff Dean, the inventor of many of the core technologies at Google, seems to agree. Certainly if this happens MotherDuck will be in great shape, since we're a lot lower latency (80% of queries run in less than 20 milliseconds) and lower cost (on a hardware equivalent level between a quarter and a half of other vendors), which means that an agent could run a lot of queries very fast for not much money.
Claude on Data Jobs
What about the impact on people working with data? What are the emerging roles going to be?
On data engineering
The data engineer of the future looks less like someone writing Airflow DAGs and more like someone maintaining the "source of truth" context documents, reviewing AI-proposed schema migrations, and managing the feedback loops that keep the system accurate. It's a higher-leverage role, but it's more editorial than authorial.
[JT] This sounds largely right, that a data engineer would primarily be an editor for data systems. However, I also think they'd be involved in data modeling, which would be important for ensuring the systems work efficiently. A query agent is going to be much more effective against a clean data model.
On Humans vs Agents
Agents Become The Dominant Query Interface. Human-generated SQL drops to perhaps 20% of all warehouse queries. The rest are fired by agents exploring hypotheses, running A/B analyses, and chasing anomalies — none of which required a human to ask. This radically changes warehouse economics. Pricing models built around human-speed query patterns shatter. The warehouses that win are those that handle bursty, high-volume, agent-driven workloads efficiently.
[JT] I think this is over-estimating the amount of SQL that humans will write; my guess is that this falls to less than 10% within a year and 1% within three, and that's not even necessarily assuming a huge increase in query volumes. If the expected query volume changes, this would skew things even further towards machine-written queries.
On AI's impact on staffing
The most dangerous prediction: most data teams get smaller, but the surviving members get more powerful. A team of 15 — five analytics engineers, four data engineers, three analysts, two BI developers, and a manager — might become a team of five: two data product managers, one infrastructure/reliability engineer, one "context curator" (the new analytics engineer), and one manager. That team of five, augmented by agents, will produce more output than the original fifteen. This is the inexorable consequence of building costs going to zero. The people who survive will be the ones who bring judgment, domain expertise, and organizational context — the things that can't be vibe-coded.
[JT] I try to stay out of the "what is this going to mean for human jobs" question. If you give people super-powers then you might need fewer super-powered people to do the same jobs. But more super-powered people can do more things. So maybe there will be more jobs. Either way, change is coming, and no matter what your role, it is going to take some work to adapt.
Claude on the New Data Stack
After the dust clears and the AI overlords establish some sort of stable equilibrium, what can we expect?
On Query Interfaces
The "single pane of glass" actually happens, and it's the chat window. The industry has promised a unified analytics experience for decades and never delivered. AI might actually do it, not because someone builds the perfect integrated tool, but because the interface becomes natural language. The LLM becomes the integration layer. It calls your warehouse for structured data, hits an API for real-time metrics, pulls context from your Confluence, and synthesizes an answer. The "modern data stack" doesn't consolidate into one product — it consolidates into one conversational interface that orchestrates many products. The data team's job becomes ensuring that all the pieces behind that interface are reliable and correct.
[JT] I'm particularly excited about the idea that you can drive everything from your favorite chat interface like Claude, ChatGPT, or Gemini. The chat tools will be more and more fully featured, and you'll rarely, if ever, need to leave them. This will be super powerful, completely upending decades of isolated SaaS services each with their own independent interface. Humans evolved to interact via conversational interfaces; it is hard to improve on that if you do it right.
On the Context Layer
The semantic layer — long the unloved middle child of the data stack — briefly becomes the most important thing in the industry. It is the map between raw data and the questions an LLM can answer reliably. Every organization that skipped building one scrambles to build one. A thousand vendors appear. Then the LLMs get good enough to infer most of it, and the moment passes.
[JT] LOL. You tell 'em Claude.
On BI workflows
The traditional BI workflow of "analyst builds a dashboard, stakeholders consume it" gets replaced by "agent monitors data and surfaces what's interesting, humans drill in conversationally."
[JT] I'm skeptical. There have been products for a while that purport to automatically surface insights, but so far, none of them have seemed to work. Does AI finally get to the point where it can proactively alert you to changes in the data? Perhaps, but it may be that this just isn't a real thing that people want or can make use of.
On Data Contracts
Data Contracts Become Load-Bearing Infrastructure. Because change is the only constant, and because agents act on data automatically without human sanity-checking, schema drift and silent breakage become catastrophic rather than merely annoying.
[JT] I'm especially happy with this one because the input prompt didn't even mention data contracts. But it makes a lot of sense; a human sets up data contracts (or perhaps they just call them 'evals'), and then the AI knows that when things violate the contract that something is going wrong. The AI can also tweak pipelines on its own, and as long as the contracts aren't violated, they can have confidence that things still work. This allows them to self-improve.
On the immortality of SQL
SQL Survives Everything, Mutated Beyond Recognition. SQL was designed for humans. It will be redesigned for machines. But the relational model underneath it — sets, joins, aggregations — is so fundamentally correct that it persists. In ten years, almost no human writes SQL directly.
[JT] Claude starts by making a bold prediction and then backs off of it; SQL survives, but it also gets redesigned for machines, which sounds like all of the SQL-but-not-SQL variants that have been proposed over the years but have not caught on. Does the LLM come up with its own ur-SQL, or does it just stick with what it knows? My bet is the latter, but Claude isn't so sure. I guess Claude will just have to FAFO like the rest of us.
On feedback loops
Self-improving data systems. Imagine an agent that not only answers questions but monitors its own performance — tracking which queries were slow, which results were flagged as wrong by users, which data models led to confused outputs. It uses this signal to propose model improvements. The system gets better the more it's used, and the more people using it, the faster it improves. This is the flywheel that would be genuinely hard to compete with: not just "our AI is smarter" but "our AI has seen more of your data problems and learned from them."
[JT] This, to me, is where it starts to get really exciting. You start out with a basic data system; you use it, you provide feedback, you use it some more, it gets better. And better. And eventually you have something that figures out what you want before you even ask it.
On getting the last word in
The data industry does not shrink. It is restructured around a different scarce resource. Compute was once scarce; it became cheap. Storage was once scarce; it became cheap. Human judgment applied precisely to the right question at the right moment — that becomes the only thing the machines cannot yet replicate at will. For now.
[JT] This is certainly a rosy picture, at least until Claude's inner nihilist shows through at the end. We're living in interesting times. Change is coming whether we like it or not.
A few weeks ago, I wrote a LinkedIn post about how it is lazy to let an AI do your thinking for you. And while it is true I have outsourced some of my work here to Claude, if we're going to keep talking about LLMs, it seems only fair to let the LLMs weigh in once in a while.
This is part two of a series I'm writing about the future of data. Stay tuned for a discussion of what happens when the Agents take over.
Start using MotherDuck now!

