Stop Burning Tokens: Data Observability and Quality as Your AI's System of Record

TL;DR: Embedding AI in every tool creates token bloat, inconsistency, and siloed context. The better pattern is to expose data quality and observability as a deterministic service via MCP and let whatever AI you already use call it. Use AI to discover what tests to add. Add them to TestGen. From that point forward, the tests run without burning another token.

Everyone Is Adding AI. That Is the Problem.

Atlassian has an AI assistant. Docker has one. Monte Carlo has AI, root cause analysis, lineage/impact analysis, and cost optimization. Every tool wants to be your AI interface and include every feature.

Here is what I notice when I use those tools: I ignore the embedded AI and just ask Claude. The tool’s AI does not have my context. It does not know my terminology, my priorities, or what happened last week. And now companies are starting to realize that convenience carries a bill. A DoiT survey (Sapio Research, Feb 2026, 500 finance leaders at orgs with 1,000+ employees) found that 79% of enterprises overspent on AI in 2026, even as per-token prices kept falling. Volume is the culprit, and it only gets worse when every tool in your stack burns tokens independently.

Context is the new oil. The question is: who does the refining?

The Factory Analogy

In a factory, you do not have nuts and bolts in random sizes. You standardize them, and that makes everything downstream faster and more reliable. You also do not manufacture the nuts and bolts yourself. You buy commodity parts and save your expertise for the next level of value.

I ran a demo recently where I typed one prompt to Claude: “Look at all the hygiene issues and data test failures in TestGen and list the first three I should investigate.” Claude used the MCP interface (the Model Context Protocol, which lets AI tools call external services directly) and came back with three findings: duplicate customer IDs, two completely empty columns, and a product category field with inconsistent values. I asked Claude to fix the category field. It wrote SQL to normalize all variants to the dominant form, which is exactly the kind of query I write myself. Then I asked what custom test I should add. Claude scanned 214 existing single-column tests and suggested a cross-table check: discount amount should not exceed the maximum allowed discount for that product. When I asked how it knew to recommend that, it said the field names were descriptive enough to make the relationship obvious.

That is the right division of labor. TestGen (DataKitchen’s data quality and observability platform) held the test (rules) and test results. The AI read that context and reasoned from there. Each did its own job and nothing more.

Two Jobs That Should Stay Separate

Most vendors blur a boundary that should stay clear. On one side sits determinism, auditability, and a system of record. On the other sits reasoning, synthesis, and discovery.

TestGen lives on the first side. When you add a test, it runs. Every time. At scale. It never forgets, never drifts, and keeps a full history: when a test ran, when it failed, what the data looked like at that moment. For regulated industries and compliance-conscious teams, that auditability is required, not optional.

AI lives on the second side. It is good at reasoning over context and surfacing what you did not think to look for. Ask an AI to maintain a test suite indefinitely and you will eventually be disappointed. Ask it to reason over profiled, tested data exposed via MCP, and you will be impressed.

Embedding AI inside every tool collapses this boundary. You get a reasoning engine trying to act as a system of record, burning tokens every time, and producing slightly different answers each run. That is the wrong architecture.

MCP Changes the Equation

Connect TestGen as an MCP service and whatever AI you already use, Claude, Cursor, Databricks Genie, Snowflake Cortex, can talk to it directly. Your AI does not change. Your workflow does not change. TestGen becomes the backend service your AI calls when it needs to know what the data looks like, what tests exist, and what has failed.

This is also why we decided not to build lineage into TestGen. The answer to “what will this error affect” lives in the code, and any developer already has an AI connected to their repository. No reason to duplicate that inside a data quality tool. Three pieces of context, each living where it belongs, each accessible via the right tool. The architecture does not have to be complicated; it just has to respect the boundary.

Save Tokens. Get Consistency.

I asked Claude what tests I should add. Claude recommended the cross-table discount check and added it to TestGen. That test now runs on every cycle without burning a single token. I paid for the reasoning once and the test runs forever.

That is the compounding advantage. Every test you encode is a question you never pay to answer again. TestGen always returns the same answer, auditable and traceable to the specific run.

All the features. A tenth the price. And we save you the tokens.

Everyone Is Adding AI. That Is the Problem.

The Factory Analogy

Two Jobs That Should Stay Separate

MCP Changes the Equation

Save Tokens. Get Consistency.

Gil Benghiat

Sign up for our Newsletter

You Might Also Like

IQVIA, specialty pharmacy, and your own files: the commercial data most likely to be wrong

Webinar: Stop Clicking, Start Asking — The AI Playbook For Data Quality

Why Testing Commercial Pharma Data Is Harder Than Anyone Tells You