Stop Burning Tokens: Data Observability and Quality as Your AI's System of Record

Embedding AI in every data tool creates token bloat, inconsistency, and siloed context. The better pattern: expose data quality and observability as a deterministic service via MCP and let your AI call it.

Written by Gil Benghiat on June 16, 2026

AI with LLMsData QualityDataOps TestGenData Observability
Stop Burning Tokens: Data Observability and Quality as Your AI's System of Record

TL;DR: Embedding AI in every tool creates token bloat, inconsistency, and siloed context. The better pattern is to expose data quality and observability as a deterministic service via MCP and let whatever AI you already use call it. Use AI to discover what tests to add. Add them to TestGen. From that point forward, the tests run without burning another token.

Everyone Is Adding AI. That Is the Problem.

Atlassian has an AI assistant. Docker has one. Monte Carlo has AI, root cause analysis, lineage/impact analysis, and cost optimization. Every tool wants to be your AI interface and include every feature.

Here is what I notice when I use those tools: I ignore the embedded AI and just ask Claude. The tool’s AI does not have my context. It does not know my terminology, my priorities, or what happened last week. And now companies are starting to realize that convenience carries a bill. The FinOps Foundation found that 73% of enterprises in 2026 reported AI costs exceeded original projections, even as token prices fell 98%. Volume is the culprit, and it only gets worse when every tool in your stack burns tokens independently.

Context is the new oil. The question is: who does the refining?

The Factory Analogy

In a factory, you do not have nuts and bolts in random sizes. You standardize them, and that makes everything downstream faster and more reliable. You also do not manufacture the nuts and bolts yourself. You buy commodity parts and save your expertise for the next level of value.

I ran a demo recently where I typed one prompt to Claude: “Look at all the hygiene issues and data test failures in TestGen and list the first three I should investigate.” Claude used the MCP interface (the Model Context Protocol, which lets AI tools call external services directly) and came back with three findings: duplicate customer IDs, two completely empty columns, and a product category field with inconsistent values. I asked Claude to fix the category field. It wrote SQL to normalize all variants to the dominant form, which is exactly the kind of query I write myself. Then I asked what custom test I should add. Claude scanned 214 existing single-column tests and suggested a cross-table check: discount amount should not exceed the maximum allowed discount for that product. When I asked how it knew to recommend that, it said the field names were descriptive enough to make the relationship obvious.

That is the right division of labor. TestGen (DataKitchen’s data quality and observability platform) held the standardized, profiled data. The AI read that context and reasoned from there. Each did its own job and nothing more.

Two Jobs That Should Stay Separate

Most vendors blur a boundary that should stay clear. On one side sits determinism, auditability, and a system of record. On the other sits reasoning, synthesis, and discovery.

TestGen lives on the first side. When you add a test, it runs. Every time. At scale. It never forgets, never drifts, and keeps a full history: when a test ran, when it failed, what the data looked like at that moment. For regulated industries and compliance-conscious teams, that auditability is required, not optional.

AI lives on the second side. It is good at reasoning over context and surfacing what you did not think to look for. Ask an AI to maintain a test suite indefinitely and you will eventually be disappointed. Ask it to reason over profiled, tested data exposed via MCP, and you will be impressed.

Embedding AI inside every tool collapses this boundary. You get a reasoning engine trying to act as a system of record, burning tokens every time, and producing slightly different answers each run. That is the wrong architecture.

MCP Changes the Equation

Connect TestGen as an MCP service and whatever AI you already use, Claude, Cursor, Databricks Genie, Snowflake Cortex, can talk to it directly. Your AI does not change. Your workflow does not change. TestGen becomes the backend service your AI calls when it needs to know what the data looks like, what tests exist, and what has failed.

This is also why we decided not to build lineage into TestGen. The answer to “what will this error affect” lives in the code, and any developer already has an AI connected to their repository. No reason to duplicate that inside a data quality tool. Three pieces of context, each living where it belongs, each accessible via the right tool. The architecture does not have to be complicated; it just has to respect the boundary.

Save Tokens. Get Consistency.

I asked Claude what tests I should add. Claude recommended the cross-table discount check and added it to TestGen. That test now runs on every cycle without burning a single token. I paid for the reasoning once and the test runs forever.

That is the compounding advantage. Every test you encode is a question you never pay to answer again. TestGen always returns the same answer, auditable and traceable to the specific run.

All the features. A tenth the price. And we save you the tokens.

Install Open Source TestGen Free, no vendor lock-in Request a Demo See TestGen Enterprise in action
Gil Benghiat

Gil Benghiat

Co-founder and VP of Products & Implementation at DataKitchen. Helping data teams find data quality issues before their customers do.

LinkedIn →