← Back to Blog

How TestGen Complements Microsoft Purview for Enterprise Data Quality

DataKitchen's TestGen and Microsoft's Purview complement each other: Purview serves as the governance and catalog "source of truth," while TestGen is the deep, automated data-quality and testing engine that writes thousands of data quality tests in seconds.

Written by Gil Benghiat on January 20, 2026

Data QualityDataOpsData ObservabilityDataOps ObservabilityDataOps TestGenOpen Source
How TestGen Complements Microsoft Purview for Enterprise Data Quality

How does DataKitchen’s TestGen complement Microsoft’s Purview?

Organizations that deploy Microsoft Purview gain a powerful foundation for data governance, cataloging, and lineage across their Microsoft ecosystem. But once teams begin governing data assets, they quickly encounter the challenge of validating data quality at scale and in an automated way. That is where DataKitchen’s TestGen and Purview complement each other: Purview serves as the governance and catalog “source of truth,” while TestGen is the deep, automated data-quality and testing engine that operationalizes trust at scale.

What is TestGen?

TestGen automatically profiles datasets and generates comprehensive data tests based on the structure, distributions, patterns, and irregularities it discovers. It provides detailed issue listings with context that allow data teams to decide whether bad records should be passed, patched, purged, or returned to an upstream system. TestGen’s entire approach is designed to minimize manual configuration, allowing organizations to stand up meaningful data quality assurance in days rather than months.

What Does TestGen Do That Purview Does Not?

TestGen delivers specific capabilities that Purview does not provide. Most importantly, TestGen automatically generates tests from data profiles using AI- and heuristic-driven techniques, rather than requiring stewards or engineers to manually author rules. This makes TestGen compelling because it is comprehensive and automatic — requiring no configuration to get started. It also supports exploratory data-quality discovery across large estates, scanning many tables to surface anomalies, distribution shifts, odd patterns, and other hard-to-anticipate defects with no setup.

TestGen further supports multi-table and relational data quality through “fill in the blank” test specifications, which help teams express domain logic without manually coding SQL. It offers an opinionated Pass / Patch / Purge / Push Back workflow that not only identifies issues but helps teams decide what to do with them by providing record-level detail. Operationally, TestGen is pipeline-adjacent: it can integrate with CI/CD and orchestration tools, run wherever the data lives (multi-cloud, on-prem, non-Microsoft), and is open source and tool-agnostic, allowing it to work across heterogeneous data estates.

What Does Purview Do That TestGen Does Not?

Microsoft Purview brings governance capabilities that TestGen purposely does not address. It provides an enterprise data catalog and search experience that acts as the “system of record” for schemas, datasets, classifications, reports, and data products. It offers end-to-end lineage visualization for data flowing across databases, data lakes, workflows, Fabric, Power BI, and other services — giving governance and BI teams insight into where data originates and how it changes.

Purview also handles broad governance features, including business glossaries, classifications, sensitivity labels, access control, and domain ownership. Because it is tightly integrated with the Microsoft ecosystem (e.g., Azure, Fabric, Power BI, M365, SQL). Purview provides both a consistent user experience and an enforcement model in environments where Microsoft is already the dominant stack. These governance and discovery features make Purview invaluable to stewards, security teams, and leadership.

Where Do TestGen and Purview Overlap?

There are several functional areas where Purview and TestGen both provide capability, but with different approaches and trade-offs. Both can profile data to compute statistics at the dataset and column level, helping teams understand distributions, null patterns, distinct counts, and schema characteristics. Both can define and run data-quality checks such as null checks, value ranges, pattern validation, and referential consistency. The difference is that TestGen automatically generates thousands of checks based on data profiles, while Purview requires manual configuration of rules.

Both tools can produce quality outcomes, including passed and failed checks, and both can support governance conversations by making data quality visible to a wider audience. Each platform helps different stakeholders prioritize fixes: TestGen supports engineers and operators by showing exactly which records are bad, while Purview helps stewards and leaders understand which assets and domains present governance risk.

What is an ROI Example for TestGen?

To understand the operational impact of TestGen’s automatic test generation, consider an illustrative scenario from DataKitchen’s internal benchmarking: To cover 20 tables containing 1000 columns with an average of 2.5 tests per column, using TestGen, a junior operator can complete the tasks with two steps: profile and generate tests.

For a Data Engineer to accomplish this task manually, assume it takes 30 minutes to write each test and that the engineer has no meetings, breaks, or vacation. It would consume roughly 1,250 hours, or 156 working days, which translates to about 31 weeks or 7.2 months.

This simple illustration shows that automation in test generation does not merely reduce labor — it enables scale that would otherwise be operationally impossible.

Conclusion

For organizations already using Microsoft Purview, TestGen is a natural complement that fills the operational testing and anomaly-discovery gaps that governance alone cannot address. Together, they share a common interest in making data trustworthy and offer a clear division of responsibility: Purview for governance and stewardship, TestGen for automated quality.

As data estates continue to grow in size and complexity, combining TestGen’s automated testing with Purview’s governance and lineage gives data quality, data governance, and data engineering teams a practical path to improving data reliability at enterprise scale.

Frequently Asked Questions

What is a short summary of how DataOps, Data Quality TestGen, Compliments Microsoft Purview?

Here is what TestGen does that Purview does not: it generates tests comprehensively and automatically – no configuration required.

What Purview does that TestGen does not:

Where they overlap: You will need to see which tool you want to use for these features.

Here is an ROI calculation using TestGen

With TestGen, a junior operator can generate 2,526 tests with two steps (profile, generate tests)
It would take a trained Data Engineer 7.2 months to achieve the same results – with no time for meetings, breaks, or vacations

What is a quick summary of the blog?

This blog details how Microsoft Purview and DataKitchen’s TestGen function as a unified solution for managing enterprise data quality and governance. While Purview acts as the primary system of record for data cataloging, lineage, and policy enforcement, TestGen provides an automated engine for deep data profiling and test generation. The documentation highlights that TestGen significantly reduces manual labor by using AI-driven heuristics to create thousands of tests in minutes, a task that would otherwise take months for a human engineer to complete. By integrating these tools, organizations can combine broad stewardship with operational automation to ensure data reliability across diverse technical environments. This synergy allows teams to identify anomalies and unknown defects while maintaining a consistent governance framework within the Microsoft ecosystem.

Install Open Source TestGen Free, no vendor lock-in Request a Demo See TestGen Enterprise in action
Gil Benghiat

Gil Benghiat

Co-founder and VP of Products & Implementation at DataKitchen. 35+ years in software engineering with experience at AT&T Bell Labs, Sybase, and Oracle.

LinkedIn →