Catch bad data before your customer does

Your dashboards run green while bad data slips through. TestGen is open source. It profiles every table, generates thousands of data quality tests, scores every column, and gives you a quality dashboard the team will actually look at. The DataOps way to data quality. No vendor lock-in.

Data Quality

Run open-source TestGen yourself

TestGen is open source and ships in a Docker container. Stand it up in 15 minutes against PostgreSQL, Snowflake, Redshift, BigQuery, Databricks, Oracle, or SAP HANA. Point it at a schema. It profiles every column, scores the data, and generates the test suite from what it finds. A junior operator can run it against a demo cluster without help. No per-row pricing. No vendor lock-in. The source is on GitHub.

Run open-source TestGen yourself

The DataOps way to data quality

Most teams write a data quality plan, schedule a quarterly review, and call it shipped. By the time the review happens the schema has changed three times and an analyst has filed a ticket. The DataOps way is the opposite: measure first, score, generate tests on the score, run the tests on every refresh, iterate. You get coverage in days instead of two quarters. Shift left to catch issues at the source where they cost a dollar a record to fix, not at the dashboard where they cost a hundred.

The DataOps way to data quality

“So much of what we do involves business questions that are fire drills. Executives don't trust our analytics. DataKitchen enabled us to deliver over 10,000 data quality validation tests that run every release. Now, they trust us!”

Manager, Data Quality

Build a quality dashboard the team will read

Most quality dashboards die in a Confluence page nobody opens. TestGen builds a scorecard the team uses: a total score, a CDE-only score, and breakdowns by accuracy, completeness, consistency, timeliness, uniqueness, and validity. Drill from a scorecard into a dimension, into a column, into the failing test. Add custom scorecards for a specific table group, a critical pipeline, or a domain owner's slice of the catalog. The score updates with every profiling run. Nobody is updating it by hand at the end of the quarter.

Build a quality dashboard the team will read

Find hygiene issues at profile time

TestGen scans every column for hygiene issues before you ever write a test. Non-standard blank values. Quoted values stored as strings. Dates that haven't moved in two years. Columns that look like PII but aren't tagged. Each issue lands in a triage view with table, column, likelihood, and a one-line detail. Mark it definite, dismiss it, or push it back to the data engineer who owns the source. The dirty columns surface themselves.

Find hygiene issues at profile time

Smart, continuous table monitoring

TestGen runs the suite on every refresh and watches for the things you didn't think to test for. Freshness, volume, schema, custom metrics. The Monitors page rolls anomalies up by table over a 14-day window so at a glance you see which tables drifted overnight and which broke. Drill into a table for a trend chart with anomaly markers. Notifications fire when a test fails.

Smart, continuous table monitoring

See TestGen run against your data

Stand up open-source TestGen against your own schema in 15 minutes. Free. No vendor lock-in.