Failing Your Way to Success with DataOps

Have you ever failed at work? Most people have done something cringeworthy at some point. I have plenty of funny stories to tell. Thereโ€™s the time I set off the fire alarm on my first day of a new management job. Perhaps even more notable are theย epic fails. I once worked at a company where it seems, on a near-daily basis, I would find myself on the phone with an angry customer.ย There was another data error. If it wasnโ€™t fixed immediately, the customer would find a different vendor.ย My reputation and career were constantly on the line. Not fun. At first, I worried that the problem wasย me. I always thought of myself as a dependable person who gets the job done. I wasnโ€™t used to coming up short week after week.

Fail Faster

Over the decades, I have come to understand that good people underperform when operating within a bad system. If the data team canโ€™t deliver quickly enough or suffers from abominable quality, the culprit is likely the methodologies that they are using. Some companies become overly dependent on a star employee who they drive to work long hours. When people work in teams,ย task coordinationย and the elimination ofย processย inefficiencies is far more consistent and impactful than individual ability orย heroism.

In the industrial design domain, attitudes towards failure have matured. The adage โ€œfail faster,โ€ย attributedย to David Kelley of IDEO, expresses how failure fuels innovation. Failing fast implies that it is about speed, but it is more generally a call to minimize the consequences and the cost of failure. Failing fast enables designers to feel freer to take big risks, leading to creativity and innovation that powers progress and growth.

Operations managers widely apply the axiom โ€œfail fastโ€ to lean manufacturing. As a product progresses through a manufacturing process, its cost-of-goods-sold increases. Every factory manager knows it is much less costly to screen out a faulty component prior to assembly than to invest in mid- or post-production rework.

Failing Faster in Data Analytics

When data professionals apply these same principles to data analytics, they can reap the rewards of lower costs and higher creativity that we see in industrial manufacturing. In the data industry, the application of these methodologies across the data lifecycle is called DataOps. In summary, DataOps reduces the cost and consequences of data errors and data-analytics bugs. You could say that โ€œfailing fasterโ€ is the common theme that unifies all aspects of DataOps.

Data Pipeline Errors

Data operations consist of a set of data sources that progress through a series of processing steps, for example, integration, cleaning, processing, transformation and publication (as charts, graphics and reports). 30% of respondents to our recentย DataOps surveyย reported more than 11 errors per month. In a significant number of enterprises, data errors are regularly flowing into user analytics with potentially catastrophic results.

DataOps places tests at each stage of the data-operations pipeline. It checks and monitors data at its source before it enters the pipeline. Does it conform to business logic? Does it fall within statistical norms? DataOps also tests inputs and outputs at each stage of transformation. If a fork or join fails, DataOps will catch it before it corrupts analytics.

DataOps implements automated statistical and process controls on data operations, much like a manufacturing plant. If data flowing through a multi-stage data-analytics pipeline violates business logic or statistical norms, then tests alert the data team or, in an extreme case, stop the flow of data.

Development Errors

In many organizations, new analytics are developed directly on operational systems. DataOps allows data analysts to create development sandboxes. With virtualization technology, sandboxes closely match the target production environment minimizing unexpected regressions. Sandboxes inherit analytics components and automated orchestrations along with associated tests so data scientists can better leverage each otherโ€™s work. If a data scientist takes a development risk, they can abandon a sandbox and revert to the baseline analytics code and configuration.

Integration Errors

DataOps employs continuous integration methods like those that enable leading software organizations to deploy millions of code releases per year. Development sandboxes are isolated from production unless they progress through an automated release workflow that includes integration, functional, unit and other tests. Tests catch issues before analytics migrate into data operations.

Product Management Errors

Developing analytics that no one wants or needs is a costly product management failure. When DataOps teams implement Agile Development, they create short-term value by iterating rapidly. Itโ€™s much easier to be correct about what feature you need this week as opposed to 24 months from now. Also, Agile teams receive immediate feedback on what they have produced so they can course correct. Often, users donโ€™t know what they want until they see it. Teams can be much more innovative when they create a rough, approximate solution and keep iterating on it.

Failing Your Way to Success

DataOps focuses on the identification and elimination of data pipeline, development, integration and product management errors. When DataOps minimizes the cost and consequences of errors, data analysts are free to work more closely with business users. Together they can play with ideas, try new things, and act on hunches. When this process plays out, it unlocks tremendous creativity. With failures minimized and identified early, DataOps enables enterprises to deliver on the promise of leveraging data for competitive advantage. We have seen many companies use DataOps methods to vault forward, taking a leadership place in the market. Fail faster using DataOps.

Sign-Up for our Newsletter

Get the latest straight into your inbox

Open Source Data Observability Software

DataOps Observability: Monitor every Data Journey in an enterprise, from source to customer value, and find errors fast! [Open Source, Enterprise]

DataOps Data Quality TestGen: Simple, Fast Data Quality Test Generation and Execution. Trust, but verify your data! [Open Source, Enterprise]

DataOps Software

DataOps Automation: Orchestrate and automate your data toolchain to deliver insight with few errors and a high rate of change. [Enterprise]

recipes for dataops success

DataKitchen Consulting Services


Assessments

Identify obstacles to remove and opportunities to grow

DataOps Consulting, Coaching, and Transformation

Deliver faster and eliminate errors

DataOps Training

Educate, align, and mobilize

Commercial Pharma Agile Data Warehouse

Get trusted data and fast changes from your warehouse

 

dataops-cookbook-download

DataOps Learning and Background Resources


DataOps Journey FAQ
DataOps Observability basics
Data Journey Manifesto
Why it matters!
DataOps FAQ
All the basics of DataOps
DataOps 101 Training
Get certified in DataOps
Maturity Model Assessment
Assess your DataOps Readiness
DataOps Manifesto
Thirty thousand signatures can't be wrong!

 

DataKitchen Basics


About DataKitchen

All the basics on DataKitchen

DataKitchen Team

Who we are; Why we are the DataOps experts

Careers

Come join us!

Contact

How to connect with DataKitchen

 

DataKitchen News


Newsroom

Hear the latest from DataKitchen

Events

See DataKitchen live!

Partners

See how partners are using our Products

 

Monitor every Data Journey in an enterprise, from source to customer value, in development and production.

Simple, Fast Data Quality Test Generation and Execution. Your Data Journey starts with verifying that you can trust your data.

Orchestrate and automate your data toolchain to deliver insight with few errors and a high rate of change.