Failing Your Way to Success with DataOps

by | Jan 21, 2020 | Blog, DataOps Principles

Have you ever failed at work? Most people have done something cringeworthy at some point. I have plenty of funny stories to tell. There’s the time I set off the fire alarm on my first day of a new management job. Perhaps even more notable are the epic fails. I once worked at a company where it seems, on a near-daily basis, I would find myself on the phone with an angry customer. There was another data error. If it wasn’t fixed immediately, the customer would find a different vendor. My reputation and career were constantly on the line. Not fun. At first, I worried that the problem was me. I always thought of myself as a dependable person who gets the job done. I wasn’t used to coming up short week after week.

Fail Faster

Over the decades, I have come to understand that good people underperform when operating within a bad system. If the data team can’t deliver quickly enough or suffers from abominable quality, the culprit is likely the methodologies that they are using. Some companies become overly dependent on a star employee who they drive to work long hours. When people work in teams, task coordination and the elimination of process inefficiencies is far more consistent and impactful than individual ability or heroism.

In the industrial design domain, attitudes towards failure have matured. The adage “fail faster,” attributed to David Kelley of IDEO, expresses how failure fuels innovation. Failing fast implies that it is about speed, but it is more generally a call to minimize the consequences and the cost of failure. Failing fast enables designers to feel freer to take big risks, leading to creativity and innovation that powers progress and growth.

Operations managers widely apply the axiom “fail fast” to lean manufacturing. As a product progresses through a manufacturing process, its cost-of-goods-sold increases. Every factory manager knows it is much less costly to screen out a faulty component prior to assembly than to invest in mid- or post-production rework.

Failing Faster in Data Analytics

When data professionals apply these same principles to data analytics, they can reap the rewards of lower costs and higher creativity that we see in industrial manufacturing. In the data industry, the application of these methodologies across the data lifecycle is called DataOps. In summary, DataOps reduces the cost and consequences of data errors and data-analytics bugs. You could say that “failing faster” is the common theme that unifies all aspects of DataOps.

Data Pipeline Errors

Data operations consist of a set of data sources that progress through a series of processing steps, for example, integration, cleaning, processing, transformation and publication (as charts, graphics and reports). 30% of respondents to our recent DataOps survey reported more than 11 errors per month. In a significant number of enterprises, data errors are regularly flowing into user analytics with potentially catastrophic results.

DataOps places tests at each stage of the data-operations pipeline. It checks and monitors data at its source before it enters the pipeline. Does it conform to business logic? Does it fall within statistical norms? DataOps also tests inputs and outputs at each stage of transformation. If a fork or join fails, DataOps will catch it before it corrupts analytics.

DataOps implements automated statistical and process controls on data operations, much like a manufacturing plant. If data flowing through a multi-stage data-analytics pipeline violates business logic or statistical norms, then tests alert the data team or, in an extreme case, stop the flow of data.

Development Errors

In many organizations, new analytics are developed directly on operational systems. DataOps allows data analysts to create development sandboxes. With virtualization technology, sandboxes closely match the target production environment minimizing unexpected regressions. Sandboxes inherit analytics components and automated orchestrations along with associated tests so data scientists can better leverage each other’s work. If a data scientist takes a development risk, they can abandon a sandbox and revert to the baseline analytics code and configuration.

Integration Errors

DataOps employs continuous integration methods like those that enable leading software organizations to deploy millions of code releases per year. Development sandboxes are isolated from production unless they progress through an automated release workflow that includes integration, functional, unit and other tests. Tests catch issues before analytics migrate into data operations.

Product Management Errors

Developing analytics that no one wants or needs is a costly product management failure. When DataOps teams implement Agile Development, they create short-term value by iterating rapidly. It’s much easier to be correct about what feature you need this week as opposed to 24 months from now. Also, Agile teams receive immediate feedback on what they have produced so they can course correct. Often, users don’t know what they want until they see it. Teams can be much more innovative when they create a rough, approximate solution and keep iterating on it.

Failing Your Way to Success

DataOps focuses on the identification and elimination of data pipeline, development, integration and product management errors. When DataOps minimizes the cost and consequences of errors, data analysts are free to work more closely with business users. Together they can play with ideas, try new things, and act on hunches. When this process plays out, it unlocks tremendous creativity. With failures minimized and identified early, DataOps enables enterprises to deliver on the promise of leveraging data for competitive advantage. We have seen many companies use DataOps methods to vault forward, taking a leadership place in the market. Fail faster using DataOps.

Sign-Up for our Newsletter

Get the latest straight into your inbox

Monitor every Data Journey in an enterprise, from source to customer value, in development and production.

Simple, Fast Data Quality Test Generation and Execution. Your Data Journey starts with verifying that you can trust your data.

Orchestrate and automate your data toolchain to deliver insight with few errors and a high rate of change.

dataops-cookbook-download
recipes for dataops success

DataOps Learning and Background Resources


DataOps Journey FAQ

DataOps Observability basics

Data Journey Manifesto

Why it matters!

DataOps FAQ

All the basics of DataOps

DataOps 101 Training

Get certified in DataOps

Maturity Model Assessment

Assess your DataOps Readiness

DataOps Manifesto

Thirty thousand signatures can't be wrong!

 

DataKitchen Basics


About DataKitchen

All the basics on DataKitchen

DataKitchen Team

Who we are; Why we are the DataOps experts

Careers

Come join us!

Contact

How to connect with DataKitchen

 

DataKitchen News


Newsroom

Hear the latest from DataKitchen

Events

See DataKitchen live!

Partners

See how partners are using our Products

 

DataKitchen DataOps Consulting Services


Product Implementation

We help you succeed with DataKitchen's products.

Commercial Pharma Data Engineering

We build, operate, train, and transfer data products.

DataOps Transformation Advisory

We help you bring DataOps to your organization.

Customer Data Platform Engineering

We build an open CDP, then transfer it to you.