Since the term was coined, DataOps has expanded the way that people think about data analytics teams and their potential. 2020 was a huge year in DataOps industry acceptance. Media mentions of DataOps are on track to increase 52% over the prior year. To date in 2020, DataKitchen has seen an additional 5K downloads of the DataOps Cookbook. Industrywide, searches on DataOps (and derivatives) are up 500% over the past three years. Gartner analyst inquiries related to DataOps are up over 1000%.

Looking ahead to 2021, we see signs of DataOps adoption in large enterprises, expansion of DataOps into new domains, new players, and a radical rethinking of how work gets done in data organizations. Here then are our top six DataOps trends for the year 2021.ย 

Enterprise DataOps Transformation

We started tracking the DataOps enterprise software ecosystem back in 2017. Today there are over 75 vendors of enterprise software that market a tool addressing some aspect of DataOps. As the maturity of the ecosystem grows, enterprises have begun to struggle with implementation, asking questions such as how to choose a first DataOps project, how to formulate a long-term DataOps strategy, how to assess DataOps organizational readiness, and how to structure a DataOps team and organization. These challenges are a natural part of any transformative methodology that demands a complete rethinking and replanning of workflows. In 2021, we will see a rise in services that address these concerns. The early adopters have crossed the DataOps chasm and can offer guidance to the early majority. DataOps experts will offer services that enable and support Enterprise DataOps Transformation that automates workflows and relieves bottlenecks associated with analytics development and operations.

Data Observability

Complex data operations pipelines integrate data from dozens or hundreds of disparate sources. Data then traverses data centers, toolchains, and distributed teams on its way to producing value for the enterprise. The advent of self-service analytics tools adds another dimension to analytics workflows with ad hoc data pipelines instantiated on-demand. The evolving topologies that comprise data operations pipelines present daunting challenges. How can the data team prevent non-conforming data or other errors from corrupting the analytics? Enterprise-critical analytics serve as the backbone of operations. In 2021, we anticipate broad investment in data observability as companies leverage automation to address the mammoth challenge of determining the health of centralized and decentralized data pipelines. Data observability integrates testing, monitoring, and real-time alerts into data pipeline orchestration so that data organizations have an ability to view data operations at-a-glance and optionally drill down into the details.

DataGovOps

Busy data scientists sometimes treat governance as an afterthought. With the pressure to cope with an endless stream of analytics updates and the exponential growth of data, there isnโ€™t always time to update and maintain the business glossary, data catalog, and data lineage. Data quality assessments are important, but are incredibly time-consuming and resource-intensive and when complete, reflect only an outdated snapshot in time. Governance is the next major domain ripe for DataOps automation. In 2021, companies will begin embracing DataGovOps (governance-as-code), the idea that governance can be implemented as a set of automated workflows that execute as part of analytics continuous deployment.

A prime example of DataGovOps is DataOps โ€œKitchensโ€ โ€“ self-service analytics development environments created on-demand with built-in background processes that monitor governance. If a user violates policies by adding a table to a database or exporting sensitive data from the sandbox environment, an automated alert can be forwarded to the appropriate data governance team member. The code and logs associated with development are stored in source control, providing a thorough audit trail. DataGovOps actively promotes the safe use of data with automation that improves governance while freeing data analysts and scientists from onerous manual tasks.

Rise of the DataOps Engineer

As 2020 stretches into 2021, expect to see increased attention on the DataOps engineer, a relatively new role in many data organizations. Following in the footsteps of DevOps Engineers in the software industry, we predict that the DataOps Engineer will become one of the most sought-after and highly compensated members of the data analytics team.

DataOps Engineers implement the continuous deployment of data analytics. They give data scientists tools to instantiate development sandboxes on demand. They automate the data operations pipeline and create platforms used to test and monitor data from ingestion to published charts and graphs, promoting observability. Through tools automation, the DataOps Engineer eliminates data lifecycle bottlenecks, which sap data team productivity. A DataOps Engineer who understands how to automate and streamline data workflows can increase the productivity of a data team by orders of magnitude.ย 

Self-Service Sandboxes

The key to maximizing the productivity of data scientists is to alleviate the bottlenecks in their development workflows. For example, it may take 10-20 weeks for a data scientist to create a development environment when a project begins. We envision a trend in 2021 toward automation of self-service sandboxes that reduce the time frame for environment creation from months to days/hours. We call these on-demand environments โ€œKitchensโ€ and they include everything a data analyst or data scientist needs in order to create analytics: complete toolchain, reusable microservices, security, prepackaged data sets, integration with team workflows, continuous deployment automation, observability, and governance. The environment is โ€œself-serviceโ€ because it can be created on demand by any authenticated user. Looking forward, enterprises that stubbornly insist on creating environments manually, will find themselves unable to match the agility of competitors.

Kitchens include everything a data analyst or data scientist needs in order to create analytics: complete toolchain, reusable microservices, security, prepackaged data sets, integration with team workflows, continuous deployment automation, observability, and governance.

ย 

Expansion of the DataOps Tools Landscape

Some organizations try to implement DataOps from scratch using DevOps and workflow tools, however, this approach overlooks the unique requirements of data analytics automation: management of the data factory, creative managed-chaos of self-service analytics, wide divergence in toolchains, meta-orchestrations spanning toolchains, complexity in dev environments, data observability, and the predisposition against software complexity among data professionals. Expect 2021 to bring new DataOps tools and solutions to the market that address the unique needs of data scientists and analytics creation. We expect to see DataOps solutions attracting unprecedented venture investment and existing players adding features that tailor their offering to the data analytics market. Should be fun to watch.

 

2020 was a tumultuous year. Strategic plans that were approved in late 2019 had to be scrapped amidst tremendous economic uncertainty. As data professionals, we know that data-driven decision making is now more critical than ever. Data analytics can serve as a companyโ€™s most valuable tool when navigating a rapidly evolving environment. Data-analytics teams are one of a companyโ€™s most valuable assets. DataOps helps data organizations increase productivity, improve responsiveness, and deliver insights that shed light on the challenges ahead.

Sign-Up for our Newsletter

Get the latest straight into your inbox

Open Source Data Observability Software

DataOps Observability: Monitor every Data Journey in an enterprise, from source to customer value, and find errors fast! [Open Source, Enterprise]

DataOps Data Quality TestGen: Simple, Fast Data Quality Test Generation and Execution. Trust, but verify your data! [Open Source, Enterprise]

DataOps Software

DataOps Automation: Orchestrate and automate your data toolchain to deliver insight with few errors and a high rate of change. [Enterprise]

recipes for dataops success

DataKitchen Consulting Services


Assessments

Identify obstacles to remove and opportunities to grow

DataOps Consulting, Coaching, and Transformation

Deliver faster and eliminate errors

DataOps Training

Educate, align, and mobilize

Commercial Pharma Agile Data Warehouse

Get trusted data and fast changes from your warehouse

 

dataops-cookbook-download

DataOps Learning and Background Resources


DataOps Journey FAQ
DataOps Observability basics
Data Journey Manifesto
Why it matters!
DataOps FAQ
All the basics of DataOps
DataOps 101 Training
Get certified in DataOps
Maturity Model Assessment
Assess your DataOps Readiness
DataOps Manifesto
Thirty thousand signatures can't be wrong!

 

DataKitchen Basics


About DataKitchen

All the basics on DataKitchen

DataKitchen Team

Who we are; Why we are the DataOps experts

Careers

Come join us!

Contact

How to connect with DataKitchen

 

DataKitchen News


Newsroom

Hear the latest from DataKitchen

Events

See DataKitchen live!

Partners

See how partners are using our Products

 

Monitor every Data Journey in an enterprise, from source to customer value, in development and production.

Simple, Fast Data Quality Test Generation and Execution. Your Data Journey starts with verifying that you can trust your data.

Orchestrate and automate your data toolchain to deliver insight with few errors and a high rate of change.