A Data Prediction for 2025

What will the world of data tools be like at the end of 2025?ย  The crazy idea is that data teams are beyond the boom decade of โ€œspending extravaganceโ€ and need to focus on doing more with less.

Weโ€™ve read many predictions for 2023 in the data field:ย  they cover excellent topics like data mesh, observability, governance, lakehouses, LLMs, etc. ย  Here at DataKitchen, we wanted to take a different approach: look at a three-year horizon. What will the world of data tools be like at the end of 2025?ย  The crazy idea is that data teams are beyond the boom decade of โ€œspending extravaganceโ€ and need to focus on doing more with less. This will drive a new consolidated set of tools the data team will leverage to help them govern, manage risk, and increase team productivity.ย ย 

 

What will exist at the end of 2025?ย ย 

A combined, interoperable suite of tools for data team productivity, governance, and security for large and small data teams.

  • DataOps Automation (Orchestration, Environment Management, Deployment Automation)
  • DataOps Observability (Monitoring, Test Automation)
  • Data Governance (Catalogs, Lineage, Stewardship)
  • Data Privacy (Access and Compliance)
  • Data Team Management (Projects, Tickets, Documentation, Value Stream Management)

 

What are the drivers of this consolidation?

  • Recession: the party is over. Central IT Data Teams focus on standards, compliance, and cost reduction. They are moving away from direct connection with ‘the business.’. They are data enabling vs. value delivery. Their software purchase behavior will align with enabling standards for line-of-business data teams who use various tools that act on data.
  • Enterprises are more challenged than ever in their data sprawl, so reducing risk and lowering costs drive software spending decisions. Driving new opportunities and expansion takes a back seat.
  • The vendor landscape for addressing risk, cost, productivity, and governance often overlaps. The vendor sprawl leaves enterprises to integrate and rationalize their approach.ย 
  • Ultimately, there will be an interoperable toolset for running the data team, just like a more focused toolset (ELT/Data Science/BI) for acting upon data. As an analogy, the DevOps space has seen consolidation in code storage, CI/CD, team workflow, value stream management, testing, and other tools into one platform. And the tools for acting on data are consolidating: Tableau does data prep, Altreyx does data science, Qlik joined with Talend, etc.
  • It’s code -> governance, not governance->code. Most data governance tools today start with the slow, waterfall building of metadata with data stewards and then hope to use that metadata to drive code that runs in production. In reality, the ‘active metadata’ is just a written specification for a data developer to write their code. Modern tools like dbt auto-generate lineage and catalogs from the code, giving a more accurate and timely set of governance metadata.
  • Data Observability is booming in popularity, with a dozen startups in the space. This is because data teams realize that untested production systems produce error-prone output. And their business customers want more data trust.ย  And they need to save time-consuming re-work.
  • Data Infrastructure cost management (or FinOps) will become necessary. ย ย Those big cloud bills will get a looking at โ€“ and the CFO will ask hard questions โ€“ what work is important? Can we do this cheaper?ย  And data teams will have to provide the data to understand the cost per data journey by the team.
  • There is an opportunity to enable Data Stewarts to go beyond their current passive role in governance definition to a more active role in helping improve data testing and observability by reviewing and configuring production data tests.
  • Ultimately, data teams want to change their production data systems with low risk quickly. They need a complete ‘deployable unit’ of ETL/Data Science/BI code and changes to the catalog, lineage, and privacy/security metadata in one unit. Can that be done with minimal vendor tools?
  • The code, configuration, and metadata about your data are the Intellectual Property of your data teams. The vendor who owns that database of record in the enterprise will drive lots of value and ‘stickiness.’
  • The explosion of tools that act on data:ย  50 ELT or ETL tools, 50 data science tools, and 50 data visualization tools.ย  Most companies have several types of tools in production.ย  Most companies are moving to the cloud and have their own proprietary data tools suite. Our problems are simple: too many tools, too many errors, and insufficient value delivered.
  • Data Privacy and the risk of data misuse are part of the compliance and standards central data teams need to enforce.
  • Data teams are underperforming; they will look to the changes that IT organizations have adopted with Agile and DevOps over the past five years and up their game with an Agile, DataOps approach.
  • The changing role of central IT data teams in large companies: ย large companies typically have a โ€˜corporateโ€™ data function. ย  They were once the center of the data universe at many companies. However, over the past decade, line of business data teams have carried the burden of insight generation for the business. These teams are the hub, helping to enable many spokes.ย  With a hub and spoke or data enablement model, central teams are about guardrails.ย  Each line of the business spoke has the freedom to make their own ETL/BI/data science tool choice while delivering value to their data customer.ย  The central teams will become raw data loading functions, with required standards for governance, security, privacy, observability, testing, and production operations.ย ย 

 

Why would this consolidation not happen?

  • The software people and DevOps tools will take over the data space. Data Teams and Software/IT want to be agile, but as more software people work in data teams, they will consolidate on a single toolset rooted in software developers’ tools.ย ย 
  • The cloud vendors are building data capabilities at a frantic rate, and their ‘walled garden’ will make it hard for any vendor to compete on tools for the data team. ย  Azure is ahead in this area.
  • Like Oracle in the 2000s, Snowflake and Databricks will win on being the everything platform for data analytics.ย ย 

 

Conclusion

We are entering into tough few years economically. We are heading into โ€˜data winter.โ€™ just like the software field had a multi-year crunch 20 years ago.ย  Perhaps out of this will come a data culture obsessed with creating value for their customers instead of adopting the latest cool tech buzzword, a culture that tests, iterates, and continuously improves efficiency.ย 

Enterprise data teams are still challenged with their data sprawl and making their customers happy. They canโ€™t just spend millions on new tech and hope it will deliver value next year.ย  So the prime drivers will be reducing risk and lowering costs.ย  Driving new opportunities and expansion takes a back seat.ย  The prediction is this:ย  those challenges create an opportunity to create a single integrated set of tools rooted in DataOps principles to help these teams govern, manage risk, and increase team productivity.

Sign-Up for our Newsletter

Get the latest straight into your inbox

Open Source Data Observability Software

DataOps Observability: Monitor every Data Journey in an enterprise, from source to customer value, and find errors fast! [Open Source, Enterprise]

DataOps Data Quality TestGen: Simple, Fast Data Quality Test Generation and Execution. Trust, but verify your data! [Open Source, Enterprise]

DataOps Software

DataOps Automation: Orchestrate and automate your data toolchain to deliver insight with few errors and a high rate of change. [Enterprise]

recipes for dataops success

DataKitchen Consulting Services


Assessments

Identify obstacles to remove and opportunities to grow

DataOps Consulting, Coaching, and Transformation

Deliver faster and eliminate errors

DataOps Training

Educate, align, and mobilize

Commercial Data & Analytics Platform for Pharma

Get trusted data and fast changes to create a single source of truth

 

dataops-cookbook-download

DataOps Learning and Background Resources


DataOps Journey FAQ
DataOps Observability basics
Data Journey Manifesto
Why it matters!
DataOps FAQ
All the basics of DataOps
DataOps 101 Training
Get certified in DataOps
Maturity Model Assessment
Assess your DataOps Readiness
DataOps Manifesto
Thirty thousand signatures can't be wrong!

 

DataKitchen Basics


About DataKitchen

All the basics on DataKitchen

DataKitchen Team

Who we are; Why we are the DataOps experts

Careers

Come join us!

Contact

How to connect with DataKitchen

 

DataKitchen News


Newsroom

Hear the latest from DataKitchen

Events

See DataKitchen live!

Partners

See how partners are using our Products

 

Monitor every Data Journey in an enterprise, from source to customer value, in development and production.

Simple, Fast Data Quality Test Generation and Execution. Your Data Journey starts with verifying that you can trust your data.

Orchestrate and automate your data toolchain to deliver insight with few errors and a high rate of change.