Question: What is something the data industry is missing?
I think it’s observability-led DataOps. I’ve come to believe that we, as an industry, will not change how people build things they’ve already made. They’re already being Heroes and have pain, unhappiness, and poor results.
The first step to enlightenment
The first step in solving that pain is to observe what’s happening with your data and analytics ‘estate’ and stick little thermometers at various points in the process and measure. Have those thermometers check while your data and tools are running in production a few things, for example:
- Did the production process start? End? Is it late?
- Is the raw data correct?
- Is the integrated data right?
- Is the model still predicting accurately?
- Is the dashboard showing accurate information?
- Will the server run out of space?
- And a few more things.
You are going to be surprised at what you see in that information. And that data is going to drive you and your team’s behavior.
You can’t improve what you can’t measure.
And what you will see is that those checks/tests/monitors find a variety of problems. Maybe your data providers are giving you bad data. Maybe the code your data engineer is running is faulty. Perhaps the data scientist needs to tune up their model. It could be that your Analysts messed up the Tableau dashboard. Or maybe everything is perfect, and your dashboards are not being used. So you can take them down and save some money.
So the idea of Observability first DataOps is to stick a bunch of thermometers all over your data pipelines, models, Vis, and tech stack, then measure all that stuff. Then look at the data and find where the bottlenecks and errors are. Then you get evidence to do the work to fix those problems through automation and testing.
Why Observability first DataOps? Data convinces data people. Otherwise, our experience is that people will continue to “hero out.” They will build these systems where they rush to get something done, are afraid to change it once it’s running, and live with a constant stream of problems from their customers until frustration makes them quit.
Don’t change what you have. Just observe it, get evidence and incrementally improve.
I believe that data can convince our team members to make better decisions. For this approach to work, we need measurements throughout the entire process and an understanding of what’s happening with your data estate as it runs in production. Hence, you know when there are problems or opportunities ahead.