← Back to Blog

What is a Data Mesh?

Written by DataKitchen Marketing Team on August 3, 2021

Data Mesh
What is a Data Mesh?

The data mesh design pattern breaks giant, monolithic enterprise data architectures into subsystems or domains, each managed by a dedicated team. With an architecture comprised of numerous domains, enterprises need to manage order-of-operations issues, inter-domain communication, and shared services like environment creation and meta-orchestration. A DataOps superstructure provides the foundation to address the many challenges inherent in operating a group of interdependent domains. DataOps helps the data mesh deliver greater business agility by enabling decentralized domains to work in concert.

This post (1 of 5) is the beginning of a series that explores the benefits and challenges of implementing a data mesh and reviews lessons learned from a pharmaceutical industry data mesh example. But first, let’s define the data mesh design pattern.

The past decades of enterprise data platform architectures can be summarized in 69 words. Note, this is based on a post by Zhamak Dehghani of Thoughtworks.

See the pattern? It’s no fun working in data analytics/science when you are the bottleneck in your company’s business processes. A barrage of errors, missed deadlines, and slow response time can overshadow the contributions of even the most brilliant data scientist or engineer. The problem is not “you.” It lies somewhere in between your enterprise data platform architecture and the enterprise’s business processes. Below are some reflections upon the failure of modern enterprise architectures to deliver on data analytics agility and reliability:

Figure 1: The organization structure builds walls and barriers to change. Source: Thoughtworks

Introduction to Data Mesh

The data mesh addresses the problems characteristic of large, complex, monolithic data architectures by dividing the system into discrete domains that are managed by smaller, cross-functional teams. Data mesh proponents borrow the term “domain” from the software engineering concept of “domain-driven design (DDD),” a term coined by Eric Evans. DDD divides a system or model into smaller subsystems called domains. Each domain is an independently deployable cluster of related microservices which communicate with users or other domains through modular interfaces. Each domain has an important job to do and a dedicated team – five to nine members – who develop an intimate knowledge of data sources, data consumers and functional nuances. The domain team adopts an entrepreneurial mindset, viewing the domain as if it were a product and users or other domains as customers. The domain includes data, code, workflows, a team, and a technical environment. Data mesh applies DDD principles, proven in software development, to data analytics. If you’ve been following DataKitchen at all, you know we are all about transferring software development methods to data analytics.

The organizational concepts behind data mesh are summarized as follows.

Benefits of a Domain

A well-implemented domain has certain attributes that offer benefits to the customers or domain users:

Organizing a data architecture into domains is a first-order decision that drives organizational structure, incentives and workflows that influence how data consumers use data. Domains change the focus from the data itself to the use cases for the data. The customer use cases in turn drive the domain team to focus on services, service level agreements (SLA) and APIs. Domains promote data decentralization. Instead of relying on a centralized team, where the fungibility of human resources is an endless challenge, the domain team is dedicated and focused on a specific set of problems and data sets. Decentralization promotes creativity and empowerment. In the software industry, there’s an adage, “you build it, you run it.” Clear ownership and accountability keep the data teams focused on making their customers successful. As an organizing principle, domains favor a decentralized ecosystem of interdependent data products versus a centralized data lake or data warehouse with all its inherent bottlenecks.

If you have worked in the data industry, you already know that while decentralization offers many benefits it also comes with a cost. We’ll cover some of the potential challenges facing data mesh enterprise architectures in our next blog,Use DataOps with Your Data Mesh to Prevent Data Mush.

DataKitchen Marketing Team

The DataKitchen marketing team curates industry news, resources, and thought leadership on DataOps, data quality, and data observability.