Our Product
The data mesh challenge: closing the gap between strategy and operation at scale
"A new world is not built by discarding the old, but by reshaping its enduring truths into novel forms." — Anonymous
“We chose data mesh and a decentralized data strategy, and we’ve actually made some progress….but our implementation isn’t exactly data mesh. The technology available just isn’t ready for that.” We heard this said in a meeting with a global bank just a few weeks ago. In fact, I have heard some version of this, almost as an apology, from hundreds of data leaders of some of the largest and most critical companies and governments of the global economy. These leaders are committed to unlocking data potential in a decentralized way, but they can’t quite get there, certainly not at scale.
Since founding Nextdata, my mission has shifted from being the architect and champion of data mesh to providing a transformative technology to make data mesh at scale a reality for everyone. This shift has given my team and me a new perspective and experience that I will share with you here.
First, I would like to unpack the radical shifts in the data operating model that companies have to grapple with to adopt data mesh, second, the antipatterns, or partial data mesh implementations that have emerged from the difficulties of adopting a new operating model, and lastly offer a new approach to technology to help reach escape velocity—overcoming the collective inertia of the legacy data stack, legacy processes and legacy culture— to allow organizations to implement data mesh at scale.
Data mesh has been hard. But it doesn’t have to be.
The hard parts of the data mesh operating model
The data operating model is the blueprint for how an organization delivers value using its data. It includes principles (assumptions), structure, processes, people, technology, and governance that enable the organization to operate and achieve its goals using their data.
Data mesh introduced a radical change in how companies and teams have managed data for decades, proposing fundamental changes to their technology, processes, and organizational structure.
The incumbent operating model
Let’s take a minute and recall the data operating model of many organizations when data mesh arrived on the scene.
- Data responsibilities were confined to a centralized data platform team, based on their experience and knowledge of data engineering technologies. They very quickly had become a bottleneck, often requiring weeks to months to make the right data, in the right shape for the right use case, available.
- The data platform team was faced with a never-ending task of centralizing the data on the next, next data platform — data warehouse, lake, lakehouse and so on— so that the rest of the organization could use the data for analytics and machine learning.
- The data teams were building and feeding an evergrowing hairball of data pipelines that had led to unmanageable complexity; with many fragile and intricate steps to continuously reshape data, copy it and move it around.
- Data governance was struggling to create trust in data and to control it without being a bottleneck of data innovation.
- Ultimately, there was a ubiquitous gap between data investments and returns, with no clear visibility into the data ROI.
The bottom line, the operating model had plateaued in its ability to deliver value for an organization's data investments, and growing complexity was making it worse.
Data mesh operating model
Looking at the shortcomings of the existing operating mode, the root cause was evident. Some dearly held, multi-decade assumptions were clearly no longer true:
- Data must be centralized to be useful for analytics/ML/GenAI
- Data pipelines were the only way to accomplish data transformations and processing
- Data access must only be centrally governed through catalogs and storage infrastructure
In contrast, the solution according to data mesh became obvious:
- Break down the monolith and decentralize data ownership across business domains
- Stop building data pipelines and starting building data products
- Automate governance as code and embed its execution into every single data product
- Create a new kind of self-service platform to enable domains to own, discover and exchange data, across infrastructure and organizational boundaries seamlessly.
Data mesh required a new operating model, a new set of principles, structure, processes and technology. In chapter one of my book, I summarized the shifts in the operating model across six dimensions:
- Organizationally, the data mesh operating model shifts from centralized ownership of data to a decentralized data ownership model pushing ownership and accountability for the data (production to consumption) back to the business domains where data is produced from or is used.
- Architecturally and structurally, it shifts from collecting data through pipelines (pipes) and loading them into monolithic storage platforms (sinks), to a distributed mesh of live data products connected through standardized APIs and experiences.
- Technologically, it shifts from fragmentation of data as files or tables that are dumb byproducts (“data exhaust”) of upstream compute, to encapsulation of data, compute, semantic and all other necessary pieces into an autonomous data product; self-orchestrating and self-governing.
- Operationally, it shifts data governance from a top-down centralized operational model with ongoing, costly human interventions to a federated model with computational policies embedded as code in each autonomous data product.
- Principally, it shifts from valuing data as water, oil or other kinds of assets that move from one stockpile to another, to a fully functioning, autonomous product that continuously delivers value.
- Infrastructurally, it shifts from building platforms as a set of low-level capabilities — data ingestion, data cataloging, data transformation, etc. — to a well-integrated experience oriented around autonomous data products as the first class principle e.g. building and sharing data products, discovering data products, governing data products, and importantly, trusting and relying on data products.
This is a big shift; shifting from an industrial design of pipelines to a quasi-organic design of dynamic collaborating complete cells, aka autonomous data products.
Data mesh antipatterns
In the face of a hard pivot, large organizations almost always adopt small steps. And as long as these steps are relentlessly consistent, accretive and taken in the same direction, seismic transformations are in reach. But in the case of data mesh, like many other transformative paradigms, many companies have settled for a halfway solution, limited by the existing ways of doing data, and have missed out on the transformational power of true decentralization.
For many, data mesh has been reduced to data product management, with data products downgraded to nothing more than purportedly clean datasets with metadata, and deployed using existing data tools, with only marginal domain engagement.
A fun fact and a quick tangent, I asked ChatGTP to give me a few examples of misinterpreted big ideas, and to my surprise it listed data mesh as one of the top 10 examples. Here are a few fun ones and its description:
- Darwin’s theory of evolution got misunderstood as “survival of the fittest”, reducing the concept to a simplistic view of strength and dominance, neglecting cooperation and adaptability.
- Agile methodology was interpreted as scrum rituals without embracing the mindset shift adaptive, collaborative and value-driven software development.
- Human rights limitedly interpreted as politicized selectively, focusing on some rights while neglecting others based on geopolitical interest.
- Data mesh is a decentralized approach to data management, treating data as a product owned by domain teams. Many organizations reduce it to simply deploying data tools or rebranding data warehouses as “data products,” missing the cultural and organizational changes required for true decentralization.
Below are a few common ways that companies have partially adopted data mesh. I wonder if you recognize any of them.
Partial data mesh
The most prevalent antipattern of data mesh implementation is the partial adoption of the principles; where only a sliver of some principles have come to life, often with radically reduced realization of data products, minimal involvement of domains, no computational federated governance and some incremental improvement of infrastructure to outsource a small portion of data responsibilities to domains.
Let’s go deeper on some of these partial facets.
Data as a product was a new concept and the core idea of data mesh and has been reinterpreted through competing vendor agendas and data practitioner worldviews. It’s been cherry-picked, reduced to static data and metadata, rows in catalogs, or tables in a data lake, or views in a warehouse. This is how most analysts and other observers who aren’t hands-on with data platforms use data mesh terminology. This downscoping made data products “dumb” and that is the root cause of the biggest gap in partial data mesh implementations. Data products should really be autonomous. That means they encapsulate the soup-to-nuts of data management to deliver domain-data-as-a-product to all their customers. For example, an autonomous data product consumes data, transforms data, serves data, governs data, makes it discoverable, and so on… all on its own.
Computational federated governance has also been left out, simply because without encapsulation of autonomous data products there is no place to embed, execute and monitor policies as code.
Without a rich and complete implementation of autonomous data products as the primitive of the data value chain, self-service data platforms remain as-is: a collection of low level data technologies sometimes referred to as modern data stack with some progress toward automation of infrastructure through templates.
Open loop data mesh
Data mesh creates a closed loop between operational and analytical systems, organized and structured by domains. The following diagram shows an example of this, where analytical data products consume data directly from operational systems and vice versa, in a continuous data loop.
In reality a very few organizations have been able to implement this closed loop. The open loop implementation of data mesh only contains static (not autonomous) data products built downstream and served from a lake or warehouse. This model lacks the domain-centric collaboration and data feedback loop between operational and analytical systems. This antipattern does not fix the long lead time between a business application changing and its impact unlocking an ML/analytics business use case, as they are siloed in parallel.
The reason for this antipattern is (1) there is a complex hairball of data pipelines sitting in between operational systems and the warehouse/lake - the centralized storage. Untangling these pipelines is a very costly refactoring job that offers little ROI on its own. (2) domains haven’t had the tools or expertise to participate in data product creation, which requires a mature data product-centric and generalist technologist, a data product manager focused on value creation.
Many smaller and digital-native organizations are just beginning to tackle building a mesh that is in fact connected to the upstream transactional systems directly and close the loop.
Proxy data mesh
Domain data ownership is the fundamental principle to enable data innovation at scale. This has been particularly difficult. Many organizations have resorted to a model where centralized data teams, people with the technical know-how, have remained the proxy for the business domains. And in some cases, business domains have shadow and siloed data teams.
This takes us back to square one, the pre-data-mesh era, with domains working in data silos using their platform, data stack and processes of choice, disconnected from the whole; or data platform teams remain centralized despite best attempts to shadow the business’s organizational structure. The business shadows the data team, and the platform shadows the business.
Building blocks of scalable data mesh
Escaping the gravity of legacy data operating models and rapid evolution to data mesh at scale has occupied Nextdata for the last few years. For us, the answer lies in the essence of computing - encapsulating complexity with a new abstraction — and shifting the abstraction from low to high. This is and has ever been the key to transformational change.
When we analyze the root cause of complexity in data mesh, we see attempts to force-fit a completely new way of operating on top of the old processes, thinking and tools, frustrating change agents and slowing everything down.
So, we looked for ways to abstract these old ways into something simpler and better. And as a result, we are introducing two interconnected abstractions: the autonomous data product and the data mesh operating system.
Autonomous data products
An autonomous data product encapsulates the whole data supply chain—all facets of data management—for a single domain concept. The autonomous data product is a new data primitive, the thing that we build, share, discover, monitor, access and use.
Allow me to clarify this new abstraction through a series of provocations:
“What if we stopped building data pipelines?”
“What if we stopped orchestrating data pipelines?”
“What if we stopped cataloging data?”
“What if we stopped accessing data through storage?”
“What if we stopped understanding data using schemas?”
“What if we stopped layering in semantic models?”
“What if we stopped adding data lineage?”
“What if we stopped measuring data quality after data is created?”
“What if we stopped discovering data looking for tables, files and schemas?”
“What if we stopped copying data across stacks for each data use case, GenAI, AI or analytics?”
“What if we stopped separating batch and streaming processes?”
“What if we stopped governing data after the fact?”
These provocations challenge the legacy way of doing things, old ways of thinking about the problems to eradicate the sources of complexity.
We looked for an encapsulation that simplifies all these broken processes into one simple creative process:
“What if we just build autonomous data products?”
…and all other functionality becomes available through this simple act.
This implies autonomous data products are self-orchestrating, self-governing, self-semantically-defining, etc. These capabilities are all built in. The technology that enables this encapsulation is the data product container, a unique Nextdata innovation.
Data mesh operating system
Now that we have a new abstraction, an autonomous data product (ADP), we need to build a few simple tools to perform a complete set of simple functions on this new abstraction. The tools to:
- Build an ADP through declaration and coding of its components
- Run an ADP as an interoperable and standard process
- Find an ADP through search or exploration
- Access an ADP through its global URL
- Use an ADP by connecting to its data sources
- Control an ADP by commanding and configuring its behavior
We call this collection of functions a data mesh operating system, and while simple, they can be combined to create transformative impact.
I invite you to shift your perspective — autonomous data products are smart primitives and the OS (platform) functions are rather simple and dumb.
The operating system needs two other key capabilities to fit into an existing environment.
- Integrate any ADP into the existing set of diverse data stacks, any kind of storage, compute, identity management, security and data transport. We introduce this integration as a set of extensible operating system drivers loaded by each autonomous data product.
- Bootstrap a mesh of ADPs from companies’ existing data and other assets. History shows that self-bootstrapping to accelerate a new way of working and understanding is a pre-requisite to a transformation of this scale. This is a crucial feature of the data mesh operating system.
One studies the past to define the future. We’re inspired by past pivotal shifts in our computing history, from multi-user interactive systems like Unix, to http and the worldwide web, to globally scalable cloud computing. Cloud computing changed the way companies operated, how we worked and how hundreds of millions of people manage their lives and navigate their days. The shift required defining cloud-native applications, a new abstraction for building software, and a stack of technologies that included virtualization, containers and container orchestrators as the pieces of a cloud operating system.
Today data is broken, and getting more broken as AI and other innovations are making systems and processes exponentially more complex…and even so, many of us keep thinking of data as exhaust that must be stored and piped from place to place.
We’ve been motivated to address this problem by wrestling with this complexity, along with many of you reading this. We see data complexity as the bottleneck that stops innovation and creates confusion, the biggest, longest-lead-time barrier to almost every large-sale transformation we encounter.
Our plan is to encapsulate data complexity with a simple abstraction, the autonomous data product -and manage it with a set of core operations, the data mesh operating system, to unblock a wide range of new data-driven applications and AIs that create unimaginable new value for users.
We’re excited to do this with you. Reach out to us, if you are interested to learn more about a real-world implementation of autonomous data products and Nextdata OS. We are excited to continue to share our breakthrough in managing data complexity in an ever evolving landscape of data and AI.
1 "Study the past if you would define the future." – Confucius
We also recorded an episode of our Nextdata Technical Support series on this topic. Please enjoy!