Optimizing data governance with data mesh and data fabric strategies

Listen to this blog

Disclaimer

Data stored across any organization has immense value, and the knowledge derived from it can differentiate one company from its competitors. Not having a solid strategy for breaking data silos is a strategic mistake.

While traditional methods like ETL pipelines and data lakes are common, more innovative, distributed approaches like data mesh and data fabrics are gaining traction. The end goal of these strategies is to democratize data access, fostering a self-serve model and promoting a more collaborative, data driven culture. Organizations must stay agile, adapting to these evolving concepts and technologies to maintain a competitive edge.

Unveiling the evolution of data mesh & data fabrics

A Data Mesh is a data architecture designed to facilitate data sharing across an organization. A data mesh is technology agnostic and is defined by four tenets.

Domain Ownership

The business function that collects data holds authority over it.

Data Product

Data is packaged into data products, simplifying sharing across the organization.

Self Service

Data and data products must be accessible to non-technical people for independent analysis without requiring assistance from IT or the domain sharing it.

Federated Governance

The responsibility to govern and secure data is shared between the domain and central IT authorities.

To learn more about data mesh, read our blog on what a data mesh is and why you need one.

Gartner defines data fabric as a design concept that serves as an integration layer of data and connecting processes. It uses continuous analytics over existing discoverable and inferenced metadata assets to support the design, deployment, and utilization of integrated and reusable data across all environments. Indeed, both the concept of Data Mesh and Data Fabrics share a common goal: to address the challenge of data silos and enhance access to data within organizations.

Changing distributed data strategies

Since the inception of the Data Mesh concept, the strategy has evolved. In the early days, there was an inclination toward granting domains the power to use any tools to create data products to share. This concept has matured as concerns around standardization and interoperability arose. Reinforcing the concept of data silos and not defining how data products interoperate may not be the best approach, even if the domain leaders have the best understanding of the data. Today’s data mesh implementations emphasize standardized processes and platforms, ensuring easy creation, sharing, and integration of data products.

Concurrently, Data fabric architectures have also emerged, focusing on technology, automation, and central governance control. While data mesh and data fabric may not compete, they influence each other, prompting adaptations to meet market needs. Modern data practitioners explore how data fabric architecture can support data mesh concepts such as federated governance, data products, and domain ownership. This intersection reflects an ongoing evolution in data management strategies.

Data Mesh vs. Data Fabric

Data integration is key to both approaches, Data mesh, and data fabric, with data democratization through virtualization emerging as the architecture of choice. Virtualization allows data to stay in their source domains and virtualizes data sets to enable data democratization. However, the concepts of data fabric and data mesh diverge in terms of governance, automation, and consumption/discovery.

Automation

Data fabric leverages automation to enable self-service, whereas data mesh relies on domain experts to embed their expertise in data products.

Governance

Data fabric relies on central governance control, while data mesh adopts a federated approach with domains responsible for governing their own data.

Consumption

Data fabric consolidates data assets in data catalogs or deploys knowledge graphs to map data assets across the organization. A data mesh approach exposes data through domain-created data products, typically published through a data product marketplace.

As these concepts of data mesh and the technology of data fabrics evolve, they have begun to converge. Practitioners are experimenting with various levels of control, data consolidation, and automation. AI is playing an important role in enabling this convergence.

As the market evolves, it becomes less about automation vs. people federated vs. central governance or data assets vs. data product and rather about strategies that incorporate all the best features and leveraging the right tool for the right job. Data management platforms and analytics gateways are supporting these integrated approaches.

Automation – people & machines

In modern data mesh and data fabric approaches, both strategies strike a balance between domain experts and automation, incorporating these resources in distinct ways. Data fabrics use automation to integrate data in real time. Humans play a more passive role in addressing issues identified by AI alerts.

Data mesh focuses on data products created by data producers. AI helps producers automate repetitive tasks, eliminating the need for coding skills; however, the human who understands the nuance of the data remains central to the process. Automated data wrangling processes and AI-assisted data classification are examples of this symbiotic relationship.

The approaches can coexist in the same strategy with different participants in the process relying on automation in different ways. The key is finding the right balance between human expertise and automation to optimize data processes effectively.

Consumption & discovery - data products vs data assets

In data management, Data fabric architectures produce data assets, while a data mesh produces data products. Both discovery and consumption approaches can exist in a combined strategy with a data mesh, adding more controls to package data assets into data products.

The data mesh approach focuses on the data product as the main vehicle for sharing data. Data products published on a data product marketplace are richer and arguably more valuable. They typically are made up of data assets that have been merged and normalized under the guidance of a knowledgeable domain expert. Data products are reusable, more permanent, and better for external use beyond specific data domains.

Combined approaches may expose consolidated data catalogs to less technical data consumers, allowing them to create data products for sharing. Leveraging AI to expose these data assets to data consumers, similar to a data fabric, reduces the technical skills required to access data. LLM empowers data consumers with limited SQL expertise to explore and query data assets effectively.

Whether it is a data fabric or mesh, the data catalog becomes a very important piece of the strategy. Gateway platforms are creating unified data catalogs that span the entire organization and organize data assets efficiently. These platforms also leverage GenAI tools to reduce manual work, helping in data classification and data normalization to support robust data models and business glossaries.

Ongoing advancements in AI will continue to enhance the efficiency of data producers to create data products leveraging automation. Also, experts have the opportunity to train AI to help data consumers get the most from their data. This synergy between skilled humans and powerful machines represents a best-of-both-worlds approach in the evolving landscape of data management.

Data Governance – federated vs. centralized

Emerging platforms and tools are enabling greater federation of governance. Governance tools make it easier for central IT to relinquish more control while maintaining effective oversight.

Integration of data governance controls into data management platforms empowers all data team members to actively participate in and take responsibility for governance.

Domain Manager Controls	IT Manager Controls	Data Producers Controls
Controls access to domains	Controls access to data platforms	Fine grained access controls to the table level
Controls granular access to data	Controls how domains are organized

The integration of automation into data governance is evolving with the emergence of active data governance – a technology that monitors data assets and delivers alerts to producers and consumers when issues arise.

In the context of data mesh, governance extends beyond data assets to cover the end-to-end data lifecycle, from source to data product. Managing data governance and quality does not end with the data asset in a data mesh. Public data products are continuously improved and monitored through human feedback loops. This iterative process ensures that data products remain relevant and valuable to consumers.

With the capabilities of data fabrics and data mesh converging, there is a growing flexibility in accessing data. Users can access and discover data through an approach that aligns with their technical skills and understanding of the data. The future will likely see a blending of data mesh and data fabric elements, resulting in unique combinations that leverage the strengths of people, machines, governance, and consumption tactics. The distinction between data meshes and data fabrics may fade, giving rise to more personalized and adaptable data management strategies.