Designing elegant data products



Listen to this blog
Disclaimer

What is a data product?

The way organizations think about data and how they access trustworthy information is changing quickly. Demand for insights is growing exponentially and strategies to manage data more efficiently are emerging. At the center of this change is a gradual shift in mindset. Organizations are beginning to think about data as a product, a packaged offering that is reusable and refined. This approach gets away from the project-based mindset where every request for data is fulfilled with a new one-off data pipeline.

The key benefits of data products are:

Reusability
Reusability
Easy access
Easy access
Shareability
Shareability

As with any product, the way that data products are designed and presented to users makes a significant difference. Now let’s see how to design elegant data products.

When we talk about data products, we are referring to them in the context of a larger IT strategy or data mesh. This is not to be confused with a data product as a part of a core business strategy where a data product targeting customers is an organization’s primary revenue generator. We are not talking about data products such as Google Analytics or Bloomberg.

Gartner defines a data product as:
“a curated and self-contained combination of data, metadata, semantics and templates. It includes access and implementation logic certified for tackling specific business scenarios and reuse. A data product must be consumption-ready (trusted by consumers), kept up to date (by engineering teams) and approved for use (governed). Data products enable various data and analytics (D&A) use cases, such as data sharing, data monetization, domain analytics and application integration.”

This very detailed and complex definition may be accurate; however, a more elegant definition might come from J. Majchrzak who defines a data product as “an autonomous, read-optimized, standardized data unit containing at least one dataset (Domain Dataset), created for satisfying user needs".

While both definitions are accurate, one is simpler and easier to consume. Similarly, elegant data products are easier to consume and hence valuable.

What is an elegant design?

How do we know if a design is elegant? Albert Einstein is credited with saying, “Everything should be made as simple as possible, but not simpler.” An elegant data product, therefore, must be as simple as possible in order to obtain the best outcome.

Let’s look at other must haves of an elegant solutions:

  • Focused and efficient enough to affect a defined outcome with bounded resources
  • Coherent enough to handle edge cases in core logic, not thought not bolted on capabilities
  • Powerful enough to be applied to multiple applications

Why is elegant design important? Less complexity makes things much easier and enjoyable to consume, driving greater value. A simple but effective solution will outperform complexity.

Data product mindset

The first step to designing and creating elegant data products is to adopt a data product mindset. Often, this can be the biggest hurdle.

To adopt a data product mindset, you need to get rid of the project mindset. This is the idea that each time a data request is received by the data engineering group, a new project is created and executed. This project mindset is much more reactive, with data engineers constantly scrambling to build data pipelines per the stakeholders’ requirements. Once one project is done, it is time to forget it and move on to the next one.

The product mindset is evolved. Data engineers, analysts and data stewards think more proactively about data. Instead of waiting for ad-hoc data requests, analysts, engineers, and managers work together to create data products before they are required. This approach requires thorough research and insight to create data products that will be most useful to a larger set of users, driving greater value per output.

Data products are also reusable, so they stay relevant thought their lifecycle, this lifecycle includes ongoing maintenance and improvement. As data products take on a life of their own, feedback can be easily incorporated into new versions.

The biggest challenge in implementing and building effective and elegant data products is creating the right mindset. When you shift to a data product vs. a data project strategy success is measured by outcomes not outputs. While data products do evolve, effective planning and design upfront will help set the foundation for elegant data products.

Key traits of good data products

Effective and powerful data products typically exhibit certain traits. Designers should keep these traits in mind when creating the data products:

Discoverable

For data products to be impactful they need to be discoverable. Even a fantastic product, will not fulfill its potential if no one knows it exists. Data product marketplaces are great ways to get data products in the hands of users. Some data product marketplaces will use AI and predictive analytics to suggest data products to users, similarly to how Netflix suggests new movies or shows to viewers. Elegance is not always about how you design the product but also about how you bring it to market and make it accessible to the users.

Quality

Clean and accurate data is a must-have attribute for any data product. If data analysts cannot trust, your data product will not be valued by decision-makers. Designing and building data products must include a reliable process for cleaning and normalizing the data as it is merged and integrated.

Once the process is set you need to ensure and prove to your audience that it works. This involves tracking and sharing data quality metrics to measure variability and completeness, among several other qualities.

Secure

Keeping data safe is a requirement of any IT strategy but building security into your data product can be nuanced. Elegantly designed data products can provide granular access to data assets. Designing access rules that consider the roles of users and data attributes balance access and security. These access controls and data masking also provide efficient use of data tables.

Another important trait is the inclusion of sophisticated encryption ensuring data is protected as it moves from database for analysis.

Observable

To ensure continuous quality, great data products have in-built observability capabilities. Data products are only as good as the quality of the data they deliver. If decision makers do not trust the data produced by data products they lose their value. Data products should be designed with integrated monitoring features that detect anomalies and errors. This reduces the likelihood of bad data making its way into an executive’s analysis or be used to train AI models.

Scalable

One of the other benefits of adopting a product-based approach is that more a data product is used more value it contributes to the organization. Data products are very flexible and can be applied to several use cases thereby increasing their utility. Consequently, data products must be designed to scale and meet the growing user demand.

Collaborative

It’s essential to get inputs from various sources for data products to be powerful enough to solve multiple problems.

Building a diverse team to build data products and the supporting frameworks is vital. Multiple stakeholders play a role in creating successful data products including data product producers, domain owners and consumers.

Data product producers are most invested in the success of a data product and hence take the lead. They may have data engineering skills or data analysts’ skills, but the primary focus is on understanding the needs of the consumers. Those with a background in product management or product ownership understand the product mindset.

Domain owners also play a vital role and are typically responsible for ensuring proper governance. Governance helps in setting the right controls and policies which lead to the success or failure of a data product making the role of a domain owner important.

Data product consumers are also a key piece of the ongoing lifecycle of data products. Their engagement and feedback provide the input to improve the utility of data products. They can rate their satisfaction with individual data products and how well they fit their needs. Tracking data product consumers behavior is also a big part of incorporating consumers into the process.

Accessible

Like discoverability, effective accessibility is an important trait of quality data products. Easy accessibility improves the process of getting data products and using them for analysis as simply as possible leading to faster time-to-insight. One of the barriers to rapid access is importing data products into your BI tool or AI model-builder tool. Elegant data product designs enable data products to be accessed from within any preferred analytics package.

The second and perhaps more difficult barrier is gaining the authority to access data. Setting up the right protocols to enable access makes the process safer and more efficient. Clearly defining who is responsible for enabling access is an important part of defining elegant protocols. In a more distributed framework, domain managers who oversee data collection in their group have the authority to provide access.

Subscriptions and data contracts define the duration of access and how data can and cannot be used. By standardizing these agreements up front, users don’t have to go through the process each time they want access to a data product, simplifying the process.

 

Customizable and Interoperable

To meet the needs of users’ data products should be adaptable to specific business requirements and user preferences.

Instead of bolting on awkward data features, elegant data products should also be designed to interoperate with other data products. With interoperability built into the design, data products can be easily combined to create richer and more valuable super data products.

Auditable

As data products evolve, some changes will be improvements, but not all. Changing data products can also expose vulnerabilities, such as security and compliance risks. To ensure that data products are of the highest quality, they must include audit trials and versioning data. Quickly identifying errors and pinpointing the source will help to keep your data product running securely and efficiently.

Use-Case Driven

To be comprehensive and consistent, data products should be able to solve users’ problems effectively every time. To be able to achieve this they should be designed like any other product keeping the end user at the center of the process. Whether the user is a data engineer, data analyst, business analyst, business executive, customer, or partner, having a comprehensive understanding of their needs is key for success.

Comprehensive data products incorporate a wide breadth of data sources to ensure extensiveness and consistent coverage of use cases. Enriching data with partner or third-party sources can add additional depth to data product. For example, using zip code databases to fill in missing address data and standardizing it can make data products more comprehensive and consistent.

Users must be able to clearly understand what the data within your data product represents to be applicable to their use-case. This can be a challenge as data originates from all over an organization. Proper metadata management is important in creating powerful data products and ensuring context is preserved. Making certain that users understand the terminology used to describe the data in the data product is also important. Incorporating business glossaries is one way to help standardize terminology.

Lifecycle Management

One of the key differentiators between data products and data projects is the performance of data products and their ability to be constantly improved and enhanced. Even if we do our best to design a data product to meet the needs of our audience, it will not always hit the mark or simply require change. Building a mechanism to capture feedback from users is essential to continuously delivering great data products.

Tracking data products and understanding how they resonate with users is crucial for connecting products with users. A data product marketplace littered with aging and irrelevant data products does not lend itself to an elegant process. Data products should be archived and retired when they reach the end of their lifecycle, reducing noise. Ensure that you curate your data product marketplace to optimize the user experience.

Process

Elegant data products do not just happen on their own, the require a right process to support their creation. Without it, there is a tendency to add more data that adds complexity. A process ensures data is added deliberately. Elegant designs are produced by iterative and collaborative processes.

Iterative design processes support elegant design because each step or cycle gets you closer to a simpler, more powerful solution. The first versions of data products may not be the optimal solution, so they need to evolve. Features that are unused or disrupt the path to the best outcome can be eliminated through iteration. New users can find innovative applications for data products that spawn new features or a split from the original data product into something new and more impactful. Your process should embrace and institutionalize feedback to better understand how your data product meets its objective. As data products evolve, and feedback is collected, ideas emerge for new data products.

Building great data products is no small feat. Creating it from scratch without a solid technological foundation can be even harder. Data product platforms can make the process much easier. Extrica is a modern data analytics platform that is designed from the bottom up to streamline the creation of data products. To learn more about the capabilities of Extrica and how the platform can help you create elegant data products schedule a demo.

Discover the Latest in Data and AI Innovation

  • E-book

    Unstructured data with the modern data stack

    Read More

  • Blog

    Building a reliable data quality strategy in the age of AI

    Read More

  • Blog

    AWS re:Invent recap

    Read More

Request a Demo TODAY!

Take the leap from data to AI