Unified data access from multi-cloud to on-prem



Listen to this blog
Disclaimer

The emergence of cloud computing has created a tectonic shift in the digital landscape. New, more flexible, resilient, and scalable computing architectures have become possible with the emergence of this new technology.

Before the arrival of COVID-19, cloud migrations were well under way, but the pandemic accelerated the shift and made cloud computing essential. The flexibility of cloud architecture enables organizations to distribute compute processes and data storage across a variety of environments, mixing and matching whichever platform is the best fit for the respective computing requirements. This includes architectures that span across on-prem, public cloud and hybrid cloud resources, and designs that balance computing jobs and data across several public clouds in a multi-cloud configuration. Today, 80% of organizations deploy hybrid cloud approaches, and 89% use multi-cloud.

This wholesale adoption of cloud computing has shifted mindsets. Architectures are designed to take advantage of the unique capabilities of universal connectivity and instantaneous scalability. Applications no longer need to be built as monolithic programs that run on a single machine, but rather, as collections of code that run independently across different servers and clouds to achieve a predetermined outcome. This more granular approach to computing has created vastly more dynamic environments, spawning new opportunities for innovation.

Why Strategic Hybrid Multi-Cloud?

Adopting a multi-cloud, hybrid cloud, or both in a hybrid multi-cloud strategy has many benefits, including opportunities to optimize data management. Hybrid multi-clouds help organizations reduce risks and costs while improving performance and regulatory compliance.

Reduce Risk With Increased Resilience

One of the most significant risks to any system is its possibility of going down. System resilience is paramount for providing adequate service to your customers and stakeholders. Hybrid multi-cloud strategies offer more options for shifting compute jobs and maintaining redundant copies of databases to provide greater redundancy. A new platform is always ready if one cloud or server runs into issues.

Reduce Compliance Risk With Data Sovereignty

Data sovereignty dictates that countries have the right to set rules for the data that is collected within their borders. Remaining compliant with these regulations while managing data across multiple countries with unique rules is challenging. Some countries require data collected in their territory to be stored on servers within their borders. Hybrid multi-cloud architectures allow organizations to keep data within a country's borders while making it accessible to applications worldwide.

Flexibility Reduces Risk

The ability to quickly shift workloads from one cloud vendor to another prevents the possibility of one provider having too much leverage over your data and applications. The flexibility to easily move data and compute jobs from on-prem to the cloud or from one provider to another, enables organizations to optimize their processes and take advantage of the strengths of each platform. For example, parts of legacy applications can be refactored to run in the cloud to take advantage of cloud capabilities while keeping legacy data on-prem.

Performance

For many applications, speed is crucial to ensuring optimal business outcomes, such as with Wall Street financial data, sophisticated security applications, and cutting-edge customer experiences. Anticipating data needs and positioning high-demand data closer to the server running the compute job reduces app latency. Hybrid multi-cloud strategies provide more choices for how and where you can stage your data to get it closer to compute engines and/or the end-user. Applications do not need to retrieve data from the center of the network every time it is needed. With hybrid multi-cloud strategies, there is more flexibility in accessing new resources and scaling applications. Moving data closer to users also makes data more accessible.

Hybrid Multi-Cloud Challenges

Hybrid multi-cloud strategies have enabled application designs and architectures to change significantly. Standardized Docker containers and orchestrators such as Kubernetes have standardized the abstraction layer that allows application code to run independently of infrastructure. This technology enables any part of an application to run in any cloud environment. Data is much more complex and constantly changing, making this type of approach much more complicated.

Data is constantly changing and is much less mobile. While data can be replicated and mirrored, the more copies of data that exist, the harder it is to maintain consistency and know which data set is most valid. This challenge is exacerbated by cloud sprawl.

Cloud Sprawl

The ease of provisioning a new cloud or SaaS application has led to cloud sprawl. Shadow IT has led to the growth of unchecked data stores in unsanctioned SaaS applications, leading to unconnected and siloed databases. This trend has resulted in growing complexity, making it difficult to analyze data across an organization.

The efficiency and cost savings enabled by the cloud are being negated by cloud sprawl. The ease of provisioning a new app or cloud without the appropriate governance is leading to redundancies, inefficiencies, and unused resources. Hybrid multi-cloud strategies limit the ability to see and manage an organization's entire cloud estate, leading to cost overruns.

Fragmented governance

Lack of uniform governance across a multi-cloud strategy also has security and access implications. Unique access rules and policies must be maintained to manage access across hybrid multi-clouds. This is complex, as each cloud has its own nuanced controls. Information technology professionals must understand the particularities of each cloud to maintain proper security and governance. The lack of uniform governance makes it difficult to manage access rules that are granular enough to support the data access required by authorized users. IT departments fall back on broader, more restrictive access rules to simplify this challenge.

Application Modernization Challenges

The cloud's promise of efficiency does not always meet expectations when modernizing legacy applications. Migrating legacy databases to the cloud is not always compatible with cloud architecture, limiting the ability to migrate entire applications to the cloud and hindering modernization. When some data is in the cloud while others remain on-prem, efficiently accessing this data simultaneously is not simple and requires individual API calls. Many in the industry realize that some legacy data and applications may never make it to the cloud and will live forever on-prem. This fact necessitates new strategies to incorporate these older applications and databases into a successful hybrid multi-cloud strategy.

With data spread all over the cloud resulting from cloud sprawl and data stranded in legacy apps and databases, access to data on hybrid multi-cloud architectures is not always optimal. Federated queries and virtualized access layers can help overcome some of these challenges.

Solution: Federated Queries and Data Products

A federated query enables data to be pulled from multiple databases with a single query. This negates the need to write multiple complex queries and merge the results to create a single dataset. Simply write a single SQL query, and data from multiple databases can be extracted as if they were a single database. Data stranded in various cloud data stores and SaaS platforms by cloud sprawl can be accessed and combined across these data silos, driving greater data access and business insight throughout the organization.

Once federated queries are executed, data is fed into a virtual database. This virtualization layer separates the complexity of the compute layer and database structure from the actual data. This layer functions as a single point of access where a unified set of governance rules and access protocols can be implemented, enabling enhanced visibility and management of data, whether it is stored on-prem or in multiple clouds. Less complexity enables granular access rules to be applied. This layer simplifies managing data across different clouds by abstracting the data from the various cloud protocols and controls. Instead of resorting to one-size-fits-all access rules and governance, access can be managed with the data at the center. The limitations of system policies no longer define what data users can and cannot access.

This federated data layer also provides the foundation for data products. This process offers a new, better model of implementing an additional control layer that provides data product producers and domain manager’s greater control.

Optimizing Queries Across Hybrid Multi-Clouds

Federated queries enable multiple opportunities to optimize data access and cloud data architectures. When data no longer needs to be stored in a single location for an app to easily access it, other factors can be considered when deciding where to store your data. With cloud costs and waste getting out of control, this flexibility can be a significant asset for your Fin-Ops strategies.

For example, if you have a legacy application with a database that may not be compatible with the cloud or is too sensitive to store in the cloud, a hybrid database strategy can be deployed where some application data is stored in the cloud while other data remains on-prem. With a federated query, each database can be accessed as if it were one with a single SQL query. This can optimize storage and bandwidth costs.

Running databases in parallel can also be beneficial when migrating sensitive data to the cloud. Migrating legacy applications offers many benefits, but the process can be risky. There is always a chance that old applications will not run correctly on the new platform, or that data could be compromised. Many organizations will run the same database in parallel in the cloud and on-prem to reduce the risk of data loss. If an issue arises with data in the cloud, federated queries can draw data from the redundant on-prem database without any disruption.

Optimizing Query Performance

Federated query engines can improve performance and optimize resource utilization automatically through premium connectors. Well-built connectors that connect databases to the virtualization layer can organize processes and execute them on the most appropriate platform. Known as predicate push down, connectors can instruct certain amounts of data wrangling to be conducted on the database's compute resource. Operations such as filtering, aggregation, sorting, and column pruning can be conducted before the dataset is pulled into the virtualization layer. This process can reduce bandwidth costs since only pared-down data sets are extracted from the database. It also reduces query times and avoids computational processes downstream.

Data Product as a Container

As we mentioned earlier, Docker and Kubernetes enabled application code to be packaged and used from anywhere in the cloud. These cloud-native architectures are much more efficient, resilient, and valuable because they can easily be changed. In the data world, data products supported by virtualization and federated queries are emerging as a similar abstraction layer for data. Data products are prepacked and designed to be interchangeable across various BI tools or modeling platforms. They are constantly improving, and data product producers, like application developers, are collecting and incorporating feedback to improve their products.

The emergence of hybrid multi-cloud architecture has significantly impacted application development and how and where compute jobs are executed. This trend has not had the same level of impact on data management due to the constantly changing nature and sensitivity of data. Greater adoption of federated queries, virtualization, data-centered governance, and data products brings this flexibility to data management, making data more agile and accessible. This evolution is driving innovation and greater insights into decision-making.

Discover the Latest in Data and AI Innovation

  • E-book

    Unstructured data with the modern data stack

    Read More

  • Blog

    Building a reliable data quality strategy in the age of AI

    Read More

  • Blog

    AWS re:Invent recap

    Read More

Request a Demo TODAY!

Take the leap from data to AI