As enterprise business has become increasingly digital, data collection and analysis has emerged as a central component of strategic planning and business plan execution. With the right eye on data, companies can unlock insights to transform the customer experience. The more data that’s collected and produced, and the more personnel that consumes it, however, the more important maintaining good data governance becomes.
This is especially true when companies migrate to the cloud environment. The cloud’s decentralized nature is conducive to the creation of a fluid, accessible data environment, but it also makes it all the more critical to ensure that data are reliable, valid, complete and available for analysts to access and consume. Businesses that adopt a comprehensive and future-facing position through data governance are best situated not only to navigate the changes brought about by cloud migration, but also to maintain that position as new — and even more decentralized — cloud models emerge.
Below, hear from expert voices in data governance on the strategies and tools that could future-proof businesses operation in the cloud.
Data governance now
The current conversation around data governance centers on the ways in which cloud migration has raised questions of scope. “What you see is the number of datasets exploding in the cloud,” says Salim Syed, vice president of software engineering at Capital One. “The number of lines of business that are accessing the data is exploding. If you don’t have the right governance around to manage all that, you’re going to lose control of your data platform.”
Putting that governance into place requires the implementation of the right tools to ensure a company’s data is being produced and consumed properly. Snowflake has invested heavily in new governance capabilities, allowing its customers to satisfy increasing compliance needs. “Our philosophy,” says Artin Avanes, director of product management at Snowflake, “is to build one single service and engine that’s very integrated, making sure that all new data governance capabilities we are developing seamlessly work for all the different workloads, from traditional analytics and modern data applications, to data sharing and data exchanges. We offer capabilities that allow you to let the policies travel together with your data, wherever the data might need to be, and make sure that those data governance needs will apply not just within a single Snowflake account, within a single physical location or region, or even within a single cloud, but across different cloud regions and different clouds.”
There’s also a cultural element at play, thanks to the data decentralization that comes with moving away from the data center model and into the cloud. “You need to make sure that there is a centralized team that sets enterprise governance policies and monitors compliance,” says Syed. “And then, you want to make sure that the infrastructure, ownership of data and the process is federated. The central team just doesn’t have the domain knowledge to set all those rules. It’s really important in the cloud world to federate so that the ownership of data goes to the lines of business.”
Sharing data securely at scale
How to implement and succeed with data governance at scale
With those structures and tools in place, companies that have migrated to the cloud can effectively manage the flow of and access to vital data across multiple internal teams, but it’s important to make it a part of the transition plan from the start. “The sooner you start understanding the roles, building the right data models and the corresponding governance controls, it’s a key recipe for accelerating and succeeding with the migration to the cloud,” Avanes says.
This is especially true at the enterprise level, where questions of scale come into play. According to Syed, this is what led Capital One to partner with Snowflake. “We had thousands of users running millions of queries per day,” he says. “Hundreds of queries were running simultaneously during peak hours. This is a big problem for traditional data vendors; you just can’t handle that kind of speed and scale. Snowflake, with its separation of storage and compute architecture, really for the first time, enabled us to scale to whatever our users demanded, but then scale it down whenever usage was gone.”
Capital One uses data to drive customer insights. Its engineers and analysts use machine learning and real-time data at scale to guide investment decisions, detect fraud and provide more intelligent digital products to help its customers manage their finances.
“Capital One really was one of the first, I think, who recognized the potential that the cloud offers in terms of scalability and bringing all of your data sets together, being able to build modern analytical solutions at large scale,” Avanes says. “At Snowflake, we are continuously innovating the engine, resulting in a faster and more scalable platform powering the data cloud, and fully transparent for the end user oftentimes. We are taking away the burden from our customers to think about knobs and diverse configuration settings for their different application needs. So, there are two dimensions coming together: Us innovating in performance and scale, and our customers adopting new capabilities to make sure that they have the right visibility, transparency and governance around their data. This way, our customers can scale with ease, with a high level of confidence and peace of mind while onboarding new workloads to the cloud.”
Data governance and the new cloud frontier
Enterprise technology is always moving forward, and so, as more businesses move to a cloud-focused strategy, the boundaries of what that means are evolving. New models such as serverless and multi-cloud are redefining the ways in which companies will need to manage the flow and ownership of their data, and they’ll require new ways of thinking about how data is governed.
According to Syed, these new models are going to make even more important the ability to decentralize data architecture while maintaining centralized governance policies. “A lot of companies are going to invest in trying to figure out, ‘How do I build something that combines not just my one data source, but my data warehouse, my data lake, my low-latency data store and pretty much any data object I have?’ How do you bring it all together under one umbrella? The tooling has to be very configurable and flexible to meet all the different lines of businesses’ unique requirements, but also ensure all the central policies are being enforced while you are producing and consuming the data.”
Avanes agrees that striking this balance between central governance and widely distributed data will be critical to future-proofing cloud-based business. “What we are seeing now is that there is a danger of running into similar challenges from the past: data sets that are being created in separate clouds or separate regions. So, building a system that allows you to have governance at a global scale across different clouds and be able to really apply it to all of your data sets is the next challenge ahead. In an increasing world of multi-cloud, data governance policies need to apply seamlessly everywhere. That is not trivial.”
Making data governance work
As these more distributed cloud models proliferate, properly managed data stands to become one of the most vital resources in a business organization’s portfolio. That being the case, keeping a handle on governance could make a significant bottom-line difference in both operations and customer interactions. As demonstrated by the partnership between Capital One and Snowflake, matching tools and policy to strategize for this distributed future can make all the difference.
“It is a balance,” Syed says. “I would say 70% is creating the right tooling to make it easy, but then 30% would be the right organization structure, the right culture. All of that is going to be important.”
From Capital One:
For more on how enterprise can become cloud-powered, visit https://www.capitalone.com/tech/cloud/