From the golden era of science fiction, there was a frightful 1930s novel, which became a 1950s movie, called “When Worlds Collide.” I’ve been thinking about another collision lately, about what happens when exabytes and zettabytes of data collide with the growing need for governance in the form of Europe’s General Data Protection Regulation (GDPR), the California Consumer Privacy Act (CCPA), the Sarbanes-Oxley (SOX) Act, Health Insurance Portability and Accountability Act (HIPAA) and what’s likely to become a wave of similar regulatory mandates on how we handle our growing ocean of data.
Our datasphere is growing by an estimated 250 exabytes per day, which is to say 250 million terabytes per day. Within the next five years the global datasphere is expected to total about 160 zettabytes (1 zettabyte = 1,000 exabytes). The vastness of the data defies imagination, as does the challenge of meeting two contradictory demands.
The first demand is to harvest value from this immense store of information — through data mining, feeding artificial intelligence, and enabling deep-learning machine-to-machine interactions. The second demand is to do all of this while adhering to regulatory mandates protecting privacy, as well as fending off the constant attacks from cyber criminals who are constantly poking at the global threatscape looking for entry points — of which there are many.
An End to Business as Usual
The seemingly endless headlines about major data breaches helped give birth to the GDPR and CCPA. And such regulations should force organizations into more tightly protecting their digital assets, not just because it is the right thing to do, but because it has now become even more painful to allow a breach or to fall out of privacy compliance.
For example, if the 2019 Capital One breach, in which cyber criminals gained access to more than 100 million customer records, happened today the penalties under CCPA would be staggering. If just 10% of their customers were subject to CCPA, the maximum fine could be $35 billion. Risk like that means an end to business as usual. Tight compliance — and data security — are essential.
Needed: Data Stewardship
Data is a powerful resource, likened to the oil that fuels AI and to gold that comes from harvesting its insights. Data stewardship has become a C-suite affair with Chief Information Officers, and Chief Data Officers being tasked to manage the balance between extracting value from data while remaining secure and compliant. The task is made exponentially difficult not just because of the rapid growth of data but also because of the diverse sources from which it emerges. The average large company has more than 400 custom applications — many of them mission critical and line of business, with many of them running on legacy platforms.
Analysts and other knowledge workers need to tap into data generated through such applications, whether for data mining, AI training, or simply to support application integration, digital transformation, or legacy modernization. This places pressure on data stewards to support these high-value efforts, while ensuring data usage remains compliant. Similar demands are placed on application developers — with 92% of large organizations having in-house software development teams.
Data Stewardship Tools
To draw value from the terabytes of data an organization might have in its own systems, or from the exabytes of data beyond, new types of tools will be required. I think of them as data stewardship tools. CIOs and Chief Data Officers need to have a central clearinghouse application that can filter data requests in real-time to ensure that data — whether going to an analyst or streaming into an AI application — is safe to release from a compliance and security standpoint. Such a tool would mean data stewards could work from a single console to apply internal policies and regulatory requirements across all of their applications and data stores.
Such a system should also have a powerful SDK that frees developers from having to generate their own code to enforce compliance across their applications. Rather than having 40 applications with 40 compliance solutions, an organization’s development teams should be able to simply plug into a set of data connectors that integrate with the compliance console to provide consistent protection that enables visibility and control of data at enterprise scale..
In high-level terms, this is what we are creating at Datafi. An integrated data stewardship tool that allows organizations to pull the enormous value tied up in their data, while ensuring that all data access is granted in a compliant and secure manner. As the world continues moving from terabytes to petabytes to exabytes to zettabytes and beyond, we will be there to help developers and data stewards maximize the value they can draw from data while protecting against the chance of noncompliance. We want to keep worlds from colliding.