How storage is changing in the age of big data

Don Basile Contributor

Don Basile is an entrepreneur and venture capitalist with more than 20 years of executive experience in technology, healthcare and telecommunications.

More posts by this contributor

Have you ever tracked all the ways you use data in a single day? How many of your calories, activities, tasks, messages, projects, correspondences, records and more are saved and accessed through data storage every day? I bet you won’t be able to stop once you start counting.

Many of us never pause to consider what that means, but data is growing exponentially — with no end in sight. There are already more than a billion cellphones in the world, emitting 18 exabytes (1 billion gigabytes) of data every month. As more devices continue to connect to the Internet of Things, sensors on everything from automobiles to appliances increase the data output even more.

By 2020, IDC predicts that the amount of data will increase by a thousandfold, reaching a staggering 44 zettabytes of data. The only logical response to this data deluge is to create more ways to store and maximize all this information.

Artificial intelligence and machine learning have become major areas of research and development in recent years as a response to this data flood, as algorithms work to find patterns that can help manage the data. While this is a step in the right direction in terms of learning from data, it still doesn’t solve the storage problem. And while interesting advances are being made in data storage on DNA molecules, for now, realistic data storage options are still a little less sci-fi sounding. Here are four viable solutions to our storage capacity woes.

The hybrid cloud

We all understand the concept of the cloud. Hybrid cloud storage is a little different though, in that it uses both storage in the cloud as well as on-site storage or hardware. This creates more value through a “mash-up” that accesses either kind of storage, depending on the security and the need for accessibility.

Data storage needs to be fast, intuitive, effective, safe and cost-effective.

A hybrid data storage solution addresses common fears about security, compliance and latency that straight cloud storage raises. Data can be housed either onsite or in the cloud, depending on risk classification, latency and bandwidth needs. Enterprises that choose hybrid cloud storage are drawn to it because of its scalability and cost-effectiveness, combined with the option of keeping sensitive data out of the public cloud.

All flash, all the time

Flash data storage is the most common form widely used in consumer tech, including cell phones. Unlike traditional storage, which stores info on discs, flash stores and accesses info directly from a semiconductor. With flash prices continuing to fall as the technology is able to store more info in the same amount of space, flash makes sense for a lot of medium-sized enterprises.

Recent breakthroughs by data storage company Pure Storage aim to scale flash to the next level, making it a real contender for large enterprises in the big data storage war. Pure took its all-flash approach to storage with FlashBlade, a box designed to store petabytes of unstructured data in an unprecedented scale. The refrigerator-sized box can store up to 16 petabytes of data, and co-founder John Hayes believes that amount can be doubled by 2017. Sixteen petabytes is already five times as much data as traditional storage devices, so clearly Pure’s scalable blade approach is a step in the right direction.

I-SDS

Intelligent Software Designed Storage (I-SDS) removes the need for cumbersome proprietary hardware stacks that are generally associated with data storage, and replaces them with storage infrastructure that is managed and automated by intelligent software, rather than hardware. I-SDS is also more cost efficient, with faster response times, than storing data on hardware.

I-SDS moves toward a storage design that mimics how the human brain stores vast amounts of data with the unique ability to call it up at a moment’s notice. Essentially, I-SDS allows big data streams to be clustered. Approximate search and the stream extraction of data combine to allow the processing of huge amounts of data, while simultaneously extracting the most frequent and appropriate outputs from the search. These techniques give I-SDS a huge advantage over obsolete storage models because they team up to improve speed while still achieving high levels of accuracy.

Cold storage archiving

Cold storage is economical, if not often used. By keeping on slower moving and less expensive disks data that doesn’t need to be readily available, space is freed up on faster disks for information that does need to be readily available. This option makes sense for large enterprises with backlogged info that doesn’t need to be readily accessed regularly.

Such enterprises can store their data based on its “temperature,” keeping hotter data on flash, where it can be more quickly accessed, and archived info in cost-effective cold storage archives. However, the deluge of big data means that enterprises are gleaning so much data at once that knowing what is valuable and what can be put on the back burner isn’t always clear.

Bigger data, smarter storage

While the sheer volume of data continues to grow exponentially, so too does its perceived value to companies eager to glean information about their consumers and their products. Data storage needs to be fast, intuitive, effective, safe and cost-effective — a tall order in a world where data now far outpaces the population. It will be interesting to see which method can best address all these needs simultaneously.