Four Patents That Changed Enterprise Storage

Editor’s note: Satyam Vaghani is co-founder and CTO of PernixData, a Silicon Valley based enterprise storage startup.

Sports fans get very excited about Final Fours — whether it’s the new four-team playoff in college football, the semi-finals competitors at Wimbledon or, most famously, the remaining competitors in NCAA basketball’s March Madness. As we are in the midst of this year’s tournament, I started pondering a Fab Four, if not the Final Four, of the world’s impactful storage patents to date.

This was quite a competition. College basketball’s March Madness starts with 68 teams, but my game had a field of more than 11 million patents issued by the U.S. Patent and Trademark Office in the last 50 years. Okay, I have no clue how many of those patents involved IT storage, but it’s safe to say that the number is in the many thousands.

I made my picks based on some pretty simple criteria: the patents had to represent truly great, game-changing technology innovation; they had to have led to significant business success for the companies where they were invented; and they had to be sexy — at least as we define “sexy” in the storage world.

So here we go. Break out your bracket sheets, and think of me as a younger and more soft-spoken Dick Vitale. I present to you The Four Patents that changed enterprise storage (in no particular order).

1. Patent 6,928,526 granted to Data Domain in 2005 

Data Domain was a Silicon Valley company founded at the start of the 21st century with a clear mission: Disrupt the backup storage market with a cost-effective, disk-based substitute for tape.

In those days, secondary or backup storage was exclusively tape-based. Tape could hold more data and was much cheaper. Using expensive disk storage to hold backup data that companies didn’t need to access on a daily basis made no sense.

It set the stage for emerging technologies like cloud-based backup.

Then on Aug. 22, 2005, Data Domain announced that the U.S. Patent and Trademark Office had granted it a patent entitled “Efficient Data StorageSystem,” invented by Ben Zhu, Hugo Patterson and Chief Scientist Kai Li. The casual-sounding title belied a revolutionary innovation: Deduplication.

Deduplication identifies and removes redundant data segments at high speed, thus allowing organizations to reduce the number of backup copies they need to make of similar files. So if you were to store a file five times, in five different directories, Data Domain invented a way to figure out there are duplicates and to store the file only once.

The technology leveled the playing field, making disk storage a cost-effective replacement for tape worldwide. And you can argue that it set the stage for emerging technologies like cloud-based backup because you’re not going to do that via tape but rather over a network to disk. Data Domain’s patent fundamentally changed the world in terms of how data backups take place.

In 2009, EMC acquired Data Domain for more than $2 billion.

2. Patent 6,289,356 granted to NetApp in 2001

NetApp set out to solve a problem with disks. It’s very efficient to write data sequentially to disks or read data sequentially from disks, but disks are excruciatingly slow if the writing or reading is random. The world has very few well-behaved applications that read and write sequentially.

NetApp’s David Hitz, Michael Malcolm, James Lau and Byron Rakitzis found a way to eliminate the need to store data or metadata in predetermined locations on disks. They called it Write Anywhere File Layout, or WAFL. Voila! Sequential/random problem solved.

The world has very few well-behaved applications that read and write sequentially.

An important feature of WAFL is its ability to very quickly create snapshots that are read-only copies of the file system. This allows users to recover files that have been accidentally deleted or for business continuity in the event of data loss. WAFL productized snapshotting as a primary function of enterprise class storage systems. While flash arrays now threaten WAFL with dinosaur status, it’s an exceedingly important technology in the evolution of storage and how NetApp made its name. Also, the disruption flash caused by WAFL’s relevance is one of the best lessons on the perils of resting on past achievements.

Patent 7,146,524 granted to Isilon Systems in 2006

Like most patents, this one had a very literal name: “Systems and methods for providing a distributed file system incorporating a virtual hot spare.” Behind the moniker was a game-changing advance in the storing of file data among a multitude of smart storage units that are accessed as a single, logical file system.

As the company’s press release at the time said: “This core data protection feature… dynamically recreates data in the free-space of an Isilon IQstorage cluster, eliminating the need for idle, additional storage units, servers or spare disk drives – dramatically reducing the cost and complexity of a customer’s storage environment.”

Good products can be made great with time and with empirical usage data.

This patent, invented by Sujal Patel, Paul Mikesell, Darren Schack and Aaron Passey, led Isilon to ship a practical distributed storage system – probably the most successful commercial product in its class at the time.

Distributed systems software is very hard to engineer because of the number of components and idiosyncracies one needs to handle. The folks at Isilon wrote their file system with specific hardware in mind, thus highly simplifying the number of corner cases they needed to handle. This is a good engineering lesson, too: good engineering is about making the most practical product to meet demand now, not necessarily the most perfect product 10 years from now. Good engineering is also iterative; good products can be made great with time and with empirical usage data.

EMC acquired Isilon in 2010 for $2.25 billion and has continued to update the core technology for Hadoop and other new demands.

4. Patent 8,650,359 granted to VMware in 2014

Obviously, VMware pioneered the concept of partitioning a physical computer into many virtual computers – the first company to virtualize x86 architecture. (Patent geeks may be interested to know that early VMware leaders Scott Devine, Edouard Bugnion and Mendel Rosenblum filed patent number 6,397,242 – “Virtualization system including a virtual machine monitor for a Computer with segmented Architecture” – on Oct. 26, 1998 and it was published on May 28, 2002.)

But VMware also developed several patented technologies that have advanced storage capabilities. (Full disclosure: I personally have been involved with several of these patents as the former CTO of VMware’s storage group.)

With the emergence of new approaches, the need arises to put storage intelligence into servers, increase VM analytics and manage resources effectively.

Patent 8,266,099, for example, describes VAAI. It lets a virtualization platform convey a virtual machine operation to a storage system at a meta level, so that the storage system can do many VM operations internally and efficiently without involving servers. Patent 7,849,098 is the invention that led VMFS to be the only clustered file system in the world that does not require server-to-server network communication to work. All you need to do is plug in shared storage.

But patent 8,650,359, entitled “Computer system accessing object storage system,” holds the most significance for the future of storage, in my opinion. Created by myself and several VMware colleagues (Ilia Sokolinski, Tejasvi Aswathanarayana and Sujay Godbole), this patent enables storage systems to store and operate on VMs as first-class objects.

It allows any SCSI or NFS storage system to morph into a VM-aware storage system. This patent is the foundation for Virtual Volumes, or VVOLs, which is just now being introduced into the market. VVOLs closely matches the requirements of a virtual machine with underlying storage, bringing storage into the virtualization age.

There is so much happening in the storage world today, and it remains unclear if these patents will stay in the top spots. With the emergence of new approaches, like decoupled storage for example, the need arises to put storage intelligence into servers, increase VM analytics and manage resources effectively. To this end, I’m thrilled to see what is in store for the future.