Letting Data Die A Natural Death


The big story today is about Microsoft subsidiary Danger losing all T-Mobile Sidekick customer data from their servers. Danger is the company noted for the T-Mobile Sidekick, the revolution in cloud mobile, and most memorably, almost everybody living in 90210 having to get new phone numbers because of Paris Hilton. Valued T-Mobile Sidekick customers received a notice today from the company updating them on the “data disruption” problem. The good news is that data is no longer being disrupted. The bad news is that there is no data left to be disrupted.

This latest large-scale publicized data loss will surely lead to managers everywhere forwarding a link to the story to their IT departments asking “what are we doing so that this doesn’t happen to us.” It will lead to the issue of data loss and backups being written about ad naseum by technology pundits. Research companies will rub their hands together as they prepare new 80 page whitepapers with titles such as “How Companies Who Pay Us Money Can Prevent Your Data Being Lost” (complete with FDA “may cause drowsiness” warning label on the cover). Consultants will flock to their customers, pat them on the head, and reassure them that everything is ok because their project specification powerpoint shows that they included two of everything (and charged for it).

Backups are a hard sell. Most of us don’t want to think about things going wrong (or put more colloquially, shit hitting the fan). Spending your Saturday afternoon staring at a progress meter that seems to be moving backwards is the polar opposite of fun. If there was a brainwave study of people in the process of backing up data, it would probably show no activity at all (but they could use the results to help calibrate the machines). Furthering the point of no interest, Google trends shows that while the volume of news stories about backups and data loss is increasing over time, volume from people searching about it is proportionately decreasing. We are only shaken out of this slumber briefly when there is an incident such as the one at Danger this week.

Like the death of a celebrity from a drug overdose, publicized data loss incidents remind us that we should probably do something about taking better care of our data. But we usually don’t, because we quickly remind ourselves that backups are boring as hell, and that it’s shark week on Discovery. Our previously well thought out backup and recovery plans are expunged as we scan the perimeter of the clinic for the shortest fence to jump over and bolt back to freedom.

Those who are organized and backup their data usually discover the later, larger, part of the problem – restoring from a backup: Where did I put the backup? It’s an old copy. That file I was just working on isn’t there. It was never actually backing up. No software I use can read this stupid fucking format, etc. For most of us, by backing up, we are only setting ourselves up for a bigger failure down the road.

If you read almost any technology website or newspaper, you could be forgiven for thinking that “The Cloud” solves everything. When “The Cloud” is proposed as a solution to any problem most nod in agreement, not wanting to appear out of the loop by asking what the hell it even means. It certainly isn’t a solution to backups – as Sidekick users found out today, and ironically, as 7,500 users of online backup provider Carbonite found out after the company lost their backups (Carbonite can take some comfort in that they now rank very well for ‘data loss’ in search engines because of the incident. What do they say about bad publicity?).

In the Danger case, it appears from initial speculation that the data was lost because they attempted to upgrade a storage array without backing it up first. Here is a case of smart and rational people who do this for a living at one of the best companies in the world, and they didn’t even bother making a backup – so what hope do we have? Relying on the cloud as a backup didn’t work, because somebody forgot to backup the backup. Roman poet Juvenal foreshadowed this very problem when he wrote “Quis custodiet ipsos custodes?” (at least I think he did, hard to tell because there was no word for “backup” back then).

Storage technology does a reasonable job of keeping data intact, considering that it is only a spilt Red Bull away from not functioning at all. The methods used to store data are vulnerable to simple things such as a magnet, and we live on one of those (hint: The Earth). We have become far too reliant on something that is inherently unreliable.

Every systems administrator has at some point in their life experienced the sickening feeling of realizing that they have lost data – and do not have a backup. It is so common that Eminem even wrote a song about it (Lose Yourself, about a sysadmin who when realizing he didn’t have a backup decides it is time for another career (replace ‘music’ with ‘man tar‘ in the lyrics for the full effect)). The sick feeling that all sysadmins have felt after losing data is because of the pressure and responsibility of the situation, sysadmins run the technology, and we expect technology to solve this problem.


The solution may be to do nothing, certainly not to panic. The biggest problem is that we hoard data. We produce more data and information than we ever have, and we are all vain enough to believe that the data we create is so fantastic that it should live on for eternity. Losing the contact list on your phone shouldn’t be a problem – you should know who your friends are anyway. If you are losing sleep because you can’t find an old email you wrote, you likely have deeper issues to address.

Technology has spoiled us to the point where we feel nostalgic when we lose data that didn’t really matter in the first place. If it did matter, a primal instinct would have driven us to do more to preserve, rather than rely on a sleep deprived sysadmin on the other side of the country. If you didn’t care enough to take care of it yourself, then you didn’t really need it. It is our misguided expectation of technology that causes us to panic when we lose data. The only people who have a larger incentive to preserve your data are those who are using it to target an advertisement at you, or sell you something.

Not only is a lot of this data not important, but do we really want to keep it? I certainly would not want a full account of everything I did in my youth sitting on a server somewhere. I am also certain that we do not want the record of our as a society time being documented and discovered by future civilizations based on Twitter messages.

Data experiences its own form of natural selection. What is important will survive, the remainder will thankfully fade away.