All Your Metadata Shall Be In Water Writ

The power of the internet lies in its near-infinite mutability. It’s an edifice of information being added to and sculpted by as many hands as there are eyes viewing it. Truly democratic and increasingly accessible, it will soon be the vector for most communication that takes place on our world.

But its mutability is also a weakness, as so many great strengths are. The weakness arises from a lack of permanence: it is impossible to make an indelible mark.

Lack of permanence! you say. Why, I can request 500 pages of data on file at Facebook, and the NSA is building a profile on me that includes every cookie I’ve ever been issued. True, but the data itself is impermanent. Vulnerable in a dozen ways to being rewritten, manipulated, retouched, softened, or otherwise reduced from a record to a falsification.

The data we create today is not etched in stone but “writ in water.” The benefits of this we have seen, and monumental they are, but soon we will know its danger, too.

The chocolate ration has been increased

Broadly speaking, the potential for fraud on the internet is as great as the potential for good, but the idea of this article isn’t to enumerate the many ways in which data can be manipulated for nefarious purposes. The problem lies at a higher level of organization: higher than engineering (e.g. “video is too high-bandwidth”) but not quite philosophical (e.g. “why should we pay for software”). It’s simply this:

There is no simple and reliable way to tell whether the information you are looking at has been altered in any way. Every word, every image, every byte has to some significant degree an unknown provenance.

There are some ways to be reasonably sure about some things, to be sure, like the complicated sub pixel breakdown to which photographs can be subjected, or textual analysis, or timestamps. But like all countermeasures these are soon defeated, and you’re back to where you started.

It’s not a problem unique to digital data; counterfeiting and data manipulation go back to prehistory. We don’t have to solve the problem of human iniquity. But it would be nice if there was some way of ensuring that any given portion of data, be it text, image, moving image, audio, or what have you, has remained undisturbed since it was set down.

In a way it becomes a philosophical problem here at last, but one that ends in a sort of informational nihilism. How can you really be sure something is original? How do we know someone hasn’t modified the Constitution, replaced the surveillance tapes, bribed the scholars, a la George Orwell? That way lies madness. But it is practical to want to know whether a news report you just read was changed after publication, or whether a photo has been retouched in any way.

Of course it’s not always necessary to know these things. With very little in the way of real integrity, the internet has come extremely far. Just as it is not necessary to have every exchange of goods notarized and every conversation recorded, it’s not necessary to record everything on the internet in some irreversible, indelible way. But it’s troubling that even if we wanted to write something in stone, there is no established way to do so.

It’s for this reason that the cautious (and indeed, the paranoid) are unwilling to renounce local storage. And although you may not ever understand the technology you use, you at least can ensure to your own satisfaction that your data is yours, and furthermore is the same as it was last week.

Write-only memory

It is not a fundamental limitation of the internet or computing, which means that eventually it will be solved. Will there be data centers stacked high with write-only solid-state storage where the data is seared permanently into nano structures? Will a distributed network check the internet’s sums at the drop of every packet? Will people just have to trust each other?

For now the web of trust suffices, but when a murder trial hangs on information that one party says was manipulated and the other says is original, trust is no longer an option. Originality must be proved, or manipulation disproved, beyond a reasonable doubt. And doubts are becoming ever more reasonable; how many pictures have you heard of recently in news reports that have been modified in nontrivial ways to advance a political agenda? Wars have been started over less.

As the virtual world continues to merge and integrate with the real, they both take on each other’s aspects. In the case of the visual and physical vocabulary of computers, it reaches back to the earliest compatible representation of the concept, as I wrote last month in Iconoclasm. And why should there not develop an analog for authenticity that does not rely on an authority or certificate? I know a mountain is old not because a geologist tells me so, but because it is self-evident that mountains are old, just like it’s self-evident that you can’t touch the moon (well, most of us can’t), that you can’t pick up a house, that you can’t breathe in water.

A real and powerful mechanism for establishing authenticity must be a cornerstone of the internet if it is to be used with confidence in matters of gravity. How it will ultimately work will likely be a surprise, but as with most interesting developments in technology, today’s fantasy is tomorrow’s necessity.