The Internet Of Sound

Patrick Bergel Contributor

Patrick Bergel is a creative technologist and one of the founders of Chirp, a platform for sharing data with sound.

How do birds communicate? By singing. And now it’s the turn of the machines. A new crop of businesses are now creating what’s referred to as the Internet of Sound.

A Brief History

Let’s rewind. The history of sound-as-signal is deep. In the beginning, horns, drums and bells rang the alarm, roused the congregation and directed military troops and urban workers: city-sized ringtones guided our lives. These sounds were primarily communicative, musicality was a secondary concern.

Analog sonic codes can be found in unlikely places: composers from Mozart to Schumann hid private audio-numerological jokes in their music, underwater modems guided naval vessels, telephone networks babbled dial-tone enharmonics.

For many (including me), the first experience of the Internet was the grackle squawk of a modem, of PCM-encoded games on cassette — sound not as data per se but as a by-product of data transmission, designed neither for the air nor the ear, but for the wire.

We arrive at the digital age in modern systems encoding URLs as sequences of notes for over-the-air transmission as tiny audio clips that can now be decoded in real-time on mobile devices. In the weightless world of digital, it’s easy to forget that information is a thing, it’s stuff — so why not sound? Now in the modern era, the machines can sing. And they can sing anything, from pictures to payments.

The Internet Of Sound Is Here

This is what is called the Internet of Sound, the speakernet. Using sound to send links to remote networks is one thing, but local lookup tables on-device obviate the need for another network: the sound is the network. Today’s technology is sufficient at present only to send tiny amounts of data through or above audible ranges, but good enough to encode every URL ever created as audio lasting a few seconds or less.

Why Is This A Powerful Idea?

Consider this: There are more tiny, cheap speakers on the planet than there are people. Why not leverage this ubiquitous, commodity technology? We see a huge opportunity to connect a very large number of devices simply and intuitively.

Sound is in many ways the first, and yet the most neglected, network.

Sound can go where other networks do not, and sound can be a valuable part of the network ecosystem alongside existing protocols. We can trivially repurpose ATMs, TVs, toys, radios and tablets already — if it carries sound, it can send data.

How Does It Work?

There are a number of techniques for building the Internet of Sound, loosely comprising three categories:

Descriptive: Selected pre-existing features of the signal uniquely disambiguate one signal from another, aka ‘audio fingerprinting.’ Examples include music-recognition services, like Shazam or Soundhound. Network geeks, forgive me: While it’s a stretch to call these true speakernet technologies, even music can be repurposed for point-to-point communication.

Additive: Retrievable but human-indetectable features (aka ‘audio watermarking’) are added to an arbitrary audio signal. Examples include codes used for audience-tracking in broadcast radio. One method uses tiny, imperceptibly fast echoes, normally suppressed by human hearing, used by companies like Infrasonics, mufin and Civolution. These give roughly equivalent bitrates to fingerprinting.

Coded: Instead of plucking a sound from a large online library of known sounds, or adding watermarks on the fly, the entire audio signal (pitch, tone-color, phase or amplitude) is itself the code. Modern examples designed to be robust to noise, distortion and compression include multi-pitch systems like Chirp and LISNR. Latterly, Google Tone, another form of multiple frequency encoding, has entered the market. Pure-code signals ‘in plain hearing’ have the advantage of being much faster than the first two techniques.

But…

All methods have relative strengths and weaknesses. Often these are strongly orthogonal from usability and engineering perspectives. The signal can be rich in data, but hideous to hear. The signal can be embedded imperceptibly in a string quartet, but encode miniscule amounts of data. The signal may be hopelessly fragile to real-world noise or reverberation, and so on. The signal may be both unlovely and unreliable at the same time.

Data over audio is necessarily slow. Data over pleasant-sounding audio is slower still — thus, the trick is to send pointers instead of files. Sonic data raises issues of security: One-to-many data sharing is hugely useful, but insecure (by design); but what if you want to share to just one person? This is an interesting challenge — novel solutions are being worked on as we speak.

Lastly, most importantly, do we really need the Internet of Sound? I believe so. Sound is in many ways the first, and yet the most neglected, network — a bridge over the last few feet, a medium that goes where other networks cannot.

Take Care

The sound itself matters. It is vital that sound is used respectfully, with care for the sonic environment. We have enough casual and arrogant noise pollution as it is, from headphone spill to street noise, to sonic UIs as a tedious afterthought: mere keypress beeps.

As audio geeks and sound designers, we take great care when designing the sounds we send, and we explicitly model our audio on sound from nature, specifically from the language of birds. The Internet of Sound needs to be a humane form of communication: that is, one that respects users by putting the human ear first and foremost.

Sound Is Everywhere

It’s time to get excited. The opportunities for the Internet of Sound are frankly mind-(if not ear-) boggling, from the range of applications available to the ease of deployment. The potential now exists to re-engineer speech, music and sound design for data sharing, and my company is active here.

New hardware will accelerate uptake: Ever-cheaper dedicated DSP (Digital Signal Processing) chips in mobiles and connected devices for always-on listening and hands-free UIs increase the reach and efficiency of Internet of Sound products. We have already seen audio used in payments to send bitcoin over the radio, in classrooms to share pictures and on web pages to pass maps to mobiles. And we’re only just getting started. Bottom line — wherever there is sound, there is data — from dumb phones to doorbells.

The Internet of Sound is coming, and one thing is for sure — you ain’t heard nothing yet.