Algorithmically Generated Realistic Sound On Show At SIGGRAPH

Researchers at Cornell University are hard at work on a project that sounds odd at first, but is in fact a perfectly natural extension of existing 3D and computing technology. They’re making an engine for producing the sounds of colliding objects by simulating the materials of the objects themselves in a virtual space, and then calculating the forces and vibrations that would be produced. Academically it’s a challenging proposition, but it has plenty of practical applications as well.

The simulation of noise propagation perhaps would be most easily applied in 3D games, which despite having nearly photorealistic models, textures, and lighting, still rely on a limited cache of pre-recorded sounds to play when, say, a table tips over. By simulating every object on the table and tracking the physical effects of collision with the floor, other objects, and the resulting reverberations, a more realistic and accurate sound can be created on the fly — or at least that’s the theory.

Right now the researchers acknowledge two obstacles. First, the physical world needs to be simplified greatly in some cases in order to provide a workable amount of data. A ball hitting the floor is one thing, with only a few factors to calculate, but what about a stack of dishes rattling against each other on a table that has been jostled? The number of contact points must be reduced so thousands or millions of different interactions don’t have to be tracked separately. At the same time, they must have enough to produce a realistic sound. It’s a balancing act governed by the amount and type of objects and the computing power they have at hand.

And it seems that not everything can be generated completely from scratch just yet. Their demo at SIGGRAPH has the stack of dishes mentioned above, but apparently soundtracking flames it isn’t so easy. The low-frequency part they’ve got, but for the rest had to base their models based on recorded fire sounds and then “paint” them onto the low end. That said, most common sounds are predictable in the same way physical interactions are predictable (being that they are themselves sums of physical reactions), and it’s just a matter of getting the tools to do so.

Parallel processing hardware (like graphics cards or many-core CPUs) will be necessary to make these calculations on in real time, though: simulating the fire noise takes hours just for a short clip. But the very idea is compelling to anyone who’s heard the same “glass breaking” or “ricochet” noises in games or even movies, where the catalog of sounds is limited.

Right now it’s still in the labs, but this is definitely the kind of thing that gets turned into a product and sold. A company like Nvidia or Havok would love to get their hands on this. Unfortunately there’s no video, but if one becomes available after it’s shown at SIGGRAPH, we’ll put it here.

Update: it turns out that there are videos, just not of the presentation: