Solving entertainment's globalization problem with AI and ML

Teresa Phillips Contributor

Teresa Phillips is the CEO and co-founder of Spherex, a data and technology company that is pioneering the culturalization of media for international film and television distribution.

The recent controversy surrounding the mistranslations found in the Netflix hit “Squid Game” and other films highlights technology’s challenges when releasing content that bridges languages and cultures internationally.

Every year across the global media and entertainment industry, tens of thousands of movies and TV episodes exhibited on hundreds of streaming platforms are released with the hope of finding an audience among 7.2 billion people living in nearly 200 countries. No audience is fluent in the roughly 7,000 recognized languages. If the goal is to release the content internationally, subtitles and audio dubs must be prepared for global distribution.

Known in the industry as “localization,” creating “subs and dubs” has, for decades, been a human-centered process, where someone with a thorough understanding of another language sits in a room, reads a transcript of the screen dialogue, watches the original language content (if available) and translates it into an audio dub script. It is not uncommon for this step to take several weeks per language from start to finish.

Once the translations are complete, the script is then performed by voice actors who make every effort to match the action and lip movements as closely as possible. Audio dubs follow the final cut dialogue, and then subtitles are generated from each audio dub. Any compromise made in the language translation may, then, be subjected to further compromise in the production of subtitles. It’s easy to see where mistranslations or changes in a story can occur.

The most conscientious localization process does include some level of cultural awareness because some words, actions or contexts are not universally translatable. For this purpose, the director of the 2019 Oscar-winning film “Parasite,” Bong Joon-ho, sent detailed notes to his translation team before they began work. Bong and others have pointed out that limitations of time, available screen space for subtitles, and the need for cultural understanding further complicate the process. Still, when done well, they contribute to higher levels of enjoyment of the film.

The exponential growth of distribution platforms and the increasing and continuous flow of fresh content are pushing those involved in the localization process to seek new ways to speed production and increase translation accuracy. Artificial intelligence (AI) and machine learning (ML) are highly anticipated answers to this problem, but neither has reached the point of replacing the human localization component. Directors of titles such as “Squid Game” or “Parasite” are not yet ready to make that leap. Here’s why.

Culture matters

First, literal translation is incapable of catching 100% of the story’s linguistic, cultural or contextual nuance included in the script, inflection or action. AI companies themselves admit to these limitations, commonly referring to machine-based translations as “more like dictionaries than translators,” and remind us that computers are only capable of doing what we teach them while stating they lack understanding.

For example, the English title of the first episode of “Squid Game” is “Red Light, Green Light.” This refers to the name of the children’s game played in the first episode. The original Korean title is “무궁화 꽃이 피던 날” (“Mugunghwa Kkoch-I Pideon Nal”), which directly translates as “The Day the Mugunghwa Bloomed,” which has nothing to do with the game they’re playing.

In Korean culture, the title symbolizes new beginnings, which is the game’s protagonists’ promise to the winner. “Red Light, Green Light” is related to the episode, but it misses the broader cultural reference of a promised fresh start for people down on their luck — a significant theme of the series. Some may believe that naming the episode after the game played because the cultural metaphor of the original title is unknown to the translators may not be a big deal, but it is.

How can we expect to train machines to recognize these differences and apply them autonomously when humans don’t make the connection and apply them themselves?

Knowing versus knowledge

It’s one thing for a computer to translate Korean into English. It is another altogether for it to have knowledge about relationship differences like those in “Squid Game” — between immigrants and natives, strangers and family members, employees and bosses — and how those relationships impact the story. Programming cultural understanding and emotional recognition into AI is challenging enough, especially if those emotions are displayed without words, such as a look on someone’s face. Even then, it is hard to predict emotional facial response that may change with culture.

AI is still a work in progress as it relates to explainability, interpretability and algorithmic bias. The idea that machines will self-train themselves is far-fetched given where the industry stands concerning executing AI/ML. For a content-heavy, creative industry like media and entertainment, context is everything; there is the content creator’s expression of context, and then there is the audience’s perception of it.

Moreover, with respect to global distribution, context equals culture. A digital nirvana is achieved when a system can orchestrate and predict the audio, video and text in addition to the multiple layers of cultural nuance that are at play at any given frame, scene, theme and genre level. At the core, it all starts with good-quality training data — essentially, taking a data-centric approach versus a model-centric one.

Recent reports indicate Facebook catches only 3% to 5% of problematic content on its platform. Even with millions of dollars available for development, programming AI to understand context and intent is very hard to do. Fully autonomous translation solutions are some ways off, but that doesn’t mean AI/ML cannot reduce the workload today. It can.

Through analysis of millions of films and TV shows combined with the cultural knowledge of individuals from nearly 200 countries, a two-step human and AI/ML process can provide the detailed insights needed to identify content that any country or culture may find objectionable. In “culturalization,” this cultural roadmap is then used in the localization process to ensure story continuity, avoid cultural missteps and obtain global age ratings — all of which reduce post-production time and costs without regulatory risk.

Audiences today have more content choices than ever before. Winning in the global marketplace means content creators have to pay more attention to their audience, not just at home but in international markets.

The fastest path to success for content creators and streaming platforms is working with companies that understand local audiences and what matters to them so their content is not lost in translation.