• New machine can transcribe discussions in real-time, capture emotions

    Serkan Toto

    Dr. Serkan Toto is an independent consultant and advisor focusing on Japan’s web, mobile and social gaming industries. Based in Tokyo, he works together with financial institutions and startups worldwide. Serkan has been the Japan contributor for TechCrunch.com since 2008. He is sept-lingual, holds an MBA and is a PhD in economics. → Learn More

    Wednesday, July 7th, 2010

    Japan’s telecommunications behemoth NTT is working on a device that can transcribe discussions in meetings automatically and in real-time. Japanese daily The Nikkei is reporting that the current prototype features two cameras with fish-eye lenses and eight microphones to capture what is being said and detect who is speaking.

    Technical details are scarce at this point, but the system apparently identifies a speaker by i.e. calculating the time it takes for their voice to reach the microphones and visually identifying them via the cameras. NTT claims that in contrast to existing transcription systems, its own device can handle speeches by multiple people during meetings.

    All spoken words are captured by the microphones, processed by proprietary voice recognition technology and turned into documents. The company even claims the device is able to detect certain emotions during meetings, for example when a speaker draws laughter or stares from other people. It can then “transcribe” those emotions and work them into the document in order to reflect the atmosphere during meetings.

    Sorry, NTT hasn’t released pictures of the device yet.

    blog comments powered by Disqus