Japan’s telecommunications behemoth NTT is working on a device that can transcribe discussions in meetings automatically and in real-time. Japanese daily The Nikkei is reporting that the current prototype features two cameras with fish-eye lenses and eight microphones to capture what is being said and detect who is speaking.
Technical details are scarce at this point, but the system apparently identifies a speaker by i.e. calculating the time it takes for their voice to reach the microphones and visually identifying them via the cameras. NTT claims that in contrast to existing transcription systems, its own device can handle speeches by multiple people during meetings.
All spoken words are captured by the microphones, processed by proprietary voice recognition technology and turned into documents. The company even claims the device is able to detect certain emotions during meetings, for example when a speaker draws laughter or stares from other people. It can then “transcribe” those emotions and work them into the document in order to reflect the atmosphere during meetings.
Sorry, NTT hasn’t released pictures of the device yet.