Irish startup SoapBox Labs is building speech recognition tech for kids

Irish startup SoapBox Labs is on a mission to create what it calls “the world’s most accurate and accessible speech technology for children”, tech it plans to offer to third-party hardware and app developers. These span educational apps that support reading and language development, children’s voice-control for IoT devices in the home, smart toys, and AR/VR experiences.

Founded in 2013 by Dr. Patricia Scanlon, an ex-Bell Labs researcher and PhD with nearly 20 years experience in the area of speech recognition technologies, the young company is based on the premise that speech recognition tech built for adults, such as that most recently found in devices like the Amazon Echo or Google Home, doesn’t work as well as it could do for kids.

That’s because children have higher pitched voices and different speech patterns. Crucially, unlike adults, younger children don’t tend to adapt their speech to suit machines, either, something we do consciously or unconsciously in order to improve the utility of voice-enabled user interfaces and so-called smart assistants.

In a call, Scanlon explained that when she and the SoapBox Labs team began working on this problem in 2013, they had to disregard a lot of what they already understood about how to build speech technology. After an extensive research phase, it became clear that “children’s speech behaviours are vastly different to adults,” particularly the younger the child. Speech recognition tech developed using adult voice data that models adult behaviours leads to poor performance when used by young kids.

Instead, SoapBox Labs has created its own unique children’s speech dataset (consisting of thousands of hours of children’s speech data), and combined this with the team’s understanding of children’s voice and behaviours. The resulting platform is said to leverage deep learning (AI) techniques to power the startup’s proprietary models and scoring algorithms, and ultimately provide far better speech technology targeted at children.

This has seen SoapBox Labs release a version of its English language children’s speech recognition API for use by third parties, whilst I’m told a number of partnerships are to be announced as early as next month.

The company is also disclosing further funding: €2.1 million, capital it plans to use to add multiple languages to its speech recognition platform. The cash injection consists of a €1.5 million EU grant, and €600,000 from existing backers. It brings total funding for SoapBox Labs to just over €3 million.

Discussing the future of children’s speech recognition tech, Scanlon tells me she can see a situation where devices will recognise if it is a child or adult speaking and switch underlying data sets and models accordingly. That’s because, she says, kids speech tech, whilst arguably harder to develop, doesn’t work any better with adults. For now, two separate solutions are optimum.

Additionally, a device or app that knows it is interacting with a child could change its behaviour or interaction permissions. In some situations, you really wouldn’t want a child to be in control, however well they are understood.