Amazon’s Alexa AI team has developed a new training method for the virtual assistant that could greatly improve its ability to handle tricky questions. In a blog post, team lead Abdalghani Abujabal details the new method, which combines both text-based search and a custom-built knowledge graph, two methods which normally compete.
Abujabal suggests the following scenario: You ask Alexa “Which Nolan films won an Oscar but missed a Golden Globe?” The answer to this question asks a lot — you need to identify that the “Nolan” referred to is director Christopher Nolan, figure out which movies he’s directed (even his role as “director” for the resulting list needs to be inferred) and then cross-reference those which have won an Oscar with a list of those which have also won a Golden Globe, and identify those that are present on List A but not on List B.
Amazon’s method to provide a better answer to this difficult question opts for first gathering the most complete data set possible, and then automatically building a curated knowledge graph out of an initially high volume and very noisy (i.e. filled with unnecessary data) data set using algorithms that the research team custom-created to deal with cutting the chaff and arriving at mostly meaningful results.
The system devised by Amazon is actually relatively simple on its face — or rather, it combines two relatively simple methods, including a basic web search, that essentially just crawls the web for results using the full text of the question asked — just like if you’d typed “Which Nolan films won an Oscar but missed a Golden Globe?” into Google, for instance (researchers used multiple web engines in reality). The system then grabs the top 10 ranked pages and breaks them down into identified names and grammar units.
On top of that resulting data set, Alexa AI’s approach then looks for clues in the structure of sentences to flag and weight significant sentences in the top texts, like “Nolan directed Inception,” and discounts the rest. This builds the ad-hoc knowledge graph, which they then asses to identify “cornerstones” within. A cornerstone is basically dead ringers for words in the original search string (i.e. “Which Nolan films won an Oscar but missed a Golden Globe?”) and take those out, focusing instead of looking at the information in between as the source of the actual answers to that question.
With some final weighting and sorting of the remaining data, the algorithm correctly returns “Inception” as the answer, and Amazon’s team found that this method actually beat out state-of-the-art approaches that were much more involved but that focused on just text search, or just building a curated knowledge graph in isolation. Still, they think they can tweak their approach to be even better, which is good news for Alexa users hoping their smart speakers will be able to settle heated debates about advanced Trivial Pursuit questions.