For IBM Watson CTO Rob High, the biggest technological challenge in machine learning right now is figuring out how to train models with less data. “It’s a challenge, it’s a goal and there’s certainly reason to believe that it’s possible,” High told me during an interview at the annual Mobile World Congress in Barcelona.
With this, he echoes similar statements all across the industry. Google’s AI chief John Giannandrea, for example, also recently listed this as one of the main challenges the search giant’s machine learning groups are trying to tackle. Typically, machine learning models need to be trained on large amounts of data to ensure that they are accurate, but for many problems, that large data set simply doesn’t exist.
High, however, believes this is a solvable problem. Why? “Because humans do it. We have a data point,” he said. One thing to keep in mind is that even when we see that evidenced in what humans are doing, you have to recognize it’s not just that session, it’s not just that moment that is informing how humans learn. We bring all of this context to the table.” For High, it’s this context that’ll make possible training models with less data, as well as recent advances in transfer learning, that is, the ability to take one trained model and then use this data to kickstart the training of another model where less data may exist.
The challenges for AI — and especially conversational AI — go beyond that, though. “On the other end is really trying to understand how better to interact with humans in ways that they would find natural and that are influential to their thinking,” says High. “Humans are influenced by not just the words that they exchange but also by how we encase those words in vocalizations, inflection, intonation, cadence, temper, facial expression, arm and hand gestures.” High doesn’t think an AI necessarily needs to mimic these in some kind of anthropomorphic form, but maybe in some other form like visual cues on a device.
At the same time, most AI systems also still need to get better at understanding the intent of a question and how that relates to individuals’ previous questions about something, as well as their current state of mind and personality.
That brings up another question, though. Many of these machine learning models that are in use right now are inherently biased because of the data with which they were trained. That often means that a given model will work great for you if you’re a white male but then fails black women, for example. “First of all, I think that there’s two sides to that equation. One is, there may be aggregate bias to this data and we have to be sensitive to that and force ourselves to consider data that broadens the cultural and demographic aspects of the people it represents,” said High. “The flip side of that, though, is that you actually want aggregate bias in these kind of systems over personal bias.”
As an example, High cited work IBM did with the Sloan Kettering Cancer Center. IBM and the hospital trained a model based on the work of some of the best cancer surgeons. “But Sloan Kettering has a particular philosophy about how to do medicine. So that philosophy is embodied in their biases. It’s their institutional biases, it’s their brand. […] And any system that is going to be used outside of Sloan Kettering needs to carry that same philosophy forward.”
“A big part of making sure that these things are biased in the right way is both making sure that you have the right people submitting for and who these people are representative of — of the broader culture.” That’s a discussion that High says now regularly comes up with IBM’s clients, too, which is a positive sign in an industry that still often ignores these kind of topics.