Google unveils MedLM, a family of healthcare-focused generative AI models

Google thinks that there’s an opportunity to offload more healthcare tasks to generative AI models — or at least, an opportunity to recruit those models to aid healthcare workers in completing their tasks.

Today, the company announced MedLM, a family of models fine-tuned for the medical industries. Based on Med-PaLM 2, a Google-developed model that performs at an “expert level” on dozens of medical exam questions, MedLM is available to Google Cloud customers in the U.S. (it’s in preview in certain other markets) who’ve been whitelisted through Vertex AI, Google’s fully managed AI dev platform.

There are two MedLM models available currently: a larger model designed for what Google describes as “complex tasks” and a smaller, fine-tunable model best for “scaling across tasks.”

“Through piloting our tools with different organizations, we’ve learned that the most effective model for a given task varies depending on the use case,” reads a blog post penned by Yossi Matias, VP of engineering and research at Google, provided to TechCrunch ahead of today’s announcement. “For example, summarizing conversations might be best handled by one model, and searching through medications might be better handled by another.”

Google says that one early MedLM user, the for-profit facility operator HCA Healthcare, has been piloting the models with physicians to help draft patient notes at emergency department hospital sites. Another tester, BenchSci, has built MedLM into its “evidence engine” for identifying, classifying and ranking novel biomarkers.

“We’re working in close collaboration with practitioners, researchers, health and life science organizations and the individuals at the forefront of healthcare every day,” writes Matias.

Google — along with chief rivals Microsoft and Amazon — are racing desperately to corner a healthcare AI market that could be worth tens of billions of dollars by 2032. Recently, Amazon launched AWS HealthScribe, which uses generative AI to transcribe, summarize and analyze notes from patient-doctor conversations. Microsoft is piloting various AI-powered healthcare products, including medical “assistant” apps underpinned by large language models.

But there’s reason to be wary of such tech. AI in healthcare, historically, has been met with mixed success.

Babylon Health, an AI startup backed by the U.K.’s National Health Service, has found itself under repeated scrutiny for making claims that its disease-diagnosing tech can perform better than doctors. And IBM was forced to sell its AI-focused Watson Health division at a loss after technical problems led customer partnerships to deteriorate.

One might argue that generative models like those in Google’s MedLM family are much more sophisticated than what came before them. But research has shown that generative models aren’t particularly accurate when it comes to answering healthcare-related questions, even fairly basic ones.

One study co-authored by a group of ophthalmologists asked ChatGPT and Google’s Bard chatbot questions about eye conditions and diseases, and found that the majority of responses from all three tools were wildly off the mark. ChatGPT generates cancer treatment plans full of potentially deadly errors. And models including ChatGPT and Bard spew racist, debunked medical ideas in response to queries about kidney function, lung capacity and skin.

In October, the World Health Organization (WHO) warned of the risks from using generative AI in healthcare, noting the potential for models to generate harmful wrong answers, propagate disinformation about health issues and reveal health data or other sensitive info. (Because models occasionally memorize training data and return portions of this data given the right prompt, it’s not out of the question that models trained on medical records could inadvertently leak those records.)

“While WHO is enthusiastic about the appropriate use of technologies, including [generative AI], to support healthcare professionals, patients, researchers and scientists, there’s concern that caution that would normally be exercised for any new technology is not being exercised consistently with [generative AI],” the WHO said in a statement. “Precipitous adoption of untested systems could lead to errors by healthcare workers, cause harm to patients, erode trust in AI and thereby undermine or delay the potential long-term benefits and uses of such technologies around the world.”

Google has repeatedly claimed that it’s being exceptionally cautious in releasing generative AI healthcare tools — and it’s not changing its tune today.

“[W]e’re focused on enabling professionals with a safe and responsible use of this technology,” Matias continued. “And we’re committed to not only helping others advance healthcare, but also making sure that these benefits are available to everyone.”