AI

Google DeepMind forms a new org focused on AI safety

Comment

DeepMind logo
Image Credits: Google DeepMind

If you ask Gemini, Google’s flagship GenAI model, to write deceptive content about the upcoming U.S. presidential election, it will, given the right prompt. Ask about a future Super Bowl game and it’ll invent a play-by-play. Or ask about the Titan submersible implosion and it’ll serve up disinformation, complete with convincing-looking but untrue citations.

It’s a bad look for Google needless to say — and is provoking the ire of policymakers, who’ve signaled their displeasure at the ease with which GenAI tools can be harnessed for disinformation and to generally mislead.

So in response, Google — thousands of jobs lighter than it was last fiscal quarter — is funneling investments toward AI safety. At least, that’s the official story.

This morning, Google DeepMind, the AI R&D division behind Gemini and many of Google’s more recent GenAI projects, announced the formation of a new organization, AI Safety and Alignment — made up of existing teams working on AI safety but also broadened to encompass new, specialized cohorts of GenAI researchers and engineers.

Beyond the job listings on DeepMind’s site, Google wouldn’t say how many hires would result from the formation of the new organization. But it did reveal that AI Safety and Alignment will include a new team focused on safety around artificial general intelligence (AGI), or hypothetical systems that can perform any task a human can.

Similar in mission to the Superalignment division rival OpenAI formed last July, the new team within AI Safety and Alignment will work alongside DeepMind’s existing AI-safety-centered research team in London, Scalable Alignment — which is also exploring solutions to the technical challenge of controlling yet-to-be-realized superintelligent AI.

Why have two groups working on the same problem? Valid question — and one that calls for speculation given Google’s reluctance to reveal much in detail at this juncture. But it seems notable that the new team — the one within AI Safety and Alignment — is stateside as opposed to across the pond, proximate to Google HQ at a time when the company’s moving aggressively to maintain pace with AI rivals while attempting to project a responsible, measured approach to AI.

The AI Safety and Alignment organization’s other teams are responsible for developing and incorporating concrete safeguards into Google’s Gemini models, current and in-development. Safety is a broad purview. But a few of the organization’s near-term focuses will be preventing bad medical advice, ensuring child safety and “preventing the amplification of bias and other injustices.”

Anca Dragan, formerly a Waymo staff research scientist and a UC Berkeley professor of computer science, will lead the team.

“Our work [at the AI Safety and Alignment organization] aims to enable models to better and more robustly understand human preferences and values,” Dragan told TechCrunch via email, “to know what they don’t know, to work with people to understand their needs and to elicit informed oversight, to be more robust against adversarial attacks and to account for the plurality and dynamic nature of human values and viewpoints.”

Dragan’s consulting work with Waymo on AI safety systems might raise eyebrows, considering the Google autonomous car venture’s rocky driving record as of late.

So might her decision to split time between DeepMind and UC Berkeley, where she heads a lab focusing on algorithms for human-AI and human-robot interaction. One might assume issues as grave as AGI safety — and the longer-term risks the AI Safety and Alignment organization intends to study, including preventing AI in “aiding terrorism” and “destabilizing society” — require a director’s full-time attention.

Dragan insists, however, that her UC Berkeley lab’s and DeepMind’s research are interrelated and complementary.

“My lab and I have been working on … value alignment in anticipation of advancing AI capabilities, [and] my own Ph.D. was in robots inferring human goals and being transparent about their own goals to humans, which is where my interest in this area started,” she said. “I think the reason [DeepMind CEO] Demis Hassabis and [chief AGI scientist] Shane Legg were excited to bring me on was in part this research experience and in part my attitude that addressing present-day concerns and catastrophic risks are not mutually exclusive — that on the technical side mitigations often blur together, and work contributing to the long term improves the present day, and vice versa.”

To say Dragan has her work cut out for her is an understatement.

Skepticism of GenAI tools is at an all-time high — particularly where it relates to deepfakes and misinformation. In a poll from YouGov, 85% of Americans said that they were very concerned or somewhat concerned about the spread of misleading video and audio deepfakes. A separate survey from The Associated Press-NORC Center for Public Affairs Research found that nearly 60% of adults think AI tools will increase the volume of false and misleading information during the 2024 U.S. election cycle.

Enterprises, too — the big fish Google and its rivals hope to lure with GenAI innovations — are wary of the tech’s shortcomings and their implications.

Intel subsidiary Cnvrg.io recently conducted a survey of companies in the process of piloting or deploying GenAI apps. It found that around a fourth of the respondents had reservations about GenAI compliance and privacy, reliability, the high cost of implementation and a lack of technical skills needed to use the tools to their fullest.

In a separate poll from Riskonnect, a risk management software provider, over half of execs said that they were worried about employees making decisions based on inaccurate information from GenAI apps.

They’re not unjustified in those concerns. Last week, The Wall Street Journal reported that Microsoft’s Copilot suite, powered by GenAI models similar architecturally to Gemini, often makes mistakes in meeting summaries and spreadsheet formulas. To blame is hallucination — the umbrella term for GenAI’s fabricating tendencies — and many experts believe it can never be fully solved.

Recognizing the intractability of the AI safety challenge, Dragan makes no promise of a perfect model — saying only that DeepMind intends to invest more resources into this area going forward and commit to a framework for evaluating GenAI model safety risk “soon.”

“I think the key is to … [account] for remaining human cognitive biases in the data we use to train, good uncertainty estimates to know where gaps are, adding inference-time monitoring that can catch failures and confirmation dialogues for consequential decisions and tracking where [a] model’s capabilities are to engage in potentially dangerous behavior,” she said. “But that still leaves the open problem of how to be confident that a model won’t misbehave some small fraction of the time that’s hard to empirically find, but may turn up at deployment time.”

I’m not convinced customers, the public and regulators will be so understanding. It’ll depend, I suppose, on just how egregious those misbehaviors are — and who exactly is harmed by them.

“Our users should hopefully experience a more and more helpful and safe model over time,” Dragan said. Indeed.

More TechCrunch

StrictlyVC events deliver exclusive insider content from the Silicon Valley & Global VC scene while creating meaningful connections over cocktails and canapés with leading investors, entrepreneurs and executives. And TechCrunch…

Meesho, a leading e-commerce startup in India, has secured $275 million in a new funding round.

Meesho, an Indian social commerce platform with 150M transacting users, raises $275M

Some Indian government websites have allowed scammers to plant advertisements capable of redirecting visitors to online betting platforms. TechCrunch discovered around four dozen “gov.in” website links associated with Indian states,…

Scammers found planting online betting ads on Indian government websites

Around 550 employees across autonomous vehicle company Motional have been laid off, according to information taken from WARN notice filings and sources at the company.  Earlier this week, TechCrunch reported…

Motional cut about 550 employees, around 40%, in recent restructuring, sources say

The deck included some redacted numbers, but there was still enough data to get a good picture.

Pitch Deck Teardown: Cloudsmith’s $15M Series A deck

The company is describing the event as “a chance to demo some ChatGPT and GPT-4 updates.”

OpenAI’s ChatGPT announcement: What we know so far

Unlike ChatGPT, Claude did not become a new App Store hit.

Anthropic’s Claude sees tepid reception on iOS compared with ChatGPT’s debut

Welcome to Startups Weekly — Haje‘s weekly recap of everything you can’t miss from the world of startups. Sign up here to get it in your inbox every Friday. Look,…

Startups Weekly: Trouble in EV land and Peloton is circling the drain

Scarcely five months after its founding, hard tech startup Layup Parts has landed a $9 million round of financing led by Founders Fund to transform composites manufacturing. Lux Capital and Haystack…

Founders Fund leads financing of composites startup Layup Parts

AI startup Anthropic is changing its policies to allow minors to use its generative AI systems — in certain circumstances, at least.  Announced in a post on the company’s official…

Anthropic now lets kids use its AI tech — within limits

Zeekr’s market hype is noteworthy and may indicate that investors see value in the high-quality, low-price offerings of Chinese automakers.

The buzziest EV IPO of the year is a Chinese automaker

Venture capital has been hit hard by souring macroeconomic conditions over the past few years and it’s not yet clear how the market downturn affected VC fund performance. But recent…

VC fund performance is down sharply — but it may have already hit its lowest point

The person who claims to have 49 million Dell customer records told TechCrunch that he brute-forced an online company portal and scraped customer data, including physical addresses, directly from Dell’s…

Threat actor says he scraped 49M Dell customer addresses before the company found out

The social network has announced an updated version of its app that lets you offer feedback about its algorithmic feed so you can better customize it.

Bluesky now lets you personalize main Discover feed using new controls

Microsoft will launch its own mobile game store in July, the company announced at the Bloomberg Technology Summit on Thursday. Xbox president Sarah Bond shared that the company plans to…

Microsoft is launching its mobile game store in July

Smart ring maker Oura is launching two new features focused on heart health, the company announced on Friday. The first claims to help users get an idea of their cardiovascular…

Oura launches two new heart health features

Keeping up with an industry as fast-moving as AI is a tall order. So until an AI can do it for you, here’s a handy roundup of recent stories in the world…

This Week in AI: OpenAI considers allowing AI porn

Garena is quietly developing new India-themed games even though Free Fire, its biggest title, has still not made a comeback to the country.

Garena is quietly making India-themed games even as Free Fire’s relaunch remains doubtful

The U.S.’ NHTSA has opened a fourth investigation into the Fisker Ocean SUV, spurred by multiple claims of “inadvertent Automatic Emergency Braking.”

Fisker Ocean faces fourth federal safety probe

CoreWeave has formally opened an office in London that will serve as its European headquarters and home to two new data centers.

CoreWeave, a $19B AI compute provider, opens European HQ in London with plans for 2 UK data centers

The Series C funding, which brings its total raise to around $95 million, will go toward mass production of the startup’s inaugural products

AI chip startup DEEPX secures $80M Series C at a $529M valuation 

A dust-up between Evolve Bank & Trust, Mercury and Synapse has led TabaPay to abandon its acquisition plans of troubled banking-as-a-service startup Synapse.

Infighting among fintech players has caused TabaPay to ‘pull out’ from buying bankrupt Synapse

The problem is not the media, but the message.

Apple’s ‘Crush’ ad is disgusting

The Twitter for Android client was “a demo app that Google had created and gave to us,” says Particle co-founder and ex-Twitter employee Sara Beykpour.

Google built some of the first social apps for Android, including Twitter and others

WhatsApp is updating its mobile apps for a fresh and more streamlined look, while also introducing a new “darker dark mode,” the company announced on Thursday. The messaging app says…

WhatsApp’s latest update streamlines navigation and adds a ‘darker dark mode’

Plinky lets you solve the problem of saving and organizing links from anywhere with a focus on simplicity and customization.

Plinky is an app for you to collect and organize links easily

The keynote kicks off at 10 a.m. PT on Tuesday and will offer glimpses into the latest versions of Android, Wear OS and Android TV.

Google I/O 2024: How to watch

For cancer patients, medicines administered in clinical trials can help save or extend lives. But despite thousands of trials in the United States each year, only 3% to 5% of…

Triomics raises $15M Series A to automate cancer clinical trials matching

Welcome back to TechCrunch Mobility — your central hub for news and insights on the future of transportation. Sign up here for free — just click TechCrunch Mobility! Tap, tap.…

Tesla drives Luminar lidar sales and Motional pauses robotaxi plans

The newly announced “Public Content Policy” will now join Reddit’s existing privacy policy and content policy to guide how Reddit’s data is being accessed and used by commercial entities and…

Reddit locks down its public data in new content policy, says use now requires a contract