AI

Google DeepMind forms a new org focused on AI safety

Comment

DeepMind logo
Image Credits: Google DeepMind

If you ask Gemini, Google’s flagship GenAI model, to write deceptive content about the upcoming U.S. presidential election, it will, given the right prompt. Ask about a future Super Bowl game and it’ll invent a play-by-play. Or ask about the Titan submersible implosion and it’ll serve up disinformation, complete with convincing-looking but untrue citations.

It’s a bad look for Google needless to say — and is provoking the ire of policymakers, who’ve signaled their displeasure at the ease with which GenAI tools can be harnessed for disinformation and to generally mislead.

So in response, Google — thousands of jobs lighter than it was last fiscal quarter — is funneling investments toward AI safety. At least, that’s the official story.

This morning, Google DeepMind, the AI R&D division behind Gemini and many of Google’s more recent GenAI projects, announced the formation of a new organization, AI Safety and Alignment — made up of existing teams working on AI safety but also broadened to encompass new, specialized cohorts of GenAI researchers and engineers.

Beyond the job listings on DeepMind’s site, Google wouldn’t say how many hires would result from the formation of the new organization. But it did reveal that AI Safety and Alignment will include a new team focused on safety around artificial general intelligence (AGI), or hypothetical systems that can perform any task a human can.

Similar in mission to the Superalignment division rival OpenAI formed last July, the new team within AI Safety and Alignment will work alongside DeepMind’s existing AI-safety-centered research team in London, Scalable Alignment — which is also exploring solutions to the technical challenge of controlling yet-to-be-realized superintelligent AI.

Why have two groups working on the same problem? Valid question — and one that calls for speculation given Google’s reluctance to reveal much in detail at this juncture. But it seems notable that the new team — the one within AI Safety and Alignment — is stateside as opposed to across the pond, proximate to Google HQ at a time when the company’s moving aggressively to maintain pace with AI rivals while attempting to project a responsible, measured approach to AI.

The AI Safety and Alignment organization’s other teams are responsible for developing and incorporating concrete safeguards into Google’s Gemini models, current and in-development. Safety is a broad purview. But a few of the organization’s near-term focuses will be preventing bad medical advice, ensuring child safety and “preventing the amplification of bias and other injustices.”

Anca Dragan, formerly a Waymo staff research scientist and a UC Berkeley professor of computer science, will lead the team.

“Our work [at the AI Safety and Alignment organization] aims to enable models to better and more robustly understand human preferences and values,” Dragan told TechCrunch via email, “to know what they don’t know, to work with people to understand their needs and to elicit informed oversight, to be more robust against adversarial attacks and to account for the plurality and dynamic nature of human values and viewpoints.”

Dragan’s consulting work with Waymo on AI safety systems might raise eyebrows, considering the Google autonomous car venture’s rocky driving record as of late.

So might her decision to split time between DeepMind and UC Berkeley, where she heads a lab focusing on algorithms for human-AI and human-robot interaction. One might assume issues as grave as AGI safety — and the longer-term risks the AI Safety and Alignment organization intends to study, including preventing AI in “aiding terrorism” and “destabilizing society” — require a director’s full-time attention.

Dragan insists, however, that her UC Berkeley lab’s and DeepMind’s research are interrelated and complementary.

“My lab and I have been working on … value alignment in anticipation of advancing AI capabilities, [and] my own Ph.D. was in robots inferring human goals and being transparent about their own goals to humans, which is where my interest in this area started,” she said. “I think the reason [DeepMind CEO] Demis Hassabis and [chief AGI scientist] Shane Legg were excited to bring me on was in part this research experience and in part my attitude that addressing present-day concerns and catastrophic risks are not mutually exclusive — that on the technical side mitigations often blur together, and work contributing to the long term improves the present day, and vice versa.”

To say Dragan has her work cut out for her is an understatement.

Skepticism of GenAI tools is at an all-time high — particularly where it relates to deepfakes and misinformation. In a poll from YouGov, 85% of Americans said that they were very concerned or somewhat concerned about the spread of misleading video and audio deepfakes. A separate survey from The Associated Press-NORC Center for Public Affairs Research found that nearly 60% of adults think AI tools will increase the volume of false and misleading information during the 2024 U.S. election cycle.

Enterprises, too — the big fish Google and its rivals hope to lure with GenAI innovations — are wary of the tech’s shortcomings and their implications.

Intel subsidiary Cnvrg.io recently conducted a survey of companies in the process of piloting or deploying GenAI apps. It found that around a fourth of the respondents had reservations about GenAI compliance and privacy, reliability, the high cost of implementation and a lack of technical skills needed to use the tools to their fullest.

In a separate poll from Riskonnect, a risk management software provider, over half of execs said that they were worried about employees making decisions based on inaccurate information from GenAI apps.

They’re not unjustified in those concerns. Last week, The Wall Street Journal reported that Microsoft’s Copilot suite, powered by GenAI models similar architecturally to Gemini, often makes mistakes in meeting summaries and spreadsheet formulas. To blame is hallucination — the umbrella term for GenAI’s fabricating tendencies — and many experts believe it can never be fully solved.

Recognizing the intractability of the AI safety challenge, Dragan makes no promise of a perfect model — saying only that DeepMind intends to invest more resources into this area going forward and commit to a framework for evaluating GenAI model safety risk “soon.”

“I think the key is to … [account] for remaining human cognitive biases in the data we use to train, good uncertainty estimates to know where gaps are, adding inference-time monitoring that can catch failures and confirmation dialogues for consequential decisions and tracking where [a] model’s capabilities are to engage in potentially dangerous behavior,” she said. “But that still leaves the open problem of how to be confident that a model won’t misbehave some small fraction of the time that’s hard to empirically find, but may turn up at deployment time.”

I’m not convinced customers, the public and regulators will be so understanding. It’ll depend, I suppose, on just how egregious those misbehaviors are — and who exactly is harmed by them.

“Our users should hopefully experience a more and more helpful and safe model over time,” Dragan said. Indeed.

More TechCrunch

Line Man Wongnai, an on-demand food delivery service in Thailand, is considering an initial public offering on a Thai exchange or the U.S. in 2025.

Thai food delivery app Line Man Wongnai weighs IPO in Thailand, US in 2025

The problem is not the media, but the message.

Apple’s ‘Crush’ ad is disgusting

Ever wonder why conversational AI like ChatGPT says “Sorry, I can’t do that” or some other polite refusal? OpenAI is offering a limited look at the reasoning behind its own…

OpenAI offers a peek behind the curtain of its AI’s secret instructions

The federal government agency responsible for granting patents and trademarks is alerting thousands of filers whose private addresses were exposed following a second data spill in as many years. The…

US Patent and Trademark Office confirms another leak of filers’ address data

As part of an investigation into people involved in the pro-independence movement in Catalonia, the Spanish police obtained information from the encrypted services Wire and Proton, which helped the authorities…

Encrypted services Apple, Proton and Wire helped Spanish police identify activist

Match Group, the company that owns several dating apps, including Tinder and Hinge, released its first-quarter earnings report on Tuesday, which shows that Tinder’s paying user base has decreased for…

Match looks to Hinge as Tinder fails

Private social networking is making a comeback. Gratitude Plus, a startup that aims to shift social media in a more positive direction, is expanding its wellness-focused, personal reflections journal to…

Gratitude Plus makes social networking positive, private and personal

With venture totals slipping year-over-year in key markets like the United States, and concern that venture firms themselves are struggling to raise more capital, founders might be worried. After all,…

Can AI help founders fundraise more quickly and easily?

Google has found a way to bring a variation of its clever “Circle to Search” gesture to iPhone users. The new interaction, launched in January, allows Android users to search…

Google brings a variation on ‘Circle to Search’ to iPhone users

A new sculpture going live on Wednesday in the Flatiron South Public Plaza in New York is not your typical artwork. It combines technology, sociology, anthropology and art to let…

Always-on video portal lets people in NYC and Dublin interact in real time

Apple’s iPad event had a lot to like. New iPads with new chips and new sizes, a new Apple Pencil, and even some software updates. If you are a big…

TechCrunch Minute: When did iPads get as expensive as MacBooks?

Autonomous, AI-based players are coming to a gaming experience near you, and a new startup, Altera, is joining the fray to build this new guard of AI agents. The company announced…

Bye-bye bots: Altera’s game-playing AI agents get backing from Eric Schmidt

Google DeepMind has taken the wraps off a new version of AlphaFold, their transformative machine learning model that predicts the shape and behavior of proteins. AlphaFold 3 is not only…

Google DeepMind debuts huge AlphaFold update and free proteomics-as-a-service web app

Uber plans to deliver more perks to Uber One members, like member-exclusive events, in a bid to gain more revenue through subscriptions.  “You will see more member-exclusives coming up where…

Uber promises member exclusives as Uber One passes $1B run-rate

We’ve all seen them. The inspector with a clipboard, walking around a building, ticking off the last time the fire extinguishers were checked, or if all the lights are working.…

Checkfirst raises $1.5M pre-seed to apply AI to remote inspections and audits

Close to a decade ago, brothers Aviv and Matteo Shapira co-founded a company, Replay, that created a video format for 360-degree replays — the sorts of replays that have become…

Controversial drone company Xtend leans into defense with new $40 million round

Usually, when something starts to rot, it gets pitched in the trash. But Joanne Rodriguez wants to turn the concept of rot on its head by growing fungus on trash…

Mycocycle uses mushrooms to upcycle old tires and construction waste

Monzo has raised another £150 million ($190 million), as the challenger bank looks to expand its presence internationally — particularly in the U.S. The new round comes just two months…

UK challenger bank Monzo nabs another $190M as US expansion beckons

iRobot has announced the successor to longtime CEO, Colin Angle. Gary Cohen, who previous held chief executive role at Timex and Qualitor Automotive, will be heading up the company, marking a major…

iRobot names former Timex head Gary Cohen as CEO

Reddit — now a publicly-traded company with more scrutiny on revenue growth — is putting a big focus on boosting its international audience, starting with francophones. In their first-ever earnings…

Reddit tests automatic, whole-site translation into French using LLM-based AI

Mushrooms continue to be a big area for alternative proteins. Canada-based Maia Farms recently raised $1.7 million to develop a blend of mushroom and plant-based protein using biomass fermentation. There’s…

Meati Foods bites into another $100M amid growth to 7,000 retail locations

Cleaning the outside of buildings is a dirty job, and it’s also dangerous. Lucid Bots came on the scene in 2018 with its Sherpa line of drones to clean windows…

Lucid Bots secures $9M for drones to clean more than your windows

High interest rates and financial pressures make it more important than ever for finance teams to have a better handle on their cash flow, and several startups are hoping to…

Israeli startup Panax raises a $10M Series A for its AI-driven cash flow management platform

The European Union has deepened the investigation of Elon Musk-owned social network, X, that it opened back in December under the bloc’s online governance and content moderation rulebook, the Digital Services Act…

EU grills Elon Musk’s X about content moderation and deepfake risks

For the founders of Atlan, a data governance startup, data has always been at the heart of what they do, even before they launched the company. In fact, co-founders Prukalpa…

Atlan scores $105M for its data control plane, as LLMs boost importance of data

It is estimated that about 2 billion people, especially those in lower and middle-income countries, lack access to quality and affordable essential medicines. The situation is exacerbated by low-quality or even killer…

Axmed raises $2M from Founderful to streamline drug supply chains in underserved markets

For decades, the Global Positioning System (GPS) has maintained a de facto monopoly on positioning, navigation and timing, because it’s cheap and already integrated into billions of devices around the…

Xona Space Systems closes $19M Series A to build out ultra-accurate GPS alternative

Bankruptcy lawyers representing customers impacted by the dramatic crash of cryptocurrency exchange FTX 17 months ago say that the vast majority of victims will receive their money back — plus interest. The…

FTX crypto fraud victims to get their money back — plus interest

On Wednesday, Google launched its digital wallet in India with local integrations, nearly two years after the app was relaunched as a digital wallet platform in the U.S. As TechCrunch exclusively reported last month,…

Google Wallet is now available in India

Bluesky has launched a new product roadmap for the coming months. The decentralized social network said on Tuesday that it is planning to introduce direct messages, support for videos, improved…

Bluesky to add DMs, video support and in-app custom feed curation