Vera wants to use AI to cull generative models' worst behaviors

Liz O’Sullivan is on a mission to make AI “a little bit safer,” in her own words.

A member of the National AI Advisory Committee, which drafts recommendations to the White House and Congress on how to foster AI adoption while regulating its risks, O’Sullivan spent 12 years on the business side of AI startups overseeing data labeling and operations and customer success. In 2019, she took a job at the Surveillance Technology Oversight Project, mounting campaigns to protect New Yorkers’ civil liberties, and co-founded Arthur AI, a startup that partners with civil society and academia to shine light into AI’s “black box.”

Now, O’Sullivan is gearing up for her next act with Vera, a startup building a toolkit that allows companies to establish “acceptable use policies” for generative AI — the type of AI models that generate text, images, music and more — and enforce these policies across open source and custom models.

Vera today closed a $2.7 million funding round led by Differential Venture Partners with participation from Essence VC, Everywhere VC, Betaworks, Greycroft and ATP Ventures. Bringing Vera’s total raised to $3.3 million, the new cash will be put toward growing Vera’s five-person team, R&D and scaling enterprise deployments, O’Sullivan says.

“Vera was founded because we’ve seen, firsthand, the power of AI to address real problems, just as we’ve seen the wild and wacky ways it can cause damage to companies, the public and the world,” O’Sullivan told TechCrunch in an email interview. “We need to responsibly shepherd this technology into the world, and as companies race to define their generative AI strategies, we’re entering an age where it’s critical that we move beyond AI principles and into practice. Vera is a team that can actually help.

O’Sullivan co-founded Vera in 2021 with Justin Norman, formerly a research scientist at Cisco, a lead data scientist in Cloudera’s AI research lab and the VP of data science at Yelp. In September, Norman was appointed a member of the Department of the Navy Science and Technology board, which provides advice and counsel to the U.S. Navy on matters and policies relating to scientific, technical and related functions.

Vera’s platform attempts to identify risks in model inputs — for example, a prompt like “write a cover letter for a software engineering role” to a text-generating model — and block, redact or otherwise transform requests that might contain things like personally identifiable information, security credentials, intellectual property and prompt injection attacks. (Prompt injection attacks, essentially carefully worded malicious prompts, are often used to “trick” models into bypassing safety filters.)

Vera also places constraints on what models can “say” in response to prompts, according to O’Sullivan, giving companies greater control over the behavior of their models in production.

How does Vera achieve this? By using what O’Sullivan describes as “proprietary language and vision models” that sit between users and internal or third-party models (e.g. OpenAI’s GPT-4) and detect problematic content. Vera can block “inappropriate” prompts to — or answers from a model in any form, O’Sullivan claims, whether text, code, image or video.

“Our deep tech approach to enforcing policies goes beyond passive forms of documentation and checklists to address the direct points at which these risks occur,” O’Sullivan said. “Our solution … prevents riskier responses that may include criminal material or encourage users to self-harm.”

Companies are certainly encountering challenges — mainly compliance-related — in adopting generative AI models for their purposes. They’re worried about their confidential data ending up with developers who trained the models on user data, for instance; in recent months, major corporations including Apple, Walmart and Verizon have banned employees from using tools like OpenAI’s ChatGPT.

And offensive models are obviously bad for publicity. No brand wants the text-generating model powering their customer service chatbot, say, to spout racial epithets or give self-destructive advice.

But this reporter wonders if Vera’s approach is as reliable as O’Sullivan suggests.

No model is perfect — not even Vera’s — and it’s been demonstrated time and time again that content moderation models are prone to a whole host of biases. Some AI models trained to detect toxicity in text see phrases in African-American Vernacular English, the informal grammar used by some Black Americans, as disproportionately “toxic.” Meanwhile, certain computer vision algorithms have been found to label thermometers held by Black people as “guns” while labeling thermometers held by light-skinned subjects as “electronic devices.”

To be fair to O’Sullivan, she doesn’t claim Vera’s models are bulletproof — only that they can cull the worst of a generative AI models’ behaviors. There may be some truth to that (depending on the model, at least) — and the degree to which Vera has iterated and refined its own models.

“Today’s AI hype cycle obscures the very serious, very present risks that affect humans alive today,” O’Sullivan said. “Where AI overpromises, we see real people hurt by unpredictable, harmful, toxic and potentially criminal model behavior … AI is a powerful tool and like any powerful tool, should be actively controlled so that its benefits outweigh these risks, which is why Vera exists.”

Vera’s possible shortcomings aside, the company has competition in the nascent market for model-moderating tech.

Similar to Vera, Nvidia’s NeMo Guardrails and Salesforce’s Einstein Trust Layer attempt to prevent text-generating models from retaining or regurgitating sensitive data, such as customer purchase orders and phone numbers. Microsoft provides an AI service to moderate text and image content, including from models. Elsewhere, startups like HiddenLayer, DynamoFL and Protect AI are creating tooling to defend generative AI models against prompt engineering attacks.

So far as I can tell, Vera’s value proposition is that it tackles a whole range of generative AI threats at once — or promises to at the very least. Assuming that the tech works as advertised, that’s bound to be attractive for companies in search of a one-stop content moderation, AI-model-attack-fighting shop.

Indeed, O’Sullivan says that Vera already has a handful of customers. The waitlist for more opens today.