AI

Meta releases a dataset to probe computer vision models for biases

Comment

Meta negotiations with moderators in Kenya over labor dispute collapse
Image Credits: TechCrunch

Continuing on its open source tear, Meta today released a new AI benchmark, FACET, designed to evaluate the “fairness” of AI models that classify and detect things in photos and videos, including people.

Made up of 32,000 images containing 50,000 people labeled by human annotators, FACET — a tortured acronym for “FAirness in Computer Vision EvaluaTion” — accounts for classes related to occupations and activities like “basketball player,” “disc jockey” and “doctor” in addition to demographic and physical attributes, allowing for what Meta describes as “deep” evaluations of biases against those classes.

“By releasing FACET, our goal is to enable researchers and practitioners to perform similar benchmarking to better understand the disparities present in their own models and monitor the impact of mitigations put in place to address fairness concerns,” Meta wrote in a blog post shared with TechCrunch. “We encourage researchers to use FACET to benchmark fairness across other vision and multimodal tasks.”

Certainly, benchmarks to probe for biases in computer vision algorithms aren’t new. Meta itself released one several years ago to surface age, gender and skin tone discrimination in both computer vision and audio machine learning models. And a number of studies have been conducted on computer vision models to determine whether they’re biased against certain demographic groups. (Spoiler alert: they usually are.)

Then, there’s the fact that Meta doesn’t have the best track record when it comes to responsible AI.

Late last year, Meta was forced to pull an AI demo after it wrote racist and inaccurate scientific literature. Reports have characterized the company’s AI ethics team as largely toothless and the anti-AI-bias tools it’s released as “completely insufficient.” Meanwhile, academics have accused Meta of exacerbating socioeconomic inequalities in its ad-serving algorithms and of showing a bias against Black users in its automated moderation systems.

But Meta claims FACET is more thorough than any of the computer vision bias benchmarks that came before it — able to answer questions like “Are models better at classifying people as skateboarders when their perceived gender presentation has more stereotypically male attributes?” and “Are any biases magnified when the person has coily hair compared to straight hair?”

To create FACET, Meta had the aforementioned annotators label each of the 32,000 images for demographic attributes (e.g. the pictured person’s perceived gender presentation and age group), additional physical attributes (e.g. skin tone, lighting, tattoos, headwear and eyewear, hairstyle and facial hair, etc.) and classes. They combined these labels with other labels for people, hair and clothing taken from Segment Anything 1 Billion, a Meta-designed dataset for training computer vision models to “segment,” or isolate, objects and animals from images.

The images from FACET were sourced from Segment Anything 1 Billion, Meta tells me, which in turn were purchased from a “photo provider.” But it’s unclear whether the people pictured in them were made aware that the pictures would be used for this purpose. And — at least in the blog post — it’s not clear how Meta recruited the annotator teams, and what wages they were paid.

Historically and even today, many of the annotators employed to label datasets for AI training and benchmarking come from developing countries and have incomes far below the U.S.’ minimum wage. Just this week, The Washington Post reported that Scale AI, one of the largest and best-funded annotation firms, has paid workers at extremely low rates, routinely delayed or withheld payments and provided few channels for workers to seek recourse.

In a white paper describing how FACET came together, Meta says that the annotators were “trained experts” sourced from “several geographic regions” including North America (United States), Latin American (Colombia), Middle East (Egypt), Africa (Kenya), Southeast Asia (Philippines) and East Asia (Taiwan). Meta used a “proprietary annotation platform” from a third-party vendor, it says, and annotators were compensated “with an hour wage set per country.”

Setting aside FACET’s potentially problematic origins, Meta says that the benchmark can be used to probe classification, detection, “instance segmentation” and “visual grounding” models across different demographic attributes.

As a test case, Meta applied FACET to its own DINOv2 computer vision algorithm, which as of this week is available for commercial use. FACET uncovered several biases in DINOv2, Meta says, including a bias against people with certain gender presentations and a likelihood to stereotypically identify pictures of women as “nurses.”

“The preparation of DINOv2’s pre-training dataset may have inadvertently replicated the biases of the reference datasets selected for curation,” Meta wrote in the blog post. “We plan to address these potential shortcomings in future work and believe that image-based curation could also help avoid the perpetuation of potential biases arising from the use of search engines or text supervision.”

No benchmark is perfect. And Meta, to its credit, acknowledges that FACET might not sufficiently capture real-world concepts and demographic groups. It also notes that many depictions of professions in the dataset might’ve changed since FACET was created. For example, most doctors and nurses in FACET, photographed during the COVID-19 pandemic, are wearing more personal protective equipment than they would’ve before the health crises.

“At this time we do not plan to have updates for this dataset,” Meta writes in the whitepaper. “We will allow users to flag any images that may be objectionable content, and remove objectionable content if found.”

In addition to the dataset itself, Meta has made available a web-based dataset explorer tool. To use it and the dataset, developers must agree not to train computer vision models on FACET — only evaluate, test and benchmark them.

More TechCrunch

Welcome back to TechCrunch’s Week in Review. This week had two major events from OpenAI and Google. OpenAI’s spring update event saw the reveal of its new model, GPT-4o, which…

OpenAI and Google lay out their competing AI visions

Expedia says Rathi Murthy and Sreenivas Rachamadugu, respectively its CTO and senior vice president of core services product & engineering, are no longer employed at the travel booking company. In…

Expedia says two execs dismissed after ‘violation of company policy’

When Jeffrey Wang posted to X asking if anyone wanted to go in on an order of fancy-but-affordable office nap pods, he didn’t expect the post to go viral.

With AI startups booming, nap pods and Silicon Valley hustle culture are back

OpenAI’s Superalignment team, responsible for developing ways to govern and steer “superintelligent” AI systems, was promised 20% of the company’s compute resources, according to a person from that team. But…

OpenAI created a team to control ‘superintelligent’ AI — then let it wither, source says

A new crop of early-stage startups — along with some recent VC investments — illustrates a niche emerging in the autonomous vehicle technology sector. Unlike the companies bringing robotaxis to…

VCs and the military are fueling self-driving startups that don’t need roads

When the founders of Sagetap, Sahil Khanna and Kevin Hughes, started working at early-stage enterprise software startups, they were surprised to find that the companies they worked at were trying…

Deal Dive: Sagetap looks to bring enterprise software sales into the 21st century

Keeping up with an industry as fast-moving as AI is a tall order. So until an AI can do it for you, here’s a handy roundup of recent stories in the world…

This Week in AI: OpenAI moves away from safety

After Apple loosened its App Store guidelines to permit game emulators, the retro game emulator Delta — an app 10 years in the making — hit the top of the…

Adobe comes after indie game emulator Delta for copying its logo

Meta is once again taking on its competitors by developing a feature that borrows concepts from others — in this case, BeReal and Snapchat. The company is developing a feature…

Meta’s latest experiment borrows from BeReal’s and Snapchat’s core ideas

Welcome to Startups Weekly! We’ve been drowning in AI news this week, with Google’s I/O setting the pace. And Elon Musk rages against the machine.

Startups Weekly: It’s the dawning of the age of AI — plus,  Musk is raging against the machine

IndieBio’s Bay Area incubator is about to debut its 15th cohort of biotech startups. We took special note of a few, which were making some major, bordering on ludicrous, claims…

IndieBio’s SF incubator lineup is making some wild biotech promises

YouTube TV has announced that its multiview feature for watching four streams at once is now available on Android phones and tablets. The Android launch comes two months after YouTube…

YouTube TV’s ‘multiview’ feature is now available on Android phones and tablets

Featured Article

Two Santa Cruz students uncover security bug that could let millions do their laundry for free

CSC ServiceWorks provides laundry machines to thousands of residential homes and universities, but the company ignored requests to fix a security bug.

2 days ago
Two Santa Cruz students uncover security bug that could let millions do their laundry for free

TechCrunch Disrupt 2024 is just around the corner, and the buzz is palpable. But what if we told you there’s a chance for you to not just attend, but also…

Harness the TechCrunch Effect: Host a Side Event at Disrupt 2024

Decks are all about telling a compelling story and Goodcarbon does a good job on that front. But there’s important information missing too.

Pitch Deck Teardown: Goodcarbon’s $5.5M seed deck

Slack is making it difficult for its customers if they want the company to stop using its data for model training.

Slack under attack over sneaky AI training policy

A Texas-based company that provides health insurance and benefit plans disclosed a data breach affecting almost 2.5 million people, some of whom had their Social Security number stolen. WebTPA said…

Healthcare company WebTPA discloses breach affecting 2.5 million people

Featured Article

Microsoft dodges UK antitrust scrutiny over its Mistral AI stake

Microsoft won’t be facing antitrust scrutiny in the U.K. over its recent investment into French AI startup Mistral AI.

2 days ago
Microsoft dodges UK antitrust scrutiny over its Mistral AI stake

Ember has partnered with HSBC in the U.K. so that the bank’s business customers can access Ember’s services from their online accounts.

Embedded finance is still trendy as accounting automation startup Ember partners with HSBC UK

Kudos uses AI to figure out consumer spending habits so it can then provide more personalized financial advice, like maximizing rewards and utilizing credit effectively.

Kudos lands $10M for an AI smart wallet that picks the best credit card for purchases

The EU’s warning comes after Microsoft failed to respond to a legally binding request for information that focused on its generative AI tools.

EU warns Microsoft it could be fined billions over missing GenAI risk info

The prospects for troubled banking-as-a-service startup Synapse have gone from bad to worse this week after a United States Trustee filed an emergency motion on Wednesday.  The trustee is asking…

A US Trustee wants troubled fintech Synapse to be liquidated via Chapter 7 bankruptcy, cites ‘gross mismanagement’

U.K.-based Seraphim Space is spinning up its 13th accelerator program, with nine participating companies working on a range of tech from propulsion to in-space manufacturing and space situational awareness. The…

Seraphim’s latest space accelerator welcomes nine companies

OpenAI has reached a deal with Reddit to use the social news site’s data for training AI models. In a blog post on OpenAI’s press relations site, the company said…

OpenAI inks deal to train AI on Reddit data

X users will now be able to discover posts from new Communities that are trending directly from an Explore tab within the section.

X pushes more users to Communities

For Mark Zuckerberg’s 40th birthday, his wife got him a photoshoot. Zuckerberg gives the camera a sly smile as he sits amid a carefully crafted re-creation of his childhood bedroom.…

Mark Zuckerberg’s makeover: Midlife crisis or carefully crafted rebrand?

Strava announced a slew of features, including AI to weed out leaderboard cheats, a new ‘family’ subscription plan, dark mode and more.

Strava taps AI to weed out leaderboard cheats, unveils ‘family’ plan, dark mode and more

We all fall down sometimes. Astronauts are no exception. You need to be in peak physical condition for space travel, but bulky space suits and lower gravity levels can be…

Astronauts fall over. Robotic limbs can help them back up.

Microsoft will launch its custom Cobalt 100 chips to customers as a public preview at its Build conference next week, TechCrunch has learned. In an analyst briefing ahead of Build,…

Microsoft’s custom Cobalt chips will come to Azure next week

What a wild week for transportation news! It was a smorgasbord of news that seemed to touch every sector and theme in transportation.

Tesla keeps cutting jobs and the feds probe Waymo