Three ways to avoid bias in machine learning

At this moment in history it’s impossible not to see the problems that arise from human bias. Now magnify that by compute and you start to get a sense for just how dangerous human bias via machine learning can be. The damage can be twofold:

  • Influence. If the AI said so it must be true… people trust outputs of AI, so if human bias is missed in the training it could compound the problem by infecting more people;
  • Automation. Sometimes AI models are plugged into a programmatic function, which could lead to the automation of bias. 

But there is potentially a silver machine-learned lining. Because AI can help expose truth inside messy data sets, it’s possible for algorithms to help us better understand bias we haven’t already isolated, and spot ethically questionable ripples in human data so we can check ourselves. Exposing human data to algorithms exposes bias, and if we are considering the outputs rationally, we can use machine learning’s aptitude for spotting anomalies.

But the machines can’t do it on their own. Even unsupervised learning is semi-supervised, as it requires data scientists to choose the training data that goes into the models. If a human is the chooser, bias can be present. How the heck do we tackle such a bias beast? We will attempt to pick it apart.

The landscape of ethical concerns with AI

Bad examples abound. Consider the finding from Carnegie Mellon that showed that women were shown significantly fewer online ads for high-paying jobs than men were. Or recall the sad case of Tay, Microsoft’s teen slang Twitter bot that had to be taken down after producing racist posts.

In the near future, such mistakes could result in hefty fines or compliance investigation, a conversation that’s already occurring in the U.K. parliament. All mathematicians and machine learning engineers should consider bias to some degree, but that degree varies from instance to instance. A small company with limited resources will often be forgiven for accidental bias as long as the algorithmic vulnerability is fixed quickly; a Fortune 500 company, which presumably has the resources to ensure an unbiased algorithm, will be held to a tighter standard.

Of course, an algorithm that recommends novelty T-shirts does not need nearly as much oversight as an algorithm that decides what dose of radiation to give to a cancer patient. It’s these high-stakes decisions that will become the most pronounced when legal liability enters the discussion.

It’s important for builders and business leaders to establish a process for monitoring the ethical behavior of their AI systems.

Three keys to managing bias when building AI

There are signs of existing self-correction in the AI industry: Researchers are looking at ways to reduce bias and strengthen ethics in rule-based artificial systems by taking human biases into account, for example.

These are good practices to follow; it’s important to be thinking proactively about ethics regardless of the regulatory environment. Let’s take a look at several points to keep in mind as you work on your AI.

1. Choose the right learning model for the problem.

There’s a reason all AI models are unique: Each problem requires a different solution and provides varying data resources. There’s no single model to follow that will avoid bias, but there are parameters that can inform your team as it’s building.

For example, supervised and unsupervised learning models have their respective pros and cons. Unsupervised models that cluster or do dimensional reduction can learn bias from their data set. If belonging to group A highly correlates to behavior B, the model can mix up the two. And while supervised models allow for more control over bias in data selection, that control can introduce human bias into the process.

It’s better to find and fix vulnerabilities now than to have regulators find them later on.

Non-bias through ignorance — excluding sensitive information from the model — may seem like a workable solution, but it still has vulnerabilities. In college admissions, sorting applicants by ACT scores is standard, but taking their ZIP code into account might seem discriminatory. But because test scores might be affected by the preparatory resources in a given area, including the ZIP code in the model could actually decrease bias.

You have to require your data scientists to identify the best model for a given situation. Sit down and talk them through the different strategies they can take when building a model. Troubleshoot ideas before committing to them. It’s better to find and fix vulnerabilities now — even if it means taking longer — than to have regulators find them later on.

2. Choose a representative training data set.

Your data scientists may do much of the leg work, but it’s up to everyone participating in an AI project to actively guard against bias in data selection. There’s a fine line you have to walk. Making sure the training data is diverse and includes different groups is essential, but segmentation in the model can be problematic unless the real data is similarly segmented.

It’s inadvisable — both computationally and in terms of public relations — to have different models for different groups. When there is insufficient data for one group, you could possibly use weighting to increase its importance in training, but this should be done with extreme caution. It can lead to unexpected new biases.

For example, if you have only 40 people from Cincinnati in a data set and you try to force the model to consider their trends, you might need to use a large weight multiplier. Your model would then have a higher risk of picking up on random noise as trends — you could end up with results like “people named Brian have criminal histories.” This is why you need to be careful with weights, especially large ones.

3. Monitor performance using real data.

No company is knowingly creating biased AI, of course — all these discriminatory models probably worked as expected in controlled environments. Unfortunately, regulators (and the public) don’t typically take best intentions into account when assigning liability for ethical violations. That’s why you should be simulating real-world applications as much as possible when building algorithms.

It’s unwise, for example, to use test groups on algorithms already in production. Instead, run your statistical methods against real data whenever possible. Ask the data team to check simple test questions like “Do tall people default on AI-approved loans more than short people?” If they do, determine why.

When you’re examining data, you could be looking for two types of equality: equality of outcome and equality of opportunity. If you’re working on AI for approving loans, result equality would mean that people from all cities get loans at the same rates; opportunity equality would mean that people who would have returned the loan if given the chance are given the same rates regardless of city. Without the latter, the former could still hide if one city has a culture that makes defaulting on loans common.

Result equality is easier to prove, but it also means you’ll knowingly accept potentially skewed data. While it’s harder to prove opportunity equality, it is at least valid morally. It’s often practically impossible to ensure both types of equality, but oversight and real-world testing of your models should give you the best shot.

Eventually, these ethical AI principles will be enforced by legal penalties. If New York City’s early attempts at regulating algorithms are any indication, those laws will likely involve government access to the development process, as well as stringent monitoring of the real-world consequences of AI. The good news is that by using proper modeling principles, bias can be greatly reduced or eliminated, and those working on AI can help expose accepted biases, create a more ethical understanding of tricky problems and stay on the right side of the law — whatever it ends up being.