Four steps for drafting an ethical data practices blueprint

In 2019, UnitedHealthcare’s health-services arm, Optum, rolled out a machine learning algorithm to 50 healthcare organizations. With the aid of the software, doctors and nurses were able to monitor patients with diabetes, heart disease and other chronic ailments, as well as help them manage their prescriptions and arrange doctor visits. Optum is now under investigation after research revealed that the algorithm (allegedly) recommends paying more attention to white patients than to sicker Black patients.

Today’s data and analytics leaders are charged with creating value with data. Given their skill set and purview, they are also in the organizationally unique position to be responsible for spearheading ethical data practices. Lacking an operationalizable, scalable and sustainable data ethics framework raises the risk of bad business practices, violations of stakeholder trust, damage to a brand’s reputation, regulatory investigation and lawsuits.

Here are four key practices that chief data officers/scientists and chief analytics officers (CDAOs) should employ when creating their own ethical data and business practice framework.

Identify an existing expert body within your organization to handle data risks

The CDAO must identify and execute on the economic opportunity for analytics, and with opportunity comes risk. Whether the use of data is internal — for instance, increasing customer retention or supply chain efficiencies — or built into customer-facing products and services, these leaders need to explicitly identify and mitigate risk of harm associated with the use of data.

A great way to begin to build ethical data practices is to look to existing groups, such as a data governance board, that already tackles questions of privacy, compliance and cyber-risk, to build a data ethics framework. Dovetailing an ethics framework with existing infrastructure increases the probability of successful and efficient adoption. Alternatively, if no such body exists, a new body should be created with relevant experts from within the organization. The data ethics governing body should be responsible for formalizing data ethics principles and operationalizing those principles for products or processes in development or already deployed.

Ensure that data collection and analysis are appropriately transparent and protect privacy

All analytics and AI projects require a data collection and analysis strategy. Ethical data collection must, at a minimum, include: securing informed consent when collecting data from people, ensuring legal compliance, such as adhering to GDPR, anonymizing personally identifiable information so that it cannot reasonably be reverse-engineered to reveal identities and protecting privacy.

Some of these standards, like privacy protection, do not necessarily have a hard and fast level that must be met. CDAOs need to assess the right balance between what is ethically wise and how their choices affect business outcomes. These standards must then be translated to the responsibilities of product managers who, in turn, must ensure that the front-line data collectors act according to those standards.

CDAOs also must take a stance on algorithmic ethics and transparency. For instance, should an AI-driven search function or recommender system strive for maximum predictive accuracy, providing a best guess as to what the user really wants? Is it ethical to micro-segment, limiting the results or recommendations to what other “similar people” have clicked on in the past? And is it ethical to include results or recommendations that are not, in fact, predictive, but profit-maximizing to some third party? How much algorithmic transparency is appropriate, and how much do users care? A strong ethical blueprint requires tackling these issues systematically and deliberately, rather than pushing these decisions down to individual data scientists and tech developers that lack the training and experience to make these decisions.

Anticipate – and avoid – inequitable outcomes

Division and product managers need guidance on how to anticipate inequitable and biased outcomes. Inequalities and biases can arise due simply to data collection imbalances — for instance, a facial recognition tool that has been trained on 100,000 male faces and 5,000 female faces will likely be differently effective by gender. CDAOs must help ensure balanced and representative data sets.

Other biases are less obvious, but just as important. In 2019, Apple Card and Goldman Sachs were accused of gender bias when extending higher credit lines to men than women. Though Goldman Sachs maintained that creditworthiness — not gender — was the driving factor in credit decisions, the fact that women have historically had fewer opportunities to build credit likely meant that the algorithm favored men.

To mitigate inequities, CDAOs must help tech developers and product managers alike navigate what it means to be fair. While computer science literature offers myriad metrics and definitions of fairness, developers cannot reasonably choose one in the absence of collaborations with the business managers and external experts who can offer deep contextual understanding of how data will eventually be used. Once standards for fairness are chosen, they must also be effectively communicated to data collectors to ensure adherence.

Align organizational structure with the process for identifying ethical risk

CDAOs often build analytics capacity in one of two ways: via a center of excellence, in service to an entire organization, or a more distributed model, with data scientists and analytics investments committed to specific functional areas, such as marketing, finance or operations. Regardless of organizational structure, the processes and rubrics for identifying ethical risk must be clearly communicated and appropriately incentivized.

Key steps include:

  • Clearly establishing accountability by creating linkages from the data ethics body to departments and teams. This can be done by having each department or team designate its own “ethics champion” to monitor ethics issues. Champions need to be able to elevate concerns to the data ethics body, which can advise on mitigation strategies, such as augmenting existing data, improving transparency or creating a new objective function.
  • Ensuring consistent definitions and processes across teams through education and training around data and AI ethics.
  • Broadening teams’ perspectives on how to identify and remediate ethical problems by facilitating collaborations across internal teams and sharing examples and research from other domains.
  • Creating incentives — financial or other recognitions — to build a culture that values the identification and mitigation of ethical risk.

CDAOs are charged with the strategic use and deployment of data to drive revenue with new products and to create greater internal consistencies. Too many business and data leaders today attempt to “be ethical” by simply weighing the pros and cons of decisions as they arise. This short-sighted view creates unnecessary reputational, financial and organizational risk. Just as a strategic approach to data requires a data governance program, good data governance requires an ethics program. Simply put, good data governance is ethical data governance.