In the fall of 2018, the New York Times published a piece about an experiment in which they used an algorithm to produce Halloween costume ideas. The results were amusing: “baseball clown”, “cat witch”, “king dog.” The algorithm combined random letters to make words, which it then compared to the set of real words in its training data. If it found a match, it kept the word and paired it with another one.
In this example—and in many more serious ones like it—the algorithm was given pre-existing patterns and taught to replicate them. The meaning of the words was never “understood” by the algorithm, but it simply produced results that matched the pattern it had learned. If a word had been spelled incorrectly in the training data, the algorithm would perpetuate the error. If a word was obsolete, the algorithm, too, would be out-of-date. If a word had negative associations, the algorithm would be “rude.”
Bad feedback loops
There have been several examples that show how bad feedback loops, such as those mentioned above, could lead to negative impacts on society.
To name one, Amazon developed an algorithm to inform recruitment decisions and was using historical data from the past 10 years, a time period in which men dominated the technology industry. As a result, Amazon found its algorithm discriminated against female applicants. Or take another example from the US criminal justice system. A biased algorithm—that gave black defendants a higher risk score—was used to predict the risk of recidivism and influence sentencing in several states.
The problem is that simple algorithms treat all data as immutable, even data about our preferences, income, life situation or countless other shifting patterns. What can happen then, is that algorithms can trap people in their origins, history or a stereotype. This should compel all who develop algorithms to pin-point and address potential, unintended consequences.
The Future of Privacy Forum (FPF), a nonprofit think tank, has identified four main types of harm—or unintended consequences—that algorithms can cause.
- Loss of Opportunity. As the Amazon example above suggests, biased hiring algorithms can lock some groups out of employment opportunities. Similar biases could keep people out of other opportunities such as higher education, welfare programs, health care plans and business loans.
- Economic Loss. Differential pricing and credit availability are two of the most common examples of economic loss. In one of the earliest examples of racially biased algorithms, Mr. Johnson, an African-American, had his credit limit drastically reduced from $10,800 to $3,800. The reason? He was shopping at places where the customer base was expected to have a poor credit repayment history.
- Social Detriment. Some examples of social detriments include confirmation bias, stereotypes, and other phenomena that impact how individuals organize and relate to each other. A simple example is your Facebook news feed, which is so well tailored to your online activity that you are most likely to see and read ideas which confirm your own beliefs about the world, regardless of whether they are biased or not. Any future recommendations will also be in line with your views.
- Loss of Liberty. The most serious harm of all is the loss of liberty. Similar to the second example, a racially biased algorithm can lead to false crime predictions based on little more than race.
Most organizations are committed to avoiding these harms, and indeed, the majority of biases are accidental and negligent, not intentional. But to avoid accidents and negligence, we should ask the following questions in the planning, design and evaluation phases of our work with algorithms and related AI applications:
- Is it appropriate to make this machine learning system?
This question puts the human back at the center of the algorithm and challenges its purpose. Why are we making this algorithm? Is its purpose to give someone an advantage over others? And if yes, is this an appropriate competitive space? What problem are we trying to solve with it? Perhaps there are some inherent biases in the historical data used to train the algorithm, and therefore the outcome will only reinforce those biases.
- In building this machine learning system, what is an inclusive and comprehensive technical approach? Having established that the purpose of your algorithm is appropriate, what are the means that will get you there? This is what will ensure not only that your algorithm is inclusive and comprehensive, but that it continues to remain that way.
- Now that this machine learning system is built, are the results fair?
This question encourages you to evaluate the outputs of your algorithm. Just because the algorithm is working, doesn’t mean it’s operating as intended, as we’ve seen in the examples above. As we are at the relatively early stages of AI development, the results from machine learning algorithms should always be scrutinized.
- The difficulty in this case is defining what is “fair,” a complex question that transcends AI ethics. Even from a technical perspective, differing ways of quantifying ‘fairness’ exist that are at odds with each other. In order to understand whether the results of an algorithm are fair, the creator needs to engage the impacted audiences to understand the context of fairness in the application of the algorithm.
- What second-order harms could exist?
Apart from the four main types of harm, there are others which might only become apparent in the medium-to-long term use of the algorithm. Will the algorithm threaten people’s data privacy, for instance? What other uses of the algorithm and its results could there be? How will those affect people and society as a whole? Even though these are only speculations, it is important that developers address them early on because it forces them to think about the non-technical, ethical side of the creation of the algorithm.
- Will the algorithm provide deterministic suggestions/outputs?
Finally, this question addresses how the algorithm will change over time. As the algorithm will use historical data to begin with, it will probably reinforce some of its patterns. If it doesn’t have the option to learn from new data, its outcomes will be deterministic. Thus, this question is about how we can improve the methods behind the algorithm and ensure that it learns from new data sets or that it recognizes subtle differences, so that it doesn’t repeat or exaggerate past trends.
Putting ethics at the heart of development
We program algorithms to give us exactly what we have asked for, so we shouldn’t be surprised when they do.
None of the issues mentioned in this article are inherent with machine learning algorithms themselves. Instead, issues arise from the way they interact with society and the unintended consequences that can result from those interactions. As such, putting the ethical implications at the heart of the development of each new algorithm is vital.
One way to ensure this is by embracing public health models of governance, which treat issues as indicative of underlying drivers, rather than problems to be solved per se. Another would be to ensure algorithms can be adapted more readily to newer or better data, in ways that do not exaggerate historical patterns. We see this every day in the way AI at Spotify or Amazon quickly adapts recommendations to our latest searches.
Finally, targeted research identifying individual problems and solutions is critical to the success of any effort to create more ethical AI. We need to see more resources—and more senior leadership attention—directed at ensuring algorithms do not have negative impacts on individuals or society. Just as data privacy and cyber security have moved from departmental to board-level issues, responsible governance of AI must be quickly elevated in importance by all organizations that use it.