With AI and Criminal Justice, the Devil Is in the Data

If we have learned anything in the last decade about our criminal justice system, it is how astonishingly dysfunctional it is

Vincent Southerland, Executive Director, Center on Race, Inequality, and the Law, NYU Law.

April 9, 2018

If we have learned anything in the last decade about our criminal justice system, it is how astonishingly dysfunctional it is

If we have learned anything in the last decade about our criminal justice system, it is how astonishingly dysfunctional it is. Extensive investigations have revealed persistent racial disparities at every stage, a different kind of justice for the haves and the have nots, and a system that neither rehabilitates individuals nor ensures public safety. In short, the system is in crisis.

Rather than scrapping everything and starting anew, many criminal justice stakeholders have turned to technology to repair the breach through “risk assessment tools.” Also labeled artificial intelligence, automated decision-making, or predictive analytics, these tools have been touted as carrying with them the potential to save a broken system, and they now play a role at nearly every critical stage of the criminal justice process. If we’re not careful, however, these tools may exacerbate the same problems they are ostensibly meant to help solve.

It begins on the front lines of the criminal justice system with policing. Law enforcement has embraced predictive analytics — which can pinpoint areas allegedly prone to criminal activity by examining historical patterns — and then deploy officers to those areas. In Chicago, for example, the predictive tools analyze complex social networks through publicly accessible data in an attempt to forecast likely perpetrators and victims of violent crime.

Once an individual is arrested, they are likely to be subjected to a pre-trial risk assessment tool. Such tools are used to inform the thinking of a judge who must decide whether to incarcerate that person pending trial or release them. Pre-trial risk assessments attempt to predict which of the accused will fail to appear in court or will be rearrested. Some states have used these pre-trial tools at the sentencing and parole stage, in an attempt to predict the likelihood that someone will commit a new offense if released from prison.

A screen from the video, picturing a pixelated image of faces overlayed with facial recognition technology

This embed will serve content from {{ domain }}. See our privacy statement

While all of this technology may seem to hold great promise, it also can come with staggering costs. The potential for bias to creep into the deployment of the tools is enormous. Simply put, the devil is in the data. All risk assessment tools generally rely on historical, actuarial data. Often, that data relates to the behavior of a class of people — like individuals with criminal records. Sometimes it relates to the characteristics of a neighborhood. That information is run through an algorithm — a set of instructions that tell a computer model what to do. In the case of risk assessment tools, the model produces a forecast of the probability that an individual will engage in some particular behavior.

That order of operations can be problematic given the range of data that fuels the forecast. Data scientists often refer to this type of problem as “garbage in, garbage out.” In a historically biased criminal justice system, the “garbage in” can have grave consequences. Imagine, for a moment, a city where Black people made up 67 percent of the population, but accounted for 85 percent of vehicle stops, 90 percent of citations issued, 95 percent of jaywalking charges, 94 percent of those charged with disobeying the order of an officer, and 93 percent of the arrests made by the city’s officers.

What about a city where Black people comprised 54 percent of the population, but 85 percent of pedestrian stops and 79 percent of arrests by police, and were 2.5 times more likely to be stopped by the police than their white counterparts? Or a police department that singled out a city’s Black and Latino residents for 83 percent of all stops, but 88 percent of the stops resulted in no further action?

These aren’t imaginary cities or made-up numbers. They are drawn from Ferguson, Missouri; Newark, New Jersey; and New York City, respectively. It is now well known that the police forces in these cities engaged in racially biased policing on the false assumption that doing so was an effective means of fighting crime. In the case of Ferguson, fighting crime was only half the goal; generating revenue for the municipality through law enforcement was the other.

Now consider the potential harm done when police departments like these use their crime data to feed the algorithms and models used to predict behavior. If one only examined the data, the working assumption would be that white people rarely engage in criminal activity. Most algorithms would simply predict that these disparate numbers represent a real, consistent pattern of criminal behavior by individuals.

While all of this technology may seem to hold great promise, it also can come with staggering costs.

The data provides a distorted picture of the neighborhoods where crime is happening that, in turn, drives more police to those neighborhoods. Police then come into contact with more people from those communities, and by virtue of more contact, make more arrests. Those arrests — regardless of their validity or constitutionality — are interpreted as indicative of criminal activity in a neighborhood, leading to a greater police presence. The result, as mathematician and data scientist Cathy O’Neil calls it in “Weapons of Math Destruction,” is “a pernicious feedback loop,” where “the policing itself spawns new data, which justifies more policing.”

Any system that relies on criminal justice data must contend with the vestiges of slavery, de jure and de facto segregation, racial discrimination, biased policing, and explicit and implicit bias, which are part and parcel of the criminal justice system. Otherwise, these automated tools will simply exacerbate, reproduce, and calcify the biases they are meant to correct.

These concerns aren’t theoretical. In a piece two years ago, reporters at ProPublica sparked a debate about these tools by highlighting the racial bias embedded in risk assessments at pretrial bail hearings and at sentencing. That study found that Black defendants were more likely to be wrongly labeled high risk than white defendants.

Humans have always deployed technology with the hope of improving the systems that operate around them. For risk assessments to advance justice, those who seek to use them must confront racism head-on, recognize that it is infecting decisions and leading to unjust outcomes, and make its eradication the ultimate goal of any tool used. When the data reveals racism and bias in the system, risk assessment tools must account for that bias.

...we should not add to the problems in the criminal justice system with mechanisms that exacerbate racism and inequity.

This means privileging the voices of communities and those with experience in the criminal justice system so that the quantitative data is informed by qualitative information about those numbers and the human experiences behind them. It means employing the tool in a criminal justice ecosystem that is devoted to due process, fairness, and decarceration.

Finally, it requires the implementation of frameworks that ensure algorithmic accountability. An Algorithmic Impact Assessment is one such framework, proposed by the research institute AI Now in the context of New York City’s efforts to hold public agencies accountable in their automated decision-making. AIAs do so by publicly listing how and when algorithms are used to make decisions in people’s lives, providing meaningful access for independent auditing of these tools, increasing the expertise and capacity of agencies that use the tools, and allowing the public opportunities to assess and dispute the way entities deploy the tools.

No system or tool is perfect. But we should not add to the problems in the criminal justice system with mechanisms that exacerbate racism and inequity. Only by making a commitment to antiracist and egalitarian values and frameworks for accountability, can well-intended reformers ensure that these new tools are used for the public good.

Vincent Southerland is the executive director of the Center on Race, Inequality, and the Law at NYU School of Law. He previously served as a senior counsel with the NAACP Legal Defense and Educational Fund, where he focused on race and criminal justice, and as a public defender with The Bronx Defenders and the Federal Defenders of New York.

This piece is part of a series exploring the impacts of artificial intelligence on civil liberties. The views expressed here do not necessarily reflect the views or positions of the ACLU.

A top level view of many people crossing a street, with red squares around a few people's heads

Will Artificial Intelligence Make Us Less Free?

Artificial intelligence is playing a growing role in our lives, in private and public spheres, in ways large and small. Machine learning tools help determine the ads you see on Facebook and routes you take to get to work. They might also be making decisions about your health care and immigration status. Read More

What Lurks Behind All That Immigration Data?

The United States has a long history of using cutting-edge technology to collect and analyze data on immigrants. Unfortunately, we have an equally long history of misusing that data to justify nativist and exclusionary policies. Read More