Studies Question Value of AI-Assisted Police Reports
Subscribe to the Free Future Newsletter
Free Future home
Some US police departments seem to be adopting AI-assisted police report products despite the technology’s deal-breaking civil liberties problems, some of which (such as evidentiary/contamination of memory issues) are not even theoretically solvable. We laid this out in our 2024 white paper on AI-assisted police reports. But there have also been questions about the degree to which the technology even offers advantages for police officers and departments — questions that recent work by criminologists is intensifying. We can debate the civil liberties problems with a new technology all day long, but if the tech doesn’t work, why bother?
In our white paper we cited an early study finding AI police report products didn’t actually save officers any time, and customer reports suggesting that Axon’s product “performs worse with longer interviews, pursuits, and traffic accidents” and “in loud environments or when unrelated conversations interject (such as radio chatter).”
A recent new report, as well as others that have been published since then, have added to the evidence that this technology’s benefits may be illusory.
In the latest study, researchers tried to assess whether AI-assisted police reports are higher quality than officer-drafted reports. They recruited 92 senior law enforcement officers (sergeants and above) from police departments across the US who had an average of 22 years approving police reports. They gave these reviewers 80 police reports, 20 of which had been drafted with the Draft One AI product made by Axon.
The researchers asked the reviewers to judge which of the reports were written with AI, and they were unable to do so, with their accuracy on that score no better than a coin flip. Thus demonstrably ignorant of which reports were AI-assisted, the reviewers were then asked to rate each report on five measures: clarity, completeness, grammar, accuracy, and utility.
Report co-author Ian T. Adams, a criminology professor at the University of South Carolina (and former police officer) summarized the findings:
First, on the objective text measures: yes, AI is “better” in a narrow sense. AI assistance produces “better” writing, if “better” means longer words… more syllables per word...higher grade level, and reduced readability.
Second, on the subjective measures from 92 independent expert reviewers blind to condition: no reliable improvement on the composite [overall] quality measure.
Third, and I think the one that matters: AI reports were rated substantively and significantly worse on ACCURACY (p = .038). For a document whose downstream readers are prosecutors deciding to charge, defense attorneys probing for inconsistencies, and judges and juries reconstructing events months later, accuracy is the threshold condition. The estimated effect moves a report from roughly the 50th to the 36th percentile on perceived accuracy.
What worries me is that because of their access to minute data from body camera transcripts, AI-assisted reports may provide more detail than the average human and, with its trained-to-be authoritative AI voice, create reports that could seem superior to what humans produce. But this report suggests that, whoever might be fooled by AI “quality,” experienced police managers are not; they seem to recognize that in terms of accuracy and usefulness, the AI is a step backwards not forward. Given the critical role of police reports in our criminal justice system, and the other flaws with AI-assisted police reports, we must hope that not just the civil liberties problems, but also the reality reflected by these officers’ assessments overcomes hype about AI, the resulting drive to deploy it in many contexts where it is inappropriate, and the marketing power of Axon.
Adams sums up the implication of the report and the overall state of research on AI-assisted reports (hyperlinks added):
Tech in this space sells on two main claims: speed/efficiency and quality. The independent literature on those claims is limited, but it is not non-existent, and what exists points consistently the same direction. On speed and efficiency claims, Adams et al. (2024) finds no time savings from Draft One; Walley & Glasspoole-Bird (2025) finds an 18.5-minute increase per incident in a different AI tool; Becker et al. (2025) finds experienced developers slowed by 19% with AI coding tools; Rotenstein et al. (2026) finds modest time savings (~13 minutes per eight scheduled patient hours) for AI clinical scribes with no reduction in after-hours work (sample opts-in).
But even granting the literature is still developing, we can’t let the water get muddied: there is not a single independent evaluation that supports the vendor claims being made to police departments right now.
I had a job once where I had to clean up and sort a messy data set. I decided to write a macro that would do the job for me. The database wasn’t that large, and in the time it took me to code, test, debug, and run the macro I probably could have gone through the data and manually fixed the problems. But coding a solution was so much more interesting and fun! Fixing it that way may have taken longer, but it didn’t seem longer. I suspect that to the extent that people think AI saves time that it doesn’t actually save, which seems to be a trend in the research that has emerged so far, that it’s because of this effect. Large language models are still novel and using them and fixing their work (if that is in fact done) may be more stimulating than trying to write from scratch.
But if the quality is lower, and it takes more time, and it raises insurmountable civil liberties issues, then we need our officers to stop playing with AI in their reports and just describe what happened, human-to-human.