Tips for Data Journalism in the Shadow of an Overbroad Anti-Hacking Law

How can we know if a housing website is suggesting the same homes to home-seekers of different races, or illegally steering some users toward neighborhoods where they demographically “belong”? Or whether employment websites are showing qualified women’s resumes to employers at the same rate as men’s?

Because such websites tend to make decisions that are automated by code that is proprietary and hidden, members of the public wouldn’t know unless someone tested the outcomes produced by these algorithms. Researchers and journalists want to do this testing to inform the debate about online business practices, just as they have long done in the offline world.

Unfortunately, they conduct these investigations in the shadow of a federal criminal statute called the Computer Fraud and Abuse Act, which perversely grants businesses that operate online the power to shut down any testing of their practices they don’t like. Intended to punish malicious hacking, the CFAA contains broad and vague language making it a crime to access a website in a manner that “exceeds authorized access.” This provision has been interpreted to prohibit an individual from visiting a website in a manner that violates the website’s terms of service. But common website terms of service prohibit activities like copying publicly available information (“scraping”), creating multiple accounts, or providing false information — even though these activities are often necessary for robust testing, including the kind of testing that would uncover discrimination on the internet. 

We are challenging this provision in federal court on behalf of a group of academic researchers and The Intercept, a media organization. The lawsuit seeks to remove the barrier posed by the CFAA’s overbroad criminal prohibitions. In the meantime, here is some advice for journalists and researchers doing this important work. You can read our full paper on this subject here.

First, do no harm.

To avoid liability, journalists should design their investigations to avoid placing too much stress on the target’s computers or servers. The idea is to ensure, to the extent possible, that the servers continue to function as they would without the investigation. Conducting a careful investigation makes it less likely that the target company can argue damage to its machines or its regular business operations.

Practically speaking, this means, for example, designing software to make a small number of requests repeatedly over a long period of time, rather than overwhelming a server by running all of the requests at once. Journalists should also consider running bots and scrapers at off-hours, when servers are not likely to be experiencing much traffic, though this may be impossible with some services (like trip or route planners) that are highly sensitive to the time of day they are tested. Finally, investigations that trigger real-world events — for example, hailing a car service or reserving lodging — should be limited in scope.

Does fear of negative publicity protect you?

Imagine: A data journalist working for a major publication conducts an investigation that reveals that a platform operated by a large and publicly traded company systematically disadvantages women or people of color in some way. When the platform gets wind of the investigation, it sues the journalist and the publication claiming damages from the test. How would the company look to the public when news of this retaliatory suit got out?

In recent years, many technology companies have been sensitive to allegations of discrimination and to any publicity that makes them look like bad actors. This sensitivity might offer data journalists some protection. Exactly how much will depend on the footprint of the journalist and publication involved, the size and corporate culture of the target, and the extent to which its business is public-facing and reliant on the trust of its consumer base. It will also depend on the details of the investigation — the more newsworthy the topic, the more protection a journalist may have. For example, an investigation into gender discrimination in job recruiting may generate widespread interest and more protection from public attention.

Consider informing the investigated entity.

Researchers might consider seeking permission for testing from the entities they want to investigate. If a target grants permission, that would preclude any argument that the testing activities violated the CFAA’s authorization provisions. However, if the targeted entity refuses permission, a researcher may find herself in a worse legal position than before if she goes ahead with the research. (There may be, of course, other downsides to seeking permission, including the possibility that the targeted entity makes it technologically impossible to conduct the proposed research.)

Mount a defense based on civil rights enforcement.

If a journalist conducting research into algorithmic discrimination is alleged to have accessed, copied, or published information obtained through falsity or deception, she could raise the defense that the online testing was the equivalent of offline testing long approved by the courts. Courts recognize, in the context of fair housing, that testers are necessary for enforcement, even though they are not genuinely interested in the housing they claim to seek during the test. Courts have even acknowledged that deception is involved in testing, and nonetheless have permitted it. As one appellate court put it:

“It is surely regrettable that testers must mislead commercial landlords and home owners as to their real intentions. . . Nonetheless, we have long recognized that this requirement of deception was a relatively small price to pay to defeat racial discrimination. The evidence produced by testers . . . is a major resource in society’s continuing struggle to eliminate the subtle but deadly poison of racial discrimination.”

Congress passed a statute ensuring that the federal government directly funds testing related to fair housing issues. Testing has similarly been recognized by some courts as a vital part of the enforcement of anti-discrimination laws in employment. The more closely an online audit test resembles these offline tests, the more persuasive this argument will likely be to a court.

Journalists and researchers can find more on how to protect themselves while conducting online investigations here.

Add a comment (44)
Read the Terms of Use

Dr. Joseph Goebbels

Face it folks, the internet is a tool of the devil if there ever was one.


Is that the same devil that Jesus Christ had buttsex with in space when he was gone for forty days and nights. Perhaps it was just an alien anal probe. Hmmm, aliens anal probed Jesus. So that’s what Hitler didn’t want us to know.
Vary interestinging.


This is really interesting in my opinion


ACLU said
“How can we know if a housing website is suggesting the same homes to home-seekers of different races, or illegally steering some users toward neighborhoods where they demographically “belong”.”

You conduct a legal blind survey of thousands of people. That’s how. You do it in a non biased way. Otherwise what you are suggesting is releasing the private info of every user to a website offering real estate or anything else to the government.

Um I say NO!


Survey would not be an empirically valid approach.


Much gratitude to you for sharing in this article I can take in a ton and could in like manner be a reference I intend to scrutinize the accompanying your article revives.


Amazing!!! I like this site so much it's really awesome.I have in like manner encountered your distinctive posts also and they are moreover especially recognized competent and I'm as of late sitting tight for your next invigorate to come as I like each one of your posts.

Mihika Pal

Is there no way out there to save it form these hackers.


Oh! This article has suggested to me many new ideas. I will embark on doing it. Hope you can continue to contribute your talents in this area. Thank you.
bodak yellow lyrics ,
colour switch


I think this is very interesting to read


Stay Informed