Journalists champion Wayback Machine after news publishers limit article archiving
In January, Hanaa’ Tameez and I broke the story that The New York Times, The Guardian, and USA Today Co. had begun limiting the Wayback Machine’s access to their news articles. Our reporting showed that these decisions, including a “hard block” by the Times that started late last year, were driven by publishers’ concern that the Internet Archive’s free library of webpage snapshots could be scraped by AI companies to train their commercial models.
Now, journalists and digital rights nonprofit organizations are pushing back against this trend and advocating for news publishers to lift their restrictions.
On Monday, Wired first reported on the publication of a new petition organized by the digital rights nonprofit Fight for the Future. The open letter does not call for any specific policy from publishers, but “applauds” the Wayback Machine for its work “at a time where many major media outlets are questioning whether to allow the Wayback Machine to continue to preserve journalism.” The petition has already been signed by over 120 journalists, including Cory Doctorow, Taylor Lorenz, and Ron Suskind.
“The Internet Archive is a national treasure. I use it daily, and have for many, many years. I cannot imagine doing the work I do without it,” MS Now host Rachel Maddow wrote in a testimonial published alongside the letter.
“The Internet Archive preserves over two decades of original reporting on music and popular culture by MTV News,” wrote Michael Alex, the founding editor of the now-shuttered music and popular culture news site. “History needs stewards. The people of the Internet Archive do an outstanding job of preserving irreplaceable work and making it available to journalists and researchers.”
PressProgress reporter Brishti Basu also signed the petition, detailing an incident when the Vancouver Police Department edited a press release after she published an article criticizing it for making misleading statements. The department then publicly accused her of falsifying information.
“I was able to use the Wayback Machine to immediately prove that the police department had changed their initial statement to make it look like I had lied in my article,” wrote Basu.
The petition follows a blog post published last month by the digital rights nonprofit Electronic Frontier Foundation (EFF), which cited Lab’s reporting. Joe Mullin, a senior policy analyst at EFF, called on new publishers to lift their limits on the Wayback Machine and instead take violating AI companies to court.
“In many cases, articles get edited, changed, or removed — sometimes openly, sometimes not. The Internet Archive often becomes the only source for seeing those changes,” wrote Mullin, noting that Wikipedia links to over 2.6 million news articles preserved by the Wayback Machine across 249 languages. “There are real disputes over AI training that must be resolved in courts. But sacrificing the public record to fight those battles would be a profound, and possibly irreversible, mistake.”
The recent rallying efforts by digital rights organizations echo public comments made by Wayback Machine’s director, Mark Graham, in the weeks after our reporting was first published. In February, Graham published an opinion piece on the tech policy blog TechDirt.
“Whatever legitimate concerns people may have about generative AI, libraries are not the problem, and blocking access to web archives is not the solution; doing so risks serious harm to the public record,” Graham said.