The Tenuous Nature of Online Archives
If historians are going to keep writing awesome political histories and other histories, we are going to need access to primary sources. Increasingly those primary sources are online. But as we know, anything online can disappear in a blink. There was that moment early in Google’s history when it hoped to digitize everything. But then it decided it had no interest in providing great public services it couldn’t monetize. So that died. In the case of the newspaper in Milwaukee, its newspaper archives simply became so expensive that the public library couldn’t afford the service.
Where had Milwaukee’s history gone?
The archive had initially been made available on Google around 2008 as part of the company’s effort to digitize historical newspapers. That project ended in 2011, but not before Google had scanned more than 60 million pages covering 250 years of history’s first drafts. Those newspapers have remained publicly accessible, and serve both professional historians and home genealogists.
When the Milwaukee project began, Google used microfilms from the papers that had already been uploaded to the ProQuest research database. Because some things were missing from ProQuest, the Journal-Sentinel asked the Milwaukee Public Library to help out. The library let the company digitize decades of microfilms to bulk out the digital archives.
But as Google discontinued support for the project, the paper decided to construct its own archive. “It takes a long time to scan and get the archives up,” said James Conigliaro, the paper’s vice-president of digital strategy. “So we’ve been working on that.”
The paper had an existing relationship with Newsbank, a digitization and archiving company based in Florida. In 2014, Newsbank approach the Milwaukee Public Library about buying the rights to the Journal-Sentinel archives. The MPL already subscribed to two Newsbank services—an obituary archive and a modern database of the Journal-Sentinel–and regularly purchases proprietary databases whose subscription fees are in the low five figures. But it couldn’t afford the Journal-Sentinel archives.
In May, Newsbank came to the MPL again, offering a menu of purchase options. The most expansive offer was almost $1.5 million, with an annual hosting fee. That nearly amounted to the library’s entire $1.7 million annual materials budget. “To be asked to purchase outright something for a million dollars is just out of our scope of possibility,” said Paula Kiely, the library director.
Then, in August, Newsbank let the other shoe drop: According to Urban Milwaukee, Gannett—which purchased the paper in April—asked the Journal-Sentinel to ask Google to remove the paper’s digital archives, which the company did. It’s harder to sell a product when it’s being given away for free, after all.
Someone figured it could make money on history. It can only do that by charging an absolutely exorbitant amount of money. Many cities can’t pay. So the historical sources disappear. If it is going to be a requirement that someone profit in order to make primary sources public, the future of the historical profession is grim indeed.