The Tow Centre for Digital Journalism has researched a subject close to my heart: digital archiving. How well are our news organizations preserving their own reporting, to both provide context, and as a resource for future generations?
What we found was that the majority of news outlets had not given any thought to even basic strategies for preserving their digital content, and not one was properly saving a holistic record of what it produces. Of the 21 news organizations in our study, 19 were not taking any protective steps at all to archive their web output. The remaining two lacked formal strategies to ensure that their current practices have the kind of longevity to outlast changes in technology.
There's a number of factors to consider here. At least some of this problem is driven by the mindsets of people who run news organisations. Almost by definition, people who run those organizations tend to be news-obsessed; their focus is on the now or the very recent past. Thus their approach to archive content is very much in line with that out-dated British joke: today's newspaper is tomorrow's fish and chips wrapper.
Possibly as a result of that, most working journalists massively under-estimate the amount of their site's traffic that heads into what we might call the "archives" - non current content. Part of that is down to the appeal of "evergreen" content - journalism that is not rooted in time. But it's also worth remembering that even non-evergreen journalism can take on new meaning through connection with later events, or through anniversaries. The internet allows us to do what paper never could: interconnect todays events, with the previous ones that led us there.
Attention on particular subjects or events can go through peaks and troughs, and the ability to capitalize that as spikes in attention emerge can be really valuable for publishers. The Financial Times has actually found a way to automate this:
On average, the articles flagged by the tool and re-promoted have seen more engagement than the FT's average Facebook posts, getting three times more clicks. They have also performed better than archival content shared by editors without using the dashboard, receiving between 1.5 and 1.7 more clicks on average, as well as more shares and comments.
(That's a two year old piece from journalism.co.uk - see the value?)
That sort of approach drives traffic, which can, in turn, drive revenue - either through advertising, or through proving that the site has deeper value to a paying subscriber. That latter factor has been important for many paywalled B2b titles who have found that their subscribers are as likely to use the site as a research tool and a news one.
There's gold in them there archives
However, more philosophically, surfacing and inter-linking archive content allows much deeper context setting, exploration of the history of events - and thus, hopefully, deeper understanding. In an age of declining trust in journalism, and the rise of hyper-partisan or misinformation sites, our depth of history can be a compelling selling point in building loyalty. A recent, malicious site can imitate the look and style of traditional reporting - but it can't generate the history and depth of reporting. It's an advantage we refuse to acknowledge that we have.
So, this is an issue that's just as much about good business sense — making use of content you've already paid to generate — as it is the historical record. Most news organisations have woefully neglected their archives, leaving print material undigitized, and allowing older digital content to fall off the web, or become less findable through link rot.
Not everyone is guilty of this. A client I worked with last year had, much to my delight, already paid to digitize the majority of its archives going back decades. That gave it a significant competitive advantage as it started working on a paywall strategy. They are still very much the exception, rather than the rule.
Journalism's changing relationship with time
In fact, this lack of attention to our old content is a symptom of a quite deep problem: the industry's lack of awareness that the internet has fundamentally changed our relationship with time. One aspect of that is well-discussed, as many journalists are ground down by the pressure to become a 24 hour rolling news service, with an ever-hastening news cycle.
The other shift — that of our old content being available, potentially forever, is less well-discussed. Smart publishers are finding ways to exploit the ever-green content in their archives. Less smart ones are purging them. I was had a meeting with a project manager who was proposing to dump all site archives older than 1000 or so articles, to address a technical limitation in the new CMS they were implementing. The way colour drained from his face when he saw how much traffic arrived on those older, at risk posts, still sticks in my mind, the best part of a decade later.
This is the reason I tend to use the phase "journalism industry" rather than "news business". Our work can be valuable long after it ceases to be news, in this day and age, at least. But taking advantage of that requires a pretty fundamental mindset shift.
And that might require hiring different people, whose instinct is not rooted in the now, but in the narrative. People who like to archive, and connect those archives to make them useful. In my experience, there's a heavy overlap here with people who are more interested in feature writing than news reporting; people for whom depth and connection resonate more than the adrenaline rush of breaking a story.
As the report's executive summary puts it:
While there are a number of news archiving technologies being developed by both individuals and nonprofits, it is worth noting that preserving digital content is not, first and foremost, a technical challenge. Rather, it’s a test of human decision-making and a matter of priority. The first step in tackling an archival process is the intention to save content. News organizations must get there.
And perhaps we first need to accept that we are journalism organizations, in which news is only the beginning of what we do, rather than the entirety of it.