Mirko Tobias Schäfer / Assistant Professor
University of Utrecht Department for Media and Culture Studies

Making Sense of News Archives

The archive consists of more than a million articles going as far back as 1999, as well as multimedia content such as podcasts, videos and images. The Guardian came up with a three level model of making the content available. On the first level the users can employ headlines of the Guardian's articles, tags and meta-data for free and without prior registration. The second level provides access to the full articles, which feature advertising embedded by the Guardian and are tracked for performance. For the second level users are required to register and the receive an accesskey in order to integrate the Guardians content into their application. The Guardian does not demand a split of any revenues generated from web application featuring content from the Guardian archives. On a third level customized solutions can be negotiated directly with the Guardian. The guardian is using data from various sources which are provided through the open platform. The World Government Data is a gateway to various open government archives around the world, mostly UK and US based archives.

While other newspapers feel threatened by blogs and other services employing their content, the Guardian recognizes a potential for its otherwise idle archive. The Open Platform is a good example for what Tim O'Reilly has dubbed an "architecture of participation". Through integrating the well-developed media practice of employing content from newspapers into blogs and other services, the Guardian finds a way of reaching out to new audiences and monetizing yesterday's news and, even more important, exploring completely new ways of using the newspapers' archive.

By opening the archive through an Application Programming Interface users can integrate the content in a variety of unpredictable ways. The results are surprising mash-up sites featuring content from various sources. A remarkable example is the Gaza Attack Map, providing a Google Maps view, a timeline and the related newsfootage for the 21 days of the Israeli attack on Gaza.

The Guardian Topic Researcher is a search function that trawls the data base for various keywords, retrieving all related articles. The Guardian Tag Bubbles visualize the news according to the content tags as bubbles. Guardian Trends visualizes the headlines in the Google graphs also used for stock market information in Google Finance. I am quite sure that there will be many more application appearing soon, finding many different ways of displaying, researching and visualizing the Guardian's content. The various applications will be published in the Application Gallery.

The Guardian's strategy with an open platform for harnessing their archival data can be seen as an integration of recently established media practices into their traditional newspaper business. In opposite to many of their competitors, the Guardian does not seek to replace the old fashioned way of selling printed paper as an online edition, but finds new ways to profit from their otherwise idle archive and to reach out to a community of potential collaborators and co-developers.

Date May 2010 Category News

While other newspapers have recently decided to lock their content behind paid-access gates or even exclude it from search engines, the Guardian chooses a timely approach for its archive. With the Guardian Open Platform, the British newspaper makes its archive accessible for users who can integrate the content into all kinds of mash-up sites and web applications. The benefits are mutual, of course:  while users gain access to a large archive of newspaper content, the Guardian turns dead data into a dynamic resource that will thrive on the labour of others and simultaneously reach audiences for advertising.

2000 - 2022 Mirko Tobias Schäfer

made with Müller