New laws needed to preserve public documents in digital age

Central repository for electronic documents would guard against risk of disappearance

Official documents at the National Archives. A growing number of public bodies are scanning past publications to make them more widely available. Photograph: Eric Luke
Official documents at the National Archives. A growing number of public bodies are scanning past publications to make them more widely available. Photograph: Eric Luke

A big issue in managing the world of social media is the “right to forget”. However, the move from paper to electronic communications raises other important issues such as the “duty to remember” and how it can best be implemented.

More than 2,000 years ago, the library of Alexandria was one of the seven wonders of the world. For centuries, it was a famed repository of learning, and its destruction saw the loss of centuries of wisdom.

Since then, libraries have played a vital role in collecting and making available to a wider public the fruits of human scholarship. Over the last two millennia, drawing on libraries, every generation has been able to build on the accumulated knowledge of past generations, ensuring progress across all fields of learning. With the advent of printing, this accumulated learning over time became available to a mass market.

In a move to ensure the preservation of the growing range of publications, the Copyright Act of 1911 required all publishers in these islands to send a copy of their output to a select range of libraries, including Trinity College Dublin. As a result, any material published in Ireland or Britain has been preserved for future generations.

READ MORE

Web pages

Today, the web has greatly expanded our access to knowledge. However, while it has disrupted the previous technology for storing and disseminating knowledge, the new medium has not yet been fully adapted to take over the role of the libraries of the past. In particular, there is a serious danger that important publications on the web may disappear, leaving a vital gap in our history.

Many publications by government agencies are now published only as web pages. However, as publications are superseded by later versions, as institutions change and as websites get redesigned, web pages also disappear. We very frequently encounter broken links to web pages that no longer exist. Thus publication to a web page is often an ephemeral matter.

Furthermore, some early publications may be in formats no longer easily read by today’s computer programs.

To address these issues, it is essential that public bodies, whether or not they publish directly to a web page, also publish these documents in portable document format (PDF). This format is sufficiently ubiquitous that any new format will be certain to provide backward access to PDF documents, ensuring their readability by future generations.

In addition, it is essential that all public bodies adopt the standard library practice of storing copies of past documents, and providing a structured access to them.

Ideally there should be a central repository for all Irish public documents published in electronic format. This would guard against the danger that the disappearance of a public body could result in the disappearance of its historical publications.

It has taken time for public bodies to adopt a suitable approach to archiving past publications. The Central Statistics Office has been doing it for some time, and there has been a notable improvement in the approach of the departments of finance and public expenditure to this task over the last year. However, there are still many documents falling through the electronic cracks into possible oblivion.

Scanning

A growing number of public bodies are scanning past publications to make them more widely available. However, unless very rigorous checking is done, it is vital that the paper copies are maintained. For example, the Oireachtas library makes a wide range of government publications available to the public in scanned form. However, in many cases the scans are imperfect. If the paper copies were scrapped we would permanently lose the information that they contain. Thus reliance on scanned copies will be possible only with very rigorous checking.

A further potential risk with electronic publication is where documents include statistical appendices. Increasingly these appendices are being replaced with references to websites where the data is available in a more convenient format. However, the data on the relevant website is frequently updated. This makes it impossible to access the original data used in the publication when it was published.

This can be a major problem when trying to understand past policy decisions. Some organisations have adopted the good practice of publishing spreadsheets online, making data easy to use. However, public bodies have further work to do to ensure the suitable preservation and archiving of historic data.

The Copyright Act of 1911 has served us well for more than a century. However, this legislation needs to be updated to ensure in our digital age the preservation of official documents and data for future generations.