Almost everyone realises that it may be difficult to access electronic information in the distant future. The truth is that there is a high risk that information will be lost (more accurately, that access to information will be lost – which is tantamount to the same thing) over the course of years if no action is taken to avoid losses.
The factors that combine to cause this risk are related to physical and chemical degradation of storage media, and to the evolution of technologies. The first, degradation, is relatively easy to avoid by good data management disciplines (backup, offsite storage, good environments, media rotation and so on); you are not likely to need consultants to advise on this. The second, evolution of technologies, causes two problems: one related to hardware, the other to software. Here again, the former is easy to handle, simply by ensuring that information is continually migrated to current media. But the last issue – software and format obsolescence – is less easy.
Strategies exist to minimise the risks of software and format obsolescence. But few organisations have programmes to implement them. Inforesight brings together consultants who have decades of experience in digital preservation; we can work with you to determine a strategy, and to design a programme, that will minimise the risks to a level appropriate to your organisation.
Here is just one example of a perhaps-unexpected digital preservation issue.
Most people think that PDF is a relatively future-proof storage format. But few realise the problems it can bring in practice. One problem is the proportion of PDF files that do not conform to the standard definition of PDF, and so are difficult or impossible to access. The Jhove project recently reported that, in a sample of 9,141,011 PDF files 1% were ill-formed, and 8% were no valid. That’s 9% were problematic. And out of a sample of over 9 million files, that means nearly 800,000 files were problematic. That’s a lot of files to fix.
Another unexpected problem can be caused by some TIFF files.
TIFF, regarded as a standard despite its proprietary nature, is also thought of as relatively future proof. Yet some early document imaging programs created non-standard TIFF files - which cannot reliably be read by all TIFF viewers. One such was early versions of "Imaging for Windows", distributed in the mid/late 1990s. If your organisation has large numbers of files created with some versions, you may have a large problem building up for the future.