Friday, September 28, 2012

Family Archival Storage

One of the things my paternal grandmother left -- in addition to a marvelous music box -- was an overstuffed three-ring binder full of notes about the family tree.  I've finished scanning all 705 pages that she left.  Most of the pages are either pencil-on-paper or typed with a fabric ribbon; even the ones that are approaching 50 years old are in pretty good shape.  There are a few pictures included that go back much farther that are also well preserved.  The old black-and-white photos are, in fact, in better shape than the few newer color photos.  Copies of the scanned pages are being distributed to several family members to reduce the risk of losing everything in a single fire/tornado/other disaster here.

I used to have lunchtime discussions about archival storage, intentional or otherwise, with the woman who managed the company's technical library.  She maintained that we have a much better idea of what the common man thought and wrote about the US Civil War than people living 100 years from now will have about what we think today.  Her fundamental reason was that during the Civil War, people left their thoughts on media with an inherently long shelf life: silver-based black-and-white photography and pigment-based inks on relatively low-acid paper.  Letters or a diary written on such material can be tucked away in a trunk in the attic and still last for a long time.

Today, of course, we send e-mail and post stuff on Facebook and tweet.  Most of which never reaches paper, and all of which is subject to the vagaries of computers, both our own and those "in the cloud."  I have CD-Rs that are approaching 20 years old and seem to work fine on those occasions when I need something from one of them, but digital media are often an all-or-nothing proposition: they work perfectly until they fail, but once they fail they're unusable.  Then there are file system formats, and formats for the file contents themselves.  In 100 years, even if the bits on a CD-R are still good, will there be a drive that can read it, or software that understands the ISO file system, or applications that still handle JPEG image files?

Even if you assume conscientious descendents that periodically copy the material from the old physical medium to a new one, and transcode images from one format to another, there's a more subtle problem.  Serial encoding of images with lossy algorithms (eg, JPEG as it is almost always used) results in steady deterioration.  The first encoding introduces small errors in the reconstructed image.  Many of these errors produce visible artifacts in the image if you know what you're looking for.  The image to the left illustrates ringing errors in an MRI image (in this case, visible "echos" of the sharp dark/light boundary) [1].  The serial encoding problem occurs because a future encoding with a different algorithm will waste bits trying to accurately reproduce the artifacts.

I find myself struggling to find some sort of middle road, one that allows me to take advantage of the ability of contemporary tech to impose indexing and organization on Grandma's work and additions to it, and at the same time, to allow both Grandma's stuff and any material I add to survive something as extreme as skipping a generation along the way.

[1] Image taken from the American Journal of Roentgenology, An Introduction to the Fourier Transform.

No comments:

Post a Comment