Monday, May 6, 2013

How can we preserve our digital history?


Several years ago, I was asked to sign a purchase order while my boss was out of town.  When I looked at it, it turned out to be a mobile data center to give to the Internet Archive.  Brewster Kahle had recently come to visit us and spoken quite eloquently about his work creating the Internet Archive.  It is a great idea/cause/resource.  I was happy to help support it then and I still believe in the vision of the Internet Archive.

For those of you who are unfamiliar with the Internet Archive, it is a repository of the web.  More and more of our records of events are being put online.  The problem is that, in this digital age, web sites come and go and content changes regularly.  There is no trail of paper documents.  There are no paper photographs.  There are only bits, and bits are volatile.   So what is the library for the internet?  Where do we find the equivalent of rare books?  How do we make sure that our history isn't erase over time as we update the web?  How do we make sure that we can still access old information as data formats change?  Paper has an advantage that it can be read by the human eye, but electronic media often doesn't have this quality.  If you have any old VHS tapes or even 8mm films lying around, you know what I'm talking about.  Who is making sure that we don't lose our history?  ...especially the latent events that may not seem important at the time, but later prove to be critical.

The Internet Archive takes snapshots of as many websites as it can and makes them accessible so that we can look back in time.  It stores all the data it can grab in a vast array of disk drives that you can access through the Wayback Machine.  You can type in a URL for your favorite website and it will let you see what it looked like in the past.  The Internet is growing so fast that it is impossible to store everything, but they are getting as much as they possibly can.  I know they have a personal website of mine going back to 1996.  Pretty impressive.

The video below gives you some idea what is at stake, some of the challenges and what the Internet Archive is doing about it.  It is also very impressive how much data they can fit in such a small space.


Do yourself a favor, go explore the Internet Archive for a little while today.  It is worthwhile.  Take a look at what Google looked like in 1996, or how whitehouse.gov has changed over the years.  It will give you some idea of just how difficult this undertaking is.