There are plenty of good changes in the new whitehouse.gov site, such as a better copyright policy that enables clearer copying and remix, and a much shorter robots.txt file, which makes it easier for search engines and archivists to index and archive the site. (Compare the current 4-line Obama robots file to a 2300+ version from apparently late in the Bush era.)
But what about Bush’s old website? Shouldn’t that be preserved? (Well, yeah!) But when President Obama took the oath of office, things switched over and the Bush site was gone from public view. Did anybody keep a copy? Well, yes, kind of. The Internet Archive archives the whitehouse.gov site, but I have deep concerns about the completeness of its archive. See below for a screen cap of the Internet Archive’s database of http://www.whitehouse.gov.
I think it can be taken as an axiom that in a free society, it’s vital that governmental sites are archived frequently, deeply, accurately, and made available for scrutiny quickly. But the depth of the Internet Archive’s archive of whitehouse.gov is unclear. First, to the extent that the Bush administration’s robots.txt file told search engines and archives to stay away, did the Internet Archive fail to archive governmental content? (Maybe not, but how can we be sure?) Second, the Internet Archive is not up-to-date: as of this writing, the most recent public archive of whitehouse.gov is dated Mar. 25, 2008. Finally and even more disturbingly, the Internet Archive’s frequency is poor. It contains only 53 captures of the main whitehouse.gov page for 2007, and only 15 have yet been posted from calendar year 2008. We can do better.
Interestingly, it appears that government archivists are now dipping their feet in the water. At least part of the legacy Bush 2009 website is now being hosted by the National Archives and Records Administration (“NARA”), which administers the George W. Bush Presidential Library. According to the site:
To preserve the historical record of the George W. Bush administration’s presence on the web, the White House took a “snapshot” of the Whitehouse.gov web site. This is historical material, “frozen in time.” The web site is no longer updated and links to external web sites and some internal pages will not work.
Having NARA archivists maintain an archive is a good start. (Though there should always be archives maintained by disinterested third parties as well.) But it’s not enough to have a “snapshot” of a presidential website. Not only does the archive lack temporal depth (it’s only from materials existing in January 2009), but it appears to be incomplete as well, as even some internal links are admitted not to function. Plus, as the site indicates, the “White House” took the snapshot. I take this to mean that it was taken by interested White House insiders rather than by (hopefully) disinterested professional archivists at NARA.
H/T on Bush Archive to BushLegacy via Twitter.