Obama’s Change.gov promise to protect whistleblowers? Scrubbed from the Web

Well, this pissed me off. Long-time readers of this site may recall my interest in the Internet Archive’s Wayback Machine, which aims to preserve the historical web. I’ve previously written to criticize the Bush administration for its lengthy robots.txt exclusion file (thousands of lines long), which could be viewed as an attempt to prevent the Wayback Machine and others from archiving portions of his White House website. I also wrote to compliment the new Obama White House website for its much shorter, and much more archive-friendly robots file.

But now the Obama administration is scrubbing the web, too. John Wonderlich at the Sunlight Foundation reports that materials from Obama’s old transition website at Change.gov have recently been deleted. Although the main page has referred users for a while to the Whitehouse.gov site, internal pages regarding his agenda were still online, and “until recently, you could still continue on to see the materials and agenda laid out by the administration.”

So why the change? Wonderlich speculates — and I think 100% correctly — that the internal Change.gov pages were removed due to broken and now inconvenient promises made in the transition team’s “Obama-Biden Plan” to protect whistleblowers. Considering the administration’s consistent actions in aggressively prosecuting whistleblowers such as Edward Snowden and others, the administration likely decided to scrub inconvenient promises it made during the transition period.

But in an era of permanent digital records (hello, NSA and its yottabytes of storage in Utah!), how can the Obama administration be so naïve as to think that somebody wouldn’t: 1) notice the missing pages; 2) find the old site; and 3) point it out? As a prosecutor might say, destroying evidence may be proof of a guilty conscience. The administration’s naïveté is positively striking, considering that Obama’s people are widely touted as being extremely tech-savvy.

See for yourself. In an Internet Archive capture of the Change.gov site from June 7, 2013 (barely a month ago), a page on ethics (!) in the Obama-Biden Plan promised to protect whistleblowers:

Protect Whistleblowers: Often the best source of information about waste, fraud, and abuse in government is an existing government employee committed to public integrity and willing to speak out. Such acts of courage and patriotism, which can sometimes save lives and often save taxpayer dollars, should be encouraged rather than stifled. We need to empower federal employees as watchdogs of wrongdoing and partners in performance. Barack Obama will strengthen whistleblower laws to protect federal workers who expose waste, fraud, and abuse of authority in government. Obama will ensure that federal agencies expedite the process for reviewing whistleblower claims and whistleblowers have full access to courts and due process.

Here’s a screen cap. According to the Wayback Machine, this was still online as recently as June 7:

Untitled

Post-Snowden, this is what you see today:

Untitled picture

The difference? No doubt it’s the Snowden affair, which broke in early June. A Google search of Change.gov for “whistleblowers” conducted today (screen cap here) shows no hits, so the page apparently has not been moved to another URL on the site. It simply seems to be gone.

Even more disturbingly, this may reflect a broader trend of digital scrubbing. Wonderlich notes that this is not the first time that Obama administration documents have disappeared from the internet. An earlier posting of his includes a letter the Sunlight Foundation and others sent to the Department of Labor criticizing the administration for removing materials. As the letter states, “No major administration decision should be accompanied by related materials disappearance from public view.”

HT Animal. Cross-posted to Infoglut Tumblr.

Major expansion of Wayback Machine’s archive of the historical internet

The Next Web reports that the Internet Archive has vastly increased its historical database of the web:

The Internet Archive has updated its Wayback Machine with a significant bump in coverage: the service has gone from 150,000,000,000 URLs to having 240,000,000,000 URLs, a total of about 5 petabytes of data. More specifically, the Wayback Machine now covers the Web from late 1996 to December 9, 2012.

Cross-posted to Infoglut Tumblr.

Social networking word-of-the-day: “thinvisibility”

A new word for Facebookers and social networkers who cavalierly post embarrassing information about themselves to the web: thinvisibility:  Here’s a starting definition:

Thinvisibility: n.

  1. Being neither completely visible nor completely invisible.
  2. Being a tiny, shiny needle in a haystack of information overload.
  3. Being invisible to everyone except data aggregators and digital preservationists such as Google, the Wayback Machine, the NSA, and others.
  4. Being invisible to employers, colleges, police, neighbors, friends, exes, stalkers, acquaintances, and others, who are not interested in you, until they are.
  5. Being visible.

President Obama and White House robots

I’ve written before about the Bush administration’s use of the robots exclusion standard to tell search engines and web archives to stay away from chunks of the Whitehouse.gov website.  As of last November, the robots.txt file was nearly 2300 lines long.  (Here’s a copy of it.)  In my opinion, such materials should be freely archivable for researchers and historians.  It’s appalling that search engines or the Internet Archive would be asked to stay away.

President Obama has been in office for barely more than an hour, and the Whitehouse.gov website has already been updated.  And the robots.txt file?  Not 2000 lines long.  It’s only two lines long:

User-agent: *
Disallow: /includes/

Much better.

UPDATE:  The EFF has a posting on presidential orders and memoranda regarding transparency and openess in government.  H/T to BoingBoing via Twitter.

A presidential “legacy” via rewritten history

Web archiving is a topic of great interest to me and the subject of an article I’m writing.  Part of the paper addresses the Bush administration’s questionable conduct regarding the content of the White house website.  For example, the White House website’s robots exclusion file — a mechanism that can be used to ask search engine and web archive spiders to stay away — is nearly 2300 lines long.  2300 lines?  Simply absurd.  (Click here for a copy of the White House robots file that I downloaded on Nov. 25, 2008.)

Today, researchers at the University of Illinois released a study showing how the White House has deleted or modified portions of its website.  Their findings are, sadly, unsurprising:

Legacies are in the air as President Bush prepares to leave the White House. How future historians will judge the president remains to be seen, but one thing is certain: future historians won’t have all the facts needed to make that judgment. One legacy at risk of being forgotten is the way the Bush White House has quietly deleted or modified key documents in the public record that are maintained under its direct control.

Remember the “Coalition of the Willing” that sided with the United States during the 2003 invasion of Iraq? If you search the White House web site today you’ll find a press release dated March 27, 2003 listing 49 countries forming the coalition. A key piece of evidence in the historical record, but also a troubling one. It is an impostor.

And although there were only 45 coalition members on the eve of the Iraq invasion, later deletions and revisions to key documents make it seem that there were always 49.

The study is a disturbing read.  Rightly or not, a primary source of history for many researchers is the web.  And any effort by the government to modify or delete historical records is appalling.  As the authors note:

Updating lists to keep up with the times is one thing. Deleting original documents from the White House archives is another. Back-dating later documents and using them to replace the originals goes beyond irresponsible stewardship of the public record. It is rewriting history.

H/T: New York Times.