<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>nathenson&#039;s digital garbage &#187; Digital Preservation</title>
	<atom:link href="http://digitalgarbage.net/category/digital-preservation/feed/" rel="self" type="application/rss+xml" />
	<link>http://digitalgarbage.net</link>
	<description>dumpster-diving for bits about law, info, tech, and culture</description>
	<lastBuildDate>Wed, 16 Nov 2011 05:00:26 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Washington Declaration on Intellectual Property and the Public Interest</title>
		<link>http://digitalgarbage.net/2011/09/06/washington-declaration-on-intellectual-property-and-the-public-interest/</link>
		<comments>http://digitalgarbage.net/2011/09/06/washington-declaration-on-intellectual-property-and-the-public-interest/#comments</comments>
		<pubDate>Wed, 07 Sep 2011 01:24:52 +0000</pubDate>
		<dc:creator>Ira Nathenson</dc:creator>
				<category><![CDATA[Conferences]]></category>
		<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Digital Preservation]]></category>
		<category><![CDATA[Fair Use]]></category>
		<category><![CDATA[Information]]></category>
		<category><![CDATA[Intellectual Property]]></category>
		<category><![CDATA[Law]]></category>
		<category><![CDATA[Law Professors]]></category>
		<category><![CDATA[Libraries]]></category>
		<category><![CDATA[Patents]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[ACTA]]></category>
		<category><![CDATA[Enforcement]]></category>
		<category><![CDATA[Global Congress]]></category>
		<category><![CDATA[Participation]]></category>
		<category><![CDATA[Transparency]]></category>

		<guid isPermaLink="false">http://digitalgarbage.net/?p=2335</guid>
		<description><![CDATA[Despite the slings and arrows of Hurricane Irene hitting Washington a week ago, the recent Global Congress on Intellectual Property Law and the Public Interest has produced an important document calling for more transparency and public participation in the crafting of &#8230; <a href="http://digitalgarbage.net/2011/09/06/washington-declaration-on-intellectual-property-and-the-public-interest/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Despite the slings and arrows of <a href="http://digitalgarbage.net/2011/08/27/the-earth-and-hurricane-irene/">Hurricane Irene</a> hitting Washington a week ago, the recent <a href="http://infojustice.org/public-events/global-congress">Global Congress on Intellectual Property Law and the Public Interest</a> has produced an important document calling for more transparency and public participation in the crafting of IP law.The <a href="http://infojustice.org/archives/5406">Washington Declaration on Intellectual Property and the Public Interest</a> is an important step in the fight for the public interest and against governments that have been co-opted by copyright and patent owners. Truly a global effort, the Global Congress included over 180 experts from 35 countries in six continents and was held (during Irene!) at American University Washington College of Law.</p>
<p>As argued in my recent article on <a href="http://ssrn.com/abstract=1699429">private copyright enforcement and feedback loops</a>, a deficit of transparency and public participation in private copyright enforcement has fostered gross overreach by copyright owners. A recent example of copyright overreach is amply demonstrated by the so-called Anti-Counterfeiting Trade Agreement, which was negotiated secretly and addresses far more than mere “counterfeiting.” (See <a href="http://www.wcl.american.edu/pijip/download.cfm?downloadfile=83CE3453-EFC7-45B0-7CBA50D842A84563&amp;typename=dmFile&amp;fieldname=filename">here</a> for a law professors’ letter I’ve signed against ACTA.)</p>
<p>It’s good to see such concerns echoed in the Congress’ just-released Declaration. For example:</p>
<blockquote><p>International intellectual property policy making should be conducted through mechanisms of transparency and openness that encourage broad public participation. New rules should be made within the existing forums responsible for intellectual property policy, where both developed and developing countries have full representation, and where the texts of and forums for considering proposals are open. All new international intellectual property standards must be subject to democratic checks and balances, including domestic legislative approval and opportunities for judicial review.</p></blockquote>
<p>Along similar lines, the Declaration calls excessive IP enforcement out to task, noting that “Government and private IP enforcement are commandeering greater social resources in order to impose stricter penalties than ever before, with fewer safeguards and less procedural fairness.” The Declaration contains many other important ideas, such as making sure that new IP protections are rooted in transparent research that demonstrates the need for new IP rights, including addressing the fact that fair uses and other IP limitations also generate economic value. Other important mentions are the importance of libraries and archives, strengthening IP exceptions, rejuvenating notice-based formalities, and much more.</p>
<p>I&#8217;d go on, but instead you should read the full document at <a href="http://infojustice.org/washington-declaration">http://infojustice.org/washington-declaration</a>. Even better, sign it. (I did: I’m # 95.).</p>
<p><iframe src="http://docs.google.com/viewer?url=http%3A%2F%2Finfojustice.org%2Fwp-content%2Fuploads%2F2011%2F09%2FWashington-Declaration.pdf&amp;hl=en_US&amp;embedded=true" frameborder="0" width="100%" height="800"></iframe></p>
]]></content:encoded>
			<wfw:commentRss>http://digitalgarbage.net/2011/09/06/washington-declaration-on-intellectual-property-and-the-public-interest/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Social networking word-of-the-day: &#8220;thinvisibility&#8221;</title>
		<link>http://digitalgarbage.net/2010/08/10/thinvisibility/</link>
		<comments>http://digitalgarbage.net/2010/08/10/thinvisibility/#comments</comments>
		<pubDate>Tue, 10 Aug 2010 09:20:06 +0000</pubDate>
		<dc:creator>Ira Nathenson</dc:creator>
				<category><![CDATA[Data Retention]]></category>
		<category><![CDATA[Digital Preservation]]></category>
		<category><![CDATA[Featured]]></category>
		<category><![CDATA[Information]]></category>
		<category><![CDATA[Internet Archive]]></category>
		<category><![CDATA[Language]]></category>
		<category><![CDATA[Reputation]]></category>
		<category><![CDATA[Social Networking]]></category>
		<category><![CDATA[Surveillance]]></category>
		<category><![CDATA[Wayback Machine]]></category>
		<category><![CDATA[Web 2.0]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[Infoglut]]></category>
		<category><![CDATA[MySpace]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://digitalgarbage.net/?p=229</guid>
		<description><![CDATA[A new word for Facebookers and social networkers who cavalierly post embarrassing information about themselves to the web: thinvisibility:  Here&#8217;s a starting definition: Thinvisibility: n. Being neither completely visible nor completely invisible. Being a tiny, shiny needle in a haystack of &#8230; <a href="http://digitalgarbage.net/2010/08/10/thinvisibility/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>A new word for Facebookers and social networkers who cavalierly post embarrassing information about themselves to the web:<em> thinvisibility</em>:  Here&#8217;s a starting definition:</p>
<p><em>Thinvisibility</em>: <em>n.</em></p>
<ol>
<li>Being neither completely visible nor completely invisible.</li>
<li><span style="font-size: 13.3333px;">Being a tiny, shiny needle in a haystack of information overload.</span></li>
<li><span style="font-size: 13.3333px;"> </span><span style="font-size: 13.3333px;">Being invisible to everyone except data aggregators and digital preservationists such as Google, the Wayback Machine, the NSA, and others.</span></li>
<li><span style="font-size: 13.3333px;">Being invisible to employers, colleges, police, neighbors, friends, exes, stalkers, acquaintances, and others, who are not interested in you, until they are.</span></li>
<li><span style="font-size: 13.3333px;">Being visible.</span></li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://digitalgarbage.net/2010/08/10/thinvisibility/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>1934: Building a brick &amp; mortar archive</title>
		<link>http://digitalgarbage.net/2010/03/07/1934-building-a-brick-mortar-archive/</link>
		<comments>http://digitalgarbage.net/2010/03/07/1934-building-a-brick-mortar-archive/#comments</comments>
		<pubDate>Sun, 07 Mar 2010 07:19:51 +0000</pubDate>
		<dc:creator>Ira Nathenson</dc:creator>
				<category><![CDATA[Digital Preservation]]></category>
		<category><![CDATA[Excerpts]]></category>
		<category><![CDATA[Featured]]></category>
		<category><![CDATA[Archiving]]></category>
		<category><![CDATA[Construction]]></category>
		<category><![CDATA[National Archives]]></category>
		<category><![CDATA[Photos]]></category>

		<guid isPermaLink="false">http://digitalgarbage.net/?p=1564</guid>
		<description><![CDATA[The logo I&#8217;ve always used for the site is an image of the National Archives Building.  Amazingly, Congress did not approve such a building until 1926.  The architect was John Russell Pope, who also designed the Jefferson Memorial and the &#8230; <a href="http://digitalgarbage.net/2010/03/07/1934-building-a-brick-mortar-archive/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>The logo I&#8217;ve always used for the site is an image of the National Archives Building.  Amazingly, Congress did not approve such a building <a href="http://www.archives.gov/about/history/building-an-archives/building.html">until 1926</a>.  The architect was <a href="http://www.archives.gov/about/history/building-an-archives/pope.html">John Russell Pope</a>, who also designed the Jefferson Memorial and the National Gallery of Art.  Ground was broken in 1931 and the building was mostly completed by 1935.  According to the <a href="http://www.archives.gov/about/history/building-an-archives/building.html">online history</a>:</p>
<blockquote><p>By the time President Herbert Hoover laid the cornerstone of the building in February 1933, significant problems had arisen. Because the massive structure was to be constructed above an underground stream, 8,575 piles had been driven into the unstable soil, before pouring a huge concrete bowl as a foundation. Another difficulty arose over the choice of building materials. Both limestone and granite were authorized as acceptable, but construction began during the darkest days of the Great Depression, and suppliers of each material lobbied fiercely to have the government use their stone. Ultimately, as in the other Federal Triangle buildings, limestone was used for the exterior superstructure and granite for the base.</p></blockquote>
<p>Here&#8217;s <a href="http://www.archives.gov/about/history/building-an-archives/construction.html">construction</a> <a href="http://www.archives.gov/calendar/images/national-archives-bldg-1934-large.jpg">images</a>:</p>
<div id="attachment_1567" class="wp-caption alignnone" style="width: 310px"><a href="http://digitalgarbage.net/wp-content/uploads/2010/03/nara_const_1.gif"><img class="size-medium wp-image-1567 " title="September 30, 1932" src="http://digitalgarbage.net/wp-content/uploads/2010/03/nara_const_1-300x228.gif" alt="" width="300" height="228" /></a><p class="wp-caption-text">September 30, 1932</p></div>
<p><span id="more-1564"></span></p>
<div id="attachment_1568" class="wp-caption alignnone" style="width: 310px"><a href="http://digitalgarbage.net/wp-content/uploads/2010/03/nara_const_2.gif"><img class="size-medium wp-image-1568" title="September 5, 1933" src="http://digitalgarbage.net/wp-content/uploads/2010/03/nara_const_2-300x232.gif" alt="" width="300" height="232" /></a><p class="wp-caption-text">September 5, 1933</p></div>
<div id="attachment_1569" class="wp-caption alignnone" style="width: 310px"><a href="http://digitalgarbage.net/wp-content/uploads/2010/03/nara_const_3.gif"><img class="size-medium wp-image-1569" title="December 4, 1933" src="http://digitalgarbage.net/wp-content/uploads/2010/03/nara_const_3-300x231.gif" alt="" width="300" height="231" /></a><p class="wp-caption-text">December 4, 1933</p></div>
<div id="attachment_1570" class="wp-caption alignnone" style="width: 310px"><a href="http://digitalgarbage.net/wp-content/uploads/2010/03/nara_const_4.gif"><img class="size-medium wp-image-1570" title="October 1, 1934" src="http://digitalgarbage.net/wp-content/uploads/2010/03/nara_const_4-300x228.gif" alt="" width="300" height="228" /></a><p class="wp-caption-text">October 1, 1934</p></div>
<p>As the National Archives notes, not only was the structure built in a troublesome location, but archives have special needs, further complicating the construction:</p>
<blockquote><p>Constructing the National Archives was a mammoth task. Not only was the building the most ornate structure on the Federal Triangle, but it also called for installation of specialized air-handling systems and filters, reinforced flooring, and thousands of feet of shelving to meet the building&#8217;s archival storage requirements. The building&#8217;s exterior took more than 4 years to finish and required a host of workers ranging from sculptors and model makers to air-conditioning contractors and structural-steel workers.</p></blockquote>
<p>Here&#8217;s a <a href="http://www.archives.gov/calendar/images/national-archives-bldg-1934-large.jpg">shot</a> similar to the one used for the site logo.</p>
<div id="attachment_1571" class="wp-caption alignnone" style="width: 541px"><a href="http://digitalgarbage.net/wp-content/uploads/2010/03/national-archives-bldg-1934-large.jpg"><img class="size-full wp-image-1571 " title="national-archives-bldg-1934-large" src="http://digitalgarbage.net/wp-content/uploads/2010/03/national-archives-bldg-1934-large.jpg" alt="" width="531" height="420" /></a><p class="wp-caption-text">June 1, 1934</p></div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow: hidden;">
<p>Ground was broken for the National Archives on September 9, 1931. By the time  President Herbert Hoover laid the cornerstone of the building in February 1933,  significant problems had arisen. Because the massive structure was to be  constructed above an underground stream, 8,575 piles had been driven into the  unstable soil, before pouring a huge concrete bowl as a foundation. Another  difficulty arose over the choice of building materials. Both limestone and  granite were authorized as acceptable, but construction began during the darkest  days of the Great Depression, and suppliers of each material lobbied fiercely to  have the government use their stone. Ultimately, as in the other Federal  Triangle buildings, limestone was used for the exterior superstructure and  granite for the base.</p>
<p><strong>Construction</strong></p>
<p><a href="/about/history/building-an-archives/construction.html">Constructing  the National Archives was a mammoth task</a>. Not only was the building the most  ornate structure on the Federal Triangle, but it also called for installation of  specialized air-handling systems and filters, reinforced flooring, and thousands  of feet of shelving to meet the building&#8217;s archival storage requirements. The  building&#8217;s exterior took more than 4 years to finish and required a host of  workers ranging from sculptors and model makers to air-conditioning contractors  and structural-steel workers.</p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://digitalgarbage.net/2010/03/07/1934-building-a-brick-mortar-archive/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The hindsight of archives: “Popular Science” &amp; incorrect technology predictions</title>
		<link>http://digitalgarbage.net/2010/03/05/the-hindsight-of-archives-popular-science-incorrect-technology-predictions/</link>
		<comments>http://digitalgarbage.net/2010/03/05/the-hindsight-of-archives-popular-science-incorrect-technology-predictions/#comments</comments>
		<pubDate>Fri, 05 Mar 2010 18:44:23 +0000</pubDate>
		<dc:creator>Ira Nathenson</dc:creator>
				<category><![CDATA[Digital Preservation]]></category>
		<category><![CDATA[Excerpts]]></category>
		<category><![CDATA[Featured]]></category>
		<category><![CDATA[Search Engines]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[History]]></category>
		<category><![CDATA[Popular Science]]></category>

		<guid isPermaLink="false">http://digitalgarbage.net/?p=1529</guid>
		<description><![CDATA[Sci-fi and tech site IO9.com reports that Popular Science Magazine is now making its archives available online dating back to 1872.  The archives can be searched either at the magazine&#8217;s website or via Google Books.  In the archive, I was &#8230; <a href="http://digitalgarbage.net/2010/03/05/the-hindsight-of-archives-popular-science-incorrect-technology-predictions/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Sci-fi and tech site IO9.com <a href="http://io9.com/5486311/read-the-archives-of-popular-science-back-to-1872/gallery/">reports</a> that Popular Science Magazine is now making its archives available online dating back to 1872.  The archives can be searched either at the magazine&#8217;s <a href="http://www.popsci.com/archive-viewer">website</a> or via <a href="http://books.google.com/books/serial/ISSN:01617370?rview=1">Google Books</a>.  In the archive, I was able to quickly find articles of historical interest, each showing a technological prediction that didn&#8217;t pan out.  Of course, for technology, such history can be shockingly recent.</p>
<p>Here&#8217;s a failed prediction of <em>success</em>.  1980s computer buffs may remember the venerable <a href="http://www.flickr.com/photos/nathenson/galleries/72157623559784616/">Amiga</a>.  A 1985 <a href="http://books.google.com/books?id=oQAAAAAAMBAJ&amp;lpg=PP1&amp;pg=PA89#v=onepage&amp;q=&amp;f=false">article</a> describes the $1295 machine in glowing terms.  Even with <em>only 256 kilobytes</em> of memory, the machine could run a Mac-like operating system, and with an emulator, also PC programs like Lotus 1-2-3 (a popular pre-Excel spreadsheet).  An Amiga representative predicted that Amiga would become &#8220;the new standard for home- and small-business computer needs.&#8221;  Needless to say, this prediction did not become reality, and the Amiga never became a widely used platform, instead outgunned and outnumbered by the less-powerful Macs and PCs of the era.</p>
<p>Here&#8217;s a failed prediction of <em>failure</em>, and a good reality check on how far we&#8217;ve come.  A 1995 <a href="http://books.google.com/books?id=KrfIjdl-EMwC&amp;lpg=PP1&amp;pg=PA78#v=onepage&amp;q=&amp;f=false">article</a> discusses the emerging use of the Internet:</p>
<blockquote><p>Set aside for a moment the hype about what the Internet represents (“the assembly line of the electronic era”), what it could become (“the bedrock of the information superhighway”), or what it might turn us into (“a global community of data-seeking homebodies”).  Instead, let’s take stock of what it is.  This worldwide computer network you hear and read so much about is today little more than a high-tech candy dispenser for the eyes, ears, and mind.  <strong>It is fuzzy satellite weather maps, canned audio clips from the President, unfettered access to obscure college journals, and very likely, not one damn thing that will make a lasting difference in how you work, play, or live.</strong></p></blockquote>
<p>In fairness to the author, much of what he said was true in 1995.  He understandably bemoans the &#8220;impractical&#8221; nature of the web of its time, noting that &#8220;you can&#8217;t stop and make plane or hotel reservations&#8221; online.  But to be sure, the web very quickly made, and continues to make, a transforming difference in our lives.  But enough for now.  I have to pull up Expedia to get some plane tickets before getting back to the work I&#8217;m doing from home over Spring Break.  Later on, maybe I&#8217;ll order some coffee from Amazon, or watch some Hulu.  Or better yet, maybe I &#8212; a &#8220;data-seeking homebody&#8221; &#8212; should unplug and walk the dog, who could care less about computers and archives.</p>
<p><a title="Sleepy sepia golden retriever by Ira Nathenson, on Flickr" href="http://www.flickr.com/photos/nathenson/4399778546/"><img src="http://farm5.static.flickr.com/4059/4399778546_739b313a3c.jpg" alt="Sleepy sepia golden retriever" width="334" height="500" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://digitalgarbage.net/2010/03/05/the-hindsight-of-archives-popular-science-incorrect-technology-predictions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>&#8220;Here today, gone tomorrow&#8221; &#8211; Pogue on data rot</title>
		<link>http://digitalgarbage.net/2009/03/27/here-today-gone-tomorrow-pogue-on-data-rot/</link>
		<comments>http://digitalgarbage.net/2009/03/27/here-today-gone-tomorrow-pogue-on-data-rot/#comments</comments>
		<pubDate>Sat, 28 Mar 2009 02:24:04 +0000</pubDate>
		<dc:creator>Ira Nathenson</dc:creator>
				<category><![CDATA[Digital Preservation]]></category>

		<guid isPermaLink="false">http://digitalgarbage.net/?p=1198</guid>
		<description><![CDATA[Below is an excellent news report on &#8220;data rot&#8221; by New York Times technology writer David Pogue, noting the importance of vigilant migration of files, photos, and other data to keep ahead of degrading media and ever-shifting data formats.  Ironically, &#8230; <a href="http://digitalgarbage.net/2009/03/27/here-today-gone-tomorrow-pogue-on-data-rot/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Below is an excellent news report on &#8220;data rot&#8221; by New York Times technology writer David Pogue, noting the importance of vigilant migration of files, photos, and other data to keep ahead of degrading media and ever-shifting data formats.  Ironically, a 400-year old book is likely to outlive a 10-year old DVD.  Additional info from Pogue <a href="http://www.nytimes.com/2009/03/26/technology/personaltech/26pogue-email.html">here</a>.</p>
<p><object width="425" height="324" data="http://www.cbs.com/thunder/swf30can10cbsnews/rcpHolderCbs-3-4x3.swf" type="application/x-shockwave-flash"><param name="flashvars" value="link=http%3A%2F%2Fwww%2Ecbsnews%2Ecom%2Fvideo%2Fwatch%2F%3Fid%3D4836762n%253fsource%3Dsearch%5Fvideo&amp;partner=news&amp;vert=News&amp;autoPlayVid=false&amp;releaseURL=http://release.theplatform.com/content.select?pid=L_QupwByFtIBStQKP_RCvwJ9vdTK9EHd&amp;name=cbsPlayer&amp;allowScriptAccess=always&amp;wmode=transparent&amp;embedded=y&amp;scale=noscale&amp;rv=n&amp;salign=tl" /><param name="src" value="http://www.cbs.com/thunder/swf30can10cbsnews/rcpHolderCbs-3-4x3.swf" /><param name="allowfullscreen" value="true" /></object><br />
<a href="http://www.cbs.com">Watch CBS Videos Online</a></p>
]]></content:encoded>
			<wfw:commentRss>http://digitalgarbage.net/2009/03/27/here-today-gone-tomorrow-pogue-on-data-rot/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>NARA hosting &#8220;lite&#8221; Bush website archive</title>
		<link>http://digitalgarbage.net/2009/02/16/nara-bush-archive/</link>
		<comments>http://digitalgarbage.net/2009/02/16/nara-bush-archive/#comments</comments>
		<pubDate>Mon, 16 Feb 2009 05:02:08 +0000</pubDate>
		<dc:creator>Ira Nathenson</dc:creator>
				<category><![CDATA[Digital Preservation]]></category>
		<category><![CDATA[Wayback Machine]]></category>
		<category><![CDATA[NARA]]></category>
		<category><![CDATA[Robots.txt]]></category>
		<category><![CDATA[White House]]></category>

		<guid isPermaLink="false">http://digitalgarbage.net/?p=1088</guid>
		<description><![CDATA[There are plenty of good changes in the new whitehouse.gov site, such as a better copyright policy that enables clearer copying and remix, and a much shorter robots.txt file, which makes it easier for search engines and archivists to index &#8230; <a href="http://digitalgarbage.net/2009/02/16/nara-bush-archive/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>There are plenty of good changes in the new <a href="http://www.whitehouse.gov/">whitehouse.gov</a> site, such as a better <a href="http://www.whitehouse.gov/copyright/">copyright policy</a> that enables clearer copying and remix, and a much shorter robots.txt file, which makes it easier for search engines and archivists to index and archive the site.  (Compare the <a href="http://www.whitehouse.gov/robots.txt">current</a> 4-line Obama robots file to a 2300+ <a href="http://georgewbush-whitehouse.archives.gov/robots.txt">version</a> from apparently late in the Bush era.)</p>
<p>But what about Bush&#8217;s old website?  Shouldn&#8217;t that be preserved?  (Well, yeah!)  But when President Obama took the oath of office, things switched over and the Bush site was gone from public view.  Did anybody keep a copy?  Well, yes, kind of.  The Internet Archive archives the whitehouse.gov site, but I have deep concerns about the completeness of its archive.  See below for a screen cap of the Internet Archive&#8217;s <a href="http://web.archive.org/web/*sa_/http://www.whitehouse.gov">database</a> of http://www.whitehouse.gov.</p>
<p><a href="http://digitalgarbage.net/wp-content/uploads/2009/02/wh-ia-2.jpg"><img class="aligncenter size-medium wp-image-1106" title="Internet Archive captures of whitehouse.gov" src="http://digitalgarbage.net/wp-content/uploads/2009/02/wh-ia-2-300x240.jpg" alt="Internet Archive captures of whitehouse.gov" width="300" height="240" /></a></p>
<p>I think it can be taken as an axiom that in a free society, it&#8217;s vital that governmental sites are archived frequently, deeply, accurately, and made available for scrutiny quickly.  But the depth of the Internet Archive&#8217;s archive of whitehouse.gov is unclear.  First, to the extent that the Bush administration&#8217;s robots.txt file told search engines and archives to stay away, did the Internet Archive fail to archive governmental content?  (Maybe not, but how can we be sure?)  Second, the Internet Archive is not up-to-date: as of this writing, the most recent public archive of whitehouse.gov is dated Mar. 25, 2008.  Finally and even more disturbingly, the Internet Archive&#8217;s frequency is poor.  It contains only 53 captures of the main whitehouse.gov page for 2007, and only 15 have yet been posted from calendar year 2008.  We can do better.</p>
<p>Interestingly, it appears that government archivists are now dipping their feet in the water.  At least part of the legacy Bush 2009 website is now being hosted by the National Archives and Records Administration (&#8220;NARA&#8221;), which administers the <a href="http://www.georgewbushlibrary.gov/">George W. Bush Presidential Library</a>.  According to the <a href="http://www.georgewbushlibrary.gov/white-house/">site</a>:</p>
<blockquote><p>To preserve the historical record of the George W. Bush administration&#8217;s presence on the web, the White House took a &#8220;snapshot&#8221; of the Whitehouse.gov web site. This is historical material, &#8220;frozen in time.&#8221; The web site is no longer updated and links to external web sites and some internal pages will not work.</p></blockquote>
<p>Having NARA archivists maintain an archive is a good start.  (Though there should always be archives maintained by disinterested third parties as well.)  But it&#8217;s not enough to have a &#8220;snapshot&#8221; of a presidential website.  Not only does the archive lack temporal depth (it&#8217;s only from materials existing in January 2009), but it appears to be incomplete as well, as even some internal links are admitted not to function.  Plus, as the site indicates, the &#8220;White House&#8221; took the snapshot.  I take this to mean that it was taken by interested White House insiders rather than by (hopefully) disinterested professional archivists at NARA.</p>
<p>H/T on Bush Archive to <a href="http://twitter.com/BushLegacy/status/1213166967">BushLegacy</a> via Twitter.</p>
]]></content:encoded>
			<wfw:commentRss>http://digitalgarbage.net/2009/02/16/nara-bush-archive/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>President Obama and White House robots</title>
		<link>http://digitalgarbage.net/2009/01/20/president-obama-and-white-house-robots/</link>
		<comments>http://digitalgarbage.net/2009/01/20/president-obama-and-white-house-robots/#comments</comments>
		<pubDate>Tue, 20 Jan 2009 18:47:11 +0000</pubDate>
		<dc:creator>Ira Nathenson</dc:creator>
				<category><![CDATA[Digital Preservation]]></category>
		<category><![CDATA[Internet Archive]]></category>

		<guid isPermaLink="false">http://digitalgarbage.net/?p=894</guid>
		<description><![CDATA[I&#8217;ve written before about the Bush administration&#8217;s use of the robots exclusion standard to tell search engines and web archives to stay away from chunks of the Whitehouse.gov website.  As of last November, the robots.txt file was nearly 2300 lines &#8230; <a href="http://digitalgarbage.net/2009/01/20/president-obama-and-white-house-robots/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve <a href="http://digitalgarbage.net/2008/11/25/presidential-legacy/">written before</a> about the Bush administration&#8217;s use of the robots exclusion standard to tell search engines and web archives to stay away from chunks of the <a href="http://www.whitehouse.gov">Whitehouse.gov</a> website.  As of last November, the robots.txt file was nearly 2300 lines long.  (Here&#8217;s a <a href="http://digitalgarbage.net/wp-content/uploads/2008/11/wh-robots-2008-11-25.txt">copy</a> of it.)  In my opinion, such materials should be freely archivable for researchers and historians.  It&#8217;s appalling that search engines or the Internet Archive would be asked to stay away.</p>
<p>President Obama has been in office for barely more than an hour, and the <a href="http://www.whitehouse.gov">Whitehouse.gov</a> website has already been updated.  And the <a href="http://www.whitehouse.gov/robots.txt">robots.txt</a> file?  Not 2000 lines long.  It&#8217;s only two lines long:</p>
<blockquote><p>User-agent: *<br />
Disallow: /includes/</p></blockquote>
<p>Much better.</p>
<p>UPDATE:  The EFF has a <a href="http://www.eff.org/deeplinks/2009/01/on-day-one-obama-demands-open-government">posting</a> on presidential orders and memoranda regarding transparency and openess in government.  H/T to <a href="http://twitter.com/BoingBoing/status/1138515506">BoingBoing</a> via Twitter.</p>
]]></content:encoded>
			<wfw:commentRss>http://digitalgarbage.net/2009/01/20/president-obama-and-white-house-robots/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A presidential &#8220;legacy&#8221; via rewritten history</title>
		<link>http://digitalgarbage.net/2008/11/25/presidential-legacy/</link>
		<comments>http://digitalgarbage.net/2008/11/25/presidential-legacy/#comments</comments>
		<pubDate>Tue, 25 Nov 2008 21:01:35 +0000</pubDate>
		<dc:creator>Ira Nathenson</dc:creator>
				<category><![CDATA[Digital Preservation]]></category>
		<category><![CDATA[Internet Archive]]></category>
		<category><![CDATA[White House]]></category>

		<guid isPermaLink="false">http://digitalgarbage.net/?p=543</guid>
		<description><![CDATA[Web archiving is a topic of great interest to me and the subject of an article I&#8217;m writing.  Part of the paper addresses the Bush administration&#8217;s questionable conduct regarding the content of the White house website.  For example, the White &#8230; <a href="http://digitalgarbage.net/2008/11/25/presidential-legacy/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Web archiving is a topic of great interest to me and the subject of an article I&#8217;m writing.  Part of the paper addresses the Bush administration&#8217;s questionable conduct regarding the content of the White house website.  For example, the White House website&#8217;s <a href="http://www.whitehouse.gov/robots.txt">robots exclusion file</a> &#8212; a <a href="http://www.robotstxt.org/">mechanism</a> that can be used to ask search engine and web archive spiders to stay away &#8212; is nearly 2300 lines long.  2300 lines?  Simply absurd.  (Click <a href="http://digitalgarbage.net/wp-content/uploads/2008/11/wh-robots-2008-11-25.txt">here</a> for a copy of the White House robots file that I downloaded on Nov. 25, 2008.)</p>
<p>Today, researchers at the University of Illinois released a <a href="http://www.clinecenter.uiuc.edu/airbrushing_history/">study</a> showing how the White House has deleted or modified portions of its website.  Their findings are, sadly, unsurprising:</p>
<blockquote>
<p align="justify">Legacies are in the air as President Bush prepares to leave the  White House. How future historians will judge the president remains to be seen,  but one thing is certain: future historians won&#8217;t have all the facts needed to  make that judgment. One legacy at risk of being forgotten is the way the Bush  White House has quietly deleted or modified key documents in the public record  that are maintained under its direct control.</p>
<p align="justify">Remember the &#8220;Coalition of the Willing&#8221; that sided with the  United States during the 2003 invasion of Iraq? If you search the White House  web site today you&#8217;ll find a <a href="http://www.whitehouse.gov/infocus/iraq/news/20030327-10.html">press  release</a> dated March 27, 2003 listing 49 countries forming the coalition. A  key piece of evidence in the historical record, but also a troubling one. It is  an impostor.</p>
<p align="justify">And although there were only 45 coalition members on the eve of the Iraq  invasion, later deletions and revisions to key documents make it seem that there  were always 49.</p>
</blockquote>
<p>The study is a disturbing read.  Rightly or not, a primary source of history for many researchers is the web.  And any effort by the government to modify or delete historical records is appalling.  As the authors note:</p>
<blockquote><p>Updating lists to keep up with the times is one thing. Deleting original  documents from the White House archives is another. Back-dating later documents  and using them to replace the originals goes beyond irresponsible stewardship of  the public record. It is rewriting history.</p></blockquote>
<p style="text-align: left;">H/T: <a href="http://www.nytimes.com/2008/11/25/washington/25documents.html">New York Times</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://digitalgarbage.net/2008/11/25/presidential-legacy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Is Zoetrope the next-gen Internet Archive?</title>
		<link>http://digitalgarbage.net/2008/11/22/is-zoetrope-the-next-gen-internet-archive/</link>
		<comments>http://digitalgarbage.net/2008/11/22/is-zoetrope-the-next-gen-internet-archive/#comments</comments>
		<pubDate>Sat, 22 Nov 2008 15:06:43 +0000</pubDate>
		<dc:creator>Ira Nathenson</dc:creator>
				<category><![CDATA[Data Mining]]></category>
		<category><![CDATA[Digital Preservation]]></category>
		<category><![CDATA[Internet Archive]]></category>
		<category><![CDATA[Privacy]]></category>
		<category><![CDATA[Search Engines]]></category>
		<category><![CDATA[Searching]]></category>
		<category><![CDATA[Wayback Machine]]></category>

		<guid isPermaLink="false">http://digitalgarbage.net/?p=456</guid>
		<description><![CDATA[Although the Internet Archive&#8217;s Wayback Machine is a great research tool, its utility is hampered but a lack of basic search mechanisms.  One can search by URL and archived links, but basic Google-style boolean searching isn&#8217;t available.  The Archive once &#8230; <a href="http://digitalgarbage.net/2008/11/22/is-zoetrope-the-next-gen-internet-archive/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Although the Internet Archive&#8217;s <a href="http://web.archive.org">Wayback Machine</a> is a great research tool, its utility is hampered but a lack of basic search mechanisms.  One can search by URL and archived links, but basic Google-style boolean searching isn&#8217;t available.  The Archive once offered a beta boolean search tool, but it never worked and it was later withdrawn.</p>
<p>However, a new application may significantly expand our ability to data-mine archived webdata. Reports give a sneak peek at Zoetrope, an application being developed by researchers at Adobe and the <a href="http://uwnews.washington.edu/ni/article.asp?articleID=45255">University of Washington</a>.  As put by the researchers:</p>
<blockquote><p>The Web is ephemeral. Pages change frequently, and it is nearly impossible to find data or follow a link after the underlying page evolves. We present Zoetrope, a system that enables interaction with the historical Web (pages, links, and embedded data) that would otherwise be lost to time. Using a number of novel interactions, the temporal Web can be manipulated, queried, and analyzed from the context of familar [sic] pages. Zoetrope is based on a set of operators for manipulating <em>content streams</em>. We describe these primitives and the associated indexing strategies for handling temporal Web data. They form the basis of Zoetrope and enable our construction of new temporal interactions and visualizations.</p></blockquote>
<p>The demo video shows how historical webdata could be manipulated and compared, as the authors note, in a variety of &#8220;novel&#8221; ways.  Even more significantly, researcher <a href="http://uwnews.washington.edu/ni/article.asp?articleID=45255">Eytan Adar</a> &#8220;hopes to eventually incorporate information from the Internet Archive&#8217;s nearly  14 years of records.&#8221; Such a combination would massively increase the utility of web archives, but would also &#8212; as discussed in a paper I&#8217;m writing &#8212; exacerbate concerns over informational autonomy.</p>
<p><p><a href="http://digitalgarbage.net/2008/11/22/is-zoetrope-the-next-gen-internet-archive/"><em>Click here to view the embedded video.</em></a></p>.</p>
<p>The research paper can be found <a href="http://www.cond.org/z.pdf">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://digitalgarbage.net/2008/11/22/is-zoetrope-the-next-gen-internet-archive/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Yet another report on digital preservation</title>
		<link>http://digitalgarbage.net/2008/07/22/more-digital-preservation/</link>
		<comments>http://digitalgarbage.net/2008/07/22/more-digital-preservation/#comments</comments>
		<pubDate>Wed, 23 Jul 2008 00:58:32 +0000</pubDate>
		<dc:creator>Ira Nathenson</dc:creator>
				<category><![CDATA[Copyright]]></category>
		<category><![CDATA[Digital Preservation]]></category>
		<category><![CDATA[European Union]]></category>
		<category><![CDATA[Green Paper]]></category>

		<guid isPermaLink="false">http://digitalgarbage.net/?p=151</guid>
		<description><![CDATA[It must be Digital Preservation Week. Just a few days ago, I wrote about the Library of Congress&#8217; new report on digital preservation (which itself followed the report of the Section 108 Study Group issued last March).  Now, the Commission &#8230; <a href="http://digitalgarbage.net/2008/07/22/more-digital-preservation/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>It must be Digital Preservation Week.</p>
<p>Just a few days ago, I <a href="http://digitalgarbage.net/2008/07/19/copyright-digital-preservation/">wrote</a> about the Library of Congress&#8217; new report on digital preservation (which itself followed the <a href="http://www.section108.gov/docs/Sec108StudyGroupReport.pdf">report</a> of the <a href="http://www.section108.gov/">Section 108 Study Group</a> issued last March).  Now, the Commission of the European Communities has released a <a href="http://ec.europa.eu/internal_market/copyright/docs/copyright-infso/greenpaper_en.pdf">green paper</a> entitled <em>Copyright in the Knowledge Economy</em>, which discusses, among other things, digital preservation, the making available of digitized works, and orphan works.</p>
<p>Hat tip: <a href="http://www.libraryjournal.com/info/CA6580979.html#news1">LibraryJournal.com</a></p>
]]></content:encoded>
			<wfw:commentRss>http://digitalgarbage.net/2008/07/22/more-digital-preservation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

