Is Zoetrope the next-gen Internet Archive?

Although the Internet Archive’s Wayback Machine is a great research tool, its utility is hampered but a lack of basic search mechanisms.  One can search by URL and archived links, but basic Google-style boolean searching isn’t available.  The Archive once offered a beta boolean search tool, but it never worked and it was later withdrawn.

However, a new application may significantly expand our ability to data-mine archived webdata. Reports give a sneak peek at Zoetrope, an application being developed by researchers at Adobe and the University of Washington.  As put by the researchers:

The Web is ephemeral. Pages change frequently, and it is nearly impossible to find data or follow a link after the underlying page evolves. We present Zoetrope, a system that enables interaction with the historical Web (pages, links, and embedded data) that would otherwise be lost to time. Using a number of novel interactions, the temporal Web can be manipulated, queried, and analyzed from the context of familar [sic] pages. Zoetrope is based on a set of operators for manipulating content streams. We describe these primitives and the associated indexing strategies for handling temporal Web data. They form the basis of Zoetrope and enable our construction of new temporal interactions and visualizations.

The demo video shows how historical webdata could be manipulated and compared, as the authors note, in a variety of “novel” ways.  Even more significantly, researcher Eytan Adar “hopes to eventually incorporate information from the Internet Archive’s nearly 14 years of records.” Such a combination would massively increase the utility of web archives, but would also — as discussed in a paper I’m writing — exacerbate concerns over informational autonomy.


The research paper can be found here.

Facebook: job-hunting, non-invisibility, and the creepiness factor

Note to job applicants: your potential employers aren’t just looking at Google and Yahoo.

Sunday’s New York Times includes a really interesting article by Alan Finder on how some companies now investigate job applicants on social networking sites such as Facebook, MySpace, Xanga, and Friendster. See For Some, Online Persona Undermines a Résumé.”

The article underscores a simple but important fact: users of social network sites shouldn’t assume that their postings are private. Although names like “MySpace” paint an image of personal spaces, personal doesn’t mean private. It’s not difficult to get into these sites – as the article notes, for some sites such as MySpace, you generally only need to register. For Facebook, to view entries for a particular college, you only need an e-mail address from that college.

That means an awful lot of people can view Facebook entries: alumni with email addresses (which could include potential employers), professors, even campus police. Despite this, at an emotional level, many people assume that their personal websites, blogs, and social network postings are relatively personal spaces that won’t be noticed or invaded by others. These assumptions are wrong in at least two ways.

