10 October 2007
Today's applied stats talk by Fernanda Viegas and Martin Wattenberg covered a wide array of interesting data visualization tools that they and their colleagues have been developing over at IBM Research. One of the early efforts that they described is an applet called History Flow, which allows users to visualize the evolution of a text document that was edited by a number of people, such as Wikipedia entries or computer source code. You can track which authors contributed over time, how long certain parts of the text have remained in place, and how text moves from one part of the document to another. To give you a flavor of what is possible, here is a visualization of the history of the Wikipedia page for Gary King (who is the only blog contributor who has one at the moment):
This shows how the page became longer over time and that it was primarily written by one author. The applet also allows you to connect textual passages from earlier versions to their authors. We noticed this one from Gary's entry:
"Ratherclumsy"'s contribution to the article only survived for 24 minutes, and was deleted by another user with best wishes for becoming "un-screwed". All kidding aside, this is a really interesting tool for text-based projects. Leaving aside the possibility for analysis, this would be useful for people working on coding projects. I can think of more than one R function that I've worked on where it would be nice to know who wrote a particular section of code....