23 March 2010
During a press conference at last week's SxSW conference, product manager of Google's gmail team, Todd Jackson, revealed an interesting bit of information about the company's problem-ridden new service Google Buzz:
Jackson told the crowd, as he's previously said to reporters, that too much was assumed about how Buzz would work best and be received based on Google's internal testing. Google employees didn't have a strong use case for "muting" their fellow Google employees, and the people they'd want to follow and be followed by closely matched up to their contact lists. In general, too, Jackson suggested that Google underestimated the impact of "having a social, public service appear inside ... what is a very private thing (email) for some people .
So by testing their social service inside a single context (Google employees only), the developers failed to notice that in real life, people participate in multiple contexts (family, work, friends, etc) that they work actively to keep separate. The reasons for wanting to keep these groups separate can range from wanting to keep an illicit affair secret from your spouse to political activists in oppressive regimes wanting to keep certain connections secret from the government . Another important reason to keep our communities separate, is that we often play different roles - and communicate differently - in different contexts, as illustrated beautifully in the following clip from TV's Seinfeld:
So, ironically, the key problem for Buzz, Google's social network service was that the engineers at the Googleplex had failed to understand an essential property of real-world social networks. Figure 1 illustrates the problem:
Figure 1A shows a cartoon version of Google's internal testing situation. It's clear that in this situation, since an individual (the gray node) only belongs to a single social context, sharing contact information with his neighbors reveals no new information to his social network. However, an ego-centered network in the wild looks more like the situation depicted in Figure 1B. Here, the gray node is a member of several communities (nodes with different colors) with very little communication between communities. Now, because people typically manage all of their 'worlds' from their email inbox, what Google did when they created Buzz' automatic friends-lists, was to implicitly link people's worlds, revealing the precisely the information that people work to supress. Sometimes with serious implications.
It is interesting to consider what the structure displayed in Figure 1B implies for the full graph. For an individual, the world breaks neatly into a small set of social contexts, but when every single node is in this situation, then the resulting total structure becomes very different from many of the model networks that are currently in use. In my own corner of the complex networks world, this has serious implications for rapidly growing field of community detection . Currently, most algorithms are designed to search for densely connected sets of nodes that are weakly connected to the rest of the network, and while some methods do include the possibility of community overlap, most break down if the overlap constitutes more than a small fraction of the number of nodes. If Figure 1B is correct and overlap is present for all nodes, then the idea of communities as weakly connected to the remainder of the network is false -- since communities will have many more links to the outside world than to the inside.
I hope to see more research investigating this problem!
Oh - and George Costanza gets to have the last word...
Update April 3rd, 2010
I've just become aware of a few excellent blog posts that discuss problems related to buzz, drawing on ideas very similar to what I present above. Fred Stutzman writes eloquently about buzz and colliding worlds inspired by Erving Goffman here. That post sparked additional 'world-colliding' thoughts from David Truss (via this post from George Siemens).
 Santo Fortunato. Community detection in graphs. Physics Reports 486:75-174 (2010).
Posted by Sune Lehmann at 7:17 PM
15 March 2010
Hey all you behavioral scientists out there ..... do you have Physics envy? Are you afraid that the big bad physicists will kick sand in your face? Well, just remember, they are probably much geekier than you, and have probably borrowed your really good network methods too (without citations!).
Well, we now have research on why people, even children, think physicists SHOULD kick sand on behavioral scientists ....
A study published in the Journal of Experimental Psychology took a look at which disciplines children and adults thought were the most difficult to learn. For the most part, people of all ages think psychology is easy and physics is hard. That bias begins early and changes some, but not much, the older we get.
In one phase of the study, kindergartners, second graders, fourth graders, and college-age adults were asked to rank how hard it would be to learn certain disciplines on their own. Everyone said psychology would be the easiest. Interestingly, kindergartners thought that economics would be the hardest to learn on one's own, while adults thought it would be easier than the natural sciences (but not as easy as psychology which, again, we all agree is a snap).
So, all you psychologists out there ..... head to the gym! The physical one AND the mental one! And as for you social scientists ..... no question --- you need it as well.
Posted by Stan Wasserman at 6:30 AM
12 March 2010
I've been asked a lot by many different researchers how they can get their hands on behavioral data logging programs that work on cell phones, such as in Nathan Eagle and Sandy Pentland's landmark Reality Mining study. That study was back in 2004, and they were using old Nokia phones with the Symbian OS, which presented a host of problems. Below I'll go through the currently available data logging applications for phones, and I'll describe a new system being built on top of Android that will allow for an incredibly enhanced platform for social scientists. All of these applications log Bluetooth proximity information, call logs, and cell tower IDs, but some log additional information such as WiFi access points, SMS messages, and accelerometer data. Here are many of the dominant data logging applications available today:
Only 6600 phones are officially supported, but the Context Group at the University of Helsinki has developed a number of behavior logging applications for these phones, available for download here (use mitv2).
The iPhone is nice because a lot of people have them, but it's a poor choice for data logging because it does not allow processes to run in the background. This means you have to have jailbroken iPhones to run these applications, and it also means you can't offer them for download on the official app store. Anmol Madan from our group has made an iPhone app available for download here, and he also wrote a short tutorial on how to get this application running. Your iPhones have to have older versions of the firmware, however, and it doesn't work with the new 3G iPhones.
This is still a widely used phone OS, and Anmol has written a fairly robust data logging application that eclipses all of the previous versions in functionality with WiFi access point logging, survey launcher, and automatic updating tool. He hasn't made it available for download quite yet, but it should be appearing in the next few weeks on the Human Dynamics Social Evolution website. Unfortunately, this version will not be useful for new phones in a few months because Microsoft is releasing Windows Mobile 7.0, which is not compatible with the old 6.x version that this application is written for.
Now the good news: Android phones are becoming increasingly popular and will most likely eclipse all other platforms as the dominant phone OS. Almost every cell phone manufacturer is producing Android phones and with a unified and unrestricted app store there is an opportunity to easily reach millions of people after a short development period. Android also allows easily for automatic updates.
Nadav Aharony from our group is spearheading the project for creating an Android data logging application, and he has already deployed it on over 50 phones in a new study of consumption patterns among family groups (rather than the normal college students in dorms study). This application logs most of the usual suspects (Bluetooth, WiFi access points, call logs), but it also hashes the contents of text messages, allowing researchers to see not just who texts who, but get an idea about how topics spread (not the actual content, since the words are hashed, but just that topic A passed from person 1 to person 2). Actually this application has been running on my phone for over a month with no real problems. The platform also comes with a special app store that allows researchers to log what applications people install, allowing you to look at how application usage spreads among friends. Soon Nadav is also planning to allow researchers to deploy their own apps over this new app store so that researchers can push surveys or more sophisticated logging tools to study participants. Instead of paying for apps, though, users will get paid to download apps so that they will participate (sort of like Mechanical Turk).
The Android Reality Mining platform promises to be extremely powerful, and the results from the current study should further push the boundaries of computational social science.
Posted by Ben Waber at 10:33 AM
In a series of weekly pieces for The New York Times, Steve Strogatz, a marvelous network scientist, has been very good. My favorite so far, which gets right into balance theory and transitivity, is this one. Follow him on Mondays in the NYT, in the Opinionator Blog.
Posted by Stan Wasserman at 7:12 AM
5 March 2010
Along with Riley Crane (of Darpa Challenge and Colbert Report fame), physicist Gourab Ghoshal, and quantitatively minded art historian Max Schich, I'm putting together a workshop on High Throughput Humanities as a satellite meeting at this years European Conference on Complex Systems in Lisbon this September. The general idea is to put together people who ask interesting questions of massive data sets. More specifically - as the title implies - we want to figure out how to use computers to do research in the humanities in a way extends beyond what can currently be accomplished by human beings.
Entire libraries are in the process of being scanned and we would like to begin to investigate questions like: Are there patterns in history that are currently 'invisible' due to the fact that humans have limited bandwidth - that we can only read small fraction of all books in a lifetime?
We have an exciting program committee so it should be an interesting day!
Confirmed Programme Committee Members
Albert-László Barabási, CCNR Northeastern University, USA.
Guido Caldarelli, INFM-CNR Rome, Italy.
Gregory Crane, Tufts University, USA.
Lars Kai Hansen, Technical University of Denmark.
Bernardo Huberman, HP Laboratories, USA.
Martin Kemp, Trinity College, Oxford, UK.
Roger Malina, Leonardo/ISAST, France.
Franco Moretti, Stanford University, USA.
Didier Sornette, ETH Zurich, Switzerland.
Practical information can be found at the conference website. Oh, and did I mention that Lisbon is beautiful in September! Sign up an join us. The workshop abstract is reprinted below.
The High Throughput Humanities satellite event at ECCS'10 establishes a forum for high throughput approaches in the humanities and social sciences, within the framework of complex systems science. The symposium aims to go beyond massive data aquisition and to present results beyond what can be manually achieved by a single person or a small group. Bringing together scientists, researchers, and practitioners from relevant fields, the event will stimulate and facilitate discussion, spark collaboration, as well as connect approaches, methods, and ideas.
The main goal of the event is to present novel results based on analyses of Big Data (see NATURE special issue 2009), focusing on emergent complex properties and dynamics, which allow for new insights, applications, and services.
With the advent of the 21st century, increasing amounts of data from the domain of qualitative humanities and social science research have become available for quantitative analysis. Private enterprises (Google Books and Earth, Youtube, Flickr, Twitter, Freebase, IMDb, among others) as well as public and non-profit institutions (Europeana, Wikipedia, DBPedia, Project Gutenberg, WordNet, Perseus, etc) are in the process of collecting, digitizing, and structuring vast amounts of information, and creating technologies, applications, and services (Linked Open Data, Open Calais, Amazon's Mechanical Turk, ReCaptcha, ManyEyes, etc), which are transforming the way we do research.
Utilizing a complex systems approach to harness these data, the contributors of this event aim to make headway into the territory of traditional humanities and social sciences, understanding history, arts, literature, and society on a global-, meso- and granular level, using computational methods to go beyond the limitations of the traditional researcher.
Posted by Sune Lehmann at 1:49 PM
4 March 2010
Many readers of this blog will find the videos of the following conference that took place at USC a couple of weeks ago quite interesting-- it's a fabulous line up.
The international Network Theory Conference, organized by the ANN and SONIC research centers, took place on Feb 19-20 at the University of Southern California. Bruno Latour delivered the keynote speech titled "Networks, Societies, Spheres: Reflections of an Actor-network theorist." The four panels were focused on conceptual and methodological aspects of network theory, network inclusion and exclusion, network theories of power, and the semantic web. The list of presenters includes: Noshir Contractor, Peter Monge, Paul Leonardi, Yochai Benkler, Ernest J. Wilson III, Rahul Tongia, Karine Barzilai-Nahon, Wendy Hall, Nigel Shadbolt, David Grewal, and Manuel Castells.
Posted by David Lazer at 4:23 PM
2 March 2010
For those in Cambridge/Boston, this post is a reminder that Ignite Boston 7 is taking place this Thursday evening at the Microsoft NERD office near Kendall Square in Cambridge, MA. For the uninitiated, Ignite talks work as follows: Presenters get 20 slides that are displayed 15 seconds each, for a grand total of five minutes to make their point. The results of these strict constraints are creative and (usually) exciting talks about a variety of subjects.
As always, the list of speakers is varied with many titles that should be interesting to readers of this blog. In particular, I look forward to Tim Hwang's talk On the Ecology of Awesome. Shameless plug: I will play a minor role in Max Schich's talk about the upcoming High Throughput Humanities Symposium at the ECCS'2010 conference this summer.
Posted by Sune Lehmann at 1:46 PM