26 July 2011
The International Conference on Weblogs and Social Media took place in Barcelona from July 17th through July 21st. In attendance were hundreds of international researchers representing the social and computational sciences, physics, economics, media studies and the humanities, as well as key figures from industry and even the intelligence community.
Several key themes emerged during the course of the meeting, and they characterize in broad terms the challenges and opportunities facing the discipline. Among them: inference of causality and influence from historical network data; the role of selective perception and homophily in shaping exposure to web-based information resources; the limitations of unipartite models for fundamentally multiplex, multimode networks; the professionalization of journalism in a the modern age; the struggle for balance between qualitative and quantitative treatments of large-scale datasets; and the role of web science in shaping our understanding of human behavior. Here we record a few of the many highlights from the conference's presentations, poster sessions, and distinguished keynotes.
Jimmy Lin, a University of Maryland professor on industrial sabbatical with Twitter, took several provocative positions in his morning keynote. Most contentiously, he argued that academics should not engage in research that industry 'can do better'. According to Lin, work of this type encompasses incremental improvements in information retrieval tasks and descriptive analyses of technological systems. Instead, he argues that researchers should focus on fundamental, transformative questions, such as how information spreads, the identification of influential individuals in social networks, and the qualities of a service that give it 'addictive' potential. Whether these are meaningful distinctions, and whether industry actually is better suited to certain types of analyses, are questions open for debate. Regardless, in the sense that goal of the talk was to catalyze this kind of discussion, Dr. Lin succeeded admirably.
Jimmy Lin's Homepage
In a wide-ranging talk addressing the factors underlying social influence and causality, Dr. Aral began by questioning the basic character of the definitions that have traditionally motivated measures of 'influence' in social networks. One of the basic tenets of his argument is that influence is best understood as the ability of an individual to initiate change in a 'system of behaviors'. He proposes that the goal of research in this domain should be the development of models that can capture indirect causal dynamics, and offers as an example a scenario in which A changes his exercise regimen in response to a change in the diet of B. Touching on the problem of distinguishing homophily from contagion, Aral acknowledges that 'most influence is really just observable homphily', but argues that with rigorously constructed statistical models we can begin to place bounds around the extent to which homophily is a driver of behavior in a system.
Sinan Aral's Homepage
Prominent sociologist and communications scholar Manuel Castells gave the opening keynote and addressed many issues relating to role of social media in sociopolitical change. Castells contends that the prevalence of social media leads to a 'culture of free communication,' fostering in individuals a sense of self-determination and autonomy often lacking in authoritarian societies. He argues that this precipitates a cultural shift, in which people begin to see themselves as agents of change, a factor he asserts is critical to the revolutionary process. His thoughtful treatment of the subject serves as a welcome and well-reasoned counterpoint to the arguments put forth by Malcolm Gladwell in the controversial article, 'Small Change: Why The Revolution Will Not be Tweeted.' Admittedly a biased audience, the consensus among scholars in the social media community seems to be that the true role of social media in political upheaval is complex and subtle, and Castells' treatment speaks to that directly.
Manuel Castells' Homepage
Analyzing Twitter for Public Health
In a standout work, Michael Paul and Mark Dredze show that Twitter data can be used to track the timing and geospatial properties of communication relating to different health conditions throughout the United States. This work departs from other research on the subject (such as Google Flu Trends) in that it does not rely on a pre-specified a set of ailments and keywords a priori, but instead leverages structured topic models to infer classes of tweets relating to different conditions, such as allergies, obesity and insomnia. The authors show that these features are correlated with data from the Centers for Disease Control, and document the spread of allergy symptoms through different regions of the US over the course of the year.
Given the ability to tie this kind of public data to health conditions, one could envision privacy concerns were insurers to integrate a person's social media history into actuarial pricing structures. During the Q&A, however, the authors emphasized that, in its current form, this model cannot be used to make such inferences at the individual level.
Michael Paul, Mark Dredze. You Are What You Tweet: Analyzing Twitter for Public Health. [link]
Disaster Response & Situational Awareness
As evidenced most recently by the response to the political violence in Norway, social media represent a rich channel for real-time information and communication relating to emergencies. This point is emphasized with a quote from FEMA director Craig Fugate, in which he claims that tools like Twitter can provide better situational awareness than official sources were able to produce 4-5 years ago. One of the key challenges, however, is separating tactile, actionable information from other content, such as empathetic expressions of support, that provide little leverage in terms of emergency management. In this work, the authors propose a machine learning apparatus that relies on various linguistic features, including those from natural language processing and part of speech annotation tools, to isolate tweets that provide this kind of critical information. Promising though this work is, further challenges relating to generalizability and the extraction of specific units of actionable information still remain.
Sudha Verma, Sarah Vieweg, William J. Corvey, Leysia Palen, James H. Martin, Martha Palmer, Aaron Schram, Kenneth M. Anderson. Natural Language Processing to the Rescue? Extracting "Situational Awareness" Tweets during Mass Emergency. [link]
FEMA Director Craig Fugate, Statement on Social Media and Disaster Management
Media Landscape and Heterogeneous Information Sources
The problem of the 'filter bubble', whereby users selectively filter their information sources so as to consume only content that reflects their pre-existing interests and beliefs, was a recurrent theme across many presentations. In this work, An et al. tackle the problem by examining the diversity of information sources to which an individual is exposed as a result of using the Twitter platform. They conclude that while each individual explicitly subscribes to relatively few media outlets (through follower relationships), users are exposed to a much broader range of information sources as a result of diffusion processes operating on the underlying social network. The result is that a user's exposure is limited by the extent to which homophily dominates their social linkages, rather than whether they subscribe to a narrow set of media sources.
Jisun An, Meeyoung Cha, Krishna Gummadi, Jon Crowcroft. Media landscape in Twitter: A world of new conventions and political diversity. [link]
Hypergraph Analysis of Clandestine Networks
In an increasing variety of situations we are confronted with the limitations of the traditional unipartite graph models used to represent complex social systems. Motivated by the lossy nature of bipartite projections, the authors develop a set of hypergraph techniques that support the identification of networks of illicit 'gold farmers' in the online roleplaying game 'Everquest 2.' In a particularly creative stroke, the authors supplement traditional descriptive network statistics, such as degree and centrality, with additional pattern-based features. Specifically, after identifying hypergraph motifs in the player and resource interaction network they use market basket analysis to compute measures of support and confidence for each motif. Using this approach they are able to identify a set of network structures commonly associated with illicit activity on the platform. Work like this underscores the range of interactions available to users of digital communication and interaction platforms, and highlights the importance of methodologies that preserve the information encoded in these relationships.
Muhammad Aurangzeb Ahmad, Brian Keegan, Dmitri Williams, Jaideep Srivastava, Noshir Contractor. Trust Amongst Rogues? A Hypergraph Approach for Comparing Clandestine Trust Networks in MMOGs. [link]
Social Science Research with Amazon's Mechanical Turk
A growing body of work demonstrates the usefulness of Mechnical Turk as an experimental platform for research in the social and behavioral sciences. In a three-hour tutorial, Yahoo researcher Winter Mason covered several important features of the Mechanical Turk crowdsourcing platform.
Notably, there are no studies that deal with the ways Mechanical Turk fails to reproduce canonical results from traditional lab settings. This may be a result of a tendency towards confirmation bias among Turk researchers, and one could envision a useful discussion resulting from evidence identifying Turk's weaknesses as an experimental platform.
Mason, W., Suri, S. A Guide to Behavioral Experiments on Mechanical Turk, Behavior Research Methods. [link]
Posted by Michael Conover at July 26, 2011 11:51 AM