September 2011
Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30  

Authors' Committee


Matt Blackwell (Gov)


Martin Andersen (HealthPol)
Kevin Bartz (Stats)
Deirdre Bloome (Social Policy)
John Graves (HealthPol)
Rich Nielsen (Gov)
Maya Sen (Gov)
Gary King (Gov)

Weekly Research Workshop Sponsors

Alberto Abadie, Lee Fleming, Adam Glynn, Guido Imbens, Gary King, Arthur Spirling, Jamie Robins, Don Rubin, Chris Winship

Weekly Workshop Schedule

Recent Comments

Recent Entries



SMR Blog
Brad DeLong
Cognitive Daily
Complexity & Social Networks
Developing Intelligence
The Education Wonks
Empirical Legal Studies
Free Exchange
Health Care Economist
Junk Charts
Language Log
Law & Econ Prof Blog
Machine Learning (Theory)
Marginal Revolution
Mixing Memory
Mystery Pollster
New Economist
Political Arithmetik
Political Science Methods
Pure Pedantry
Science & Law Blog
Simon Jackman
Social Science++
Statistical modeling, causal inference, and social science



Powered by
Movable Type 4.24-en

« Cross Validated | Main | App Stats: Liublinska on "Addressing missing data issues in a study with rare binary outcomes constrained by a small sample size" »

29 September 2011

Tweeting how you feel

Benedict Carey of the New York Times discusses a paper (gated) by Scott Golder and Michael Macy showing that people's moods -- as expressed within the character limit of twitter -- have remarkably predictable patterns. The authors' interpretation is that our moods are fundamentally linked to our circadian rhythms.

First, I really like this paper and I'm glad to see it come out in Science. An earlier version was presented at one of the conferences put on by Arthur Spirling and the Harvard Program on Text Research, and it caught my eye then.

On one hand, it's obviously innovative research that is making great use of the reams data that are now sitting in the interwebs somewhere, waiting to be analyzed. The possibilities in this new, data rich realm are seemingly endless: the culturomics/ngrams project, work on political blogs (Abe Gong), congressional tweeting (Drew Conway), the news cycle (Leskovec, Backstrom, and Kleinberg), and so on.

But we also should spend more time stepping back and asking hard questions about the data. Are tweets really a great measure of sentiment if the decision to tweet isn't random? Who is online and how do they differ from the offline folks? There is basically no discussion of this in the Golder and Macy article. Perhaps the lack of attention to the limitations of "big data" research is just an inevitable part of the fad cycle, but that doesn't mean we should let our standards slide just because someone has cool data.

Posted by Richard Nielsen at September 29, 2011 8:35 PM