31 January 2007
Most of us are aware of various distortions in reasoning that people are vulnerable to, mainly because of heuristics we use to make decisions easier. I recently came across an article in Psychological Science called Choosing an inferior alternative that demonstrates a technique that will cause people to choose an alternative that they themselves have previously acknowledged to be personally inferior. This is interesting for two reasons: first of all, exactly how and why it works tells us something about the process by which our brains update (at least some sorts of) information; and second, because I anticipate commercials and politicians and master manipulaters to start using these techniques any day now, and maybe if we know about it in advance we'll be more resistant. One can hope, anyway.
So what's the idea?
It's been known for a while that decision makers tend to slightly bias their evaluations of new data to support whatever alternative is currently leading. For instance, if I'm trying to choose between alternatives A, B, and C -- let's say they are restaurants and I'm trying to decide where to go eat -- when I learn about one attribute, say price, I'll tentatively rank them and decide that (for now) A is the best option. If I then learn about another attribute, say variety, I'll rerank them, but not in the same way I would have if I'd seen those two attributes at the same time: I'll actually bias it somewhat so that the second attribute favors A more than it otherwise would have. This effect is generally only slight, so if restaurant B is much better on variety and only slightly worse on price, I'll still end up choosing restaurant B: but if A and B were objectively about equal, or B was even slightly better, then I might choose A anyway.
Well, you can see where this is going. These researchers presented subjects with a set of restaurants and attributes to determined their objective "favorite." Then, two weeks later, they brought the same subjects in again and presented them with the same restaurants. This time, though, they had determined -- individually, for each subject -- the proper order of attributes that would most favor choosing the inferior alternative. (It gets a little more complicated than this, because in order to try to ensure that the subjects didn't recognize their choice from before, they combined nine attributes into six, but that's the essential idea). Basically what they did is picked the attribute that most favored the inferior choice and put it first, hoping to establish that the inferior choice would get installed as the leader. The attribute that second-most favored the inferior choice was last, to take advantage of recency effects. The other attributes were presented in pairs, specifically chosen so that the ones that most favored the superior alternative were paired with neutral or less-favorable ones (thus hopefully "drowning them out.")
The results were that when presented with the information in this order, 61% of people chose the inferior alternative. The good news, I guess, is that it wasn't more than 61% -- some people were not fooled -- but it was robustly different than chance, and definitely more than you'd expect (since, after all, it was the inferior alternative, and one would hope you'd choose that less often). Moreover, people didn't realize they were doing this at all: they were more confident in their choice when they had picked the inferior alternative. Even when told about this effect and asked if they thought they themselves had done it, they tended not to think so (and the participants who did it most were no more likely to think they had done it than the ones who didn't).
I always get kind of depressed at this sort of result, mainly because I become convinced that this sort of knowledge is then used by unscrupulous people to manipulate others. I mean, it's probably always been used somewhat subconsciously that way, but making it explicit makes it potentially more powerful. On the plus side, it really does imply interesting things for how we process and update information -- and raises the question of why we bias the leading alternative, given that it's demonstrably vulnerable to order effects. Just to make ourselves feel better about our current choice? But why would this biasing do that - wouldn't we feel best of all if we knew we were being utterly rational the whole time? It's a puzzle.
30 January 2007
Here is a question for you: Imagine you are asked to conduct an observational study to estimate the effect of wearing a helmet on the risk of death in motorcycle crashes. You have to choose one of two different data-sets for this study: Either a large, rather heterogeneous sample of crashes (these happened on different roads, at different speeds, etc.) or a smaller, more homogeneous sample of crashes (let's say they all occurred on the same road). Your goal is to unearth a trustworthy estimate of the treatment effect that is as close as possible to the `truth', i.e. the effect estimate obtained from an (unethical) experimental study on the same subject. Which sample do you prefer?
Naturally, most people tend to choose the large sample. Larger sample, smaller standard error, less uncertainty, better inference…we’ve heard it all before. Interestingly, in a recent paper entitled "Heterogeneity and Causality: Unit Heterogeneity and Design Sensitivity in Observational Studies" Paul Rosenbaum comes to the opposite conclusion. He demonstrates that heterogeneity, and not sample size matters for the sensitivity of your inference to hidden bias (a topic we blogged about previously here and here). He concludes that:
“In observational studies, reducing heterogeneity reduces both sampling variability and sensitivity to unobserved bias—with less heterogeneity, larger biases would need to be present to explain away the same effect. In contrast, increasing the sample size reduces sampling variability, which is, of course useful, but it does little to reduce concerns about unobserved bias.”
This basic insight about the role of unit heterogeneity in causal inference goes back to John Stuart Mill’s 1864 System of Logic. In this regard, Rosenbaum’s paper is a nice comparison to Jas’s view on Mill’s methods. Of course, Sir Fisher dismissed Mill for his plea for unit homogeneity because in experiments, when you have randomization working for you, hidden bias is not a real concern so you may as well go for the larger sample.
Now you may say: well it all depends on the estimand, no? Do I care about the effect of helmets in the US as a whole or only on a single road? This point is well taken, but keep in mind that for causal inference from observational data we often care about internal validity first and not necessarily generalizability (most experiments are also done on highly selective groups). In any case, Rosenbaum’s basic intuition remains and has real implications for the way we gather data and judge inferences. Next time you complain about a small sample size, you may want to think about heterogeneity first.
So finally back to the helmet example. Rosenbaum cites an observational study that deals with the heterogeneity issue in a clever way: “Different crashes occur on different motorcycles, at different speeds, with different forces, on highways or country roads, in dense or light traffic, encountering deer or Hummers. One would like to compare two people, one with a helmet, the other without, on the same type of motorcycle, riding at the same speed, on the same road, in the same traffic, crashing into the same object. Is this possible? It is when two people ride the same motorcycle, a driver and a passenger, one helmeted, the other not. Using data from the Fatality Analysis Reporting System, Norvell and Cummings (2002) performed such a matched pair analysis using a conditional model with numerous pair parameters, estimating approximately a 40% reduction in risk associated with helmet use.”
29 January 2007
The Applied Statistics Workshop resumes this week with a talk by Holger Lutz Kern, a Ph.D. candidate at Cornell University currently visiting at Harvard. His research focuses on comparative political economy and behavior with a focus on causal inference from observational data. His work has appeared in the Journal of Legislative Studies. He will present a talk entitled "The Effect of Free Foreign Media on the Stability of Authoritarian Regimes: Evidence from a Natural Experiment in Communist East Germany." The talk is based on joint work with Jens Hainmueller. The presentation will be at noon on Wednesday, January 31 in Room N354, CGIS North, 1737 Cambridge St. As always, lunch will be provided. An abstract of the talk follows on the jump:
A common claim in the democratization literature is that free foreign media undermine authoritarian rule. In this case study, we exploit a natural experiment to estimate the causal effect of exposure to West German television on East Germans' political attitudes. While one could receive West German television in most parts of East Germany, residents of Dresden were cut off from West German television due to Dresden's location in the Elbe valley. Using an ordered probit LARF instrumental variable estimator and survey data collected in 1988/9, we find that East Germans who watched West German television were *more* satisfied with life in East Germany and the communist regime. To explain this surprising finding, we demonstrate that East Germans consumed West German television primarily for its entertainment value and not because of its unbiased news reporting. Archival data on the internal debates of the East German regime corroborate our argument.
26 January 2007
In this past Sunday’s New York Times Book Review, Scott Stossel covers a book by Sarah E. Igo, a professor in the history department at the University of Pennsylvania. The Averaged American – which I haven’t read but plan to pick up soon – discusses how the development of statistical measurement after World War I impacted not only social science, but also, well, the average American. According to the review, Igo argues that statistical groundbreakers like the Gallup poll and the Kinsey reports created a societal self-awareness that hadn’t existed before.
What struck me, though, was the reviewer’s closing comment. Stossel writes, “Even as we have moved toward ever-finer calibrations of statistical measurement, the knowledge that social science can produce is, in the end, limited. Is the statistical average rendered by pollsters the distillation of America? Or its grinding down into porridge? For all of the hunger Americans have always expressed for cold, hard, data about who we are, literary ways of knowing may be profounder than statistical ones.”
Keep in mind that these words come from a literary person immersed in the literary world (specifically, Stossel is the managing editor of The Atlantic Monthly ) and should be understood in context. However, I hope that Stossel and the average American see the value of cold, hard, data handled well. I also think that we as social scientists and statisticians should accept his challenge to keep the porridge limited, the ideas unlimited, and our impact on the national consciousness profound! And maybe we should be a little offended, too.
25 January 2007
Dr. King, Esteemed Faculty, Members of the Advisory Board, My Fellow Stats Brats:
The rite of custom brings us together at a defining hour when decisions are hard and courage is needed. We enter the year 2007 with large endeavors under way and others that are ours to begin. In all of this, much is asked of us. We must have the will to face difficult challenges and determined reviewers, and the wisdom to face them together.
We’re not the first to come here with allegiances divided between structural equation modeling and proper counterfactual reasoning and Bayesian uncertainty in the air. Like many before us, we can work through our difference in difference equations, and we can achieve big things for the scientific community. Our readers don’t much care which department we sit in as long as we are willing to walk across campus when there is work to be done. Our job is to make research better for our readers, and to help them to build CVs of hope and opportunity, and this is the business before us tonight.
A future of hope and opportunity begins with a growing software library, and that is what we have. We’re now in the 19th month of uninterrupted dissertation research by many proud graduate students at IQSS, an effort that has created 1,947 pages of prose and equations, so far. Unemployment is low, ignorance is low, wages are rising. These dissertations are on the move, and our job is to keep it that way, not with more error term jabber but with more attention to potential outcomes and causality.
Next week, I’ll deliver a bound report on the state of my dissertation to the Registrar. Tonight, I want to discuss one statistical reform that deserves to be a priority for this Institute.
In particular, there’s the matter of appealing to causality fraudulently. These appeals are often slipped into manuscripts at the last hour when not even copy editors are watching. In 2005 alone, the number of appeals to causality across journals grew to over 13,000. Reviewers did not vet them. Don did not sign off on them. Yet they are treated as if they have the blessing of Don. The time has come to end this practice. So let us work together to reform the review process — expose every slippage to the light of day and cut the number of unfounded appeals to causality at least in half by the end of this session.
This is a decent and honorable Institute, and resilient too. We’ve been through a lot together. We’ve met challenges and faced dangers, and we know that more lie ahead. Yet we can go forward with confidence, because the stats of our union is strong, our cause in the world is right, and tonight that cause goes on.
24 January 2007
I’ll be giving the talk at the Gov 3009 seminar in early February, and I’ll be presenting a paper I’m writing with Don Rubin on applying the potential outcomes framework of causation to what lawyers call “immutable characteristics” (race, gender, and national origin, for example). I’ll be previewing some of the idea from this paper on the blog.
One key point from this paper is the recognition that in law (specifically, in an anti-discrimination setting), the goal of causal inference may be different from that in a more traditional social science setting. A sociologist, for example, might study the effect of tax breaks for married couples on marriage rates; the obvious goal of the study is to see whether a contemplated intervention (tax breaks) has a desired effect. An economist might evaluate a job training program for a similar reason. In anti-discrimination law, however, we study the effect of units’ perceived races (or genders or whatever) on some outcome (e.g., hiring or promotion), but we have no interest in intervening to change these perceptions. Rather, we’re contemplating action that would mitigate the effects we find. The “intervention” we’re considering might be compensating the victim of discrimination, as is true in an employment discrimination suit. Or it might be ceasing a certain type of government action, such as the death penalty. But we’re not interesting in implementing a policy promoting or effectuating the treatment that we’re studying.
23 January 2007
So it's finally getting cold in Boston after some days that resembled Spring more than anything. Outside the buildings, smokers in T-shirts and flip-flops? The first flowers blooming?? But it's not all lost: I was just reading that an early Spring or a short interval of warm temperatures doesn't really matter for plants and animals. Plants just grow new buds or skip a year. Animals adjust their sleep patterns. But maybe Mother Nature is also smart about predicting when it's the right time to wake up. Are plants and animals Bayesians and have learned to give more weight to a signal that is a better predictor of changes in seasons than temperature?
Apparently plans and animals have an internal clock that measures the length of day and night by using length of sunlight exposure as proxy. Having been around a couple of hundred years they might know that relying on the length of day is a safer bet than relying much temperature. I wonder whether there is evidence of Mother Nature changing those weights over time, as one of the signals becomes more reliable? Maybe temperature was a better predictor when the Little Ice Age began? It wouldn't be so great to wake up when it's well below zero in late May. This would be a good example for Amy's post on Bayesian inference and natural selection (see here).
Here in the computer lab of an unnamed basement in Cambridge, MA, yours faithful won't be fooled by the temperatures either. I'll take a nap now.
22 January 2007
... it may extend your life by up to two years, according to a new paper by Matthew Rablen and Andrew Oswald from the University of Warwick, as reported in this week's Economist. They suggest that the increase in status associated with winning a Nobel Prize increases longevity compared to those who are nominated but never win. As the authors note, looking at Nobel nominees and laureates presents some problems because nominees have to be alive at the time of their nomination (and, since 1971, have to remain alive until the prize is awarded). This implies that living longer increases your chances of winning the prize in the first place.
One way that the authors try to deal with this problem is by matching Nobel winners to nominees who were nominated at the same age but never won. Now we obviously like matching here at Harvard, but my sense is that this doesn't quite take care of the problem. By dividing the groups into "winners" and "never winners", you still have the problem that some of the "never winners" stay in that category because they don't live long enough to be recognized with a prize. It seems to me that a better approach would be to compare winners to individuals who were unsuccessful nominees at the same age, whether or not they went on to win a Nobel later in life. I think is closer to the actual treatment, which is not "win or don't win", but rather "win now or stay in the pool." My guess is that this comparison would reduce the matching-based estimate of the increase in lifespan.
On the other hand, there doesn't appear to be any evidence that winning a Nobel shortens your lifespan, so tell your friends that they should go ahead and nominate you (unless you agree with Andrew Gelman on this...).
17 January 2007
I saw an thought-provoking post at John Baez's diary the other day pointing out an interesting analogy between natural selection and Bayesian inference, and I can't decide if I should classify it as just "neat" or if it might also be "neat, and potentially deep" (which is where I'm leaning). Because it's a rather lengthy post, I'll just quote the relevant bits:
The analogy is mathematically precise, and fascinating. In rough terms, it says that the process of natural selection resembles the process of Bayesian inference. A population of organisms can be thought of as having various "hypotheses" about how to survive - each hypothesis corresponding to a different allele. (Roughly, an allele is one of several alternative versions of a gene.) In each successive generation, the process of natural selection modifies the proportion of organisms having each hypothesis, according to Bayes' law!
Now let's be more precise:
Bayes' law says if we start with a "prior probability" for some hypothesis to be true, divide it by the probability that some observation is made, then multiply by the "conditional probability" that this observation will be made given that the hypothesis is true, we'll get the "posterior probability" that the hypothesis is true given that the observation is made.
Formally, the exact same equation shows up in population genetics! In fact, Chris showed it to me - it's equation 9.2 on page 30 of this book:
* R. Bürger, The Mathematical Theory of Selection, Recombination and Mutation, section I.9: Selection at a single locus, Wiley, 2000.
But, now all the terms in the equation have different meanings!
Now, instead of a "prior probability" for a hypothesis to be true, we have the frequency of occurence of some allele in some generation of a population. Instead of the probability that we make some observation, we have the expected number of offspring of an organism. Instead of the "conditional probability" of making the observation, we have the expected number of offspring of an organism given that it has this allele. And, instead of the "posterior probability" of our hypothesis, we have the frequency of occurence of that allele in the next generation.
Baez goes on to wonder, as I do, if people doing work on genetic programming or Bayesian approaches to machine learning have noticed this relationship. I feel like I would have remembered if I'd seen something like this (at least recently), and I don't remember anything, but that doesn't mean it's not there -- any pointers, anyone? [The closest I can think of is an interesting chapter (pdf) by David MacKay called "Why have sex? Information acquisition and evolution", but it's mainly about how one can use information theory to quantify the argument for why recombination (sex) is a better way to spread useful mutations and clear less-useful ones].
Also, re: the conceptual deepness of this point... I've long thought (and I'm sure I'm not alone in this) that it's useful to see natural selection as a guided search over genotype (or phenotype) space; Bayesian inference, i.e., searching over "problem space" so as to maximize posterior probability seems to be a valuable and useful thing to do in machine learning and cognitive science. [Incidentally, I've also found it to be a useful rhetorical tool in discussing evolution with creationists -- the idea that computers can do intelligent searches over large spaces and find things with small "chance" probability is one that many of them can accept, and from there it's not so much of a leap to think that evolution might be kind of analogous; it also helps them to understand how "natural selection" is not "random chance", which seems to be the common misunderstanding]. Anyway, in that superficial sense, it's perhaps not surprising that this analogy exists; on the other hand, the analogy goes deeper than "they are both searches over a space" -- it's more along the lines of "they are both trying to, essentially, maximize the same equation (posterior probability)." And
Anyway, I'm now speculating on things I know very little about, and I should go read the Burger book (which has been duly added to my ever-expanding reading list). But I thought I'd throw out these speculations right now anyway, since you all might find them interesting. And if anyone has any other references, I'd love to see them.
16 January 2007
The Applied Statistics Workshop will resume for the spring semester on January 31, 2007. We will continue to meet in the CGIS Knafel Building, Room N354 on the third floor at noon on Wednesdays. The Workshop has a new website that has the tentative schedule posted for the semester. We will be moving the archives of papers from the previous semesters to the new site in the coming weeks, so you can track down your favorite talks from years past. As a preview of what's to come, here are the names and affiliations of some of the speakers presenting in the next month:
Holger Lutz Kern
Department of Government
Department of Statistics
Alberto Abadie, Alexis Diamond, and Jens Hainmueller
Kennedy School of Government and Department of Government
Department of Government
11 January 2007
We are fortunate to have Amy continuing to write for the blog at the same time as she continues her Bayesian studies of language, how kids learn, and a variety of other interesting issues in cognitive science.
9 January 2007
Courtesy of Aleks at Columbia, who brought this to my attention:
A very interesting collection of visualizations for projects, proposals and presentations. The periodic table arrangement itself is not at all useful, but the depth and organization sure is.