March 2011
Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    

Authors' Committee

Chair:

Matt Blackwell (Gov)

Members:

Martin Andersen (HealthPol)
Kevin Bartz (Stats)
Deirdre Bloome (Social Policy)
John Graves (HealthPol)
Rich Nielsen (Gov)
Maya Sen (Gov)
Gary King (Gov)

Weekly Research Workshop Sponsors

Alberto Abadie, Lee Fleming, Adam Glynn, Guido Imbens, Gary King, Arthur Spirling, Jamie Robins, Don Rubin, Chris Winship

Weekly Workshop Schedule

Recent Comments

Recent Entries

Categories

Blogroll

SMR Blog
Brad DeLong
Cognitive Daily
Complexity & Social Networks
Developing Intelligence
EconLog
The Education Wonks
Empirical Legal Studies
Free Exchange
Freakonomics
Health Care Economist
Junk Charts
Language Log
Law & Econ Prof Blog
Machine Learning (Theory)
Marginal Revolution
Mixing Memory
Mystery Pollster
New Economist
Political Arithmetik
Political Science Methods
Pure Pedantry
Science & Law Blog
Simon Jackman
Social Science++
Statistical modeling, causal inference, and social science

Archives

Notification

Powered by
Movable Type 4.24-en


« March 2, 2011 | Main | March 21, 2011 »

7 March 2011

Type I and Type II Errors

A well-known social scientist once confessed to me that, after decades of doing social research, he still couldn't remember the difference between Type I and Type II errors. Since I suspect that many others also share this problem, I thought I would share a mnemonic I learned from a statistics professor. Recall that a Type I error occurs when the null hypothesis is rejected when it is in fact true, while a Type II error occurs when a null hypothesis is not rejected when it is actually false. This distinction, of course, many people find difficult to remember.

So here's the mnemonic: first, a Type I error can be viewed as a "false alarm" while a Type II error as a "missed detection"; second, note that the phrase "false alarm" has fewer letters than "missed detection," and analogously the numeral 1 (for Type I error) is smaller than 2 (for Type I error). Since learning this mnemonic, I have not forgotten the difference between Type I and Type II errors!

Posted by Ethan Fosse at 8:29 PM

Rubin on "Are Job-Training Programs Effective?"

We hope you can join at the Applied Statistics Workshop us this Wednesday, March 9th, when we are excited to have Don Rubin, the John L. Loeb Professor of Statistics here at Harvard University, who will be presenting recent work on job-training programs. You will find an abstract below. As usual, we will begin with a light lunch at 12 noon, with the presentation starting at 12:15p and wrapping up by 1:30p.

“Are Job-Training Programs Effective?”
Don Rubin
John L. Loeb Professor of Statistics, Harvard University
Wednesday, March 9th 12:00pm - 1:30pm
CGIS Knafel K354 (1737 Cambridge St)

Abstract:

In recent years, job-training programs have become more important in many developed countries with rising unemployment. It is widely accepted that the best way to evaluate such programs is to conduct randomized experiments. With these, among a group of people who indicate that they want job-training, some are randomly assigned to be offered the training and the others are denied such offers, at least initially. Then, according to a well-defined protocol, outcomes, such as employment statuses or wages for those who are employed, are measured for those who were offered the training and compared to the same outcomes for those who were not offered the training. Despite the high cost of these experiments, their results can be difficult to interpret because of inevitable complications when doing experiments with humans. In particular, some people do not comply with their assigned treatment, others drop out of the experiment before outcomes can be measured, and others who stay in the experiment are not employed, and thus their wages are not cleanly defined. Statistical analyses of such data can lead to important policy decisions, and yet the analyses typically deal with only one or two of these complications, which may obfuscate subtle effects. An analysis that simultaneously deals with all three complications generally provides more accurate conclusions, which may affect policy decisions. A specific example will be used to illustrate essential ideas that need to be considered when examining such data. Mathematical details will not be pursued.

Posted by Matt Blackwell at 10:20 AM

Machine Learning Tutorials

Andrew Moore has a fairly long list of tutorials on various topics in Machine Learning and Statistics. Here is the description:

The following links point to a set of tutorials on many aspects of statistical data mining, including the foundations of probability, the foundations of statistical data analysis, and most of the classic machine learning and data mining algorithms.

These include classification algorithms such as decision trees, neural nets, Bayesian classifiers, Support Vector Machines and cased-based (aka non-parametric) learning. They include regression algorithms such as multivariate polynomial regression, MARS, Locally Weighted Regression, GMDH and neural nets. And they include other data mining operations such as clustering (mixture models, k-means and hierarchical), Bayesian networks and Reinforcement Learning.

There is a little modesty in the description here. The slides that I have looked at do a great job motivating the methods using intuition, which is often hugely lacking.

Posted by Matt Blackwell at 9:50 AM