Regardless of anything that changes in the next 24 hours as the strike seems to be winding down, standardized testing is simply not an effective means of evaluating teacher performance. The Sun-Times ran one of the better op-eds I’ve seen on the issue yesterday.
The first important consideration of testing is purpose. The process of test construction is so specialized that an instrument designed for one purpose cannot be used for another. Even if we use the best tests possible, it is a core truth of psychometrics that no test is completely reliable: Error is part of every score.
For this reason, test developers, academic bodies and professional associations alike warn against attaching severe consequences to performance on any test.
It gets even worse from there. The way that CPS plans to use test scores in teacher evaluation, referred to as value-added, is so incredibly flawed that almost no one with a knowledge base in this area thinks it’s a good idea.
The National Research Council wrote a letter to the Obama administration warning against including value-added in Race to the Top federal grant program because of a lack of research support. The Educational Testing Service, an organization that stands to benefit tremendously from any expansion of testing, issued a report concluding that value-added is improper test use.
These are the people who know the statistics, and none of them thinks the models work. There is a list of obstacles:
One: A correlation does not mean a causality. Researchers have found fifth-grade teacher “effects” on fourth-grade scores using these models. Ridiculous, right?
That’s because the models don’t work. For one thing, there must be random assignment of students for this kind of comparison among teachers to work — and no administration that cared about students would ever do that. There are deep statistical problems, and no way to reduce the amount of error to an acceptable level. The biggest problem of all, though, is that this is a ranking. So half of all teachers will always be below the 50th percentile. That’s math.
Amen. Eric has a collection of more in depth articles available here. Clearly people who worry about such things as validity and reliability in social science are French commie pinko fascists librul elites.
And yes, you can blame Barack Obama and Arne Duncan for this as the Race to the Top Requires tying teacher evaluation to standardized test scores. It’s only slightly less ridiculous than the requirement under NCLB for all students to be average or above.