By Larry Miller 3/16/14
Albert Einstein had a sign hanging in his office at Princeton that read, “Not everything that counts can be counted, and not everything that can be counted counts.” While Einstein created a revolution using the scientific method of research, he cautioned against over-simplification of the use of data in assessing human behavior.
The world of teaching and learning is mired in debate over teacher evaluation and subsequently “merit” pay. States and school districts have devised policies employing what’s known as VAM, Value Added Measurement, to determine a teacher’s worth largely based on standardized test scores. VAM models (also referred to as student growth models) use research methods, through computer-based procedural computations (algorithms), to attempt to measure and evaluate the contributions over time that teachers make to student learning and achievement.
Despite strong research disclaiming the validity of VAM, a number of states and school districts are moving forward with programs that rely on it. One model called VARC, the creation of the Value Added Research Center at the University of Wisconsin-Madison, is currently being utilized by state departments of education in Minnesota, New York, North Dakota, South Dakota, and Wisconsin. VARC is also contracting with many large school districts in cities from Atlanta to Los Angeles.
VARC claims to take into account up to 30 variables including race, gender, ethnicity, levels of poverty, students’ levels of English language proficiency, and special education statuses. VARC also uses other variables when available including, for example, student attendance, suspension, and retention records.
I wonder how someone factors in a “variable” such as race to determine intellectual progress. What variance is used to distinguish between a black student and a white student?
How about trauma? Can VARC take into account a student’s experience with violence or an event that occurred on the way to school or dissonance due to psychological discord?
Although research tells us that teacher quality has an effect on test scores, this does not mean that a specific teacher is responsible for how a specific student performs on any given standardized test. Nor does it mean we can equate effective teaching (or actual learning) with higher test scores.
A long list of researchers have verified that significant statistical error rates occur with VAM when comparing tests scores over years. They have shown that test scores of students taught by the same teacher fluctuate significantly from year to year. A one-time, randomly occurring factor on the day of a test can significantly affect a student’s results. Also, the complexities of learning and the cognitive transfer of skills that students learn across different subjects cannot be correlated to an individual teacher. We can never be certain which class and which teacher contributed to a given student’s test performance in any given subject. In Florida art teachers are being evaluated based on test scores while art is not on the test.
In addition, out-of-school factors such as inadequate access to health care, food insecurity, and poverty-related stress, among others, negatively impact the in-school achievement of students so profoundly that they severely limit what schools and teachers can do on their own.
According to a U.S. Department of Education report, “More than 90 percent of the variation in student gain scores is due to the variation in student-level factors that are not under control of the teacher.” Yet that has not mattered to the states hopping on the VAM band wagon.
The leaders of these states are not listening to the National Research Council of the National Academy of Sciences, which stated that “VAM estimates of teacher effectiveness should not be used to make operational decisions because such estimates are far too unstable to be considered fair or reliable.”
In Tennessee, Florida and Ohio all districts must use value-added ratings as part of a teacher’s total evaluation score. In New York the new teacher evaluation framework must use value-added ratings.
Education should not model business algorithms where the “bottom-line” and market forces drive outcomes. Education should be child-centered and collaborative, taking into account the individual knowledge and skill set of each child. No algorithm or computer program can replace a teacher’s running record of a student’s progress in learning to read. No algorithm or computer program can read an essay or evaluate a debate.
Only teachers knowing and being vested in their students can lead to an outcome that serves a democratic society and confronts the present status quo of inequality and injustice. Imprecise “science” should not determine high stakes education outcomes, instruction quality or teacher compensation.
I hear Einstein weighing in on this debate. “Technological progress,” he wrote, “is like an axe in the hands of a pathological criminal.”