Educate All Students, Support Public Education

March 16, 2014

What would Einstein say about VAM: the Value Added Measurement for Evaluating Teachers?

Filed under: teacher evaluation — millerlf @ 10:55 am

 By Larry Miller 3/16/14

Albert Einstein had a sign hanging in his office at Princeton that read, “Not everything that counts can be counted, and not everything that can be counted counts.”  While Einstein created a revolution using the scientific method of research, he cautioned against over-simplification of the use of data in assessing human behavior.

The world of teaching and learning is mired in debate over teacher evaluation and subsequently “merit” pay. States and school districts have devised policies employing what’s known as VAM, Value Added Measurement, to determine a teacher’s worth largely based on standardized test scores. VAM models (also referred to as student growth models) use research methods, through computer-based procedural computations (algorithms), to attempt to measure and evaluate the contributions over time that teachers make to student learning and achievement.

Despite strong research disclaiming the validity of VAM, a number of states and school districts are moving forward with programs that rely on it. One model called VARC, the creation of the Value Added Research Center at the University of Wisconsin-Madison, is currently being utilized by state departments of education in Minnesota, New York, North Dakota, South Dakota, and Wisconsin. VARC is also contracting with many large school districts in cities from Atlanta to Los Angeles.

VARC claims to take into account up to 30 variables including race, gender, ethnicity, levels of poverty, students’ levels of English language proficiency, and special education statuses. VARC also uses other variables when available including, for example, student attendance, suspension, and retention records.

I wonder how someone factors in a “variable” such as race to determine intellectual progress. What variance is used to distinguish between a black student and a white student?

How about trauma? Can VARC take into account a student’s experience with violence or an event that occurred on the way to school or dissonance due to psychological discord?

Although research tells us that teacher quality has an effect on test scores, this does not mean that a specific teacher is responsible for how a specific student performs on any given standardized test. Nor does it mean we can equate effective teaching (or actual learning) with higher test scores.

A long list of researchers have verified that significant statistical error rates occur with VAM when comparing tests scores over years. They have shown that test scores of students taught by the same teacher fluctuate significantly from year to year. A one-time, randomly occurring factor on the day of a test can significantly affect a student’s results. Also, the complexities of learning and the cognitive transfer of skills that students learn across different subjects cannot be correlated to an individual teacher. We can never be certain which class and which teacher contributed to a given student’s test performance in any given subject. In Florida art teachers are being evaluated based on test scores while art is not on the test.

In addition, out-of-school factors such as inadequate access to health care, food insecurity, and poverty-related stress, among others, negatively impact the in-school achievement of students so profoundly that they severely limit what schools and teachers can do on their own.

According to a U.S. Department of Education report, “More than 90 percent of the variation in student gain scores is due to the variation in student-level factors that are not under control of the teacher.” Yet that has not mattered to the states hopping on the VAM band wagon.

The leaders of these states are not listening to the National Research Council of the National Academy of Sciences, which stated that “VAM estimates of teacher effectiveness should not be used to make operational decisions because such estimates are far too unstable to be considered fair or reliable.”

In Tennessee, Florida and Ohio all districts must use value-added ratings as part of a teacher’s total evaluation score. In New York the new teacher evaluation framework must use value-added ratings.

Education should not model business algorithms where the “bottom-line” and market forces drive outcomes. Education should be child-centered and collaborative, taking into account the individual knowledge and skill set of each child. No algorithm or computer program can replace a teacher’s running record of a student’s progress in learning to read. No algorithm or computer program can read an essay or evaluate a debate.

Only teachers knowing and being vested in their students can lead to an outcome that serves a democratic society and confronts the present status quo of inequality and injustice. Imprecise “science” should not determine high stakes education outcomes, instruction quality or teacher compensation.

I hear Einstein weighing in on this debate. “Technological progress,” he wrote, “is like an axe in the hands of a pathological criminal.”


February 2, 2011

Problems With Value Added Assessment by Diane Ravitch

Filed under: Arne Duncan,teacher evaluation — millerlf @ 12:35 pm

January 18, 2011 The Pitfalls of Putting Economists in Charge Of Education/Building Bridges Blog

Dear Deborah,

A few weeks ago, Mike Rose posted a list of his New Year’s resolutions. One of them was that we should “make do with fewer economists in education.

These practitioners of the dismal science have flocked to education reform, though most know little about teaching and learning.” Mike suggested that so few economists were able to give useful advice about the financial and housing markets that we should now be skeptical about expecting them “to change education for the better.”


I agree with Mike. It is astonishing to realize the extent to which education debates are now framed and dominated by economists, not by educators or sociologists or cognitive psychologists or anyone else who actually spends time in classrooms. My bookshelves are chock full of books that analyze the teaching of reading, science, history, and other subjects; books that examine the lives of children; books that discuss the art and craft of teaching; books about the history of educational philosophy and practice; books about how children learn.

Now such considerations seem antique. Now we are in an age of data-based decision-making, where economists rule. They tell us that nothing matters but performance, and performance can be quantified, and those who do the quantification need never enter a classroom or think about how children learn.

So the issue of our day is: How do we measure teacher effectiveness? Most of the studies by economists warn that there is a significant margin of error in “value-added assessment” (VAA) or “value-added modeling” (VAM). The basic idea of VAA is that teacher quality can be measured by the test-score gains of their students. Proponents of VAA see it as the best way to identify teachers who should get merit pay and teachers who should be fired. Critics say that the method is too flawed to use for high-stakes purposes such as these.

Last July, the U.S. Department of Education published a study by Mathematica Policy Research, which estimated that even with three years of data, there was an error rate of 25 percent. A few months ago, I signed onto a statement by a group of testing experts, which cautioned that such strategies were likely to misidentify which teachers were effective and which were ineffective, to promote teaching narrowly to the test, and to cause a narrowing of the curriculum.

None of these cautions has stemmed the tide of rating teachers by student test scores and releasing the ratings. Last year, the Los Angeles Times published an online database that rated 6,000 teachers as to their effectiveness (one of them, elementary school teacher Rigoberto Ruelas, committed suicide a few weeks later). New York City is poised to make a public release of the names and ratings of 12,000 teachers, if the courts give the go-ahead (in the first trial, a judge ruled that the data could be released even if it was inaccurate).

This trend did not just happen. It was encouraged by the Obama administration’s Race to the Top, which urged states to develop quantitative measures of teacher effectiveness. Secretary of Education Arne Duncan has issued statements endorsing the publication of teachers’ names and ratings, although few testing experts agree with this practice.

The bulk of studies warn about the inaccuracy and instability of these measures, but the Gates Foundation recently released a study called “Measures of Effective Teaching” (MET) that supports the use of VAA and VAM. As is customary for the Gates Foundation, it hired an impressive list of economists at institutions across the nation to give the gloss of authority to its work. Among its key findings was this one: “Teachers with high value-added on state tests tend to promote deeper conceptual understanding as well.” Ah, said the proponents of measuring teacher quality by the rise and fall of student test scores, this study vindicates these methods and effectively counters all those cautionary warnings.

But now comes a re-analysis of the Gates study by University of California-Berkeley economist Jesse Rothstein, which says that the MET study reached the wrong conclusions and that its data demonstrate that VAA misidentifies which teachers are more effective and is not much better than a coin toss. Even the claim that teachers whose students get high scores on state tests will also get high scores on tests of “deeper conceptual understanding” is flawed, writes Rothstein.


December 29, 2010

Ranking Teachers Riddled With Problems

Filed under: teacher evaluation — millerlf @ 3:18 pm

Hurdles Emerge in Rising Effort to Rate Teachers

By SHARON OTTERMAN Published: December 26, 2010 NYTimes

For the past three years, Katie Ward and Melanie McIver have worked as a team at Public School 321 in Park Slope, Brooklyn, teaching a fourth-grade class. But on the reports that rank the city’s teachers based on their students’ standardized test scores, Ms. Ward’s name is nowhere to be found.

Melanie McIver, a teacher at Public School 321 in Park Slope, Brooklyn, with Elizabeth Phillips, background, the school principal. Both women have seen issues related to the city’s system of ranking teachers, which is at the heart of a lawsuit in State Supreme Court in Manhattan.

“I feel as though I don’t exist,” she said last Monday, looking up from playing a vocabulary game with her students.

Down the hall, Deirdre Corcoran, a fifth-grade teacher, received a ranking for a year when she was out on child-care leave. In three other classrooms at this highly ranked school, fourth-grade teachers were ranked among the worst in the city at teaching math, even though their students’ average score on the state math exam was close to four, the highest score.

“If I thought they gave accurate information, I would take them more seriously,” the principal of P.S. 321, Elizabeth Phillips, said about the rankings. “But some of my best teachers have the absolute worst scores,” she said, adding that she had based her assessment of those teachers on “classroom observations, talking to the children and the number of parents begging me to put their kids in their classes.”

It is becoming common practice nationally to rank teachers for their effectiveness, or value added, a measure that is defined as how much a teacher contributes to student progress on standardized tests. The practice was strongly supported by President Obama’s education grant competition, Race to the Top, and large school districts, including those in Houston, Dallas, Denver, Minneapolis and Washington, have begun to use a form of it.


November 11, 2010

Respected Teacher Commits Suicide Following LA Times Published Ratings of All Los Angeles Public School Teachers

Rigoberto Ruelas was rated “less effective than average” by the school district. This was published in the Los Angeles Times.

Teacher’s Death Exposes Tensions in Los Angeles

By IAN LOVETT Published: November 9, 2010 NYTimes

LOS ANGELES — Colleagues of Rigoberto Ruelas were alarmed when he failed to show up for work one day in September. They described him as a devoted teacher who tutored students before school, stayed with them after and, on weekends, took students from his South Los Angeles elementary school to the beach.

When his body was found in a ravine in the Angeles National Forest, and the coroner ruled it a suicide, Mr. Ruelas’s death became a flash point, drawing the city’s largest newspaper into the middle of the debate over reforming the nation’s second-largest school district.

When The Los Angeles Times released a database of “value-added analysis” of every teacher in the Los Angeles Unified School District in August, Mr. Ruelas was rated “less effective than average.” Colleagues said he became noticeably depressed, and family members have guessed that the rating contributed to his death.

On Monday, a couple hundred people marched to the Los Angeles Times building, where they waved signs and chanted, demanding that the newspaper remove Mr. Ruelas’s name from the online database.

“Who got the ‘F’? L.A. Times,” chanted the crowd, which was made up mostly of students, teachers and parents from Miramonte Elementary School, where Mr. Ruelas taught fifth grade.

The value-added assessments of teachers — which use improvements in student test scores to evaluate teacher effectiveness — has grown in popularity across the country with support from the federal Department of Education, which has tied teacher evaluations to the Race to the Top state-grant program.

But their use remains controversial. Teachers’ unions argue that the method is unfair and incomplete and have fought its implementation across the country.

The Los Angeles Times compiled its database using seven years of standardized test scores obtained through a public records request.

A. J. Duffy, president of the union, United Teachers Los Angeles, which helped organize Monday’s event, held up Mr. Ruelas as an example of the problems with value-added assessments.

“Value-added assessments are a flawed system,” Mr. Duffy said. “This was a great teacher who gave a lot to the community.”

The newspaper has refrained from commenting on the issue beyond a statement issued after Mr. Ruelas’s death: “The Times continues to extend our sympathy to Mr. Ruelas’s family, students, friends and colleagues. The Times published the database, which is based on seven years of state test scores in the L.A.U.S.D. schools, because it bears directly on the performance of public employees who provide an important service, and in the belief that parents and the public have a right to judge the data for themselves.”

Teachers’ unions have largely opposed moves away from the tenure system, in which layoffs are based on seniority, not performance.

Recently, in Washington, where the school chancellor, Michelle Rhee, used comprehensive teacher evaluations to fire hundreds of “ineffective” teachers, their unions poured hundreds of thousands of dollars into a campaign to unseat her main supporter, Mayor Adrian M. Fenty. Mr. Fenty lost the Democratic primary in September, and Ms. Rhee resigned the next month.

Despite opposition from the teachers union, Education Secretary Arne Duncan came out in support of greater transparency in teacher evaluations, and the New York City Department of Education is also preparing to release data reports on its teachers, pending the result of a court hearing later this month.

In Los Angeles, where the school district has moved toward significant reforms, like handing control of some chronically low-performing campuses to charter school operators, members of the school board have increasingly pushed to implement value-added assessments.

“Not including value-added measures is not acceptable,” said Yolie Flores, a board member of the Los Angeles Unified School District. “But it also has to be part of a more comprehensive system of evaluation.”

Eric A. Hanushek, a senior fellow at Stanford’s Hoover Institution who studies school accountability systems, said the value-added assessments should be combined with other factors. But he said the tenure system did not offer any meaningful evaluation of teacher performance.

“Now that The L.A. Times has published these scores, I think the genie is out of the bottle, and parents are going to want this information,” Mr. Hanushek said. “I presume the union’s opposition is a last effort of the teachers’ union to say that you should never evaluate teachers. This is their attempt to take a tragic situation and turn it into one that they can use for their own political advantage.”

But Randi Weingarten, president of the American Federation of Teachers, argued that reliance on value-added assessments actually hindered efforts to carry out comprehensive teacher evaluations.

“Our union has proposed a comprehensive system of teacher evaluation that more than 50 districts have adopted,” Ms. Weingarten said. “The good work we’re doing trying to make comprehensive teacher evaluations will actually be hurt by this fixation on a value-added system.”

Create a free website or blog at