I found this article, Mute the Messenger, through my RSS feed this week. I found the article fascinating. Essentially, it is the tale of standardized testing and what could potentially be the ugly reality of assessment. It is not the shortest of articles, but a great read.

Now, take the article with a bit of skepticism. Still, it is a very powerful article. Yes, many of the points may be simply circumstantial. Yes, there could be a lot of information that is missing. Still.

Let’s take a look at a few of the quotes.

Testing advocates believed that more rigorous curricula and tests would boost student achievement—the “rising tide lifts all boats” theory. But that’s not how it worked out.

This is one of the powerful quotations for me. There is a fundamental belief that making the curriculum and assessments more rigorous would “obviously” led to more learning. Funny thing about learning though, sometimes it is more complex than people want to think.

Texas Education Commissioner Robert Scott, long an advocate of using tests to hold schools accountable, broke from orthodoxy when he called the STAAR test a “perversion of its original intent.”

Yep. Some are starting to realize that just increasing testing isn’t the panacea that some want to think that it is.

Stroup sat down at the witness table and offered the scientific basis behind the widely held suspicion that what the tests measured was not what students have learned but how well students take tests.

…his testimony to the committee broke through the usual assumption that equated standardized testing with high standards. He reframed the debate over accountability by questioning whether the tests were the right tool for the job. The question wasn’t whether to test or not to test, but whether the tests measured what we thought they did.

This points out a profound function of testing that all too many take for granted. What does testing really measure? Yes, we end up with a number at the end of testing. However, what does that number really mean? What do tests really measure? These are crucial important questions.

Stroup argued that the tests were working exactly as designed

Stroup had caught the government using a bathroom scale to measure a student’s height.

The scale wasn’t broken or badly made. The scale was working exactly as designed. It was just the wrong tool for the job. The tests, Stroup said, simply couldn’t measure how much students learned in school.

Here is the crux of the matter. Are we really using the right tools? Are we using assessments correctly? Are we sure that the assessments measure what we think that they measure? I remember times as a principal where the number one question was “what was the topic of the writing” section. Once we knew what the topic was, we were pretty sure (and always right on) about how the students would do on the assessment. Quite frankly, we knew that the topic was really, really important. We knew how well the students could write. Even more importantly, we knew that if the topic was something that the students weren’t interested in, they would not do well on the assessment.

Well, one of the legislators called for Stroup and Pearson to have a debate. That debate would never happen.

…standardized tests have become the pre-eminent yardstick of classroom learning in America, and Pearson is selling the most yardsticks.

Pearson is heavily invested (literally) in assessment. Quite frankly, they are selling the yardsticks.

But, here’s one of the interesting things. Stroup was also teaching kids. He had developed a program that helped students learn math. He knew that the kids were being successful, but that success wasn’t showing up on the statewide standardized tests. He started looking at why.

Stroup knew from his experience teaching impoverished students in inner-city Boston, Mexico City and North Texas that students could improve their mastery of a subject by more than 15 percent in a school year, but the tests couldn’t measure that change. Stroup came to believe that the biggest portion of the test scores that hardly changed—that 72 percent—simply measured test-taking ability. For almost $100 million a year, Texas taxpayers were sold these tests as a gauge of whether schools are doing a good job. Lawmakers were using the wrong tool.

So, he does the research and finds out that what the tests really measure is how well students take the test. His research found that 70% of the test score was “insensitive to instruction”. Essentially, this means that teachers, schools and educators can’t change about 70% of the test results. Pearson called foul. They stated that he had made a mistake. According to Pearson, only 50% of the test is “insensitive to instruction”. That’s right. Pearson admitted that about half of the score that would determine how well teachers were teaching was unchangeable by the teacher. Honestly, teachers are being evaluated by these scores. Jobs, reputations, etc. – all determined by these tests. Yet, here is Pearson admitting that 50% of that score is determined by the student’s ability to take a test. Nothing the teacher or school could do would effect this part of the score.

Stroup concluded that the tests were 72 percent “insensitive to instruction,” a graduate- school way of saying that the tests don’t measure what students learn in the classroom.

After correcting what Pearson interpreted as the mislabeled column, Way wrote, the tests were “only 50 percent” insensitive to instruction.

“teachers account for about 1% to 14% of the variability in test scores,” largely confirming Stroup’s apparently controversial conclusion.

If it’s true that the test measured primarily students’ ability to take a test, then, Stroup reasoned to the House Public Education Committee in June 2012, “it is rational game theory strategy to target the 72 percent.” That means more Pearson worksheets and fewer field trips, more multiple-choice literary analysis and fewer book reports, and weeks devoted to practice tests and less classroom time devoted to learning new things. In other words, logic explained exactly what was going on in Texas’ public schools.

Oh, and the legislator who had called for a debate between Dr. Stroup and Pearson. The debate that never happened. Well, he retired. He is now a lobbyist for Pearson.

Source: http://www.texasobserver.org/walter-stroup-standardized-testing-pearson/