Subscribe via RSS Feed

Education “Reform” and the Worthlessness of Standardized Testing

[ 66 ] August 23, 2012 |

Kristina Rizga’s piece at Mother Jones on her experiences in a supposedly failing San Francisco high school is just fantastic. Following a Salvadoran immigrant who saw some pretty horrible stuff back home, Rizga shows how dedicated teachers and administrators created a positive learning environment in a diverse school.

One of the most diverse high schools in the country, Mission has 925 students holding 47 different passports. The majority are Latino, African American, and Asian American, and 72 percent are poor. Yet even as the school was being placed on the list of lowest-performing schools, 84 percent of the graduating class went on to college, higher than the district average; this year, 88 percent were accepted. (Nationally, 32 percent of Latino and 38 percent of African American students go to college.) That same year, Mission improved Latinos’ test scores more than any other school in the district. And while suspensions are skyrocketing across the nation, they had gone down by 42 percent at Mission. Guthertz had seen dropout rates fall from 32 percent to 8 percent. Was this what a failing school looked like?

By the metrics used under No Child Left Behind and anti-teacher “reformers” like Michelle Rhee, yes. And this is incredibly stupid, as are the standards themselves. As a historian, this bit on the standardized testing in history made me furious:

As Roth retreated to his desk, Maria stared at the rows of empty bubbles. A sharp, pounding pain filled her head. She picked up a pencil and read the first question:

During the late 19th and 20th centuries, urban immigrants generally supported local political machines that:

(a) discouraged the new immigrants from participating in civic affairs.

(b) were usually supported by urban reformers.

(c) provided essential services to the immigrants.

(d) reminded immigrants of political practices in their homelands.

As always, Maria started translating the words into Spanish. Then she got to discouraged. She’d seen the word many times before, but it was usually in a context where she could guess the meaning of the passage without knowing every term. In this short sentence, though, there were no hints.

She tried to remember the word’s meaning for a few minutes. Nothing.

Affairs was another word she’d heard before but couldn’t remember. She translated the rest of the sentence—new immigrants from participating—but that didn’t help. She took a deep breath and translated the rest of the answers. B was a possibility, she thought, but something felt off. C seemed right. But what about A? What if that was really the answer? There was no way of knowing. She filled in C for now.

“Five more minutes, everyone!” Roth interrupted. An ambulance siren wailed outside. Maria had spent too much time on the first five questions, and now she had to rush. She translated another page and randomly bubbled in the rest.

When she switched to the written section of the test, her leg stopped bouncing. When the bell rang, Maria kept writing, and didn’t stop until Roth collected the pages from her.

Roth waited until the last student had left the room, and we looked over Maria’s test together. She got almost all the answers wrong on the practice multiple-choice section, the only one that would have counted for the state. On Roth’s essay question, she got an A+.

There is so much here to protest. First, these tests automatically discriminate against people for whom English is not their first language. The profiled student did not really learn English until 9th grade. So it’s hardly surprising that she might not know what some of these words meant. It doesn’t mean she’s not smart or learning rapidly or that the school is failing her. It means that for reasons outside of her or the school’s control, she doesn’t have the English language skills that native-born speakers do. I’d like to see Michelle Rhee take a standardized test in Spanish about Mexican history after just 3 years of learning the language, despite the obvious educational advantages someone like Rhee has.

Second, why are we doing multiple choice tests in history? What does multiple choice show? Anything? The ability to memorize I guess. Does it promote understanding? Critical thinking? Real-life skill building? No. None of it. The article doesn’t really explore what the essay portion that the young woman scored so well on was about, but wouldn’t that be so much more valuable? If California wants to use standardized testing, go all AP on it and hire readers to go over these essays and grade them. At least that would give these students a chance to show their ability. Instead, that essay doesn’t really count for anything in standardized testing.

I do not use multiple choice tests in my college classes. Why? Because they test nothing valuable and because they can destroy the grades of students who may not have a great factual basis in the material or a good memory for these types of things. Instead, short answer essays allow students to show what they do know instead of what they don’t know and create scenarios for partial credit.

Meanwhile, despite the huge obstacles this school faces, they are placing kids in college, graduating them at high rates, etc. But I guess we should fire the principle or something because the test scores aren’t high enough. Absolutely absurd and outrageous.

But hey, instead of focusing on the poverty, language barriers, and economic inequality that creates educational disparities, let’s instead have our wealthy white California high schools hold “Seniores and Señoritas” dress-up days, where our next generation of plutocrats can dress up as cholos and wear sombreros.

Even better, why not just force immigrants into a permanent underclass. That’s the new official policy of the Republican Party after all, which just included a plank in their platform that would deny federal education funding to any state that offered in-state tuition to undocumented immigrants. California is one of 12 states to still offer this for its residents and this policy would ruin any chance for higher education for the young woman profiled in Rizga’s piece.



Comments (66)

Trackback URL | Comments RSS Feed

  1. Nathan Williams says:

    Not to disagree with the overall point, but this:

    If California wants to use standardized testing, go all AP on it and hire readers to go over these essays and grade them

    seems wrong. Maybe the multiple-choice tests are worse, but the failings of AP essay grading are legion.

    • Richard says:

      And would be probably worse for kids who have to write in English after only learning the language for the last three years. Is there any reason, other than the anecdote in Erik’s post, to believe that new English speakers would do better on essays required to be in English than on standardized tests?

      • Erik Loomis says:

        Well, the student profiled in the essay received an A+ on her essay.

        And for the problems with AP grading, as a veteran AP grader, I can say that most students get a fair shake.

        Though no question that the language barrier could and probably would be a problem there too.

        • Richard says:

          This one student did. But in general I don’t know why students with less than three year exposure to English would do better on essays than on standardized tests. I’ve been married to a woman who was born in Mexico for the last eleven years and have picked up a little Spanish. I know for a fact that I would do better in a standardized test in Spanish than on a test requring me to write an essay in Spanish.

          With regard to grading AP tests, I’m sure you try to do a fair job but my understanding is that there have been studies showing great variety from grader to grader which, whatever else can be said against them, is not the case with standardized tests.

          Also, as I understand the California school grading system, what it primarily does is measure improvement (or lack of improvement) from one school year to the next. So even if the school has very low scores on the standardized tests based on poor language skills, the tests should, at the least, marginally improve the next year since the student is getting more conversant in English. As long as there is substantial improvement, the school doesn’t get a failing grade (I’m no expert on the system other people more knowledgeable with the system can feel free to correct me). I dont believe a school fails just because it has low scores. It fails when the scores dont improve (and assuming no significant change in the school demographic which brought in more Spanish speakers or more students without good parental support at home).

          • djw says:

            my understanding is that there have been studies showing great variety from grader to grader which, whatever else can be said against them, is not the case with standardized tests

            I’d be curious to see some sort of link to back up this claim. My experience (Government) with AP grading suggests something rather different: we start out giving divergent scores at the beginning of the week. We’re checked and re-checked early, and beaten into submission. By the third day, we’re achieving reliability that I certainly wouldn’t have thought possible. If specific questions can’t produce reliability, it is not unprecedented to drop them from the grading and weight the others differently. I also was told that if a particular grader continues to score inaccurately (as determined by spot-checks), his/her scores will be thrown out and the essays will be rescored by competent graders.

            If I were to criticize the AP grading, it would be in precisely the opposite direction than your comment suggests. At the alter of reliability, we are forced to sacrifice validity. While it didn’t happen that often, there were always some essays that took the question in a somewhat different direction that was intended–but nevertheless demonstrated substantial knowledge–that we had to give low scores because of the almighty rubric.

            • Richard says:

              If that is the practice in effect with regard to AP tests, you are probably right. I was referring to studies in general regarding grading of essay tests showing that, without the control back to the norm which you describe, there is great variety from grader to grader

          • Timb says:

            who was born in Mexico for the last eleven years

            That is amazing!**

            **this pedantic quibble brought you with a) no opinion about the discussion, b) a solid personal history of poor typing on my part, and c) no animus. I just think her being born in Mexico 11 years in a row is amazing

  2. Murc says:

    I do not use multiple choice tests in my college classes.

    Do you teach intro or survey classes?

    I have a friend who teaches for a motley combination of Southwestern, Sam Houston, and a couple other Texas unis local to him while he desperately seeks tenure. A lot of American History to 1860 and other such things. He would LIKE to be able to make his tests all essays and short answers, but his department forbids it, on the grounds that failing that many freshmen is untenable.

    • Linnaeus says:

      That’s interesting. As far as I know, my department has no such rule. Furthermore, I’ve never seen any multiple choice tests used in my department. Now, that doesn’t mean that none of our instructors or TAs has ever used them, but I’m confident in saying that if they are used, it’s quite rare. And this applies across the board from intro-level to senior seminar classes.

      • Murc says:

        Well, perhaps saying “forbids” was too strong a word. His department has certain guidelines he has to follow (has to have a midterm and a final, things like that) and some of those guidelines are related to test format.

        The flip side is that he’s also required to make them do a certain amount of long-form writing, and he has a lot of latitude in the format that can take.

    • Erik Loomis says:

      Yes, I teach a 125 student survey every fall.

      Also, I taught at Southwestern U. for 3 years. That school at least has no such rule. And students fail far more with multiple choice than essays and short answers.

      • Scott Lemieux says:

        And students fail far more with multiple choice than essays and short answers.

        I tried multiple choice exams a couple times when teaching 250-student Am Gov surveys to classes that had a lot of ESL students. The grades were worse than on my essay exams.

        • I tried MC tests in my expanded World surveys last year, and gave them up, despite the fact that my class sizes are still up (2×72) and I now have a collection of MC questions written to my own standards.

          Students did very poorly, and I came to the conclusion that MC tests are very good at identifying what students *don’t* know, but useless for evaluating learning in a fair or meaningful way.

      • DrDick says:

        I teach intro anthropology classes with between 180-25 students and do use multiple choice there for logistical reasons, as I simply cannot grade that many essay exams, even with my one TA, in a reasonable amount of time. I share you distaste for multiple choice however and do not use them in any other classes.

        One thing I have noticed is that even with my largely white bread Anglo students, vocabulary is often a big problem for them and I am continually astounded at the words they do not know (because they will ask me). I think it has a large negative effect on their grades, but unless the university gives us more resources for TAs or I reduce my class size (which is not a viable option) I have no real choice.

        • Ruviana says:


          I too teach fairly well-prepared white-bread types and I’m amazed at the words they don’t know–egalitarian? I think that it’s the degree to which many students don’t read for pleasure and thus don’t get exposed to lots of words.

          About testing, I always use short essay, particularly for introductory courses (I too am an anthropology professor). Students usually do worse when I include objective-style questions and I rarely if ever use them anymore. I do have the luxury of smallish courses since I teach at a very small institution.

        • sparks says:

          Seems your class size is highly variable as it is.

        • Eli Rabett says:

          Eli (chemistry) has a little trick, multiple choice questions with a few lines below to write out the answers. The written part can earn partial credit, but the marker can zip through the better students’ exams and the zeros fast enough.

          The same thing should work with other areas, even history or poly sci.

      • Bijan Parsia says:

        The classic trade off is that MCQs are faster to take and mark and much slower (and harder) to write (and the reverse with subjectively marked questions). The faster to take means you can cover more material in the same amount of exam time, thus achieving (possibly) greater validity and reliability.

        But they are very difficult to construct. Students surprisingly don’t know how to take them either (e.g., they are always surprised when I tell them that switching answers is the better move; I remember being told that “my gut answer” was likely right and that’s just mistaken).

        I use a combination of MCQs and essays. They generally do worse on the MCQ part, but as a whole things seem to be getting better.

    • djw says:

      He would LIKE to be able to make his tests all essays and short answers, but his department forbids it, on the grounds that failing that many freshmen is untenable

      I find the presumption that essay exams automatically lead there, while MC do not, quite odd–and, indeed, counterintuitive. My experiences as a TA don’t support it. If someone writes shitty essays, you can always grit your teeth and give them a C-. A 16/50 on an MC test, though, seems harder to make appear worthy of a passing grade.

      • Increase Mather says:

        Which no doubt is the counter-argument to Eric. Students do better on essays because teachers grade them easier. I mean, this girl who struggles to understand the MC question gets an an A+ on her essay. (BTW, was the essay written in English? Do they allow them to do otherwise?).

        I’m not saying I agree with them, but this story isn’t going to change many minds.

        • djw says:

          I didn’t mean to suggest it’s a general practice. I meant to suggest that if an authority above the instructor wants to pressure the instructor to pass students regardless of whether they deserve it, essay exams seem much more amenable to such a project.

        • It’s not grade inflation, but a willingness to give credit for what the student *did* achieve. MC tests are all-or-nothing, no partial credit or consideration.

      • John says:

        The one time I graded a class based on multiple choice exams (this was not required, but strongly encouraged by the department; there were 70 students in the class with no TA, the school was a two hour commute each way, and I was teaching two other classes at another school, so I wasn’t really inclined to give myself more work), I basically just set the grades so that the range of grades would be about the same as they are in a normal class with essays. Since about the median was 50% correct, this meant accepting some absurdly low grades as passing. But I’m not sure this was any worse than some of the awful essays I’ve given passing grades to.

        And I don’t see how it’s any more artificial than any other grading scale. I have no particular reason to think that the arbitrary 90% A 80% B, etc., scale has any real relationship to student performance.

        • elm says:

          Yeah. I always curve. Insisting that you must get a 90 to get an A is ridiculous unless you are so super-confident in your ability to design a test where a 90 does indeed match up to an A. Some tests are harder than others, and I’m not going to penalize my students because I wrote a harder test that semester.

          • steverino says:

            Heh. Class size. When in the Navy I was LPO of a 5-man division, and did the training and the testing. Navy requirements strongly favor having some failing scores and some high scores (to demonstrate the test was neither too hard nor too easy). The Chief didn’t do training. I was the instructor. That left three to meet those testing goals. The result was that I gave an honest test and graded as deserved, and gundecked the training report to match the Navy’s goals.

            Also, “principal.”

      • mark f says:

        I had a professor who made it even easier on himself by issuing 200-question MC tests and awarding one point per correct answer, with a maximum possible score of 100.

    • elm says:

      I actually well-designed multiple choice tests can test things beyond just memorization, at least in political science if not in history. I use them in even my upper level classes: students prefer them, they’re easier for me to grade (though harder to write, but I prefer doing the latter than the former), and for most students, they do just as good (if not a better job) at discriminating between students of varying quality.

      I do worry that it puts students who are non-native speakers at a disadvantage, but my tests are usually pretty short, so they have plenty of time. I also worry about students who just aren’t good multiple-choice test-takers. There’s definitely a skill involved that is imperfectly correlated (though not uncorrelated, as far as I can tell) with overall academic ability. I do what I can to help these students improve their test-taking skills, but it’s a rare student who does significantly worse in multiple-choice based classes than they do in essay-based classes. (And, there are probably also students who are just not good essay writers and their grades might not adequately reflect their understanding.)

      The biggest issue, as others have raised, is that there tends to be much more grade variance: some students get 100s while others sometimes get 30s or 40s. On essays, that never happens. I try to compensate but not being overly punitive to people to fail. But it is certainly easier to fail a MC exam than an essay. I find it’s also easier to get an A, though. I guess students expect they’ll fall in the latter group than the former and so prefer multiple choice.

      • elm says:

        Oh, and I meant to add: none of this is to imply that standardized testing at the high school level (or earlier) is a good idea. Part of what being “a well-designed test” entails is being matched up to the teaching the student received. If someone other than the teacher is designing the test, then the only way to approach doing that is for the teacher to teach to the test which is not going to work.

    • Karla says:

      I make the first page of my exams multiple choice, and intentionally make the first one easy as a confidence builder. It’s the second page more students hate, where they demonstrate their understanding of vocabulary and fundamental concepts by fixing false sentences to make them true. There are, of course, multiple routes to correct answers, but these items are perceived as tricky. The rest of the exam is short answer and essay (to be honest, longer short answer).

      ELL students are given more time if they need it.

    • CD says:

      I have never used multiple-choice questions on exams in 20 years of college teaching, including intro courses. Most of my objection is that this kind of question tries to fake students out: a “good” multiple-choice question has plausible wrong answers. I cannot make the turn from trying to help students to trying to bamboozle them.

      And yes, on the current issue, you want to provide students who are not comfortable with the terms the opportunity to show what they can do.

      I was a champ at standardized multiple-choice tests as a kid because I had the fine-grained cultural knowledge to figure out what a question was after and spot the fake-outs. But this is just affirmative action for the culturally privileged.

    • Cody says:

      I always enjoyed History classes with Multiple Choice questions because I’ve never ran into one I couldn’t logically reduce down.

      However, I’ve only taken fairly easy history classes (200 level).

      I really liked my Latin American class, where every question was obvious….

      • Bill Murray says:

        I hear you on this. MC tests are by far the easiest for me. Made taking the ACT and SAT easy. Well except for one section of the ACT where I had kind of fell asleep because of our end of season football party the night before

  3. David Kaib says:

    The Prospect has a great piece on education and segregation.

    Politicians and experts typically refer to schools as “failing” if they are filled with low-income children with low test scores. Faced with enormous challenges, such schools may be doing as well as they possibly can, though. African American children from low-income urban families often suffer from health problems that lead to school absences; from frequent or sustained parental unemployment that provokes family crises; from rent or mortgage defaults causing household moves that entail changes of teachers and schools, with a resulting loss of instructional continuity; and from living in communities with high levels of crime and disorder, where schools spend more time on discipline and less on instruction and where stress interferes with academic achievement. With school segregation continuing to increase, these children are often isolated from the positive peer influences of middle-class children who were regularly read to when young, whose homes are filled with books, whose environment includes many college-educated professional role models, and whose parents have greater educational experience and the motivation such experience brings as well as the time, confidence, and ability to monitor schools for academic quality.

    The contrast between George and Mitt Romney is really instructive.

  4. Gary K says:

    And I’ll confess, despite having spoken English for 60 years, I don’t know how to answer that question. I’d take a stab at (c), but (d) is pretty appealing too.

  5. I teach immigrant kids, at the middle school level, and you are so right.

  6. jeer9 says:

    I just attended an AP conference this July because our school has decided to move from an Honors program to AP. I disagreed with the change but was over-ruled by the new principal who argued unconvincingly that AP courses are much more preferred by admissions officers at UCs and can make the difference between acceptance and relegation to one’s second or third choice. It was poorly supported by statistical evidence and didn’t seem to warrant the change. (It later turned out that one of the major newsweeklies ranked high schools based upon the number of students enrolled in AP courses and our school had received a bronze star rather than the gold one achieved by our “arch-enemy.” This result could not be tolerated, we would capture another 200 students with a switch, and instructional priorities be damned.)

    That said, I do think the AP essays are scored in a fair fashion but that the M/C component is a nightmare. (I scored 7 out of 10 on a poetry example. Needless to say, a bit of argumentation ensued.) My problem with AP in English is that its focus concerns a forty minute essay and the recognition of literary techniques employed in a work that the student has probably never seen before and it is therefore unlike any blue book midterm or final that she will see at university where students know what they are to be tested upon. I don’t want to denigrate forty minute essay skills, they are certainly useful, but the Honors course is much more concerned with the development of skills needed to write an 8 to 10 page paper (honing of thesis, research methods and documentation, subtopics which fully explore the inter-relationship of ideas) that a student is likely to be assigned at a UC. All of my students who chose to take the test last year passed (though a 3 is no longer a high enough score for the UCs; testing company and university work a little tag team on schools and parents regarding admission anxiety and tuition.)

    The AYPs are not even close to being an adequate measure of a school’s quality. Both of our children attended my wife’s elementary school which was designated as “failing” because of its intense poverty and large percentage of second language/migrant farm kids. They received a great education. The teacher morale there, however, was very low as year after year teachers were reminded by administrators of their incompetence/ineptitude because kids who had never spoken English in the home weren’t reading at the appropriate second grade level. Pressure to do more to raise those scores was unwavering, and a teacher who was discovered engaged in an art or music project could be expected to be reprimanded immediately and/or written up. Race To The Top indeed.

    * I second everything djw said about the grading of essays above, though in the English scoring of essays style (8/9) almost always triumphs over well-structured substance (7).

    • mpowell says:

      There used to be an AP in writing. Does it still exist? That’s what your looking for. The AP in English is really an AP in literature (which is even it’s actual name, I think). So they test students ability to identify literary techniques. Maybe you don’t think this is something they should be focused on testing, but being forced to examine a previously unread text is a pretty damn good approach to testing understanding of literature, imop.

      • jmack says:

        The non-fiction course is AP Language and Composition. In addition to analyzing texts, students have to evaluate argument, as well as create their own through a more open question. As of a few years ago, they also write a synthesis essay (not unlike the document based question on the AP US exam) where they have to synthesize information from multiple sources into a coherent argument.

  7. mch says:

    I agree with just about everything in this post but want to register one plea, not to dismiss the value of memorization. Memorizing shouldn’t get dismissed as a hugely valuable skill just because a-holes confuse it with the fullness of learning. I would venture that one reason Maria could write an A+ essay was that she’d developed both skills of reasoning, and access to information to support her arguments, that memorizing can foster/be part and parcel of. I speak here as a teacher of foreign languages and literature, but a biologist or chemist could offer the same observations. (And history professors I know regularly include some short answer ID’s, timeline and/or map questions as preludes to the big essay questions.) Especially when it comes to “the big test,” the thing to do is to give students an opportunity to show what they DO know and the insights they can argue for. You’ve helped them if you’ve previously encouraged and helped them develop, among other things, the capacity to “memorize.”

  8. E-none of the above says:

    — “During the late 19th and 20th centuries…” so 1850-1899, and also 1950-1999 (not accounting for year 0 “AD” or “BC” issues)? Hmm., already test-takert me is wondering if they elided an “early” before their 20th.

    –“…urban immigrants…” so they’re now urbanized or were from, wherever they are now, urban when they left?

    — “generally supported” so there’s polls or truth serums so we know what they actually thought, by 50% or more then? Or are you talking inference from votes or something? Because the latter means, for a lot of that time, women and all of that time, minors/apathetics/sickly/jailed/intimidated did not vote at all so 50%+1 absolutely cannot be determined.
    — “local political machines that:…” means, no doubt, the inner Solar system (and not the Local Group of which the mikly way is a member, albeit just one of several non-remarkable galaxies) locality and “machine” can only be a judgement-neutral description of any one political organization these immigrants might encounter.
    — of the four choices I am trying to imagine any political organization that might fail to use any of these choices including as a caution to the prospective voter: as in d), if you don’t choose wisely it’ll be just like the old country again and we don’t want that.

    Its a terrible question for this English speaker and, even with wild stabs at its poser’s probable intents I cannot see a single given answer as correct. If there was only one way to discuss this, in prose, on the test. Like an essay.

  9. As a physics/math major in college (and later physics grad student) I have literally never, not once, seen a multiple-choice exam, with the exception of the physics GRE. Multiple choice is absolutely worthless when it comes to determining whether the student actually understands the concepts being tested on; if someone accidentally carries a factor incorrectly somewhere, they’re marked just as wrong as someone who literally knows nothing about the subject. Giving multiple choice tests in a science class is just as misguided as giving them in an English or history class.

    • Bijan Parsia says:

      Multiple choice is absolutely worthless when it comes to determining whether the student actually understands the concepts being tested on;

      That’s just nonsense.

      if someone accidentally carries a factor incorrectly somewhere, they’re marked just as wrong as someone who literally knows nothing about the subject.

      If your exam depends heavily on the result of one question then, indeed, this would be a problem.

      But that’s a badly designed exam.

      Giving multiple choice tests in a science class is just as misguided as giving them in an English or history class.

      Which is to say that it’s not necessarily misguided at all.

      I do recommend reading some of the literature. I was similarly confident a few years back, but I’ve changed my mind. It helps to think about what you are trying to achieve with the test and what you’re willing to trade off.

  10. Lit3Bolt says:

    NCLB was designed to set impossible, arbitrary standards for public schools, force them to get arbitrary “failing grades” that would cut off funding, and then have local “education reform” companies take over the school, siphon all funding into administrative costs, and eventually the school would shut down entirely due to parent, teacher, and student avoidance.

Leave a Reply

You must be logged in to post a comment.