By Stuart Singer, The Teacher Leader
Editor’s Note: Stu Singer taught high school mathematics for forty years. He taught a wide variety of math courses including Advanced Placement and International Baccalaureate courses.
Accurately evaluating the mountain of testing statistics produced by students, teachers, schools and districts is a daunting task. Unfortunately there is no “one size fits all” technique that can be employed to precisely determine the level of success or failure of each of those various groups. Consequently a multi-layered approach with varying assessment tools is required.
Apples and oranges
The machinations of Race to the Top, No Child Left Behind and local policy makers notwithstanding, mathematical calculations of academic work are more complicated than a single number on a piece of paper.
During my years as a department chair, I often found myself at statistical odds with well-meaning Assistant Principals. One constant battle was the discussion of the school’s math teachers’ “D/F ratio”. (For the record this was not a semantic issue despite the fact that a percentage measure of the number of D and F grades is a rate not a ratio. Sadly many of the mathematicians in the building were guilty of the same mistaken vocabulary.) One AP would always combine an individual’s entire teaching schedule to compute the rate of poor grades. I can remember one conversation when she said to me, “I don’t understand why your ratio is so low and this person’s is three times as high.” One reason was that my stats included the yearbook class where a grade below A was virtually nonexistent. In addition, the two Honors Algebra 2 sections on my schedule made any comparison to a staffer who was working with four or five classes of Algebra 1 unfair.
Often separating success from failure requires more than a simple percentage. In many cases a powerful argument could be made that a grade of D+ represents an educational triumph. Consider a student whose academic history tells a story of years of struggles in math and most recently a miserable failure in an Algebra 1 class. The next year with the constant support of the teacher and a diligent work ethic this individual despite a woeful background in the basics of the subject matter secures a grade of high D barely missing a C by a point. In too many stat books this is deemed a failure. On that same sheet of data, students who should be earning excellent grades but because of a lack of motivation or a weak teacher floundered to a C are listed as a statistical success.
Measuring actual achievement
An example of potential pitfalls in assessing student and teacher performance can be demonstrated by the Virginia Standards of Learning (SOL) exams administered in eleven courses as requirements for graduation. In the vast majority of cases the success or failure of a teacher is based on the percentage of students who pass the exams. The best case scenario would have this number carefully considered in the context of multiple factors. These measures would include the scores of other teachers with the same classes both within the school and in other buildings in the district, the demographics of the student body in comparison to others in the system and the students’ prior knowledge found in part by the level of success attained by feeder schools in the pyramid. But even if all of those factors are taken into consideration there are other important numbers to be considered.
It is not as simple as a single number
Once again the Virginia SOLs, a test that has been in place since 1997, can provide insights into the complexity of assessing the performance of a group of students. In virtually every measurement the primary consideration is the percentage of success compared to failure. But what does “passing” a test like the SOL actually mean? It is a definition that can be a bit elusive.
The range of possible scores on these exams is 200-600 with a score of 400 dividing success from failure. These numbers are the result of the norming of raw scores (number of correct answers) required for a score of 400. On most tests that number is between 28 and 30 out of 50 questions, thus indicating student mastery at a level of 56% to 60%.
As low as those requirements might appear they are actually misleading. Ten of these tests (exception—English Writing) are four-option, multiple choice questions with no penalty for wrong answers. Under these conditions, the laws of probability become a powerful component in the scoring. In a system where guessing will provide the correct answer at a rate of on- in-four (25%), “passing” becomes a great deal easier. A student who can answer 22 questions and guesses on the remaining 28 should gain 7 more correct answers leading to a potential pass with an actual score of 44%. Two years ago the raw score required to pass the Algebra 1 exam was set at 25, a number which could be easily reached with less than 40% mastery.
A cautionary tale
At one Virginia school, students who wanted to take a two-year advanced Biology course would not take any class in the subject until their junior year. After having studied slightly less than half of the curriculum in May of that first year they would take the Biology SOL exam. In five years of testing none of these individuals failed the test; more than 60% scored “pass advanced” (500 or higher). Meanwhile 20% of the students in the full-year regular Biology 1 classes were failing the identical test. Such results and a study of the math leads to one conclusion: the Virginia SOL exams are a minimum competency test that talented students can pass with little preparation.
Measurement requires more than numbers
The lessons from the SOLs are basic. The construction, grading and scope of standardized testing must be incorporated into any assessment of teacher or student performance. The quality of the teacher may play less of a role in passing than the test itself. Likewise, the demographics of a student body can be a critical component of success or failure. In many situations individuals with a strong academic background can do well on such assessments regardless of the quality of the teacher. When comparing scores of different schools, ELL population, poverty levels and student mobility must be considerations. And when studying individual student success, past history can be as important as any current result.
Assessing performance is becoming more and more important in education but it cannot be found in a single number.