A
perennial issue at every college and university is how to measure teacher
quality. It is important because it directly influences decisions about
retention, tenure, and promotion. Everyone complains about basing such
decisions on end-of-course evaluations. This column will explore a recent study
by Scott Carrell and James West [1], undertaken at the United States Air Force Academy
(USAFA), that strongly suggests that such evaluations are even less useful than
commonly believed and that the greatest long-term learning does not come from
those instructors who receive the strongest evaluations at the end of the
class.
The study authors chose to measure teacher effectiveness in Calculus I by examining
value added in both Calculus I and Calculus II: comparing student performance
on course assessments for each instructor after controlling for variables in
student preparation and background that included academic background, SAT
verbal and math scores, sex, and race and ethnicity. It is generally
acknowledged that better teachers produce better results in their students, but
this has only been extensively studied in elementary students, and even there
it is not without its problems. The authors reference a 2010 study by Rothstein
[2] that shows a strong positive correlation between the quality of fifth grade teachers and student performance on assessments taken in fourth grade, suggesting a significant selection bias: The best students seek out the
best teachers. This may be even truer at the university level where students
have much more control over who they take a class with. For this reason,
Carrell and West were very careful to measure the comparability of the classes.
At USAFA, everyone takes Calculus I, and there is little personal choice in
which section to take, so such selection bias is less likely to occur. The
authors also tested for and found no evidence of backward correlation, that the
best Calculus II instructors were correlated with higher grades in Calculus I.
The
authors had a large sample size with which to work, all of the students who
took Calculus I from fall 2000 through spring 2007, over 10,000 students and 91
instructors. The faculty make-up at USAFA is unusual among post-secondary
institutions. There is a small core of permanent faculty. Only 15% of Calculus
I instructors held the rank of Associate or Full Professor, and only 31% held a
doctorate in mathematics or a mathematical science. Most of the teaching is
done by officers who hold a master’s degree and are doing a rotation through
USAFA. The average number of years of teaching experience among all Calculus I
instructors was less than four years. Because of this, there is tight control on
these courses, which facilitates a careful statistical study. There are common
syllabi and examinations. All instructors get to see the examinations before
they are given so that there is opportunity, if an instructor so wishes, to
“teach to the test,” emphasizing those parts of the curriculum that are known
to be important for the assessment.
Positive
responses to the following prompts all had positive influence on student
performance in Calculus I, significant at the 0.05 level:
- Instructor’s ability to provide clear, well-organized instruction.
- Value of questions and problems raised by instructor.
- Instructor’s knowledge of course material.
- The course as a whole.
- Amount you learned in the course.
- The instructor’s effectiveness in facilitating my learning in the course.
On the other hand, faculty rank, highest degree, and years of teaching experience were negatively correlated with examination performance in Calculus I, but positively correlated with performance in Calculus II, with statistical significance for years of teaching experience for both the negative impact in Calculus I and the positive impact in Calculus II.
The suggested
implication is that less experienced instructors tend to focus on the
particular skills and abilities needed to succeed in the next assessment and
that students like that approach. Experienced instructors may pay more
attention to the foundational knowledge that will serve the student in
subsequent courses, and students appear to be less immediately appreciative of
what these instructors are able to bring to the class.
This
study strongly suggests that end of course student evaluations are, at best, an
incomplete measure of an instructor’s effectiveness. It also suggests a
long-term weakness of simply preparing students for their next assessment,
though it should be emphasized that this represents merely a guess as to why
less experienced instructors appear to get better performance from their
students in Calculus I assessments.
At
Macalester College, we recognize the importance of student reflection on what
they learned months or years earlier. When a promotion or tenure case comes to
the Personnel Committee, we collect online student evaluations of that faculty
member from all of the students who have taken a course with him or her over
roughly the past five years, combining both recent and current appraisals with
longer term assessments of the effect that instructor has had.
References:
[1] Scott
E. Carrell & James E. West, 2010. "Does Professor Quality Matter?
Evidence from Random Assignment of Students to Professors," Journal of
Political Economy, University of Chicago Press, vol. 118(3), pages 409-432, 06.
Available at http://www.nber.org/papers/w14081
[2] Jesse Rothstein, 2010. “Teacher
Quality in Educational Production: Tracking, Decay, and Student Achievement.”
Quarterly Journal of Economics 125 (1): 175–214. Available at http://gsppi.berkeley.edu/faculty/jrothstein/published/rothstein_vam_may152009.pdf