New-generation teacher evaluation systems invariably contain at least two competing goals: accountability and development. The challenge comes because both goals are important, and both could potentially lead to positive outcomes for students and schools; but when they are applied to the same set of activities, they may seem to mix messages, cross purposes, and cancel out efforts. This is why time spent on evaluation sometimes fails to work for either goal.  

Observations, in particular, can be used in ways that support and develop teachers, or ways that measure and sort teachers, but they cannot be used to do both effectively. For example, when evaluators set out to support and develop, they are more likely to tailor the focus of an observation to an individual teacher’s strengths, needs, or interests. 

In this situation, they are more likely to provide feedback that is aimed at areas specific to the teacher (whether or not those areas are emphasized or specified by the rubric or set of indicators that define excellent teaching in the school, district, or state). Evaluators are also more likely to make suggestions from a wide range of resources, rather than sticking to those outlined within the official rubric and related documents. This approach will make it harder to generate an objective rating of the teacher’s overall practice or compare this teacher with others across different grades or disciplines, but it may support the individual teacher’s key areas for development.

If, instead, an evaluator wants to measure and sort teachers in order to identify schoolwide trends and ensure everyone is being held to the same standards, they are more likely to look for the same things in classroom observations. They are likely to use the same rubric or checklist of indicators in each observation and to generate feedback and suggestions based on that rubric or set of indicators. 

Evaluators might, for example, require all teachers to state and post a learning objective, refrain from cold calling, or try to engage students in peer-led discussions. Though this may not align with an individual teacher’s content- or grade-specific areas for growth, it does generate data from each classroom that can be compared across classrooms and over time. 

Is There a Middle Ground? 

It might seem as though there is a middle ground, a way to sort and measure while you also support and develop. However, in my work with teachers and administrators implementing new-generation teacher evaluation systems across Connecticut, I have come to believe that even though both approaches are necessary, they cannot be applied simultaneously. In fact, the possibility of observations holding teachers accountable to high standards of practice or developing the effectiveness of their practices depends on the transparency with which each goal is being addressed.

For example, consider the following situation, described to me by an assistant principal who was assigned to evaluate the English department at a large suburban high school.

Mr. Richards (all names are pseudonyms) had a measure and sort approach to classroom observations. He wanted to ensure every teacher had a clear learning objective for each lesson because he believed students need to know what they are learning and why in order to fully engage in the lesson. The indicators on his district’s rubric related to learning objectives require the teacher to post and orally state the objective in student-friendly language for an “effective” rating. For an “exemplary” rating, students must be able to explain the objective when asked. 

Richards decided to stop in to all of his teachers’ classrooms to see if objectives were regularly being posted and stated at some point during a two-week period. He planned to leverage his good relationships with students to quietly ask one or two what they thought the learning objective was when he dropped in to observe. He hoped he would be able to collect objective, comparable data about the whole department that he could use to plan group professional learning experiences and compare with evaluators for other departments.

Ms. Mac, an English teacher in Mr. Richards’ department, had a support and develop approach in mind for observation. She was eager to receive feedback that would help her improve differentiation for students, especially as it related to writing instruction. In her own research on writing, she read that individual goal-setting was important for self-directed student engagement. In order to work on this, she explained to students that they would use 1:1 conferences to identify a focus area for each of them during independent writing time. Though students all had the same writing assignment, Mac purposefully left the learning objective off the board on writing days to emphasize that each writer had his or her own goal to address.

When Richards stopped for an unannounced observation, Mac was thrilled that he would have the chance to see her writing and conferencing time in action. She thought of this as some of the most powerful work she did with her students, and it aligned perfectly with her professional goal of better differentiation. Since Mac was not addressing the class, Richards did not hear her state the learning objective (which he also didn’t see on the board). When he asked students what they were learning in the lesson that day, the students said there was no lesson because it was a writing day. 

Though all students seemed to be engaged in a writing task, including those intensely conferring with Mac at their desks, Richards couldn’t find any of the indicators he had been looking for. He either had to rate her poorly or leave without a rating at all. Mac, who had been hoping for feedback in a particular area, got nothing but a note in her box saying he would try coming back some other time to see her teach.

At the end of the day, Richards didn’t get his objective, reliable data. Mac didn’t get her specific, actionable feedback. In other words, the teacher was not measured, sorted, supported, or developed. The whole exercise of crossed purposes canceled out the potential benefit of pairing two educators who had each thoughtfully prepared for the observation with the best intentions in mind.

Shared Goals

Hypothetically, if Richards and Mac’s shared goal had been to measure and sort, Mac might have been thinking about how to frame individual goals as a learning objective to post for the entire class, and she might have discussed this objective with students, as she did in other lessons. She may not have gotten principal feedback exactly how she expected to, but she would have ensured that her evaluator had data for comparisons to be made. And, perhaps Mac’s hopes for support and development could be met outside the evaluation system by a nonevaluative mentor or coach.

On the other hand, if Richards and Mac’s shared goal had been to support and develop, Richards might have met with Mac before the observation so that he could ask (or ask her to reflectively identify) what he could attend to as a focus during the visit. If there was no time for a pre-observation meeting, Richards might have simply asked why it was that there was no posted objective (so that he could decide if it was principled practice or just negligence that explained why his expectations were not met). He might also have reached beyond the rubric rows he had in mind to identify other sets of indicators or materials that might support Mac’s goal of differentiation. Richards may not have been able to gather his comparable, objective data points, but perhaps this data could be gathered during other subsequent visits.

Teachers and leaders need to transparently acknowledge the purpose of each visit and discuss what that means for the level of preparation and feedback each expects. If leaders aim to support and develop teachers to address the development goal of teacher evaluation systems, they need to be ready to color outside the lines of ready-made evaluation tools. Principals must: 

  • Ask the teacher what would be most helpful to observe and how it links to student achievement, rather than relying on standardized indicators and tools for a focus. 
  • Discuss what the teacher intended for the lesson, as well as the differences between intentions and reality if they do not match. 
  • Reach beyond evaluation templates and the online vending machine of professional development resources by generating personal suggestions based on a teacher’s specific context and focus areas.

When it’s time to measure and sort for the purpose of accountability, teachers and leaders need to be ready to apply the standard, generic tools across classrooms in ways that mean something to each teacher. Principals must:

  • Discuss what generic indicators of teacher quality should specifically look like in practice for a given grade and content area before the observation.
  • Ensure teachers are aware of the focus area(s) for each visit so that they can attempt to demonstrate what evaluators are coming to see. 
  • Take advantage of the shared language and resources for describing expectations for practice so that teachers can support each other in identifying ways to meet and exceed expectations.  

Rachael Gabriel is an associate professor of literacy education at the University of Connecticut in Storrs, CT.