This week, our district has about 50 professional educators (teachers) working on redeveloping the Student Learning Objective Assessments that are required as a part of Georgia's Race to the Top grant. Our evaluation system, TKES, will have about 40% of the evaluation based on student growth. We will be using standardized tests for those teaching tested subject grads. In GA, that means Math, ELA, social studies, and science in grades 4-8 and 8 different End-of-Course Tests in high school. Everyone else who teaches is going to have to have a pre/post test that is developed under the direction of state DOE guidance, but directed and created by districts. The SLO assessment process has been a challenge because we are trying to serve two masters with the assessments.
First, we are trying to serve the need of the evaluation instrument to have a measure of student learning over the course of the school year. This presents a number of important considerations around validity and reliability of the assessments. Although there is a fairly rigorous process in place to attempt to ensure this, there is no item analysis, distractor analysis, or other psycometric evaluation of the assessment items after they have been administered. We try very hard to train teachers in two or three days on how to construct good test items, but there are still questions as to value of the data as an accurate measure of student growth over the year. This is in stark contrast to the "tested" subjects that have industry created and vetted items used with huge sets of data and field testing of each item and teams of trained psycometricians ensuring the items are valid and reliable. Both the "tested" and "non-tested" growth models impact Teacher Evaluation in the same manner.
The second master we want to serve with pre/post assessments is the instructional work of teachers in the classrooms. There is significant research supporting the value of pre-assessments and subsequent instructional decision making based on the data that comes back from the pre-assessments. However, the value of this type of work tends to be more helpful at the unit or lesson level and is signficantly less impactful at the course level. We can determine what students know or don't know about course content, yes, but when you are attempting to do a survey of all the knowledge and skills learned in a Chemistry I course when students come in with little chemistry background, the test is limited in what it offers as meaningful data for the teacher. Thus, the SLO assessments are limited in their ability to inform instructioanl decisions on the fly or after the initial work in the first month of school.
This data problem is a frustration for teachers and administrators alike. We are once again taking at least 2 days of instruction to collect "autopsy" data that in theory could guide instruction, but in practice is limiting in what it offers for teachers in the classroom. Not only are we taking that instructional time away, but we have invested hundreds of thousands of dollars in "professional learing" around building these SLO assessments. There is value in exposing teachers to Webb's Depth of Knowledge and how to use it to evaluate standards and assessment items to match and there is value in training teachers in what makes good assessments. But that value is limited in its scope and influence because it is focused on SLO assessment development and not on an overall assessment plan/vision that impacts instruction for all kids from day to day.
The challenge continues to be how to effectively evaluate the impact of an individual teacher on student achievement and how to isolate their impact away from all other factors that impact student learning. We all agree great teachers have significant impact on student learning, but being able to quantify that impact in a fair and reliable way is very difficult and complex work.