Kennedy 2019 How We Learn About Teacher Learning
Kennedy 2019 How We Learn About Teacher Learning
Kennedy 2019 How We Learn About Teacher Learning
research-article2019
RREXXX10.3102/0091732X19838970Review of Research in EducationKennedy: Teacher Learning
Chapter 5
How We Learn About Teacher Learning
Mary M. Kennedy
Michigan State University
H uman beings have taught one another for centuries, and for most of that time
everyone invented their own approaches to teaching, without the guidance of
mentors, administrators, teacher educators, or professional developers. Today, teach-
ers receive guidance from almost every corner. They are formally certified to teach,
and once certified, they continue to take additional courses, called professional devel-
opment, or PD, throughout their teaching careers. In addition, states and school dis-
tricts also regulate many aspects of their work through performance appraisals and
student assessments.
This chapter addresses a specific portion of guidance called professional develop-
ment, or PD. Literature on PD has grown substantially over time, and standards for
research have also changed. Twenty years ago, I reviewed studies of PD effectiveness
within math and science education (Kennedy, 1998), limiting my review to studies
that provided evidence of student achievement and that included a comparison
group. I found only 12 such studies, most of which are not acceptable by today’s
138
Kennedy: Teacher Learning 139
standards. Some had very small samples, some did not randomly assign teachers to
groups, and some provided instructional materials as well as PD, so that the effects of
the PD were confounded with the effects of the materials. Since then, the literature
has grown substantially, so that we are able to raise our standards for what counts as
a “good study.” Two years ago, I reviewed 28 studies of PD (Kennedy, 2016), all of
which used random assignment. Since then, even more such studies have been pub-
lished. For this chapter, I now raise my standards again and remove studies that were
based on fewer than 20 teachers.1
PD studies represent our way, as researchers, of learning about teacher learning.
We generate hypotheses about what good teaching consists of, about what teachers
need to learn in order to do good teaching, or about what kind of activities or experi-
ences provoke learning in teachers. Then we try different kinds of interventions to see
how they work. It is not a perfect system, for every PD study simultaneously involves
all three of these types of hypotheses: what teachers need to learn, how they learn, and
how we will know whether they have learned enough. Thus, a given study of PD can
fail if any one of these hypotheses is wrong, and we may not know where our error is.
Furthermore, teachers themselves may learn about teaching independently, in ways
we don’t see. They take formal courses, they read things, they ruminate about their
own experiences, and they seek advice from colleagues. They may even get a brain-
storm about their teaching while watching a movie.
My aim in this chapter is to examine our existing oeuvre of experimental research
on PD both from the standpoint of what we have learned about teacher learning and
from the standpoint of what we have learned about how to learn about teacher learn-
ing—that is, how to design informative studies. I begin with a brief overview of how
we think about teaching as a phenomenon.
But the ideas we form about teaching are naive in the sense that they are formed
without any awareness of what really causes events to turn out as they do. Just as a
child might form the naive conception that the sun circles around the earth, she/he
might form the conception that teaching practice comes naturally, or is effortless,
because teachers always appear to know what to do. Moreover, we remain confident
in our judgments that one teacher is “better” than another.
This is an important preface to any discussion about learning about teaching
because our impressions could be wrong. We may be aware of the effect of a teacher’s
actions but not what its purpose was. We see their actions but not their thoughts,
their goals, their motives, their frustrations. Moreover, we don’t see what they see,
from their vantage point at the front of the classroom and from their vantage point
of trying to lead the class in a particular direction. This lack of awareness, in turn, can
lead us to think that teaching practices come naturally or that the decisions about
“what to do next” are always self-evident in the moment.
I became especially aware of the difference between an observer’s view and a teach-
er’s view in a study of teachers’ in-the-moment decision making (Kennedy, 2005,
2010b). When I asked teachers about discrete actions they took during a lesson, they
nearly always referred to something they saw at that moment. Teachers would say “I
could see that Billy was about to jump out of his seat,” or “I realized I didn’t have
enough handouts to go around,” or “Juan rarely speaks and I wanted to encourage
him.” These conversations reveal a highly contingent aspect of teaching that is quite
different from our naive conceptions of teachers as entirely self-directed and always
knowing what to do next.
others are caused entirely by their own character, not by the situations they are
confronting.
9 /1 x 2 / 3 = 18 / 3 = 6.
The student’s solution would have yielded this computation:
36 / 4 x 2 / 3 = 72 /12 = 6.
Katlaski knew that the student’s proposal would be too complicated for her stu-
dents to follow, so she immediately faced a dilemma: accept the student’s solution
and solve the problem on the board, even if most students couldn’t follow it, or reject
the student’s solution. In a fluster, she said, “No, that won’t work.”
Katlaski’s behavior would imply that she did not know her mathematics, but her
real problem was a logistical one of how to respond to a proposal that would be
difficult for her students to follow (Kennedy, 2010a). The term “attribution error”
refers to this tendency to attribute the actions of others to stable personal traits
rather than to the situations in which they find themselves. In Katlaski’s case, the
situation presented something she wasn’t ready for, and an unknowing observer
could easily attribute that error to a lack of sufficient content knowledge. Because
we are all vulnerable to the assumption that teachers always know what to do, we
are especially guilty, even as grown-ups and even as researchers, of attributing
teachers’ actions to their content knowledge or their character traits rather than to
the situations they face.
142 Review of Research in Education, 43
specific practices in which protégés were known to be less effective, but there were no
rules regarding the content or format of these conversations. One teacher might pres-
ent a specific behavior as a requirement, to be done at least twice every day, while
another might cast it as useful in specific types of situations.
This program provides a useful alternative to conventional PD with conventional
curricula: The program had no cost, no formal schedule, and no uniform curriculum.
Formal PD programs have all these things—standards, goals, models of good practice,
admonitions, and so forth—but they also have substantial cost and take up a lot of
teachers’ time. Since virtually all approaches to PD will be more expensive, we should
expect them to demonstrate greater value than this simple “bootstrap” approach.
Table 1 presents average effect sizes for studies using these two kinds of outcomes.3
The top two rows of Table 1 provide two possible benchmark values against which to
compare studies using broader outcome measures. First, we see the average effect size
of all education studies examined by Hill et al. (2008), which was 0.07, and then we
see the effect size of the “bootstrap” program described above, a program that simply
pairs more effective teachers with less effective teachers.
After these rows in Table 1 are two rows showing the average effects of PD pro-
grams, first those that were evaluated with broad achievement tests in mathematics
and language arts, and then those with specialized tests in the sciences or in English
language learning. Finally, to help us understand the value of a second set of out-
comes, the last line of Table 1 shows the average effect size Hill and others found
when their studies used more narrow measures.
This comparison is not very encouraging. On average, our myriad expensive PD
programs are almost indistinguishable from Papay et al.’s (2016) inexpensive boot-
strap approach that encourages teachers to help one another. Yet most of these pro-
grams are far more expensive and time-consuming.
Still, the programs gathered here are quite various, and it behooves us to exam-
ine them further to see what else we might learn about the potential for PD to
improve teaching practice. In the remainder of this chapter, I use patterns of pro-
gram effects to address a variety of questions about how PD works and what we
can expect from it.4
Procedures
One stream of research focuses on what teachers are doing, often with little regard
to why they did those things or to what their students were doing. This was the first
approach we used to define the practices of teaching. A vocal advocate for this line of
work, Nate Gage (1977) argued that the field needed a scientific basis for what had
previously been thought of as “the art of teaching.” To this end, researchers tried to
partition teaching into a collection of discrete practices and then see which practices
were correlated with student achievement gains. Once they became convinced that
they had identified a set of such behaviors, they began devising PD programs to teach
those behaviors to teachers.
Kennedy: Teacher Learning 145
Table 1
Overall Effects of Professional Development
I found seven experimental studies that were designed to prescribe specific things
teachers should do. Most focused on generic teaching practices such as questioning
techniques or management techniques. One (Borman, Gamoran, & Bowdon, 2008)
provided highly specified behavioral guidance on how to implement a new science
curriculum. For instance, here is a passage from the teachers’ manual describing a
single fourth-grade unit (Rot It Right: The Cycling of Matter and the Transfer of Energy.
4th Grade Science Immersion unit, 2006):
•• To set the tone for this investigation as an exploration, generate a class discussion
and class list about what plants need for growth and development.
•• Use the Think Aloud technique to model how to refine a wondering into a good
scientific investigation. From the students’ list about what plants need, form the
question—What effect does sunlight have on radish plant growth and
development?
•• Continue the Think Aloud to model assembling the Terraqua Columns using
proper experimental procedures, and designing an experiment that has only one
factor that is varied.
•• Have students record and explain their predictions for each set of columns for
later reference. (p. 21)
•• . . .
Content Knowledge
The second stream of work focuses on teachers’ content knowledge. Interest in
content knowledge arose relatively early in our history of PD research, and it derived
from a study of teaching behaviors (Good & Grouws, 1979). These authors tested a
PD model that stipulated a sequence of lesson segments for mathematics lessons. The
guideline suggested that teachers spend about 8 minutes reviewing concepts that had
146 Review of Research in Education, 43
Program effects
Do these different conceptions of teaching make a difference? The easiest way to
compare these three approaches to PD is to focus on a subset of programs with com-
mon study designs. In this case, I compare 22 programs that all worked with teachers
for a single academic year and used a general achievement test to measure student
achievement. Figure 1 arrays the effects of these programs along a horizontal scale.
Each program is characterized by a single circle. The figure does not include the sci-
ence or English language learner studies, where test metrics can yield different effect
sizes, or the studies that worked with teachers for only a portion of the year.
You can see that program effects are quite various across these studies, and that
many of them appear to be less effective (though more expensive) than the bootstrap
program, whose effect of 0.12 is shown in the top row. However, a larger fraction
(2/3) of studies in the third group program yielded program effects larger than the
0.12 benchmark.5
148 Review of Research in Education, 43
Figure 1
Distribution of 1-Year Program Effects by Conceptions of Teaching
The third conception also appears to be more widespread, with 13 research groups
testing programs based on this vision of teaching. Thus we might benefit from a
closer look at this approach to PD. Teachers are nearly always aware of multiple
things continuously unfolding in their classrooms. They see that Ronald is confused,
that Juanita is getting restless, that Mark is eager to show off what he has figured out,
that someone has spilled something sticky on the floor near her desk, that the room
is getting too hot because the janitor has not yet fixed the heater, and that the lunch
bell will ring in 10 minutes. Regardless of what teachers plan to do during a given
lesson, their in-the-moment actions are often responses to these in-the-moment
observations. They want to suppress the show-off, calm down the fidgeter, help the
confused student, open the window, and make sure the lesson reaches an appropriate
closure before the bell rings. Then the mess can be cleaned up.
Strategic thinking is not merely about finding the best way to achieve the lesson
goal; it is also about seeing things that might interfere with or facilitate the direction
of the lesson, watching for signs of restlessness or confusion, inventing ways to avoid
or capitalize on these moments, and generally being aware of what all the students are
thinking and doing. Much of the strategically oriented PD had to do with interpret-
ing students’ comments and recognizing signs of confusion or disorientation that
need to be addressed. I suspect that one reason why strategically oriented PD was
more successful is that it helped teachers get better at seeing signals within their own
classrooms.
These programs offer two thing that are often missing when programs focus on
procedures or content knowledge. One is classroom artifacts. Many of these pro-
grams rely on videotapes of classroom events, examples of student work, interviews
with children, or other artifacts that demonstrate to teachers the issues on which they
want to focus. Thus, conversations about teaching are not about universal methods
but about interpreting and responding to specific types of situations. Second, the
people who provide this sort of PD tend to be people who have themselves spent a
great deal of time in classrooms and are cognizant of all these nuances of classroom
life. They themselves have an intimate familiarity with the complications of teaching.
They are not merely telling teachers what to do or what to say; they are showing
teachers what to look for. Which raises another question:
Kennedy: Teacher Learning 149
Procedural Knowledge
In one of the first studies of PD ever conducted, a group of researchers (Anderson,
Evertson, & Brophy, 1979) generated a list of procedures that had been shown to be
related to student learning, converted these into a list of recommended practices, and
accompanied each recommendation with very brief rationale. For instance, one said,
“The introduction to the lesson should give an overview of what is to come in order
to mentally prepare the students for the presentation.” Another said, “It is also at the
beginning of the lesson that new words and sounds should be presented to the chil-
dren so that they can use them later when they are reading or answering questions.”
The PD itself was remarkably brief, consisted of a single 3-hour orientation, during
which the principal investigator presented the list as a whole, discussed its use, and
allowed teachers to ask questions. They then asked teachers to try to use these admo-
nitions for the entire school year. The program had a yearlong effect of 0.24 on stu-
dent achievement, the second-most effective procedural program shown in Figure 2.
150 Review of Research in Education, 43
Figure 2
Distribution of Program Effects for Packaged Versus Original Programs
Now for the contrast: A few years after that study was done, another group
(Coladarci & Gage, 1984) took the same list of admonitions and mailed it to a group
of teachers to see if they could get the same effect. I consider this mailed list of admo-
nitions to be a packaged message in part because it is more impersonal but also
because there was no opportunity to discuss or clarify any of the admonitions, help
teachers envision the kind of situations where they would be applicable, or respond
to any questions. This mailed-in program had an effect of −0.04, compared to the
0.24 from the original study.
Strategies
I use the term strategic to distinguish programs that are focused on interpreting
events and adapting instruction to circumstances. In general, strategies are more flex-
ible than procedures, more responsive to unique circumstances, and more responsive
to differences among students.
One of the most effective of these programs (Gersten, Dimino, Jayanthi, Kim, &
Santoro, 2010) introduced first-grade teachers to research findings regarding early
reading instruction, and did this by helping teachers incorporate these findings into
their local lesson plans. Teachers met in groups throughout the academic year to
jointly plan their reading lessons, and throughout these meetings, they were regularly
introduced to new research findings. Each planning meeting followed a four-step
process: First, teachers would report what happened when they implemented their
previously planned lessons. Then they would discuss their newest report on research
findings. In this phase, the group facilitator focused their attention on the central
concepts to make sure everyone understood them. In the third phase, they would
review the publisher’s recommended lesson and discuss its strengths and weaknesses.
Finally, they would work together to design a lesson of their own that incorporated
the research principle they had just read about. Thus, in this PD, even though the
program gave teachers a standardized curriculum, it did so by embedding the content
into the lesson planning process. Each planning group made sense of the findings in
the context of its own classrooms and then directly applied the new knowledge into
their next lessons.
There is another program that also taught teachers about findings from reading
research, but instead of working with teachers and helping them incorporate the
findings into their lesson plans, this program packaged the material and presented it
to teachers through a series of daylong seminar sessions, each accompanied by a text-
book. So both programs wanted teachers to get better at teaching language arts, and
both aimed to do so by introducing teachers to research findings in that area. The
first had an effect size of 0.23 and appears in the “strategy” row of Figure 2, while the
second had an effect of 0.05 and appears in the “content knowledge” row of Figure 2.
By definition, strategic programs are less amenable to packaging. They aim to
engage teachers in classroom-based problem-solving and to help them “see” their own
classrooms differently, a goal that seems to require program faculty who have inti-
mate familiarity with classroom life, so much so that they can help their teachers
interpret their own experiences differently.
One program in this group has been working toward standardization for several
years, and its progress might be instructive here. The Cognitively Guided Instruction
program, or CGI, was initially designed and tested by a group of mathematics faculty
and graduate students at the University of Wisconsin in 1989 (Carpenter et al., 1989).
At that time it was a unique program, one of the first programs to move away from
152 Review of Research in Education, 43
direct instruction and toward strategic thinking. It had a modest effect of 0.13. But the
authors, along with their colleagues and graduate students, continued to use CGI in
their college courses and in local PD programs for many years and to expand its influ-
ence. After about 10 years, they developed a guide for workshop leaders (Fennema,
Carpenter, Levi, Franke, & Empson, 1999). Then, after another 10 years had passed,
the younger generation of CGI mathematics educators (Jacobs, Franke, Carpenter,
Levi, & Battey, 2007) carried out a second experimental test of CGI and achieved a
much higher effect of 0.26. Presumably, this improvement reflected a series of refine-
ments over time as all the members of this group became more familiar with teachers
and their needs.
After that, some members of this group decided to create a formal organization to
provide PD and related services. Called the Teachers Development Group, this orga-
nization sought to further disseminate the concept of CGI by providing written
materials and making PD available on a broader scale. In other words, they sought to
package the CGI program. But large-scale expansion runs the risk of relying on inex-
perienced PD providers who may have neither the personal, situated understanding
of the program that the founders had nor the intimate knowledge of how teachers
responded to CGI.
Now we have yet a third test (Schoen, LaVenia, Tazaz, & Faraina, 2018) of CGI,
this one based on the new packaged version of the program. This new packaged ver-
sion of CGI yielded an average effect of zero.
These three tests of CGI represent three different levels of PD provider experi-
ence. In the first test, yielding an effect of 0.13, the providers had knowledge of stu-
dent learning and had experience teaching teachers in their classes but had no
experience providing a PD program. In the second, yielding an effect of 0.23, pro-
gram staff had knowledge of how children learn as well as more experience providing
PD. But in the third study, yielding an average effect of 0.0, the program had been
packaged for large-scale distribution, and I suspect local providers lack the kind of
intimate familiarity with classroom life that is needed to help teachers alter their
perceptions of their own experiences.
The pairs of outcomes I share here, of course, could reflect nothing more than
ordinary statistical variations. However, the pattern of differential program effective-
ness, across over 20 independent studies, raises important questions about the reli-
ability of program effectiveness and, ultimately, about value of our PD research if our
findings cannot be reliably expanded or replicated. Even if my hypothesis about
packaging is wrong, we still need to think more about how we define salient program
features that should be part of any replication effort and whether program staff expe-
riences are a necessary “feature” of the program.
that if they teach a set of specific procedures to teachers, and teachers implement
them, those specific behaviors will foster student learning. If we teach content knowl-
edge, we are assuming that teachers will be more able, on their own, to teach that
content. But we still know very little about how to actually foster these changes, or
about how much time is needed to foster such change. In an effort to help teachers
make these changes, many PD providers send mentors or coaches into the schools,
people who visit teachers within their classrooms and help them “see” new things and
try new things. Thus, we may think about PD as having a cascading sequence of
influences that looks like this:
The modal PD program works with teachers throughout a single full aca-
demic year, implying that researchers expect teachers to be able to alter habits
and routines relatively quickly, and adopt their new recommendations relatively
quickly. But there is another timing problem inherent in this approach to PD:
Researchers typically measure changes in student achievement during that same
academic year. This schedule is popular in part because student achievement is
typically measured by school districts at the end of each academic year. So a PD
provider who comes in, say September, might consider last spring’s school test
as his pretest. Normally, we think of causes as preceding effects, so this schedule
raises a variety of questions: How quickly do we expect teachers to alter their
practice based on what they have just learned? Do we expect them to alter their
methods the next day? Within a week or a month? On the other hand, if change
is slow, and if teachers need time to alter their habits, can we expect to see the
effect of that change on student achievement gains that are measured concurrent
with the treatment itself?
These complications with PD research designs invite questions about the array of
program effects, for virtually all of them could be underestimating program effects:
Students’ annual achievement gains are almost always the result of teaching events
that occurred before the PD has had its full influence.
But there are also scenarios that would lead us to overestimate effectiveness:
Suppose teachers privately dislike the approaches being taught but comply with
them only to be polite or to get their coaches to leave them alone. If this
occurred, we might see a gain during the program year, but the gain would
reflect compliance rather than genuine learning and it would go away the follow-
ing year.
These scheduling problems provide another example of an area in which we
need to learn more about how to learn about teacher learning, how to design our
studies, and how to map exposure to PD with changes in practice and, in turn,
changes in student learning. Most important, we need to learn more about
whether program effects are sustained over time, and whether they accumulate
over time.
154 Review of Research in Education, 43
Figure 3
Delayed Effects From Different Approaches to Professional Development
‘insights” part, they tested three approaches: Some teachers examined their own stu-
dents’ work, others examined written cases of real teaching episodes, and still others
examined their own experiences as learners. All three approaches had strong effects,
and Figure 4 shows their average effectiveness.
The other program (Allen, Pianta, Gregory, Mikami, & Lun, 2011) consisted of
ongoing consultations between teachers and mentors. Teachers videotaped sample
lessons approximately every 2 weeks and sent their tapes to an online “teaching part-
ner.” Then the two of them would talk about the lesson. The nature of these conver-
sations helps us understand the difference between prescriptions and insights. Instead
of correcting teachers’ behaviors, prescribing recommended practices, or evaluating
what they saw on the video, these mentors used “prompts” to help teachers examine
and think about specific events that had occurred. For instance, a “nice work” prompt
might say, “You do a nice job letting the students talk. It seems like they are really
feeling involved. Why do you think this worked?” And a “consider this” prompt
might look like this:
One aspect of “Teacher Sensitivity” is when you consistently monitor students for cues and when you
notice they need extra support or assistance. In this clip, what does the boy in the front row do that shows
you that he needs your support at this moment? What criteria did you use to gauge when to move on?
Notice that the teaching partner was not directly recommending any specific pro-
cedures or rules for teachers to follow, but there was a set of concepts the mentor
wanted teachers to understand. Teaching partners posed questions that might help
teachers think harder about their classroom experiences, about the relationship
156 Review of Research in Education, 43
between their own behaviors and the behaviors of their students, and about the
enacted meaning of these concepts.
This kind of conversation, of course, requires that mentors themselves must be
able to select revealing moments for examination, and must be able to pose provoca-
tive questions rather than recommend specific behaviors. If such a program wanted
to expand, it would not be easy for them to hire more mentors, or even to define their
selection criteria. As PD providers shift their programs away from procedures and
knowledge and toward strategic thinking, they depend more and more on PD pro-
viders who themselves have enough depth of experience that they can recognize
“teachable moments” within the PD process.
Figure 4
Cumulative Program Effects
Where to next?
The first study I described here (Anderson et al., 1979) was conducted almost 40
years ago, and I suspect it was the first experimental study of PD ever published in an
educational journal. In the intervening years, education researchers have continued
to pursue questions about what makes one teacher better than another, and about
how we can provide guidance that would help teachers improve their practice. Many
of our efforts have been naive in the sense that we thought teaching was much sim-
pler than it has turned out to be.
I sorted these PD programs into three ways of thinking about how to improve
teaching: one focusing on teaching behaviors, one on increasing content knowledge,
and one on strategic thinking. The evidence we have now suggests that the third
approach has had the greatest positive impact on teachers’ effectiveness. Furthermore,
there is some evidence that this approach enables teachers to continue to improve
their own practice independently after the formal PD is finished. I suspect that the
reason for this delayed success has to do with its emphasis on purpose, which in turn
helps teachers function autonomously after the PD providers are gone.
I hope over time it will become customary for PD researchers to follow teachers
for at least one full school year beyond the program’s duration. As Huberman (1993)
pointed out a long time ago, teachers are essentially tinkerers. They are accustomed
to working in isolation, they depend heavily on their own personal innovations, and
they depend on automated habits and routines. It makes sense, then, that they would
need time to incorporate new ideas into their habits and routines. Though a few
studies have followed teachers for a year beyond their treatment, the data shown here
are too skimpy to yield any firm conclusion.
An important remaining problem has to do with replication. The most effective
PD programs appear to be designed and carried out by people who have gained deep
158 Review of Research in Education, 43
personal knowledge of the intricacies of teaching. The patterns shown above suggest
that their effectiveness is at least partially a function of this intimate knowledge. It is
not clear whether or how PD providers can share this form of knowledge with other
PD providers, thus raising questions about whether these programs can be expanded
very much. We have reached a situation in which our knowledge about how to con-
duct productive PD is increasing but our ability to spread that knowledge is not.
Meantime, teachers are being “treated” with ever-increasing volumes of packaged
PD, at great expense to school districts and with almost no benefit for themselves or
their students.
Notes
1The present population of studies differs from the 2016 review as follows: (a) It excludes
four studies whose samples included fewer than 20 teachers; (b) it removes one study that did
not use random assignment and that I had mistakenly included earlier; (c) it includes four
studies that followed teachers for less than a full academic year (my 2016 criteria required a
minimum full academic year minimum); and (d) it adds six studies published since that review
was completed.
2Randomization can be done in many ways. Researchers may assign individual teachers,
whole school populations, or subgroups of teachers within schools. Sometimes they solicit
volunteers first and then assign only volunteers to groups. The most common mistake in PD
research is to solicit volunteers for their program, then seek out a group of seemingly compa-
rable teachers for a comparison. This design overlooks the importance of motivation to learn
as a factor in learning, and I rejected all of the studies based on matched groups.
3Readers are referred to my earlier article (Kennedy, 2016) in the Review of Educational
tiveness but rather to use outcome patterns to generate hypotheses about teacher learning and
about how PD fosters learning.
5I have not formally tested for differences among discrete program effects. Study sample
sizes ranged from 20 to over 400 with more recent studies using larger samples.
Studies Examined
Allen, J. P., Pianta, R. C., Gregory, A., Mikami, A. Y., & Lun, J. (2011). An interaction-based
approach to enhancing secondary school instruction and student achievement. Science,
333, 19, 1034–1037.
Anderson, L. M., Evertson, C. M., & Brophy, J. E. (1979). An experimental study of effective
teaching in first-grade reading groups. Elementary School Journal, 4, 193–223.
Babinski, L., Amendum, S. J., Knotek, S. E., Sanche, M., & Malone, P. (2018). Improving
young English learners’ language and literacy skills through teacher professional develop-
ment: A randomized controlled trial. American Educational Research Journal, 55, 117–143.
Borman, G. D., Gamoran, A., & Bowdon, J. (2008). A randomized trial of teacher devel-
opment in elementary science: First-year achievement effects. Journal of Research on
Educational Effectiveness, 1, 237–264.
Campbell, P. F., & Malkus, N. N. (2011). The impact of elementary mathematics coaches on
student achievement. Elementary School Journal, 111, 430–454.
Carpenter, T. P., Fennema, E., Peterson, P. L., Chiang, C.-P., & Loef, M. (1989). Using
knowledge of children’s mathematics thinking in classroom teaching: An experimental
study. American Educational Research Journal, 26, 499–531.
Kennedy: Teacher Learning 159
Coladarci, T., & Gage, N. L. (1984). Effects of a minimal intervention on teacher behavior
and student achievement. American Educational Research Journal, 21, 539–555.
Duffy, G. G., Roehler, L. R., Sivan, E., Rackliffe, G., Book, C., Meloth, M. S., . . . Bassiri,
D. (1987). Effects of explaining the reasoning associated with using reading strategies.
Reading Research Quarterly, 22, 347–368.
Garet, M. S., Cronen, S., Eaton, M., Kurki, A., Ludwig, M., Jones, W., . . . Sztejnberg, L.
(2008). The impact of two professional development interventions on early reading instruc-
tion and achievement. Washington, DC: National Center for Educational Evaluation and
Regional Assistance, Institute of Education Sciences. Retrieved from http://ies.ed.gov/
ncee/pdf/20084031
Garet, M. S., Helpen, J. B., Walters, K., Parkinson, J., Smith, T. M., Song, M., . . . Yang, R.
(2016). Focusing on mathematical content knowledge: The impact of content-intensive teacher
professional development. Washington, DC: U.S. Department of Education.
Garet, M. S., Wayne, A. J., Stancavage, F., Taylor, J., Eaton, M., Walters, K., . . . Doolittle,
F. (2011). Middle school mathematics professional development impact study: Findings after
the second year of implementation. Washington, DC: U.S. Department of Education.
Retrieved from http://ies.ed.gov/pubsearch/pubsinfo.asp?pubid=NCEE20114024
Gersten, R., Dimino, J., Jayanthi, M., Kim, J. S., & Santoro, L. E. (2010). Teacher study
group: Impact of the professional development model on reading instruction and student
outcomes in first grade classrooms. American Educational Research Journal, 47, 694–739.
Glazerman, S., Isenberg, E., Dolfin, S., Bleeker, M., Johnson, A., Grider, M., & Jacobus,
M. (2010). Impacts of comprehensive teacher induction: Final results from a randomized
controlled study. Washington, DC: National Center for Education Evaluation. Retrieved
from https://files.eric.ed.gov/fulltext/ED565837.pdf
Good, T. L., & Grouws, D. A. (1979). The Missouri Mathematics Effectiveness Project:
An experimental study in fourth grade classrooms. Journal of Educational Psychology, 71,
355–362.
Greenleaf, C. L., Litman, C., Hanson, T. L., Rosen, R., Boscardin, C. K., Herman, J.,
. . . Jones, B. (2011). Integrating literacy and science in biology: Teaching and learn-
ing impacts of reading apprenticeship professional development. American Educational
Research Journal, 48, 647–717.
Heller, J. I., Dahler, K. R., Wong, N., Shinohara, M., & Miratrix, L. W. (2012).
Differential effects of three professional development models on teacher knowledge
and student achievement in elementary science. Journal of Research in Science Teaching,
49, 333–362.
Jacobs, V. R., Franke, M. L., Carpenter, T., Levi, L., & Battey, D. (2007). Professional
development focused on children’s algebraic reasoning in elementary school. Journal for
Research in Mathematics Education, 38, 258–288.
Jayanthi, M., Gersten, R., Taylor, M. J., Smolkowski, K., & Dimino, J. (2017). Impact of the
developing mathematical ideas professional development program on grade 4 students’ and
teachers’ understanding of fractions. Washington, DC: National Center for Educational
Evaluation and Regional Assistance.
Matsumura, L. C., Garnier, H. E., Correnti, R., Junker, B., & Bickel, D. D. (2010).
Investigating the effectiveness of a comprehensive literacy coaching program in schools
with high teacher mobility. Elementary School Journal, 111, 35–62.
McMeeking, L. B. S., Orsi, R., & Cobb, R. B. (2012). Effects of a teacher professional devel-
opment program on the mathematics achievement of middle school students. Journal for
Research in Mathematics Education, 43, 159–181.
Myers, C. V., Molefe, A., Brandt, W. C., Zhu, B., & Dhillon, S. (2016). Impact results of
the eMints professional development validation study. Educational Evaluation and Policy
Analysis, 38 455–476.
160 Review of Research in Education, 43
Niess, M. (2005). Oregon ESEA Title IIB MSP: Central Oregon Consortium. Corvallis:
Department of Science and Mathematics Education, Oregon State University.
Papay, J. P., Taylor, E. S., Tyler, J. H., & Laski, M. (2016). Learning job skills from colleagues
at work: Evidence from a field experiment using teacher performance data. Cambridge, MA:
National Bureau of Economic Research.
Penuel, W. R., Gallagher, L. P., & Moorthy, S. (2011). Preparing teachers to design sequences
of instruction in earth science: A comparison of three professional development programs.
American Educational Research Journal, 48, 996–1025.
Roth, K. J., Garnier, H. E., Chen, C., Lemmens, M., Schwille, K., & Wickler, N. I. Z. (2011).
Videobased lesson analysis: Effective science PD for teacher and student learning. Journal
for Research in Science Teaching, 48, 117–148.
Sailors, M., & Price, L. R. (2010). Professional development that supports the teaching of
cognitive reading strategy instruction. Elementary School Journal, 110, 301–322.
Santagata, R., Kersting, N., Givvin, K. B., & Stigler, J. W. (2011). Problem implementa-
tion as a lever for change: An experimental study of the effects of a professional devel-
opment program on students’ mathematics learning. Journal of Research on Educational
Effectiveness, 4, 1–24.
Schoen, R. C., LaVenia, M., Tazaz, A. M., & Faraina, K. (2018). Replicating the CGI experi-
ment in diverse environments: Effects of Year 1 on student achievement. Tallahassee: Florida
State University Learning Systems Institute.
Supovitz, J. (2013, April). The linking study: An experiment to strengthen teachers’ engagement
with data on teaching and learning. Paper presented at the American Educational Research
Association conference, San Francisco, CA. Retrieved from https://files.eric.ed.gov/full-
text/ED547667.pdf
References
Allen, J. P., Pianta, R. C., Gregory, A., Mikami, A. Y., & Lun, J. (2011). An interaction-based
approach to enhancing secondary school instruction and student achievement. Science,
333, 19, 1034–1037.
Anderson, L. M., Evertson, C. M., & Brophy, J. E. (1979). An experimental study of effective
teaching in first-grade reading groups. Elementary School Journal, 4, 193–223.
Borman, G. D., Gamoran, A., & Bowdon, J. (2008). A randomized trial of teacher devel-
opment in elementary science: First-year achievement effects. Journal of Research on
Educational Effectiveness, 1, 237–264.
Bredeson, P. V. (2001). Negotiated learning: Union contracts and teacher professional devel-
opment. Education Policy Analysis Archives, 9. Retrieved from https://epaa.asu.edu/ojs/
article/viewFile/355/481
Carpenter, T. P., Fennema, E., Peterson, P. L., Chiang, C.-P., & Loef, M. (1989). Using
knowledge of children’s mathematics thinking in classroom teaching: An experimental
study. American Educational Research Journal, 26, 499–531.
Clark, C. M., & Peterson, P. L. (1986). Teachers’ thought processes. In M. C. Wittrock (Ed.),
Handbook of research on teaching (3rd ed., pp. 255–296). New York, NY: Macmillan.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Lawrence
Erlbaum.
Coladarci, T., & Gage, N. L. (1984). Effects of a minimal intervention on teacher behavior
and student achievement. American Educational Research Journal, 21, 539–555.
Fennema, E., Carpenter, T., Levi, L., Franke, M. L., & Empson, S. B. (1999). Children’s
mathematics: Cognitively guided instruction: A guide for workshop leaders. Portsmouth, NH:
Heinemann.
Gage, N. L. (1977). The scientific basis of the art of teaching. New York, NY: Teachers College
Press.
Kennedy: Teacher Learning 161
Garet, M. S., Cronen, S., Eaton, M., Kurki, A., Ludwig, M., Jones, W., . . . Sztejnberg, L.
(2008). The impact of two professional development interventions on early reading instruc-
tion and achievement. Washington, DC: National Center for Educational Evaluation and
Regional Assistance, Institute of Education Sciences. Retrieved from http://ies.ed.gov/
ncee/pdf/20084031
Garet, M. S., Helpen, J. B., Walters, K., Parkinson, J., Smith, T. M., Song, M., . . . Yang, R.
(2016). Focusing on mathematical content knowledge: The impact of content-intensive teacher
professional development. Washington, DC: U.S. Department of Education.
Garet, M. S., Wayne, A. J., Stancavage, F., Taylor, J., Eaton, M., Walters, K., . . . Doolittle,
F. (2011). Middle school mathematics professional development impact study: Findings after
the second year of implementation. Washington, DC: U.S. Department of Education.
Retrieved from http://ies.ed.gov/pubsearch/pubsinfo.asp?pubid=NCEE20114024
Garet, M. S., Wayne, A. J., Stancavage, F., Taylor, J., Walters, K., Song, M., . . . Hurlburt, S.
(2010). Middle school mathematics professional development impact study: Findings after the
first year of implementation. Washington, DC: U.S. Department of Education. Retrieved
from http://ies.ed.gov/ncee/pubs/20104009/
Gersten, R., Dimino, J., Jayanthi, M., Kim, J. S., & Santoro, L. E. (2010). Teacher study
group: Impact of the professional development model on reading instruction and stu-
dent outcomes in first grade classrooms. American Educational Research Journal, 47,
694–739.
Good, T. L., & Grouws, D. A. (1979). The Missouri Mathematics Effectiveness Project:
An experimental study in fourth grade classrooms. Journal of Educational Psychology, 71,
355–362.
Heller, J. I., Dahler, K. R., Wong, N., Shinohara, M., & Miratrix, L. W. (2012). Differential
effects of three professional development models on teacher knowledge and student
achievement in elementary science. Journal of Research in Science Teaching, 49, 333–362.
Hill, C. J., Bloom, H. S., Black, A. R., & Lipskey, M. W. (2008). Empirical benchmarks for
interpreting effect sizes in research. Child Development Perspectives, 2, 172–177.
Huberman, M. (1993). The model of the independent artisan in teachers’ professional rela-
tions. In J. W. Little, & M. W. McLaughlin (Eds.), Teachers’ work: Individuals, colleagues,
context (pp. 11–50). New York, NY: Teachers College Press.
Jacobs, V. R., Franke, M. L., Carpenter, T., Levi, L., & Battey, D. (2007). Professional
development focused on children’s algebraic reasoning in elementary school. Journal for
Research in Mathematics Education, 38, 258–288.
Jayanthi, M., Gersten, R., Taylor, M. J., Smolkowski, K., & Dimino, J. (2017). Impact of the
Developing Mathematical Ideas professional development program on grade 4 students’ and
teachers’ understanding of fractions. Washington, DC: National Center for Educational
Evaluation and Regional Assistance.
Kennedy, M. M. (1998). Form and substance in inservice teacher education. Madison:
University of Wisconsin National Institute for Science Education. Retrieved from www.
msu.edu/~mkennedy/publications/valuePD.html
Kennedy, M. M. (2005). Inside teaching: How classroom life undermines reform. Cambridge,
MA: Harvard University Press.
Kennedy, M. M. (2010a). Attribution error and the quest for teacher quality. Educational
Researcher, 39, 591–598.
Kennedy, M. M. (2010b). Teacher assessment and the quest for teacher quality: A handbook. San
Francisco, CA: Jossey Bass.
Kennedy, M. M. (2016). How does professional development improve teaching? Review of
Educational Research, 86, 945–980.
Lortie, D. C. (1975). Schoolteacher: A sociological study. Chicago, IL: University of Chicago
Press.
162 Review of Research in Education, 43