Progress Testing Evaluation of Four Years
Progress Testing Evaluation of Four Years
Progress Testing Evaluation of Four Years
ORIGINAL RESEARCH
Tomic ER, Martins MA, Lotufo PA, Benseñor IM. Progress testing: evaluation of four years of application in the School of
Medicine, University of São Paulo. Clinics. 2005;60(5):389-96.
BACKGROUND: Progress testing is a longitudinal tool for evaluating knowledge gains during the medical school years.
OBJECTIVES: (1) To implement progress testing as a form of routine evaluation; (2) to verify whether cognitive gain is a
continuous variable or not; and (3) to evaluate whether there is loss of knowledge relating to basic sciences in the final years of
medical school.
METHODS: A progress test was applied twice a year to all students from 2001 to 2004. The mean percentage score was calculated
for each school year, employing ANOVA with post hoc Bonferroni test evaluation for each test.
RESULTS: Progress testing was implemented as a routine procedure over these 4 years. The results suggest a cognitive gain from
first to sixth year in all eight tests, as a continuum (P for trend < .0001). Gain was found to be continuous for basic sciences (taught
during the first 2 years), clinical sciences (P < .0001), and clerkship rotation (P < .0001). There was no difference between the
performance of men and women.
CONCLUSION: Progress testing was implemented as a routine, applied twice a year. Data suggest that cognitive gain during
medical training appears to be a continuum, even for basic science issues.
One of the most important aspects of medical compe- based teaching in a number of medical schools. Progress
tence and clinical reasoning is the need for physicians to testing was especially developed to measure cognitive skills
acquire a great capacity to accumulate information with an in a problem-based learning universe. However, more re-
organized method.1-3 This ability must be taught from the cent data suggest there is no dilemma regarding its use in
outset at the medical school. In addition, if so much infor- a school with a nonproblem-based learning curriculum.4 In
mation is needed for a student to become a good profes- fact, a similar testing procedure, named the Quarterly Pro-
sional, medical schools ought to create specific tools to file Examination, has been developed in parallel to progress
evaluate the acquisition of knowledge during the school testing at the School of Medicine at the University of Mis-
years. souri, in Kansas City.5
The quantification of knowledge gain over the course In 2001, the School of Medicine at the University of
of the medical school years is a challenge. Recently, the São Paulo decided to apply progress testing twice a year
longitudinal tool of progress testing has been shown to be to all students from first to sixth year as a way of evaluat-
suitable for application to curricula involving problem- ing the learning of cognitive skills in a nonproblem-based
curriculum. The present study attempts to describe the im-
Department of Internal Medicine, Hospital das Clínicas, Faculty of Medicine,
University of São Paulo – São Paulo/SP, Brazil.
plementation of progress testing at our school. Adaptations
Email: [email protected] we have made over the last 3 years may be useful as a
Received for publication on May 16, 2005. model for future applications of progress testing in other
Accepted for publication on July 19, 2005.
schools with similar curricula. In this study, we have also
389
Progress testing: evaluation of four years of application in the School of Medicine, University of São Paulo CLINICS 2005;60(5):389-96
Tomic ER et al.
tried to address 3 specific issues: (1) to implement progress penalize students who answered questions incorrectly. So,
testing as a form of routine evaluation; (2) to verify whether the scores were calculated by adding the number of cor-
cognitive gain is a continuous variable or not; and (3) to rect answers in the test, and the results were subsequently
evaluate whether there is loss of knowledge relating to ba- presented as percentages. We tried to adapt the format of
sic sciences in the final years of medical school. the test to match the format of our routine tests as a way
of decreasing possible dissatisfaction with the implemen-
METHODS tation of a new test.
The questions used in the test were devised by staff
The curriculum of the School of Medicine at the Uni- members in all the departments of the medical school and
versity of São Paulo is divided into 3 cycles, each one last- were selected and compiled by the team responsible for or-
ing 2 years. The first 2 years include basic sciences ganizing the progress testing. The questions could include
(anatomy, physiology, cell biology, and others); the next 2 figures or graphs and had to be restricted to issues that were
years relate to clinical sciences; the final 2 years include fundamental to the respective discipline for a future phy-
clerkship rotation, basically in outpatient clinical facilities, sician.
general wards, and emergency rooms, Students rotate The first 4 applications of the progress test were
through internal medicine, pediatrics, surgery, obstetrics, nonmandatory. Initially, we tried to discuss its importance
and gynecology. with the students. The discussion focused on the fact that
the test was not being used for promotional objectives, since
Characteristics of progress testing at School of each discipline has its particular form of evaluating stu-
Medicine, University of São Paulo dents. After 2 years of testing, we decided the test should
become compulsory and absent students would have to jus-
Progress testing was introduced to our students in the tify their absence.
first semester of 2001. We decided to apply the test only After results are released, students have 7 days to reg-
twice a year until we were more familiar with it. In every ister complaints by e-mail. All complaints are analyzed, and
semester, the students have a class-free day dedicated to the questions with which the students had greatest prob-
evaluating the medical course with staff members. This lems are disregarded.
evaluation is in seminar-format, and held during the morn-
ing, while the progress test is given during the afternoon. Degree of difficulty and discrimination in the tests
In its first application, the test consisted of 130 ques-
tions on basic sciences, clinical sciences, and clerkship ro- In the first 4 tests, we did not include all disciplines
tation. The test format was then restructured to include 100 either when they failed to comply with the deadline or be-
questions subdivided into 33 questions about basic sciences, cause the quality of questions was unacceptable. However,
33 about clinical sciences, and 34 about clerkship rotation from the second test onwards, at least 90% of all disciplines
issues. The number of questions relating to each discipline were included. There was no difference in the degree of
was calculated on the basis of the number of hours allot- difficulty of the questions used in the tests over the years
ted to that discipline in the school curriculum. As the test (mean degree of difficulty). However, the questions gradu-
is applied 1 month before the end of the school semester, ally became more discriminative. Thus, in the last 4 tests,
we expected, for instance, that a student in the fourth se- the questions had good discrimination power in clinical sci-
mester would be able to answer all the questions from the ences and clerkship rotation, while still needing some im-
first, second, and third semesters, and 80% of the questions provement in basic sciences.
relating to the fourth semester. We continue to request new questions every semester
In Brazil, the most common type of question used in with the objective of creating a good-quality question bank.
tests to evaluate students is a multiple-choice question with After 3 years, we have at least 2000 good quality questions
5 alternatives. Because most staff members are familiar from all disciplines. However, we are still collecting new
with devising this questions in this format, and students are questions during the preparatory period before test appli-
trained to answer such questions, we decided to use only cation.
this type. We did not include an alternative in the “I do
not know” format, because there is no tradition for the use Statistics
of this type of question in Brazil. Students would be afraid
to say that they did not know how to answer questions Mean and standard deviation scores for all students were
about disciplines they had already completed. We did not calculated according to gender and school year for each
390
CLINICS 2005;60(5):389-96 Progress testing: evaluation of four years of application in the School of Medicine, University of São Paulo
Tomic ER et al.
occasion on which the test was applied. Mean scores for tory, attendance of first-year students increased, but attend-
each discipline were calculated for all the students accord- ance of final-year students was lower than in the previous
ing to their school year. Mean scores for the basic course, tests. Except for the first test, in which attendance was
clinical course, and clerkship rotation years were also cal- higher for women than for men (P < .0001), in the other 3
culated. Comparisons between the mean scores for students tests for which attendance was not compulsory, there was
from first to sixth years on each test occasion were made no difference between genders.
using the ANOVA test with post hoc Bonferroni test evalu- Figure 1 shows mean percentage scores according to un-
ation. dergraduate year and test number. Sixth-year students had a
worse performance than did fifth-year students in 2 of the 4
RESULTS tests with compulsory attendance. However the gain in knowl-
edge was still significant (P for trend < .0001) in all tests.
Table 1 shows the student attendance rates in the 8 tests, Figures 2, 3, and 4 show the mean percentage scores
according to school year. In the first 4 applications, attend- relating specifically to basic sciences, clinical sciences, and
ance was not compulsory. After the test became manda- clerkship rotation issues for students from first to sixth year
Table 1 - Student attendance in the 8 applications of progress testing, according to school year
Figure 1 - Mean scores for all questions, for students from first to sixth year* according to occasion on which the test was applied (tests 1-8) (for all tests,
P for trend < .0001). *For each sequence of columns, the first represents first-year students; the second, second-year students; the third, third-year students;
the fourth, fourth-year students; the fifth, fifth-year students; and the sixth, sixth-year students.
391
Progress testing: evaluation of four years of application in the School of Medicine, University of São Paulo CLINICS 2005;60(5):389-96
Tomic ER et al.
Figure 2 - Mean scores (%) for basic science questions, for students from first to sixth year, according to occasion on which the test was applied (tests 1-8).
(tests 1, 3, 5, 6, 7 and 8 P < .0001; test 2, P = .04; test 4, P = .03). *For each sequence of columns, the first represents first-year students; the second, second-
year students; the third, third-year students; the fourth, fourth-year students; the fifth, fifth-year students; and the sixth, sixth-year students.
Figure 3 - Mean scores (%) for clinical science questions, for students from first to sixth year, according to occasion on which the test was applied (tests 1-
8). (test 1 to 8, P < .0001). *For each sequence of columns, the first represents first-year students; the second, second-year students; the third, third-year
students; the fourth, fourth-year students; the fifth, fifth-year students; and the sixth, sixth-year students.
392
CLINICS 2005;60(5):389-96 Progress testing: evaluation of four years of application in the School of Medicine, University of São Paulo
Tomic ER et al.
Figure 4 - Mean scores (%) for clerkship rotation questions, for students from first to sixth year, according to occasion on which the test was applied (tests
1-8). (For all tests, P < .0001). *For each sequence of columns, the first represents first-year students; the second, second-year students; the third, third-year
students; the fourth, fourth-year students; the fifth, fifth-year students; and the sixth, sixth-year students.
(P for trend < .001). Except for basic sciences in the sec- afraid to answer “I don’t know,” thinking that they could
ond test (P = .04) and fourth test (P = .03), the results sug- be penalized for doing so. Consequently, the scores from
gest progressive cognitive improvement over the course of the progress tests at our school were calculated using only
the undergraduate years. the questions answered correctly.
There was no difference in mean percentage score be- However, the major difference is that in our school, each
tween men and women (data not shown). discipline was responsible for devising questions for
progress testing, and the number of questions was calcu-
DISCUSSION lated on the basis of the number of hours allotted to each
discipline in the school curriculum. This is very different
Our results suggest a progressive cognitive gain from from the University of Maastricht, where the selection of
first to sixth year in all tests. Even for basic sciences, the questions was on the basis of the blueprint. Therefore, even
data suggest a possible continuous cognitive gain over the though the mean scores of between 50% and 60% obtained
entire course of undergraduate years. Men and women had at our school are very similar to the results from other
the similar performance. Progress testing seems to be a schools like Maastricht (mean score of 58%, calculated
good longitudinal tool for evaluating gain of knowledge at from the correct answers alone), it is possible that the re-
the School of Medicine at the University of São Paulo. Af- sults are not comparable.8,9
ter eight applications, the test has been incorporated into The other difference is that since we do not have any
the routine of each semester. experience with “true or false” questions, we used multi-
The evaluation of progress testing at the School of ple-choice questions, as used by McMaster University.
Medicine in the University of São Paulo differed in a few Through this, we aimed to conduct the progress testing in
aspects from the evaluations at McMaster University the manner used for all other evaluations at our school. Re-
(Canada) and the University of Maastricht (Netherlands).6,7 sults also show that overall knowledge increased uniformly
The first difference is that we did not use “I don’t know” with time, as the training progressed from first to sixth year;
answers, because there is no tradition in Brazil of using this result is similar to that observed at the University of
this type of alternative.8 Our students would probably be Maastricht and other institutions.10,11
393
Progress testing: evaluation of four years of application in the School of Medicine, University of São Paulo CLINICS 2005;60(5):389-96
Tomic ER et al.
We had no difficulty in adapting progress testing to a and women have similar performances regarding cognitive
medical school with a traditional curriculum. Although knowledge in the medical school, this probably indicates
progress testing was created for evaluating schools using that we may have some bias in the entrance exams.
problem-based learning, further comparison of student per- Implementation of progress testing at our school was
formance between medical schools with and without prob- more difficult than expected. Some complaints by the stu-
lem-based learning has only shown small differences. There dents related to their fear that we were creating a new form
was no difference in cognitive performance between of evaluation that could be used in assessments for progres-
schools using either type of curriculum. When the compari- sion to the next school year. Some students also said that
son was divided into 3 categories (basic, clinical, and so- the test stimulated competition between them. Most of the
cial sciences), few differences were found.12,13 Students not students only accepted the importance of the test when the
using problem-based learning scored better in basic sci- results were shown to them for the first time. Now, after 4
ences, while students that used problem-based learning years, the test has become part of the routine at our school.
scored better on the social sciences.12,13 Students demand detailed comments on all the alternatives
It is possible to speculate about our data. The change for each question after each test, as a way of using the test
from optional to compulsory brought some modifications in to help them in their learning.
student attendance according to undergraduate year. When Elsewhere in Brazil, progress testing has only been ap-
the test was optional, students in the final years, who were plied regularly at the Federal University of São Paulo, for
more concerned about using the test as training ground for 4 consecutive years once per year. The results also suggest
the medical residence admission exams, presented the high- progressive gain of knowledge over the years, even for
est attendance. Some first-year students feared that the test questions about basic sciences.14 Now, several schools are
results could interfere with their progression to the next implementing progress testing, for example at the Federal
school year, and consequently did not come to the first tests. University of Minas Gerais, the School of Medicine of
After the test became compulsory, several sixth-year students Marília, and the State University of Londrina. This will
came to the test merely to register their attendance, and ei- enable future exchange of information.
ther did not answer any questions in the tests, or answered There are a number of limitations associated to the way
only the questions relating to clinical rotation issues. This in which we implemented progress testing, and these need
could explain the lower score among sixth-year students in to be considered. The format of the test, with only 100
the fifth and sixth tests. However, the gain of knowledge ap- questions, means that some disciplines are represented only
pears to be significant. Comparing the curves from the first by a single question, or are competing for their question
4 tests, for which attendance was not compulsory, with those with several other disciplines with small representation in
for the last four tests, for which attendance was mandatory, the medical curriculum. If the question is unrepresentative
there is no great difference. The lower scores for the last four or inadequate, the evaluation of the discipline could pro-
tests can be attributed to the change in student attendance, duce biased results. Another point to be discussed is that
but can also be explained by the greater discriminative power the data might not be representative of all the students be-
of the questions, which improved over the 4 years of test- cause of the lower attendance in the first four tests when
ing. Except for the first test, attendance by men and women the test was not mandatory. Even in the last 4 applications
was the same. of the test, the attendance ranged from 66% to 84%, which
We understand that having no apparent loss of cogni- allows for the possibility of some kind of selection bias.
tive gain relating to basic sciences over the school years is The data suggest that cognitive gain appears to be a con-
good news. One possible explanation for these results is tinuum over the course of undergraduate years, even for
that some basic science issues are revisited during the clini- basic sciences. Therefore, progress testing could be used
cal course and hospital rotations. as a further instrument for evaluating gain of knowledge,
In Brazil, more women than men apply for the selec- even considering its limitations. It could also be used for
tion exams to enter medical school each year. However, at further evaluation of minor or major changes in the school
medical school we have more men than women. As men curriculum in the future.
394
CLINICS 2005;60(5):389-96 Progress testing: evaluation of four years of application in the School of Medicine, University of São Paulo
Tomic ER et al.
RESUMO
Tomic ER, Martins MA, Lotufo PA, Benseñor IM. Resul- RESULTADOS: O Teste do Progresso foi implementado
tados de oito aplicações do Teste do Progresso na Facul- como rotina entre 2001-2004. Os resultados sugerem um
dade de Medicina da Universidade de São Paulo. Clinics. ganho cognitivo contínuo e progressivo ao longo da gra-
2005;60(5):389-96. duação (P < 0,0001) nos oito testes aplicados até o mo-
mento. Esse ganho seria significativo mesmo para as dis-
O Teste do Progresso foi introduzido na Faculdade de Me- ciplinas do curso básico (P < 0,05), curso clínico (P <
dicina da Universidade de São Paulo em 2001. 0.0001) e internato (P < 0.0001). Não houve diferença de
OBJETIVO: (1) Testar a viabilidade da aplicação rotinei- performance em função do gênero.
ra do teste; (2) verificar se o ganho de conhecimentos era CONCLUSÃO: O Teste do Progresso foi implementado
progressivo e contínuo durante a graduação; (3) determi- como rotina, sendo aplicado semestralmente. Os resulta-
nar se esse ganho de conhecimento inclui também as dis- dos sugerem que o ganho cognitivo parece ser contínuo e
ciplinas do curso básico. progressivo mesmo para as disciplinas do básico ao longo
MÉTODOS: O teste foi aplicado duas vezes por ano en- dos seis anos.
tre 2001-2004. Em cada teste, calculou-se o escore médio
de acertos por ano letivo usando-se ANOVA com correção UNITERMOS: Teste do progresso. Teste longitudinal.
de Bonferroni para múltiplas comparações. Avaliação.Ensino. Graduação médica.
REFERENCES
1. Waldrop MM. Toward a unified theory of cognition. Science. 6. Blake JM, Norman GR, Keane DR, Mueller CB, Cunnington J, Didyk
1988;241(4861):27-9. N. Introducing progress testing in McMaster university’s problem-based
learning curriculum: psychometric properties and effect on learning.
2. Norman GR, Tugwell P, Feightner JW, Muzzin LJ, Jacoby LL. Acad Med. 1996;71(9):1002-7.
Knowledge and clinical problem-solving. Med Educ. 1985;19(5):344-
56. 7. Boshuizen HP, Van der Vleuten CPM, Schmidt HG, Machiels-Bongaerts
M. Measuring knowledge and clinical reasoning skills in a problem-
3. Kassirer JP. Diagnostic reasoning. Ann Intern Med. 1989;110(11):893- based learning curriculum. Med Educ. 1997;31(2):115-21.
900.
8. Muijtjens AMM, Van Mameren H, Hoogenboom RJI, Evers JLH, Van
4. Van der Vleuten CP, Verwijnen GM, Wijnen WH. Fifteen years of der Vleuten CPM. The effect of “don’t know” option on test scores:
experience with progress testing in a problem-based learning curriculum. Number-right and formula scoring compared. Med Educ.
Medical Teacher. 1996;18(2):103-9. 1999;33(4):267-75.
5. Arnold L, Willowghby TL. The Quarterly Profile Examination. Acad 9. Albano MG, Cavallo F, Hoogenboom R, Magni F, Majoor G, Manenti
Med. 1990;65(8):515-6. F et al. An international comparison of the knowledge levels of medical
students: the Maastricht progress test. Med Educ. 1996;30(4):239-45.
395
Progress testing: evaluation of four years of application in the School of Medicine, University of São Paulo CLINICS 2005;60(5):389-96
Tomic ER et al.
10. Vosti KL, Bloch DA, Jacobs CD. The relationship of clinical knowledge 13. Schmidt HG. Innovative and conventional curricula compared: what
to months of clinical training among medical students. Acad Med. can be said about their effects? In: Nooman ZM, Schmidt HG, Ezzat
1997;72(4):1097-102. ES, editors. Innovation in medical education. An evaluation of its present
status. New York: Springer Publishing Company; 1990. p. 1-7.
11. Blake JM, Norman GR, Smith EK. Report card from McMaster: Student
evaluation at a problem-based medical school. Lancet. 14. Borges DR, Stella RCR. Avaliação do ensino de Medicina na
1995;345(8954):899-902. Universidade Federal de São Paulo. Rev Bras Educ Med. 1999;23(1):11-
7.
12. Verhoeven BH, Verwijnen GM, Scherphier AJJA, Holdrinet RSG,
Oeseburg BM, Bulte JA, et al. An analysis of progress test results of
PBL and non-PBl students. In: van der Vleuten CPM, Scherpbier AJJA,
Wijnen WHFW, editors. Progress testing: the utility of an assessment
concept. Groningen: Bas Verhoeven; 2003:125-36.
396