Journal of Educational and Behavioral Statistics, Nov 3, 2016
Causal mediation analysis is the study of mechanisms-variables measured between a treatment and a... more Causal mediation analysis is the study of mechanisms-variables measured between a treatment and an outcome that partially explain their causal relationship. The past decade has seen an explosion of research in causal mediation analysis, resulting in both conceptual and methodological advancements. However, many of these methods have been out of reach for applied quantitative researchers, due to their complexity and the difficulty of implementing them in standard statistical software distributions. The mediation package in R provides a set of simple commands that execute some of the newer causal mediation methods. This article will summarize some of the recent advances in mediation analysis, critically review the mediation package, and demonstrate, by example, some of its capabilities.
Conventionally, regression discontinuity analysis contrasts a univariate regression's limits as i... more Conventionally, regression discontinuity analysis contrasts a univariate regression's limits as its independent variable, R, approaches a cut-point, c, from either side. Alternative methods target the average treatment effect in a small region around c, at the cost of an assumption that treatment assignment, I [R < c], is ignorable vis a vis potential outcomes. Instead, the method presented in this paper assumes Residual Ignorability, ignorability of treatment assignment vis a vis detrended potential outcomes. Detrending is effected not with ordinary least squares but with MM-estimation, following a distinct phase of sample decontamination. The method's inferences acknowledge uncertainty in both of these adjustments, despite its applicability whether R is discrete or continuous; it is uniquely robust to leading validity threats facing regression discontinuity designs.
The accuracy of the Content should not be relied upon and should be independently verified with p... more The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden.
A scientific community can be modeled as a collection of epistemic agents attempting to answer qu... more A scientific community can be modeled as a collection of epistemic agents attempting to answer questions, in part by communicating about their hypotheses and results. We can treat the pathways of scientific communication as a network. When we do, it becomes clear that the interaction between the structure of the network and the nature of the question under investigation affects epistemic desiderata, including accuracy and speed to community consensus. Here we build on previous work, both our own and others’, in order to get a firmer grasp on precisely which features of scientific communities interact with which features of scientific questions in order to influence epistemic outcomes.Here we introduce a measure on the landscape meant to capture some aspects of the difficulty of answering an empirical question. We then investigate both how different communication networks affect whether the community finds the best answer and the time it takes for the community to reach consensus on ...
Supplemental material, sj-pdf-1-cjn-10.1177_0844562120927535 for A Cross-Sectional Exploration of... more Supplemental material, sj-pdf-1-cjn-10.1177_0844562120927535 for A Cross-Sectional Exploration of Cytokine–Symptom Networks in Breast Cancer Survivors Using Network Analysis by Ashley Henneghan, Michelle L. Wright, Garrett Bourne and Adam C. Sales in Canadian Journal of Nursing Research
The proliferation of computerized technology in education has slowly been followed by a (smaller)... more The proliferation of computerized technology in education has slowly been followed by a (smaller) proliferation of evaluations of educational technology. These randomized studies are entirely classical. However, they produce an entirely new type of data as a byproduct: computer log data from subjects assigned to the treatment condition. For instance, in an effectiveness trial of the Cognitive Tutor Algebra I (CTA1) curriculum the researchers collected log data from students in the treatment group. What insights about the CTA1 effect may be extracted from this rich supplementary dataset? This paper will compare and contrast three different causal techniques in three parallel analyses of the CTA1 dataset. Specifically, we will examine the role of hints in CTA1's effect. One approach discards the control group--and hence, the randomization--and analyzes usage and outcome data in the treatment group as an observational study. Another, causal mediation analysis, contextualizes the ef...
Randomized A/B tests in educational software are not run in a vacuum: often, reams of historical ... more Randomized A/B tests in educational software are not run in a vacuum: often, reams of historical data are available alongside the data from a randomized trial. This paper proposes a method to use this historical data–often highdimensional and longitudinal–to improve causal estimates from A/B tests. The method proceeds in two steps: first, fit a machine learning model to the historical data predicting students’ outcomes as a function of their covariates. Then, use that model to predict the outcomes of the randomized students in the A/B test. Finally, use design-based methods to estimate the treatment effect in the A/B test, using prediction errors in place of outcomes. This method retains all of the advantages of design-based inference, while, under certain conditions, yielding more precise estimators. This paper will give a theoretical condition under which the method improves statistical precision, and demonstrates it using a deep learning algorithm to help estimate effects in a se...
Researchers faced with a sequence of candidate model specifications must often choose the best sp... more Researchers faced with a sequence of candidate model specifications must often choose the best specification that does not violate a testable identification assumption. One option in this scenario is sequential specification tests: hypothesis tests of the identification assumption over the sequence. Borrowing an idea from the change-point literature, this paper shows how to use the distribution of p-values from sequential specification tests to estimate the point in the sequence where the identification assumption ceases to hold. Unlike current approaches, this method is robust to individual errant p-values and does not require choosing a test level or tuning parameter. This paper demonstrates the method's properties with a simulation study, and illustrates it by application to the problems of choosing a bandwidth in a regression discontinuity design while maintaining covariate balance and of choosing a lag order for a time series model.
Principal stratification (PS), which measures variation in a causal effect as a function of post-... more Principal stratification (PS), which measures variation in a causal effect as a function of post-treatment variables, can have wide applicability in educational data mining. Under the PS framework, researchers can model the effect of an intelligent tutor as a function of log data, can account for attrition, and study causal mechanisms. Participants in this tutorial will learn how and when PS works and doesn’t work, and will learn three methods of estimating principal effects. 1. PRINCIPAL STRATIFICATION IN EDM RESEARCH Educational data miners are increasingly interested in causal questions—what interventions work, for whom, and how. Accompanying this interest is the widespread realization that there is no such thing as “the effect”: actually, effects can vary widely between individuals. Estimating the differences in effects between types of learners is (in principal) straightforward for types defined prior to the onset of an experiment. But what about learners who use the software i...
This paper reports an application to educational intervention of Principal Stratification, a stat... more This paper reports an application to educational intervention of Principal Stratification, a statistical method for estimating the effect of a treatment even when there are different rates of dropout in experimental and control conditions. We consider the potential value for using principal stratification to identify “Tough Love Interventions” – interventions that have a large effect but also increase the propensity of students to drop out. This method allowed us to generate an estimate of the treatment effect in an RCT without the selection bias induced by differential attrition by restricting analysis to just the inferred “stratum” of students who would not drop out in either condition. This paper provides a case study of how to appropriate the method of principal stratification from statistics and medical research fields to educational data mining, where it has been largely absent despite increasing relevance to online learning.
The Cognitive Tutor (Anderson, Corbett, Koedinger, and Pelletier, 1995) is a piece of software de... more The Cognitive Tutor (Anderson, Corbett, Koedinger, and Pelletier, 1995) is a piece of software designed to teach math, alongside traditional teachers. In the second year of a largescale effectiveness study of the Cognitive Tutor Algebra I (CTAI) curriculum, the intervention had a moderate positive effect on high school post-test scores (Pane, Griffin, McCaffrey, and Karam, 2014). One of CTAI’s mechanisms is “mastery learning” (Bloom, 1968): the software estimates students’ skill mastery after each worked problem, and (ideally) only advances them to the next section after they have mastered all of the current section’s skills. When each student advances at his or her own pace, academically diverse students can learn together in the same classroom. What role did mastery learning play in CTAI’s successes? In practice, students will sometimes exhaust all of the problems in a section without mastering its skills, in which case they are “promoted” to the next section. Did students who wer...
The ASSISTments online homework tool includes a platform, called the TestBed, on which education ... more The ASSISTments online homework tool includes a platform, called the TestBed, on which education researchers can propose experimental modifications of specific modules within ASSISTments. Then, students whose teachers assign those modules are individually randomized between treatment conditions. In one RCT, 614 students working on a Pythagorean Theorem module were randomized to receive hints either as text or as videos. Researchers may estimate the effect of hint type on module completion rates by comparing outcomes between the randomized treatment groups, and may adjust those comparisons with provided covariates, such as prior problem correctness and completion rates. However, much more “auxiliary” data is available for estimating causal effects. Hundreds of thousands of students have used ASSISTments—could we use data from the “remnant” from the experiment, i.e. students who were not randomized to either condition, to increase the precision of effect estimates? ASSISTments gathers...
The Cognitive Tutor Algebra I (CTAI) curriculum, which includes both textbook and online componen... more The Cognitive Tutor Algebra I (CTAI) curriculum, which includes both textbook and online components, has been shown to boost student learning by about 0.2 standard deviations in a randomized effectiveness trial. Students who were assigned to the experimental condition varied substantially in how, and how much, the used the online component of CTAI, but original analyses of the experimental data focused on estimating average effects, and did not examine whether the CTAI treatment effect varied by the amount of style of usage. This study leverages log data from the experiment to present a more nuanced analysis. It uses the framework of Principal Stratification, which estimates the varying CTAI treatment effect as a function of “potential” usage—either how students used the program, or how they would have used it had they been assigned to the treatment condition. With experimental data, Principal Stratification does not require that we assume that all relevant variables have been measu...
Cognitive Tutor Algebra I (CTAI), published by Carnegie Learning, Inc., is an Algebra I curriculu... more Cognitive Tutor Algebra I (CTAI), published by Carnegie Learning, Inc., is an Algebra I curriculum, including both textbook components and an automated, computer application that is designed to deliver individualized instruction to students. A recent randomized controlled effectiveness trial, found that CTAI increased students’ test scores by about 0.2 standard deviations. However, the study raised a number of questions, in the form of evidence for treatmenteffect-heterogeneity. The experiment generated student logdata from the computer application. This study attempts to use that data to shed light on CTAI’s causal mechanisms, via principal stratification. Principal strata are categories of both treatment and control students according their potential CTAI usage; they allow researchers to estimate differences in treatment effect between usage subgroups. Importantly, randomization satisfies the principal stratification identification assumptions. We present the results of our first ...
The design of the Cognitive Tutor Algebra I (CTA1) intelligent tutoring system assumes that stude... more The design of the Cognitive Tutor Algebra I (CTA1) intelligent tutoring system assumes that students work through sections of material following a pre-specified order, and only move on from one section to the next after mastering the first section’s skills. However, the software gives teachers the flexibility to override that structure, by reassigning students to different sections of the curriculum. Which students get reassigned? Does reassignment hurt student learning? Does it help? This paper used data from the treatment arm of a large effectiveness study of the CTA1 curriculum to estimate the effects of reassignment on students’ scores on an Algebra I posttest. Since reassignment is not randomized, we used a multilevel propensity score matching design, along with assessments of sensitivity to bias from unmeasured confounding, to estimate the effects of reassignment. We found that reassignment reduces posttest scores by roughly 0.2 standard deviations—–about the same as the overa...
Randomized controlled trials (RCTs) are increasingly prevalent in education research, and are oft... more Randomized controlled trials (RCTs) are increasingly prevalent in education research, and are often regarded as a gold standard of causal inference. Two main virtues of randomized experiments are that they (1) do not suffer from confounding, thereby allowing for an unbiased estimate of an intervention’s causal impact, and (2) allow for design-based inference, meaning that the physical act of randomization largely justifies the statistical assumptions made. However, RCT sample sizes are often small, leading to low precision; in many cases RCT estimates may be too imprecise to guide policy or inform science. Observational studies, by contrast, have strengths and weaknesses complementary to those of RCTs. Observational studies typically offer much larger sample sizes, but may suffer confounding. In many contexts, experimental and observational data exist side by side, allowing the possibility of integrating “big observational data” with “small but high-quality experimental data” to get...
Mastery learning, the notion that students learn best if they move on from studying a topic only ... more Mastery learning, the notion that students learn best if they move on from studying a topic only after having demonstrated mastery, sits at the foundation of the theory of intelligent tutoring. This paper is an exploration of how mastery learning plays out in practice, based on log data from a large randomized effectiveness trial of the Cognitive Tutor Algebra I (CTAI) curriculum. We find that students frequently progressed from CTAI sections they were working on without demonstrating mastery and worked units out of order. Moreover, these behaviors were substantially more common in the second year of the study, in which the CTAI effect was significantly larger. We explore the various ways students departed from the official CTAI curriculum, focusing on heterogeneity between years, states, schools, and students. The paper concludes with an observational study of the effect on post-test scores of teachers reassigning students out of their current sections before they mastered the requ...
Causal mediation analysis is the study of mechanisms—variables measured between a treatment and a... more Causal mediation analysis is the study of mechanisms—variables measured between a treatment and an outcome that partially explain their causal relationship. The past decade has seen an explosion of research in causal mediation analysis, resulting in both conceptual and methodological advancements. However, many of these methods have been out of reach for applied quantitative researchers, due to their complexity and the difficulty of implementing them in standard statistical software distributions. The mediation package in R provides a set of simple commands that execute some of the newer causal mediation methods. This article will summarize some of the recent advances in mediation analysis, critically review the mediation package, and demonstrate, by example, some of its capabilities.
Journal of Educational and Behavioral Statistics, Nov 3, 2016
Causal mediation analysis is the study of mechanisms-variables measured between a treatment and a... more Causal mediation analysis is the study of mechanisms-variables measured between a treatment and an outcome that partially explain their causal relationship. The past decade has seen an explosion of research in causal mediation analysis, resulting in both conceptual and methodological advancements. However, many of these methods have been out of reach for applied quantitative researchers, due to their complexity and the difficulty of implementing them in standard statistical software distributions. The mediation package in R provides a set of simple commands that execute some of the newer causal mediation methods. This article will summarize some of the recent advances in mediation analysis, critically review the mediation package, and demonstrate, by example, some of its capabilities.
Conventionally, regression discontinuity analysis contrasts a univariate regression's limits as i... more Conventionally, regression discontinuity analysis contrasts a univariate regression's limits as its independent variable, R, approaches a cut-point, c, from either side. Alternative methods target the average treatment effect in a small region around c, at the cost of an assumption that treatment assignment, I [R < c], is ignorable vis a vis potential outcomes. Instead, the method presented in this paper assumes Residual Ignorability, ignorability of treatment assignment vis a vis detrended potential outcomes. Detrending is effected not with ordinary least squares but with MM-estimation, following a distinct phase of sample decontamination. The method's inferences acknowledge uncertainty in both of these adjustments, despite its applicability whether R is discrete or continuous; it is uniquely robust to leading validity threats facing regression discontinuity designs.
The accuracy of the Content should not be relied upon and should be independently verified with p... more The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden.
A scientific community can be modeled as a collection of epistemic agents attempting to answer qu... more A scientific community can be modeled as a collection of epistemic agents attempting to answer questions, in part by communicating about their hypotheses and results. We can treat the pathways of scientific communication as a network. When we do, it becomes clear that the interaction between the structure of the network and the nature of the question under investigation affects epistemic desiderata, including accuracy and speed to community consensus. Here we build on previous work, both our own and others’, in order to get a firmer grasp on precisely which features of scientific communities interact with which features of scientific questions in order to influence epistemic outcomes.Here we introduce a measure on the landscape meant to capture some aspects of the difficulty of answering an empirical question. We then investigate both how different communication networks affect whether the community finds the best answer and the time it takes for the community to reach consensus on ...
Supplemental material, sj-pdf-1-cjn-10.1177_0844562120927535 for A Cross-Sectional Exploration of... more Supplemental material, sj-pdf-1-cjn-10.1177_0844562120927535 for A Cross-Sectional Exploration of Cytokine–Symptom Networks in Breast Cancer Survivors Using Network Analysis by Ashley Henneghan, Michelle L. Wright, Garrett Bourne and Adam C. Sales in Canadian Journal of Nursing Research
The proliferation of computerized technology in education has slowly been followed by a (smaller)... more The proliferation of computerized technology in education has slowly been followed by a (smaller) proliferation of evaluations of educational technology. These randomized studies are entirely classical. However, they produce an entirely new type of data as a byproduct: computer log data from subjects assigned to the treatment condition. For instance, in an effectiveness trial of the Cognitive Tutor Algebra I (CTA1) curriculum the researchers collected log data from students in the treatment group. What insights about the CTA1 effect may be extracted from this rich supplementary dataset? This paper will compare and contrast three different causal techniques in three parallel analyses of the CTA1 dataset. Specifically, we will examine the role of hints in CTA1's effect. One approach discards the control group--and hence, the randomization--and analyzes usage and outcome data in the treatment group as an observational study. Another, causal mediation analysis, contextualizes the ef...
Randomized A/B tests in educational software are not run in a vacuum: often, reams of historical ... more Randomized A/B tests in educational software are not run in a vacuum: often, reams of historical data are available alongside the data from a randomized trial. This paper proposes a method to use this historical data–often highdimensional and longitudinal–to improve causal estimates from A/B tests. The method proceeds in two steps: first, fit a machine learning model to the historical data predicting students’ outcomes as a function of their covariates. Then, use that model to predict the outcomes of the randomized students in the A/B test. Finally, use design-based methods to estimate the treatment effect in the A/B test, using prediction errors in place of outcomes. This method retains all of the advantages of design-based inference, while, under certain conditions, yielding more precise estimators. This paper will give a theoretical condition under which the method improves statistical precision, and demonstrates it using a deep learning algorithm to help estimate effects in a se...
Researchers faced with a sequence of candidate model specifications must often choose the best sp... more Researchers faced with a sequence of candidate model specifications must often choose the best specification that does not violate a testable identification assumption. One option in this scenario is sequential specification tests: hypothesis tests of the identification assumption over the sequence. Borrowing an idea from the change-point literature, this paper shows how to use the distribution of p-values from sequential specification tests to estimate the point in the sequence where the identification assumption ceases to hold. Unlike current approaches, this method is robust to individual errant p-values and does not require choosing a test level or tuning parameter. This paper demonstrates the method's properties with a simulation study, and illustrates it by application to the problems of choosing a bandwidth in a regression discontinuity design while maintaining covariate balance and of choosing a lag order for a time series model.
Principal stratification (PS), which measures variation in a causal effect as a function of post-... more Principal stratification (PS), which measures variation in a causal effect as a function of post-treatment variables, can have wide applicability in educational data mining. Under the PS framework, researchers can model the effect of an intelligent tutor as a function of log data, can account for attrition, and study causal mechanisms. Participants in this tutorial will learn how and when PS works and doesn’t work, and will learn three methods of estimating principal effects. 1. PRINCIPAL STRATIFICATION IN EDM RESEARCH Educational data miners are increasingly interested in causal questions—what interventions work, for whom, and how. Accompanying this interest is the widespread realization that there is no such thing as “the effect”: actually, effects can vary widely between individuals. Estimating the differences in effects between types of learners is (in principal) straightforward for types defined prior to the onset of an experiment. But what about learners who use the software i...
This paper reports an application to educational intervention of Principal Stratification, a stat... more This paper reports an application to educational intervention of Principal Stratification, a statistical method for estimating the effect of a treatment even when there are different rates of dropout in experimental and control conditions. We consider the potential value for using principal stratification to identify “Tough Love Interventions” – interventions that have a large effect but also increase the propensity of students to drop out. This method allowed us to generate an estimate of the treatment effect in an RCT without the selection bias induced by differential attrition by restricting analysis to just the inferred “stratum” of students who would not drop out in either condition. This paper provides a case study of how to appropriate the method of principal stratification from statistics and medical research fields to educational data mining, where it has been largely absent despite increasing relevance to online learning.
The Cognitive Tutor (Anderson, Corbett, Koedinger, and Pelletier, 1995) is a piece of software de... more The Cognitive Tutor (Anderson, Corbett, Koedinger, and Pelletier, 1995) is a piece of software designed to teach math, alongside traditional teachers. In the second year of a largescale effectiveness study of the Cognitive Tutor Algebra I (CTAI) curriculum, the intervention had a moderate positive effect on high school post-test scores (Pane, Griffin, McCaffrey, and Karam, 2014). One of CTAI’s mechanisms is “mastery learning” (Bloom, 1968): the software estimates students’ skill mastery after each worked problem, and (ideally) only advances them to the next section after they have mastered all of the current section’s skills. When each student advances at his or her own pace, academically diverse students can learn together in the same classroom. What role did mastery learning play in CTAI’s successes? In practice, students will sometimes exhaust all of the problems in a section without mastering its skills, in which case they are “promoted” to the next section. Did students who wer...
The ASSISTments online homework tool includes a platform, called the TestBed, on which education ... more The ASSISTments online homework tool includes a platform, called the TestBed, on which education researchers can propose experimental modifications of specific modules within ASSISTments. Then, students whose teachers assign those modules are individually randomized between treatment conditions. In one RCT, 614 students working on a Pythagorean Theorem module were randomized to receive hints either as text or as videos. Researchers may estimate the effect of hint type on module completion rates by comparing outcomes between the randomized treatment groups, and may adjust those comparisons with provided covariates, such as prior problem correctness and completion rates. However, much more “auxiliary” data is available for estimating causal effects. Hundreds of thousands of students have used ASSISTments—could we use data from the “remnant” from the experiment, i.e. students who were not randomized to either condition, to increase the precision of effect estimates? ASSISTments gathers...
The Cognitive Tutor Algebra I (CTAI) curriculum, which includes both textbook and online componen... more The Cognitive Tutor Algebra I (CTAI) curriculum, which includes both textbook and online components, has been shown to boost student learning by about 0.2 standard deviations in a randomized effectiveness trial. Students who were assigned to the experimental condition varied substantially in how, and how much, the used the online component of CTAI, but original analyses of the experimental data focused on estimating average effects, and did not examine whether the CTAI treatment effect varied by the amount of style of usage. This study leverages log data from the experiment to present a more nuanced analysis. It uses the framework of Principal Stratification, which estimates the varying CTAI treatment effect as a function of “potential” usage—either how students used the program, or how they would have used it had they been assigned to the treatment condition. With experimental data, Principal Stratification does not require that we assume that all relevant variables have been measu...
Cognitive Tutor Algebra I (CTAI), published by Carnegie Learning, Inc., is an Algebra I curriculu... more Cognitive Tutor Algebra I (CTAI), published by Carnegie Learning, Inc., is an Algebra I curriculum, including both textbook components and an automated, computer application that is designed to deliver individualized instruction to students. A recent randomized controlled effectiveness trial, found that CTAI increased students’ test scores by about 0.2 standard deviations. However, the study raised a number of questions, in the form of evidence for treatmenteffect-heterogeneity. The experiment generated student logdata from the computer application. This study attempts to use that data to shed light on CTAI’s causal mechanisms, via principal stratification. Principal strata are categories of both treatment and control students according their potential CTAI usage; they allow researchers to estimate differences in treatment effect between usage subgroups. Importantly, randomization satisfies the principal stratification identification assumptions. We present the results of our first ...
The design of the Cognitive Tutor Algebra I (CTA1) intelligent tutoring system assumes that stude... more The design of the Cognitive Tutor Algebra I (CTA1) intelligent tutoring system assumes that students work through sections of material following a pre-specified order, and only move on from one section to the next after mastering the first section’s skills. However, the software gives teachers the flexibility to override that structure, by reassigning students to different sections of the curriculum. Which students get reassigned? Does reassignment hurt student learning? Does it help? This paper used data from the treatment arm of a large effectiveness study of the CTA1 curriculum to estimate the effects of reassignment on students’ scores on an Algebra I posttest. Since reassignment is not randomized, we used a multilevel propensity score matching design, along with assessments of sensitivity to bias from unmeasured confounding, to estimate the effects of reassignment. We found that reassignment reduces posttest scores by roughly 0.2 standard deviations—–about the same as the overa...
Randomized controlled trials (RCTs) are increasingly prevalent in education research, and are oft... more Randomized controlled trials (RCTs) are increasingly prevalent in education research, and are often regarded as a gold standard of causal inference. Two main virtues of randomized experiments are that they (1) do not suffer from confounding, thereby allowing for an unbiased estimate of an intervention’s causal impact, and (2) allow for design-based inference, meaning that the physical act of randomization largely justifies the statistical assumptions made. However, RCT sample sizes are often small, leading to low precision; in many cases RCT estimates may be too imprecise to guide policy or inform science. Observational studies, by contrast, have strengths and weaknesses complementary to those of RCTs. Observational studies typically offer much larger sample sizes, but may suffer confounding. In many contexts, experimental and observational data exist side by side, allowing the possibility of integrating “big observational data” with “small but high-quality experimental data” to get...
Mastery learning, the notion that students learn best if they move on from studying a topic only ... more Mastery learning, the notion that students learn best if they move on from studying a topic only after having demonstrated mastery, sits at the foundation of the theory of intelligent tutoring. This paper is an exploration of how mastery learning plays out in practice, based on log data from a large randomized effectiveness trial of the Cognitive Tutor Algebra I (CTAI) curriculum. We find that students frequently progressed from CTAI sections they were working on without demonstrating mastery and worked units out of order. Moreover, these behaviors were substantially more common in the second year of the study, in which the CTAI effect was significantly larger. We explore the various ways students departed from the official CTAI curriculum, focusing on heterogeneity between years, states, schools, and students. The paper concludes with an observational study of the effect on post-test scores of teachers reassigning students out of their current sections before they mastered the requ...
Causal mediation analysis is the study of mechanisms—variables measured between a treatment and a... more Causal mediation analysis is the study of mechanisms—variables measured between a treatment and an outcome that partially explain their causal relationship. The past decade has seen an explosion of research in causal mediation analysis, resulting in both conceptual and methodological advancements. However, many of these methods have been out of reach for applied quantitative researchers, due to their complexity and the difficulty of implementing them in standard statistical software distributions. The mediation package in R provides a set of simple commands that execute some of the newer causal mediation methods. This article will summarize some of the recent advances in mediation analysis, critically review the mediation package, and demonstrate, by example, some of its capabilities.
Uploads
Papers by Adam Sales