ÐÜèÊÓÆµ

Guanglei Hong
University of Michigan



Causal inference for multi-level observational data with applications to educational research



FINAL REPORT:

Rubin's (1978) potential-outcomes causal framework laid the foundation for conceptualizing causal problems and developing statistical solutions. For simplicity, he presented the framework under the stable unit treatment value assumption (SUTVA). It assumes that there is a single value of each potential outcome for each experimental unit regardless of what treatments are received by other experimental units, how the treatment is assigned, who delivers the treatment, and the context in which the treatment is delivered. This assumption is hardly tenable when treatment assignment occurs in multi-level settings. This is partly due to the sharing of and competition for resources within and between organizations, and partly due to the agent effects in treatment delivery. Conceptually there is a distinct set of potential outcomes for a unit corresponding to each possible group composition, agent allocation, and treatment allocation. Also yet to be explored is the applicability of the propensity score-based causal inference methods for multi-level data. The purpose of this study is to extend the potential-outcomes causal framework to encompass multi-level data. I handle the multiplicity of potential outcomes by replacing the stable unit treatment value assumption with the exchangeability assumption. I define the causal effects of treatments for three basic types of multi-level experimental designsÑmulti-site randomized design, cluster randomized design, and joint multi-level randomized design. For the corresponding multi-level observational designs, I investigate the applicability of various propensity score based causal inference approaches. Using multi-level, longitudinal observational data for illustration, I apply the extended causal framework and the propensity score based causal inference techniques to an empirical study of kindergarten retention.

Retention as a remedial device for poor-performing students has been in controversy for decades. I use the Early Childhood Longitudinal Study Kindergarten Cohort (ECLS-K) data to investigate the causal effect of kindergarten retention on children's literacy growth. I define the retention effect for a student at risk of repeating kindergarten as the difference between the potential outcomes corresponding to retention and promotion, respectively. In this particular study, I choose propensity score matching as the major analytic approach to the causal inference, evaluate the stability and sensitivity of the analytical result, and generate retention effect estimates using conventional multiple regression methods for comparison. The result shows that students at risk of repeating kindergarten are expected to achieve a significantly higher level of literacy growth if they are promoted to the first grade instead. I also demonstrate the inadequacy of conventional multiple regression methods in removing selection bias.




Back to Funded Dissertation Grants Page