This thesis concerns doubly robust (DR) estimation in missing data contexts. Previous research is not unanimous as to which estimators perform best and in which situations DR is to be preferred over other estimators. We observe that the conditions surrounding comparisons of DR- and other estimators vary between dierent previous studies. We therefore focus on the effects of three distinct aspects of study design on the performance of one DR-estimator in comparison to outcome regression (OR). These aspects are sample size, the way in which models are misspecified, and the degree of association between the covariates and propensities. We find that while there are no drastic eects of the type of model misspecication, all three aspects do affect how DR compares to OR. The results can be used to better understand the divergent conclusions of previous research.
This paper examines gender differences in job autonomy in the United States and Sweden. It analyzes data from the 2005 Work Orientations III module of the International Social Survey Program, using multiple linear regression analysis. Women‟s concentration in the public sector, as a form of occupational segregation, as well as gender differences in unionization are assessed as possible explanations. Since these two factors vary greatly between the US and Sweden, these two cases are used to test the suitability of the explanatory approaches. While there are no gender differences in job autonomy in the US, Swedish women experience significantly lower job autonomy than Swedish men. These gender differences are primarily due to the fact that women in Sweden are concentrated in public sector employment, which offers lower autonomy. This supports occupational segregation as an explanation for gender differences in job autonomy. Meanwhile, the hypothesis that women‟s higher degree of unionization in Sweden would lead to a smaller gender gap in autonomy does not receive support from the data.
Our goal is to improve the estimation of the average treatment effect among treated (ATT) from longitudinal data. When the ATT is estimated at one time point (or separately at each), outcome-regression (OR), inverse probability weighting and doubly robust estimators can be used. These methods involve estimating the relationships that the covariates have with the outcome and/or propensity score, in different regression models. Assuming these relationships do not vary drastically between close-by time points, we can improve estimation by also using information from neighboring points.
We use local regression to smooth the coefficient estimates in the outcome- and propensity score-model over time. Our simulation study shows that when the true coefficients are constant over time, the performance of all estimators is improved by smoothing. Especially in terms of precision, the improvement is greater the more the coefficient estimates are smoothed. We also evaluate the OR-estimator in more complex scenarios where the true regression coefficients vary linearly and non-linearly over time. Here we find that larger degrees of smoothing have a negative effect on the estimators’ accuracy, but continue to improve their precision. This is especially prominent in the non-linear scenario.
In this thesis, we present methods for studying patterns of income accumulation over time using functional data analysis. This is made possible by the availability of large-scale longitudinal register data in Sweden. By modelling individuals’ cumulative earnings trajectories as continuous functions of time, we can explore temporal dynamics as well as divergences in these trajectories based on initial labour market conditions. A major contribution of this thesis consists of extending the potential outcome framework for causal inference to functional data analysis.
In Paper I, we use functional-on-scalar linear regression and an interval-wise testing procedure to study the associations between initial labour market size and income trajectories for one Swedish birth cohort. In Paper II, we present methods to draw causal conclusions in this setting. We introduce the functional average treatment effect (FATE), as well as an outcome-regression based estimator for this parameter. In addition, we show the finite sample distribution of this estimator under certain regularity conditions and demonstrate how simultaneous confidence bands can be used for inferences about the FATE. An application study in this paper estimates the causal effect of initial labour market size on income accumulation trajectories.
In Paper III, these methods are applied to study the effect of initial firm age on earnings accumulation. Paper IV presents an outcome regression based and a double robust estimator for the mean of functional outcomes when some of these outcome functions are missing at random. We derive the asymptotic distributions of these two estimators as well as their covariance structure under more general conditions.
This paper examines gender differences in unemployment experiences in 30 European countries. Multilevel regression analysis is used to test whether the effect of not working on wellbeing is moderated by gender, and if this moderation varies between national contexts. Based on the premise that needs for and financial and psychological benefits of employment differ for men and women, institutionalized gender roles regarding employment and family commitment are assessed as a theoretical explanation for cross-country divergences. The results indicate that not being in work is associated with lower wellbeing and more so for men than for women. Overall, the moderating effect of gender varies with gender differences in labor force participation, yet this cross-national variation cannot be observed when controlling for subjective income. These initial findings demonstrate possibilities for further, in-depth research in this area.
This article presents methods to study the causal effect of a binary treatment on a functional outcome with observational data. We define a Functional Average Treatment Effect (FATE) and develop an outcome regression estimator. We show how to obtain valid inference on the FATE using simultaneous confidence bands, which cover the FATE with a given probability over the entire domain. Simulation experiments illustrate how the simultaneous confidence bands take the multiple comparison problem into account. Finally, we use the methods to infer the effect of early adult location on subsequent income development for one Swedish birth cohort.
We use longitudinal register data from Sweden to study patterns and dynamics in lifetime income trajectories. We examine divergences in these income trajectories by local economic conditions at labour market entry, in combination with other factors such as gender, education level and socio-economic background. We cannot assume that these relationships are constant over the course of individuals’ working lives. Therefore, we use methods from functional data analysis, allowing for a time-varying relationship between income and the explanatory variables. Our results show a large degree of heterogeneity in how lifetime income trajectories develop for different subgroups. We find that, for men, entering the labour market in an urban area is associated with higher cumulative lifetime income, especially later in life. The exception is men with only primary education, for whom those starting their working lives in a large city have lower incomes on average. This divergence increases in size over time. Women who enter into a large urban labour market receive higher lifetime income at all education levels. This relationship is strongest for women with primary education but decreases in strength over time for these women.