Umeå University's logo

umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Covariate selection for the estimation of marginal hazard ratios in high-dimensional data
Umeå University, Faculty of Social Sciences, Umeå School of Business and Economics (USBE), Statistics.ORCID iD: 0000-0002-9313-3499
Umeå University, Faculty of Social Sciences, Umeå School of Business and Economics (USBE), Statistics.ORCID iD: 0000-0002-9086-7403
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Hazard ratios are frequently reported in time-to-event and epidemiological studies to assess treatment effects. In observational studies, the combination of propensity score weights with the Cox proportional hazards model facilitates the estimation of the marginal hazard ratio (MHR). The methods for estimating MHR are analogous to those employed for estimating common causal parameters, such as the average treatment effect. However, MHR estimation in the context of high-dimensional data remain unexplored. This paper seeks to address this gap through a simulation study that consider variable selection methods from causal inference combined with a recently proposed multiply robust approach for MHR estimation. Additionally, a case study utilizing stroke register data is conducted to demonstrate the application of these methods. The results from the simulation study indicate that the double selection covariate selection method is preferable to several other strategies when estimating MHR. Nevertheless, the estimation can be further improved by employing the multiply robust approach to the set of propensity score models obtained during the double selection process.

National Category
Probability Theory and Statistics
Research subject
Statistics; Statistics
Identifiers
URN: urn:nbn:se:umu:diva-218976OAI: oai:DiVA.org:umu-218976DiVA, id: diva2:1824024
Available from: 2024-01-03 Created: 2024-01-03 Last updated: 2024-01-09
In thesis
1. Estimation of hazard ratios from observational data with applications related to stroke
Open this publication in new window or tab >>Estimation of hazard ratios from observational data with applications related to stroke
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The objective of this thesis is to examine some challenges that may emerge when conducting time-to-event studies based on observational data. Time-to-event (also called survival) is a setting that involves analyzing how different factors may influence the length of time until an individual experiences the event of interest. This type of analysis is commonly applied in fields such as medical research and epidemiology. In this thesis, which focuses on stroke, we are interested in the time to a recurrent stroke or the death of a patient who survived a first stroke.

Hazard ratios are one of the main parameters estimated in time-to-event studies. Hazard ratios involve comparing the risk of experiencing the event between two groups, usually a treated group and an untreated group.  They can also involve other factors, such as different age groups. Hazard ratios can be estimated from the data by using the Cox regression model.

Observational data, in contrast to experimental data, involves data collected without any intervention or random assignment of treatment to the individuals. Confounders, that is, variables that distort or obscure the true relationship between treatment and outcome, are always present and need to be controlled for in observational studies.

National registers are an important source of observational data. A national registry is a centralized database or system that collects, stores, and maintains information about a specific population or group of individuals within a country. Sweden is known for its detailed and complete national registers. In this thesis, data from the Swedish Stroke Register (Riksstroke) is used to study factors related to stroke.

In time-to-event studies involving observational data, several challenges may arise for the researcher during data analysis. Some individuals may not experience the event during the observation period and thus the information about their time until the event is incomplete. These individuals are considered as censored. Some individuals may experience another event rather than the one of interest, a competing risk. Additionally, models must be properly constructed, with researchers selecting variables and determining the suitable functional form.

Four papers are included in the thesis. Paper I demonstrates how to handle competing risks in survival analysis. The study involves comparing individuals with and without standard modifiable risk factors and their risks of a recurrent stroke or death using data from the Swedish Stroke Register.

The estimation of marginal hazard ratios is a common theme in the other three papers. All involve simulation studies in order to extend methods and explore best practices when estimating marginal hazard ratios.

Paper II explores non-parametric methods that can be used as alternatives to more traditional parametric methods when balancing datasets in order to estimate a marginal hazard ratio. A case study was also conducted using data from the Swedish Stroke Register involving the prescription of anticoagulants at hospital discharge after a stroke.

Paper III is about how censoring affects marginal hazard ratio estimation, even with perfect balancing of the dataset. We study this issue, taking into consideration varying effect sizes and censoring rates. A procedure to attenuate the problem is also studied.

Paper IV concerns covariate selection in the case of high-dimensional data. High-dimensional data involves cases in which the number of covariates in the study is comparable to the number of individuals, and therefore covariate selection methods are needed. In the paper, we explore some of these methods and suggest a best-performing procedure. As Paper II, Paper IV involves a case study of anticoagulant prescription using data from the Swedish Stroke Register.

Place, publisher, year, edition, pages
Umeå University, 2024. p. 19
Series
Statistical studies, ISSN 1100-8989 ; 57
Keywords
survival analysis, causal inference, hazard ratios, marginal hazard ratio, stroke, balancing
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:umu:diva-219201 (URN)978-91-8070-240-9 (ISBN)978-91-8070-241-6 (ISBN)
Public defence
2024-02-02, Hörsal NBET.A.101, Norra Beteendevetarhuset, Mediegränd 14, 907 36, Umeå, 10:00 (English)
Opponent
Supervisors
Available from: 2024-01-12 Created: 2024-01-09 Last updated: 2024-01-10Bibliographically approved

Open Access in DiVA

No full text in DiVA

Authority records

Barros, GuilhermeHäggström, Jenny

Search in DiVA

By author/editor
Barros, GuilhermeHäggström, Jenny
By organisation
Statistics
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 99 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf