umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Variable selection in Cox model to optimise cardiovascular disease risk prediction in Northern Sweden.
Umeå University, Faculty of Medicine, Department of Public Health and Clinical Medicine, Epidemiology and Global Health.
2017 (English)Independent thesis Advanced level (degree of Master (Two Years)), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Background: Variable selection for cardiovascular disease (CVD) risk is important in reducing cost of data collection and thus promote uptake of risk screening among practitioners and general public. Apart from expert knowledge, statistical model is another major component in variable selection. Variable selection determined by p-value in Cox proportional hazard regression after expert selection is the most common approach in constructing CVD risk prediction model. Under the Cox model, there are many other available methods, such as, step-wise Cox selection, LASSO, and survival boosted regression tree. These methods can be applied when the researcher is facing to large amount of potential variables.

Objective: This study aims at 1) identifying the best subset of predictors for the first CVD event in 10 years based on different statistical methods, and, 2) comparing the predictive performance of Cox proportional hazard regression, step-wise Cox selection, LASSO, and survival boosted regression tree with the corresponding set of predictors.

Method: Empirical data was extracted from the Vasterbotten Intervention Programme (VIP), death registry, and hospitalisation record. The cohort selected for analysis includes individuals examined in primary care centre from 1990 to 2005 and were followed up for ten years based on the registry data. Data from 1990 to 1999 are used to fit models. The models are then tested on data from 2000 to 2005. Statistical predictor selection was performed with Cox backward selection, LASSO in Cox model, and survival boosted regression tree model (sBRT). Predictor set based on expert knowledge and p-value was derived from Framingham risk score. Predictor subsets from LASSO and sBRT were passed to Cox regression. All models were compared based on the predictive performance, specifically graphical calibration and AUC.

Result: Variables different from the expert variable subset are found by Cox backward selection, LASSO, and sBRT, such as, education, long sick leave, relationship with colleagues, berry picking, and, alcohol consumption. In terms of predictive performance, the Cox proportional hazard regression model with the expert variable subset performs best. The differences in AUC between models with various variable subsets is little. In our case, LASSO and sBRT do not show better prediction than Cox regression.

Conclusion: Different statistical models may result in different variables subsets. Variable selected based on expert knowledge is of high importance in constructing prediction models. In the case of predicting CVD event in 10 in Northern Sweden, Cox proportional hazard regression shows similar predictive performance as LASSO and sBRT.

Place, publisher, year, edition, pages
2017. , p. 33
Series
Centre for Public Health Report Series, ISSN 1651-341X ; 2017:48
Keywords [en]
Cox model, Cardiovascular disease, Northern Sweden
National Category
Public Health, Global Health, Social Medicine and Epidemiology
Identifiers
URN: urn:nbn:se:umu:diva-152617OAI: oai:DiVA.org:umu-152617DiVA, id: diva2:1256240
External cooperation
VIP - Västerbotten Intervention Program
Educational program
Master's Programme in Public Health
Presentation
2017-05-22, Dentristy building, 9th floor, room D, Norrlands University Hospital, Umeå, 11:00 (English)
Supervisors
Examiners
Available from: 2018-10-22 Created: 2018-10-16 Last updated: 2018-10-22Bibliographically approved

Open Access in DiVA

No full text in DiVA

Search in DiVA

By author/editor
Tsang, Ka Chun
By organisation
Epidemiology and Global Health
Public Health, Global Health, Social Medicine and Epidemiology

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 34 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf