• Reference Manager
  • Simple TEXT file

People also looked at

Review article, scientific contributions of population-based studies to cardiovascular epidemiology in the gwas era.

image

  • 1 Institute of Epidemiology, Kiel University, Kiel, Germany
  • 2 Framingham Heart Study (FHS), Framingham, MA, United States
  • 3 Section of Preventive Medicine and Epidemiology, Boston University School of Medicine, Boston, MA, United States

Longitudinal, well phenotyped, population-based cohort studies offer unique research opportunities in the context of genome-wide association studies (GWAS), including GWAS for new-onset (incident) cardiovascular disease (CVD) events, the assessment of gene x lifestyle interactions, and evaluating the incremental predictive utility of genetic information in apparently healthy individuals. Furthermore, comprehensively phenotyped community-dwelling samples have contributed to GWAS of numerous traits that reflect normal organ function (e.g., cardiac structure and systolic and diastolic function) and for many traits along the CVD continuum (e.g., risk factors, circulating biomarkers, and subclinical disease traits). These GWAS have heretofore identified many genetic loci implicated in normal organ function and different stages of the CVD continuum. Finally, population-based cohort studies have made important contributions to Mendelian Randomization analyses, a statistical approach that uses genetic information to assess observed associations between cardiovascular traits and clinical CVD outcomes for potential causality.

What Are Key Features of Population-Based Cohort Studies?

As a brief introduction, we would like to highlight important design features of population-based studies. As opposed to hospital-based referral samples, population-based epidemiological studies examine community-dwelling or random samples from the general population. As such, study participants are not selected based on a given disease, but rather to represent the general population of the areas sampled, so that observations from such a sample are generalizable to the underlying source population. It has to be kept in mind, though, that the response rate of some landmark cohort studies is rather low [e.g., 5,5% for the UK Biobank ( 1 )], which increases the potential for selection bias ( 2 ). Furthermore, most population-based studies are longitudinal studies that are re-examining their participants every few years so that repeated measures of several traits are available and trajectories over time (and their genetic underpinning) can be assessed, as opposed to analyses of single occasion measurements of select traits in typical referral samples. Thus, population-based cohort studies include many individuals free of the disease of interest at the beginning of the study, but who might develop the condition of interest over the course of the study. Therefore, population-based cohort studies are ideal to study risk factors and intermediate traits for the development of chronic disease conditions and to estimate measures of disease incidence ( 3 , 4 ).

Third, many population-based cohort studies perform deep physiological/clinical and molecular phenotyping of their study participants ( 5 ). For example, comprehensive physiological, biochemical, subclinical, and clinical measurements are obtained on the participants using highly standardized methods. Similarly, clinical endpoints are adjudicated in a comprehensive and highly standardized process, which enhances the accuracy and validity of endpoint data from population-based cohort studies. The molecular characterization may include the assessment of common and rare genetic variation and other OMICs measurements, such as epigenomics, transcriptomics, lipidomics, proteomics, and metabolomics ( 5 ). These key features of population-based studies allow specific research questions to be addressed in the context of genome-wide association studies (GWAS). For example, the detailed phenotyping allows comprehensive adjustments and mediation analyses in order to delineate whether an observed association between a genetic variant and cardiovascular outcomes is independent of traditional risk factors and whether traditional risk factors or biomarkers might mediate the observed association. Overall, population-based studies have made a substantial contribution to scientific discoveries in the GWAS era. A few illustrative highlights of such findings from cohort studies are described below.

Reference Sample for Genetic-Epidemiological Analyses

Since many community-dwelling samples are representative of the general population, population-based studies have served as reference (“control”) samples for many genetic case-control analyses. In essence, genetic case-control studies compare allelic frequencies of genetic variants in prevalent cases (patients who have the disease of interest when they are sampled) and controls. Ideally, the control sample captures the distribution of the exposure (in this case, the allele frequencies of putative genetic variants) in the source population from which the cases were derived ( 6 ). Therefore, population-based studies have provided controls for genetic case-control studies of a broad spectrum of traits, including myocardial infarction (MI)/coronary artery disease (CAD) ( 7 ), stroke ( 8 , 9 ), and dilated cardiomyopathy ( 10 ). Importantly, as detailed below, GWAS might reveal different results depending on whether prevalent or incident cases are being analyzed.

GWAS Analyses for a Broad Spectrum of Phenotypic Traits and Biomarkers Along the Cardiovascular Disease Continuum

The broad and highly standardized phenotyping of their study participants has allowed many different contributions of population-based cohort studies to GWAS. Specifically, researchers from population-based studies have performed and contributed to numerous GWAS for traits along the cardiovascular disease continuum, including traditional CVD risk factors [e.g., lipids ( 11 ), blood pressure ( 12 , 13 ), and glycemic traits ( 14 )], circulating cardiovascular biomarkers [e.g., B-type natriuretic peptide (BNP) ( 15 ), C-reactive protein ( 16 ), troponin ( 17 ), aldosterone, renin concentration, renin activity ( 18 ), adipokines ( 19 ), and fibrinogen levels ( 20 )], and subclinical cardiovascular disease traits [such as indices of left ventricular structure and function ( 21 , 22 ), carotid intima media thickness [IMT] ( 23 ), and coronary artery calcification ( 24 )]. Of note, cardiac function can be assessed by different modalities, including e.g., ECG, echocardiography, MRI/CT and circulating biomarkers; and genome-wide genetic analyses have been conducted for various of these traits, including ECG parameters ( 25 ), echocardiographic traits ( 21 , 22 ) and MRI measures of cardiac structure and function ( 26 ), as well as relevant biomarkers ( 15 ).

It is important to keep in mind that community-based samples (as opposed to clinical samples with established disease) include many individuals free of CVD at the time of inclusion in the study so that population-based cohort studies offer great opportunities to study the development of cardiovascular disease conditions over the adult life course ( 27 ), including very early (clinically asymptomatic) stages of the disease process and the genetic underpinning of these early stages. Thus, the above-mentioned GWAS have described to what extent different stages along the CVD continuum are associated with genetic variation and which genes might be involved.

Furthermore, given the large proportion of apparently healthy individuals in population-based cohort studies (as opposed to clinical samples), these studies conducted GWAS of many traits that reflect relatively normal organ function, including biomarkers of cardiac structure and systolic and diastolic function ( 21 , 22 ). These studies provided important insights how physiological organ function is influenced by genetic variation, and how organ dysfunction might contribute to different disease processes ( 21 , 22 ).

Assessment of Gene X Lifestyle Interactions

It is an important and growing area of research to quantify the contribution of genes and of different lifestyle factors (and their interactions) to inter-individual variation in cardiovascular risk factor levels and disease risk. Since well phenotyped cohort studies usually have comprehensive genetic data and detailed lifestyle information available, population-based studies represent an ideal setting to study gene x lifestyle interactions. The interaction of a genetic risk score (based on 50 SNPs) and a lifestyle score (including information on smoking, obesity, physical activity, and diet) on the incidence of CAD has been analyzed in several large community-based cohorts ( 28 ). Key observations from these analyses were that (i) both scores, the genetic risk score and the lifestyle score, were independently associated with the risk of incident CVD and that (ii) a favorable lifestyle was associated with an almost 50% reduction in the relative risk for CAD, as compared to those with an unfavorable lifestyle profile ( 28 ). This reduction in the relative risk of CAD by a favorable lifestyle was observed in individuals with high genetic risk, but also in individuals with low and intermediate genetic risk ( 28 ). Very similar observations were made in more than 270.000 participants of the UK Biobank, when a polygenic risk score, representing 314 BP-associated loci, as well as a slightly different lifestyle score (including information on body mass index, healthy diet, sedentary lifestyle, alcohol consumption, smoking, and urinary sodium excretion levels) were related to different BP traits and to incident CVD ( 29 ). Both, the genetic risk score as well as the lifestyle score were associated with BP traits and incident CVD. Importantly, a favorable lifestyle as compared to an unfavorable lifestyle was associated with substantially lower average BP values in all categories of genetic risk (low, intermediate, high) and with an about 30% lower relative risk for incident CVD ( 29 ).

The same lifestyle score as in Reference ( 28 ) was used in a sample of young women (aged 25 to 40 years) from the Dutch Lifelines cohort to assess the contribution of rare and common genetic variation and of lifestyle factors to very low (≤1st age- and sex-specific percentile) and very high (≥99% age- and sex-specific percentile) levels of LDL-C. The study revealed that about two thirds of the women with very low LDL-C levels had a likely genetic cause (either a relevant mutation in an established gene for monogenic hypocholesterolemia or a very low polygenic risk score), whereas the lifestyle score ( 28 ) was not statistically significantly associated with low LDL-C concentrations ( 30 ). In cases with hypercholesterolemia, however, an unfavorable lifestyle seems to be more relevant. Only about 40% of the women had a genetic cause (relevant mutations in genes for monogenic familial hypercholesterolemia) or predisposition (high polygenic risk score) for high LDL-C; and of the women without genetic cause for hypercholesterolemia, more than half of women displayed an unfavorable lifestyle profile ( 30 ).

Community-based studies have also been involved in studying uncommon loss-of-function variants that may offer insights into function of variants. For example, (gain-of-function) mutations in the PCSK9 (proprotein convertase subtilisin/kexin type 9) serine protease gene were initially identified in families with autosomal dominant hypercholesterolemia ( 31 ). Subsequently, loss-of-function mutations were reported in individuals with low circulating low-density lipoprotein (LDL) cholesterol levels ( 32 ). Analyses in population-based studies revealed that low-frequency sequence variants in the PCSK9 gene and a PCSK9 genetic score were associated with lower circulating LDL cholesterol levels and reduced risk of cardiovascular events in the general population ( 33 , 34 ). Recently, PCSK9 inhibitors have been tested in randomized controlled trials ( 35 ).

The Genetic Underpinning of Change in Cardiovascular Traits Over the Life Course

Due to the availability of repeated measures over time, cohort studies are also suitable to explore the genetic underpinning of changes in cardiovascular risk factors over time, and of the progression of subclinical CVD traits longitudinally. For example, a GWAS for carotid IMT measured at different time points over a 10-year period has recently been published ( 36 ). Furthermore, several researchers assessed the association of risk factor-associated genetic variants with trajectories of the respective risk factor over the life course. For example, BMI-associated genetic variants have been related to repeated measures of BMI over time ( 37 ). Interestingly, BMI in childhood and adulthood were associated with different sets of single nucleotide polymorphisms (SNPs) ( 37 ), respectively, consistent with the concept that genetic effects on risk factors might be age-dependent. In line with this concept, genetic linkage analyses for BMI provided evidence for age-dependent effects of select genetic loci ( 38 ).

On a parallel note, a genetic risk score consisting of 29 SNPs was not only associated with blood pressure and hypertension prevalence at baseline, but also with new-onset hypertension and change in blood pressure over the life course in a large Swedish cohort study ( 39 ).

GWAS for Incident Disease Conditions

The longitudinal character of population-based cohort studies allows genetic variation to be studied in relation to disease incidence. For example, population-based cohort studies have facilitated GWAS for incident heart failure ( 40 ), incident stroke ( 41 ) and incident MI/coronary heart disease (CHD) ( 3 ). Interestingly, GWAS for incident MI/CHD ( 3 ) reported partially discrepant results as compared to GWAS using prevalent CAD cases ( 7 ). As an example, the chromosome 9p21 locus – consistently replicated in case-control GWAS for CAD/MI ( 7 , 42 ) – provided only modest evidence for association in a GWAS for incident MI/CHD within the CHARGE consortium ( 3 ). Of note, the CHARGE consortium (Cohorts for Heart and Aging Research in Genomic Epidemiology) was founded to coordinate joint GWAS analyses of several traits in large population-based cohort studies and to provide opportunities for mutual replication efforts ( 43 ).

It is well known that analyses based on prevalent disease cases and those based on incident cases might reveal different results if the association between an exposure and the disease outcome differs by disease severity or disease duration (a phenomenon referred to as prevalence-incidence bias) ( 44 ). In order to be included in a case-control study as prevalent MI/CAD case, MI patients have to survive the acute event until they are sampled. Given that MI is still associated with substantial case fatality ( 45 , 46 ), case-control studies are likely enriched for MI/CAD survivors with rather long survival ( 3 ). Thus, alleles associated with prevalent CAD in case-control analyses could be related to the risk of developing the CAD event, but could also be related to the chances of surviving the acute CAD event. In line with this concept, the CAD risk allele at the 9p21 locus was associated with longer survival after MI in several population-based cohorts within CHARGE ( 3 ).

Impact of Genetic Variation on Risk Prediction

Furthermore, community-based prospective cohorts allow assessing whether genetic information improves risk prediction models beyond traditional risk factors. It was, indeed, one of the main motivations of the human genome project to use genetic information to predict disease risks in healthy individuals and to predict the response to a given therapy among patients. Several analyses conducted in various population-based cohorts assessed whether genetic variation – e.g., in an aggregated form as risk scores – improved performance measures of risk prediction models for a first CVD event, including discrimination, calibration, and reclassification ( 47 – 50 ). Although the results from individual studies vary, in most cases, the genetic risk scores displayed clear statistically significant associations with CVD endpoints, but improvements in discrimination (e.g., C-statistics; integrated discrimination improvement) and reclassification (e.g., net reclassification index) were more modest ( 47 , 48 ) and some studies did not provide evidence for improvement in these performance metrices beyond traditional risk factors ( 49 , 50 ).

Mendelian Randomization Analyses for Cardiovascular Traits

Genetic information in population-based cohort studies has also been used to assess causality between cardiovascular risk factors or circulating biomarkers and cardiovascular outcomes (incident CVD events) using instrumental variable analyses, a statistical approach referred to as Mendelian Randomization (MR) ( 51 – 53 ). This term, MR, refers to the random assortment of alleles of a given locus at meiosis ( 51 , 52 ). Thus, if a genetic locus (or a genetic risk score) is strongly associated with circulating biomarker levels or with risk factor levels, individuals are “randomized” to genetically determined high or low biomarker/risk factor levels ( 51 , 52 , 54 ). If the biomarker/risk factor is causally related to CVD, this difference in genetically determined higher or lower biomarker/risk factor levels should translate into corresponding quantitative differences in disease risk ( 51 , 52 , 54 ). Therefore, in addition to the association between the genetic variant and the risk factor/biomarker of interest, MR analyses also assess the associations between the risk factor/biomarker and incident CVD as well as between the genetic variant and incident CVD ( 52 ); the two latter analyses are facilitated by population-based cohort studies. By using genetic information as instrumental variable for the biomarker/risk factor of interest, MR analyses try to avoid two important limitations of observational studies, reverse causality and confounding ( 54 , 55 ). Using MR analyses in population-based samples, several traits along the CVD continuum and biomarkers have been tested for potentially causal relations with incident CVD, including high-density lipoprotein (HDL) cholesterol ( 53 ), C-reactive protein ( 56 ), lipoprotein(a) ( 57 ), and many others. It has to be kept in mind, though, that instrumental variable analyses can be affected by different types of selection bias. For example, such analyses might be biased, if a genetic variant is related to mortality, and MR analyses are conducted in an elderly sample ( 58 , 59 ).

Population-based studies have substantially improved our understanding of the genetic architecture of normal and abnormal organ function, CVD risk factors, circulating biomarkers, subclinical disease, and overt CVD traits over the life course. Furthermore, they were essential in exploring gene x lifestyle interactions and in evaluating genetic variation in the context of risk prediction models for incident CVD. In addition, population-based cohort studies provided great opportunities to conduct GWAS for incident CVD events, such as MI, stroke and heart failure, and thereby, to overcome classic limitations of case-control GWAS including prevalence-incidence bias. Finally, population-based cohort studies used genetic information as instrumental variables to assess whether cardiovascular risk factors or biomarkers are causally related to clinical CVD (Mendelian Randomization analyses).

Author Contributions

WL and RV wrote the article together.

This work was supported in part by the National Heart, Lung, and Blood Institute (NHLBI) contracts NO1-HL 25195 and HHSN268201500001I (RSV). Dr. Vasan is supported by the Evans Medical Foundation and the Jay and Louis Coffman Endowment. Dr. Lieb received grant funding from the German Ministry of Education and Research (01ER1301/13; 01ZX1606A).

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

1. Ganna A, Ingelsson E. 5 year mortality predictors in 498,103 UK Biobank participants: a prospective population-based study. Lancet (2015) 386(9993):533–40. doi: 10.1016/S0140-6736(15)60175-1

PubMed Abstract | CrossRef Full Text | Google Scholar

2. Riedel-Heller SG, Schork A, Matschinger H, Angermeyer MC. Recruitment procedures and their impact on the prevalence of dementia. Results from the leipzig longitudinal study of the aged (LEILA75+). Neuroepidemiology (2000) 19(3):130–40. doi: 10.1159/000026248

3. Dehghan A, Bis JC, White CC, Smith AV, Morrison AC, Cupples LA, et al. Genome-wide association study for incident myocardial infarction and coronary heart disease in prospective cohort studies: the CHARGE consortium. PLoS One (2016) 11(3):e0144997. doi: 10.1371/journal.pone.0144997

4. Oleckno WA. "Cohort Studies". In: Epidemiology - Concepts and Methods . United States: Waveland Press (2008). p. 315–70.

5. Wijmenga C, Zhernakova A. The importance of cohort studies in the post-GWAS era. Nat Genet (2018) 50(3):322–8. doi: 10.1038/s41588-018-0066-3

6. Rothman K. "Case-Control Studies". Epidemiology - An Introduction . United Kingdom: Oxford University Press (2002). 73 p.

7. Samani NJ, Erdmann J, Hall AS, Hengstenberg C, Mangino M, Mayer B, et al. Genomewide association analysis of coronary artery disease. N Engl J Med (2007) 357(5):443–53. doi: 10.1056/NEJMoa072366

8. Cheng YC, Stanne TM, Giese AK, Ho WK, Traylor M, Amouyel P, et al. Genome-wide association analysis of young-onset stroke identifies a locus on chromosome 10q25 near HABP2. Stroke (2016) 47(2):307–16. doi: 10.1161/STROKEAHA.115.011328

9. Malik R, Chauhan G, Traylor M, Sargurupremraj M, Okada Y, Mishra A, et al. Multiancestry genome-wide association study of 520,000 subjects identifies 32 loci associated with stroke and stroke subtypes. Nat Genet (2018) 50(4):524–537. doi: 10.1038/s41588-018-0058-3

10. Meder B, Rühle F, Weis T, Homuth G, Keller A, Franke J, et al. A genome-wide association study identifies 6p21 as novel risk locus for dilated cardiomyopathy. Eur Heart J (2014) 35(16):1069–77. doi: 10.1093/eurheartj/eht251

11. Teslovich TM, Musunuru K, Smith AV, Edmondson AC, Stylianou IM, Koseki M, et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature (2010) 466(7307):707–13. doi: 10.1038/nature09270

12. Newton-Cheh C, Johnson T, Gateva V, Tobin MD, Bochud M, Coin L, et al. Genome-wide association study identifies eight loci associated with blood pressure. Nat Genet (2009) 41(6):666–76. doi: 10.1038/ng.361

13. Warren HR, Evangelou E, Cabrera CP, Gao H, Ren M, Mifsud B, et al. Genome-wide association analysis identifies novel blood pressure loci and offers biological insights into cardiovascular risk. Nat Genet (2017) 49(3):403–15. doi: 10.1038/ng.3768

14. Dupuis J, Langenberg C, Prokopenko I, Saxena R, Soranzo N, Jackson AU, et al. New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk. Nat Genet (2010) 42(2):105–16. doi: 10.1038/ng.520

15. Musani SK, Fox ER, Kraja A, Bidulescu A, Lieb W, Lin H, et al. Genome-wide association analysis of plasma B-type natriuretic peptide in blacks: the Jackson heart study. Circ Cardiovasc Genet (2015) 8(1):122–30. doi: 10.1161/CIRCGENETICS.114.000900

16. Dehghan A, Dupuis J, Barbalic M, Bis JC, Eiriksdottir G, Lu C, et al. Meta-analysis of genome-wide association studies in >80 000 subjects identifies multiple loci for C-reactive protein levels. Circulation (2011) 123(7):731–8. doi: 10.1161/CIRCULATIONAHA.110.948570

17. Yu B, Barbalic M, Brautbar A, Nambi V, Hoogeveen RC, Tang W, et al. Association of genome-wide variation with highly sensitive cardiac troponin-T levels in European Americans and Blacks: a meta-analysis from atherosclerosis risk in communities and cardiovascular health studies. Circ Cardiovasc Genet (2013) 6(1):82–8. doi: 10.1161/CIRCGENETICS.112.963058

18. Lieb W, Chen MH, Teumer A, de Boer RA, Lin H, Fox ER, et al. Genome-wide meta-analyses of plasma renin activity and concentration reveal association with the kininogen 1 and prekallikrein genes. Circ Cardiovasc Genet (2015) 8(1):131–40. doi: 10.1161/CIRCGENETICS.114.000613

19. Kilpeläinen TO, Carli JF, Skowronski AA, Sun Q, Kriebel J, Feitosa MF, et al. Genome-wide meta-analysis uncovers novel loci influencing circulating leptin levels. Nat Commun (2016) 7:10494. doi: 10.1038/ncomms10494

20. Dehghan A, Yang Q, Peters A, Basu S, Bis JC, Rudnicka AR, et al. Association of novel genetic Loci with circulating fibrinogen levels: a genome-wide association study in 6 population-based cohorts. Circ Cardiovasc Genet (2009) 2(2):125–33. doi: 10.1161/CIRCGENETICS.108.825224

21. Vasan RS, Glazer NL, Felix JF, Lieb W, Wild PS, Felix SB, et al. Genetic variants associated with cardiac structure and function: a meta-analysis and replication of genome-wide association data. JAMA (2009) 302(2):168–78. doi: 10.1001/jama.2009.978-a

22. Wild PS, Felix JF, Schillert A, Teumer A, Chen MH, Leening MJG, et al. Large-scale genome-wide analysis identifies genetic variants associated with cardiac structure and function. J Clin Invest (2017) 127(5):1798–812. doi: 10.1172/JCI84840

23. Bis JC, Kavousi M, Franceschini N, Isaacs A, Abecasis GR, Schminke U, et al. Meta-analysis of genome-wide association studies from the CHARGE consortium identifies common variants associated with carotid intima media thickness and plaque. Nat Genet (2011) 43(10):940–7. doi: 10.1038/ng.920

24. O'Donnell CJ, Kavousi M, Smith AV, Kardia SL, Feitosa MF, Hwang SJ, et al. Genome-wide association study for coronary artery calcification with follow-up in myocardial infarction. Circulation (2011) 124(25):2855–64. doi: 10.1161/CIRCULATIONAHA.110.974899

25. Eijgelsheim M, Newton-Cheh C, Sotoodehnia N, de Bakker PI, Müller M, Morrison AC, et al. Genome-wide association analysis identifies multiple loci related to resting heart rate. Hum Mol Genet (2010) 19(19):3885–94. doi: 10.1093/hmg/ddq303

26. Fox ER, Musani SK, Barbalic M, Lin H, Yu B, Ogunyankin KO, et al. Genome-wide association study of cardiac structure and systolic function in African Americans: the candidate gene association Resource (CARe) study. Circ Cardiovasc Genet (2013) 6(1):37–46. doi: 10.1161/CIRCGENETICS.111.962365

27. Vasan RS, Kannel WB. Strategies for cardiovascular risk assessment and prevention over the life course: progress amid imperfections. Circulation (2009) 120(5):360–3. doi: 10.1161/CIRCULATIONAHA.109.881995

28. Khera AV, Emdin CA, Drake I, Natarajan P, Bick AG, Cook NR, et al. Genetic risk, adherence to a healthy lifestyle, and coronary disease. N Engl J Med (2016) 375(24):2349–58. doi: 10.1056/NEJMoa1605086

29. Pazoki R, Dehghan A, Evangelou E, Warren H, Gao H, Caulfield M, et al. Genetic predisposition to high blood pressure and lifestyle factors: associations with midlife blood pressure levels and cardiovascular events. Circulation (2018) 137(7):653–61. doi: 10.1161/CIRCULATIONAHA.117.030898

30. Balder JW, Rimbert A, Zhang X, Viel M, Kanninga R, van Dijk F, et al. Genetics, lifestyle, and low-density lipoprotein cholesterol in young and apparently healthy women. Circulation (2018) 137(8):820–31. doi: 10.1161/CIRCULATIONAHA.117.032479

31. Abifadel M, Varret M, Rabès JP, Allard D, Ouguerram K, Devillers M, et al. Mutations in PCSK9 cause autosomal dominant hypercholesterolemia. Nat Genet (2003) 34(2):154–6. doi: 10.1038/ng1161

32. Cohen J, Pertsemlidis A, Kotowski IK, Graham R, Garcia CK, Hobbs HH. Low LDL cholesterol in individuals of African descent resulting from frequent nonsense mutations in PCSK9. Nat Genet (2005) 37(2):161–5. doi: 10.1038/ng1509

33. Cohen JC, Boerwinkle E, Mosley TH, Hobbs HH. Sequence variations in PCSK9, low LDL, and protection against coronary heart disease. N Engl J Med (2006) 354(12):1264–72. doi: 10.1056/NEJMoa054013

34. Ference BA, Robinson JG, Brook RD, Catapano AL, Chapman MJ, Neff DR, et al. Variation in PCSK9 and HMGCR and risk of cardiovascular disease and diabetes. N Engl J Med (2016) 375(22):2144–53. doi: 10.1056/NEJMoa1604304

35. Sabatine MS, Giugliano RP, Keech AC, Honarpour N, Wiviott SD, Murphy SA, et al. Evolocumab and clinical outcomes in patients with cardiovascular disease. N Engl J Med (2017) 376(18):1713–22. doi: 10.1056/NEJMoa1615664

36. Xie G, Myint PK, Voora D, Laskowitz DT, Shi P, Ren F, et al. Genome-wide association study on progression of carotid artery intima media thickness over 10 years in a Chinese cohort. Atherosclerosis (2015) 243(1):30–7. doi: 10.1016/j.atherosclerosis.2015.08.034

37. Mei H, Chen W, Jiang F, He J, Srinivasan S, Smith EN, et al. Longitudinal replication studies of GWAS risk SNPs influencing body mass index over the course of childhood and adulthood. PLoS One (2012) 7(2):e31470. doi: 10.1371/journal.pone.0031470

38. Atwood LD, Heard-Costa NL, Fox CS, Jaquish CE, Cupples LA. Sex and age specific effects of chromosomal regions linked to body mass index in the framingham study. BMC Genet (2006) 7:7. doi: 10.1186/1471-2156-7-7

39. Fava C, Sjögren M, Montagnana M, Danese E, Almgren P, Engström G, et al. Prediction of blood pressure changes over time and incidence of hypertension by a genetic risk score in Swedes. Hypertension (2013) 61(2):319–26. doi: 10.1161/HYPERTENSIONAHA.112.202655

40. Smith NL, Felix JF, Morrison AC, Demissie S, Glazer NL, Loehr LR, et al. Association of genome-wide variation with the risk of incident heart failure in adults of European and African ancestry: a prospective meta-analysis from the cohorts for heart and aging research in genomic epidemiology (CHARGE) consortium. Circ Cardiovasc Genet (2010) 3(3):256–66. doi: 10.1161/CIRCGENETICS.109.895763

41. Ikram MA, Seshadri S, Bis JC, Fornage M, Destefano AL, Aulchenko YS, et al. Genomewide association studies of stroke. N Engl J Med (2009) 360(17):1718–28. doi: 10.1056/NEJMoa0900094

42. Schunkert H, Götz A, Braund P, McGinnis R, Tregouet DA, Mangino M, et al. Repeated replication and a prospective meta-analysis of the association between chromosome 9p21.3 and coronary artery disease. Circulation (2008) 117(13):1675–84. doi: 10.1161/CIRCULATIONAHA.107.730614

43. Psaty BM, O'Donnell CJ, Gudnason V, Lunetta KL, Folsom AR, Rotter JI, et al. Cohorts for heart and aging research in genomic epidemiology (CHARGE) consortium: design of prospective meta-analyses of genome-wide association studies from 5 cohorts. Circ Cardiovasc Genet (2009) 2(1):73–80. doi: 10.1161/CIRCGENETICS.108.829747

44. Oleckno WA. "Assessing the Accuracy of Epidemiologic Studies". In: Oleckno WA, editor. Epidemiology: Concepts and Methods . United States: Waveland Press (2008). p. 195–231.

45. Abildstrom SZ, Rasmussen S, Rosén M, Madsen M. Trends in incidence and case fatality rates of acute myocardial infarction in Denmark and Sweden. Heart (2003) 89(5):507–11. doi: 10.1136/heart.89.5.507

46. Lehto HR, Lehto S, Havulinna AS, Ketonen M, Lehtonen A, Kesäniemi YA, et al. Sex differences in short- and long-term case-fatality of myocardial infarction. Eur J Epidemiol (2011) 26(11):851–61. doi: 10.1007/s10654-011-9601-6

47. Ripatti S, Tikkanen E, Orho-Melander M, Havulinna AS, Silander K, Sharma A, et al. A multilocus genetic risk score for coronary heart disease: case-control and prospective cohort analyses. Lancet (2010) 376(9750):1393–400. doi: 10.1016/S0140-6736(10)61267-6

48. Hughes MF, Saarela O, Stritzke J, Kee F, Silander K, Klopp N, et al. Genetic markers enhance coronary risk prediction in men: the MORGAM prospective cohorts. PLoS One (2012) 7(7):e40922. doi: 10.1371/journal.pone.0040922

49. Paynter NP, Chasman DI, Paré G, Buring JE, Cook NR, Miletich JP, et al. Association between a literature-based genetic risk score and cardiovascular events in women. JAMA (2010) 303(7):631–7. doi: 10.1001/jama.2010.119

50. Brautbar A, Pompeii LA, Dehghan A, Ngwa JS, Nambi V, Virani SS, et al. A genetic risk score based on direct associations with coronary heart disease improves coronary heart disease risk prediction in the atherosclerosis risk in communities (ARIC), but not in the Rotterdam and Framingham offspring, studies. Atherosclerosis (2012) 223(2):421–6. doi: 10.1016/j.atherosclerosis.2012.05.035

51. Emdin CA, Khera AV, Kathiresan S. Mendelian Randomization. JAMA (2017) 318(19):1925–6. doi: 10.1001/jama.2017.17219

52. Swerdlow DI, Kuchenbaecker KB, Shah S, Sofat R, Holmes MV, White J, et al. Selecting instruments for Mendelian randomization in the wake of genome-wide association studies. Int J Epidemiol (2016) 45(5):1600–16. doi: 10.1093/ije/dyw088

53. Voight BF, Peloso GM, Orho-Melander M, Frikke-Schmidt R, Barbalic M, Jensen MK, et al. Plasma HDL cholesterol and risk of myocardial infarction: a mendelian randomisation study. Lancet (2012) 380(9841):572–80. doi: 10.1016/S0140-6736(12)60312-2

54. Jansen H, Samani NJ, Schunkert H. Mendelian randomization studies in coronary artery disease. Eur Heart J (2014) 35(29):1917–24. doi: 10.1093/eurheartj/ehu208

55. Schunkert H, Samani NJ. Elevated C-reactive protein in atherosclerosis-chicken or egg? N Engl J Med (2008) 359(18):1953–5. doi: 10.1056/NEJMe0807235

56. Zacho J, Tybjaerg-Hansen A, Jensen JS, Grande P, Sillesen H, Nordestgaard BG. Genetically elevated C-reactive protein and ischemic vascular disease. N Engl J Med (2008) 359(18):1897–908. doi: 10.1056/NEJMoa0707402

57. Kamstrup PR, Tybjaerg-Hansen A, Steffensen R, Nordestgaard BG. Genetically elevated lipoprotein(a) and increased risk of myocardial infarction. JAMA (2009) 301(22):2331–9. doi: 10.1001/jama.2009.801

58. Vansteelandt S, Dukes O, Martinussen T. Survivor bias in Mendelian randomization analysis. Biostatistics (2017). doi: 10.1093/biostatistics/kxx050

59. Boef AG, Le Cessie S, Dekkers OM. Mendelian randomization studies in the elderly. Epidemiology (2015) 26(2):e15–16. doi: 10.1097/EDE.0000000000000243

Keywords: GWAS (genome-wide association study), population, genetic variation, genetic predisposition to disease, risk prediction

Citation: Lieb W and Vasan RS (2018). Scientific Contributions of Population-Based Studies to Cardiovascular Epidemiology in the GWAS Era. Front. Cardiovasc. Med. 5:57. doi: 10.3389/fcvm.2018.00057

Received: 21 February 2018; Accepted: 11 May 2018; Published: 07 June 2018

Reviewed by:

Copyright © 2018 Lieb and Vasan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Wolfgang Lieb, [email protected]

This article is part of the Research Topic

From GWAS Hits to Treatment Targets

  • Materials and methods
  • Article Information

Model of classification by the Hematology Review Committee. Hb indicates hemoglobin; Hct, hematocrit.

Overview of study results.

See More About

Select your interests.

Customize your JAMA Network experience by selecting one or more topics from the list below.

  • Academic Medicine
  • Acid Base, Electrolytes, Fluids
  • Allergy and Clinical Immunology
  • American Indian or Alaska Natives
  • Anesthesiology
  • Anticoagulation
  • Art and Images in Psychiatry
  • Artificial Intelligence
  • Assisted Reproduction
  • Bleeding and Transfusion
  • Caring for the Critically Ill Patient
  • Challenges in Clinical Electrocardiography
  • Climate and Health
  • Climate Change
  • Clinical Challenge
  • Clinical Decision Support
  • Clinical Implications of Basic Neuroscience
  • Clinical Pharmacy and Pharmacology
  • Complementary and Alternative Medicine
  • Consensus Statements
  • Coronavirus (COVID-19)
  • Critical Care Medicine
  • Cultural Competency
  • Dental Medicine
  • Dermatology
  • Diabetes and Endocrinology
  • Diagnostic Test Interpretation
  • Drug Development
  • Electronic Health Records
  • Emergency Medicine
  • End of Life, Hospice, Palliative Care
  • Environmental Health
  • Equity, Diversity, and Inclusion
  • Facial Plastic Surgery
  • Gastroenterology and Hepatology
  • Genetics and Genomics
  • Genomics and Precision Health
  • Global Health
  • Guide to Statistics and Methods
  • Hair Disorders
  • Health Care Delivery Models
  • Health Care Economics, Insurance, Payment
  • Health Care Quality
  • Health Care Reform
  • Health Care Safety
  • Health Care Workforce
  • Health Disparities
  • Health Inequities
  • Health Policy
  • Health Systems Science
  • History of Medicine
  • Hypertension
  • Images in Neurology
  • Implementation Science
  • Infectious Diseases
  • Innovations in Health Care Delivery
  • JAMA Infographic
  • Law and Medicine
  • Leading Change
  • Less is More
  • LGBTQIA Medicine
  • Lifestyle Behaviors
  • Medical Coding
  • Medical Devices and Equipment
  • Medical Education
  • Medical Education and Training
  • Medical Journals and Publishing
  • Mobile Health and Telemedicine
  • Narrative Medicine
  • Neuroscience and Psychiatry
  • Notable Notes
  • Nutrition, Obesity, Exercise
  • Obstetrics and Gynecology
  • Occupational Health
  • Ophthalmology
  • Orthopedics
  • Otolaryngology
  • Pain Medicine
  • Palliative Care
  • Pathology and Laboratory Medicine
  • Patient Care
  • Patient Information
  • Performance Improvement
  • Performance Measures
  • Perioperative Care and Consultation
  • Pharmacoeconomics
  • Pharmacoepidemiology
  • Pharmacogenetics
  • Pharmacy and Clinical Pharmacology
  • Physical Medicine and Rehabilitation
  • Physical Therapy
  • Physician Leadership
  • Population Health
  • Primary Care
  • Professional Well-being
  • Professionalism
  • Psychiatry and Behavioral Health
  • Public Health
  • Pulmonary Medicine
  • Regulatory Agencies
  • Reproductive Health
  • Research, Methods, Statistics
  • Resuscitation
  • Rheumatology
  • Risk Management
  • Scientific Discovery and the Future of Medicine
  • Shared Decision Making and Communication
  • Sleep Medicine
  • Sports Medicine
  • Stem Cell Transplantation
  • Substance Use and Addiction Medicine
  • Surgical Innovation
  • Surgical Pearls
  • Teachable Moment
  • Technology and Finance
  • The Art of JAMA
  • The Arts and Medicine
  • The Rational Clinical Examination
  • Tobacco and e-Cigarettes
  • Translational Medicine
  • Trauma and Injury
  • Treatment Adherence
  • Ultrasonography
  • Users' Guide to the Medical Literature
  • Vaccination
  • Venous Thromboembolism
  • Veterans Health
  • Women's Health
  • Workflow and Process
  • Wound Care, Infection, Healing

Others Also Liked

  • Download PDF
  • X Facebook More LinkedIn

van der Klauw MM , Goudsmit R , Halie MR, et al. A Population-Based Case-Cohort Study of Drug-Associated Agranulocytosis. Arch Intern Med. 1999;159(4):369–374. doi:10.1001/archinte.159.4.369

Manage citations:

© 2024

  • Permissions

A Population-Based Case-Cohort Study of Drug-Associated Agranulocytosis

From the Departments of Internal Medicine II (Drs van der Klauw and Wilson) and Epidemiology and Biostatistics (Dr Stricker), Erasmus University Medical School, Rotterdam; Drug Safety Unit, Inspectorate for Health Care, Rijswijk, Amsterdam (Drs van der Klauw and Stricker); Department of Hematology, University Hospital Groningen, Groningen (Dr Halie); Department of Hematology, Dr Daniel den Hoed Cancer Centre, Rotterdam (Dr van't Veer); and Department of Pharmacoepidemiology and Pharmacotherapy, University of Utrecht, Utrecht (Dr Herings), the Netherlands. Dr Goudsmit is in private practice in Amsterdam.

Background   Agranulocytosis is a life-threatening disorder, often caused by drugs. Incidences or risks of drug-induced agranulocytosis are not well known, since it is rare.

Methods   To determine the risk of drug-associated agranulocytosis as a reason for admission to Dutch hospitals, we performed a population-based case-cohort study. Hospital discharge data came from the Dutch Centre for Health Care Information, Utrecht, which contains data on all general and university hospitals in the Netherlands. The reference cohort consisted of all persons in the catchment area of the Pharmaco Morbidity Record Linkage System (PHARMO RLS) in the Netherlands, composing a population of approximately 220,000 to 484,000 persons from 1987 through 1990. All admissions during that period with agranulocytosis or related diagnoses were included in the study (n=923). The potential causes of agranulocytosis were assessed in all cases classified as probable or possible agranulocytosis.

Results   Discharge summaries were received of 753 admissions, of which 678 contained enough information for analysis. Of the 678, 108 were classified as "agranulocytosis probable" or as "agranulocytosis possible." In 75 of these 108 cases, agranulocytosis had been the reason for admission. Fifteen patients had used methimazole within 10 days before developing agranulocytosis; 2, carbimazole; 9, sulfasalazine; 8, sulfamethoxazole-trimethoprim; 4, clomipramine hydrochloride; and 2, dipyrone with analgesics, yielding adjusted relative risks of agranulocytosis of 114.8 (for thyroid inhibitors combined) (95% confidence interval [CI], 60.5-218.6), 74.6 (95% CI, 36.3-167.8), 25.1 (95% CI, 11.2-55.0), 20.0 (95% CI, 6.1-57.6), and 26.4 (95% CI, 4.4-11.1), respectively.

Conclusions   The highest relative risks were found for thyroid inhibitors, sulfamethoxazole-trimethoprim, sulfasalazine, clomipramine, and dipyrone combined with analgesics.

AGRANULOCYTOSIS IS a life-threatening disorder that frequently occurs as an adverse reaction to drugs. 1 Some drugs are well-known causes of agranulocytosis, but there are several drugs of which this is less certain. In the medical literature, case reports continue to appear about agranulocytosis as an adverse reaction to drugs, but the risk of these drugs, expressed as a relative risk or incidence, is difficult to estimate. In 1980 through 1986, the International Agranulocytosis and Aplastic Anemia Study (IAAAS) was performed as a population-based case-control study involving several study centers across Europe and in Israel, and encompassing a potential population base of approximately 23 million people. 2 - 18 We performed a study in the Netherlands for the following reasons: first, in the IAAAS, large differences in relative risks between regions in Europe were found, and no epidemiological study has ever included all admitted cases of agranulocytosis from a whole country. Moreover, the IAAAS was criticized for potential biases inherent in its design. 15 , 17 Second, the IAAAS encompassed the years 1980 to 1986, but since then other drugs have been developed and marketed. We therefore performed a study to assess the relative and attributable risks of drug-associated agranulocytosis in the Netherlands, with a population-based case-cohort design.

Data on morbidity were obtained from the Dutch Centre for Health Care Information, Utrecht, which holds a standardized computerized register of hospital diagnoses. Admission data are filed continuously from all general and university hospitals in the Netherlands. Whenever a patient is discharged from a hospital, data on sex, date of birth, dates of admission and discharge, 1 principal diagnosis (mandatory), and up to 9 additional diagnoses (optional) are recorded. All diagnoses are coded according to the International Classification of Diseases, Ninth Revision, Clinical Modification . 19 At the time of initiation of this study, the most recent years on file available were 1987 through 1990. In this study, we analyzed all records containing potential cases of agranulocytosis, ie, admissions with the codes 288.0 (agranulocytosis), 288.1 (functional disorders of polymorphonuclear neutrophils), 288.2 (genetic anomalies of leukocytes), and 288.9 (unspecified diseases of white blood cells) as principal diagnoses.

Data on dispensed drugs were obtained from the Pharmaco Morbidity Record Linkage System (PHARMO RLS), a registry of community pharmacy data, with a complete coverage of filled prescriptions in its catchment area of approximately 220,000 persons in 1987, 331,000 persons in 1988, 419,000 persons in 1989, and 484,000 persons in 1990. 20 All data on prescription-only drugs dispensed by all pharmacies in the catchment area are registered, as well as sex and date of birth of the patients these are dispensed to. In the Dutch health care system, all patients are designated to 1 pharmacy for filling their prescriptions. The vital statistics concerning age (overall and stratified) and sex were similar to those of the total Dutch population. It has been demonstrated that these data are good estimators of drug exposure in the Dutch population. 20

In this study, a population-based case-cohort design was used, in which drug use in cases was compared with drug use in a reference cohort. 21 In the case-cohort design, the reference cohort may contain 1 or more cases. Cases were patients admitted to a hospital with a validated diagnosis of agranulocytosis. The reference cohort consisted of all people in the catchment area of all pharmacies included in the PHARMO RLS.

Agranulocytosis was defined as severe neutropenia (neutrophil count, ≤0.5×10 9 /L) in an individual 2 years of age or older who used to have normal hematologic values, and who had symptoms compatible with agranulocytosis, notably fever and infections. In addition, cases had to comply with all of the following criteria: (1) hemoglobin level of 6.5 mmol/L or more or hematocrit of 0.32 or more if normochromic (men and women); (2) platelet count of 100×10 9 /L or more; and (3) bone marrow aspirate or biopsy that confirmed the diagnosis, 22 or if there was none, recovery of the absolute number of neutrophilic granulocytes within 30 days to greater than 1.5×10 9 /L unless the patient died.

For every case, an index day was defined as the first day of the onset of fever (temperature ≥38°C), chills, or a sore throat. Furthermore, if the symptoms disappeared 5 days before admission or earlier, these were not taken into account. For every case, a risk time window was defined as the 10-day period preceding the index day. In all cases of a reaction classified as "agranulocytosis probable" or "agranulocytosis possible," the reporting consultant was asked for permission to contact the general practitioner and the pharmacist of the patient to assess the use of drugs in the 3 months before admission. These data were used as exposure data, in combination with the data from the patient record. If not available, data on exposure to drugs were collected from the patients' hospital records only. For every drug, the exposure period was calculated by dividing the total number of dispensed tablets or capsules by the prescribed daily number of tablets or capsules. To correct for undercompliance and carryover effects, this period was multiplied by a factor of 1.1, with a maximum of 14 days. Cases were considered exposed to all drugs for which the exposure period fell (partly) within the 10-day risk time window. If the drug was discontinued before the index day, the last day of use of the particular drug had to be within 10 days before the index day. Since the data from the reference cohort include only data from community pharmacies and not from hospital pharmacies, patients who developed agranulocytosis during hospital admission (and thus probably caused by drugs supplied by a hospital pharmacy) were excluded from the study.

For every member of the reference cohort aged 2 years or older, a random 10-day period was chosen in each year separately. People in the reference cohort were considered exposed to all drugs of which the exposure period fell within this 10-day period. For every drug, the exposure period was calculated as defined above. The average number of users in each year of the study period was calculated in each age and sex stratum, standardized to the population size in the PHARMO RLS catchment area in 1990 (n=471,812).

In 1992, a request for information was sent to all hospitals where patients had been discharged in the years 1987 through 1990 with 1 of the principal diagnoses mentioned above. All physicians involved in the treatment of these patients received a request for a copy of the discharge summary, laboratory results, and, if available, descriptions of bone marrow material, after removal of the patient identification.

If the data received were too scanty, further information was requested. All patient data were analyzed, without prior knowledge of the suspected cause of agranulocytosis, as follows.

Every admission was analyzed according to a predefined algorithm and classified as "agranulocytosis probable," "agranulocytosis possible," "agranulocytosis unlikely," or "agranulocytosis unclassifiable" ( Figure 1 ). A Hematology Review Committee assessed the clinical details of those admissions, without knowledge of the suspected cause, where the diagnosis was not straightforward. Also, other diseases that have been associated with leukopenia, such as preceding sepsis, systemic lupus erythematosus, Felty syndrome, and leukemia were excluded. If 2 members differed in their opinion on the classification of an admission, it was discussed in a joint meeting of the committee. Then, final classification was based on consensus (same classification by all 3 members) or on majority of votes in case of a minor discrepancy (eg, agranulocytosis possible vs agranulocytosis unlikely). If no agreement was obtained, the admission was classified as agranulocytosis unclassifiable. Furthermore, a random 10% sample of the remainder of admissions was reanalyzed by 1 of the members of the Hematology Review Committee (R.G.) to check the validity of the first analysis.

An admission because of agranulocytosis was classified as severe if the patient developed sepsis or septic shock caused by the agranulocytosis.

The relative risk of developing agranulocytosis when being exposed to a certain drug (group) compared with not being exposed was estimated by dividing the ratio of cases exposed (c 1 ) and not exposed (c 0 ) to drug (group) X by the ratio of cohort members exposed (b 1 ) and not exposed (b 0 ) to this drug (group): relative risk=(c 1 /c 0 )/(b 1 /b 0 ).

Point estimates were calculated with their 95% confidence intervals for case-cohort studies. 23 , 24

The etiologic fractions and excess risks were calculated according to standard procedures. 25 All causes that were significantly associated with agranulocytosis in the univariate analysis were subsequently adjusted for age, sex, and concomitant drug use in a stratified analysis 24 where concomitant use of drugs that have been associated with agranulocytosis was included as a dichotomous variable.

From January 1, 1987, through December 31, 1990, there were 923 admissions with a principal diagnosis coded as agranulocytosis (288.0) (n=859), functional disorders of polymorphonuclear neutrophils (288.1) (n=26), genetic anomalies of leukocytes (288.2) (n=2), and unspecified diseases of white blood cells (288.9) (n=36). A response was received to the request for information for 753 admissions (81.6%). In approximately 50% of the cases, all relevant information was received (ie, at least a copy of the discharge summary, and the laboratory and bone marrow results). In the remainder, the hospitals were asked for additional information, resulting in data on 678 admissions, of which 66 concerned patients who had been admitted more than once. Another 86 cases were excluded because of insufficient data (eg, no description of symptoms or leukocyte counts) and 114 were excluded because the patient was younger than 2 years or had no symptoms on admission (ie, agranulocytosis was discovered by coincidence).

The remaining 478 admissions were classified as follows: agranulocytosis probable (n=72), agranulocytosis possible (n=36), agranulocytosis unlikely (n=363), and agranulocytosis unclassifiable (n=7) ( Figure 2 ). Of the 108 admissions classified as agranulocytosis probable or agranulocytosis possible, 78 concerned adverse reactions that had their onset outside the hospital and that were the direct reason for admission. The remaining 30 consisted of reactions that occurred either in the outpatient clinic or inside the clinic during admission.

Of the admissions coded as diagnosis 288.0 (agranulocytosis), 333 (74.5% of classified admissions) were classified as agranulocytosis unlikely ( Figure 2 ). Most of these patients had been admitted with pancytopenia or a combination of leukocytopenia with anemia or thrombocytopenia caused by chemotherapy.

Only the events that occurred outside the hospital and led to admission and that were classified as agranulocytosis probable or agranulocytosis possible were used in the further analysis, as cases occurring in the hospital could not be related to the exposure data acquired from community pharmacies. This group consisted of 78 cases. Of these, 6 patients died (8%), and in an additional 6 patients the event was severe in view of development of sepsis or septic shock. Fever was present in an additional 66 patients, often with chills.

Blood cultures were performed in 65 patients, of which 39 were positive. Bone marrow was examined in 47 patients, and in 44 it confirmed the diagnosis of agranulocytosis. In 3 patients the results were inconclusive. Once the cause of agranulocytosis was discontinued, the neutrophil count recovered within 30 days in 43 patients, it did not recover within this period in 8 patients, there were no data in 22 patients, and 5 other patients died before recovery of their neutrophil count.

Five patients were admitted twice, 2 of these on separate occasions. Three patients, however, were transferred from one hospital to another for the same diagnosis. These 3 admissions were therefore excluded.

After exclusion, 75 cases remained, 30 men (median age, 48.5 years; 25th-75th percentile, 32-67 years) and 45 women (median age, 61 years; 25th-75th percentile, 42-73 years).

The incidence of agranulocytosis was estimated at 1.7 per million inhabitants in 1987, 2.2 per million in 1988, 2.5 per million in 1989, and 1.6 per million in 1990.

In the cases classified as agranulocytosis probable or agranulocytosis possible, in which the event had been the reason for admission, the main drugs used before the index day were methimazole (n=15), digoxin (n=12), prednisone (n=10), sulfasalazine (n=9), sulfamethoxazole-trimethoprim (n=8), acetaminophen including combinations (n=8), furosemide (n=6), hydrochlorothiazide with potassium-sparing drugs (n=6), levothyroxine sodium (n=5), ibuprofen (n=5), acenocoumarol (n=5), propranolol hydrochloride (n=5), and oxazepam (n=5). The prevalence of use in the reference cohort is also given in Table 1 for drug groups and for the individual drugs most frequently used before agranulocytosis. The relative risks of hospital admissions because of agranulocytosis, adjusted for age, sex, and concomitant drug use, are also shown in Table 1 , as are the etiologic fraction and excess risk for drugs for which the adjusted relative risk was significantly elevated.

This study was performed to examine the drug-related hospital admissions because of agranulocytosis in the Netherlands, with a population-based case-cohort design. Excluded admissions mostly pertained to diagnoses closely related to agranulocytosis. These admissions were all excluded from our study, since they mostly concerned hospital-acquired pancytopenia, aplastic anemia, or a combination of agranulocytosis with anemia or thrombocytopenia, which were not our topic of interest. Moreover, only admissions of patients with community-acquired agranulocytosis could be used for the relative risk and incidence estimations, as in-hospital exposure data were not available. A large group of patients developed agranulocytosis to chemotherapy, an adverse reaction that has already been studied extensively.

In this study, we were not able to assess an incidence rate of mild leukopenia, as not all such patients would have been admitted. It is likely, however, that few cases of agranulocytosis (which is mostly symptomatic) have been missed and that our study gives a fairly accurate estimation of the incidence of community-acquired symptomatic agranulocytosis. Although psychiatric hospitals were not included, symptomatic cases of agranulocytosis are almost always referred to a general hospital. It should be noted that patients could have been admitted with agranulocytosis and coded otherwise in the registry of the Dutch Centre for Health Care Information. To assess false-negative misclassification, we added 3 diagnosis codes that could have included cases of agranulocytosis, and found only 1 possible case. Thyroid inhibitors had the highest relative risk and excess risk of drug-associated agranulocytosis, but also sulfamethoxazole-trimethoprim, sulfasalazine, clomipramine hydrochloride, and dipyrone combined with analgesics were associated with high risk estimates.

For this study we used a case-cohort design. Because of the low incidence of agranulocytosis, we did not consider a cohort study a useful approach. Case-control studies are suitable for studying rare diseases, but we had several reasons for not using a case-control design. First, recall bias would have been likely. As agranulocytosis is an impressive event that patients are not likely to forget, it would not have been easy to find controls subject to the same recall of exposure as cases. Second, as drugs are a well-known cause of agranulocytosis, physicians might inquire more insistently about drug use in the index group than in the control group. Third, although agranulocytosis is considered to be rare, the low population exposure prevalence of some drugs (eg, thyroid inhibitors) could consequently have meant that none of the controls would have been exposed to those drugs.

Theoretically, selection bias might occur if agranulocytosis to one drug is more severe than agranulocytosis to other drugs, or if patients with agranulocytosis to a particular drug are admitted more readily. However, there are no reasons to believe that agranulocytosis to orally administered methimazole or sulfasalazine has a worse prognosis than agranulocytosis to other orally administered drugs. Hence, this will mean that the proportion of community-acquired cases of agranulocytosis that leads to admission is more or less the same for these drugs. Information bias might result if physicians who anticipate an increased risk of agranulocytosis perform more blood tests. This could occur, for instance, in patients taking thyroid inhibitors, as these are a well-known cause of agranulocytosis. Therefore, we excluded all cases that were asymptomatic and discovered only because of a routine blood test. Information bias by differential recall of drug use by patients (recall bias) was not a problem because the information came from automated pharmacies and had been gathered before disease onset. In patients in whom drug use could be checked in pharmacy data or general practitioner's records, 85% of drugs mentioned in the hospital data could be confirmed. Although it was possible to obtain the filling data on most cases, it was virtually impossible to get these data during a longer episode than the risk period. Therefore, dose- and duration-related risk estimates could not be obtained. Confounding is unlikely, as apart from drugs there are few independent risk factors for agranulocytosis, and we adjusted for age, sex, and concurrent drug use.

In the IAAAS, the overall incidence of community-acquired agranulocytosis was estimated at 3.4 per million inhabitants per year, 26 which is slightly higher than the 1.6 to 2.5 per million inhabitants per year found in our study. The IAAAS has been heavily debated, since bias was thought to play a role. 15 - 18 One of the difficulties was that the rate ratio regarding dipyrone varied between regions from 0.8 to 23.7. 4 Insofar as we are aware, our study is the first that includes all admitted community-acquired cases of admitted agranulocytosis from a whole country. Our results were comparable with those of the IAAAS with regard to the elevated risks found for thyroid inhibitors and dipyrone, although the absolute number of cases involving dipyrone was small. For thyroid inhibitors, a relative risk of 102 was found in the IAAAS (excess risk, 6.3 per million users during 1 week of exposure), 5 which is comparable with the relative risk of 115 (excess risk, 4.9 per million users during 10 days of exposure) found in the current study. Also, in the IAAAS article, the risk for methimazole seems to be higher than that for carbimazole. 5 Since carbimazole is converted to methimazole in vivo, this higher risk is difficult to explain. With regard to anti-infective agents, an elevated relative risk was found for sulfamethoxazole-trimethoprim (12; excess risk, 1.6 per million with 2 weeks of exposure) and the macrolides (excess risk, 7.1 per million). 6 Several drugs with an elevated relative risk in our study have been associated with agranulocytosis in the medical literature, including diuretics (eg, chlorthalidone), antithyroid drugs (carbimazole, methimazole, and propylthiouracil), penicillins, indomethacin, acetaminophen, dipyrone, benzodiazepines, antidepressants (eg, amitriptyline), sulfasalazine, sulfamethoxazole-trimethoprim, carbamazepine, and phenothiazines. 1 For several other drugs, eg, coumarins, digoxin, and prednisone, this was not the case, although in the IAAAS an elevated relative risk was also found for digoxin and prednisone, for which the authors had no explanation. Since agranulocytosis disappeared in our patients despite continuation of these drugs, the association with these 2 drugs is probably not causal. Clozapine, which has been studied extensively because of its ability to cause agranulocytosis, was not registered in the Netherlands during the study period.

In conclusion, we found a slightly lower cumulative yearly incidence of community-acquired agranulocytosis in the Netherlands than was found in the multicenter IAAAS. In our study, thyroid inhibitors, sulfamethoxazole-trimethoprim, sulfasalazine, clomipramine, and dipyrone combined with analgesics were associated with the highest risks of agranulocytosis.

Accepted for publication March 26, 1998.

Corresponding author: Bruno H. Ch. Stricker, MB, PhD, Pharmacoepidemiology Unit, Department of Epidemiology and Biostatistics, Erasmus University Medical School, Dr Molewaterplein 50, 3015 GE Rotterdam, the Netherlands (e-mail: [email protected] ).

  • Register for email alerts with links to free full-text articles
  • Access PDFs of free articles
  • Manage your interests
  • Save searches and receive search alerts
  • Open access
  • Published: 03 April 2024

Addition of inflammation-related biomarkers to the CAIDE model for risk prediction of all-cause dementia, Alzheimer’s disease and vascular dementia in a prospective study

  • Kira Trares 1 ,
  • Manuel Wiesenfarth 2 ,
  • Hannah Stocker 1 ,
  • Laura Perna 3 , 4 ,
  • Agnese Petrera 5 ,
  • Stefanie M. Hauck 5 ,
  • Konrad Beyreuther 6 ,
  • Hermann Brenner 1 &
  • Ben Schöttker 1  

Immunity & Ageing volume  21 , Article number:  23 ( 2024 ) Cite this article

Metrics details

It is of interest whether inflammatory biomarkers can improve dementia prediction models, such as the widely used Cardiovascular Risk Factors, Aging and Dementia (CAIDE) model.

The Olink Target 96 Inflammation panel was assessed in a nested case-cohort design within a large, population-based German cohort study ( n  = 9940; age-range: 50–75 years). All study participants who developed dementia over 20 years of follow-up and had complete CAIDE variable data ( n  = 562, including 173 Alzheimer’s disease (AD) and 199 vascular dementia (VD) cases) as well as n  = 1,356 controls were selected for measurements. 69 inflammation-related biomarkers were eligible for use. LASSO logistic regression and bootstrapping were utilized to select relevant biomarkers and determine areas under the curve (AUCs).

The CAIDE model 2 (including Apolipoprotein E ( APOE ) ε4 carrier status) predicted all-cause dementia, AD, and VD better than CAIDE model 1 (without APOE ε4) with AUCs of 0.725, 0.752 and 0.707, respectively. Although 20, 7, and 4 inflammation-related biomarkers were selected by LASSO regression to improve CAIDE model 2, the AUCs did not increase markedly. CAIDE models 1 and 2 generally performed better in mid-life (50–64 years) than in late-life (65–75 years) sub-samples of our cohort, but again, inflammation-related biomarkers did not improve their predictive abilities.

Conclusions

Despite a lack of improvement in dementia risk prediction, the selected inflammation-related biomarkers were significantly associated with dementia outcomes and may serve as a starting point to further elucidate the pathogenesis of dementia.

Introduction

The number of dementia cases worldwide is continuously rising and is projected to double nearly every 20 years [ 1 ]. With the approval of Aduhelm , Leqembi, and Donanemab as the first effective treatments against Alzheimer’s disease (AD) by the U.S. Food and Drug Administration (FDA) there is hope for significant advancements in AD therapy. Although the drugs' efficacy, safety, and clinical application are still controversial [ 2 , 3 , 4 , 5 ], they can be considered a first step towards an effective dementia treatment. The above and future improved drugs will likely be most effective in early AD treatment. Thus, it is vital to perform dementia risk assessments and make diagnoses early [ 6 , 7 ].

The scientific literature on dementia risk prediction increased rapidly since new risk factors and biomarkers were identified during the last years. However, sample sizes and follow-up durations varied extremely, and external validation is often lacking [ 6 ]. Also, the underlying study populations are highly different. Risk prediction models combining demographic, cognition, physical and health risk factors are often best suited and versatile [ 8 , 9 ]. The Cardiovascular Risk Factors, Aging and Dementia (CAIDE) model, which is based on data from a Finnish population-based study, is such a risk model [ 10 ]. Including several risk factors of dementia, the authors could predict the risk of developing dementia with an area under the curve (AUC) of 0.769 (95% confidence interval (CI): 0.709 – 0.829). A second model containing additionally Apolipoprotein E ( APOE ) ε4 performed slightly better (AUC [95% CI]: 0.776 [0.717 – 0.836]). The CAIDE model was internally and externally validated in many cohorts, including high-income countries and various ethnicities [ 11 , 12 , 13 , 14 , 15 ]. However, the performance of the model was attenuated when applied to low-income countries as well as late-life cohorts [ 16 , 17 ].

Dementia prediction models, including the CAIDE model, do not contain inflammatory biomarkers, although inflammation is a critical mechanism contributing to dementia pathogenesis [ 18 ]. Previously, we showed that most of the 92 inflammation-related biomarkers of the Olink Target 96 inflammation panel were significantly associated with all-cause dementia [ 19 ].

In this study, we fitted the CAIDE model to a large prospective cohort study and aimed to assess the potential of improving its ability to predict dementia risk by including inflammation-related biomarkers. Different models for all-cause dementia, AD, and vascular dementia (VD) as well as a mid-life and late-life population, were created.

Study population

This study was based on data from the ESTHER study. The ESTHER study (Epidemiologische Studie zu Chancen der Verhütung, Früherkennung und optimierten Therapie chronischer Erkrankungen in der älteren Bevölkerung [German]) is a prospective cohort study conducted in Saarland, Germany. Participants were recruited during a general health checkup at their general practitioners (GP) between 2000 and 2002 and were followed up 2, 5, 8, 11, 14, 17, and 20 years after baseline. The study comprises 9940 men and women between 50 and 75 years. Details have been described elsewhere [ 20 ]. Sociodemographic baseline characteristics were similarly distributed in the respective age categories as in a German National Health Survey conducted in a representative sample of the German population around the time of recruitment [ 20 ]. The study was approved by the ethics committees of the Medical Faculty of Heidelberg and the state medical board of Saarland, Germany.

Dementia ascertainment and case-cohort design sample

Dementia information was collected during the 14-, 17-, and 20-year follow-up (median (interquartile range) follow-up time: 16.3 years (13.5–17.0 years)) via standardized questionnaires sent to the GPs of the ESTHER study’s participants. In this questionnaire, the GPs were asked whether dementia has been diagnosed among their patients and, if so, to provide all medical records from neurologists, psychiatrists, memory clinics, or other specialized providers. This query was also sent to the GPs of study participants who had already dropped out due to ill health or death. Overall, information on whether dementia was diagnosed during 20 years of follow-up or not could be ascertained for n  = 6,466 study participants (65% of the original cohort). A flowchart of the study population is shown in Fig. 1 .

figure 1

Flowchart of dementia ascertainment during the 14-, 17-, and 20-year follow-up of the ESTHER study and study participant selection. Abbreviations: GP General practitioner

After excluding subjects with missing blood samples ( n  = 73) from participants with ascertained dementia information, 6,297 participants were eligible to be drawn for the case-cohort sample and measurements of the Olink Target 96 inflammation panel. The randomly selected sample consisted of 1,611 study participants, of whom 115 were diagnosed with dementia during follow-up. Among the remaining 4,686 study participants not randomly selected, 541 were incident dementia cases and added to the data set as well, resulting in 656 dementia cases overall. However, due to quality control warnings during the biomarker measurements, 75 participants were additionally excluded. Participants with missing data for any of the aforementioned CAIDE model variables were further excluded ( n  = 159). For the last exclusion step, we compared the data of included and excluded participants with respect to age, sex, and education, and no indication of selection bias was detected (Supplemental Table  1 ). The final sample included a total of 562 dementia cases and 1,356 controls.

Origin, assessment and modifications of the CAIDE model

The CAIDE model originates from the CAIDE study, a population-based cohort study from Finland assessing cardiovascular risk factors, aging, and dementia [ 21 ]. For the development of the CAIDE model, 1,409 participants aged between 39 and 64 years of the original CAIDE study were included [ 10 ]. Of those, 61 developed dementia during 20 years of follow-up. CAIDE model 1 consists of the variables age, education, sex, systolic blood pressure, body mass index (BMI), total cholesterol, and physical activity, while CAIDE model 2 additionally includes APOE ε4 status.

In the ESTHER study, the CAIDE model variables age, sex, education, body mass index (BMI), and physical activity of participants were assessed during the baseline assessment by standardized self-administered questionnaires. The systolic blood pressure of participants was measured at baseline by the GP. Total cholesterol levels were measured from serum samples by an enzymatic colorimetric test with the Synchron LX multicalibrator system (Beckman Coulter, Galway, Ireland). APOE genotypes were determined by TaqMan single-nucleotide polymorphism (SNP) genotyping assays (Applied Biosystems, California, USA). Endpoint allelic discrimination reads were used to analyze genotypes with the Bio-RAD CFX Connect System (Bio-Rad Laboratories, CA, USA). In the case of missing directly genotyped APOE data ( n  = 70), imputed quality-controlled data was used. For details, see Stocker et al. 2020 [ 22 ].

All variables used in the CAIDE model were available but it needed to be newly calibrated because the ESTHER cohort has a different age range, school education history and physical activity assessment than the CAIDE study. Fractional polynomials were utilized to determine the best fitting function of the continuous variables in the prediction of all-cause dementia, AD, and VD [ 23 ] (data not shown). Because the linear function was the best fitting for systolic blood pressure and BMI, they were kept as continuous variables. Although the best fitting function was x (−2) for age and total cholesterol for all-cause dementia and VD, they were still modelled with the linear function because the difference in model fit was small. Education, physical activity, and APOE genotypes were dichotomized by summarizing categories with very similar odds ratios (ORs) for the association with all-cause dementia (data not shown).

Measurement of inflammation-related biomarkers

Levels of inflammation-related proteins were measured in baseline serum samples using the Olink Target 96 inflammation panel (Olink Proteomics, Uppsala, Sweden). Details are described in Supplemental Text 1 . In addition, a list of all biomarkers is depicted in Supplemental Table 2 . 

Statistical analyses

The associations of the CAIDE model variables with the outcomes of all-cause dementia, AD, and VD were determined by a multivariate logistic regression model adjusted for age, education, sex, systolic blood pressure, BMI, total cholesterol, physical activity, and APOE ε4 status.

The predictive accuracy of the CAIDE model, including baseline variables and the inflammatory biomarkers measured from baseline serum samples, was assessed for dementia diagnoses collected over 20 years of follow-up, using least absolute shrinkage and selection operator (LASSO) logistic regression models. LASSO is a form of linear regression that uses shrinkage to exclude variables that are not useful for the prediction [ 24 ]. This makes the final equation simpler and easier to interpret. The CAIDE model variables were defined as not being penalized by the LASSO regression and thus forced into the model. In a sensitivity analysis, all variables were penalized. The parameter λ was determined by five-fold cross-validation. The AUCs and 95% CIs were estimated using 500 bootstrap samples for the CAIDE model and CAIDE model + inflammatory biomarkers for all-cause dementia, AD, and VD as the outcome, respectively. While the CAIDE model only included the CAIDE model variables, the CAIDE model + inflammatory biomarkers additionally included those of the 69 inflammation-related biomarkers selected by the LASSO regression. Moreover, we distinguished CAIDE models 1 and 2, with only the latter including APOE ε4 carrier status among the unpenalized CAIDE model variables. To determine if the differences between the CAIDE model and the CAIDE model + inflammatory biomarkers models were statistically significant, bootstrap intervals for the differences in AUCs were computed. This involves the calculation of the AUC difference between the two models for every bootstrap sample, sorting and assessing the true AUC difference. The probability of a variable to be selected by the LASSO regression was additionally determined using bootstrap inclusion frequencies [ 25 , 26 ], providing insights about the number of selections for each variable throughout the bootstrapping procedure. High inclusion frequencies indicate a continuous impact on the model’s performance by the respective variables.

Besides calculations for the total sample, the models' discrimination performance was also evaluated in subgroups for mid-life (50–64 years) and late-life (65–75 years) for all three dementia outcomes and CAIDE model 1 and CAIDE model 2.

The Statistical Analysis System (SAS, version 9.4, Cary, North Carolina, USA) was used for multivariate logistic regression. Statistical tests were two-sided, using an alpha level of 0.05. LASSO regression was performed using the R package “ glmnet” (R, version 3.6.3; glmnet package version 4.1–2) [ 27 ]. For AUC computation and bootstrapping, the R package ModelGood (R, version 3.6.3; ModelGood package version 1.0.9) was used [ 28 ].

Table 1 shows the CAIDE model variables of all included study participants separately for all-cause dementia ( n  = 562), AD ( n  = 173), and VD ( n  = 199) cases, as well as healthy controls ( n  = 1356). Most all-cause dementia cases were represented in the late-life sub-sample (63.2%). Furthermore, a larger proportion of subjects among controls had a higher school education than the basic education of 9 years (23.6%) than among the all-cause dementia cases (20.3%). Slightly more females than males were included in both cases (53.7%) and controls (54.7). Mean values for systolic blood pressure, BMI, and total cholesterol levels were comparable between all-cause dementia cases and controls. In addition, all-cause dementia cases included a higher proportion of physically inactive participants (26.0% compared to 17.6%) and a much higher proportion of APOE ε4 carriers than controls (39.5% compared to 24.3%). In a multivariate logistic regression model, only age, total cholesterol (inversely), physical activity (inversely) and APOE genotype were statistically significantly associated with all-cause dementia (Supplemental Table  3 ). In the model for AD (Supplemental Table  4 ), BMI was additionally significant and total cholesterol lost statistical significance in CAIDE model 1. In the model for VD (Supplemental Table  5 ), physical activity was not statistically significant. Age and APOE genotype were statistically significantly associated with all dementia outcomes.

Table 2 shows the discriminative performances of various prediction models for all-cause dementia, AD, and VD. All CAIDE models had a high discriminative performance in the total cohort with an AUC ≥ 0.7 (Fig.  2 ). However, inflammatory biomarkers selected by the LASSO logistic regression did not improve the models’ discriminative performance. The inflammation-related biomarkers selected by LASSO regression are shown in Table  3 . In total, 20, 7, and 4 inflammatory biomarkers were added to the CAIDE model 2 for all-cause dementia, AD, and VD, respectively. The selected biomarkers differed between the outcomes but were similar for CAIDE model 1 and 2 for each outcome. The β-coefficients of all variables needed to calculate risk scores for the CAIDE + inflammatory biomarkers models and bootstrap inclusion frequencies for all-cause dementia, AD and VD can be found in Supplemental Tables 6 , 7 , 8  respectively. Bootstrap inclusion frequencies showed a relatively clear cutoff for variables selected by LASSO compared to non-selected ones (data not shown).

figure 2

ROC curves of created all-cause dementia, Alzheimer’s disease, and vascular dementia risk prediction models for the total cohort. ROC curves for CAIDE model 1 (including age, education, sex, systolic blood pressure, BMI, total cholesterol, and physical activity, and CAIDE model 2 (additionally including APOE ε4 carrier status) are depicted in black while curves of the CAIDE models plus inflammatory biomarkers chosen by LASSO regression (cf. Table 3 ) are depicted in grey. AUC and 95% bootstrap confidence intervals are provided with the respective graphs. The AUCs were obtained in a nested case-cohort study with n  = 1,356 healthy controls and n  = 562, n  = 173, and n  = 199 cases for all-cause dementia, Alzheimer’s disease, and vascular dementia, respectively. Abbreviations: BMI Body mass index, APOE Apolipoprotein, LASSO Least absolute shrinkage and selection operator

The prediction of CAIDE model 2 improved more for AD and all-cause dementia than VD compared to CAIDE model 1. Overall, the highest discriminative performance of all models was achieved for AD for CAIDE model 2 without inflammatory biomarkers (AUC [95% CI]: 0.752 [0.704–0.798]).

In a further step, we split the cohort into a mid-life (50–64 years) and late-life (65–75 years) sub-sample. A clear difference in dementia prediction between the age groups became apparent (Table  2 , Supplemental Figs.  1   and  2 ). While the AUCs for the various models for all-cause dementia, AD, and VD varied between 0.665 and 0.751 in the mid-life sample, AUCs in the late-life sample were consistently lower and ranged between 0.547 and 0.651. Inflammatory biomarkers selected by the LASSO regression did not lead to improvements in the models' AUCs, neither in the mid-life nor the late-life subsample. The inflammatory biomarkers selected by the LASSO regression and the β-coefficients for their associations with all-cause dementia, AD and VD, as well as the other CAIDE variables needed to calculate the risk prediction models and bootstrap inclusion frequencies, are shown in Supplemental Tables 9 , 10 , 11 for the mid-life and Supplemental Tables 12 , 13 , 14 for the late-life sample, respectively. Comparable to the total cohort, the highest AUCs were achieved for AD when the inflammatory biomarkers were not included in CAIDE model 2 (AUC [95% CI]: 0.751 [0.678–0.830] and 0.651 [0.566–0.724] for the mid-life and late-life sample, respectively).

In a sensitivity analysis, we penalized not only the OLINK inflammation biomarkers but also the variables of the CAIDE model 1 in the LASSO regression. This analysis was exemplarily conducted for CAIDE model 1 and the outcome of all-cause dementia. Interestingly, all CAIDE model variables except sex and education were selected, and the same list of inflammatory biomarkers with only one addition was chosen (CXCL5). In addition, the AUC of this sensitivity analysis (0.703 [0.674–0.734]) was almost identical to the one from the main analysis (0.702 [0.669–0.732]).

In this prospective cohort study, we aimed to explore the potential for improving the predictive ability of the CAIDE model by including the serum levels of inflammation-related proteins. Although several biomarkers were selected by LASSO regression to the CAIDE model for the prediction of all-cause dementia, AD, and VD, AUCs did not change. Nevertheless, these are still important findings in this research field.

Previous studies

In previous studies, the CAIDE score showed good external validity in five cohorts without any adjustments to the model [ 11 , 12 , 14 , 15 , 29 ]. All of them reported a similar discriminative performance of the score. Moreover, a recent Cochrane review performed a meta-analysis on three studies externally validating the CAIDE model [ 30 ]. Overall the meta-analysis revealed a good predictive ability of the CAIDE model (AUC [95% CI]: 0.71 [0.66–0.76]). However, the authors expressed concerns about the certainty of the underlying data. Besides, the CAIDE risk score was evaluated as a tool for dementia risk prediction in different ethnicities and showed good predictive ability in subgroups for Asians and dark-skinned people [ 11 ]. However, the prognosis was poor in cohorts of Hispanic/Latino Americans and Japanese American men [ 13 , 31 ]. Furthermore, Stephan and colleagues recently showed that the CAIDE score has poor predictive ability in low- and middle-income countries (0.52 ≤ c ≤ 0.63) [ 17 ]. Furthermore, a poor performance of the CAIDE model was observed in late-life samples in previous studies by Anstey and Kivimäki et al. [ 16 , 29 ] and Fayosse et al. [ 12 ]. The latter showed that the CAIDE model only significantly predicted dementia at a mean age of 55 but not at 60 or 65 years, when examining participants separately. Thus, despite its unquestionable merits, improvements of the CAIDE score are needed.

To our knowledge, four modifications of the CAIDE score are available: Tolea and colleagues designed a modified version of the CAIDE score (mCAIDE) to simplify the application of the model in a community-based setting [ 32 ]. Therefore, laboratory measurements of cholesterol levels were replaced by self-reported information about high cholesterol levels (yes or no). In addition, physical activity assessment was replaced by the mini Physical Performance Testing (mPPT). The mCAIDE score was first applied to a cohort of 230 community-dwelling older adults in which it slightly improved the discrimination between cognitively impaired and unimpaired individuals (AUC mCAIDE: 0.78 [0.71–0.85], AUC CAIDE: 0.71 [0.61–0.80]). Afterwards, the score was additionally validated in an independent clinical cohort of 219 participants and demonstrated to discriminate well between different stages of dementia.

Exalto and colleagues aimed to improve the predictive performance of the CAIDE score by including diabetes mellitus, depressed mood, head trauma, central obesity, lung function, and smoking as additional mid-life risk factors [ 11 ]. However, the added variables did not improve its predictive abilities.

Harrison and colleagues tested if adding a composite score of two biomarkers of inflammation (interleukin-6 and C-reactive protein) and one of oxidative stress (homocysteine) to the CAIDE score would improve the ability to predict cognitive decline for study participants of two cohorts aged 85 years or older [ 33 ]. Adding the biomarkers to the CAIDE score increased the hazard ratio (HR) for comparison of a high- and low-risk group from 1.14 (95% CI: 0.64–2.03, p  = 0.65) to 1.96 (1.27–3.42, p  = 0.02) in the first cohort and from 1.64 (1.04–2.58, p  = 0.03) to 1.89 (1.18–3.02, p  = 0.08) in the second cohort.

Finally, Geethadevi and colleagues applied the CAIDE model and two other dementia risk prediction models to an Australian cohort study, compared their predictive ability, and created a hybrid model including several variables of all three models chosen by a machine learning algorithm [ 34 ]. The CAIDE model showed the lowest predictive ability for dementia of all models in this cohort of 3360 participants (AUC [95% CI]: 0.54 [0.49–0.58]). Nonetheless, the created hybrid model included all variables of the CAIDE model as well as history of head injury, depression, diabetes mellitus, smoking status, alcohol consumption, social activity, cognitive activity, fish intake, history of coronary artery disease (CAD), and APOE ε4. With this set of variables, the authors achieved an AUC of 0.80 (95% CI: 0.78–0.83). However, the hybrid model still lacks external validation.

Interpretation of findings

Compared to the original CAIDE model, the predictive ability in our study was lower but still good (AUCs of 0.769 and 0.776 for CAIDE model 1 and 2 in the original study compared to 0.702 and 0.725, respectively, for all-cause dementia in our study). In agreement with previous studies, we also observed a better predictive ability of the CAIDE model in mid-life than in late-life [ 12 , 16 , 29 ]. However, since it is more important to have suitable dementia risk assessment tools in mid-life than in late-life this is not critical. Targeting dementia risk factors in mid-life has a greater potential to prevent or delay the onset of the disease.

Although inflammation is considered to have a crucial role in dementia pathogenesis [ 35 , 36 ], the discriminative ability of the CAIDE model did not increase when the inflammation-related biomarkers were added – neither in the total sample nor in the mid-life nor late-life sub-sample. This suggests that the variables included in the CAIDE model are already strong dementia predictors capturing the predictive ability of inflammatory biomarkers because there is some conceptual overlap (e.g., between age and inflammation or between low physical activity and inflammation). Apart from this, due to the long follow-up duration in our study and the single measurement at baseline, it is possible that the biomarker measurements only reflect a beginning inflammatory response of the immune system to early dementia onset and are not predictive for clinical dementia diagnoses in the long run.

Despite the lack of an added predictive value by the biomarkers, these results are still important for this research field. First, they underscore the robustness of the CAIDE model, which already encompasses key risk factors for dementia. CAIDE model 2, which comprises the APOE ε4 carrier status, reached the highest AUC without including the inflammation-related biomarkers. This is essential information for researchers aiming to improve the predictive abilities of the CAIDE and other dementia risk prediction models since it might be more promising to spend the time and resources on testing biomarkers addressing other aspects of dementia etiology.

Moreover, the inflammatory biomarkers chosen by LASSO regression might shed more light on the biological mechanisms underlying dementia pathogenesis. Notably, EN-RAGE and latency-associated peptide transforming growth factor beta-1 (LAP TGF-beta 1) were among the biomarkers chosen by LASSO regression for all-cause dementia, AD, and VD. EN-RAGE also showed the highest and most consistent bootstrap inclusion frequencies of > 73% for all outcomes (total cohort). The biomarker vascular endothelial growth factor-A (VEGF-A) was additionally chosen for all-cause dementia. In a previous analysis with our case-cohort sample from the ESTHER study, we showed that these biomarkers were independently associated with at least one of the outcomes and discussed the potential mechanisms involving different aspects of dementia pathogenesis, namely neurodegeneration (EN-RAGE), amyloid beta (Aß) deposition (LAP TGF-beta 1), and blood brain barrier permeability (VEGF-A) [ 37 ].

Strengths and limitations

This study is characterized by the prospective cohort design, a long follow-up period of 20 years, its large sample size and its representativeness of the German healthcare setting. In addition, appropriate measures were taken to prevent overfitting of the developed models by applying LASSO logistic regression and bootstrapping [ 24 , 28 ].

In the ESTHER study, dementia diagnoses are collected in a community-based setting. Although, diagnoses were collected from medical records, a thourough assessment of subtypes is often lacking the community setting. This might also explain the comparatively low proportion of AD cases. However, the most important outcome for dementia risk assessment in the community setting is all-cause dementia. Moreover, due to a different age structure, education system, and physical activity assessment in the ESTHER study compared to the CAIDE study, the CAIDE model needed to be refitted. This hampers a direct comparison to the results of the CAIDE model. Due to cost reasons, biomarker measurements were conducted in a case-cohort study design rather than a cohort design using the entire study population. In addition, biomarker measurements could only be performed once in baseline blood samples rather than in follow-up samples. This limitation may have resulted in an underestimation of the AUC because the inflammation status could change during follow-up. Finally, the results of this study originate from a study population that comprises mainly of participants of European descent aged 50 to 75 years. Hence, the results might not be generalized to other populations.

This large, prospective cohort study showed that adding inflammation-related, blood-based biomarkers to the CAIDE model does not improve the model’s discriminative ability for all-cause dementia, AD, or VD. Nevertheless, as previously shown, the biomarkers selected by LASSO regression were significantly associated with the assessed outcomes and could thus serve as a starting point to further elucidate the pathogenesis of dementia. Other factors, less conceptionally related to the variables already included in the CAIDE model, should be included in future studies to improve its predictive value.

Availability of data and materials

The data that support the findings of this study are not openly available due to reasons of sensitivity and are available from the corresponding author upon reasonable request. Data are located in controlled access data storage at the German Cancer Research Center.

Abbreviations

  • Alzheimer’s disease

Apolipoprotein E

Amyloid beta

Area under the curve

Body mass index

Coronary artery disease

Cardiovascular Risk Factors, Aging and Dementia

Confidence interval

Protein S100-A12

Epidemiologische Studie zu Chancen der Verhütung, Früherkennung und optimierten Therapie chronischer Erkrankungen in der älteren Bevölkerung [German]

Food and Drug Administration

General practitioners

Hazard ratio

Latency-associated peptide transforming growth factor beta-1

Least absolute shrinkage and selection operator

Modified version of the CAIDE score

Mini Physical Performance Testing

Single-nucleotide polymorphism

  • Vascular dementia

Vascular endothelial growth factor-A

Prince M, Wimo A, Guerchet M, Ali G-C, Wu Y-T, Prina M. World Alzheimer Report 2015 - The Global Impact of Dementia an Analysis of Prevalence, Incidence, Cost, and Trends. In: Alzheimers Dis Int. 2015. https://www.alzint.org/u/WorldAlzheimerReport2015.pdf . Accessed 07 Aug 2023.

Mahase E. Aducanumab: European agency rejects Alzheimer’s drug over efficacy and safety concerns. BMJ. 2021;375:n3127.

Article   PubMed   Google Scholar  

Alexander GC, Emerson S, Kesselheim AS. Evaluation of Aducanumab for Alzheimer Disease: Scientific Evidence and Regulatory Review Involving Efficacy, Safety, and Futility. JAMA. 2021;325(17):1717–8.

Perneczky R, Jessen F, Grimmer T, Levin J, Flöel A, Peters O, et al. Anti-amyloid antibody therapies in Alzheimer’s disease. Brain. 2023;146(3):842–9.

Cummings J. Anti-Amyloid Monoclonal Antibodies are Transformative Treatments that Redefine Alzheimer’s Disease Therapeutics. Drugs. 2023;83(7):569–76.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Hou X-H, Feng L, Zhang C, Cao X-P, Tan L, Yu J-T. Models for predicting risk of dementia: a systematic review. J Neurol Neurosurg Psychiatry. 2019;90(4):373–9.

Goerdten J, Čukić I, Danso SO, Carrière I, Muniz-Terrera G. Statistical methods for dementia risk prediction and recommendations for future work: A systematic review. Alzheimer’s & Dementia: Translational Research & Clinical Interventions. 2019;5(1):563–9.

Article   Google Scholar  

Tang EYH, Harrison SL, Errington L, Gordon MF, Visser PJ, Novak G, et al. Current Developments in Dementia Risk Prediction Modelling: An Updated Systematic Review. PLoS ONE. 2015;10(9):e0136181.

Article   PubMed   PubMed Central   Google Scholar  

Stephan BCM, Kurth T, Matthews FE, Brayne C, Dufouil C. Dementia risk prediction in the population: are screening models accurate? Nat Rev Neurol. 2010;6(6):318–26.

Kivipelto M, Ngandu T, Laatikainen T, Winblad B, Soininen H, Tuomilehto J. Risk score for the prediction of dementia risk in 20 years among middle aged people: a longitudinal, population-based study. The Lancet Neurology. 2006;5(9):735–41.

Exalto LG, Quesenberry CP, Barnes D, Kivipelto M, Biessels GJ, Whitmer RA. Midlife risk score for the prediction of dementia four decades later. Alzheimers Dement. 2014;10(5):562–70.

Fayosse A, Nguyen D-P, Dugravot A, Dumurgier J, Tabak AG, Kivimäki M, et al. Risk prediction models for dementia: role of age and cardiometabolic risk factors. BMC Med. 2020;18(1):107.

Torres S, Alexander A, O’Bryant S, Medina LD. Cognition and the Predictive Utility of Three Risk Scores in an Ethnically Diverse Sample. J Alzheimers Dis. 2020;75(3):1049–59.

Licher S, Yilmaz P, Leening MJG, Wolters FJ, Vernooij MW, Stephan BCM, et al. External validation of four dementia prediction models for use in the general community-dwelling population: a comparative analysis from the Rotterdam Study. Eur J Epidemiol. 2018;33(7):645–55.

Virta JJ, Heikkilä K, Perola M, Koskenvuo M, Räihä I, Rinne JO, et al. Midlife cardiovascular risk factors and late cognitive impairment. Eur J Epidemiol. 2013;28(5):405–16.

Anstey KJ, Cherbuin N, Herath PM, Qiu C, Kuller LH, Lopez OL, et al. A Self-Report Risk Index to Predict Occurrence of Dementia in Three Independent Cohorts of Older Adults: The ANU-ADRI. PLoS ONE. 2014;9(1):e86141.

Stephan BCM, Pakpahan E, Siervo M, Licher S, Muniz-Terrera G, Mohan D, et al. Prediction of dementia risk in low-income and middle-income countries (the 10/66 Study): an independent external validation of existing models. Lancet Glob Health. 2020;8(4):e524–35.

Walker KA, Ficek BN, Westbrook R. Understanding the Role of Systemic Inflammation in Alzheimer’s Disease. ACS Chem Neurosci. 2019;10(8):3340–2.

Article   CAS   PubMed   Google Scholar  

Trares K, Bhardwaj M, Perna L, Stocker H, Petrera A, Hauck SM, et al. Association of the inflammation-related proteome with dementia development at older age: results from a large, prospective, population-based cohort study. Alzheimer’s Research & Therapy. 2022;14(1):128.

Article   CAS   Google Scholar  

Stocker H, Beyer L, Trares K, Perna L, Rujescu D, Holleczek B, et al. Association of Kidney Function With Development of Alzheimer Disease and Other Dementias and Dementia-Related Blood Biomarkers. JAMA Netw Open. 2023;6(1):e2252387.

Kivipelto M, Helkala E-L, Hänninen T, Laakso MP, Hallikainen M, Alhainen K, et al. Midlife vascular risk factors and late-life mild cognitive impairment. A population-based study. 2001;56(12):1683–9.

CAS   Google Scholar  

Stocker H, Perna L, Weigl K, Möllers T, Schöttker B, Thomsen H, et al. Prediction of clinical diagnosis of Alzheimer’s disease, vascular, mixed, and all-cause dementia by a polygenic risk score and APOE status in a community-based cohort prospectively followed over 17 years. Molecular Psychiatry. 2021;26(10):5812–22.

Royston P, Sauerbrei W. Building multivariable regression models with continuous covariates in clinical epidemiology–with an emphasis on fractional polynomials. Methods Inf Med. 2005;44(4):561–71.

Tibshirani R. Regression Shrinkage and Selection via the Lasso. J Roy Stat Soc: Ser B (Methodol). 1996;58(1):267–88.

Google Scholar  

Heinze G, Wallisch C, Dunkler D. Variable selection – A review and recommendations for the practicing statistician. Biom J. 2018;60(3):431–49.

Sauerbrei W, Perperoglou A, Schmid M, Abrahamowicz M, Becher H, Binder H, et al. State of the art in selection of variables and functional forms in multivariable analysis—outstanding issues. Diagnostic and Prognostic Research. 2020;4(1):3.

Friedman JH, Hastie T, Tibshirani R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw. 2010;33(1):1–22.

Gerds TA. ModelGood: Validation of risk prediction models. R package version 1.0.9. ed2015. https://cran.r-project.org/web/packages/ModelGood/index.html .

Kivimäki M, Livingston G, Singh-Manoux A, Mars N, Lindbohm JV, Pentti J, et al. Estimating Dementia Risk Using Multifactorial Prediction Models. JAMA Netw Open. 2023;6(6):e2318132.

Mohanannair Geethadevi G, Quinn TJ, George J, Anstey KJ, Bell JS, Sarwar MR, Cross AJ. Multi‐domain prognostic models used in middle‐aged adults without known cognitive impairment for predicting subsequent dementia. Cochrane Database Syst Rev. 2023;6:CD014885.

Chosy EJ, Edland SD, Gross N, Meyer MJ, Liu CY, Launer LJ, et al. The CAIDE Dementia Risk Score and the Honolulu-Asia Aging Study. Dement Geriatr Cogn Disord. 2019;48(3–4):164–71.

Tolea MI, Heo J, Chrisphonte S, Galvin JE. A Modified CAIDE Risk Score as a Screening Tool for Cognitive Impairment in Older Adults. J Alzheimers Dis. 2021;82:1755–68.

Harrison SL, de Craen AJM, Kerse N, Teh R, Granic A, Davies K, et al. Predicting Risk of Cognitive Decline in Very Old Adults Using Three Models: The Framingham Stroke Risk Profile; the Cardiovascular Risk Factors, Aging, and Dementia Model; and Oxi-Inflammatory Biomarkers. J Am Geriatr Soc. 2017;65(2):381–9.

Geethadevi GM, Peel R, Bell JS, Cross AJ, Hancock S, Ilomaki J, et al. Validity of three risk prediction models for dementia or cognitive impairment in Australia. Age Ageing. 2022;51(12):afac307.

Kinney JW, Bemiller SM, Murtishaw AS, Leisgang AM, Salazar AM, Lamb BT. Inflammation as a central mechanism in Alzheimer’s disease. Alzheimer’s & dementia (New York, N Y). 2018;4:575–90.

Raz L, Knoefel J, Bhaskar K. The neuropathology and cerebrovascular mechanisms of dementia. J Cereb Blood Flow Metab. 2016;36(1):172–86.

Trares K, Bhardwaj M, Perna L, Stocker H, Petrera A, Hauck SM, et al. Association of the inflammation-related proteome with dementia development at older age: results from a large, prospective, population-based cohort study. Alzheimer’s Res Ther. 2022;14(1):128.

Ranstam J, Cook JA. LASSO regression. British J Surg. 2018;105(10):1348-.

Download references

Acknowledgements

Not applicable.

Open Access funding enabled and organized by Projekt DEAL. Financial support for research staff involved in this project was granted by the Baden-Württemberg State Ministry of Science, Research and Arts (Stuttgart, Germany), the Robert-Bosch-Stiftung (Stuttgart, Germany) and the Klaus-Tschira-Stiftung gGmbH (Heidelberg, Germany). The ESTHER study was funded by grants from the Baden-Württemberg state Ministry of Science, Research and Arts (Stuttgart, Germany), the Federal Ministry of Education and Research (Berlin, Germany), the Federal Ministry of Family Affairs, Senior Citizens, Women and Youth (Berlin, Germany), and the Saarland state ministry for Social Affairs, Health, Women and Family Affairs (Saarbrücken, Germany).

Author information

Authors and affiliations.

Division of Clinical Epidemiology and Aging Research, German Cancer Research Center, Im Neuenheimer Feld 581, Heidelberg, 69120, Germany

Kira Trares, Hannah Stocker, Hermann Brenner & Ben Schöttker

Division of Biostatistics, German Cancer Research Center, Heidelberg, Germany

Manuel Wiesenfarth

Department of Genes and Environment, Max Planck Institute of Psychiatry, Kraepelinstraße 2-10, Munich, 80804, Germany

Laura Perna

Division of Mental Health of Older Adults, Department of Psychiatry and Psychotherapy, University Hospital, LMU Munich, Munich, 80336, Germany

Metabolomics and Proteomics Core, Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), Neuherberg, Germany

Agnese Petrera & Stefanie M. Hauck

Network Aging Research, Heidelberg University, Bergheimer Straße 20, Heidelberg, 69115, Germany

Konrad Beyreuther

You can also search for this author in PubMed   Google Scholar

Contributions

K.T., M.W., and B.S. contributed to the conception and design of the study. K.T., H.B., and B.S. contributed to the acquisition and analysis of data. K.T., M.W., H.S., L.P., A.P., S.M.H., K.B., H.B., and B.S. contributed to drafting the text, figures, or tables.

Corresponding author

Correspondence to Ben Schöttker .

Ethics declarations

Ethics approval and consent to participate.

The study was approved by the ethics committees of Heidelberg University and the state medical board of Saarland, Germany. Written informed consent was obtained from all participants in the study.

Consent for publication

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Trares, K., Wiesenfarth, M., Stocker, H. et al. Addition of inflammation-related biomarkers to the CAIDE model for risk prediction of all-cause dementia, Alzheimer’s disease and vascular dementia in a prospective study. Immun Ageing 21 , 23 (2024). https://doi.org/10.1186/s12979-024-00427-2

Download citation

Received : 14 February 2024

Accepted : 20 March 2024

Published : 03 April 2024

DOI : https://doi.org/10.1186/s12979-024-00427-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Inflammation
  • Risk prediction
  • Cohort study

Immunity & Ageing

ISSN: 1742-4933

population based case cohort study

The Case-Population Study Design

An Analysis of its Application in Pharmacovigilance

  • Original Research Article
  • Published: 03 January 2013
  • Volume 34 , pages 861–868, ( 2011 )

Cite this article

  • Hélène Théophile 1 , 2 , 3 ,
  • Joan-Ramon Laporte 4 ,
  • Nicholas Moore 1 , 2 , 3 ,
  • Karin-Latry Martin 1 , 2 , 3 &
  • Bernard Bégaud 1 , 2 , 3  

460 Accesses

28 Citations

3 Altmetric

Explore all metrics

Background: The case-population approach or population-based case-cohort approach is derived from the case-control design and consists of comparing past exposure to a given risk factor in subjects presenting a given disease or symptom (cases) with the exposure rate to this factor in the whole cohort or in the source population of cases. In the same way as the case-control approach, the case-population approach measures the disproportionality of exposure between cases of a given disease and their source population expressed in the form of an odds ratio approximating the ratio of the risks in exposed and notexposed populations (relative risk).

Objective: The aim of this study was to (i) present the case-population principle design in a way understandable for non-statisticians; (ii) propose the easiest way of using it for pharmacovigilance purposes (mainly alerting and hypothesis testing); (iii) propose simple formulae for computing an odds ratio and its confidence interval; (iv) apply the approach to several practical and published examples; and (v) discuss its pros and cons in the context of real life.

Methods: The approach used is derived from that comparing two rates expressed as person-time denominators. It allows easy computation of an odds ratio and its confidence interval under several hypotheses. Results obtained with the case-population approach were compared with those of case-control studies published in the literature.

Results: Relevance and limits of the proposed approach are illustrated by examples taken from published pharmacoepidemiological studies. The odds ratio (OR) reported in a European case-control study on centrally acting appetite suppressants and primary pulmonary hypertension was 23.1 (95% CI 6.9, 77.7) versus 31 (95% CI 16.2, 59.2) using the case-population approach. In the European case-control studies SCAR (Severe Cutaneous Adverse Reactions) and EuroSCAR on the risk of toxic epidermal necrolysis associated with the use of medicines, the OR for cotrimoxazole was 160 and 102, respectively, versus 44.4 using the case-population approach. Similarly, these two case-control studies found ORs of 12 and 72 for carbamazepine versus 24.4 using the case-population approach, 8.7 and 16 for phenobarbital versus 21.9, 12 for piroxicam (analysed in the SCAR study only) versus 14.5, and 5.5 and 18 for allopurinol versus 3.4 using the case-population approach.

Conclusions: Being based on the estimate derived from sales statistics of the total exposure time in the source population of cases, the method can be used even when there is no information about the actual number of exposed subjects in this population. Although the case-population approach suffers from limitations stemming from its main advantage, i.e. impossibility to control possible confounders and to quantify the strength of associations due to the absence of an ad hoc control group, it is particularly useful to use in routine practice, mainly for purposes of signal generation and hypothesis testing in drug surveillance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

population based case cohort study

Similar content being viewed by others

population based case cohort study

Case–Control Studies

population based case cohort study

Case–Control Study

population based case cohort study

Case-Control Studies in Aging Research

Prentice RL. A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 1986; 73(1): 1–11

Article   Google Scholar  

van der Klauw MM, Stricker BH, Herings RM, et al. A population based case-cohort study of drug-induced anaphylaxis. Br J Clin Pharmacol 1993; 35(4): 400–8

Article   PubMed   Google Scholar  

Capella D, Laporte JR, Vidal X, et al. European network for the case-population surveillance of rare diseases (Euronet): a prospective feasibility study. Eur J Clin Pharmacol 1998; 53(5): 299–302

Article   PubMed   CAS   Google Scholar  

Capella D, Pedros C, Vidal X, et al. Case-population studies in pharmacoepidemiology. Drug Saf 2002; 25(1): 7–19

Rothman KJ, Greenland S. Modern epidemiology. 2nd ed. Philadelphia (PA): Lippincott-Raven, 1998

Google Scholar  

Etwel FA, Rieder MJ, Bend JR, et al. A surveillance method for the early identification of idiosyncratic adverse drug reactions. Drug Saf 2008; 31(2): 169–80

Ibanez L, Ballarin E, Vidal X, et al. Agranulocytosis associated with calcium dobesilate clinical course and risk estimation with the case-control and the case-population approaches. Eur J Clin Pharmacol 2000; 56(9-10): 763–7

Laporte JR, Capella D, Juan J. Agranulocytosis induced by cinepazide. Eur J Clin Pharmacol 1990; 38(4): 387–8

Laporte JR, Ibanez L, Ballarin E, et al. Fatal aplastic anaemia associated with nifedipine. Lancet 1998; 352(9128): 619–20

Stricker BH, de Groot RR, Wilson JH. Glafenine-associated anaphylaxis as a cause of hospital admission in the Netherlands. Eur J Clin Pharmacol 1991; 40(4): 367–71

van der Klauw MM, Goudsmit R, Halie MR, et al. A population-based case-cohort study of drug-associated agranulocytosis. Arch Intern Med 1999; 159(4): 369–74

Daly LE, Bourke GJ, McGilvray J. Authors interpretation and use of medical statistics. 4th ed. Oxford: Blackwell Scientific Publications, 1991: 200–2

Miettinen OS. Simple interval-estimation of risk ratio. Am J Epidemiol 1974; 100: 515–6

Abenhaim L, Moride Y, Brenot F, et al. Appetitesuppressant drugs and the risk of primary pulmonary hypertension. International Primary Pulmonary Hypertension Study Group. N Engl J Med 1996; 335(9): 609–16

French Medicines Agency. Proceedings of the Commission Nationale de Pharmacovigilance [online]. Available from URL: http://www.afssaps.fr [Accessed 2011 Jul 29]

Roujeau JC, Kelly JP, Naldi L, et al. Medication use and the risk of Stevens-Johnson syndrome or toxic epidermal necrolysis. N Engl J Med 1995; 333(24): 1600–7

Mockenhaupt M, Viboud C, Dunant A, et al. Stevens-Johnson syndrome and toxic epidermal necrolysis: assessment of medication risks with emphasis on recently marketed drugs. The EuroSCAR-study. J Invest Dermatol 2008; 128(1): 35–44

Roujeau JC, Guillaume JC, Fabre JP, et al. Toxic epidermal necrolysis (Lyell syndrome): incidence and drug etiology in France, 1981–1985. Arch Dermatol 1990; 126(1): 37–42

Roujeau JC. Clinical aspects of skin reactions to NSAIDs. Scand J Rheumatol Suppl 1987; 65: 131–4

Roujeau JC, Stern RS. Severe adverse cutaneous reactions to drugs. N Engl J Med 1994; 331(19): 1272–85

Stern RS, Bigby M. An expanded profile of cutaneous reactions to nonsteroidal anti-inflammatory drugs: reports to a specialty-based system for spontaneous reporting of adverse reactions to drugs. JAMA 1984; 252(11): 1433–7

Wolkenstein PE, Roujeau JC, Revuz J. Drug-induced toxic epidermal necrolysis. Clin Dermatol 1998; 16(3): 399–408

Claessens N, Delbeke L, Lambert J, et al. Toxic epidermal necrolysis associated with treatment for preterm labor. Dermatology 1998; 196(4): 461–2

Lear JT, English JS. Toxic epidermal necrolysis associated with indomethacin therapy. Postgrad Med J 1996; 72(845): 186–7

O’Sullivan M, Hanly JG, Molloy M. A case of toxic epidermal necrolysis secondary to indomethacin. Br J Rheumatol 1983; 22(1): 47–9

Beral V, Chilvers C, Fraser P. On the estimation of relative risk from vital statistical data. J Epidemiol Community Health 1979; 33(2): 159–62

Mittleman MA. Estimation of exposure prevalence in a population at risk using data from cases and an external estimate of the relative risk. Epidemiology 1995; 6(5): 551–3

Suissa S, Edwardes MD, Boivin JF. External comparisons from nested case-control designs. Epidemiology 1998; 9(1): 72–8

Wacholder S, Boivin JF. External comparisons with the casecohort design. Am J Epidemiol 1987; 126(6): 1198–209

PubMed   CAS   Google Scholar  

Martin-Latry K, Begaud B. Pharmacoepidemiological research using French reimbursement databases: yes we can! Pharmacoepidemiol Drug Saf 2010; 19(3): 256–65

Pelat C, Boelle PY, Turbelin C, et al. A method for selecting and monitoring medication sales for surveillance of gastroenteritis. Pharmacoepidemiol Drug Saf 2010; 19(10): 1009–18

Vergu E, Grais RF, Sarter H, et al. Medication sales and syndromic surveillance, France. Emerg Infect Dis 2006; 12(3): 416–21

Download references

Acknowledgements

The authors thank Ray Cooke who kindly supervised the English of this paper. No sources of funding were used to conduct this study or prepare this manuscript. The authors have no conflicts of interest to declare that are directly relevant to the content of this study.

Author information

Authors and affiliations.

Univ. de Bordeaux, Bordeaux, France

Hélène Théophile, Nicholas Moore, Karin-Latry Martin & Bernard Bégaud

Pharm D, INSERMU657, Département de Pharmacologie, Université de Bordeaux Segalen, 33076, Bordeaux Cedex, France

Service de Pharmacologie, Centre de Pharmacovigilance, CHU, Bordeaux, France

WHO Collaborative Centre for Research and Training in Pharmacoepidemiology, Institut Catalá de Farmacologia, Universitat Autonoma de Barcelona, CSU Vall d’Hebron, Pg Vall d’Hebron, Barcelona, Spain

Joan-Ramon Laporte

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Hélène Théophile .

Rights and permissions

Reprints and permissions

About this article

Théophile, H., Laporte, JR., Moore, N. et al. The Case-Population Study Design. Drug-Safety 34 , 861–868 (2011). https://doi.org/10.2165/11592140-000000000-00000

Download citation

Published : 03 January 2013

Issue Date : October 2011

DOI : https://doi.org/10.2165/11592140-000000000-00000

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Source Population
  • Toxic Epidermal Necrolysis
  • Primary Pulmonary Hypertension
  • Catchment Population
  • Find a journal
  • Publish with us
  • Track your research

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 02 April 2024

Association between statin use and the risk for idiopathic pulmonary fibrosis and its prognosis: a nationwide, population-based study

  • Jimyung Park 1 ,
  • Chang-Hoon Lee 1 ,
  • Kyungdo Han 2 &
  • Sun Mi Choi 1  

Scientific Reports volume  14 , Article number:  7805 ( 2024 ) Cite this article

Metrics details

  • Epidemiology
  • Respiratory tract diseases

Given the pleiotropic effects of statins beyond their lipid-lowering effects, there have been attempts to evaluate the role of statin therapy in IPF, but they have shown inconclusive results. Data from the National Health Insurance Service (NHIS) database of South Korea were used to investigate the effects of statin therapy on IPF. The IPF cohort consisted of a total of 10,568 patients who were newly diagnosed with IPF between 2010 and 2017. These patients were then matched in a 1:3 ratio to 31,704 subjects from a control cohort without IPF, with matching based on age and sex. A case–control study was performed to evaluate the association between statin use and the risk for IPF, and the multivariable analysis revealed that statin use was associated with a lower risk for IPF (adjusted OR 0.847, 95% CI 0.800–0.898). Using the IPF cohort, we also evaluated whether statin use at the time of diagnosis was associated with future clinical outcomes. The statin use at the time of IPF diagnosis was associated with improved overall survival (adjusted HR 0.779, 95% CI 0.709–0.856). Further prospective studies are needed to clarify the role of statin therapy in IPF.

Similar content being viewed by others

population based case cohort study

Antifibrotic treatment improves clinical outcomes in patients with idiopathic pulmonary fibrosis: a propensity score matching analysis

Jieun Kang, Minkyu Han & Jin Woo Song

population based case cohort study

A nationwide population-based study of incidence and mortality of lung cancer in idiopathic pulmonary fibrosis

Myung Jin Song, Song Yee Kim, … Seon Cheol Park

population based case cohort study

All-cause mortality of patients with idiopathic pulmonary fibrosis: a nationwide population-based cohort study in Korea

Sung Jun Ko, Sun Mi Choi, … Jinwoo Lee

Introduction

Idiopathic pulmonary fibrosis (IPF) is a prototype of progressive fibrotic lung disease. Progressive fibrosis is mainly driven by repetitive microinjuries to the alveolar epithelium, leading to aberrant repair process 1 . Although the development of anti-fibrotic drugs has slowed the decline in lung function, there is still an unmet need to improve the prognosis of patients with IPF 2 .

While aberrant epithelial-mesenchymal crosstalk is regarded to be the key pathogenetic factor in IPF, inflammation is also considered to play a role 3 . Currently approved anti-fibrotic drugs for IPF, such as pirfenidone and nintedanib, have anti-inflammatory effects in addition to their well-known anti-fibrotic effects 4 , 5 . Although aggressive immunosuppressive therapy has been shown to be potentially harmful in IPF 6 , an adequate level of control of inflammatory responses may be helpful 3 .

Statins are widely used to treat dyslipidemia by inhibiting 3-hydroxy-3-methylglutaryl coenzyme A (HMG-CoA) reductase. In addition to lipid-lowering effects, statins have anti-inflammatory and anti-oxidative effects 7 . In this regard, the role of statins in IPF has been evaluated in a few studies. Some studies have reported beneficial effects of statins in delaying disease progression and reducing mortality 8 , 9 , 10 , 11 , while others have not 12 . A recent meta-analysis also failed to provide conclusive evidence supporting the beneficial effects of statins in IPF 13 . Furthermore, most available studies have focused on the effects of statins in patients already diagnosed with IPF. Data regarding whether the use of statins could alter the risk for IPF in the general population or subjects at risk for IPF are lacking, including those with interstitial lung abnormality (ILA).

As such, we aimed to investigate the role of statin therapy in IPF and designed two distinct types of studies for this purpose. First, we conducted a case–control study to evaluate the association between statin use and the risk for IPF by comparing patients with IPF and control subjects without IPF. Second, we conducted a retrospective cohort study focusing only on the cohort of patients with IPF to evaluate whether clinical outcomes differed between statin users and nonusers among patients with IPF. A nationwide, population-based cohort data from the National Health Insurance Service (NHIS) database of South Korea was used for this study.

Characteristics of the study population

A search of the NHIS database identified 23,370 patients newly diagnosed with IPF during the study period. After excluding patients in whom the operational definition of statin use could not be applied (i.e., only transient use) and those who did not undergo health screening programs within 2 years before the IPF diagnosis, 10,568 patients were ultimately included in the IPF cohort. For this IPF cohort, a control cohort comprising 31,704 subjects without IPF were matched at a ratio of 1:3 according to age and sex (Fig.  1 ). The mean age of the patients was 65 years, and 69% were male. Comorbidities related to cardiovascular diseases were more common in the IPF cohort, including diabetes and ischemic heart disease (Table 1 ).

figure 1

Flowchart illustrating selection of the study population.

Statin use and the risk for IPF

Among the 10,568 patients in the IPF cohort, 3072 patients (29.1%) were statin users and, among the 31,704 subjects in the control cohort, 8569 subjects (27.0%) were statin users. When adjusted for covariates including comorbidities, statin use was associated with a lower risk for IPF (adjusted OR 0.847, 95% CI 0.800–0.898) than statin nonuse. The protective effects of statin use were consistent regardless of smoking history and sex (Table 2 ).

Statin use and clinical outcomes of IPF

Among the 10,568 patients in the IPF cohort, 9784 patients who did not experience myocardial infarction or stroke before the diagnosis of IPF were analyzed to assess the effects of statins on clinical outcomes (Fig.  1 ). Significant differences were observed in baseline characteristics between statin users and nonusers (Table 3 ). Statin users were older and cardiovascular comorbidities, such as hypertension and diabetes, were significantly more common among statin users, with almost more than twice the frequency.

The clinical outcomes of the two groups were compared and analyzed (Table 4 ). When adjusted for covariates, statin users exhibited lower overall mortality than statin nonusers (adjusted HR 0.779, 95% CI 0.709–0.856). When the time to first hospitalization or ER visit for any cause was analyzed, no significant difference was observed between the two groups (adjusted HR 0.974, 95% CI 0.908–1.044). However, the risk for hospitalization or ER visits due to respiratory causes was lower in statin users than in statin nonusers (adjusted HR 0.818, 95% CI 0.728–0.920).

Impact of duration of statin use

We evaluated whether the duration of statin use influenced the association between statin use and the risk for and clinical outcomes of IPF (Fig.  2 ). When statin nonusers were regarded as the reference group, statin use ≥ 5 years was associated with a lower risk for IPF (adjusted OR 0.786, 95% CI 0.732–0.845) than statin use < 5 years (adjusted OR 0.917, 95% CI 0.854–0.986), and the difference was statistically significant ( P  < 0.001). This trend was more pronounced in ever-smokers ( P  < 0.001) and male subjects ( P  < 0.001). In particular, in the subgroup of ever-smokers and male subjects, statin use < 5 years was not associated with a lower risk for IPF than statin nonusers (adjusted OR 1.008, 95% CI 0.909–1.119 and 0.966, 95% CI 0.885–1.055, respectively). However, when an analysis was performed to evaluate the association between statin use and clinical outcomes, the degrees of association were similar regardless of the duration of statin use (Fig.  3 ).

figure 2

Association between statin use and risk for IPF according to duration of statin use.

figure 3

Association between statin use and clinical outcomes of IPF according to duration of statin use.

In this study, we evaluated the association between statin use and the risk for IPF, and the effects of statin use on the clinical outcomes of IPF, using large-scale, nationwide cohort data. When case–control analysis was performed by matching patients with IPF with the control cohort according to age and sex, statin use was associated with a lower risk for IPF than statin nonuse. In addition, in the IPF cohort, statin use was associated with a lower risk for overall mortality and first hospitalization or ER visits related to respiratory causes.

When investigating the causes or risk factors for a specific disease, a longitudinal prospective cohort study involving healthy subjects and monitoring the development of the disease is the best option. However, for rare diseases with low incidence, such as IPF, it is not easy to perform such a study because it would require a very large number of subjects and a follow-up period of several decades, hence a case–control study is a good alternative 14 . As such, we performed a case–control study to investigate whether statin use could alter the risk for IPF and demonstrated promising results, suggesting a protective role of statins against IPF.

Repetitive alveolar microinjuries play a key role in the pathogenesis of IPF and result from multiple genetic and environmental risk factors 15 . While diabetes has been suggested to be associated with IPF 16 , metformin, a representative drug for diabetes, was shown to attenuate pulmonary fibrosis in an in-vivo model 17 . However, data regarding whether a specific drug can modify the risk for IPF are scarce. In our case–control analysis, statin use was associated with a lower risk for IPF than statin nonuse. Considering that statins have a favorable safety profile and are widely used in the general population, further studies should be performed to determine whether our findings are reproducible.

Nevertheless, our study has a limitation in that we used a dataset consisting of case and control cohorts matched only for age and sex. There may have been unmeasured biases despite our effort to adjust for confounding factors through multivariable analysis. Although incorporating all available variables for matching would have resulted in more balanced samples, this was practically unattainable because we could not have access to the complete NHIS dataset covering the entire Korean population.

In addition, our study is limited in that we were unable to determine the precise timeframe required for statins to exert a certain pharmacologic effect on the risk of developing IPF. While we attempted to provide a clear definition of statin exposure by classifying patients based on whether statins were prescribed within 1 year before their diagnosis of IPF, we recognize the limitation that statin use for a wide range of durations would have been grouped together as statin users. Exposure to statins over only a few years may not be sufficient to significantly influence the risk of IPF. To address this concern, we performed a subgroup analysis based on the duration of statin use, which revealed that statin use ≥ 5 years conferred a greater protective effect regarding the risk of IPF compared with statin use < 5 years. It is likely that a longer period of time is required for a particular drug to modify the risk of IPF, as supported by a recent study showing that approximately 3 years was required for significant progression of ILA 18 .

Recently, ILA has been increasingly detected and recognized on chest computed tomography among asymptomatic population. ILA is classified into fibrotic and non-fibrotic types, and fibrotic ILA can be a potential precursor to IPF 19 . However, it remains unclear how to prevent disease progression in the subclinical status of ILA, and systematic close follow-up is recommended for high-risk subjects 20 . Currently approved anti-fibrotic drugs, pirfenidone and nintedanib, would not be appropriate in the setting of ILA, given the non-negligible costs and potential drug toxicities. The results of our study suggest that statins may merit investigation of their role in preventing the progression of ILA to IPF.

The protective effects of statins against IPF are biologically plausible in several respects. IPF is known to be a disease of the aging lung, with telomere shortening as one of the contributing factors, which makes the lungs more susceptible to maladaptive responses to alveolar micro-injuries 21 . There is some evidence that statins have anti-aging effects linked to their ability to inhibit telomere shortening 22 . These pleiotropic effects of statins, beyond their lipid-lowering effects, may play a protective role against the development of IPF. Interestingly, in our study, statin use ≥ 5 years was associated with a lower risk for IPF in ever-smokers, whereas statin use < 5 years was not. Given the results of previous studies reporting that cigarette smoking could result in shorter telomeres 23 , longer use of statins may be needed to offset the detrimental effects of cigarette smoking in ever-smokers.

Among patients with newly diagnosed IPF, statin use at baseline was associated with improvement in clinical outcomes in our study compared to statin nonuse, especially for overall survival. There have been a few attempts to evaluate the impact of statin therapy on clinical outcomes in IPF, but a meta-analysis of these studies could not draw a definitive conclusion on the effects of statin use on overall mortality 13 . In fact, most of these previous studies had limited sample sizes, usually including hundreds of patients 8 , 9 , 10 , 11 , 12 . Our study included approximately 10,000 patients with IPF, which provided greater statistical power. Although statin users had more cardiovascular comorbidities, which was not surprising, when these confounding factors were adjusted for, statin use was associated with lower overall mortality. However, there is a possibility of the presence of unmeasured confounding factors in our analysis. Because our analysis was retrospective and relied on the NHIS database, which is a health claims database, this study cannot completely eliminate every potential bias. To fully address this concern, it would be imperative to perform a propensity score-matched analysis within a meticulously established cohort of patients with IPF or, ideally, to conduct a well-designed prospective study.

There is some experimental evidence supporting the anti-fibrotic potential of statins 24 , 25 , 26 , which are also known to have anti-inflammatory effects 27 . Both anti-fibrotic and anti-inflammatory effects may be beneficial in slowing the progression of IPF. However, in our study, although we demonstrated that overall survival was better in statin users among patients with IPF, it is not clear whether this finding was due to differences in IPF-related mortality or mortality related to other causes, because we could not determine the cause of death in this study. Considering that cardiovascular diseases and lung cancers are also the main causes of death in patients with IPF 28 , the beneficial effects of statins on overall mortality may stem from their impact on these conditions. Statins are well-known to have cardiovascular protective effects and, in addition, a recent study using the Taiwan national health insurance database reported that statin use was associated with lower risk for lung cancers in patients with interstitial lung diseases 29 . However, our finding that statin use was associated with a lower risk for events caused by respiratory causes indicates that statins may have beneficial effects on IPF itself.

The strength of our study is that we used representative nationwide data from a large number of patients captured from real-world clinical practice. However, this study had some limitations that should be addressed. First, although our study suggested a protective role of statins against the risk for IPF, a case–control study could not confirm a causal relationship. The effects of statins on the prognosis of patients with IPF could not be definitively addressed in our study due to its retrospective design. As such, a well-designed, prospective study should be conducted in the future. Second, about 50% of patients who met the operational diagnosis of newly diagnosed IPF was excluded in the final analysis. The most common cause of exclusion was because they did not undergo health screening program within 2 years. It can be suggested that patients with relatively higher health motivation were included in this study. Third, the NHIS database did not have information about the severity of IPF, such as lung function parameters. Therefore, we could not fully adjust for confounding factors when analyzing the clinical outcomes of patients with IPF. Fourth, we could not assess the effects of anti-fibrotic therapy. This is because in Korea, pirfenidone only began to be covered by the National Health Insurance in October 2015, and nintedanib has not yet been covered. Given that our study period was up to December 2017, we did not have sufficient data regarding the use of anti-fibrotic drugs. However, we suspect that the prescription of anti-fibrotic drugs would not have been so different, regardless of whether patients were statin users. Considering the mortality-reducing effects of anti-fibrotic therapy 30 , future studies should gather information about the use of anti-fibrotic drugs when evaluating whether a specific drug could improve the prognosis of IPF. Fifth, our study only investigated the impact of statin use preceding the diagnosis of IPF. Given the possibility that some patients may have started statin therapy after the diagnosis of IPF, statistical methods such as time-varying Cox regression or landmark time analysis may be useful to addressing this issue 31 . Sixth, the NHIS database used for this study predominantly covered only Asian populations, limiting our ability to evaluate patients of diverse ethnic backgrounds. Further studies should be performed in other cohorts from different countries. Seventh, we employed unconditional logistic regression in the analysis of our case–control matched dataset, under the premise that our matching, which was limited to only age and sex, were comparatively loose 32 , 33 . The absence of conditional matched regression in our statistical methodology raises concerns regarding potential biases in our results. Finally, because this study was based on information gathered from a database, misclassification related to statin use and/or diagnosis of IPF could have occurred. A prescription history of statins may not necessarily mean that the patient actually took the drug. However, in clinical practice, statins are usually taken continuously, unless there are serious side effects, which are relatively rare. Thus, the chances of exposure misclassification appear to be low. In addition, the diagnosis of IPF is strictly reviewed for registration in Korea, because patients with ICD-10 codes for IPF are financially supported by the government. Furthermore, we included only patients diagnosed at referral hospitals, which makes us more confident in the reliability of IPF diagnosis. However, relying solely on the health claims database and utilizing ICD codes for identifying cases with IPF presents inherent limitations, primarily due to our limited ability to verify the diagnostic accuracy.

In conclusion, this large, nationwide, population-based cohort study conducted in a real-world setting found that statin use was associated with a lower risk for IPF than statin nonuse. In addition, statin use was associated with improved overall survival and reduced risk for respiratory-related hospitalization or ER visits among patients with IPF. Prospective studies aiming to confirm the potential beneficial effects of statins on IPF are warranted.

Study data source

This study used data from the health claims database established by the NHIS of South Korea, a single insurer managed by the Korean government. Because it is mandatory for every Korean citizen to subscribe to the NHIS, the NHIS database contains extensive health-related data from the entire Korean population, including personal and sociodemographic information, data regarding every inpatient and outpatient service, prescriptions, and mortality data 34 .

In South Korea, the NHIS has been providing national health screening checkup programs since 1995 to improve the health status of the general population. Health-related data obtained through this health screening program are also available in the NHIS database 35 . The data and materials of the NHIS are accessible to the public and are widely used by medical researchers. Requirement for informed consent was waived by the institutional review board of Seoul National University Hospital because information regarding personal identification was completely removed while establishing the database. The study protocol was approved by the institutional review board of Seoul National University Hospital (IRB No. E-1904-001-1020), and all methods were carried out in accordance with relevant guidelines and regulations.

Study design and population

The present study was a retrospective analysis based on data from the NHIS database, which included two distinct parts. The first is the case–control study comparing patients with IPF and control populations, and the second is the retrospective cohort study focusing only on the IPF cohort to evaluate the beneficial effect of statins in IPF.

A case–control study was performed to evaluate the association between statin use and risk for IPF. For this purpose, a cohort of patients with newly diagnosed IPF was established. First, patients who had medical claims with International Classification of Diseases (ICD-10) codes for IPF (J841 or J8418) between January 2010 and December 2017 were screened. We chose to exclude patients diagnosed prior to 2010 given the considerable differences in treatment strategies and clinical outcomes between patients diagnosed in the earlier period and those diagnosed more recently 36 . Subsequently, an operational definition of newly diagnosed IPF, as detailed in our previous study, was applied 37 .

Briefly, we first searched for patients in whom ICD-10 codes for IPF were registered by physicians working in referral hospitals, not primary care clinics, considering that an accurate diagnosis of IPF requires multidisciplinary discussion. Among these, patients with claims for chest computed tomography (CT) and pulmonary function tests within 1 year and 6 months, respectively, before the first registration of ICD-10 codes for IPF were included. Patients with ICD-10 codes for autoimmune or connective tissue diseases, or other pulmonary diseases were excluded. Among the selected patients fulfilling this operational definition, only those who had participated in national health screening programs within 2 years before the diagnosis of IPF were considered to be eligible for this study to use the health-related information that could be obtained through the programs, such as smoking history.

Study patients were classified into statin users and nonusers according to whether they were being prescribed statins at the time of IPF diagnosis. An operational definition of statin use was applied for this purpose because the exact timeframe for statins to exert potential effects on the risk of IPF is not certain. A patient was regarded to be a statin user if statins were prescribed at least twice within 1 year before the index date, the date of the first registration of the ICD-10 codes for IPF. To define only those who had never been exposed to statins as statin nonusers, patients in whom statins were prescribed only once within 1 year before the index date or statins were ever prescribed before but not within 1 year before the index date were excluded.

Following the establishment of the IPF cohort, another control cohort without IPF, 1:3 matched according to age and sex, was established using the exact matching algorithm. The control population was selected without replication among subjects without ICD-10 codes for IPF. Statin users and nonusers were defined as those in the IPF cohort. The index date for the control population was determined as the index date for the matched case patients.

After conducting a case–control study, a subsequent analysis was performed using follow-up data from study patients in the IPF cohort as a retrospective cohort study. We evaluated the association between statin use and clinical outcomes. For this analysis, patients who had already been diagnosed with myocardial infarction or stroke before the index date were excluded because incident myocardial infarction or stroke was one of the outcome events of interest.

Study outcome and covariates

In the case–control study, the association between statin use and the risk for IPF was assessed by calculating odds ratios (ORs). After matching for age and sex, additional adjustments were performed for the following covariates: smoking history, alcohol consumption pattern, body mass index, income level, and comorbidities including diabetes, hypertension, ischemic heart disease, stroke, and chronic kidney disease. The presence of comorbidities was assessed using the relevant ICD-10 codes for each disease. Covariates were selected among clinically relevant variables that could be associated with statin use and identified through the NHIS database.

Using only the IPF cohort, we investigated whether statin use was associated with improvement in clinical outcomes, including overall mortality, hospitalization or emergency room (ER) visits, and incident myocardial infarction or stroke. After the overall events of hospitalization or ER visits were assessed, events associated with a primary diagnosis of respiratory diseases were further analyzed using ICD-10 codes of J00–J99. However, the cause of mortality could not be identified through our database. The covariates adjusted for were similar to those used in the case–control analysis and included smoking history, alcohol consumption pattern, body mass index, income level, and comorbidities including diabetes, hypertension, and chronic kidney disease.

Statistical analysis

Descriptive statistics were used to summarize the baseline characteristics of the study population. Continuous variables were summarized as means with standard deviations and categorical variables were reported as frequencies (percentages). The association between statin use and the risk for IPF was evaluated using multivariable logistic regression, and adjusted ORs were reported with corresponding 95% confidence intervals (CIs). Subgroup analysis according to smoking history and sex was performed as well.

The association between statin use and clinical outcomes among patients in the IPF cohort was evaluated using Cox proportional hazard regression, and adjusted hazard ratios (HRs) were calculated with corresponding 95% CIs. If a patient started statin therapy before development of the events of interest, further follow-up was censored.

Given that the operational definition of statin use could not provide detailed information about the duration of drug use, additional analysis was performed by dividing statin users according to the duration of drug use, using 5 years as the cut-off point (statin use ≥ 5 years vs. < 5 years vs. nonuse). We compared the effects of statin use ≥ 5 years and statin use < 5 years by including an interaction term in the regression model. SAS version 9.4 (SAS Institute, Cary, NC, USA) was used for statistical analyses, and P values < 0.05 for two-tailed tests were considered to be statistically significant.

Data availability

The data that support the findings of this study are available from NHIS database, but there are restrictions applied to the availability of these data, which were used under license for the current study and so are not currently publicly available. Data are however available from the corresponding author upon reasonable request and with permission of NHIS.

Abbreviations

3-hydroxy-3-methylglutaryl coenzyme A

Confidence interval

Emergency room

Hazard ratio

International classification of diseases

Interstitial lung abnormality

  • Idiopathic pulmonary fibrosis

National health insurance service

Sgalla, G. et al. Idiopathic pulmonary fibrosis: Pathogenesis and management. Respir. Res. 19 , 32 (2018).

Article   PubMed   PubMed Central   Google Scholar  

Rogliani, P., Calzetta, L., Cavalli, F., Matera, M. G. & Cazzola, M. Pirfenidone, nintedanib and N-acetylcysteine for the treatment of idiopathic pulmonary fibrosis: A systematic review and meta-analysis. Pulm. Pharmacol. Ther. 40 , 95–103 (2016).

Article   CAS   PubMed   Google Scholar  

Heukels, P., Moor, C. C., von der Thusen, J. H., Wijsenbeek, M. S. & Kool, M. Inflammation and immunity in IPF pathogenesis and treatment. Respir. Med. 147 , 79–91 (2019).

Ruwanpura, S. M., Thomas, B. J. & Bardin, P. G. Pirfenidone: Molecular mechanisms and potential clinical applications in lung disease. Am. J. Respir. Cell Mol. Biol. 62 , 413–422 (2020).

Wollin, L., Maillet, I., Quesniaux, V., Holweg, A. & Ryffel, B. Antifibrotic and anti-inflammatory activity of the tyrosine kinase inhibitor nintedanib in experimental models of lung fibrosis. J. Pharmacol. Exp. Ther. 349 , 209–220 (2014).

Article   PubMed   Google Scholar  

Idiopathic Pulmonary Fibrosis Clinical Research N. et al. Prednisone, azathioprine, and N-acetylcysteine for pulmonary fibrosis. N. Engl. J. Med. 366 , 1968–1977 (2012).

Article   Google Scholar  

Quist-Paulsen, P. Statins and inflammation: An update. Curr. Opin. Cardiol. 25 , 399–405 (2010).

De Sadeleer, L. J. et al. Statins: Cause of fibrosis or the opposite? Effect of cardiovascular drugs in idiopathic pulmonary fibrosis. Respir. Med. 176 , 106259 (2021).

Kreuter, M. et al. Statin therapy and outcomes in trials of nintedanib in idiopathic pulmonary fibrosis. Respiration 95 , 317–326 (2018).

Kreuter, M. et al. Effect of statins on disease-related outcomes in patients with idiopathic pulmonary fibrosis. Thorax 72 , 148–153 (2017).

Vedel-Krogh, S., Nielsen, S. F. & Nordestgaard, B. G. Statin use is associated with reduced mortality in patients with interstitial lung disease. PLoS One 10 , e0140571 (2015).

Ekstrom, M. & Bornefalk-Hermansson, A. Cardiovascular and antacid treatment and mortality in oxygen-dependent pulmonary fibrosis: A population-based longitudinal study. Respirology 21 , 705–711 (2016).

Kim, J. W., Barrett, K., Loke, Y. & Wilson, A. M. The effect of statin therapy on disease-related outcomes in idiopathic pulmonary fibrosis: A systematic review and meta-analysis. Respir. Med. Res. 80 , 100792 (2021).

CAS   PubMed   Google Scholar  

Dey, T., Mukherjee, A. & Chakraborty, S. A practical overview of case-control studies in clinical practice. Chest 158 , S57–S64 (2020).

Richeldi, L., Collard, H. R. & Jones, M. G. Idiopathic pulmonary fibrosis. Lancet 389 , 1941–1952 (2017).

Bai, L. et al. Idiopathic pulmonary fibrosis and diabetes mellitus: A meta-analysis and systematic review. Respir. Res. 22 , 175 (2021).

Rangarajan, S. et al. Metformin reverses established lung fibrosis in a bleomycin model. Nat. Med. 24 , 1121–1127 (2018).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Park, S. et al. Long-term follow-up of interstitial lung abnormality: Implication in follow-up strategy and risk thresholds. Am. J. Respir. Crit. Care Med. 208 , 858–867 (2023).

Hatabu, H. et al. Interstitial lung abnormalities detected incidentally on CT: A position paper from the fleischner society. Lancet Respir. Med. 8 , 726–737 (2020).

Hata, A., Schiebler, M. L., Lynch, D. A. & Hatabu, H. Interstitial lung abnormalities: State of the art. Radiology 301 , 19–34 (2021).

Duckworth, A. et al. Telomere length and risk of idiopathic pulmonary fibrosis and chronic obstructive pulmonary disease: A mendelian randomisation study. Lancet Respir. Med. 9 , 285–294 (2021).

Boccardi, V. et al. A new pleiotropic effect of statins in elderly: Modulation of telomerase activity. FASEB J. 27 , 3879–3885 (2013).

Astuti, Y., Wardhana, A., Watkins, J., Wulaningsih, W. & Network, P. R. Cigarette smoking and telomere length: A systematic review of 84 studies and meta-analysis. Environ. Res. 158 , 480–489 (2017).

Santos, D. M. et al. Screening for YAP inhibitors identifies statins as modulators of fibrosis. Am. J. Respir Cell Mol. Biol. 62 , 479–492 (2020).

Zhu, B., Ma, A. Q., Yang, L. & Dang, X. M. Atorvastatin attenuates bleomycin-induced pulmonary fibrosis via suppressing iNOS expression and the CTGF (CCN2)/ERK signaling pathway. Int. J. Mol. Sci. 14 , 24476–24491 (2013).

Watts, K. L., Sampson, E. M., Schultz, G. S. & Spiteri, M. A. Simvastatin inhibits growth factor expression and modulates profibrogenic markers in lung fibroblasts. Am. J. Respir. Cell Mol. Biol. 32 , 290–300 (2005).

Jain, M. K. & Ridker, P. M. Anti-inflammatory effects of statins: Clinical evidence and basic mechanisms. Nat. Rev. Drug Discov. 4 , 977–987 (2005).

Ko, S. J., Choi, S. M., Han, K. D., Lee, C. H. & Lee, J. All-cause mortality of patients with idiopathic pulmonary fibrosis: A nationwide population-based cohort study in Korea. Sci. Rep. 11 , 15145 (2021).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Yeh, J. J., Lai, J. N., Lin, C. L., Hsu, C. Y. & Kao, C. H. Time-dependent propensity-matched general population study of the effects of statin use on cancer risk in an interstitial lung disease and pulmonary fibrosis cohort. BMJ Open 11 , e047039 (2021).

Petnak, T., Lertjitbanjong, P., Thongprayoon, C. & Moua, T. Impact of antifibrotic therapy on mortality and acute exacerbation in idiopathic pulmonary fibrosis: A systematic review and meta-analysis. Chest 160 , 1751–1763 (2021).

Pazzagli, L. et al. Methods for time-varying exposure related problems in pharmacoepidemiology: An overview. Pharmacoepidemiol. Drug Saf. 27 , 148–160 (2018).

Kuo, C. L., Duan, Y. & Grady, J. Unconditional or conditional logistic regression model for age-matched case-control data?. Front. Public Health 6 , 57 (2018).

Pearce, N. Analysis of matched case-control studies. BMJ 352 , i969 (2016).

Cheol Seong, S. et al. Data resource profile: The national health information database of the national health insurance service in South Korea. Int. J. Epidemiol. 46 , 799–800 (2017).

PubMed   Google Scholar  

Seong, S. C. et al. Cohort profile: The national health insurance service-national health screening cohort (NHIS-HEALS) in Korea. BMJ Open 7 , e016640 (2017).

Moon, S. W. et al. Longitudinal changes in clinical features, management, and outcomes of idiopathic pulmonary fibrosis. A nationwide cohort study. Ann. Am. Thorac. Soc. 18 , 780–787 (2021).

Bae, W. et al. Impact of smoking on the development of idiopathic pulmonary fibrosis: Results from a nationwide population-based cohort study. Thorax 77 , 470–476 (2022).

Download references

Acknowledgements

We thank the National Health Insurance Service (NHIS) of South Korea for supplying the data for this study.

Author information

Authors and affiliations.

Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, Seoul National University Hospital, Seoul National University College of Medicine, 101, Daehak-ro, Jongno-gu, Seoul, 03080, South Korea

Jimyung Park, Chang-Hoon Lee & Sun Mi Choi

Department of Statistics and Actuarial Science, Soongsil University, Seoul, South Korea

Kyungdo Han

You can also search for this author in PubMed   Google Scholar

Contributions

SMC is the study lead and guarantor for this paper. JP, CHL, KH, and SMC contributed to conception and design of the study. JP and KH contributed to acquisition, analysis, and interpretation of data. KH performed the main statistical analysis, and JP, CHL, and SMC critically appraised those results. JP wrote the first draft of this paper, and CHL, KH, and SMC revised it critically for important intellectual content. JP, CHL, KH, and SMC had access to final version of this paper and approved it to be published.

Corresponding author

Correspondence to Sun Mi Choi .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Park, J., Lee, CH., Han, K. et al. Association between statin use and the risk for idiopathic pulmonary fibrosis and its prognosis: a nationwide, population-based study. Sci Rep 14 , 7805 (2024). https://doi.org/10.1038/s41598-024-58417-9

Download citation

Received : 05 June 2023

Accepted : 28 March 2024

Published : 02 April 2024

DOI : https://doi.org/10.1038/s41598-024-58417-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Case–control study
  • Observational study
  • Pharmacoepidemiology

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

population based case cohort study

This paper is in the following e-collection/theme issue:

Published on 1.4.2024 in Vol 10 (2024)

Timely Pulmonary Tuberculosis Diagnosis Based on the Epidemiological Disease Spectrum: Population-Based Prospective Cohort Study in the Republic of Korea

Authors of this article:

Author Orcid Image

Original Paper

  • Yousang Ko 1 , MD, PhD   ; 
  • Jae Seuk Park 2 , MD, PhD   ; 
  • Jinsoo Min 3 , MPH, MD   ; 
  • Hyung Woo Kim 4 , MD   ; 
  • Hyeon-Kyoung Koo 5 , MD   ; 
  • Jee Youn Oh 6 , MD   ; 
  • Yun-Jeong Jeong 7 , MD   ; 
  • Eunhye Lee 8 , MD   ; 
  • Bumhee Yang 9 , MD   ; 
  • Ju Sang Kim 4 , MPH, MD   ; 
  • Sung-Soon Lee 5 , MD, PhD   ; 
  • Yunhyung Kwon 10 , PhD   ; 
  • Jiyeon Yang 10 , RN, MSN   ; 
  • Ji yeon Han 10 , PhD   ; 
  • You Jin Jang 10 , MS   ; 
  • Jinseob Kim 11 , MD, MPH  

1 Division of Pulmonary, Allergy and Critical Care Medicine, Department of Internal Medicine, Hallym University Kangdong Sacred Heart Hospital, Seoul, Republic of Korea

2 Division of Pulmonary Medicine, Department of Internal Medicine, Dankook University College of Medicine, Cheonan, Republic of Korea

3 Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea

4 Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, Incheon St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Incheon, Republic of Korea

5 Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, Ilsan Paik Hospital, Inje University College of Medicine, Ilsan, Republic of Korea

6 Division of Pulmonary, Allergy, and Critical Care Medicine, Department of Internal Medicine, Korea University College of Medicine, Korea University Guro Hospital, Seoul, Republic of Korea

7 Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, Dongguk University Ilsan Hospital, Ilsan, Republic of Korea

8 Division of Pulmonology, Allergy and Critical Care Medicine, Department of Internal Medicine, Yongin Severance Hospital, Yonsei University College of Medicine, Yongin, Republic of Korea

9 Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, Chungbuk National University Hospital, Chungbuk National University College of Medicine, Cheong-Ju, Republic of Korea

10 Division of Tuberculosis Prevention and Control, Korea Disease Control and Prevention Agency, Osong, Republic of Korea

11 Zarathu Co Ltd, Seoul, Republic of Korea

Corresponding Author:

Yousang Ko, MD, PhD

Division of Pulmonary, Allergy and Critical Care Medicine

Department of Internal Medicine

Hallym University Kangdong Sacred Heart Hospital

Sung-an ro 150, Kangdonggu

Seoul, 05355/82

Republic of Korea

Phone: 82 2224 2561

Fax:82 02 488 6925

Email: [email protected]

Background: Timely pulmonary tuberculosis (PTB) diagnosis is a global health priority for interrupting transmission and optimizing treatment outcomes. The traditional dichotomous time-divided approach for addressing time delays in diagnosis has limited clinical application because the time delay significantly varies depending on each community in question.

Objective: We aimed to reevaluate the diagnosis time delay based on the PTB disease spectrum using a novel scoring system that was applied at the national level in the Republic of Korea.

Methods: The Pulmonary Tuberculosis Spectrum Score (PTBSS) was developed based on previously published proposals related to the disease spectrum, and its validity was assessed by examining both all-cause and PTB-related mortality. In our analysis, we integrated the PTBSS into the Korea Tuberculosis Cohort Registry. We evaluated various time delays, including patient, health care, and overall delays, and their system-associated variables in line with each PTBSS. Furthermore, we reclassified the scores into distinct categories of mild (PTBSS=0-1), moderate (PBTBSS=2-3), and severe (PBTBSS=4-6) using a multivariate regression approach.

Results: Among the 14,031 Korean patients with active PTB whose data were analyzed from 2018 to 2020, 37% (n=5191), 38% (n=5328), and 25% (n=3512) were classified as having a mild, moderate, and severe disease status, respectively, according to the PTBSS. This classification can therefore reflect the disease spectrum of PTB by considering the correlation of the score with mortality. The time delay patterns differed according to the PTBSS. In health care delays according to the PTBSS, greater PTB disease progression was associated with a shorter diagnosis period, since the condition is microbiologically easy to diagnose. However, with respect to patient delays, the change in elapsed time showed a U-shaped pattern as PTB progressed. This means that a remarkable patient delay in the real-world setting might occur at both apical ends of the spectrum (ie, in both mild and severe cases of PTB). Independent risk factors for a severe PTB pattern were age (adjusted odds ratio 1.014) and male sex (adjusted odds ratio 1.422), whereas no significant risk factor was found for mild PTB.

Conclusions: Timely PTB diagnosis should be accomplished. This can be improved with use of the PTBSS, a simple and intuitive scoring system, which can be more helpful in clinical and public health applications compared to the traditional dichotomous time-only approach.

Introduction

In 2020, an estimated 10 million cases of tuberculosis (TB) were reported worldwide [ 1 ]. In the Republic of Korea (ROK), the number of notified TB cases had long remained stable without a decrease [ 2 , 3 ]; however, the number of notified TB cases has decreased significantly in the last decade following continuous nationwide efforts [ 4 , 5 ]. Specifically, in 2021, the overall notification rate for incident TB cases (n=18,335) was 35.7/100,000 persons, which constituted a 53.6% decrease in the number of cases compared with those notified in 2011 (39,557 incident TB cases; notification rate 78.9/100,000 persons) [ 3 ]. As a low-TB-burden country, it is now an appropriate time to take the next step toward the goal of TB elimination in the ROK.

Timely TB diagnosis plays a crucial role in TB elimination by reducing the disease burden and preventing further community-based infections [ 6 , 7 ]. Numerous studies in various countries have focused on estimating the time delay for early TB diagnosis and treatment while identifying associated risk factors [ 7 - 19 ]. Previously, we also conducted research at the national level in the ROK, following a similar approach and drawing on references from previous studies [ 20 ]. However, we concluded that the binary time approach used in previous studies from other countries did not offer significant assistance in the context of the ROK. This is because the absolute value of the diagnostic delay time in the ROK was shorter than that observed in other countries. Therefore, we considered the clinical usefulness of this conventional dichotomous time-divided approach as limited for our context and attempted to approach the delayed diagnosis issue from a new perspective by focusing on the pulmonary TB (PTB) disease spectrum.

The “iceberg concept” has been applied to describe TB infection [ 21 ], illustrating that TB infection exists in varying quantities within the population. Furthermore, this concept helps to elucidate the disease spectrum, ranging from latent TB to clinical disease. According to the iceberg concept, two directions of effort are necessary to eliminate TB: the first involves moving from the bottom to the top of the iceberg in preventing progression from latent TB to active TB infection, whereas the second involves moving from the top to the bottom of the iceberg, reflecting the fact that early diagnosis and prompt treatment of active disease are crucial in preventing further community-based transmission of the infection.

In the ROK, active TB screening, including the progression from latent TB infection (LTBI) to clinical TB infection, especially in schools and military services, helped to significantly decrease the TB incidence among younger individuals in their 10s to 20s [ 4 ]. Active TB contact investigation and highly recommended treatment for LTBI are needed. Therefore, the remaining areas of early diagnosis of active TB at a national level are needed to assess and help establish TB control policies, while informing where efforts should be focused.

In this study, we investigated the diagnostic delay in timely PTB diagnosis; quantified the time-consuming processes contributing to patient, health care, and overall delays in access to TB treatment in the ROK based on the PTB disease spectrum; and compared demographic characteristics between cases diagnosed with mild versus severe disease. These data could reveal the epidemiological characteristics of PTB in the ROK, thereby informing the development of a real-world approach to PTB diagnosis that could provide important insights into the nature of the process wherein a patient develops symptoms and is finally diagnosed with PTB.

Korea TB Cohort and Recruitment

The Korea TB Cohort (KTBC) is a nationwide, prospective, and observational cohort comprising active TB cases from 172 public-private mix (PPM)–participating hospitals in 21 districts (>70% of all patients with TB in the ROK were treated in these PPM hospitals) since July 2018 [ 22 ]. Each patient with TB was notified, treated, and followed up every month until the completion of anti-TB treatment based on the national TB program. An investigation of the detailed data of the characteristics of active TB cases was planned to enable the establishment of a long-term plan in the future that shared the aim of an advanced National TB Elimination Project, which is operated by the Korean Academy of Tuberculosis and Respiratory Diseases under supervision of the Korea Disease Control and Prevention Agency (KDCA). The inclusion criteria for the KTBC include notified TB cases in all participating hospitals by the Korea National TB Surveillance System.

After enrollment of the KTBC, specialist TB nurses from each hospital conducted detailed interviews with patients with TB and completed standardized case-level forms. This process includes comprehensive investigations into patient information, including comorbidities, height, body weight, economic status, employment status, social status, education level, and symptoms. Additionally, data related to the program were gathered, including details about treatment initiation, discontinuation, termination, and adverse effects, along with mortality. The collected data were checked by regional and central data managers. Following a regional and central audit, a central statistical team analyzed and organized the data every quarter.

Data collected from July 1, 2018, to December 31, 2020, were obtained from the KTBC. We included all PTB cases and excluded those with only extrapulmonary TB (EPTB) because the disease spectrum of EPTB could not be clearly determined. Furthermore, we excluded patients with rifampicin-resistant PTB to reduce heterogenicity when validating the developed Pulmonary Tuberculosis Spectrum Score (PTBSS) in comparing the disease spectrum of TB with outcomes ( Figure 1 ).

population based case cohort study

Definitions of Time Delays in the Diagnostic Pathway

We divided the PTB time interval into patient, health care, and overall delays [ 7 ]. A patient delay was defined as the duration between the onset of PTB-related symptoms and the first hospital visit. A health care delay was defined as the duration between the first hospital visit and initiation of anti-PTB treatment after a confirmed diagnosis of PTB. The overall delay was defined as the sum of the patient and health care delays.

Determination of the PTBSS Based on the PTB Disease Spectrum

The PTBSS was designed based on previously published proposals for the PTB disease spectrum ( Figure 2 ) [ 23 - 28 ]. Due to the lack of a true reference value that can accurately reflect the PTB disease spectrum, we identified factors that are commonly used in the diagnosis of PTB and can assess disease severity based on the epidemiological concept of PTB. We calculated the PTBSS according to six important variables: presence of symptoms, positive sputum in TB-polymerase chain reaction (PCR), positive sputum in an acid-fast bacilli (AFB) smear, sputum culture-positive Mycobacterium tuberculosis , cavitation detected on chest x-ray, and bilateral lung involvement of PTB on chest x-ray. However, since these six factors we identified and set were not weighted through statistical analysis, it may be challenging to view them as equally important. In addition, asymmetry can occur as the frequency of the manifestation of each factor can differ depending on the stage in the PTB disease spectrum. Therefore, for analysis, we proceeded by grouping in the following manner. Depending on the score assigned from 0 to 6 points, scores of 0-1, 2-3, and 4-6 were classified as mild, moderate, and severe disease, respectively.

population based case cohort study

Validation of the PTBSS

While the PTBSS was designed based on a consensus around the disease spectrum, as supported by existing scientific evidence, it is necessary to validate the PTBSS within appropriate clinical and public health contexts. However, effective and specific biomarkers that are applicable in real-world settings have not been used, and these biomarkers cannot be implemented within the framework of the KTBC, which consists of clinical practice data from a real-world setting. Therefore, we sought to validate our results by comparing mortality rates according to the PTBSS. In accordance with the iceberg concept, disease progression could ultimately result in death. Hence, higher mortality could be associated with a longer disease duration. If the group with a high PTB score exhibits a high mortality rate, this group may have a longer disease period, which can be interpreted as the time elapsed from disease onset to diagnosis.

Statistical Analysis

Categorical data are described as numbers and percentages, which were compared using the χ 2 or Fisher exact test. Continuous variables are expressed as mean (SD) or median (IQR) for normal or skewed distributions, respectively, and were compared using the t test or Mann-Whitney U test. The Kaplan-Meier method was used to estimate the cumulative survival rates of active PTB according to the PTBSS during anti-TB treatment. To determine intergroup differences, survival curves were compared by means of log-rank tests, and hazard ratios with 95% CIs were estimated using Cox regression analysis. A multivariate logistic regression analysis by binary classification was used to identify independent risk factors for each mild and severe disease condition at the time of diagnosis, based on the PTBSS, as measured by the estimated odds ratio with 95% CI, including variables with P <.20 on univariate analysis [ 29 ]. All analyses were two-sided and statistical significance was set at P <.05. Statistical analyses were performed using R version 4.2.0 (R Foundation for Statistical Computing) and GraphPad Prism 9.4.

Ethical Considerations

The Institutional Review Board of Hallym University Kangdong Sacred Heart Hospital approved the study protocol (approval number 2022-08-003-001) and waived the requirement for written informed consent from the participants because of the purely observational, noninterventional study design and analysis of anonymized patient data. The data are stored by the KDCA with authority to use as surveillance data for public health and research purposes.

Participants

The flowchart in Figure 1 shows the recruitment of the study population after the exclusion of patients according to the above-described criteria. The distributions of these 14,031 patients with PTB according to the PTBSS are shown in Table 1 .

The largest proportion of active cases of PTB were diagnosed as grades II to III (22.2% and 21.1%, respectively, according to a PTBSS of 1-2) and the smallest proportion of cases were diagnosed as grade VII (3.0% according to a PTBSS of 6). After the reclassification of severity based on the PTBSS, 37.0%, 38.0%, and 25.0% of patients with PTB were diagnosed with mild, moderate, and severe forms, respectively ( Table 2 ). The detailed clinical characteristics of the enrolled 14,031 patients with PTB according to the PTBSS and reclassification of severity are presented in Table S1 in Multimedia Appendix 1 .

a TB PCR: tuberculosis polymerase chain reaction.

b AFB: acid-fast bacilli.

c MTB: Mycobacterium tuberculosis .

Validation of the PTBSS Based on Survival Analysis

We evaluated the cumulative survival rate of KTBC patients with respect to both all-cause and PTB-related mortality, according to both the PTBSS itself and reclassified disease severity based on the PTBSS. With regard to all-cause mortality ( Figure 3 , Table S2 in Multimedia Appendix 1 ), survival rates were higher in patients with a low PTBSS and mild disease than in those with high scores and moderate-to-severe disease (Kaplan-Meier analysis, log-rank test P <.001). The overall cumulative mortality rates during anti-TB treatment for active PTB were 4.9% (grade I, score of 0), 8.0% (grade II, score of 1), 11.0% (grade III, score of 2), 14.3% (grade IV, score of 3), 14.6% (grade V, score of 4), 15.6% (grade VI, score of 5), and 14.6% (grade VII, score of 6). The overall cumulative mortality rates during anti-TB treatment for active PTB were 6.8% (mild disease), 12.4% (moderate disease), and 14.9% (severe disease).

With respect to PTB-related mortality ( Figure 4 , Table S3 in Multimedia Appendix 1 ), survival rates were higher in patients with a low PTBSS and mild disease compared with those in the high score and moderate-to-severe disease groups ( P <.001). The PTB-related cumulative mortality rates during anti-TB treatment for active PTB were 0.4% (grade I), 0.8% (grade I), 1.5% (grade III), 2.7% (grade IV), 4.2% (grade V), 7.3% (grade VI), and 8.1% (grade VII). The overall cumulative mortality rates during anti-TB treatment for active PTB were 0.7% (mild disease), 2.0% (moderate disease), and 5.7% (severe disease). Considering the correlation between the PTBSS and mortality, the PTBSS can be considered to reflect the natural course of PTB.

population based case cohort study

Time Delays According to the PTBSS in the Diagnostic Pathway

After evaluating whether the PTBSS reflects the natural course of PTB, we checked all possible types of time delays according to the scores ( Figure 5 ). As the PTBSS increased, the health care delay gradually decreased. However, the patient delay decreased with an increase in the PTBSS from 1 to 3 and then increased again from a score of 4 and above. The overall delay, as the sum of the health care and patient delays, gradually increased as the PTBSS increased. When including time delays in the diagnostic pathway and classifying them on a spectrum from mild to severe disease, health care and overall delays increased with increased disease severity. However, the patient delay exhibited a U-shaped pattern ( Figure 5 ). This means that different patient-related time delays in the real-world setting, ranging from symptom onset of PTB to the first hospital visit, appeared at both extremes (ie, in the mild and severe forms) of PTB. This further means that two different approaches are required to reduce the patient delay.

Furthermore, we conducted comparative analyses to verify the impact of each factor included in the construction of the PTBSS on patient delay, health care delay, and overall delay. The results are presented in Table 3 . Patients without symptoms exhibited differences in health care delay, likely indicating that symptomatic patients receive expedited testing in hospitals. Both sputum TB PCR positivity and AFB smear positivity were associated with differences in both patient and health care delays. In contrast, a positive Mycobacterium tuberculosis (MTB) culture in sputum did not exhibit any significant difference in either the patient or health care delay. The presence of cavitation on chest x-rays appeared to affect both patient and health care delays similarly. However, while bilateral lung involvement seemed to reduce the health care delay, it did not show a significant impact on the patient delay.

population based case cohort study

a N/A: not applicable.

b TB PCR: tuberculosis polymerase chain reaction.

c AFB: acid-fast bacilli.

d MTB: Mycobacterium tuberculosis .

e PTB: pulmonary tuberculosis.

Risk Factors for a Diagnosis of Mild or Severe Disease Based on the PTBSS

The independent risk factors associated with mild or severe disease at the time of diagnosis were investigated ( Table 4 ). In the multivariate analysis adjusted for potential confounding factors, age and male sex were independently associated with severe PTB in patient delay. The significant difference in symptoms such as cough and/or sputum, dyspnea, and hemoptysis may be attributed to the progression of PTB itself and may be unrelated to the late patient delay of severe PTB. However, no factor was significantly associated with patient delay for mild PTB.

a PTB: pulmonary tuberculosis.

b TB: tuberculosis.

c EPTB: extrapulmonary tuberculosis.

Principal Results

The time elapsed from the onset of PTB to diagnosis could be heterogeneous depending on personal, cultural, and health system situations. For effective TB control, the best solution needs to be identified according to each situation. We have previously recognized that a time-only approach is insufficient for the timely diagnosis of active PTB. In this study, we developed a new approach based on the disease spectrum of PTB, deviating from the perspective that has been studied based on the time delay itself.

The main findings of this study are as follows. First, the PTBSS could reflect the disease spectrum of PTB by considering the correlation of the score with mortality. Second, the pattern of time delays differed according to the PTBSS. In health care delays according to the PTBSS, greater PTB progression was associated with a shorter diagnosis period, because the condition is microbiologically easy to diagnose. However, with respect to patient delays, the change in elapsed time showed a U-shaped pattern as PTB progressed. This means that a remarkable patient delay in the real-world setting might occur at both apical ends of the spectrum (ie, in mild and severe PTB). This further means that two different approaches are required to reduce the patient delay. Third, the independent risk factors of a late visit to a medical institution as patient delay factors were age and male sex in the severe form of PTB. In contrast, there were no significant risk factors for mild PTB. Considering the natural course of PTB infection within the population, our approach could be helpful for a diagnostic strategy of active PTB involving passive case finding for the severe form and active case finding for mild forms [ 26 ] ( Figure 5 ).

Comparison With Prior Work

A diagnostic approach for the timely diagnosis and subsequent treatment of PTB is essential to reduce ongoing transmission in the community and PTB-related morbidity and mortality [ 30 , 31 ]. This approach is composed of two different types of time delays: patient and health care delays [ 7 ]. Both patient and health care delays have different medical and public health implications. Patient delay is the major determining time period for the total duration of the PTB diagnostic pathway, and is associated with infectiousness due to long-term exposure to others in the community and poor outcomes due to disease progression [ 11 , 20 ]. In contrast, the health care delay is the period immediately preceding the diagnosis of PTB, which is characterized by the highest infectious state in the disease course of PTB and is associated with in-hospital transmission to health care workers or other patients [ 32 ]. To date, studies have investigated timely diagnosis and related risk factors according to the time dichotomy [ 8 , 9 , 11 , 12 , 14 , 20 , 33 ]. However, this approach has inherent limitations for general application because the time delay and risk factors can vary greatly depending on the environments within each country [ 7 , 9 , 10 , 12 , 13 , 15 , 19 , 33 ]. Among the types of time delays, the patient delay is greatly influenced by the culture of each country, such as the perception of disease, personal circumstances, and medical policy. Health care delay will be affected by the specialty of the health care provider and the medical resources of each country. Therefore, it was difficult to identify consistent risk factors in previous studies.

PTB can be presented as a dynamic spectrum along with pathophysiology resulting from bacterial progression and associated changes in the host response, which is distinct from the binary simple classification of active and latent TB infection [ 25 , 27 ]. To reach the goal of TB elimination, a promising approach that possesses the above-mentioned characteristics is essential for a timely diagnosis of PTB and includes recognizing the development of subclinical PTB from an LTBI state, early detection of active PTB from subclinical PTB, and timely differential diagnosis of severe PTB from mild PTB [ 24 , 27 , 28 , 34 ]. The insights and perspectives on PTB as a spectrum in a disease state are becoming increasingly accepted, leading to new diagnostic approaches for different stages of the disease spectrum. However, in clinical practice, in a real-world setting, this is considerably more complicated than what is usually expected because of the limited evidence on predictive biomarkers for disease progression within active PTB [ 24 ]. Therefore, a simple and accessible approach is needed from both clinical and public health perspectives. Thus, we attempted to design the PTBSS for clinical and public use, which is expected to be easily applied in clinical and public health practice, as the PTBSS consists of variables that are widely used in real-world settings.

We aimed to validate this proposed and devised scoring system by examining the correlation of the PTBSS and the mortality rate during the treatment of PTB. The results revealed that for all-cause mortality, there was no significant difference in the Kaplan-Meier curve and hazard ratio between high-grade scores (eg, grade IV to VII), while clear distinctions were observed when grouping into mild, moderate, and severe disease categories. Conversely, for PTB-related mortality, significant differences were observed either when considering each PTBSS grade or when classifying patients into three distinct groups. We believe that the PTBSS, as an operational definition to predict the disease spectrum of PTB, reflects PTB-related mortality relatively better than all-cause mortality, which indirectly might demonstrate its excellence in reflecting the disease spectrum of PTB. Among 1538 deaths (10.9%) recorded after a confirmed diagnosis and treatment initiation in the KTBC, 342 (2.4%) were related to PTB, while 1196 (8.5%) were due to other causes. Most PTB-related deaths were a result of respiratory failure (n=324, 94.7%), with secondary ischemic heart disease accounting for 20 deaths (5.4%) and death from massive hemoptysis for 5 cases (1.5%).

As shown in Table 3 , we conducted comparative analyses to assess the impact of various factors on each delay in PTBSS construction. Asymptomatic patients experienced health care delays, suggesting that hospitals may expedite testing for those with symptoms. A positive sputum TB PCR result was linked to both patient and health care delays; health care providers might quickly decide on treatment for TB PCR–positive patients, thereby reducing health care delays. However, TB PCR positivity may also lead to longer patient delays due to later hospital visits after symptom onset. AFB smear positivity of sputum also affected both delays, likely for similar reasons. Conversely, positive MTB cultures in sputum did not influence either delay, possibly because treatment typically starts when PTB is suspected clinically rather than waiting for culture results. Cavitation on chest x-rays was associated with delays in a similar way to TB PCR and AFB smear positivity. Bilateral lung involvement seemed to shorten the health care delay but not patient delay, as health care providers might act quickly in such cases, while the patient’s perception of the disease onset—whether unilateral or bilateral—might not affect their decision to seek care. With this novel approach, we determined at what level of severity, the degree of patient delay, and the degree of health care delay that cases of PTB are diagnosed in the ROK based on the severity of PTB. This new perspective will help to change the classic perspective of active PTB and establish a more specific and realistic PTB elimination policy. We confirmed that the patient delay of PTB in the ROK was not linearly correlated, such as an increase in time with PTB progression.

The definition of patient delay, which refers to the time taken by a patient to visit a health care facility after the onset of symptoms, can lead to discrepancies in the measured period and the associated risk factors due to the inherent inaccuracy of a patient’s subjective symptom onset. Because of these limitations, numerous studies have shown that while the health care delay remains somewhat consistent, the patient delay significantly varies depending on each country in question. Factors such as each society’s medical infrastructure, economic conditions, perspectives on the disease, and language barriers are complexly intertwined, suggesting that an integrated approach that a society can employ is necessary. From this standpoint, we believe that using objective factors, rather than just relying on a patient’s subjective symptom onset, and employing the PTBSS, which can reflect the natural progression of PTB, could be helpful in understanding the factors causing TB diagnostic delays in each society.

According to our study, older individuals with PTB were more likely to be diagnosed with advanced disease, possibly because they were unaware of TB-related symptoms because of their preexisting comorbidities or because they considered the possibility of diseases other than PTB. Thus, older adults were probably more likely to visit medical facilities late after symptom onset. Moreover, even if older people visit a medical institution, the diagnosis of PTB can be delayed because of the possibility of other diseases. Furthermore, male patients are likely to be diagnosed with advanced disease. This is consistent or inconsistent with the results of previous observational studies conducted in different countries [ 8 , 11 , 12 , 14 ]. Thus, more research is needed.

Limitations

This study has some limitations. First, this study was performed in the ROK, a country with a high level of medical resources, the highest rate of health care use among the Organisation for Economic Co-operation and Development (OECD) countries, an aging population, and a low prevalence of HIV. Therefore, our analysis may have overestimated or underestimated the prevalence of PTB. Second, the patient delay could be affected by recall bias because this factor is determined by the patient’s own symptom description. Third, we only evaluated the disease spectrum using clinical variables, without specific immunological or bacteriological data, due to a lack of laboratory-based information. Fourth, the timing data for patient delay may vary between passive and active case-finding. Active case-finding might result in a shorter patient delay compared to passive case-finding. Nevertheless, we believe that a patient delay would primarily occur in symptomatic patients in the ROK, where a nationwide health insurance system enables citizens to access medical services despite economic constraints. Koreans are more likely to seek medical care at facilities compared to citizens from other OECD countries due to the ease of accessibility to medical services. However, we acknowledge that despite these favorable circumstances, there could be instances where some symptomatic patients might not seek medical attention due to personal reasons. To gain a deeper understanding of these diagnostic situations, the KTBC made the decision, in March 2022 during a meeting with the KDCA, to investigate the reasons for hospital visits. As a result, we plan to consider these factors and incorporate relevant analyses in our future follow-up studies.

Conclusions

A diagnostic approach for the timely diagnosis of PTB would be improved if based on the disease spectrum rather than the traditional dichotomous time-only approach. The PTBSS, as a simple and intuitive scoring system, could facilitate clinical and public approaches for TB detection and elimination specific to the context of each country.

Acknowledgments

We would like to thank all the participants of this study. The nationwide Korea Tuberculosis Cohort was supported by the National Health Promotion Fund, funded by the Korea Disease Control and Prevention Agency, Republic of Korea.

Data Availability

The data that support the findings of this study are available upon request from the Korea Disease Control and Prevention Agency (KDCA). These data are not publicly available because of privacy or ethical restrictions. The Korea Tuberculosis Cohort data are currently being organized and there is a plan to make these data available to interested researchers in 2024, provided that prior permission is obtained from the KDCA.

Authors' Contributions

Y Ko, JM, JSK, and JSP conceptualized the study. Y Ko, JM, HWK, HKK, JYO, JSK, YJJ, EL, and BY curated the data. Y Ko, JM, HWK, HKK, JYO, JSK, YJJ, EL, BY, and JSP conducted the formal analysis. Y Ko, JM, and JSK designed the methodology. Y Kwon, JY, JYH, YJJ, JYK, SSL, JSP, and JSK supervised the study. Y Ko and JSP wrote the original draft. Y Ko, JM, HWK, HKK, JYO, JSK, YJJ, JSP, Y Kwon, JY, JYH, YJJ, EL, and BY reviewed and edited the paper. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest

None declared.

Characteristics and time delays of participants according to Pulmonary Tuberculosis Spectrum Score (Table S1); Cox proportional hazard model for all-cause mortality (Table S2); Cox proportional hazard model for pulmonary tuberculosis–related mortality (Table S3).

  • Global Tuberculosis Programme (GTB) WHO Team. Global Tuberculosis Report. Geneva, Switzerland. World Health Organization; 2018.
  • Park YK, Park Y, Na KI, Cho EH, Shin S, Kim HJ. Increased tuberculosis burden due to demographic transition in Korea from 2001 to 2010. Tuberc Respir Dis. Mar 2013;74(3):104-110. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Annual report on the notified tuberculosis in Korea, 2022. Korea Disease Control and Prevention Agency. URL: https://www.kdca.go.kr/board/board.es?mid=a31001000000&bid=0130 [accessed 2024-02-11]
  • Cho KS. Tuberculosis control in the Republic of Korea. Epidemiol Health. 2018;40:e2018036. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Yu S, Sohn H, Kim H, Kim H, Oh K, Kim H, et al. Evaluating the impact of the nationwide public-private mix (PPM) program for tuberculosis under National Health Insurance in South Korea: a difference in differences analysis. PLoS Med. Jul 2021;18(7):e1003717. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Long R. Making a timely diagnosis of pulmonary tuberculosis. Can Respir J. 2015;22(6):317-321. [ CrossRef ] [ Medline ]
  • Sreeramareddy CT, Qin ZZ, Satyanarayana S, Subbaraman R, Pai M. Delays in diagnosis and treatment of pulmonary tuberculosis in India: a systematic review. Int J Tuberc Lung Dis. Mar 2014;18(3):255-266. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • El Halabi J, Palmer N, McDuffie M, Golub JJ, Fox K, Kohane I, et al. Measuring health-care delays among privately insured patients with tuberculosis in the USA: an observational cohort study. Lancet Infect Dis. Aug 2021;21(8):1175-1183. [ CrossRef ] [ Medline ]
  • Roberts DJ, Mannes T, Verlander NQ, Anderson C. Factors associated with delay in treatment initiation for pulmonary tuberculosis. ERJ Open Res. Jan 2020;6(1):e19238. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Lestari BW, McAllister S, Hadisoemarto PF, Afifah N, Jani ID, Murray M, et al. Patient pathways and delays to diagnosis and treatment of tuberculosis in an urban setting in Indonesia. Lancet Reg Health West Pac. Dec 2020;5:100059. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Auer C, Kiefer S, Zuske M, Schindler C, Wyss K, Blum J, et al. Health-seeking behaviour and treatment delay in patients with pulmonary tuberculosis in Switzerland: some slip through the net. Swiss Med Wkly. Aug 27, 2018;148:w14659. [ CrossRef ] [ Medline ]
  • Abbara A, Collin SM, Kon OM, Buell K, Sullivan A, Barrett J, et al. Time to diagnosis of tuberculosis is greater in older patients: a retrospective cohort review. ERJ Open Res. Oct 04, 2019;5(4):00228-2018. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Tefera KT, Mesfin N, Reta MM, Sisay MM, Tamirat KS, Akalu TY. Treatment delay and associated factors among adults with drug resistant tuberculosis at treatment initiating centers in the Amhara regional state, Ethiopia. BMC Infect Dis. May 31, 2019;19(1):489. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Getnet F, Demissie M, Assefa N, Mengistie B, Worku A. Delay in diagnosis of pulmonary tuberculosis in low-and middle-income settings: systematic review and meta-analysis. BMC Pulm Med. Dec 13, 2017;17(1):202. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Getnet F, Demissie M, Worku A, Gobena T, Tschopp R, Girmachew M, et al. Delay in diagnosis of pulmonary tuberculosis increases the risk of pulmonary cavitation in pastoralist setting of Ethiopia. BMC Pulm Med. Nov 06, 2019;19(1):201. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Mugauri H, Shewade HD, Dlodlo RA, Hove S, Sibanda E. Bacteriologically confirmed pulmonary tuberculosis patients: Loss to follow-up, death and delay before treatment initiation in Bulawayo, Zimbabwe from 2012-2016. Int J Infect Dis. Nov 2018;76:6-13. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Yimer S, Bjune G, Alene G. Diagnostic and treatment delay among pulmonary tuberculosis patients in Ethiopia: a cross sectional study. BMC Infect Dis. Dec 12, 2005;5(1):112. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Virenfeldt J, Rudolf F, Camara C, Furtado A, Gomes V, Aaby P, et al. Treatment delay affects clinical severity of tuberculosis: a longitudinal cohort study. BMJ Open. Jun 10, 2014;4(6):e004818. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Bassili A, Seita A, Baghdadi S, AlAbsi A, Abdilai I, Agboatwalla M. Diagnostic and treatment delay in tuberculosis in 7 countries of the Eastern Mediterranean Region. Infect Dis Clin Pract. 2008;16(1):23-35. [ FREE Full text ] [ CrossRef ]
  • Ko Y, Min J, Kim HW, Koo H, Oh JY, Jeong Y, et al. Time delays and risk factors in the management of patients with active pulmonary tuberculosis: nationwide cohort study. Sci Rep. Jul 05, 2022;12(1):11355. [ CrossRef ] [ Medline ]
  • Pfeiffer D. Veterinary Epidemiology: An Introduction. Hoboken, NJ. John Wiley & Sons; 2010.
  • Min J, Kim HW, Ko Y, Oh JY, Kang JY, Lee J, et al. Tuberculosis surveillance and monitoring under the national public-private mix tuberculosis control project in South Korea 2016-2017. Tuberc Respir Dis (Seoul). Jul 2020;83(3):218-227. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Pai M, Behr MA, Dowdy D, Dheda K, Divangahi M, Boehme CC, et al. Tuberculosis. Nat Rev Dis Primers. Oct 27, 2016;2:16076. [ CrossRef ] [ Medline ]
  • Petruccioli E, Scriba TJ, Petrone L, Hatherill M, Cirillo DM, Joosten SA, et al. Correlates of tuberculosis risk: predictive biomarkers for progression to active tuberculosis. Eur Respir J. Dec 2016;48(6):1751-1763. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Furin J, Cox H, Pai M. Tuberculosis. Lancet. Apr 20, 2019;393(10181):1642-1656. [ CrossRef ] [ Medline ]
  • Dheda K, Barry CE, Maartens G. Tuberculosis. Lancet. Mar 19, 2016;387(10024):1211-1226. [ CrossRef ] [ Medline ]
  • Esmail H, Macpherson L, Coussens AK, Houben RMGJ. Mind the gap - managing tuberculosis across the disease spectrum. EBioMedicine. Apr 2022;78:103928. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Migliori GB, Ong CWM, Petrone L, D'Ambrosio L, Centis R, Goletti D. The definition of tuberculosis infection based on the spectrum of tuberculosis disease. Breathe (Sheff). Sep 2021;17(3):210079. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Mickey RM, Greenland S. The impact of confounder selection criteria on effect estimation. Am J Epidemiol. Jan 1989;129(1):125-137. [ CrossRef ] [ Medline ]
  • Uplekar M, Weil D, Lonnroth K, Jaramillo E, Lienhardt C, Dias HM, et al. for WHO's Global TB Programme. WHO's new end TB strategy. Lancet. May 02, 2015;385(9979):1799-1801. [ CrossRef ] [ Medline ]
  • Diagnostic and treatment delay in tuberculosis. World Health Organization Regional Office for the Eastern Mediterranean. 2006. URL: https://applications.emro.who.int/dsaf/dsa710.pdf [accessed 2024-02-11]
  • Uppal N, Batt J, Seemangal J, McIntyre SA, Aliyev N, Muller MP. Nosocomial tuberculosis exposures at a tertiary care hospital: a root cause analysis. Am J Infect Control. May 2014;42(5):511-515. [ CrossRef ] [ Medline ]
  • Wondawek TM, Ali MM. Delay in treatment seeking and associated factors among suspected pulmonary tuberculosis patients in public health facilities of Adama town, eastern Ethiopia. BMC Public Health. Nov 14, 2019;19(1):1527. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • D'Ambrosio L, Dara M, Tadolini M, Centis R, Sotgiu G, van der Werf MJ, et al. European National Programme representatives. Tuberculosis elimination: theory and practice in Europe. Eur Respir J. May 03, 2014;43(5):1410-1420. [ FREE Full text ] [ CrossRef ] [ Medline ]

Abbreviations

Edited by A Mavragani; submitted 20.03.23; peer-reviewed by HJ Kim, T Li; comments to author 15.06.23; revised version received 01.08.23; accepted 31.01.24; published 01.04.24.

©Yousang Ko, Jae Seuk Park, Jinsoo Min, Hyung Woo Kim, Hyeon-Kyoung Koo, Jee Youn Oh, Yun-Jeong Jeong, Eunhye Lee, Bumhee Yang, Ju Sang Kim, Sung-Soon Lee, Yunhyung Kwon, Jiyeon Yang, Ji yeon Han, You Jin Jang, Jinseob Kim. Originally published in JMIR Public Health and Surveillance (https://publichealth.jmir.org), 01.04.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on https://publichealth.jmir.org, as well as this copyright and license information must be included.

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • For authors
  • Browse by collection
  • BMJ Journals More You are viewing from: Google Indexer

You are here

  • Volume 14, Issue 3
  • Association between physical activity over a 10-year period and current insomnia symptoms, sleep duration and daytime sleepiness: a European population-based study
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • http://orcid.org/0000-0002-6585-5777 Erla Bjornsdottir 1 ,
  • http://orcid.org/0000-0001-6561-3746 Elin Helga Thorarinsdottir 2 , 3 ,
  • Eva Lindberg 4 ,
  • Bryndis Benediktsdottir 3 , 5 ,
  • Karl Franklin 6 ,
  • Debbie Jarvis 7 , 8 ,
  • Pascal Demoly 9 ,
  • http://orcid.org/0000-0001-7034-0615 Jennifer L Perret 10 ,
  • Judith Garcia Aymerich 11 , 12 ,
  • http://orcid.org/0000-0002-0798-2153 Sandra Dorado-Arenas 13 ,
  • Joachim Heinrich 14 , 15 ,
  • http://orcid.org/0000-0001-8509-7603 Kjell Torén 16 ,
  • http://orcid.org/0000-0002-0003-1988 Vanessa Garcia Larsen 17 ,
  • Rain Jögi 18 ,
  • Thorarinn Gislason 3 , 5 ,
  • http://orcid.org/0000-0001-5093-6980 Christer Janson 4
  • 1 Department of psychology , Reykjavik University , Reykjavik , Iceland
  • 2 Department of psychology , Heilsugæsla Höfuðborgarsvæðisins , Reykjavik , Iceland
  • 3 Department of psychology, Faculty of Medicine , University of Iceland , Reykjavik , Iceland
  • 4 Department of Medical Sciences: Respiratory, Allergy and Sleep Research , Uppsala University , Uppsala , Sweden
  • 5 Department of Sleep , Landspítali Háskólasjúkrahús , Reykjavik , Iceland
  • 6 Department of Surgical and Perioperative Sciences , Umea Universitet , Umea , Sweden
  • 7 Population Health and Occupational Disease, National Heart and Lung Institute , Imperial College London School of Public Health , London , UK
  • 8 Department of psychology, MRC-PHE Centre for Environment and Health , Imperial College London , London , UK
  • 9 Department of psychology, University Hospital of Montpellier , University of Montpellier–INSERM UMR UA11 , Montpellier , France
  • 10 Department of psychology, Melbourne Medical School , The University of Melbourne , Melbourne , Victoria , Australia
  • 11 Department of psychology, Centre for Research in Environmental Epidemiology (CREAL) , ISGlobal , Barcelona , Spain
  • 12 Department of psychology , Universitat Pompeu Fabra , Barcelona , Spain
  • 13 Department of Pulmonology , Hospital Galdakao-Usansolo , Galdacano , Spain
  • 14 Department of psychology, Institute and Clinic for Occupational, Social and Environmental Medicine , Ludwig Maximilians University Munich , Munchen , Germany
  • 15 Department of psychology, Allergy and Lung Health Unit, Melbourne School of Population and Global Health, Melbourne Medical School , The University of Melbourne , Melbourne , Victoria , Australia
  • 16 Occupational and Environmental Medicine, Institutionen för Medicin , Göteborgs Universitet , Göteborg , Sweden
  • 17 Program in Human Nutrition, Department of International Health , Johns Hopkins University Bloomberg School of Public Health , Baltimore , Maryland , USA
  • 18 Department of psychology, The Lung Clinic , Tartu University Hospital , Tartu , Estonia
  • Correspondence to Dr Erla Bjornsdottir; erlabjo{at}gmail.com

Objectives To explore the relationship between physical activity over a 10-year period and current symptoms of insomnia, daytime sleepiness and estimated sleep duration in adults aged 39–67.

Design Population-based, multicentre cohort study.

Setting 21 centres in nine European countries.

Methods Included were 4339 participants in the third follow-up to the European Community Respiratory Health Survey (ECRHS III), who answered questions on physical activity at baseline (ECRHS II) and questions on physical activity, insomnia symptoms, sleep duration and daytime sleepiness at 10-year follow-up (ECRHS III). Participants who reported that they exercised with a frequency of at least two or more times a week, for 1 hour/week or more, were classified as being physically active. Changes in activity status were categorised into four groups: persistently non-active; became inactive; became active; and persistently active.

Main outcome measures Insomnia, sleep time and daytime sleepiness in relation to physical activity.

Results Altogether, 37% of participants were persistently non-active, 25% were persistently active, 20% became inactive and 18% became active from baseline to follow-up. Participants who were persistently active were less likely to report difficulties initiating sleep (OR 0.60, 95% CI 0.45–0.78), a short sleep duration of ≤6 hours/night (OR 0.71, 95% CI 0.59–0.85) and a long sleep of ≥9 hours/night (OR 0.53, 95% CI 0.33–0.84) than persistently non-active subjects after adjusting for age, sex, body mass index, smoking history and study centre. Daytime sleepiness and difficulties maintaining sleep were not related to physical activity status.

Conclusion Physically active people have a lower risk of some insomnia symptoms and extreme sleep durations, both long and short.

  • sleep medicine
  • epidemiology
  • primary care
  • public health
  • sports medicine

Data availability statement

Data are available upon reasonable request.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:  http://creativecommons.org/licenses/by-nc/4.0/ .

https://doi.org/10.1136/bmjopen-2022-067197

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

STRENGTHS AND LIMITATIONS OF THIS STUDY

The longitudinal study design, in which the exposure (physical activity) is measured 10 years prior to the sleep outcomes, enables an investigation into whether the consistency of physical activity over time has an impact on current symptoms of insomnia, sleep duration and daytime sleepiness.

Data were collected using standardised and validated procedures and instruments, increasing its internal validity.

Data were obtained from nine European countries, increasing the external validity of our findings.

One limitation of our study is that sleep variables are only available at the follow-up, which precluded testing their effect on baseline physical activity.

Insomnia symptoms, sleep durations and daytime sleepiness data were obtained by questionnaire and no sleep disorder diagnoses from medical providers or objective assessments were available.

Introduction

Disturbed sleep is common in the general population and impacts health and quality of life. 1–3 Chronic sleep disturbances are associated with cardiovascular disease, metabolic dysfunction, psychiatric disorders and increased mortality. 4–6

Physical activity and sleep

Regular exercise is associated with better health and several studies suggest that physical activity (PA) is beneficial to sleep and may improve symptoms of chronic insomnia. 7–10 It is, however, unclear how significant these benefits are and which factors may have a moderating effect on them. 11 The positive association between PA and sleep may be subject to multiple moderating factors such as gender, age, body mass index (BMI), fitness level, general health and the characteristics of the type of exercise in question. Therefore, sleep and PA probably influence each other through complex, reciprocal interactions including multiple physiological and psychological pathways. 7

PA and daytime sleepiness

There is evidence that more PA is associated with less daytime sleepiness. 12–17 Cross-sectional studies have shown that low PA is associated with an increased likelihood of excessive daytime sleepiness (EDS) 14–16 and that subjects participating in exercise are less likely to have EDS. 12 17 In older adults, increasing PA by doing home exercises has been shown to improve EDS and reduce the prevalence of insomnia symptoms, 13 while another study showed that increasing PA protected women from future insomnia. 18 Other studies have contradictory findings. In an epidemiological study of 4405 Koreans, daytime sleepiness was more common among those in the top quartile of PA compared with those in the lowest quartile group. 19 Among patients with obstructive sleep apnoea, increased PA was associated with a lower severity of disease and a 28% decrease in EDS. 20 The daily association between PA and sleep duration was described in 2021, based on a systematic review and meta-analysis of 33 peer-reviewed papers, which showed that, on the night following increased PA, there was a lower total sleep time. 21

Limitations of previous studies

There is a lack of epidemiological data from long-term follow-up studies of large cohorts exploring the association of PA with sleep length, daytime sleepiness and insomnia symptoms. Previous research on PA and sleep-related outcomes has several important limitations. Most studies are cross-sectional or have a short follow-up interval, preventing the possibility of elucidating whether increased PA improves sleeping outcomes or whether reduced PA is a consequence of sleep problems. Finally, the effects of PA on sleep length, daytime sleepiness and insomnia symptoms have not been studied simultaneously.

Aims of the current study

Therefore, the aim of the present study was to assess the inter-relationship between PA, based on frequency, duration and intensity, and symptoms of insomnia, self-reported sleep durations and daytime sleepiness among middle-aged subjects from 21 centres in nine countries at two moments in time, 10 years apart, providing important longitudinal follow-up data.

Materials and methods

We studied participants from the second and third follow-up surveys of the European Community Respiratory Health Survey (ECRHS II and III, www.ecrhs.org ), an international, population-based, multicentre cohort study of asthma and allergic disease, which was first carried out in 1990. Detailed descriptions of the methods used for ECRHS I and ECRHS II have been published elsewhere. 22 23 Briefly, participating centres randomly selected samples from subjects aged 20–44 in order to track them for asthma, allergy and lung disease (see: www.ecrhs.org ). Participants completed a short postal questionnaire about asthma and asthma-like symptoms and, from those who responded, a random sample was selected to undergo a more detailed clinical examination. In ECRHS II, subjects who had participated in the clinical phase of ECRHS I (performed between 1991 and 1994) were invited to participate in the follow-up study. The clinical phase of ECRHS II was carried out between 1998 and 2002. ECRHS III is the second follow-up study and was carried out from February 2011 to January 2014. 22–24 The present study is based on data from ECRHS II and III (see figure 1 for the flow chart).

  • Download figure
  • Open in new tab
  • Download powerpoint

Flow chart of the study population in the European Community Respiratory Health Survey (ECRHS).

Health, habits and measurements

Subjects answered the core ECRHS questionnaires, which included questions on lifestyle, respiratory symptoms, smoking history and general health. ‘Current smokers’ were defined as those who smoked tobacco regularly during the last month. ‘Former smokers’ were defined as smokers who denied having smoked regularly in the month prior to the examination. Those who reported no regular smoking at the time of or prior to the examination were defined as ‘never smokers’. The participants’ height and weight were measured and their BMI was calculated. 24

Assessment of PA

PA was assessed in ECRHS II and III using replies from questionnaires. The assessment of PA in ECRHS has previously been described in detail, including how both the frequency and duration of PA were used to divide the population into categories. 22 In brief, participants were asked how often and for how many hours per week they usually exercised to the point that they became out of breath or sweaty. Participants who exercised two or more times a week, for at least 1 hour/week, were classified as physically active. Changes in activity status from baseline to follow-up were categorised into four PA groups: persistently non-active (non-active at both baseline and follow-up), became inactive (active at baseline and non-active at follow-up), became active (non-active at baseline and active at follow-up) and persistently active (active at both baseline and follow-up).

Sleep questionnaires and measurements

Sleep-related symptoms were assessed using the Basic Nordic Sleep Questionnaire, 25 where participants were asked about the frequency of insomnia symptoms. Answers were provided on a scale of 1–5: (1) never or very seldom, (2) less than once a week, (3) once to twice a week, (4) three to five times a week, (5) every day or almost every day of the week. Insomnia symptoms were defined using answers to three questions from the Basic Nordic Sleep Questionnaire: ‘I have difficulties falling asleep at night’ (difficulties initiating sleep), ‘I wake up often during the night’ (difficulties maintaining sleep) and ‘I wake up early in the morning and can’t fall back asleep’ (early morning awakenings). Those who reported these symptoms of insomnia ≥3 times a week (scores 4 and 5) were considered to have the corresponding insomnia subtype. Daytime sleepiness was evaluated using the Epworth Sleepiness Scale (ESS), a brief questionnaire that measures daytime sleepiness based on the likelihood of falling asleep in eight different situations. 26 Participants with an ESS score >10 were considered to have EDS. Participants were asked the question: how much sleep do you estimate that you get on average each night? According to their answers, they were classified as: short sleepers (≤6 hours/night), normal sleepers (6–9 hours/night) or long sleepers (≥9 hours/night).

Patient and public involvement

The study’s design did not involve patients or the general public. However, all participating patients were informed of the research objectives and their informed consent was obtained. The survey was completed by participants voluntarily and no input from patients was sought in interpreting or writing up the results. The results of the research will not be disseminated to the patients.

Statistical analysis

Data are presented as numbers and percentages or mean±SD, depending on distribution. For bivariate analysis, the χ 2 test and one-way analysis of variance were used for nominal and continuous variables. Logistic regression was used for multivariable analyses to estimate the association between PA and sleep-related outcomes. The model was adjusted for potential confounders including age, sex, BMI, smoking history and study centre. In the analysis, all variables, including study centre (n=21), were treated as fixed effects. STATA V.16 was used for all statistical analyses.

Participants and level of PA

From a total of 5850 participants in ECRHS II, we excluded those with missing data and included a total of 4339 participants (48% men) (see figure 1 ). Figure 2 shows the prevalence of insomnia symptoms, short and long sleep durations and daytime sleepiness among subjects in the different countries included in the study. From baseline to 10 years later, 36.9% of participants were persistently non-active, 17.9% became physically active at follow-up, 20.3% of participants became inactive and 24.9% were persistently active ( table 1 ). There were geographical differences in the level of PA between the ECRHS countries ( figure 3 ). Participants in Norway were most likely to be persistently active, while participants in Spain, followed by Estonia, were most likely to be persistently non-active ( figure 3 ).

Prevalence of any insomnia symptom, short sleep duration (≤6 hours/night), long sleep duration (≥9 hours/night) and daytime sleepiness (Epworth Sleepiness Scale (ESS) score >10) by country.

Activity levels by country.

  • View inline

Characteristics and general health of the participants by the level of physical activity

General characteristics and health

Persistently active participants were more often men, they were younger and they had a slightly lower BMI ( table 1 ). They were also less likely to be current smokers and more likely to be currently working ( table 1 ).

Insomnia symptoms

In unadjusted analysis, there was a significant difference in reporting difficulties initiating sleep, early morning awakenings and any insomnia symptom where those persistently active were least likely to report these symptoms. Also, persistently active subjects were the least likely to report having two or three insomnia symptoms ( table 2 ). After adjusting for age, sex, BMI, smoking history and study centre, this negative association remained significant for difficulties initiating sleep (OR 0.58 (0.42–0.77)), any insomnia symptom (OR 0.78 (0.65–0.94)) and reporting two (OR 0.60 (0.43–0.82)) and three (OR 0.63 (0.41–0.98)) insomnia symptoms ( table 3 ). Additionally, in adjusted analysis, persistently active subjects were significantly less likely to report difficulties initiating sleep (OR 0.80 (0.66–0.97)) ( table 3 ). There were also independent associations between insomnia symptoms and age, female gender and BMI ( table 4 ).

Insomnia symptoms, sleep duration and daytime sleepiness by level of physical activity

Independent association between the level of physical activity and medical disorders, insomnia symptoms, daytime sleepiness and sleep duration expressed as adjusted* ORs (95% CI) with the persistently non-active group as reference

Associations between age, sex, BMI and smoking history and sleep-related symptoms

Sleep duration and daytime sleepiness

In unadjusted analysis, there was a significant difference in short and long sleep durations between levels of activity. Those who were persistently active were most likely to be normal sleepers while the persistently non-active were least likely to be in that category (70.9% vs 59.2%, respectively) ( table 2 ). After adjusting for age, sex, BMI, smoking history and study centre, these results remained significant for persistently active subjects. They were significantly more likely to be normal sleepers (OR 1.55 (1.29–1.87)) and significantly less likely to be short sleepers (OR 0.71 (0.58–0.85)) or long sleepers (OR 0.48 (0.28–0.80)) ( table 3 ). Additionally, those who became active were more likely to be normal sleepers than those who were persistently non-active (OR 1.21 (1.00–1.47)) ( table 3 ).

However, there was not a significant association between the mean ESS score or percentage with an ESS score >10 and level of PA ( tables 2 and 3 ). Daytime sleepiness was also independently associated with smoking ( table 4 ).

The main results of this study were that participants who reported being physically active at the start and end of a 10-year follow-up period were less likely to report insomnia symptoms at the follow-up. We also found that subjects who are persistently active are more likely to sleep the recommended 6–9 hours. This association remained statistically significant after adjusting for sex, age, smoking history and BMI. We also found that persistently active participants were more often men, were younger, had a slightly lower BMI and were less likely to be current smokers and more likely to be currently working.

Our results are in line with previous studies that have shown the beneficial effect of PA on symptoms of insomnia, 9 10 but the current study additionally shows the importance of consistency in exercising over time, because the association was lost for initially active subjects who became inactive. A recent meta-analysis examining the effects of acute and regular exercise on a range of sleep variables showed that acute exercise (less than 1 week of exercise) has a small beneficial effect on many objective measures of sleep, such as total sleep time, insomnia symptoms and sleep quality. 7 Furthermore, this meta-analysis found greater benefits from regular exercise for both subjective and objective sleep parameters over time. Regular exercise had small beneficial effects on total sleep time and sleep efficiency, small-to-medium beneficial effects on sleep onset latency and moderate beneficial effects on sleep quality. 7

There are two recent systematic reviews and meta-analyses on the effects of PA on sleep 7 and insomnia, 9 both substantially reviewing the same randomised controlled studies. Banno et al included nine studies with a total of 557 participants. 7 The majority of participants exercised three times or less per week and the follow-up interval was 4 months or shorter in all the studies except one. Their conclusion was that exercise could improve sleep, but that higher quality research was needed. 7 Five studies on insomnia, and, additionally, six on insomnia symptoms, showed shorter sleep latency and higher sleep efficacy, but the authors also acknowledged the small size of the literature and severe methodological limitations, often based on selection bias. 9 In addition, most previous studies are cross-sectional, which can also be considered a limitation.

Furthermore, a recent systematic review of PA and sleep showed that moderate exercise had a more promising outcome in terms of sleep quality than vigorous exercise. It is therefore important to study further the impact of the intensity of PA, in the context of age and gender, when exploring any beneficial impact on sleep. 27

This study has a long follow-up period (10 years) and indicates strongly that consistency in PA might be an important factor in optimising sleep duration and reducing the symptoms of insomnia. Most other studies have had a much shorter follow-up period, 7 which makes it more difficult to assess the consistency of activity over time.

Our results indicate that those who maintain a consistent level of PA are also less likely to be both short (<6 hours) and long sleepers (>9 hours). Those who are physically active in general are also more likely to engage in a healthier lifestyle, 28 which can likewise have an effect on sleep. Lifestyle factors, such as a healthy diet and being physically active, are probably part of a phenotype that characterises those individuals who are generally engaged in a healthy lifestyle. A recent review highlighted the importance of focusing on the combination of sleep, diet and exercise when exploring healthy longevity. 29

The three groups reporting low PA in either of the ECRHS surveys, or at both points in time, all report a very similar prevalence of insomnia symptoms, extreme sleep lengths and daytime sleepiness. This is somewhat surprising, especially given that those who were active in the follow-up survey but not at the baseline have a very similar symptom profile to those who were inactive in both surveys. Our study found that consistency in a behaviour such as PA for more than a decade is strongly related to a lower incidence of insomnia and a more ‘normal’ sleep length. Important information concerning ‘the healthy phenotype’ would be missed if the PA data were available only at baseline or at follow-up but not at both timepoints.

In a recent review based on 22 randomised controlled trials concerning the effects of regular exercise (lasting at least 2 months on a regular basis) on self-reported sleep quality, insomnia and daytime sleepiness, it was found that regular PA improved subjective sleep quality, insomnia severity and daytime sleepiness as measured with the ESS. 30 These results regarding insomnia symptoms are in line with our study, but the results on daytime sleepiness differ from our results. The reason for this discrepancy could be due to different study populations, as there were only two studies in this review that measured daytime sleepiness using the ESS; one study assessed this among the elderly, 60 years and older, 13 and the other among overweight and obese men. 31 Another recent review of 32 randomised controlled trials on the effects of exercise on improving sleep disturbances showed that exercise is beneficial in improving sleep quality, symptoms of insomnia, restless legs, sleep apnoea and daytime sleepiness. However, exercise only had significant effects on sleepiness if it had lasted for more than 12 weeks, while the exercise period did not matter in regard to the association to sleep quality and insomnia symptoms. 32

Another recent study showed that high or increasing levels of PA could protect women from future insomnia. 18 Therefore, exercise seems to have a stronger association with sleep quality and insomnia than with sleepiness, which is in line with our results. However, almost all previous studies have the limitation that the definition of sleepiness is limited to the estimate that the likelihood of falling asleep but not the general feeling of sleepiness that we have shown is also an important part of sleepiness. 33 34 Another recent review exploring the associations of exercise, sleep and cognitive function among older adults showed that PA is associated with improved cognitive function but the association of sleep and cognitive function seems to be U shaped, as too much or too little sleep is negatively associated with cognitive function. 35 We did not explore cognitive function in the current study but it would be interesting for future studies to explore further how cognitive function is affected by the association of PA and sleep.

This study has several strengths such as the population-based nature, the longitudinal study design and the large sample collected in the same manner at many centres in nine different countries. Another strength is the use of standardised and validated procedures and instruments. The long follow-up period is also a strength since data on PA are collected 10 years apart and subjects are categorised according to change in PA. This study is, however, not without limitations. It is not possible to know whether those who are active at both timepoints have been continuously physically active throughout the study period or only at these two timepoints. Furthermore, PA was only measured using a questionnaire. Another limitation of our study is that sleep variables are only available at the follow-up, and we only have information on insomnia symptoms but not the diagnosis of insomnia disorder. Sleep length and daytime sleepiness are also based on subjective data. Therefore, even though the measurement of PA is longitudinal, it may not be entirely appropriate to describe the associations between PA and sleep outcomes as longitudinal. Also, there are potential implications of residual confounders that can influence both PA and sleep which were not explored in the current study (eg, mental health, musculoskeletal disorders/chronic pain) which could influence the study findings.

In conclusion, PA over time is associated with lower prevalence of insomnia symptoms and with sleeping between 6 and 9 hours/night.

Ethics statements

Patient consent for publication, ethics approval.

This study involves human participants and ethical approval from the local research ethics committees and written consent from participants were obtained from each site. Australia: Monash University Human Research Ethics Committee (project number CF11/1818-2010001012). Belgium: Comité voor Medische Ethiek (UZA/UA 11/41/288). Denmark: De Videnskabsetiske Komiteer for Region Midtjylland (M-20110106). Estonia: Research Ethics Committee of the University of Tartu (UT REC 209T-17 and 225/M-24). France: Etude ECRHS III: Promotion CHU de Grenoble. Ethical approval from CPP Sud est V 4 mars 2011. Approval from Ministry of Health (AFSSAPS B110053-70) (Paris, Grenoble, Montpellier, Bordeaux). Germany: Ethikkommission der Bayerischen Landesärztekammer (Positive Votum: 10015). Iceland: National Bioethics Committee of Iceland (VSN-11-121-S3). Italy: Ethics Committee of IRCCS ‘San Matteo’ Hospital Foundation, University of Pavia (approval number 24215/2011) (Pavia), ‘Comitato Etico per la sperimentazione dell’Azienda Ospedaliera Universitaria Integrata di Verona’ (N Prog 1393) (Verona). Norway: Regional Ethics Committee West Norway (2010/759). Spain: Ethics Committee of the Parc de Salut Mar, Barcelona (Comité etic d’investigacio clínica, CEIC)–Parc de Salut Mar, Barcelona (approval number 2009/3500/1). Switzerland: Swiss Academy of Sciences. Sweden: Regional Ethical Review Board in Uppsala (decision number 2010/432). UK: NRES Committee London-Stanmore REC (Ref 11/LO/0965). Participants gave informed consent to participate in the study before taking part.

  • Gottlieb DJ ,
  • Redline S ,
  • Nieto FJ , et al
  • Sivertsen B ,
  • Lallukka T ,
  • Salo P , et al
  • Gangwisch JE ,
  • Heymsfield SB ,
  • Boden-Albala B , et al
  • Luyster FS ,
  • Strollo PJ ,
  • Zee PC , et al
  • Taniguchi M , et al
  • Kredlow MA ,
  • Capozzoli MC ,
  • Hearon BA , et al
  • Haddock G ,
  • Mulligan LD , et al
  • Passos GS ,
  • Poyares D ,
  • Santana MG , et al
  • Kelley GA ,
  • Andrianasolo RM ,
  • Galan P , et al
  • Brandão GS ,
  • Gomes GSBF ,
  • Brandão GS , et al
  • Chasens ER ,
  • Sereika SM ,
  • Weaver TE , et al
  • Blazer DG ,
  • McClain JJ ,
  • Laposky AD , et al
  • Urponen H ,
  • Hasan J , et al
  • Spörndly-Nees S ,
  • Åsenlöf P ,
  • Yi H , et al
  • Iftikhar IH ,
  • Youngstedt SD
  • Chevance G ,
  • Romain A-J , et al
  • Fuertes E ,
  • Carsin A-E ,
  • Antó JM , et al
  • Burney P , et al
  • Björnsdóttir E ,
  • Lindberg E , et al
  • Partinen M ,
  • Cobb-Clark DA ,
  • Kassenboehmer SC ,
  • Masanovic B ,
  • Bu T , et al
  • Chen XJ , et al
  • Wiklund P , et al
  • Mazzotti DR ,
  • Keenan BT ,
  • Thorarinsdottir EH , et al
  • Thorarinsdottir EH ,
  • Bjornsdottir E ,
  • Benediktsdottir B , et al
  • Mellow ML ,
  • Crozier AJ ,
  • Dumuid D , et al

EB and EHT are joint first authors.

TG and CJ are joint senior authors.

Contributors EHT and EB equally drafted, participated in manuscript preparation and were responsible for communications with other coauthors. TG and CJ participated in the design of the study, manuscript preparation and review of the manuscript on several stages. EHT performed the statistical analysis with help from CJ. EL, BB, KF, DJ, PD, JLP, JGA, SD-A, JH, KT, VGL and RJ participated in data collection and/or reviewing of the paper. TG is responsible for the writing of this manuscript accuracy of the data and

accepts full responsibility for the work and/or the conduct of the study, had access to the data, and controlled the decision to publish.

Funding Financial support for ECRHS III: Australia: National Health and Medical Research Council. Belgium: Antwerp South, Antwerp City: Research Foundation Flanders (FWO), grant code G041008N10 (both sites). Estonia: Tartu-SF0180060s09 from the Estonian Ministry of Education. France: (All) Ministère de la Santé, Programme Hospitalier de Recherche Clinique (PHRC) National 2010; Bordeaux: INSERM U897, Université Bordeaux Segalen; Grenoble: Comite Scientifique AGIRadom 2011; Paris: Agence Nationale de la Santé, Région Ile de France, Domaine d’intérêt majeur (DIM). Germany: Erfurt: German Research Foundation HE 3294/10-1; Hamburg: German Research Foundation MA 711/6-1, NO 262/7-1. Iceland: Reykjavik, Landspitali University Hospital Research Fund, University of Iceland Research Fund, Icelandic College of Family Physicians Research Fund, ResMed Foundation, California, USA, Orkuveita Reykjavikur (Geothermal plant), Vegagerðin (Icelandic Road Administration, ICERA), Icelandic Research Fund (grant number 173701-052). Italy: All Italian centres were funded by the Italian Ministry of Health, Chiesi Farmaceutici. In addition, Verona was funded by Cariverona Foundation, Education Ministry (MIUR). Norway: Norwegian Research Council (grant number 214123), Western Norway Regional Health Authorities (grant number 911631), Bergen Medical Research Foundation. Spain: Fondo de Investigación Sanitaria (PS09/02457, PS09/00716 09/01511, PS09/02185, PS09/03190), Servicio Andaluz de Salud, Sociedad Española de Neumología y Cirurgía Torácica (SEPAR 1001/2010); Barcelona: Fondo de Investigación Sanitaria (FIS PS09/00716); Galdakao: Fondo de Investigación Sanitaria (FIS 09/01511); Huelva: Fondo de Investigación Sanitaria (FIS PS09/02185) and Servicio Andaluz de Salud; Oviedo: Fondo de Investigación Sanitaria (FIS PS09/03190). Sweden: All centres were funded by the Swedish Heart and Lung Foundation, Swedish Asthma and Allergy Association, Swedish Association Against Lung and Heart Disease, Swedish Research Council for Health, Working Life and Welfare (FORTE); Göteborg: Also received further funding from the Swedish Council for Working Life and Social Research; Umea also received funding from Vasterbotten Country Council ALF grant. Switzerland: Swiss National Science Foundation (grant numbers 33CSCO-134276/1, 33CSCO-108796, 3247BO-104283, 3247BO-104288, 3247BO-104284, 3247-065896, 3100-059302, 3200-052720, 3200-042532, 4026-028099), Federal Office for Forest, Environment and Landscape, Federal Office of Public Health, Federal Office of Roads and Transport, Canton’s government of Aargan, Basel-Stadt, Basel-Land, Geneva, Luzern, Ticino, Valais and Zürich, Swiss Lung League, Canton's Lung League of Basel Stadt/Basel, Landschaft, Geneva, Ticino, Valais and Zurich, SUVA, Freiwillige Akademische Gesellschaft, UBS Wealth Foundation, Talecris Biotherapeutics, Abbott Diagnostics, European Commission 018996 (GABRIEL), Wellcome Trust (WT 084703MA). UK: Medical Research Council (grant number 92091). Support was also provided by the National Institute for Health Research through the Primary Care Research Network.

Competing interests None declared.

Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

Provenance and peer review Not commissioned; externally peer reviewed.

Read the full text or download the PDF:

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • For authors
  • New editors
  • BMJ Journals More You are viewing from: Google Indexer

You are here

  • Volume 58, Issue 8
  • Physical fitness in male adolescents and atherosclerosis in middle age: a population-based cohort study
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • http://orcid.org/0000-0002-2691-0315 Ángel Herraiz-Adillo 1 ,
  • http://orcid.org/0000-0003-1383-3194 Viktor H Ahlqvist 2 ,
  • http://orcid.org/0000-0001-5205-122X Sara Higueras-Fresnillo 1 , 3 ,
  • http://orcid.org/0000-0002-3751-7180 Kristofer Hedman 4 ,
  • Emil Hagström 5 ,
  • Melony Fortuin-de Smidt 6 ,
  • Bledar Daka 7 ,
  • Cecilia Lenander 8 ,
  • http://orcid.org/0000-0003-0616-7779 Daniel Berglind 2 , 9 ,
  • Carl Johan Östgren 1 , 10 ,
  • http://orcid.org/0000-0003-3120-0913 Karin Rådholm 1 , 11 ,
  • http://orcid.org/0000-0003-2001-1121 Francisco B Ortega 12 , 13 ,
  • http://orcid.org/0000-0003-2482-7048 Pontus Henriksson 1
  • 1 Department of Health, Medicine and Caring Sciences , Linköping University , Linköping , Sweden
  • 2 Department of Global Public Health , Karolinska Institutet , Stockholm , Sweden
  • 3 Department of Physical Education, Sport and Human Motricity , Universidad Autónoma de Madrid , Madrid , Spain
  • 4 Department of Clinical Physiology in Linköping, and Department of Health, Medicine and Caring Sciences , Linköping University , Linköping , Sweden
  • 5 Department of Medical Sciences, Cardiology , Uppsala University , Uppsala , Sweden
  • 6 Department of Public Health and Clinical Medicine , Umeå University , Umeå , Sweden
  • 7 School of Public Health and Community Medicine, Institute of Medicine, Sahlgrenska Academy , University of Gothenburg Sahlgrenska Academy , Goteborg , Sweden
  • 8 Department of Clinical Sciences in Malmö, Centre for Primary Health Care Research , Lund University , Lund , Sweden
  • 9 Centre for Epidemiology and Community Medicine , Region Stockholm, Stockholm , Sweden
  • 10 Centre of Medical Image Science and Visualization (CMIV) , Linköping University , Linköping , Sweden
  • 11 The George Institute for Global Health , University of New South Wales , Sydney , New South Wales , Australia
  • 12 Department of Physical Education and Sports, Faculty of Sport Sciences, Sport and Health University Research Institute (iMUDS) and CIBEROBN Physiopathology of Obesity and Nutrition , University of Granada , Granada , Spain
  • 13 Faculty of Sport and Health Sciences , University of Jyväskylä , Jyväskylä , Finland
  • Correspondence to Dr Pontus Henriksson, Department of Health, Medicine and Caring Sciences, Linköping University, Linköping, Östergötland, Sweden; pontus.henriksson{at}liu.se

Objectives To examine the associations between physical fitness in male adolescents and coronary and carotid atherosclerosis in middle age.

Methods This population-based cohort study linked physical fitness data from the Swedish Military Conscription Register during adolescence to atherosclerosis data from the Swedish CArdioPulmonary bioImage Study in middle age. Cardiorespiratory fitness was assessed using a maximal cycle-ergometer test, and knee extension muscular strength was evaluated through an isometric dynamometer. Coronary atherosclerosis was evaluated via Coronary Computed Tomography Angiography (CCTA) stenosis and Coronary Artery Calcium (CAC) scores, while carotid plaques were evaluated by ultrasound. The associations were analysed using multinomial logistic regression, adjusted (marginal) prevalences and restricted cubic splines.

Results The analysis included 8986 male adolescents (mean age 18.3 years) with a mean follow-up of 38.2 years. Physical fitness showed a reversed J-shaped association with CCTA stenosis and CAC, but no consistent association was observed for carotid plaques. After adjustments, compared with adolescents in the lowest tertile of cardiorespiratory fitness and muscular strength, those in the highest tertile had 22% (OR 0.78; 95% CI 0.61 to 0.99) and 26% (OR 0.74; 95% CI 0.58 to 0.93) lower ORs for severe (≥50%) coronary stenosis, respectively. The highest physical fitness group (high cardiorespiratory fitness and muscular strength) had 33% (OR 0.67; 95% CI 0.52 to 0.87) lower OR for severe coronary stenosis compared with those with the lowest physical fitness.

Conclusion This study supports that a combination of high cardiorespiratory fitness and high muscular strength in adolescence is associated with lower coronary atherosclerosis, particularly severe coronary stenosis, almost 40 years later.

  • Cardiovascular Diseases

Data availability statement

Data are available upon reasonable request. The data underlying this article cannot be shared publicly due to legal reasons as well as the privacy of individuals that participated in the study. However, by contacting the study organisation (www.scapis.org) or the corresponding author, information will be provided regarding the procedures for accessing data following Swedish legislation.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:  http://creativecommons.org/licenses/by-nc/4.0/ .

https://doi.org/10.1136/bjsports-2023-107663

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

WHAT IS ALREADY KNOWN ON THIS TOPIC

Higher physical fitness levels, including both cardiorespiratory and muscular fitness, are associated with lower cardiovascular disease-related non-fatal and fatal events in adults. This association has also been observed for fitness during adolescence and later cardiovascular disease incidence and mortality.

No previous study has examined physical fitness in adolescence in relation to the development of coronary atherosclerosis in middle age, which may link fitness and the risk of cardiovascular events.

WHAT THIS STUDY ADDS

Our study provides novel evidence supporting that the combination of high cardiorespiratory fitness and high muscular strength in adolescence is associated with lower coronary atherosclerosis, particularly severe coronary stenosis, almost 40 years later.

These results suggest that coronary atherosclerosis is likely one of the mechanisms underlying the association between physical fitness and cardiovascular disease morbidity and mortality.

HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY

Our results support the clinical value of assessing both cardiorespiratory and muscular fitness for cardiovascular risk stratification.

Long-term interventions able to improve both cardiorespiratory fitness and muscular strength in adolescents could contribute to prevention of atherosclerosis in adulthood.

Introduction

Despite positive trends in the Western world during recent decades, 1 2 cardiovascular disease (CVD) remains as the leading cause of mortality worldwide. 3 Atherosclerosis, an inflammatory condition affecting all arterial regions, is the principal pathway involved in CVD. 3 Subclinical atherosclerosis, characterised by the presence of plaques in the arterial walls, is an early marker of CVD and an important predictor of future cardiovascular events. 4 Thus, identification of early modifiable risk factors is crucial for effective prevention of CVD and mortality globally.

A high level of physical fitness, including cardiorespiratory fitness and muscular strength, is considered a crucial factor in preventing CVD, cardiovascular mortality and all-cause mortality. Thus, the American Heart Association recognises cardiorespiratory fitness as a vital clinical sign 5 due to its strong association with positive cardiovascular outcomes, including improved cardiac structure and function, reduced atherosclerosis and decreased risk of CVD and all-cause mortality. 6–10 Additionally, while the associations are less pronounced compared with cardiorespiratory fitness, increased muscular strength also exhibits beneficial effects, including lower prevalence of atherosclerosis, decreased risk of CVD and lower mortality. 11 12

Nevertheless, despite a recent emphasis on prevention of CVD in younger individuals, 13 there is a lack of evidence on the impact of cardiorespiratory fitness in adolescence on the long-term development of atherosclerosis during late middle age. Such evidence could help to elucidate whether physical fitness early in life is related to atherosclerosis development decades later which may be of paramount importance for primary prevention of CVD. However, only one previous study has investigated the associations between physical fitness in adolescence and carotid atherosclerosis in middle age. 14 Furthermore, to the best of our knowledge, no previous study has examined associations of physical fitness in adolescence with coronary atherosclerosis later in life. In our study, Coronary Computed Tomography Angiography (CCTA), an accurate non-invasive imaging technique, enables a comprehensive assessment of the atherosclerotic burden, since CCTA allows the characterisation and quantification of both calcified and non-calcified plaques in the coronary arteries. 15 16

Therefore, the aim of this study was to examine the association between physical fitness in male adolescents with coronary and carotid atherosclerosis in middle age, using a population-based sample and notably long follow-up.

Study design and population

This cohort study linked information on atherosclerosis in middle age using data from the Swedish CArdioPulmonary bioImage Study (SCAPIS) (n=14 646) to information on cardiorespiratory fitness and muscular strength in male adolescents, obtained from the Swedish Military Conscription Register. Linkage of both databases, which determined our sample size (n=10 802), 17 was conducted through a personal identification number assigned to all Swedish residents at birth or on immigration. In our study, the Swedish Military Conscription Register comprised male adolescents born in Sweden between 1953 and 1968 who performed conscription between 1972 and 1987 (at ≈18 years of age). During this period, conscription was mandatory by law, except in rare circumstances, and the Swedish Military Conscription Register therefore includes 82%–92% of all Swedish men at the time of conscription. 18 SCAPIS is a collaborative project comprising six different universities in Sweden (Gothenburg, Linköping, Malmö/Lund, Stockholm, Umeå and Uppsala) aiming to predict and prevent cardiovascular and pulmonary disease. The participants included in SCAPIS were between 50 and 64 years old. Details about the SCAPIS protocol have been published elsewhere. 19

In this study, the inclusion criteria were: (1) men <20 years old at conscription with available data on cardiorespiratory fitness, muscular strength and covariates (age, site, body mass index (BMI), duration of smoking and conscription year) and (2) available data on coronary or carotid atherosclerosis and covariates (age, site and educational status) in SCAPIS.

Online supplemental figure 1 depicts a flow chart for the study. In brief, of the 14 646 male participants included in SCAPIS, 8986 male adolescents had data on exposures, covariates and at least one of the atherosclerosis outcomes. Thus, the final sample sizes consisted of 8006, 7849 and 8934 participants for the analysis of coronary stenosis, Coronary Artery Calcium (CAC) score and carotid plaques, respectively.

Supplemental material

Exposures at conscription.

Details about cardiorespiratory fitness and muscular strength protocols have been published elsewhere. 20–22 Briefly, cardiorespiratory fitness was assessed with a maximal exercise test using an electrically braked cycle-ergometer test, provided participants had a normal ECG at rest. The conscription protocol commenced with a 5-min warm-up, during which the workload was determined based on the individual’s weight. Subsequently, the workload was stepwise increased by 25 W every minute until exhaustion or incapacity to maintain the intended pedal cadence (60–70 revolutions/min). Cardiorespiratory fitness was defined as the maximal work rate achieved (in W). 23

Three different measures were considered for muscular strength: knee extension, handgrip and elbow flexion strength (in N). Knee extension strength was considered as the main exposure since previous studies have suggested it to be the most powerful indicator of health-related muscular strength in the Swedish Military Conscription Register. 22 Strength variables were measured with an isometric dynamometer test performed at maximal contraction capacity. Knee extension and elbow flexion strength were evaluated in a sitting position with 90° flexion over the knee and elbow joint, respectively, while handgrip strength was measured by positioning the hand vertically, with 90° flexion over the elbow joint.

Atherosclerosis outcomes at SCAPIS

Coronary atherosclerosis.

The detailed imaging protocol for SCAPIS has been published elsewhere. 19 Participants with a technical failure in any of the four proximal segments on the CCTA images were excluded for the analysis of coronary plaques and CAC score. 19 24

Coronary plaques were studied through two different levels of characterisation: grade of lumen stenosis and composition of the plaques from an arterial tree level. In our study, regarding the grade of lumen stenosis, the participants were finally categorised considering the segment with the greatest amount of stenosis within the 11 clinically most relevant segments (1–3, 5–7, 9, 11–13, 17) 25 as follows: no stenosis, 1%–49% stenosis and severe (≥50%) stenosis. 19 24 The presence of a ‘calcium blooming’ artefact and stent were considered as 1%–49% stenosis and ≥50% stenosis, respectively. A segment involvement score was calculated as the total number of relevant coronary segments with atherosclerosis irrespective of the degree of stenosis (range 0–11). 26 Regarding composition of the plaques from an arterial tree level, coronary atherosclerosis was further characterised as: no plaque, only non-calcified plaque/s (all identified plaque/s are non-calcified), only calcified plaque/s (all identified plaque/s are calcified) and mixed composition (presence of both calcified and non-calcified segments in the arterial tree).

In addition to CCTA images, a total CAC score was obtained according to an international standard protocol 27 by adding the calcium content in each coronary artery, 28 29 and the total CAC score was divided into three categories commonly used in clinical practice as follows: 0, 1–99 and ≥100 Agatston units. Subjects with implanted stent or post coronary artery bypass grafting were not evaluated for CAC.

Carotid atherosclerosis

Carotid artery two‐dimensional grey scale images were examined using a standardised protocol with a Siemens Acuson S2000 ultrasound scanner equipped with a 9L4 linear transducer (Siemens, Forchheim, Germany) and interpreted by regularly trained operators. 19 Carotid plaque was defined in accordance with the Mannheim consensus. 30 Common carotid, bulb and internal carotid arteries were examined, and participants without valid readings in both right and left carotid arteries were excluded for the analysis. Participants were classified as having either no plaque, unilateral plaque/s or bilateral carotid plaques. 31 For splines analysis, a carotid plaque score was calculated as follows: no plaque=0, unilateral plaque/s=1 and bilateral plaques=2.

BMI at conscription was calculated as weight (kg)/height squared (m 2 ) obtained by standardised procedures. Years of smoking at conscription were calculated based on self-reported age of smoking initiation in SCAPIS (for those who reported previous or ongoing smoking). To account for temporal trend differences at conscription, the year of conscription (ranging from 1972 to 1987) was categorised into four distinct periods, each spanning 4 years. Educational status at SCAPIS was categorised as unfinished primary school, primary school, secondary school and university degree.

Statistical analysis

We performed a complete case analysis excluding participants without complete data on exposures (0.7% in muscular strength and 13.2% in cardiorespiratory fitness), outcomes (0.8% in carotid plaque, 4.3% in CAC and 6.1% in coronary stenosis) and any covariates (14.3%). In total, 16.8% of participants had missing values in any exposure or covariate. Three types of analyses were performed. First, the adjusted non-linear associations between quantitative exposures and atherosclerosis outcomes (summarised as scores) were evaluated trough linear regression models incorporating restricted cubic splines with four knots located at percentiles 5th, 35th, 65th and 95th. 32 33 Second, the associations between tertiles of physical fitness in adolescence and atherosclerosis outcomes (CCTA coronary stenosis, CAC score and carotid plaque) in middle age were examined through multinomial logistic regression models and adjusted (obtained by marginalisation/parametric g-formula) prevalences. 34 Third, we performed restricted cubic splines (four knots located at percentiles 5th, 35th, 65th and 95th) within multinomial logistic regression models. The analyses had increasing level of covariate control: (1) unadjusted model; (2) adjusted model (by age at conscription, age at SCAPIS, site in conscription, site in SCAPIS, conscription year, educational status at SCAPIS, BMI at conscription and years of smoking at conscription). We created a directed acyclic graph to illustrate the hypothesised associations of physical fitness with atherosclerosis ( online supplemental figure 2 ). Adjusted models were further adjusted for knee extension strength in cardiorespiratory fitness and for cardiorespiratory fitness in muscular strength outcomes. Adjusted models were selected as the main analysis in splines and multinomial regressions for a better understanding of the isolated contribution of exposures on atherosclerosis outcomes without the influence of known confounders. The reference category for fitness tertiles was selected as the lowest tertile, while in multinomial models, the absence of atherosclerosis was chosen as the reference. Cut-offs for the tertiles of the different exposures are shown in table 1 . Combined associations of cardiorespiratory fitness and knee extension strength were performed considering the first tertiles at the low categories, and the second and third tertiles as the high categories.

  • View inline

Descriptive characteristics of the participants in the study by tertiles of cardiorespiratory fitness and knee extension strength in adolescence

To examine the robustness of our main findings, we conducted a series of sensitivity analyses in coronary stenosis as follows: (1) including participants with data on all 18 coronary segments of the arterial tree (instead of including participants with data on the 11 most relevant segments), (2) recategorising calcium blooming as ≥50% stenosis (instead of analysing calcium blooming as 1%–49% stenosis), (3) excluding coronary segments with a stent (instead of considering stents as ≥50% stenosis), (4) excluding participants with self-reported CVD (myocardial infarction, coronary artery bypass grafting, percutaneous coronary intervention, stroke or peripheral arterial disease intervention) in SCAPIS, (5) excluding presumably submaximal exercise tests (either ≤85% or ≤90% of the predicted maximal heart rate calculated as 208–(0.7×age)), 35 36 (6) further adjusting for height at conscription, (7) without adjusting for BMI at conscription and (8) without adjusting for muscular strength in cardiorespiratory fitness and without adjusting for cardiorespiratory fitness in muscular strength. Furthermore, to assess potential selection bias, we conducted multinomial logistic models that integrated inverse probability weighting to account for missing data in exposures, outcomes and the covariates used in the analysis. 37 Finally, a sensitivity analysis was conducted, incorporating quadratic and cubic terms for quantitative covariates to evaluate the presence of non-linearity in these covariates.

All statistical tests were two-sided and p<0.05 was considered statistically significant. Analyses were conducted using IBM-SPSS-28 (IBM Corp) and Stata V.18 (StataCorp 2021).

Equity, diversity and inclusion statement

This study uses data from the Swedish Military Conscription Register, which includes only male participants, a limitation we acknowledge in the limitations section. The SCAPIS is a population-based study that includes men and women from various birth regions. Thus, 15.6% of the male participants were born outside of Sweden. We did not impose additional restrictions related to race, ethnicity, culture, socioeconomic status or representation from marginalised groups during the study’s design or data analysis.

The research team comprises a diverse group of clinical and academic researchers from different countries including both women and men (4 women and 9 men).

Overall, included participants had a more favourable profile in smoking status, educational status, physical fitness and atherosclerosis compared with excluded participants ( online supplemental table 1 ).

The characteristics of the study population by tertiles of cardiorespiratory fitness and knee extension strength are presented in table 1 . At conscription, the mean age of participants was 18.3 years, whereas the mean cardiorespiratory fitness and knee extension strength were 259 W and 557 N, respectively. In SCAPIS, the mean age of participants was 56.5 years (mean follow-up 38.2 years), and 52.6% and 58.8% of participants had coronary stenosis and carotid plaques, respectively.

Cardiorespiratory fitness in adolescence and atherosclerosis in middle age

The continuous (left panel) and categorical associations (right panel) of cardiorespiratory fitness in adolescence with coronary and carotid atherosclerosis in middle age are shown in figure 1 ( online supplemental tables 2 and 3 depict the ORs and adjusted prevalences for such associations). In general, splines (left panel) showed a trend towards inverse associations between cardiorespiratory fitness related to segment involvement scores and CAC scores that were more pronounced for the low range of cardiorespiratory fitness values. In adjusted models, compared with adolescents in the lowest tertile of cardiorespiratory fitness, those in the medium and highest tertiles had respectively 18% (OR 0.82; 95% CI 0.66 to 1.02) and 22% (OR 0.78; 95% CI 0.61 to 0.99) lower ORs for severe (≥50%) coronary stenosis, right panel. However, there was no clear association between tertiles of cardiorespiratory fitness and 1%–49% coronary stenosis or CAC scores. Online supplemental figure 3 depicts the multinomial logistic splines for the association between cardiorespiratory fitness (as continuous variable) and atherosclerosis indicators.

  • Download figure
  • Open in new tab
  • Download powerpoint

Associations of cardiorespiratory fitness in adolescence with coronary and carotid atherosclerosis in middle age. Left panel depicts adjusted restricted cubic splines with 95% confidence bands for the association of cardiorespiratory fitness in adolescence with segment involvement score (0–11), CAC score and carotid plaque score (0–2) in middle age. X-axes are trimmed to depict the associations for the 1st to 99th percentile of cardiorespiratory fitness values. Right panel depicts adjusted multinomial regression models with 95% CIs for the association of cardiorespiratory fitness in adolescence with coronary stenosis, CAC score and carotid plaques in middle age. Both splines and multinomial models are adjusted for age at conscription, age at SCAPIS, site in conscription, site in SCAPIS, conscription year, educational status at SCAPIS, BMI at conscription, years of smoking at conscription and knee extension strength. BMI, body mass index; CAC, coronary artery calcium; SCAPIS, Swedish CArdioPulmonary bioImage Study.

Regarding carotid atherosclerosis, a different pattern arises contrasting with coronary atherosclerosis. Compared with adolescents in the lowest tertile of cardiorespiratory fitness, those in the medium and highest tertiles had 18% (OR 1.18, 95% CI 1.04 to 1.34) and 17% (OR 1.17, 95% CI 1.02 to 1.35) higher ORs for unilateral carotid plaque/s, respectively, while there were no clear associations between cardiorespiratory fitness and bilateral carotid plaques.

Considering composition of the coronary plaques, individuals in the highest tertile of cardiorespiratory fitness had 22% (OR 0.78, 95% CI 0.61 to 0.99) lower odds of mixed composition in the arterial tree ( online supplemental figure 4 and online supplemental table 4 ).

Muscular strength in adolescence and atherosclerosis in middle age

Figure 2 depicts the continuous (left panel) and categorical (right panel) associations of knee extension strength in adolescence with coronary and carotid atherosclerosis in middle age ( online supplemental tables 5,6 depict the ORs and adjusted prevalences for such associations). Overall, splines showed inverse associations between knee extension strength and atherosclerosis outcomes. In consonance with cardiorespiratory fitness, there was a negative association between knee extension strength and severe (≥50%) coronary stenosis in the adjusted model. When compared with the lowest tertile of knee extension strength, those in the medium and highest tertiles had 11% (OR 0.89; 95% CI 0.72 to 1.10) and 26% (OR 0.74; 95% CI 0.58 to 0.93) lower ORs for severe coronary stenosis, respectively. Regarding CAC score, adolescents in the medium and highest tertiles had 19% (OR 0.81; 95% CI 0.68 to 0.96) and 16% (OR 0.84; 95% CI 0.70 to 1.01) lower ORs for a CAC score ≥100. No clear associations were observed between knee extension strength and carotid plaques. Online supplemental figure 5 depicts the multinomial logistic splines for the association between knee muscular strength (as continuous variable) and atherosclerosis indicators.

Associations of knee extension strength in adolescence with coronary and carotid atherosclerosis in middle age. Left panel depicts adjusted restricted cubic splines with 95% confidence bands for the association of knee extension strength in adolescence with segment involvement score (0–11), CAC score and carotid plaque score (0–2) in middle age. X-axes are trimmed to depict the associations for the 1st to 99th percentile of knee extension strength values. Right panel depicts adjusted multinomial regression models with 95% CIs for the association of knee extension strength in adolescence with coronary stenosis, CAC score and carotid plaques in middle age. Both splines and multinomial models are adjusted for age at conscription, age at SCAPIS, site in conscription, site in SCAPIS, conscription year, educational status at SCAPIS, BMI at conscription, years of smoking at conscription and cardiorespiratory fitness. BMI, body mass index; CAC, coronary artery calcium; SCAPIS, Swedish CArdioPulmonary bioImage Study.

Considering composition of the plaques, knee extension strength did not show clear associations with any types of coronary plaques ( online supplemental figure 4 and online supplemental table 7 ).

Handgrip strength and elbow flexion strength exhibited somewhat similar, although weaker, patterns of association with coronary and carotid atherosclerosis compared with knee extension strength ( online supplemental figure 6 and online supplemental tables 8,9 ).

Combined associations of cardiorespiratory fitness and knee extension strength in adolescence with atherosclerosis in middle age

Figure 3 depicts the combined associations of cardiorespiratory fitness and knee extension strength in adolescence with atherosclerosis in middle age, while online supplemental tables 10,11 depict the ORs and adjusted prevalences for these associations. There was a trend towards less severe (≥50%) coronary stenosis with higher levels of cardiorespiratory fitness and strength, with those in the highest physical fitness group having 33% (OR 0.67; 95% CI 0.52 to 0.87) lower OR compared with those with the lowest physical fitness. Regarding CAC score, participants in the highest physical fitness group had 24% (OR 0.76; 95% CI 0.62 to 0.93) lower OR for a CAC score ≥100 compared with those in the lowest physical fitness group. However, those in the highest physical fitness group did not have lower ORs for carotid plaques.

Combined associations of cardiorespiratory fitness and knee extension strength in adolescence with coronary stenosis, CAC score and carotid plaques in middle age. All models depict multinomial regression models with 95% CIs adjusted for age at conscription, age at SCAPIS, site in conscription, site in SCAPIS, conscription year, educational status at SCAPIS, BMI at conscription and years of smoking at conscription. Low categories refer to the first tertile, while high categories refer to the second and third tertiles. BMI, body mass index; CAC, coronary artery calcium; CRF, cardiorespiratory fitness; Strength, knee extension muscular strength; SCAPIS: Swedish CArdioPulmonary bioImage Study.

Sensitivity analyses

In coronary atherosclerosis, the inclusion of all coronary segments, the recategorised definition of ‘calcium blooming’ artefact or stent, as well as the exclusion of participants with CVD, did not significantly alter the associations between cardiorespiratory fitness and knee extension strength in relation to coronary stenosis ( online supplemental table 12 ). In a second sensitivity analysis, the exclusion of presumably non-maximal tests generally strengthened the associations with coronary stenosis ( online supplemental table 13 ). Further adjustment for height in adolescence generally attenuated the associations between cardiorespiratory fitness and coronary stenosis but did not influence corresponding associations with knee extension strength ( online supplemental table 14 ). As shown in online supplemental table 15 , removing the adjustment for BMI attenuated the associations of cardiorespiratory fitness and knee extension strength with coronary stenosis. Associations of cardiorespiratory fitness and knee extension strength were generally unaffected when they were not mutually adjusted for each other ( online supplemental table 16 ). Finally, associations between cardiorespiratory fitness and atherosclerosis outcomes remained robust in the inverse probability weighting analysis ( online supplemental table 17 ) and when incorporating quadratic and cubic terms for quantitative covariates in multinomial logistic ( online supplemental tables 18,19 ) and linear models (data not shown).

This large population based-study showed inverse associations between cardiorespiratory fitness during adolescence and coronary atherosclerosis, particularly severe (≥50%) coronary stenosis, almost 40 years later. Furthermore, knee extension strength in adolescence showed inverse associations not only with severe coronary stenosis but also with high CAC scores in middle age. However, neither cardiorespiratory fitness nor knee extension strength was robustly associated with the presence of bilateral carotid plaques. Finally, the combination of high cardiorespiratory fitness and knee extension strength levels was strongly associated with a lower presence of severe coronary stenosis and high CAC scores.

To the best of our knowledge, this is the first study analysing the associations between cardiorespiratory fitness in adolescence and coronary atherosclerosis in middle age measured with CCTA, an accurate non-invasive imaging technique that allows characterisation and quantification not only of calcified but also non-calcified plaques. In our study, the splines linear models suggested a somewhat reverse J-shape pattern for the association between cardiorespiratory fitness and segment involvement score, with values below 240 W (≈first tertile) associated with a worse segment involvement score. In consonance with this, in our study, after adjustments, individuals in the highest tertile of cardiorespiratory fitness had 22% lower odds of having severe coronary stenosis. Interestingly, the splines in multinomial models also supported that higher cardiorespiratory fitness associates with decreased coronary atherosclerosis. However, very high fitness levels may not confer similar protection as moderately high levels, even suggesting a potential negative effect at very high levels of cardiorespiratory fitness (around the 95th percentile). Nevertheless, wide CIs in extreme fitness values preclude definitive associations. Regarding this, it should be acknowledged that while better levels of cardiorespiratory fitness have been cross-sectionally associated with a lower risk of coronary calcification, 38 certain populations with very high levels of cardiorespiratory fitness such as endurance athletes seem to have an increased burden of coronary atherosclerosis, suggesting a U-shape relationship. 38–41 Yet, the clinical significance of accelerated coronary artery atherosclerosis in athletes engaged in very high volume-intensity exercise remains unclear. 42 43 Further studies are needed in this context.

Interestingly, high cardiorespiratory fitness was associated with less prevalence of a mixed composition (presence of both calcified and non-calcified segments) in the arterial tree, which is consistent with the lack of association observed between tertiles of cardiorespiratory fitness and CAC. These findings align with previous studies that have reported lower prevalence of mixed plaques in the coronary artery among athletes 44 or individuals with high exercise volume, 41 which is of relevance given the clear association between cardiorespiratory fitness and exercise. 45 This observation may be of importance since individuals with non-calcified or mixed plaques have been associated with a worse prognosis compared with those with predominantly calcified plaques. 46 47

In our sensitivity analyses, associations between cardiorespiratory fitness and coronary stenosis were attenuated when estimates were not adjusted for BMI at conscription. This is intriguing and could be attributed to the selected cardiorespiratory fitness test (ie, non-weight-bearing). Notably, BMI and performance in cycle-ergometer tests (measured in W) often exhibit a positive correlation, 48 possibly because higher body mass can generate more power. However, BMI is also strongly linked to atherosclerosis risk, which might account for the observed attenuation in our sensitivity analyses. Additional research on this subject is needed.

Although no previous study has explored the associations of cardiorespiratory fitness in adolescence with later coronary atherosclerosis, our findings may be compared with previous studies that have linked cardiorespiratory fitness in adulthood to CAC later in life. The CARDIA study found that high levels of cardiorespiratory fitness in young adults were associated with 41% lower odds of coronary calcification after 15 years of follow-up. 7 However, another study also based on the CARDIA cohort found that although cardiorespiratory fitness was favourably associated with cardiac structure and function, it was not associated with CAC scores approximately 27 years later. 8 Despite different levels of covariate adjustment or follow-up could partially explain these differences, the baseline level of cardiorespiratory fitness (and physical activity) could also influence the associations between cardiorespiratory fitness and CAC. In our study, despite a lack of clear association for tertiles of cardiorespiratory fitness and CAC, the observed pattern in linear and multinomial logistic regression splines in CAC was concordant with that observed for coronary stenosis, suggesting that being unfit (cardiorespiratory fitness levels below first tertile, ≈240 W) is associated with greater risk.

In previous studies, the associations between muscular strength and CVD have generally been weaker compared with those observed for cardiorespiratory fitness. 49 However, in our study, the associations with coronary stenosis for muscular strength were similar or even slightly stronger than those for cardiorespiratory fitness. In fact, knee extension strength (more than handgrip strength or elbow flexion strength) was inversely associated not only with the presence of severe coronary stenosis, but also with a high CAC score, which was not clearly associated with tertiles of cardiorespiratory fitness. These findings are consistent with our results regarding the combined associations of cardiorespiratory fitness and knee extension strength. They indicated that achieving lower odds of coronary atherosclerosis requires the simultaneous presence of acceptable levels of cardiorespiratory fitness and knee extension strength, underscoring the integrated nature of physical fitness.

Our results regarding carotid plaques are intriguing: we did not observe consistent associations between cardiorespiratory fitness and bilateral plaques, but we found an inverted U-shaped association with unilateral plaques. This is in contrast with a previous study also analysing conscripted Swedish men, which found that cardiorespiratory fitness was associated with 19% lower odds of carotid plaques at 60 years of age. 14 However, this study analysed a sample size 10 times smaller, and considered carotid plaques as a dichotomous variable (no plaque, plaque/s) instead as continuous and multinomial ones (no plaque, unilateral plaque/s, bilateral plaques) as in our study. In addition, the Cooper Center Longitudinal Study found that midlife cardiorespiratory fitness was inversely associated with carotid artery disease measured almost two decades later. 50 However, this study characterised low cardiorespiratory fitness as the first quintile and used a different definition of carotid artery disease than our study. Further studies are therefore needed to elucidate the associations of cardiorespiratory fitness in adolescence with the development of carotid plaques later in life.

Strengths and limitations

The main strength of this study was the utilisation of CCTA on a population-based scale, enabling the characterisation of calcified and non-calcified coronary plaques within a sizeable sample of the population. Furthermore, the study benefits from a young cohort that was followed up for nearly 40 years, minimising the possibility of reverse causation, as it is highly unlikely that disease in adolescence caused low physical fitness. The study is also informative of the very-long term prognostic value of cardiorespiratory fitness and muscular strength. Additionally, physical fitness was objectively assessed using standardised procedures and not self-reported.

However, some limitations should be acknowledged. First, since conscription was only mandatory for men before 2010, only male participants were included, which unfortunately does not help to reduce the gender gap in the understanding of cardiovascular risk in women. Second, physical fitness exposures and covariates were only measured in adolescence, which impedes evaluating the cumulative effect of these variables during the follow-up. Nevertheless, a meta-analysis proved that cardiorespiratory fitness and muscular strength exhibited moderate tracking from adolescence to adulthood. 51 Third, while our study is longitudinal, its observational nature limits our capacity to make strong causal conclusions. Furthermore, the absence of certain covariates related to cardiovascular risk during conscription (eg, diet, physical activity or body composition) or measurement error in confounders restrict our ability to fully account for residual confounding in our models, which limits the assessment of the isolated contribution of physical fitness to atherosclerosis. In this sense, well-designed longitudinal studies and randomised controlled trials (despite acknowledging the difficulty of conducting long-term follow-ups) are needed to corroborate or contrast our findings. Fourth, excluded participants presented a slightly different profile compared with included participants, suggesting certain selection bias. However, our sensitivity analyses, incorporating inverse probability weighting, did not change the study’s conclusions. Finally, despite the use of CCTA, our study did not enable a comprehensive characterisation of coronary plaques within a segment level. This limitation arose from the absence of information regarding mixed plaques (combining calcified and non-calcified components) within individual coronary segments. Instead, we were only able to assess a mixed composition at an arterial tree level, indicating the presence of both calcified and non-calcified segments in the arterial tree. Similarly, the characterisation of carotid plaques was imperfect, as we were unable to evaluate their phenotype, number and degree of stenosis.

Conclusions and implications

Our findings support that a combination of high cardiorespiratory fitness and high muscular strength in adolescence is associated with less atherosclerosis, particularly lower prevalence of severe coronary stenosis later in life compared with those with lower fitness levels. The effect size observed was modest, yet it is known that small changes at a population level can have important clinical and public health implications. For example, the adjusted prevalence of severe coronary stenosis was 38% lower (6.9% vs 9.5%) for those with high cardiorespiratory fitness and knee extension strength compared with those with low cardiorespiratory fitness and knee extension strength, which may have relevance for future CVD risk stratification at a population level. Furthermore, the decreasing secular trends in cardiorespiratory fitness observed in the last couple of decades in many countries 13 52–54 are a cause for concern since they are expected to increase the absolute risk for atherosclerosis in the future. Indeed, adequate levels of cardiorespiratory fitness, and to a lesser extent, muscular strength, have consistently demonstrated an inverse association with CVD morbidity 49 55 and mortality. 56 57 Our findings, which establish a link between physical fitness in adolescence and atherosclerosis in middle age, contribute to the existing evidence by showing that coronary atherosclerosis can be one of the mechanisms underlying the association between physical fitness and CVD morbidity and mortality. Thus, although further well-designed studies are needed, our findings suggest that adequate physical fitness already in adolescence may reduce coronary atherosclerosis later in life.

Ethics statements

Patient consent for publication.

Consent obtained directly from patient(s).

Ethics approval

This study involves human participants and was approved by the Swedish Ethical Review Authority which granted ethical approval for this work (reference numbers: 2021-06408-01 and 2022-04375-02). Participants gave informed consent to participate in the study before taking part.

Acknowledgments

We thank participants and staff of the Swedish Military Conscription Register and SCAPIS project for their valuable contributions.

  • Virani SS ,
  • Benjamin EJ , et al
  • Townsend N , et al
  • Abate KH , et al
  • Argulian E ,
  • Leipsic J , et al
  • Arena R , et al
  • Kondamudi N ,
  • Laukkanen JA , et al
  • Jacobs DR ,
  • Hankinson A , et al
  • Murthy VL ,
  • Colangelo LA , et al
  • Paffenbarger RS , et al
  • Musa Yola I , et al
  • Wijndaele K , et al
  • Gu Y , et al
  • Raghuveer G ,
  • Lubans DR , et al
  • Fortuin-de Smidt M ,
  • Bergman F ,
  • Grönlund C , et al
  • Hartaigh BÓ ,
  • Gransar H , et al
  • Chang H-J ,
  • Ó Hartaigh B , et al
  • Hoenig JM ,
  • Ludvigsson JF ,
  • Berglind D ,
  • Sundquist K , et al
  • Bergström G ,
  • Berglund G ,
  • Blomberg A , et al
  • Henriksson P ,
  • Henriksson H ,
  • Tynelius P , et al
  • Svedenkrans J ,
  • Kowalski J ,
  • Norman M , et al
  • Franklin B , et al
  • Herraiz-Adillo Á ,
  • Higueras-Fresnillo S ,
  • Ahlqvist VH , et al
  • Persson M ,
  • Adiels M , et al
  • Abdelsalam MA , et al
  • McCollough CH ,
  • Ulzheimer S ,
  • Halliburton SS , et al
  • Agatston AS ,
  • Janowitz WR ,
  • Hildner FJ , et al
  • Ohnesorge B ,
  • Fischbach R , et al
  • Touboul P-J ,
  • Hennerici MG ,
  • Meairs S , et al
  • Ahlqvist VH ,
  • Higueras-Fresnillo S , et al
  • Harrell FEJ
  • Hernán MA ,
  • Monahan KD ,
  • Machado FA ,
  • Seaman SR ,
  • Kermott CA ,
  • Schroeder DR ,
  • Kopecky SL , et al
  • Aengevaeren VL ,
  • Mosterd A ,
  • Sharma S , et al
  • De Bosscher R ,
  • Claus P , et al
  • Braber TL , et al
  • Franklin BA ,
  • Thompson PD ,
  • Al-Zaiti SS , et al
  • Baggish AL ,
  • Merghani A ,
  • Maestrini V ,
  • Rosmini S , et al
  • Gossard D ,
  • Haskell WL ,
  • Taylor CB , et al
  • Criqui MH ,
  • Denenberg JO ,
  • Ix JH , et al
  • Petretta M ,
  • Daniele S ,
  • Acampa W , et al
  • Hulzebos HJ , Low-Lands Fitness Registry Study Group , et al
  • Kohl HW , et al
  • García-Hermoso A ,
  • Izquierdo M ,
  • Ramírez-Vélez R
  • Tomkinson GR ,
  • Tremblay MS
  • Ekblom-Bak E ,
  • Andersson G , et al
  • Arntz F , et al
  • Högström G ,
  • Nordström A ,
  • Nordström P
  • Sun Z , et al
  • Shi X , et al

Supplementary materials

Supplementary data.

This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

  • Data supplement 1

Twitter @AdilloAngel, @AhlqvistViktor, @sarita_hf, @DanielBerglind, @ortegaporcel, @P_Henriksson_

Contributors AH-A, VHA, SH-F, KR, FBO and PH contributed to the conception and design of the study. CJÖ, KR and PH contributed to data acquisition. AH-A, SH-F, VAH and PH conducted the statistical analysis while KH, EH, MF-S, BD, CL, DB, CJÖ, KR and FBO contributed to data analysis and interpretation. AH-A, FBO and PH drafted the manuscript, which was reviewed and revised by VHA, KH, EH, MF-S, BD, SH-F, CL, DB, CJÖ and KR. All authors approved the final version of the manuscript. PH and AH-A are the guarantors of the manuscript.

Funding The main funding body of The Swedish CArdioPulmonary bioImage Study (SCAPIS) is the Swedish Heart-Lung Foundation. The study is also funded by the Knut and Alice Wallenberg Foundation, the Swedish Research Council and VINNOVA (Sweden‘s Innovation Agency), University of Gothenburg and Sahlgrenska University Hospital, Karolinska Institutet and Stockholm County council, Linköping University and University Hospital, Lund University and Skåne University Hospital, Umeå University and University Hospital, and Uppsala University and University Hospital. In addition, this study is supported by the Joanna Cocozza Foundation for Children’s Medical Research. SH-F is supported by a Margarita Salas grant from the Autonomous University of Madrid. FBO research activity on this topic is supported by grants from the Andalusian Government (Junta de Andalucía, Plan Andaluz de Investigación, ref: P20_00124) and the Spanish Ministry of Science and Innovation (ref: PID2020-120249RB-I00).

Competing interests EH reports payments to institution from Pfizer and Amgen, small personal fees from Amgen, NovoNordisk, Bayer and AstraZeneca, small personal fee from Amarin AB for participation on advisory board. He is the co-chair of the Swedish secondary prevention registry and the national coordinator for the trials DalCore DAL301 DalGne, Regeneron R1500-CL-1643 and Aegis II/Perfuse. The remaining authors report no competing interests.

Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

Provenance and peer review Not commissioned; externally peer reviewed.

Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Read the full text or download the PDF:

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • HHS Author Manuscripts

Logo of nihpa

Overview: Cohort Study Designs

Bernadette capili.

Heilbrun Family Center for Research Nursing, The Rockefeller University, 1230 York Avenue, Hospital, Room 106, New York, NY 10065

Joyce K. Anastasi

New York University Rory Meyers College of Nursing, 380 Second Avenue, Suite 305, New York, NY 10010

This paper continues the series on the observational study designs, focusing on the cohort design. The word ‘cohort’ was adopted from the Roman term of 300 to 600 fighting soldiers who march together ( Hood, 2009 ; Hulley, 2013 ). The epidemiology community-initiated using ‘cohort’ during the 1930s to mean a “designated group which are followed or traced over a period of time “( Hood, 2009 , p. E2). The term is currently defined as a group of people with pre-defined common characteristic(s) (i.e., smokers, exposure to lead in drinking water, ICU nurses) followed longitudinally with periodic measurements to determine the incidence of specific health outcomes or events ( Alexander, 2015 ; Hulley, 2013 ; Song & Chung, 2010 ). Since cohort studies are observational, study participants are monitored, and study interventions are not provided. This paper describes the prospective and retrospective cohort designs, examines the strengths and weaknesses, and discusses methods to report the results.

Cohort Design

The cohort study design is an excellent method to understand an outcome or the natural history of a disease or condition in an identified study population ( Mann, 2012 ; Song & Chung, 2010 ). Since participants do not have the outcome or disease at study entry, the temporal causality between exposure and outcome(s) can be assessed using this design ( Hulley, 2013 ; Song & Chung, 2010 ). A vital feature of a cohort study is selecting the study participants based on mutual characteristics such as geographic location, birth year, or occupation ( Song & Chung, 2010 ). Cohorts are also selected based on exposure and non-exposure status ( Setia, 2016 ). Ideally, both groups are similar except for the exposure status. Additionally, the cohort can be divided based on exposure categories at study entry.

For example, an investigator could recruit people living with HIV (PLWH) who smoke and do not smoke (never smoked) from the same community and follow them over five years to determine the relationship between smoking status and HIV and the incidence of heart disease and stroke in this population. Alternatively, at study entry, the smokers could be categorized based on the smoking pack-years (less than five pack-years or greater than five pack-years) to determine whether heart disease and stroke are associated with the amount and duration of smoking.

Prospective Cohort Design

The prospective cohort studies are also referred to as longitudinal studies. It is used to answer a specific question(s) in a selected area. Investigators recruit a sample of participants and follow them over time, from the present to the future. At pre-determined time-points, characteristics are measured (using interviews, questionnaires, biological assays, physiologic measures) to understand the relationship between the cohort and study outcome. See figure 1 .

An external file that holds a picture, illustration, etc.
Object name is nihms-1837363-f0001.jpg

Prospective and Retrospective Cohort Designs

During the recruitment phase, the investigator must identify potential participants who plan to move and difficult to reach during the study’s follow-up phase. The eligibility criteria should reflect this consideration. The investigator should collect contact information from the enrolled participants, telephone, email address, mailing address, and at least two friends or family members the investigator can contact if they move or die during the follow-up phase ( Hulley, 2013 ). Additionally, the study protocol should schedule periodic contact with the participants, such as telephone calls to provide assessment results, study newsletter, or study incentives (gift cards) to keep the participants engaged.

In continuing with the HIV study example, study participants are recruited from local New York City HIV primary care clinics. The study plans to evaluate participants annually for ten years to determine heart disease and stroke incidence. PLWH are eligible to join if they smoke cigarettes with well-controlled HIV (undetectable viral load). At study entry, individual exposures for smoking are determined (smoking pack-years), medical history and cardiovascular health are evaluated. Participants identified at baseline to have heart disease or a history of stroke are excluded from the study. Participants are categorized into two groups based on smoking exposure, less than five pack-years or greater than five pack-years for this study. The independent variables ((predictor variables) (smoking pack-years, blood pressure, weight, waist circumference, lipid levels), and the dependent variable ((outcome), history of heart disease, and stroke) are assessed annually. The longitudinal design allows investigators to compare changes over time (Fitzmaurice, 2008) and determine if the level of exposure (smoking pack-years) and other variables are associated with the outcome (incidence of heart disease and stroke).

Prospective Cohort Design: Strengths and Weaknesses

A primary strength of the prospective cohort design is that it allows investigators to determine the number of new cases (incidence) occurring over time. From our example, the incidence of new-onset heart disease and stroke among the study participants. Additionally, measuring the predictor variables before the onset of the outcome (heart disease and stroke) strengthens the ability to assess the sequence of events and infer the causal basis of an association between the predictor variables and the outcome ( Hulley, 2013 ).

A limitation of using this design is that it requires a large sample size. Alexander and colleagues (2015) recommend at least 100 participants. Additionally, the cost of conducting the study may be costly in terms of participant recruitment, the number of staff to conduct the research, and the collection, storage, and analysis of the outcome measurements. Moreover, some conditions (i.e., breast cancer, chronic obstructive disease), despite being relatively common, could occur at low rates in any given evaluation period and not provide meaningful results. Therefore, participants need to be followed for a longer duration, thus increasing cost and the possibility of participants withdrawing from the study or losing them during follow-ups ( Hulley, 2013 ).

Retrospective Cohort Design

Retrospective cohort studies are also called historical cohort studies. The term historical is fitting since data analysis occurs in the present time, but the participants’ baseline measurements and follow-ups happened in the past ( Hulley, 2013 ). This type of study is feasible if an investigator has access to a dataset that fits the research question. The dataset must also have adequate measurements about the predictor variables. See figure 1 .

Generally, the participants for a retrospective cohort design are generated for other purposes, such as electronic medical records or an administrative database like medicare ( Hulley, 2013 ). This design’s primary goal is to review past data (predictor variables) to examine events or outcomes. Institutional review board approval is required for this design even though actual patient interactions do not occur. For example, to ascertain the incidence of heart disease and stroke among PLWH who smoke, electronic medical records of 500 HIV patients from a local HIV primary clinic are examined over ten years, 2010–2020. For this illustration, HIV patients are categorized by their smoking exposure status: smoking less than five pack-years or greater than five pack-years. The outcome of interest is the incidence of heart disease and stroke.

Retrospective Cohort Design: Strengths and Weaknesses

A strength of the retrospective cohort design is the immediate ability to analyze the outcome since it is already assembled with collected measurements and the participants’ follow-ups. This type of design is also inexpensive to conduct. A primary limitation of this study is that the available dataset may be incomplete, inaccurate, or measurements undertaken that do not match the research question ( Hulley, 2013 ). In other words, the investigator(s) do not have control over the data collection methods and procedures.

Method to Report Results

During the scheduled evaluation periods, investigators count the incidence or the number of participants who develop the outcome of interest (i.e., heart disease and stroke). The methods to measure incidence are risks and rates ( Alexander, 2015 ). Both terms can provide additional information about the exposure of interest (smoking, nonsmoking) by calculating the risk ratio and rate ratio ( Alexander, 2015 ).

Risk and Risk Ratio

The term risk is also known as cumulative incidence . It is defined as the number of participants who develop the outcome of interest divided by the total population (participants from the cohort) at risk ( Alexander, 2015 ). For instance, investigators conduct a study to evaluate the association between smoking and heart disease and stroke among PLWH who attend an HIV primary clinic in lower Manhattan. The investigators follow a total of 1000 PLWH for ten years. Among the 1000 PLWH, 500 were smokers, and 500 were nonsmokers. Participants were evaluated annually. A total of 125 heart disease cases and stroke were diagnosed in the smoking group, while 25 heart disease cases and stroke were diagnosed in the non-smoking group. All the cases of heart disease and stroke were diagnosed at the fifth year follow-up. (See Table 1 for calculations).

Calculation Example

  • a = exposed participant and acquires the outcome of interest
  • b = exposed participant and does not acquires the outcome of interest
  • c = unexposed participant and acquires the outcome of interest
  • d = unexposed participant and does not acquire the outcome of interest
  • Risk (Cumulative Incidence) of PLWH diagnosed with heart disease/stroke: (a+c)/(a+b+c+d) = 150/1000 = .15 × 100 = 15%
  • Risk Ratio among PLWH who smoke for heart disease and stroke: [a/(a+b)] / [c/(c+d)] = (125/500)/(25/500) = .25/.05 = 5

Interpretation Risk Ratio or Rate Ratio

  • Risk Ratio or Rate Ratio = 1 Exposure is not preventive or harmful
  • Risk Ratio or Rate Ratio > 1 Exposure is harmful
  • Risk Ratio or Rate Ratio < 1 Exposure is protective

Rate (Incidence Rate) of heart disease/stroke among PLWH over a ten year period: a + c/ [(a × 5 + ) + (b × 10 $ )] + [(c × 5 + ) + (d × 10 $ )] =150/9250 = 0.016 cases/Person-year

Rate Ratio (Incidence Rate Ratio (IRR)): a/[(a × 5 + ) + (b × 10 $ )] c[(c × 5 + ) + (d × 10 $ ) = 0.026/0.005= 5.2

From the above example, 150 cases of heart disease and stroke were identified from the cohort sample size of 1000. Based on the calculations, the risk for developing heart disease and stroke was 15% among the study participants. Additional analyses using the risk ratio compared the risk between participants exposed (smoker) and unexposed (nonsmoker) to provide further information about the data. The risk ratio illustrates the relative increase or decrease in the incidence between the exposed and unexposed groups ( Alexander, 2015 ). (See Table 1 for calculations).

Using the formula from table 1 , the risk ratio was 5. The results demonstrate that PLWH who smoke (exposed) were five times more likely to be diagnosed with heart disease and stroke than PLWH who were nonsmokers. To further understand the meaning of the risk ratio results, if the result was equal to 1, then the exposure (smoker) did not affect the outcome. In other words, the risk was the same for the exposed and unexposed groups. Similarly, if the risk ratio was less than 1, it indicates that the exposed (smoker) group was protective for heart disease and stroke. When the results are further away (see figure 2 )

An external file that holds a picture, illustration, etc.
Object name is nihms-1837363-f0002.jpg

Risk Ratio or Rate Ratio Interpretation

Rate and Rate Ratio

The term rate is also known as an incidence rate (IR). It is defined as the number of participants who develop the outcome of interest (heart disease and stroke) divided by the person-time (days, months, years) at risk during follow-up ( Alexander, 2015 ). Person-time is the sum of each participant’s total time free (no heart disease and no stroke) from the outcome of interest. This measure provides the accumulated events (cases of heart disease and stroke) and the speed at which new health outcomes transpire in a study cohort. Another analysis used to compare and understand the rate of speed (increase or decrease) of a health outcome between the exposed and unexposed groups is the rate ratio .

In continuing with the example from above, the calculated rate was 0.016 (see Table 1 ). The result indicates that 0.016 cases of heart disease and stroke per person-year occurred in the sample, with a rate ratio of 5.2. This result indicates that heart disease and stroke rates were 5.2 times greater in the exposed group than in the unexposed group. Similar to the risk ratio , if the result was equal to 1, then the smoking exposure did not affect the outcome. If the rate ratio was less than 1, smoking exposure was protective for heart disease and stroke. The greater the rate ratio is from 1 (null association, the exposure is not preventive or harmful), the exposure had more impact on the study cohort. (see figure 2 ).

Reporting Recommendations

In continuing the Step by Step Research column with the observational studies, the cohort design also has a reporting guideline to explain how a study was conducted and how the results were obtained. Like the cross-sectional study, the cohort study uses the same guideline, Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) ( von Elm et al., 2014 ). The report provides specific recommendations for cohort studies in the 22-item checklist to guide investigators in what to include in their manuscript. For consumers of the research, the checklist helps the reader understand the paper better regarding study planning, conduct, findings, and conclusions ( von Elm et al., 2014 ). Additionally, the checklist contains information to allow a study to be replicated, useful to make clinical decisions, and sufficient information to be included in a systematic review ( https://www.equator-network.org/reporting-guidelines/strobe/ ).

The cohort design is an appropriate method to determine the incidence of a health outcome or an event. This design is especially helpful in understanding the natural history of disease and conditions in an identified study population. Additionally, this design allows an investigator to examine the timing between an exposure and outcome(s).

Acknowledgments

This manuscript is supported in part by grant # UL1TR001866 from the National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH) Clinical and Translational Science Award (CTSA) program, and by the National Institutes of Health/National Institutes for Nursing Research #R01NR017917

Contributor Information

Bernadette Capili, Heilbrun Family Center for Research Nursing, The Rockefeller University, 1230 York Avenue, Hospital, Room 106, New York, NY 10065.

Joyce K. Anastasi, New York University Rory Meyers College of Nursing, 380 Second Avenue, Suite 305, New York, NY 10010.

  • Alexander L, Lopes B, Richetti-Masterson K, Yeatts KR. (2015) Risk and Rate Measures in Cohort Studies. In: Vol. 2nd. ERIC Notebook . Durham, NC: Department of Epidemiology at the UNC Gillings School of Global Public Health. [ Google Scholar ]
  • Hood MN (2009). A review of cohort study design for cardiovascular nursing research . J Cardiovasc Nurs , 24 ( 6 ), E1–9. doi: 10.1097/JCN.0b013e3181ada743 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Hulley S, Cummings SR, Browner WS, Grady DG, Newman TB (Ed.) (2013). Designing Clinical Research (4th ed.). Philadelphia, PA: Wolters Kluwer/Lippincott Williams & Wilkins. [ Google Scholar ]
  • Mann CJ (2012). Observational research methods—Cohort studies, cross sectional studies, and case–control studies . African Journal of Emergency Medicine , 2 ( 1 ), 38–46. doi: 10.1016/j.afjem.2011.12.004 [ CrossRef ] [ Google Scholar ]
  • Setia MS (2016). Methodology Series Module 1: Cohort Studies . Indian J Dermatol , 61 ( 1 ), 21–25. doi: 10.4103/0019-5154.174011 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Song JW, & Chung KC (2010). Observational studies: cohort and case-control studies . Plastic and reconstructive surgery , 126 ( 6 ), 2234–2242. doi: 10.1097/PRS.0b013e3181f44abc [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, & Vandenbroucke JP (2014). The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement: guidelines for reporting observational studies . Int J Surg , 12 ( 12 ), 1495–1499. doi: 10.1016/j.ijsu.2014.07.013 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Open access
  • Published: 30 March 2024

Longitudinal BMI change and outcomes in Chronic Obstructive Pulmonary Disease: a nationwide population-based cohort study

  • Taeyun Kim   ORCID: orcid.org/0000-0001-7786-5051 1   na1 ,
  • Sun Hye Shin   ORCID: orcid.org/0000-0003-3164-889X 2   na1 ,
  • Hyunsoo Kim   ORCID: orcid.org/0000-0001-5241-289X 3   na1 ,
  • Yunjoo Im 2 ,
  • Juhee Cho   ORCID: orcid.org/0000-0001-9081-0266 3 , 4 ,
  • Danbee Kang   ORCID: orcid.org/0000-0003-0244-7714 3 , 4   na1 &
  • Hye Yun Park   ORCID: orcid.org/0000-0002-5937-9671 2   na1  

Respiratory Research volume  25 , Article number:  150 ( 2024 ) Cite this article

182 Accesses

Metrics details

The association between longitudinal body mass index (BMI) change and clinical outcomes in patients with chronic obstructive pulmonary disease (COPD) has not fully investigated.

This retrospective cohort study included 116,463 COPD patients aged ≥ 40, with at least two health examinations, one within 2 years before and another within 3 years after COPD diagnosis (January 1, 2014, to December 31, 2019). Associations between BMI percentage change with all-cause mortality, primary endpoint, and initial severe exacerbation were assessed.

BMI decreased > 5% in 14,728 (12.6%), while maintained in 80,689 (69.2%), and increased > 5% in 21,046 (18.1%) after COPD diagnosis. Compared to maintenance group, adjusted hazard ratio (aHR) for all-cause mortality was 1.70 in BMI decrease group (95% CI:1.61, 1.79) and 1.13 in BMI increase group (95% CI:1.07, 1.20). In subgroup analysis, decrease in BMI showed a stronger effect on mortality as baseline BMI was lower, while an increase in BMI was related to an increase in mortality only in obese COPD patients with aHRs of 1.18 (95% CI: 1.03, 1.36). The aHRs for the risk of severe exacerbation (BMI decrease group and increase group vs. maintenance group) were 1.30 (95% CI:1.24, 1.35) and 1.12 (95% CI:1.07, 1.16), respectively.

Conclusions

A decrease in BMI was associated with an increased risk of all-cause mortality in a dose-dependent manner in patients with COPD. This was most significant in underweight patients. Regular monitoring for weight loss might be an important component for COPD management.

Chronic obstructive pulmonary disease (COPD) is characterized by persistent respiratory symptoms and airflow limitation [ 1 ]. However, it is a complex and heterogeneous disease with symptoms and pathophysiological features that vary among individuals despite a similar degree of airflow obstruction [ 1 ]. Physical features, especially body mass index (BMI), vary widely in patients with COPD, ranging from underweight to morbidly obese. Low BMI in patients with COPD is generally associated with poor outcomes, including increased mortality, exacerbation, and lung function decline [ 2 , 3 , 4 ]. However, data regarding health and obesity in patients with COPD are conflicting. Several studies have suggested that being overweight or obese protects against exacerbation and mortality in COPD [ 2 , 3 , 4 ], which is more apparent in patients with severe disease [ 5 ]. In contrast, other studies have shown an increase in mortality in obese COPD patients [ 6 , 7 ]. However, these previous studies have evaluated the relationship between BMI and outcomes in patients with COPD based on BMI measured at one point, rather than considering changes in BMI [ 2 , 3 , 4 , 6 , 7 ].

The effects of weight change on COPD-related outcomes have also been reported. A study conducted several decades ago as part of the Copenhagen City Heart Study revealed a significant dose-dependent association between weight loss and all-cause mortality [ 8 ]. However, this study recruited participants several decades ago, when recent treatment strategies for COPD were not yet being applied. In previous studies that included Asians for the consideration of differences in comorbidities and BMI among race and ethnicity groups [ 9 ], studies in large Japanese and Korean cohorts found that weight or BMI reduction was associated with higher exacerbation and overall mortality [ 10 , 11 ]. However, these studies included few patients whose BMI changed and weight change was assessed by questionnaire.

In this regard, this study aimed to evaluate the relationship between BMI changes and clinical outcomes in patients with COPD using a large nationally representative cohort from Korea by specifically investigating the association across different BMI groups based on the classification for Asians. This would enable clinicians to guide patients to make appropriate lifestyle modifications.

Materials and methods

Data source.

We conducted a retrospective cohort study using data from the Korean National Health Insurance System (K-NHIS) database, which covers the entire South Korean population. This comprehensive database contains extensive information on demographics, medical treatments, procedures, prescription drugs, diagnostic codes, and hospital utilization. Diagnoses in the K-NHIS database were classified according to the International Classification of Diseases, 10th revision (ICD-10). Regular audits of the ICD-10 codes, procedure records, and prescription records are conducted by the K-NHIS to ensure accuracy and prevent unnecessary medical expenses. Moreover, the K-NHIS claims database incorporates data from the national health screening examination, a standardized health screening program provided to all insured individuals every two years [ 12 ]. Approximately 76% of the target population participated in the health screening examination [ 12 ]. The data collected during the health screening examination included a self-administered questionnaire on medical history, lifestyle habits, anthropometric measurements, and laboratory tests [ 12 ]. Health examination facilities are designated and regulated by the relevant national laws to ensure quality control. For more detailed information on the NHIS database and health examinations, please refer to previous publications [ 12 , 13 ].

Study population

Our database included all COPD patients aged ≥ 40 years between January 1, 2014, and December 31, 2019. COPD was defined as the presence of code J43-J44 (except J43.0) (ICD-10) and the prescription of COPD medication at least twice within a year. Medications for COPD include long-acting muscarinic antagonists (LAMAs), long-acting beta-2 agonists (LABAs), inhaled corticosteroids (ICS) plus LABAs, short-acting muscarinic antagonists, short-acting beta-2 agonists, methylxanthines, systemic beta-agonists, and phosphodiesterase-4 inhibitors [ 14 , 15 , 16 ].

As the purpose of this study was to evaluate BMI changes after COPD diagnosis in terms of mortality and severe exacerbation, we included patients who had health examination data within 2 years before (Exam 1) and within 3 years after (baseline, Exam 2) the date of COPD diagnosis. A period of 3 years was chosen a priori based on previous literature, as well as the anticipated sample size and follow-up duration [ 17 , 18 ]. After excluding 10,126 participants who had cancer before the Exam 2, 118,849 participants remained. Furthermore, to minimize potential reverse causality, we excluded 2,386 participants who developed any cancer or died within the first 6 months of follow-up from the Exam 2 (index date). The final sample size was 116,463, and the median duration of follow-up from the index date was 3.9 years (interquartile range [IQR]: 2.5–5.1 and range: 0.5–7).

The Institutional Review Board of the Samsung Medical Center approved the study (approval no:2022-09-022) and waived the requirement for informed consent because the K-NHIS data were deidentified. The study was conducted in accordance with the principles of the Declaration of Helsinki.

Measurement

During each health examination, weight and height were measured by trained nurses. BMI was calculated as weight in kilograms divided by height in meters squared and was classified according to Asian-specific criteria (underweight, BMI < 18.5 kg/m 2 ; normal weight, BMI 18.5 to 22.9 kg/m 2 ; overweight, BMI 23 to 24.9 kg/m 2 ; and obese, BMI ≥ 25 kg/m 2 ) [ 19 , 20 ]. BMI change (%) was calculated as the difference in BMI from the last examination within 2 years before COPD diagnosis (Exam 1) to the last examination within 3 years after COPD diagnosis (baseline, Exam 2), then the BMI difference was divided by the BMI at Exam 1 and multiplied 100. Participants were classified into three categories: decrease in BMI > 5%, increase in BMI > 5%, and no change (not more than 5%) [ 21 , 22 ].

The primary endpoint was all-cause mortality rate. Any death events were recorded after Exam 2. The vital status and cause of death were obtained from death certifications collected by Statistics Korea from the Ministry of Strategy and Finance of South Korea [ 23 ].

The secondary endpoint was initial severe exacerbation after Exam 2. Severe exacerbation of COPD was defined as an hospitalization or emergency room visit with one of the following ICD-10 codes as the principal or secondary diagnosis: COPD (J43.X [except J43.0] or J44.X) or COPD-related disease (pneumonia [J12.X–J17.X], pulmonary thromboembolism [I26, I26.0, or I26.9], dyspnea [R06.0], or acute respiratory distress syndrome [J80]), and a prescription for systemic steroids or antibiotics at the same visit [ 24 ]. To minimize reverse causality (i.e., previous severe exacerbation could both affect low BMI and subsequent exacerbation), patients without a history of previous severe exacerbations between COPD diagnosis and Exam 2 were included for the analysis of severe exacerbation ( N  = 108,067).

Data on covariates were collected during Exam 2. Study participants completed a self-administered questionnaire with questions on medical history and lifestyle habits, including smoking and alcohol use in Exam 2 and medication use (LABA, LAMA, or ICS) within 1 year before Exam 2.

Residential areas and income levels were obtained from insurance eligibility. The residential areas were categorized as metropolitan cities (Seoul, Busan, Daegu, Daejeon, Gwangju, Incheon, and Ulsan). Income levels were categorized as Medical Aid, ≤ 30th, 30–70th, or > 70th percentile.

Comorbidities during the year before Exam 2 were obtained from claims data defined using ICD-10 codes and summarized using the Charlson Comorbidity Index (CCI) [ 25 ]. In addition to CCI, pulmonary tuberculosis (ICD-10: A15, A16, B90.9), interstitial lung disease (ICD-10: J84), bronchiectasis (ICD-10: J47), and pneumonia (ICD-10: J11–J18, J69) were determined using insurance claims data during a 1-year look-back period from Exam 2.

Statistical analysis

The incidence rates were calculated as the number of events per 100 person-years of follow-up. The cumulative incidence of each outcome was estimated using the Kaplan–Meier method, and log-rank tests were used to evaluate the differences between groups. We calculated the hazard ratio (HR) with a 95% confidence interval (CI) for all-cause mortality and severe exacerbations comparing participants with > 5% increase and > 5% decrease in BMI versus those who had maintained BMI during follow-up. The proportionality of hazards was confirmed by visual inspection of log-minus-log plots and Schoenfeld residuals. The models were adjusted for age, sex, smoking status, drinking status, residential area, income, CCI, regular moderate-to-vigorous physical activity (MVPA), previous severe exacerbation within a year before Exam 2 (baseline), medication use (ICS, LABA, or LAMA) within the year before Exam 2 (baseline), pulmonary tuberculosis, bronchiectasis, and pneumonia. The covariables were selected a priori based on their possible associations with BMI changes and outcomes.

In addition, we modeled BMI change as a continuous variable using restricted cubic splines with knots at the 5th, 35th, 65th, and 95th percentiles of the sample distribution to provide a flexible estimate of the dose-response relationship between BMI change and mortality incidence.

In subgroup analysis, we examined the association between percentage BMI change and mortality by BMI categories before COPD diagnosis (underweight, normal, overweight, or obese). A sensitivity analysis was additionally performed according to the BMI category based on the World Health Organization (WHO): underweight (< 18.5 kg/m 2 ), normal (18.5–24.9 kg/m 2 ), Overweight (25–29.9 kg/m 2 ), and Obese (≥ 30 kg/m 2 ).

All statistical analyses were performed using SAS version 9.4 (SAS Institute Inc., Cary, NC, USA) and R version 4.0.3 (R Foundation for Statistical Computing, Vienna, Austria).

Of the 116,463 patients with COPD (median age, 67 years; male, 66%), 14,728 (12.6%) experienced a > 5% decrease in BMI, 80,689 (69.2%) maintained their BMI, and 21,046 (18.1%) experienced a > 5% increase in BMI after COPD diagnosis (Table  1 ). With respect to baseline BMI, 31.5% of underweight patients had a > 5% increase in BMI and 14.1% of obese patients had a < 5% decrease in BMI (Supplementary Table 1 ). Compared to those with maintained BMI, individuals with decreased and increased BMI were more likely to experience severe exacerbations in the previous year (6.6% for maintained BMI vs. 9.8% for decreased BMI vs. 7.8% for increased BMI, p  < 0.001). Co-existing pulmonary diseases (history of pulmonary tuberculosis, interstitial lung disease, and pneumonia) and comorbidities were more prevalent in the BMI decrease group compared to BMI maintenance group ( p  < 0.001, Table  1 ).

During a median follow-up of 3.9 years (IQR: 2.5–5.1) from the index date, 8,412 participants died. The mortality rate per 100 person-years was 1.6 in the maintenance group, 3.5 in the decrease group, and 1.9 in the increase group (log-rank test p-values < 0.01, Fig.  1 ). The adjusted HR for all-cause mortality was 1.70 in the BMI decrease group (95% CI:1.61, 1.79) and 1.13 in the BMI increase group (95% CI:1.07, 1.20), compared to the maintenance group (Table  2 ).

figure 1

Kaplan Meier curve of ( A ) all-cause mortality and ( B ) severe exacerbation

In the spline regression models, the association between percentage BMI change and mortality was approximately nonlinear, indicating that both a decrease and an increase in BMI were associated with an increase in mortality (Fig.  2 ).

figure 2

Multivariable-adjusted HRs (95% CI) for all-cause mortality by percentage BMI change The curves represent multivariate-adjusted HRs (solid line) and 95% CIs (dashed lines) for mortality based on restricted cubic splines for percentage BMI change with knots at the 5th, 35th, 65th, and 95th percentiles of sample distribution. The reference value (diamond dot) was set as no change in BMI. The model was adjusted for age, sex, smoking status, drinking status, residential area, income, CCI, regular MVPA, previous severe exacerbation within a year before Exam 2 (baseline), medication use (ICS, LABA, or LAMA) within 1 year before Exam 2 (baseline), pulmonary TB, bronchiectasis, and pneumonia. HR, hazard ratio; CI, confidence interval; BMI, body mass index; CCI, Charlson Comorbidity Index; MVPA, moderate-to-vigorous physical activity; ICS, inhaled corticosteroids; LABA, long-acting beta-2 agonist; LAMA, long-acting muscarinic agonist; TB, tuberculosis

In the subgroup analysis according to BMI categories before COPD diagnosis, when they maintained their BMI after COPD diagnosis, the mortality rates per 100 person-years were 4.2, 2.0, 1.4, and 1.1 for underweight, normal weight, overweight, and obese individuals, respectively. When the BMI decreased, the mortality increased regardless of the BMI before COPD diagnosis, with rates per 100 person-years of 8.1, 4.6, 3.2, and 2.2 for underweight, normal weight, overweight, and obese individuals before COPD diagnosis, respectively. When the BMI increased, the mortality rates per 100 person-years were 3.7, 2.1, 1.3 and 1.3 for underweight, normal weight, overweight, and obese individuals before COPD diagnosis, respectively. Patients who were obese before COPD diagnosis only exhibited a significant effect of BMI increase on elevated mortality (Fig.  3 ; Table  3 ). This observed relationship was similar in a sensitivity analysis based on the WHO classification of BMI (Supplementary Table 2 ).

figure 3

All-cause mortality rate by percentage BMI change according to the BMI before COPD diagnosis. The model was adjusted for age, sex, smoking status, drinking status, residential area, income, CCI, regular MVPA, previous severe exacerbation within a year before Exam 2 (baseline), medication use (ICS, LABA, or LAMA) within 1 year before Exam 2 (baseline), pulmonary TB, bronchiectasis, and pneumonia. BMI, body mass index; COPD, chronic obstructive pulmonary; CCI, Charlson comorbidity index; CI, Confidence Interval; HR, Hazard Ratio; MVPA, moderate-to-vigorous physical activity; ICS, inhaled corticosteroids; LABA, long-acting beta-2 agonist; LAMA, long-acting muscarinic agonist; TB, tuberculosis

Among severe exacerbation-naïve patients ( N  = 108,067), 16,565 experienced severe exacerbations. The incidence of severe exacerbation per 100-person years was 4.1 in the maintenance group, 6.0 in the decrease group, and 4.8 in the increase group (log-rank test, p  < 0.01; Fig.  1 ). The fully adjusted HRs for the risk of severe exacerbation (over 5% BMI decrease vs. maintenance, and over 5% BMI increase vs. maintenance) were 1.30 (95% CI:1.24, 1.35) and 1.12 (95% CI:1.07, 1.16), respectively (Table  2 ). This result was similar in sensitivity analysis in all patients ( N  = 116,463) adjusting for variables including previous severe exacerbation (Supplementary Table 3 ).

In this large national cohort study from Korea, a decrease in BMI was associated with an increased risk of severe exacerbation and all-cause mortality in COPD patients. In particular, there was a dose-dependent relationship between a decrease in BMI and all-cause mortality, which was prominent in underweight patients with COPD. In addition, an increase in BMI correlated with an increased risk of death only among obese patients with COPD. Our results highlight that monitoring BMI is important for the non-pharmacological management of COPD and the prediction of outcomes, especially in COPD patients with a low BMI.

Our study extends previous data on the U-shaped association between baseline BMI and clinical outcomes by employing longitudinal changes in BMI in patients with COPD. In particular, the impact on mortality and severe exacerbation were greater when patients with COPD experienced a decrease in BMI than they experienced an increase in BMI (reverse J-shaped curve), and the negative impact of a decrease in BMI on all-cause mortality was more intense in underweight patients with COPD. The observed linear association between a decrease in BMI and increased mortality in our study is consistent with the findings of previous studies [ 8 , 10 , 11 , 26 ]. In a similar context, history of previous severe exacerbation, co-existing pulmonary diseases, and comorbidities were more prevalent in the BMI decrease group than in the maintenance and increase groups in our study. Severe exacerbations lead to increased inflammation, metabolic stress, and accelerated muscle wasting [ 27 ] and co-existing pulmonary diseases can further exacerbate respiratory symptoms and contribute to a decrease in BMI. Nevertheless, as a decrease in BMI was independently associated with increased mortality, even after adjustment for these covariates, it is important to note that any decrease in BMI in patients with COPD should be closely monitored.

Notably, we showed that a decrease in BMI was associated with a higher risk of all-cause mortality, even in overweight and obese patients with COPD. Epidemiological evidence in patients with cancer has shown that pre-obesity or early obesity status is associated with better outcomes, typically mortality [ 28 ]. This phenomenon has been consistently observed in patients with COPD. A meta-regression analysis of five randomized clinical trials revealed that high BMI has a protective effect against lung function decline. The lung function decline was lowest in COPD patients with BMI ≥ 30 kg/m 2 [ 4 ]. Another meta-analysis of 21,150 COPD patients reported that being overweight (25.0–29.9 kg/m 2 ) and obese (≥ 30 kg/m 2 ) were associated with lower mortality even compared with normal BMI (18.5–24.9 kg/m 2 ) [ 2 ]. In this way, so-called “obesity paradox” could explain our findings, where overweight or obese status were related to lower risk of death.

An increase in BMI was negatively associated with survival only among obese patients with COPD. Consistent with previous reports, our results suggest that worsening obesity can be detrimental. For example, in a large multinational cohort with moderate COPD, all-cause mortality proportionally increased as BMI increases from 25–<30 kg/m 2 to ≥ 40 kg/m 2 [ 6 ]. Moderate or severe exacerbations were also higher in obese patients than in patients with normal BMI [ 6 ]. The negative impact of BMI increase was remarkable in COPD patients with a predicted forced expiratory volume in 1 s < 50% [ 8 ]. Increased cardiovascular and respiratory mortality in obese patients could contribute to an increased risk of death in obese patients with COPD [ 6 ]. This result is inconsistent with the findings of a cohort study in Korea [ 10 ]. However, in that study, only 16 of 270 patients with COPD experienced an increase in BMI and baseline BMI was lower in the BMI increase group (22.6 kg/m 2 ) than in the no change group (23.3 kg/m 2 ). These factors could contribute to a lack of persuasiveness owing to the small sample size and different baseline characteristics between the groups.

In addition, we found a 13% reduction in all-cause mortality when underweight COPD patients increased their BMI, although statistical significance was not reached. A few explanations exist. First, the analysis was limited by a relatively small number of underweight patients, which may not enough to draw sufficient statistical power. Previous studies revealed that an increase in BMI, body weight, and body composition was not related to improved survival in underweight patients with COPD [ 8 , 10 , 11 , 26 ]. Our results, indicating a trend toward a reduced mortality, may offer a glimpse of evidence suggesting that an increase in BMI in underweight COPD patients could have protective effect on overall survival. Second, a 5% increase in BMI might not be substantial enough to yield a meaningful reduction in mortality. In a large population-based California cohort, weight gain of 5.1–15% showed a relative death risk of 1.09 (95% CI: 0.95, 1.26), while weight gain exceeding 15% demonstrated a significant risk reduction of 10% (RR 0.90, 95% CI: 0.83, 0.98) in underweight individuals [ 29 ].

Although this large, nationally representative cohort robustly and comprehensively evaluated the impact of BMI changes on mortality and exacerbation in patients with COPD, several limitations exist. First, as lung function data are not available in the KNHIS database, the definition of COPD was based on the claims data and diagnostic codes, which could lead to misclassification bias. However, this definition has been widely used and validated in several studies [ 14 , 15 , 16 ]. Second, there might be potential confounders that were not fully covered in the analysis, including the severity of airflow limitation, which is known to be associated with the risk of mortality in COPD patients [ 30 ]. However, severe exacerbations in the previous year, one of predictive factors of mortality in COPD patients [ 31 ], were adjusted for mortality. In addition, considering that the obesity paradox is more apparent in COPD patients with severe airway obstruction [ 5 , 8 ], the generalizability of our results to all patients with COPD should be explored further. Third, the reasons for changes in BMI are unknown. In particular, as it is not known whether outcomes differ based on intentional weight loss in overweight or obese individuals, further studies are necessary. Finally, the K-NHIS lacks data on body composition analyses. One previous study analyzed changes in body composition, such as fat-free mass or fat mass [ 26 ]; however, in real clinical practice, assessing changes in BMI would be more realistic and feasible.

Using a large national cohort study of COPD patients, our research showed a prominent relationship between a decrease in BMI and an increase in all-cause mortality, as well as severe exacerbation. This association was most significant in underweight patients with COPD. Additionally, an increase in BMI increases the risk of death, only among obese patients with COPD. Our study underscores the importance of regularly monitoring BMI changes in patients with COPD.

Data availability

The data are available from the Korean National Health Insurance Sharing Service (NHISS; https://nhiss.nhis.or.kr/) database, which is open to researchers on request with approval by the Institutional Review Board.

Agustí A, Celli BR, Criner GJ, Halpin D, Anzueto A, Barnes P, Bourbeau J, Han MK, Martinez FJ, de Montes M et al. Global Initiative for Chronic Obstructive Lung Disease 2023 Report: GOLD Executive Summary. Eur Respir J 2023, 61.

Cao C, Wang R, Wang J, Bunjhoo H, Xu Y, Xiong W. Body mass index and mortality in chronic obstructive pulmonary disease: a meta-analysis. PLoS ONE. 2012;7:e43892.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Shin SH, Kwon SO, Kim V, Silverman EK, Kim TH, Kim DK, Hwang YI, Yoo KH, Kim WJ, Park HY. Association of body mass index and COPD exacerbation among patients with chronic bronchitis. Respir Res. 2022;23:52.

Article   PubMed   PubMed Central   Google Scholar  

Sun Y, Milne S, Jaw JE, Yang CX, Xu F, Li X, Obeidat M, Sin DD. BMI is associated with FEV(1) decline in chronic obstructive pulmonary disease: a meta-analysis of clinical trials. Respir Res. 2019;20:236.

Spelta F, Fratta Pasini AM, Cazzoletti L, Ferrari M. Body weight and mortality in COPD: focus on the obesity paradox. Eat Weight Disord. 2018;23:15–22.

Article   PubMed   Google Scholar  

Brigham EP, Anderson JA, Brook RD, Calverley PMA, Celli BR, Cowans NJ, Crim C, Diserens JE, Martinez FJ, McCormack MC et al. Challenging the obesity paradox: extreme obesity and COPD mortality in the SUMMIT trial. ERJ Open Res 2021, 7.

Lambert AA, Putcha N, Drummond MB, Boriek AM, Hanania NA, Kim V, Kinney GL, McDonald MN, Brigham EP, Wise RA, et al. Obesity is Associated with increased morbidity in moderate to severe COPD. Chest. 2017;151:68–77.

Prescott E, Almdal T, Mikkelsen KL, Tofteng CL, Vestbo J, Lange P. Prognostic value of weight change in chronic obstructive pulmonary disease: results from the Copenhagen City Heart Study. Eur Respir J. 2002;20:539–44.

Article   CAS   PubMed   Google Scholar  

Lee H, Shin SH, Gu S, Zhao D, Kang D, Joi YR, Suh GY, Pastor-Barriuso R, Guallar E, Cho J, Park HY. Racial differences in comorbidity profile among patients with chronic obstructive pulmonary disease. BMC Med. 2018;16:178.

Kim EK, Singh D, Park JH, Park YB, Kim SI, Park B, Park J, Kim JH, Kim MA, Lee JH, et al. Impact of body Mass Index Change on the prognosis of Chronic Obstructive Pulmonary Disease. Respiration. 2020;99:943–53.

Wada H, Ikeda A, Maruyama K, Yamagishi K, Barnes PJ, Tanigawa T, Tamakoshi A, Iso H. Low BMI and weight loss aggravate COPD mortality in men, findings from a large prospective cohort: the JACC study. Sci Rep. 2021;11:1531.

Cheol Seong S, Kim YY, Khang YH, Heon Park J, Kang HJ, Lee H, Do CH, Song JS, Hyon Bang J, Ha S, et al. Data Resource Profile: the National Health Information Database of the National Health Insurance Service in South Korea. Int J Epidemiol. 2017;46:799–800.

PubMed   Google Scholar  

Seong SC, Kim YY, Park SK, Khang YH, Kim HC, Park JH, Kang HJ, Do CH, Song JS, Lee EJ, et al. Cohort profile: the National Health Insurance Service-National Health Screening Cohort (NHIS-HEALS) in Korea. BMJ Open. 2017;7:e016640.

Kim T, Kim H, Kong S, Shin SH, Cho J, Kang D, Park HY. Association Between Regular Moderate to Vigorous Physical Activity Initiation Following COPD Diagnosis and Mortality: An Emulated Target Trial Using Nationwide Cohort Data. Chest 2023.

Park HY, Kang D, Lee H, Shin SH, Kang M, Kong S, Rhee CK, Cho J, Yoo KH. Impact of chronic obstructive pulmonary disease on mortality: a large national cohort study. Respirology. 2020;25:726–34.

Park HY, Kang D, Shin SH, Yoo KH, Rhee CK, Suh GY, Kim H, Shim YM, Guallar E, Cho J, Kwon OJ. Chronic obstructive pulmonary disease and lung cancer incidence in never smokers: a cohort study. Thorax. 2020;75:506–9.

Tarasenko YN, Linder DF, Miller EA. Muscle-strengthening and aerobic activities and mortality among 3 + year cancer survivors in the U.S. Cancer Causes Control. 2018;29:475–84.

Irwin ML, Smith AW, McTiernan A, Ballard-Barbash R, Cronin K, Gilliland FD, Baumgartner RN, Baumgartner KB, Bernstein L. Influence of pre- and postdiagnosis physical activity on mortality in breast cancer survivors: the health, eating, activity, and lifestyle study. J Clin Oncol. 2008;26:3958–64.

Kim BY, Kang SM, Kang JH, Kang SY, Kim KK, Kim KB, Kim B, Kim SJ, Kim YH, Kim JH, et al. 2020 Korean Society for the Study of Obesity Guidelines for the management of obesity in Korea. J Obes Metab Syndr. 2021;30:81–92.

Appropriate body-mass. Index for Asian populations and its implications for policy and intervention strategies. Lancet. 2004;363:157–63.

Article   Google Scholar  

Williamson DA, Bray GA, Ryan DH. Is 5% weight loss a satisfactory criterion to define clinically significant weight loss? Obes (Silver Spring). 2015;23:2319–20.

Kompaniyets L, Freedman DS, Belay B, Pierce SL, Kraus EM, Blanck HM, Goodman AB. Probability of 5% or Greater Weight loss or BMI reduction to healthy weight among adults with overweight or obesity. JAMA Netw Open. 2023;6:e2327358.

Lee J, Lee JS, Park SH, Shin SA, Kim K. Cohort Profile: the National Health Insurance Service-National Sample Cohort (NHIS-NSC), South Korea. Int J Epidemiol. 2017;46:e15.

Kim J, Rhee CK, Yoo KH, Kim YS, Lee SW, Park YB, Lee JH, Oh Y, Lee SD, Kim Y, et al. The health care burden of high grade chronic obstructive pulmonary disease in Korea: analysis of the Korean Health Insurance Review and Assessment Service data. Int J Chron Obstruct Pulmon Dis. 2013;8:561–8.

PubMed   PubMed Central   Google Scholar  

Quan H, Li B, Couris CM, Fushimi K, Graham P, Hider P, Januel JM, Sundararajan V. Updating and validating the Charlson comorbidity index and score for risk adjustment in hospital discharge abstracts using data from 6 countries. Am J Epidemiol. 2011;173:676–82.

Rutten EP, Calverley PM, Casaburi R, Agusti A, Bakke P, Celli B, Coxson HO, Crim C, Lomas DA, Macnee W, et al. Changes in body composition in patients with chronic obstructive pulmonary disease: do they influence patient-related outcomes? Ann Nutr Metab. 2013;63:239–47.

Remels AH, Gosker HR, Langen RC, Schols AM. The mechanisms of cachexia underlying muscle dysfunction in COPD. J Appl Physiol (1985). 2013;114:1253–62.

Lennon H, Sperrin M, Badrick E, Renehan AG. The obesity Paradox in Cancer: a review. Curr Oncol Rep. 2016;18:56.

Corrada MM, Kawas CH, Mozaffar F, Paganini-Hill A. Association of body mass index and weight change with all-cause mortality in the elderly. Am J Epidemiol. 2006;163:938–49.

Guo C, Yu T, Chang LY, Bo Y, Yu Z, Wong MCS, Tam T, Lao XQ. Mortality risk attributable to classification of chronic obstructive pulmonary disease and reduced lung function: a 21-year longitudinal cohort study. Respir Med. 2021;184:106471.

Mullerova H, Maselli DJ, Locantore N, Vestbo J, Hurst JR, Wedzicha JA, Bakke P, Agusti A, Anzueto A. Hospitalized exacerbations of COPD: risk factors and outcomes in the ECLIPSE cohort. Chest. 2015;147:999–1007.

Download references

Acknowledgements

Not applicable.

This work was supported by the National Research Foundation of Korea grants funded by the South Korean government (Ministry of Science and ICT) [NRF-2021R1A2C2093987].

Author information

Taeyun Kim, Sun Hye Shin, Hyunsoo Kim, Danbee Kang, Hye Yun Park,contributed equally to this article.

Authors and Affiliations

Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, Kosin University Gospel Hospital, Kosin University College of Medicine, Busan, Republic of Korea

Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, 81 Irwon-ro, Seoul, 06351, Republic of Korea

Sun Hye Shin, Yunjoo Im & Hye Yun Park

Center for Clinical Epidemiology, Samsung Medical Center, Seoul, Republic of Korea

Hyunsoo Kim, Juhee Cho & Danbee Kang

Department of Clinical Research Design and Evaluation, SAIHST, Sungkyunkwan University, 115 Irwon-ro, Seoul, 06335, South Korea

Juhee Cho & Danbee Kang

You can also search for this author in PubMed   Google Scholar

Contributions

TK & SHS: Writing and editing the original draft. HK & JC: Methodology, formal analysis, and investigation. HYP: Writing, review, editing, supervision, and project administration. DK: Methodology, formal analysis, investigation, writing, review, and editing. YI and & JC: Validation. All the authors discussed the results and approved the final version of the manuscript.

Corresponding authors

Correspondence to Danbee Kang or Hye Yun Park .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Kim, T., Shin, S.H., Kim, H. et al. Longitudinal BMI change and outcomes in Chronic Obstructive Pulmonary Disease: a nationwide population-based cohort study. Respir Res 25 , 150 (2024). https://doi.org/10.1186/s12931-024-02788-0

Download citation

Received : 29 November 2023

Accepted : 25 March 2024

Published : 30 March 2024

DOI : https://doi.org/10.1186/s12931-024-02788-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Exacerbation

Respiratory Research

ISSN: 1465-993X

population based case cohort study

Body mass index trajectories and mortality risk in Japan using a population-based prospective cohort study: the Japan Public Health Center-based Prospective Study

Affiliations.

  • 1 School of Human Evolution and Social Change, Arizona State University, Tempe, AZ, USA.
  • 2 Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore.
  • 3 Department of Global Health Policy, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.
  • 4 Department of Psychiatry, Yale University, New Haven, CT, USA.
  • 5 Department of Epidemiology and Biostatistics, Indiana University School of Public Health-Bloomington, Bloomington, IN, USA.
  • 6 Department of Pediatrics, Indiana University School of Medicine-Indianapolis, Indianapolis, IN, USA.
  • 7 Division of Cohort Research, National Cancer Center Institute for Cancer Control, Tokyo, Japan.
  • 8 Division of Prevention, National Cancer Center Institute for Cancer Control, Tokyo, Japan.
  • PMID: 37878816
  • PMCID: PMC10859135
  • DOI: 10.1093/ije/dyad145

Background: Recent studies have found that long-term changes in weight during adulthood are associated with a high risk of mortality. The objective of this study was to characterize body mass index (BMI) trajectories during adulthood and to examine the association between BMI trajectories and risk of death in the Japanese population.

Methods: The data were extracted from Japan Public Health Center-based Prospective Study-a population-based prospective cohort study in Japan with participants aged 40-69 years followed over 20 years. The participants were categorized into multiple BMI trajectory groups using the latent class growth model. The Cox proportional-hazards model was conducted using all-cause mortality and cause-specific mortality as outcomes and the identified BMI trajectory groups as a predictor. In total, 65 520 participants were included in the analysis.

Results: Six BMI trajectory groups were identified: underweight stable (Group 1), low-to-high normal (Group 2), high-to-low normal (Group 3), normal to overweight (Group 4), overweight to normal (Group 5) and normal to obese (Group 6). Our Cox models showed a higher hazard (risk) of all-cause mortality among participants in the BMI-declining groups [Group 3, adjusted hazard ratio (aHR): 1.10, 95% CI: 1.05-1.16; Group 5, aHR: 1.16, 95% CI: 1.08-1.26], underweight stable group (Group 1, aHR: 1.27, 95% CI: 1.21-1.33) and normal to obese group (Group 6, aHR: 1.22, 95% CI: 1.13-1.33) than Group 2 (low-to-high normal BMI trajectory).

Conclusions: Stable underweight and weight loss were associated with a high risk of mortality, both of which were uniquely observed in a Japanese population.

Keywords: Japan Public Health Center-based Prospective Study; body mass index trajectory; latent class growth model.

© The Author(s) 2023. Published by Oxford University Press on behalf of the International Epidemiological Association.

  • Body Mass Index
  • Japan / epidemiology
  • Obesity / complications
  • Overweight* / epidemiology
  • Prospective Studies
  • Public Health
  • Risk Factors
  • Thinness* / complications
  • Weight Loss

Grants and funding

  • 23-A-31/National Cancer Center Research and Development Fund
  • Grant-in-Aid for Cancer Research from the Ministry of Health, Labour and Welfare of Japan
  • KAKENHI 18K18146/Japan Society for the Promotion of Science

IMAGES

  1. Cohort Studies

    population based case cohort study

  2. Study flow chart: a population-based cohort study.

    population based case cohort study

  3. Flowchart of case-cohort study. The case cohort analysis consisted of

    population based case cohort study

  4. Case Control

    population based case cohort study

  5. Differences between cross-sectional, case-control, and cohort study

    population based case cohort study

  6. PPT

    population based case cohort study

VIDEO

  1. Research Evidence Grading

  2. MPH 5323: General Epidemiology

  3. study designs with causal inference

  4. Unveiling the Truth: Artificial Sweeteners and Cancer Risk

  5. History Of Framingham Heart Study:Cohort Study Introduction

  6. Cohort Study الموضوع مطلعش صعب زي ما الناس كانت فاكرة

COMMENTS

  1. The case for case-cohort: An applied epidemiologist's guide to re-framing case-cohort studies to improve usability and flexibility

    Visualization of case-cohort designs assuming a time-on-study time scale. (A) The case-cohort study includes (1) a sample of individuals from the cohort who have experienced the outcome of interest ("cases") and (2) a sample of individuals randomly selected from among the members of the full cohort observed at baseline (the "sub-cohort").

  2. Population-Based Study

    Population-based studies may include a variety of study types. They may include case-control studies, cross-sectional studies, twin studies, or prospective and retrospective cohort studies. The important issue is the selection of the individuals that are included into the study - they should be representative of all individuals in the a ...

  3. Treatment-resistant depression and risk of autoimmune diseases ...

    We conducted a population-based study using both cohort design and nested case-control design in parallel, with the intention to preserve the advantages and complement the limitations of the other ...

  4. Cohort Studies

    Population-based cohort studies provide robust results. These studies have made significant contributions to assess risk factor-outcome associations . An outcome or a disease-free study population is first identified by the exposure of interest which is followed over time till the occurrence of the disease or the outcome of interest ...

  5. Population-Based Cohort Studies: Still Relevant?

    The RCT, despite its limitations, is the gold-standard research method for determining the effectiveness of clinical interventions; nevertheless, population-based cohort studies are still extremely relevant for other research purposes, such as scientific discovery, informing the design of RCTs, and assessing effects of harmful exposures.

  6. Population-Based Cohort Studies: Still Relevant?

    Population-based cohort studies are a specific category of epidemiology studies in which a defined population is followed up and observed longitudinally to assess exposure and outcome relationships . Some critics may argue that such studies have yielded little clinical impact recently—unlike decades ago when they helped uncover major ...

  7. Population-Based Birth Cohort Studies in Epidemiology

    Birth cohort studies are the most appropriate type of design to determine the causal relationship between potential risk factors during the prenatal or postnatal period and the health status of the newborn up to childhood and potentially adulthood. To date, there has been a growth in interest regarding observational population-based studies which are performed to provide answers to specific ...

  8. (PDF) The Case-Population Study Design

    Abstract and Figures. Background: The case-population approach or population-based case-cohort approach is derived from the case-control design and consists of comparing past exposure to a given ...

  9. A population-based cohort study of socio-demographic risk ...

    During the 1,189,484 person-years of observation, 17,181 deaths occurred in our study population between March 13, 2020 and May 7, 2020. Table 1 shows the distribution of population at risk, and ...

  10. Frontiers

    Finally, population-based cohort studies have made important contributions to Mendelian Randomization analyses, a statistical approach that uses genetic information to assess observed associations between cardiovascular traits and clinical CVD outcomes for potential causality. ... In order to be included in a case-control study as prevalent MI ...

  11. [PDF] Population-based cohort studies.

    A population based case-cohort study of drug-induced anaphylaxis. Drug-induced anaphylaxis was most frequently caused by penicillins, analgesics and non-steroidal antiinflammatory drugs (NSAID) with the highest point estimate of the risk relative to all other drugs of 10.7, 6.9 and 3.7 respectively.

  12. A Population-Based Case-Cohort Study of Drug-Associated Agranulocytosis

    In this study, a population-based case-cohort design was used, in which drug use in cases was compared with drug use in a reference cohort. 21 In the case-cohort design, the reference cohort may contain 1 or more cases. Cases were patients admitted to a hospital with a validated diagnosis of agranulocytosis. The reference cohort consisted of ...

  13. A population-based case-cohort study of drug-associated ...

    Methods: To determine the risk of drug-associated agranulocytosis as a reason for admission to Dutch hospitals, we performed a population-based case-cohort study. Hospital discharge data came from the Dutch Centre for Health Care Information, Utrecht, which contains data on all general and university hospitals in the Netherlands.

  14. Assessing Left Ventricular Trabeculation with Cardiac MRI in the World

    Assessing Left Ventricular Trabeculation with Cardiac MRI in the World's Largest Population-based Cohort Study. Nadine Kawel-Boehm; Nadine Kawel-Boehm ; Author Affiliations ). Nadine Kawel-Boehm ; Published Online: Apr 2 2024 https ... RSNA Case Collection 2020. Breast edema. RSNA Case Collection 2021. Vol. 311, No. 1 Metrics. Altmetric Score.

  15. Use of 5‐alpha reductase inhibitors and risk of gastrointestinal

    This large population-based cohort study found no evidence of a reduction in the risk of colorectal or gastro-oesophageal cancers among users of 5ARi's compared to alpha-blockers. ... In a prior case-control study suggested a potential reduced risk of gastro-oesophageal cancer with use of finasteride using a 2-year lag (OR: 0.68, ...

  16. Addition of inflammation-related biomarkers to the CAIDE model for risk

    Background It is of interest whether inflammatory biomarkers can improve dementia prediction models, such as the widely used Cardiovascular Risk Factors, Aging and Dementia (CAIDE) model. Methods The Olink Target 96 Inflammation panel was assessed in a nested case-cohort design within a large, population-based German cohort study (n = 9940; age-range: 50-75 years). All study participants who ...

  17. Population-Based Study

    Population-based studies may include a variety of study types. They may include case-control studies, cross-sectional studies, twin studies, or prospective and retrospective cohort studies. The important issue is the selection of the individuals that are included into the study - they should be representative of all individuals in the a ...

  18. The Case-Population Study Design

    Background: The case-population approach or population-based case-cohort approach is derived from the case-control design and consists of comparing past exposure to a given risk factor in subjects presenting a given disease or symptom (cases) with the exposure rate to this factor in the whole cohort or in the source population of cases. In the same way as the case-control approach, the case ...

  19. Association between statin use and the risk for idiopathic ...

    These patients were then matched in a 1:3 ratio to 31,704 subjects from a control cohort without IPF, with matching based on age and sex. A case-control study was performed to evaluate the ...

  20. Timely Pulmonary Tuberculosis Diagnosis Based on the Epidemiological

    Timely Pulmonary Tuberculosis Diagnosis Based on the Epidemiological Disease Spectrum: Population-Based Prospective Cohort Study in the Republic of Korea JMIR Public Health Surveill 2024;10:e47422 doi: 10.2196/47422 PMID: 38557939. Copy Citation to Clipboard Export Metadata ...

  21. Association between physical activity over a 10-year period and current

    Objectives To explore the relationship between physical activity over a 10-year period and current symptoms of insomnia, daytime sleepiness and estimated sleep duration in adults aged 39-67. Design Population-based, multicentre cohort study. Setting 21 centres in nine European countries. Methods Included were 4339 participants in the third follow-up to the European Community Respiratory ...

  22. Physical fitness in male adolescents and atherosclerosis in middle age

    Methods This population-based cohort study linked physical fitness data from the Swedish Military Conscription Register during adolescence to atherosclerosis data from the Swedish CArdioPulmonary bioImage Study in middle age. Cardiorespiratory fitness was assessed using a maximal cycle-ergometer test, and knee extension muscular strength was evaluated through an isometric dynamometer.

  23. Overview: Cohort Study Designs

    The cohort study design is an excellent method to understand an outcome or the natural history of a disease or condition in an identified study population ( Mann, 2012; Song & Chung, 2010 ). Since participants do not have the outcome or disease at study entry, the temporal causality between exposure and outcome (s) can be assessed using this ...

  24. Longitudinal BMI change and outcomes in Chronic Obstructive Pulmonary

    The association between longitudinal body mass index (BMI) change and clinical outcomes in patients with chronic obstructive pulmonary disease (COPD) has not fully investigated. This retrospective cohort study included 116,463 COPD patients aged ≥ 40, with at least two health examinations, one within 2 years before and another within 3 years after COPD diagnosis (January 1, 2014, to December ...

  25. Body mass index trajectories and mortality risk in Japan using a

    Methods: The data were extracted from Japan Public Health Center-based Prospective Study-a population-based prospective cohort study in Japan with participants aged 40-69 years followed over 20 years. The participants were categorized into multiple BMI trajectory groups using the latent class growth model. The Cox proportional-hazards model was ...

  26. Eating behaviors and incidence of type 2 ...

    This historical cohort study sought to research the relationship between eating behaviors and the incidence of type 2 diabetes in a large, long-term cohort of Japanese subjects. Materials and Methods. Panasonic Corporation employees who had no history of diabetes and attended yearly health surveys between 2008 and 2018 were included in this study.