Have a language expert improve your writing
Run a free plagiarism check in 10 minutes, generate accurate citations for free.
- Knowledge Base
Methodology
- What Is a Case-Control Study? | Definition & Examples

What Is a Case-Control Study? | Definition & Examples
Published on February 4, 2023 by Tegan George . Revised on June 22, 2023.
A case-control study is an experimental design that compares a group of participants possessing a condition of interest to a very similar group lacking that condition. Here, the participants possessing the attribute of study, such as a disease, are called the “case,” and those without it are the “control.”
It’s important to remember that the case group is chosen because they already possess the attribute of interest. The point of the control group is to facilitate investigation, e.g., studying whether the case group systematically exhibits that attribute more than the control group does.
Table of contents
When to use a case-control study, examples of case-control studies, advantages and disadvantages of case-control studies, other interesting articles, frequently asked questions.
Case-control studies are a type of observational study often used in fields like medical research, environmental health, or epidemiology. While most observational studies are qualitative in nature, case-control studies can also be quantitative , and they often are in healthcare settings. Case-control studies can be used for both exploratory and explanatory research , and they are a good choice for studying research topics like disease exposure and health outcomes.
A case-control study may be a good fit for your research if it meets the following criteria.
- Data on exposure (e.g., to a chemical or a pesticide) are difficult to obtain or expensive.
- The disease associated with the exposure you’re studying has a long incubation period or is rare or under-studied (e.g., AIDS in the early 1980s).
- The population you are studying is difficult to contact for follow-up questions (e.g., asylum seekers).
Retrospective cohort studies use existing secondary research data, such as medical records or databases, to identify a group of people with a common exposure or risk factor and to observe their outcomes over time. Case-control studies conduct primary research , comparing a group of participants possessing a condition of interest to a very similar group lacking that condition in real time.
Prevent plagiarism. Run a free check.
Case-control studies are common in fields like epidemiology, healthcare, and psychology.
You would then collect data on your participants’ exposure to contaminated drinking water, focusing on variables such as the source of said water and the duration of exposure, for both groups. You could then compare the two to determine if there is a relationship between drinking water contamination and the risk of developing a gastrointestinal illness. Example: Healthcare case-control study You are interested in the relationship between the dietary intake of a particular vitamin (e.g., vitamin D) and the risk of developing osteoporosis later in life. Here, the case group would be individuals who have been diagnosed with osteoporosis, while the control group would be individuals without osteoporosis.
You would then collect information on dietary intake of vitamin D for both the cases and controls and compare the two groups to determine if there is a relationship between vitamin D intake and the risk of developing osteoporosis. Example: Psychology case-control study You are studying the relationship between early-childhood stress and the likelihood of later developing post-traumatic stress disorder (PTSD). Here, the case group would be individuals who have been diagnosed with PTSD, while the control group would be individuals without PTSD.
Case-control studies are a solid research method choice, but they come with distinct advantages and disadvantages.
Advantages of case-control studies
- Case-control studies are a great choice if you have any ethical considerations about your participants that could preclude you from using a traditional experimental design .
- Case-control studies are time efficient and fairly inexpensive to conduct because they require fewer subjects than other research methods .
- If there were multiple exposures leading to a single outcome, case-control studies can incorporate that. As such, they truly shine when used to study rare outcomes or outbreaks of a particular disease .
Disadvantages of case-control studies
- Case-control studies, similarly to observational studies, run a high risk of research biases . They are particularly susceptible to observer bias , recall bias , and interviewer bias.
- In the case of very rare exposures of the outcome studied, attempting to conduct a case-control study can be very time consuming and inefficient .
- Case-control studies in general have low internal validity and are not always credible.
Case-control studies by design focus on one singular outcome. This makes them very rigid and not generalizable , as no extrapolation can be made about other outcomes like risk recurrence or future exposure threat. This leads to less satisfying results than other methodological choices.
If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.
- Student’s t -distribution
- Normal distribution
- Null and Alternative Hypotheses
- Chi square tests
- Confidence interval
- Quartiles & Quantiles
- Cluster sampling
- Stratified sampling
- Data cleansing
- Reproducibility vs Replicability
- Peer review
- Prospective cohort study
Research bias
- Implicit bias
- Cognitive bias
- Placebo effect
- Hawthorne effect
- Hindsight bias
- Affect heuristic
- Social desirability bias
A case-control study differs from a cohort study because cohort studies are more longitudinal in nature and do not necessarily require a control group .
While one may be added if the investigator so chooses, members of the cohort are primarily selected because of a shared characteristic among them. In particular, retrospective cohort studies are designed to follow a group of people with a common exposure or risk factor over time and observe their outcomes.
Case-control studies, in contrast, require both a case group and a control group, as suggested by their name, and usually are used to identify risk factors for a disease by comparing cases and controls.
A case-control study differs from a cross-sectional study because case-control studies are naturally retrospective in nature, looking backward in time to identify exposures that may have occurred before the development of the disease.
On the other hand, cross-sectional studies collect data on a population at a single point in time. The goal here is to describe the characteristics of the population, such as their age, gender identity, or health status, and understand the distribution and relationships of these characteristics.
Cases and controls are selected for a case-control study based on their inherent characteristics. Participants already possessing the condition of interest form the “case,” while those without form the “control.”
Keep in mind that by definition the case group is chosen because they already possess the attribute of interest. The point of the control group is to facilitate investigation, e.g., studying whether the case group systematically exhibits that attribute more than the control group does.
The strength of the association between an exposure and a disease in a case-control study can be measured using a few different statistical measures , such as odds ratios (ORs) and relative risk (RR).
No, case-control studies cannot establish causality as a standalone measure.
As observational studies , they can suggest associations between an exposure and a disease, but they cannot prove without a doubt that the exposure causes the disease. In particular, issues arising from timing, research biases like recall bias , and the selection of variables lead to low internal validity and the inability to determine causality.
Sources in this article
We strongly encourage students to use sources in their work. You can cite our article (APA Style) or take a deep dive into the articles below.
George, T. (2023, June 22). What Is a Case-Control Study? | Definition & Examples. Scribbr. Retrieved September 10, 2023, from https://www.scribbr.com/methodology/case-control-study/
Schlesselman, J. J. (1982). Case-Control Studies: Design, Conduct, Analysis (Monographs in Epidemiology and Biostatistics, 2) (Illustrated). Oxford University Press.
Is this article helpful?
Tegan George
Other students also liked, what is an observational study | guide & examples, control groups and treatment groups | uses & examples, cross-sectional study | definition, uses & examples, what is your plagiarism score.
Study Design 101
- Helpful formulas
- Finding specific study types
- Case Control Study
- Meta- Analysis
- Systematic Review
- Practice Guideline
- Randomized Controlled Trial
- Cohort Study
- Case Reports
A study that compares patients who have a disease or outcome of interest (cases) with patients who do not have the disease or outcome (controls), and looks back retrospectively to compare how frequently the exposure to a risk factor is present in each group to determine the relationship between the risk factor and the disease.
Case control studies are observational because no intervention is attempted and no attempt is made to alter the course of the disease. The goal is to retrospectively determine the exposure to the risk factor of interest from each of the two groups of individuals: cases and controls. These studies are designed to estimate odds.
Case control studies are also known as "retrospective studies" and "case-referent studies."
- Good for studying rare conditions or diseases
- Less time needed to conduct the study because the condition or disease has already occurred
- Lets you simultaneously look at multiple risk factors
- Useful as initial studies to establish an association
- Can answer questions that could not be answered through other study designs
Disadvantages
- Retrospective studies have more problems with data quality because they rely on memory and people with a condition will be more motivated to recall risk factors (also called recall bias).
- Not good for evaluating diagnostic tests because it’s already clear that the cases have the condition and the controls do not
- It can be difficult to find a suitable control group
Design pitfalls to look out for
Care should be taken to avoid confounding, which arises when an exposure and an outcome are both strongly associated with a third variable. Controls should be subjects who might have been cases in the study but are selected independent of the exposure. Cases and controls should also not be "over-matched."
Is the control group appropriate for the population? Does the study use matching or pairing appropriately to avoid the effects of a confounding variable? Does it use appropriate inclusion and exclusion criteria?
Fictitious Example
There is a suspicion that zinc oxide, the white non-absorbent sunscreen traditionally worn by lifeguards is more effective at preventing sunburns that lead to skin cancer than absorbent sunscreen lotions. A case-control study was conducted to investigate if exposure to zinc oxide is a more effective skin cancer prevention measure. The study involved comparing a group of former lifeguards that had developed cancer on their cheeks and noses (cases) to a group of lifeguards without this type of cancer (controls) and assess their prior exposure to zinc oxide or absorbent sunscreen lotions.
This study would be retrospective in that the former lifeguards would be asked to recall which type of sunscreen they used on their face and approximately how often. This could be either a matched or unmatched study, but efforts would need to be made to ensure that the former lifeguards are of the same average age, and lifeguarded for a similar number of seasons and amount of time per season.
Real-life Examples
Boubekri, M., Cheung, I., Reid, K., Wang, C., & Zee, P. (2014). Impact of windows and daylight exposure on overall health and sleep quality of office workers: a case-control pilot study . Journal of Clinical Sleep Medicine : JCSM : Official Publication of the American Academy of Sleep Medicine, 10 (6), 603-611. https://doi.org/10.5664/jcsm.3780
This pilot study explored the impact of exposure to daylight on the health of office workers (measuring well-being and sleep quality subjectively, and light exposure, activity level and sleep-wake patterns via actigraphy). Individuals with windows in their workplaces had more light exposure, longer sleep duration, and more physical activity. They also reported a better scores in the areas of vitality and role limitations due to physical problems, better sleep quality and less sleep disturbances.
Togha, M., Razeghi Jahromi, S., Ghorbani, Z., Martami, F., & Seifishahpar, M. (2018). Serum Vitamin D Status in a Group of Migraine Patients Compared With Healthy Controls: A Case-Control Study . Headache, 58 (10), 1530-1540. https://doi.org/10.1111/head.13423
This case-control study compared serum vitamin D levels in individuals who experience migraine headaches with their matched controls. Studied over a period of thirty days, individuals with higher levels of serum Vitamin D was associated with lower odds of migraine headache.
Related Formulas
- Odds ratio in an unmatched study
- Odds ratio in a matched study
Related Terms
A patient with the disease or outcome of interest.
Confounding
When an exposure and an outcome are both strongly associated with a third variable.
A patient who does not have the disease or outcome.
Matched Design
Each case is matched individually with a control according to certain characteristics such as age and gender. It is important to remember that the concordant pairs (pairs in which the case and control are either both exposed or both not exposed) tell us nothing about the risk of exposure separately for cases or controls.
Observed Assignment
The method of assignment of individuals to study and control groups in observational studies when the investigator does not intervene to perform the assignment.
Unmatched Design
The controls are a sample from a suitable non-affected population.
Now test yourself!
1. Case Control Studies are prospective in that they follow the cases and controls over time and observe what occurs.
a) True b) False
2. Which of the following is an advantage of Case Control Studies?
a) They can simultaneously look at multiple risk factors. b) They are useful to initially establish an association between a risk factor and a disease or outcome. c) They take less time to complete because the condition or disease has already occurred. d) b and c only e) a, b, and c
← Previous Next →
© 2011-2019, The Himmelfarb Health Sciences Library Questions? Ask us .

- Himmelfarb Intranet
- Privacy Notice
- Terms of Use
- GW is committed to digital accessibility. If you experience a barrier that affects your ability to access content on this page, let us know via the Accessibility Feedback Form .

An official website of the United States government
Here’s how you know
Official websites use .gov A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS A lock ( A locked padlock ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.
- Heart-Healthy Living
- High Blood Pressure
- Sickle Cell Disease
- Sleep Apnea
- Information & Resources on COVID-19
- The Heart Truth®
- Learn More Breathe Better®
- Blood Diseases and Disorders Education Program
- Publications and Resources
- Blood Disorders and Blood Safety
- Sleep Science and Sleep Disorders
- Lung Diseases
- Health Disparities and Inequities
- Heart and Vascular Diseases
- Precision Medicine Activities
- Obesity, Nutrition, and Physical Activity
- Population and Epidemiology Studies
- Women’s Health
- Research Topics
- Clinical Trials
- All Science A-Z
- Grants and Training Home
- Policies and Guidelines
- Funding Opportunities and Contacts
- Training and Career Development
- Email Alerts
- NHLBI in the Press
- Research Features
- Past Events
- Upcoming Events
- Mission and Strategic Vision
- Divisions, Offices and Centers
- Advisory Committees
- Budget and Legislative Information
- Jobs and Working at the NHLBI
- Contact and FAQs
- NIH Sleep Research Plan
- < Back To Health Topics
Study Quality Assessment Tools
In 2013, NHLBI developed a set of tailored quality assessment tools to assist reviewers in focusing on concepts that are key to a study’s internal validity. The tools were specific to certain study designs and tested for potential flaws in study methods or implementation. Experts used the tools during the systematic evidence review process to update existing clinical guidelines, such as those on cholesterol, blood pressure, and obesity. Their findings are outlined in the following reports:
- Assessing Cardiovascular Risk: Systematic Evidence Review from the Risk Assessment Work Group
- Management of Blood Cholesterol in Adults: Systematic Evidence Review from the Cholesterol Expert Panel
- Management of Blood Pressure in Adults: Systematic Evidence Review from the Blood Pressure Expert Panel
- Managing Overweight and Obesity in Adults: Systematic Evidence Review from the Obesity Expert Panel
While these tools have not been independently published and would not be considered standardized, they may be useful to the research community. These reports describe how experts used the tools for the project. Researchers may want to use the tools for their own projects; however, they would need to determine their own parameters for making judgements. Details about the design and application of the tools are included in Appendix A of the reports.
Quality Assessment of Controlled Intervention Studies - Study Quality Assessment Tools
*CD, cannot determine; NA, not applicable; NR, not reported
Guidance for Assessing the Quality of Controlled Intervention Studies
The guidance document below is organized by question number from the tool for quality assessment of controlled intervention studies.
Question 1. Described as randomized
Was the study described as randomized? A study does not satisfy quality criteria as randomized simply because the authors call it randomized; however, it is a first step in determining if a study is randomized
Questions 2 and 3. Treatment allocation–two interrelated pieces
Adequate randomization: Randomization is adequate if it occurred according to the play of chance (e.g., computer generated sequence in more recent studies, or random number table in older studies). Inadequate randomization: Randomization is inadequate if there is a preset plan (e.g., alternation where every other subject is assigned to treatment arm or another method of allocation is used, such as time or day of hospital admission or clinic visit, ZIP Code, phone number, etc.). In fact, this is not randomization at all–it is another method of assignment to groups. If assignment is not by the play of chance, then the answer to this question is no. There may be some tricky scenarios that will need to be read carefully and considered for the role of chance in assignment. For example, randomization may occur at the site level, where all individuals at a particular site are assigned to receive treatment or no treatment. This scenario is used for group-randomized trials, which can be truly randomized, but often are "quasi-experimental" studies with comparison groups rather than true control groups. (Few, if any, group-randomized trials are anticipated for this evidence review.)
Allocation concealment: This means that one does not know in advance, or cannot guess accurately, to what group the next person eligible for randomization will be assigned. Methods include sequentially numbered opaque sealed envelopes, numbered or coded containers, central randomization by a coordinating center, computer-generated randomization that is not revealed ahead of time, etc. Questions 4 and 5. Blinding
Blinding means that one does not know to which group–intervention or control–the participant is assigned. It is also sometimes called "masking." The reviewer assessed whether each of the following was blinded to knowledge of treatment assignment: (1) the person assessing the primary outcome(s) for the study (e.g., taking the measurements such as blood pressure, examining health records for events such as myocardial infarction, reviewing and interpreting test results such as x ray or cardiac catheterization findings); (2) the person receiving the intervention (e.g., the patient or other study participant); and (3) the person providing the intervention (e.g., the physician, nurse, pharmacist, dietitian, or behavioral interventionist).
Generally placebo-controlled medication studies are blinded to patient, provider, and outcome assessors; behavioral, lifestyle, and surgical studies are examples of studies that are frequently blinded only to the outcome assessors because blinding of the persons providing and receiving the interventions is difficult in these situations. Sometimes the individual providing the intervention is the same person performing the outcome assessment. This was noted when it occurred.
Question 6. Similarity of groups at baseline
This question relates to whether the intervention and control groups have similar baseline characteristics on average especially those characteristics that may affect the intervention or outcomes. The point of randomized trials is to create groups that are as similar as possible except for the intervention(s) being studied in order to compare the effects of the interventions between groups. When reviewers abstracted baseline characteristics, they noted when there was a significant difference between groups. Baseline characteristics for intervention groups are usually presented in a table in the article (often Table 1).
Groups can differ at baseline without raising red flags if: (1) the differences would not be expected to have any bearing on the interventions and outcomes; or (2) the differences are not statistically significant. When concerned about baseline difference in groups, reviewers recorded them in the comments section and considered them in their overall determination of the study quality.
Questions 7 and 8. Dropout
"Dropouts" in a clinical trial are individuals for whom there are no end point measurements, often because they dropped out of the study and were lost to followup.
Generally, an acceptable overall dropout rate is considered 20 percent or less of participants who were randomized or allocated into each group. An acceptable differential dropout rate is an absolute difference between groups of 15 percentage points at most (calculated by subtracting the dropout rate of one group minus the dropout rate of the other group). However, these are general rates. Lower overall dropout rates are expected in shorter studies, whereas higher overall dropout rates may be acceptable for studies of longer duration. For example, a 6-month study of weight loss interventions should be expected to have nearly 100 percent followup (almost no dropouts–nearly everybody gets their weight measured regardless of whether or not they actually received the intervention), whereas a 10-year study testing the effects of intensive blood pressure lowering on heart attacks may be acceptable if there is a 20-25 percent dropout rate, especially if the dropout rate between groups was similar. The panels for the NHLBI systematic reviews may set different levels of dropout caps.
Conversely, differential dropout rates are not flexible; there should be a 15 percent cap. If there is a differential dropout rate of 15 percent or higher between arms, then there is a serious potential for bias. This constitutes a fatal flaw, resulting in a poor quality rating for the study.
Question 9. Adherence
Did participants in each treatment group adhere to the protocols for assigned interventions? For example, if Group 1 was assigned to 10 mg/day of Drug A, did most of them take 10 mg/day of Drug A? Another example is a study evaluating the difference between a 30-pound weight loss and a 10-pound weight loss on specific clinical outcomes (e.g., heart attacks), but the 30-pound weight loss group did not achieve its intended weight loss target (e.g., the group only lost 14 pounds on average). A third example is whether a large percentage of participants assigned to one group "crossed over" and got the intervention provided to the other group. A final example is when one group that was assigned to receive a particular drug at a particular dose had a large percentage of participants who did not end up taking the drug or the dose as designed in the protocol.
Question 10. Avoid other interventions
Changes that occur in the study outcomes being assessed should be attributable to the interventions being compared in the study. If study participants receive interventions that are not part of the study protocol and could affect the outcomes being assessed, and they receive these interventions differentially, then there is cause for concern because these interventions could bias results. The following scenario is another example of how bias can occur. In a study comparing two different dietary interventions on serum cholesterol, one group had a significantly higher percentage of participants taking statin drugs than the other group. In this situation, it would be impossible to know if a difference in outcome was due to the dietary intervention or the drugs.
Question 11. Outcome measures assessment
What tools or methods were used to measure the outcomes in the study? Were the tools and methods accurate and reliable–for example, have they been validated, or are they objective? This is important as it indicates the confidence you can have in the reported outcomes. Perhaps even more important is ascertaining that outcomes were assessed in the same manner within and between groups. One example of differing methods is self-report of dietary salt intake versus urine testing for sodium content (a more reliable and valid assessment method). Another example is using BP measurements taken by practitioners who use their usual methods versus using BP measurements done by individuals trained in a standard approach. Such an approach may include using the same instrument each time and taking an individual's BP multiple times. In each of these cases, the answer to this assessment question would be "no" for the former scenario and "yes" for the latter. In addition, a study in which an intervention group was seen more frequently than the control group, enabling more opportunities to report clinical events, would not be considered reliable and valid.
Question 12. Power calculation
Generally, a study's methods section will address the sample size needed to detect differences in primary outcomes. The current standard is at least 80 percent power to detect a clinically relevant difference in an outcome using a two-sided alpha of 0.05. Often, however, older studies will not report on power.
Question 13. Prespecified outcomes
Investigators should prespecify outcomes reported in a study for hypothesis testing–which is the reason for conducting an RCT. Without prespecified outcomes, the study may be reporting ad hoc analyses, simply looking for differences supporting desired findings. Investigators also should prespecify subgroups being examined. Most RCTs conduct numerous post hoc analyses as a way of exploring findings and generating additional hypotheses. The intent of this question is to give more weight to reports that are not simply exploratory in nature.
Question 14. Intention-to-treat analysis
Intention-to-treat (ITT) means everybody who was randomized is analyzed according to the original group to which they are assigned. This is an extremely important concept because conducting an ITT analysis preserves the whole reason for doing a randomized trial; that is, to compare groups that differ only in the intervention being tested. When the ITT philosophy is not followed, groups being compared may no longer be the same. In this situation, the study would likely be rated poor. However, if an investigator used another type of analysis that could be viewed as valid, this would be explained in the "other" box on the quality assessment form. Some researchers use a completers analysis (an analysis of only the participants who completed the intervention and the study), which introduces significant potential for bias. Characteristics of participants who do not complete the study are unlikely to be the same as those who do. The likely impact of participants withdrawing from a study treatment must be considered carefully. ITT analysis provides a more conservative (potentially less biased) estimate of effectiveness.
General Guidance for Determining the Overall Quality Rating of Controlled Intervention Studies
The questions on the assessment tool were designed to help reviewers focus on the key concepts for evaluating a study's internal validity. They are not intended to create a list that is simply tallied up to arrive at a summary judgment of quality.
Internal validity is the extent to which the results (effects) reported in a study can truly be attributed to the intervention being evaluated and not to flaws in the design or conduct of the study–in other words, the ability for the study to make causal conclusions about the effects of the intervention being tested. Such flaws can increase the risk of bias. Critical appraisal involves considering the risk of potential for allocation bias, measurement bias, or confounding (the mixture of exposures that one cannot tease out from each other). Examples of confounding include co-interventions, differences at baseline in patient characteristics, and other issues addressed in the questions above. High risk of bias translates to a rating of poor quality. Low risk of bias translates to a rating of good quality.
Fatal flaws: If a study has a "fatal flaw," then risk of bias is significant, and the study is of poor quality. Examples of fatal flaws in RCTs include high dropout rates, high differential dropout rates, no ITT analysis or other unsuitable statistical analysis (e.g., completers-only analysis).
Generally, when evaluating a study, one will not see a "fatal flaw;" however, one will find some risk of bias. During training, reviewers were instructed to look for the potential for bias in studies by focusing on the concepts underlying the questions in the tool. For any box checked "no," reviewers were told to ask: "What is the potential risk of bias that may be introduced by this flaw?" That is, does this factor cause one to doubt the results that were reported in the study?
NHLBI staff provided reviewers with background reading on critical appraisal, while emphasizing that the best approach to use is to think about the questions in the tool in determining the potential for bias in a study. The staff also emphasized that each study has specific nuances; therefore, reviewers should familiarize themselves with the key concepts.
Quality Assessment of Systematic Reviews and Meta-Analyses - Study Quality Assessment Tools
Guidance for Quality Assessment Tool for Systematic Reviews and Meta-Analyses
A systematic review is a study that attempts to answer a question by synthesizing the results of primary studies while using strategies to limit bias and random error.424 These strategies include a comprehensive search of all potentially relevant articles and the use of explicit, reproducible criteria in the selection of articles included in the review. Research designs and study characteristics are appraised, data are synthesized, and results are interpreted using a predefined systematic approach that adheres to evidence-based methodological principles.
Systematic reviews can be qualitative or quantitative. A qualitative systematic review summarizes the results of the primary studies but does not combine the results statistically. A quantitative systematic review, or meta-analysis, is a type of systematic review that employs statistical techniques to combine the results of the different studies into a single pooled estimate of effect, often given as an odds ratio. The guidance document below is organized by question number from the tool for quality assessment of systematic reviews and meta-analyses.
Question 1. Focused question
The review should be based on a question that is clearly stated and well-formulated. An example would be a question that uses the PICO (population, intervention, comparator, outcome) format, with all components clearly described.
Question 2. Eligibility criteria
The eligibility criteria used to determine whether studies were included or excluded should be clearly specified and predefined. It should be clear to the reader why studies were included or excluded.
Question 3. Literature search
The search strategy should employ a comprehensive, systematic approach in order to capture all of the evidence possible that pertains to the question of interest. At a minimum, a comprehensive review has the following attributes:
- Electronic searches were conducted using multiple scientific literature databases, such as MEDLINE, EMBASE, Cochrane Central Register of Controlled Trials, PsychLit, and others as appropriate for the subject matter.
- Manual searches of references found in articles and textbooks should supplement the electronic searches.
Additional search strategies that may be used to improve the yield include the following:
- Studies published in other countries
- Studies published in languages other than English
- Identification by experts in the field of studies and articles that may have been missed
- Search of grey literature, including technical reports and other papers from government agencies or scientific groups or committees; presentations and posters from scientific meetings, conference proceedings, unpublished manuscripts; and others. Searching the grey literature is important (whenever feasible) because sometimes only positive studies with significant findings are published in the peer-reviewed literature, which can bias the results of a review.
In their reviews, researchers described the literature search strategy clearly, and ascertained it could be reproducible by others with similar results.
Question 4. Dual review for determining which studies to include and exclude
Titles, abstracts, and full-text articles (when indicated) should be reviewed by two independent reviewers to determine which studies to include and exclude in the review. Reviewers resolved disagreements through discussion and consensus or with third parties. They clearly stated the review process, including methods for settling disagreements.
Question 5. Quality appraisal for internal validity
Each included study should be appraised for internal validity (study quality assessment) using a standardized approach for rating the quality of the individual studies. Ideally, this should be done by at least two independent reviewers appraised each study for internal validity. However, there is not one commonly accepted, standardized tool for rating the quality of studies. So, in the research papers, reviewers looked for an assessment of the quality of each study and a clear description of the process used.
Question 6. List and describe included studies
All included studies were listed in the review, along with descriptions of their key characteristics. This was presented either in narrative or table format.
Question 7. Publication bias
Publication bias is a term used when studies with positive results have a higher likelihood of being published, being published rapidly, being published in higher impact journals, being published in English, being published more than once, or being cited by others.425,426 Publication bias can be linked to favorable or unfavorable treatment of research findings due to investigators, editors, industry, commercial interests, or peer reviewers. To minimize the potential for publication bias, researchers can conduct a comprehensive literature search that includes the strategies discussed in Question 3.
A funnel plot–a scatter plot of component studies in a meta-analysis–is a commonly used graphical method for detecting publication bias. If there is no significant publication bias, the graph looks like a symmetrical inverted funnel.
Reviewers assessed and clearly described the likelihood of publication bias.
Question 8. Heterogeneity
Heterogeneity is used to describe important differences in studies included in a meta-analysis that may make it inappropriate to combine the studies.427 Heterogeneity can be clinical (e.g., important differences between study participants, baseline disease severity, and interventions); methodological (e.g., important differences in the design and conduct of the study); or statistical (e.g., important differences in the quantitative results or reported effects).
Researchers usually assess clinical or methodological heterogeneity qualitatively by determining whether it makes sense to combine studies. For example:
- Should a study evaluating the effects of an intervention on CVD risk that involves elderly male smokers with hypertension be combined with a study that involves healthy adults ages 18 to 40? (Clinical Heterogeneity)
- Should a study that uses a randomized controlled trial (RCT) design be combined with a study that uses a case-control study design? (Methodological Heterogeneity)
Statistical heterogeneity describes the degree of variation in the effect estimates from a set of studies; it is assessed quantitatively. The two most common methods used to assess statistical heterogeneity are the Q test (also known as the X2 or chi-square test) or I2 test.
Reviewers examined studies to determine if an assessment for heterogeneity was conducted and clearly described. If the studies are found to be heterogeneous, the investigators should explore and explain the causes of the heterogeneity, and determine what influence, if any, the study differences had on overall study results.
Quality Assessment Tool for Observational Cohort and Cross-Sectional Studies - Study Quality Assessment Tools
Guidance for Assessing the Quality of Observational Cohort and Cross-Sectional Studies
The guidance document below is organized by question number from the tool for quality assessment of observational cohort and cross-sectional studies.
Question 1. Research question
Did the authors describe their goal in conducting this research? Is it easy to understand what they were looking to find? This issue is important for any scientific paper of any type. Higher quality scientific research explicitly defines a research question.
Questions 2 and 3. Study population
Did the authors describe the group of people from which the study participants were selected or recruited, using demographics, location, and time period? If you were to conduct this study again, would you know who to recruit, from where, and from what time period? Is the cohort population free of the outcomes of interest at the time they were recruited?
An example would be men over 40 years old with type 2 diabetes who began seeking medical care at Phoenix Good Samaritan Hospital between January 1, 1990 and December 31, 1994. In this example, the population is clearly described as: (1) who (men over 40 years old with type 2 diabetes); (2) where (Phoenix Good Samaritan Hospital); and (3) when (between January 1, 1990 and December 31, 1994). Another example is women ages 34 to 59 years of age in 1980 who were in the nursing profession and had no known coronary disease, stroke, cancer, hypercholesterolemia, or diabetes, and were recruited from the 11 most populous States, with contact information obtained from State nursing boards.
In cohort studies, it is crucial that the population at baseline is free of the outcome of interest. For example, the nurses' population above would be an appropriate group in which to study incident coronary disease. This information is usually found either in descriptions of population recruitment, definitions of variables, or inclusion/exclusion criteria.
You may need to look at prior papers on methods in order to make the assessment for this question. Those papers are usually in the reference list.
If fewer than 50% of eligible persons participated in the study, then there is concern that the study population does not adequately represent the target population. This increases the risk of bias.
Question 4. Groups recruited from the same population and uniform eligibility criteria
Were the inclusion and exclusion criteria developed prior to recruitment or selection of the study population? Were the same underlying criteria used for all of the subjects involved? This issue is related to the description of the study population, above, and you may find the information for both of these questions in the same section of the paper.
Most cohort studies begin with the selection of the cohort; participants in this cohort are then measured or evaluated to determine their exposure status. However, some cohort studies may recruit or select exposed participants in a different time or place than unexposed participants, especially retrospective cohort studies–which is when data are obtained from the past (retrospectively), but the analysis examines exposures prior to outcomes. For example, one research question could be whether diabetic men with clinical depression are at higher risk for cardiovascular disease than those without clinical depression. So, diabetic men with depression might be selected from a mental health clinic, while diabetic men without depression might be selected from an internal medicine or endocrinology clinic. This study recruits groups from different clinic populations, so this example would get a "no."
However, the women nurses described in the question above were selected based on the same inclusion/exclusion criteria, so that example would get a "yes."
Question 5. Sample size justification
Did the authors present their reasons for selecting or recruiting the number of people included or analyzed? Do they note or discuss the statistical power of the study? This question is about whether or not the study had enough participants to detect an association if one truly existed.
A paragraph in the methods section of the article may explain the sample size needed to detect a hypothesized difference in outcomes. You may also find a discussion of power in the discussion section (such as the study had 85 percent power to detect a 20 percent increase in the rate of an outcome of interest, with a 2-sided alpha of 0.05). Sometimes estimates of variance and/or estimates of effect size are given, instead of sample size calculations. In any of these cases, the answer would be "yes."
However, observational cohort studies often do not report anything about power or sample sizes because the analyses are exploratory in nature. In this case, the answer would be "no." This is not a "fatal flaw." It just may indicate that attention was not paid to whether the study was sufficiently sized to answer a prespecified question–i.e., it may have been an exploratory, hypothesis-generating study.
Question 6. Exposure assessed prior to outcome measurement
This question is important because, in order to determine whether an exposure causes an outcome, the exposure must come before the outcome.
For some prospective cohort studies, the investigator enrolls the cohort and then determines the exposure status of various members of the cohort (large epidemiological studies like Framingham used this approach). However, for other cohort studies, the cohort is selected based on its exposure status, as in the example above of depressed diabetic men (the exposure being depression). Other examples include a cohort identified by its exposure to fluoridated drinking water and then compared to a cohort living in an area without fluoridated water, or a cohort of military personnel exposed to combat in the Gulf War compared to a cohort of military personnel not deployed in a combat zone.
With either of these types of cohort studies, the cohort is followed forward in time (i.e., prospectively) to assess the outcomes that occurred in the exposed members compared to nonexposed members of the cohort. Therefore, you begin the study in the present by looking at groups that were exposed (or not) to some biological or behavioral factor, intervention, etc., and then you follow them forward in time to examine outcomes. If a cohort study is conducted properly, the answer to this question should be "yes," since the exposure status of members of the cohort was determined at the beginning of the study before the outcomes occurred.
For retrospective cohort studies, the same principal applies. The difference is that, rather than identifying a cohort in the present and following them forward in time, the investigators go back in time (i.e., retrospectively) and select a cohort based on their exposure status in the past and then follow them forward to assess the outcomes that occurred in the exposed and nonexposed cohort members. Because in retrospective cohort studies the exposure and outcomes may have already occurred (it depends on how long they follow the cohort), it is important to make sure that the exposure preceded the outcome.
Sometimes cross-sectional studies are conducted (or cross-sectional analyses of cohort-study data), where the exposures and outcomes are measured during the same timeframe. As a result, cross-sectional analyses provide weaker evidence than regular cohort studies regarding a potential causal relationship between exposures and outcomes. For cross-sectional analyses, the answer to Question 6 should be "no."
Question 7. Sufficient timeframe to see an effect
Did the study allow enough time for a sufficient number of outcomes to occur or be observed, or enough time for an exposure to have a biological effect on an outcome? In the examples given above, if clinical depression has a biological effect on increasing risk for CVD, such an effect may take years. In the other example, if higher dietary sodium increases BP, a short timeframe may be sufficient to assess its association with BP, but a longer timeframe would be needed to examine its association with heart attacks.
The issue of timeframe is important to enable meaningful analysis of the relationships between exposures and outcomes to be conducted. This often requires at least several years, especially when looking at health outcomes, but it depends on the research question and outcomes being examined.
Cross-sectional analyses allow no time to see an effect, since the exposures and outcomes are assessed at the same time, so those would get a "no" response.
Question 8. Different levels of the exposure of interest
If the exposure can be defined as a range (examples: drug dosage, amount of physical activity, amount of sodium consumed), were multiple categories of that exposure assessed? (for example, for drugs: not on the medication, on a low dose, medium dose, high dose; for dietary sodium, higher than average U.S. consumption, lower than recommended consumption, between the two). Sometimes discrete categories of exposure are not used, but instead exposures are measured as continuous variables (for example, mg/day of dietary sodium or BP values).
In any case, studying different levels of exposure (where possible) enables investigators to assess trends or dose-response relationships between exposures and outcomes–e.g., the higher the exposure, the greater the rate of the health outcome. The presence of trends or dose-response relationships lends credibility to the hypothesis of causality between exposure and outcome.
For some exposures, however, this question may not be applicable (e.g., the exposure may be a dichotomous variable like living in a rural setting versus an urban setting, or vaccinated/not vaccinated with a one-time vaccine). If there are only two possible exposures (yes/no), then this question should be given an "NA," and it should not count negatively towards the quality rating.
Question 9. Exposure measures and assessment
Were the exposure measures defined in detail? Were the tools or methods used to measure exposure accurate and reliable–for example, have they been validated or are they objective? This issue is important as it influences confidence in the reported exposures. When exposures are measured with less accuracy or validity, it is harder to see an association between exposure and outcome even if one exists. Also as important is whether the exposures were assessed in the same manner within groups and between groups; if not, bias may result.
For example, retrospective self-report of dietary salt intake is not as valid and reliable as prospectively using a standardized dietary log plus testing participants' urine for sodium content. Another example is measurement of BP, where there may be quite a difference between usual care, where clinicians measure BP however it is done in their practice setting (which can vary considerably), and use of trained BP assessors using standardized equipment (e.g., the same BP device which has been tested and calibrated) and a standardized protocol (e.g., patient is seated for 5 minutes with feet flat on the floor, BP is taken twice in each arm, and all four measurements are averaged). In each of these cases, the former would get a "no" and the latter a "yes."
Here is a final example that illustrates the point about why it is important to assess exposures consistently across all groups: If people with higher BP (exposed cohort) are seen by their providers more frequently than those without elevated BP (nonexposed group), it also increases the chances of detecting and documenting changes in health outcomes, including CVD-related events. Therefore, it may lead to the conclusion that higher BP leads to more CVD events. This may be true, but it could also be due to the fact that the subjects with higher BP were seen more often; thus, more CVD-related events were detected and documented simply because they had more encounters with the health care system. Thus, it could bias the results and lead to an erroneous conclusion.
Question 10. Repeated exposure assessment
Was the exposure for each person measured more than once during the course of the study period? Multiple measurements with the same result increase our confidence that the exposure status was correctly classified. Also, multiple measurements enable investigators to look at changes in exposure over time, for example, people who ate high dietary sodium throughout the followup period, compared to those who started out high then reduced their intake, compared to those who ate low sodium throughout. Once again, this may not be applicable in all cases. In many older studies, exposure was measured only at baseline. However, multiple exposure measurements do result in a stronger study design.
Question 11. Outcome measures
Were the outcomes defined in detail? Were the tools or methods for measuring outcomes accurate and reliable–for example, have they been validated or are they objective? This issue is important because it influences confidence in the validity of study results. Also important is whether the outcomes were assessed in the same manner within groups and between groups.
An example of an outcome measure that is objective, accurate, and reliable is death–the outcome measured with more accuracy than any other. But even with a measure as objective as death, there can be differences in the accuracy and reliability of how death was assessed by the investigators. Did they base it on an autopsy report, death certificate, death registry, or report from a family member? Another example is a study of whether dietary fat intake is related to blood cholesterol level (cholesterol level being the outcome), and the cholesterol level is measured from fasting blood samples that are all sent to the same laboratory. These examples would get a "yes." An example of a "no" would be self-report by subjects that they had a heart attack, or self-report of how much they weigh (if body weight is the outcome of interest).
Similar to the example in Question 9, results may be biased if one group (e.g., people with high BP) is seen more frequently than another group (people with normal BP) because more frequent encounters with the health care system increases the chances of outcomes being detected and documented.
Question 12. Blinding of outcome assessors
Blinding means that outcome assessors did not know whether the participant was exposed or unexposed. It is also sometimes called "masking." The objective is to look for evidence in the article that the person(s) assessing the outcome(s) for the study (for example, examining medical records to determine the outcomes that occurred in the exposed and comparison groups) is masked to the exposure status of the participant. Sometimes the person measuring the exposure is the same person conducting the outcome assessment. In this case, the outcome assessor would most likely not be blinded to exposure status because they also took measurements of exposures. If so, make a note of that in the comments section.
As you assess this criterion, think about whether it is likely that the person(s) doing the outcome assessment would know (or be able to figure out) the exposure status of the study participants. If the answer is no, then blinding is adequate. An example of adequate blinding of the outcome assessors is to create a separate committee, whose members were not involved in the care of the patient and had no information about the study participants' exposure status. The committee would then be provided with copies of participants' medical records, which had been stripped of any potential exposure information or personally identifiable information. The committee would then review the records for prespecified outcomes according to the study protocol. If blinding was not possible, which is sometimes the case, mark "NA" and explain the potential for bias.
Question 13. Followup rate
Higher overall followup rates are always better than lower followup rates, even though higher rates are expected in shorter studies, whereas lower overall followup rates are often seen in studies of longer duration. Usually, an acceptable overall followup rate is considered 80 percent or more of participants whose exposures were measured at baseline. However, this is just a general guideline. For example, a 6-month cohort study examining the relationship between dietary sodium intake and BP level may have over 90 percent followup, but a 20-year cohort study examining effects of sodium intake on stroke may have only a 65 percent followup rate.
Question 14. Statistical analyses
Were key potential confounding variables measured and adjusted for, such as by statistical adjustment for baseline differences? Logistic regression or other regression methods are often used to account for the influence of variables not of interest.
This is a key issue in cohort studies, because statistical analyses need to control for potential confounders, in contrast to an RCT, where the randomization process controls for potential confounders. All key factors that may be associated both with the exposure of interest and the outcome–that are not of interest to the research question–should be controlled for in the analyses.
For example, in a study of the relationship between cardiorespiratory fitness and CVD events (heart attacks and strokes), the study should control for age, BP, blood cholesterol, and body weight, because all of these factors are associated both with low fitness and with CVD events. Well-done cohort studies control for multiple potential confounders.
Some general guidance for determining the overall quality rating of observational cohort and cross-sectional studies
The questions on the form are designed to help you focus on the key concepts for evaluating the internal validity of a study. They are not intended to create a list that you simply tally up to arrive at a summary judgment of quality.
Internal validity for cohort studies is the extent to which the results reported in the study can truly be attributed to the exposure being evaluated and not to flaws in the design or conduct of the study–in other words, the ability of the study to draw associative conclusions about the effects of the exposures being studied on outcomes. Any such flaws can increase the risk of bias.
Critical appraisal involves considering the risk of potential for selection bias, information bias, measurement bias, or confounding (the mixture of exposures that one cannot tease out from each other). Examples of confounding include co-interventions, differences at baseline in patient characteristics, and other issues throughout the questions above. High risk of bias translates to a rating of poor quality. Low risk of bias translates to a rating of good quality. (Thus, the greater the risk of bias, the lower the quality rating of the study.)
In addition, the more attention in the study design to issues that can help determine whether there is a causal relationship between the exposure and outcome, the higher quality the study. These include exposures occurring prior to outcomes, evaluation of a dose-response gradient, accuracy of measurement of both exposure and outcome, sufficient timeframe to see an effect, and appropriate control for confounding–all concepts reflected in the tool.
Generally, when you evaluate a study, you will not see a "fatal flaw," but you will find some risk of bias. By focusing on the concepts underlying the questions in the quality assessment tool, you should ask yourself about the potential for bias in the study you are critically appraising. For any box where you check "no" you should ask, "What is the potential risk of bias resulting from this flaw in study design or execution?" That is, does this factor cause you to doubt the results that are reported in the study or doubt the ability of the study to accurately assess an association between exposure and outcome?
The best approach is to think about the questions in the tool and how each one tells you something about the potential for bias in a study. The more you familiarize yourself with the key concepts, the more comfortable you will be with critical appraisal. Examples of studies rated good, fair, and poor are useful, but each study must be assessed on its own based on the details that are reported and consideration of the concepts for minimizing bias.
Quality Assessment of Case-Control Studies - Study Quality Assessment Tools
Guidance for Assessing the Quality of Case-Control Studies
The guidance document below is organized by question number from the tool for quality assessment of case-control studies.
Did the authors describe their goal in conducting this research? Is it easy to understand what they were looking to find? This issue is important for any scientific paper of any type. High quality scientific research explicitly defines a research question.
Question 2. Study population
Did the authors describe the group of individuals from which the cases and controls were selected or recruited, while using demographics, location, and time period? If the investigators conducted this study again, would they know exactly who to recruit, from where, and from what time period?
Investigators identify case-control study populations by location, time period, and inclusion criteria for cases (individuals with the disease, condition, or problem) and controls (individuals without the disease, condition, or problem). For example, the population for a study of lung cancer and chemical exposure would be all incident cases of lung cancer diagnosed in patients ages 35 to 79, from January 1, 2003 to December 31, 2008, living in Texas during that entire time period, as well as controls without lung cancer recruited from the same population during the same time period. The population is clearly described as: (1) who (men and women ages 35 to 79 with (cases) and without (controls) incident lung cancer); (2) where (living in Texas); and (3) when (between January 1, 2003 and December 31, 2008).
Other studies may use disease registries or data from cohort studies to identify cases. In these cases, the populations are individuals who live in the area covered by the disease registry or included in a cohort study (i.e., nested case-control or case-cohort). For example, a study of the relationship between vitamin D intake and myocardial infarction might use patients identified via the GRACE registry, a database of heart attack patients.
NHLBI staff encouraged reviewers to examine prior papers on methods (listed in the reference list) to make this assessment, if necessary.
Question 3. Target population and case representation
In order for a study to truly address the research question, the target population–the population from which the study population is drawn and to which study results are believed to apply–should be carefully defined. Some authors may compare characteristics of the study cases to characteristics of cases in the target population, either in text or in a table. When study cases are shown to be representative of cases in the appropriate target population, it increases the likelihood that the study was well-designed per the research question.
However, because these statistics are frequently difficult or impossible to measure, publications should not be penalized if case representation is not shown. For most papers, the response to question 3 will be "NR." Those subquestions are combined because the answer to the second subquestion–case representation–determines the response to this item. However, it cannot be determined without considering the response to the first subquestion. For example, if the answer to the first subquestion is "yes," and the second, "CD," then the response for item 3 is "CD."
Question 4. Sample size justification
Did the authors discuss their reasons for selecting or recruiting the number of individuals included? Did they discuss the statistical power of the study and provide a sample size calculation to ensure that the study is adequately powered to detect an association (if one exists)? This question does not refer to a description of the manner in which different groups were included or excluded using the inclusion/exclusion criteria (e.g., "Final study size was 1,378 participants after exclusion of 461 patients with missing data" is not considered a sample size justification for the purposes of this question).
An article's methods section usually contains information on sample size and the size needed to detect differences in exposures and on statistical power.
Question 5. Groups recruited from the same population
To determine whether cases and controls were recruited from the same population, one can ask hypothetically, "If a control was to develop the outcome of interest (the condition that was used to select cases), would that person have been eligible to become a case?" Case-control studies begin with the selection of the cases (those with the outcome of interest, e.g., lung cancer) and controls (those in whom the outcome is absent). Cases and controls are then evaluated and categorized by their exposure status. For the lung cancer example, cases and controls were recruited from hospitals in a given region. One may reasonably assume that controls in the catchment area for the hospitals, or those already in the hospitals for a different reason, would attend those hospitals if they became a case; therefore, the controls are drawn from the same population as the cases. If the controls were recruited or selected from a different region (e.g., a State other than Texas) or time period (e.g., 1991-2000), then the cases and controls were recruited from different populations, and the answer to this question would be "no."
The following example further explores selection of controls. In a study, eligible cases were men and women, ages 18 to 39, who were diagnosed with atherosclerosis at hospitals in Perth, Australia, between July 1, 2000 and December 31, 2007. Appropriate controls for these cases might be sampled using voter registration information for men and women ages 18 to 39, living in Perth (population-based controls); they also could be sampled from patients without atherosclerosis at the same hospitals (hospital-based controls). As long as the controls are individuals who would have been eligible to be included in the study as cases (if they had been diagnosed with atherosclerosis), then the controls were selected appropriately from the same source population as cases.
In a prospective case-control study, investigators may enroll individuals as cases at the time they are found to have the outcome of interest; the number of cases usually increases as time progresses. At this same time, they may recruit or select controls from the population without the outcome of interest. One way to identify or recruit cases is through a surveillance system. In turn, investigators can select controls from the population covered by that system. This is an example of population-based controls. Investigators also may identify and select cases from a cohort study population and identify controls from outcome-free individuals in the same cohort study. This is known as a nested case-control study.
Question 6. Inclusion and exclusion criteria prespecified and applied uniformly
Were the inclusion and exclusion criteria developed prior to recruitment or selection of the study population? Were the same underlying criteria used for all of the groups involved? To answer this question, reviewers determined if the investigators developed I/E criteria prior to recruitment or selection of the study population and if they used the same underlying criteria for all groups. The investigators should have used the same selection criteria, except for study participants who had the disease or condition, which would be different for cases and controls by definition. Therefore, the investigators use the same age (or age range), gender, race, and other characteristics to select cases and controls. Information on this topic is usually found in a paper's section on the description of the study population.
Question 7. Case and control definitions
For this question, reviewers looked for descriptions of the validity of case and control definitions and processes or tools used to identify study participants as such. Was a specific description of "case" and "control" provided? Is there a discussion of the validity of the case and control definitions and the processes or tools used to identify study participants as such? They determined if the tools or methods were accurate, reliable, and objective. For example, cases might be identified as "adult patients admitted to a VA hospital from January 1, 2000 to December 31, 2009, with an ICD-9 discharge diagnosis code of acute myocardial infarction and at least one of the two confirmatory findings in their medical records: at least 2mm of ST elevation changes in two or more ECG leads and an elevated troponin level. Investigators might also use ICD-9 or CPT codes to identify patients. All cases should be identified using the same methods. Unless the distinction between cases and controls is accurate and reliable, investigators cannot use study results to draw valid conclusions.
Question 8. Random selection of study participants
If a case-control study did not use 100 percent of eligible cases and/or controls (e.g., not all disease-free participants were included as controls), did the authors indicate that random sampling was used to select controls? When it is possible to identify the source population fairly explicitly (e.g., in a nested case-control study, or in a registry-based study), then random sampling of controls is preferred. When investigators used consecutive sampling, which is frequently done for cases in prospective studies, then study participants are not considered randomly selected. In this case, the reviewers would answer "no" to Question 8. However, this would not be considered a fatal flaw.
If investigators included all eligible cases and controls as study participants, then reviewers marked "NA" in the tool. If 100 percent of cases were included (e.g., NA for cases) but only 50 percent of eligible controls, then the response would be "yes" if the controls were randomly selected, and "no" if they were not. If this cannot be determined, the appropriate response is "CD."
Question 9. Concurrent controls
A concurrent control is a control selected at the time another person became a case, usually on the same day. This means that one or more controls are recruited or selected from the population without the outcome of interest at the time a case is diagnosed. Investigators can use this method in both prospective case-control studies and retrospective case-control studies. For example, in a retrospective study of adenocarcinoma of the colon using data from hospital records, if hospital records indicate that Person A was diagnosed with adenocarcinoma of the colon on June 22, 2002, then investigators would select one or more controls from the population of patients without adenocarcinoma of the colon on that same day. This assumes they conducted the study retrospectively, using data from hospital records. The investigators could have also conducted this study using patient records from a cohort study, in which case it would be a nested case-control study.
Investigators can use concurrent controls in the presence or absence of matching and vice versa. A study that uses matching does not necessarily mean that concurrent controls were used.
Question 10. Exposure assessed prior to outcome measurement
Investigators first determine case or control status (based on presence or absence of outcome of interest), and then assess exposure history of the case or control; therefore, reviewers ascertained that the exposure preceded the outcome. For example, if the investigators used tissue samples to determine exposure, did they collect them from patients prior to their diagnosis? If hospital records were used, did investigators verify that the date a patient was exposed (e.g., received medication for atherosclerosis) occurred prior to the date they became a case (e.g., was diagnosed with type 2 diabetes)? For an association between an exposure and an outcome to be considered causal, the exposure must have occurred prior to the outcome.
Question 11. Exposure measures and assessment
Were the exposure measures defined in detail? Were the tools or methods used to measure exposure accurate and reliable–for example, have they been validated or are they objective? This is important, as it influences confidence in the reported exposures. Equally important is whether the exposures were assessed in the same manner within groups and between groups. This question pertains to bias resulting from exposure misclassification (i.e., exposure ascertainment).
For example, a retrospective self-report of dietary salt intake is not as valid and reliable as prospectively using a standardized dietary log plus testing participants' urine for sodium content because participants' retrospective recall of dietary salt intake may be inaccurate and result in misclassification of exposure status. Similarly, BP results from practices that use an established protocol for measuring BP would be considered more valid and reliable than results from practices that did not use standard protocols. A protocol may include using trained BP assessors, standardized equipment (e.g., the same BP device which has been tested and calibrated), and a standardized procedure (e.g., patient is seated for 5 minutes with feet flat on the floor, BP is taken twice in each arm, and all four measurements are averaged).
Question 12. Blinding of exposure assessors
Blinding or masking means that outcome assessors did not know whether participants were exposed or unexposed. To answer this question, reviewers examined articles for evidence that the outcome assessor(s) was masked to the exposure status of the research participants. An outcome assessor, for example, may examine medical records to determine the outcomes that occurred in the exposed and comparison groups. Sometimes the person measuring the exposure is the same person conducting the outcome assessment. In this case, the outcome assessor would most likely not be blinded to exposure status. A reviewer would note such a finding in the comments section of the assessment tool.
One way to ensure good blinding of exposure assessment is to have a separate committee, whose members have no information about the study participants' status as cases or controls, review research participants' records. To help answer the question above, reviewers determined if it was likely that the outcome assessor knew whether the study participant was a case or control. If it was unlikely, then the reviewers marked "no" to Question 12. Outcome assessors who used medical records to assess exposure should not have been directly involved in the study participants' care, since they probably would have known about their patients' conditions. If the medical records contained information on the patient's condition that identified him/her as a case (which is likely), that information would have had to be removed before the exposure assessors reviewed the records.
If blinding was not possible, which sometimes happens, the reviewers marked "NA" in the assessment tool and explained the potential for bias.
Question 13. Statistical analysis
Were key potential confounding variables measured and adjusted for, such as by statistical adjustment for baseline differences? Investigators often use logistic regression or other regression methods to account for the influence of variables not of interest.
This is a key issue in case-controlled studies; statistical analyses need to control for potential confounders, in contrast to RCTs in which the randomization process controls for potential confounders. In the analysis, investigators need to control for all key factors that may be associated with both the exposure of interest and the outcome and are not of interest to the research question.
A study of the relationship between smoking and CVD events illustrates this point. Such a study needs to control for age, gender, and body weight; all are associated with smoking and CVD events. Well-done case-control studies control for multiple potential confounders.
Matching is a technique used to improve study efficiency and control for known confounders. For example, in the study of smoking and CVD events, an investigator might identify cases that have had a heart attack or stroke and then select controls of similar age, gender, and body weight to the cases. For case-control studies, it is important that if matching was performed during the selection or recruitment process, the variables used as matching criteria (e.g., age, gender, race) should be controlled for in the analysis.
General Guidance for Determining the Overall Quality Rating of Case-Controlled Studies
NHLBI designed the questions in the assessment tool to help reviewers focus on the key concepts for evaluating a study's internal validity, not to use as a list from which to add up items to judge a study's quality.
Internal validity for case-control studies is the extent to which the associations between disease and exposure reported in the study can truly be attributed to the exposure being evaluated rather than to flaws in the design or conduct of the study. In other words, what is ability of the study to draw associative conclusions about the effects of the exposures on outcomes? Any such flaws can increase the risk of bias.
In critical appraising a study, the following factors need to be considered: risk of potential for selection bias, information bias, measurement bias, or confounding (the mixture of exposures that one cannot tease out from each other). Examples of confounding include co-interventions, differences at baseline in patient characteristics, and other issues addressed in the questions above. High risk of bias translates to a poor quality rating; low risk of bias translates to a good quality rating. Again, the greater the risk of bias, the lower the quality rating of the study.
In addition, the more attention in the study design to issues that can help determine whether there is a causal relationship between the outcome and the exposure, the higher the quality of the study. These include exposures occurring prior to outcomes, evaluation of a dose-response gradient, accuracy of measurement of both exposure and outcome, sufficient timeframe to see an effect, and appropriate control for confounding–all concepts reflected in the tool.
If a study has a "fatal flaw," then risk of bias is significant; therefore, the study is deemed to be of poor quality. An example of a fatal flaw in case-control studies is a lack of a consistent standard process used to identify cases and controls.
Generally, when reviewers evaluated a study, they did not see a "fatal flaw," but instead found some risk of bias. By focusing on the concepts underlying the questions in the quality assessment tool, reviewers examined the potential for bias in the study. For any box checked "no," reviewers asked, "What is the potential risk of bias resulting from this flaw in study design or execution?" That is, did this factor lead to doubt about the results reported in the study or the ability of the study to accurately assess an association between exposure and outcome?
By examining questions in the assessment tool, reviewers were best able to assess the potential for bias in a study. Specific rules were not useful, as each study had specific nuances. In addition, being familiar with the key concepts helped reviewers assess the studies. Examples of studies rated good, fair, and poor were useful, yet each study had to be assessed on its own.
Quality Assessment Tool for Before-After (Pre-Post) Studies With No Control Group - Study Quality Assessment Tools
Guidance for Assessing the Quality of Before-After (Pre-Post) Studies With No Control Group
Question 1. Study question
Question 2. Eligibility criteria and study population
Did the authors describe the eligibility criteria applied to the individuals from whom the study participants were selected or recruited? In other words, if the investigators were to conduct this study again, would they know whom to recruit, from where, and from what time period?
Here is a sample description of a study population: men over age 40 with type 2 diabetes, who began seeking medical care at Phoenix Good Samaritan Hospital, between January 1, 2005 and December 31, 2007. The population is clearly described as: (1) who (men over age 40 with type 2 diabetes); (2) where (Phoenix Good Samaritan Hospital); and (3) when (between January 1, 2005 and December 31, 2007). Another sample description is women who were in the nursing profession, who were ages 34 to 59 in 1995, had no known CHD, stroke, cancer, hypercholesterolemia, or diabetes, and were recruited from the 11 most populous States, with contact information obtained from State nursing boards.
To assess this question, reviewers examined prior papers on study methods (listed in reference list) when necessary.
Question 3. Study participants representative of clinical populations of interest
The participants in the study should be generally representative of the population in which the intervention will be broadly applied. Studies on small demographic subgroups may raise concerns about how the intervention will affect broader populations of interest. For example, interventions that focus on very young or very old individuals may affect middle-aged adults differently. Similarly, researchers may not be able to extrapolate study results from patients with severe chronic diseases to healthy populations.
Question 4. All eligible participants enrolled
To further explore this question, reviewers may need to ask: Did the investigators develop the I/E criteria prior to recruiting or selecting study participants? Were the same underlying I/E criteria used for all research participants? Were all subjects who met the I/E criteria enrolled in the study?
Question 5. Sample size
Did the authors present their reasons for selecting or recruiting the number of individuals included or analyzed? Did they note or discuss the statistical power of the study? This question addresses whether there was a sufficient sample size to detect an association, if one did exist.
An article's methods section may provide information on the sample size needed to detect a hypothesized difference in outcomes and a discussion on statistical power (such as, the study had 85 percent power to detect a 20 percent increase in the rate of an outcome of interest, with a 2-sided alpha of 0.05). Sometimes estimates of variance and/or estimates of effect size are given, instead of sample size calculations. In any case, if the reviewers determined that the power was sufficient to detect the effects of interest, then they would answer "yes" to Question 5.
Question 6. Intervention clearly described
Another pertinent question regarding interventions is: Was the intervention clearly defined in detail in the study? Did the authors indicate that the intervention was consistently applied to the subjects? Did the research participants have a high level of adherence to the requirements of the intervention? For example, if the investigators assigned a group to 10 mg/day of Drug A, did most participants in this group take the specific dosage of Drug A? Or did a large percentage of participants end up not taking the specific dose of Drug A indicated in the study protocol?
Reviewers ascertained that changes in study outcomes could be attributed to study interventions. If participants received interventions that were not part of the study protocol and could affect the outcomes being assessed, the results could be biased.
Question 7. Outcome measures clearly described, valid, and reliable
Were the outcomes defined in detail? Were the tools or methods for measuring outcomes accurate and reliable–for example, have they been validated or are they objective? This question is important because the answer influences confidence in the validity of study results.
An example of an outcome measure that is objective, accurate, and reliable is death–the outcome measured with more accuracy than any other. But even with a measure as objective as death, differences can exist in the accuracy and reliability of how investigators assessed death. For example, did they base it on an autopsy report, death certificate, death registry, or report from a family member? Another example of a valid study is one whose objective is to determine if dietary fat intake affects blood cholesterol level (cholesterol level being the outcome) and in which the cholesterol level is measured from fasting blood samples that are all sent to the same laboratory. These examples would get a "yes."
An example of a "no" would be self-report by subjects that they had a heart attack, or self-report of how much they weight (if body weight is the outcome of interest).
Question 8. Blinding of outcome assessors
Blinding or masking means that the outcome assessors did not know whether the participants received the intervention or were exposed to the factor under study. To answer the question above, the reviewers examined articles for evidence that the person(s) assessing the outcome(s) was masked to the participants' intervention or exposure status. An outcome assessor, for example, may examine medical records to determine the outcomes that occurred in the exposed and comparison groups. Sometimes the person applying the intervention or measuring the exposure is the same person conducting the outcome assessment. In this case, the outcome assessor would not likely be blinded to the intervention or exposure status. A reviewer would note such a finding in the comments section of the assessment tool.
In assessing this criterion, the reviewers determined whether it was likely that the person(s) conducting the outcome assessment knew the exposure status of the study participants. If not, then blinding was adequate. An example of adequate blinding of the outcome assessors is to create a separate committee whose members were not involved in the care of the patient and had no information about the study participants' exposure status. Using a study protocol, committee members would review copies of participants' medical records, which would be stripped of any potential exposure information or personally identifiable information, for prespecified outcomes.
Question 9. Followup rate
Higher overall followup rates are always desirable to lower followup rates, although higher rates are expected in shorter studies, and lower overall followup rates are often seen in longer studies. Usually an acceptable overall followup rate is considered 80 percent or more of participants whose interventions or exposures were measured at baseline. However, this is a general guideline.
In accounting for those lost to followup, in the analysis, investigators may have imputed values of the outcome for those lost to followup or used other methods. For example, they may carry forward the baseline value or the last observed value of the outcome measure and use these as imputed values for the final outcome measure for research participants lost to followup.
Question 10. Statistical analysis
Were formal statistical tests used to assess the significance of the changes in the outcome measures between the before and after time periods? The reported study results should present values for statistical tests, such as p values, to document the statistical significance (or lack thereof) for the changes in the outcome measures found in the study.
Question 11. Multiple outcome measures
Were the outcome measures for each person measured more than once during the course of the before and after study periods? Multiple measurements with the same result increase confidence that the outcomes were accurately measured.
Question 12. Group-level interventions and individual-level outcome efforts
Group-level interventions are usually not relevant for clinical interventions such as bariatric surgery, in which the interventions are applied at the individual patient level. In those cases, the questions were coded as "NA" in the assessment tool.
General Guidance for Determining the Overall Quality Rating of Before-After Studies
The questions in the quality assessment tool were designed to help reviewers focus on the key concepts for evaluating the internal validity of a study. They are not intended to create a list from which to add up items to judge a study's quality.
Internal validity is the extent to which the outcome results reported in the study can truly be attributed to the intervention or exposure being evaluated, and not to biases, measurement errors, or other confounding factors that may result from flaws in the design or conduct of the study. In other words, what is the ability of the study to draw associative conclusions about the effects of the interventions or exposures on outcomes?
Critical appraisal of a study involves considering the risk of potential for selection bias, information bias, measurement bias, or confounding (the mixture of exposures that one cannot tease out from each other). Examples of confounding include co-interventions, differences at baseline in patient characteristics, and other issues throughout the questions above. High risk of bias translates to a rating of poor quality; low risk of bias translates to a rating of good quality. Again, the greater the risk of bias, the lower the quality rating of the study.
In addition, the more attention in the study design to issues that can help determine if there is a causal relationship between the exposure and outcome, the higher quality the study. These issues include exposures occurring prior to outcomes, evaluation of a dose-response gradient, accuracy of measurement of both exposure and outcome, and sufficient timeframe to see an effect.
Generally, when reviewers evaluate a study, they will not see a "fatal flaw," but instead will find some risk of bias. By focusing on the concepts underlying the questions in the quality assessment tool, reviewers should ask themselves about the potential for bias in the study they are critically appraising. For any box checked "no" reviewers should ask, "What is the potential risk of bias resulting from this flaw in study design or execution?" That is, does this factor lead to doubt about the results reported in the study or doubt about the ability of the study to accurately assess an association between the intervention or exposure and the outcome?
The best approach is to think about the questions in the assessment tool and how each one reveals something about the potential for bias in a study. Specific rules are not useful, as each study has specific nuances. In addition, being familiar with the key concepts will help reviewers be more comfortable with critical appraisal. Examples of studies rated good, fair, and poor are useful, but each study must be assessed on its own.
Quality Assessment Tool for Case Series Studies - Study Quality Assessment Tools
Background: development and use - study quality assessment tools.
Learn more about the development and use of Study Quality Assessment Tools.
Last updated: July, 2021
- Search Menu
- Advance articles
- Editor's Choice
- Mini-reviews
- Author Guidelines
- Submission Site
- Reasons to Publish
- Open Access
- Advertising and Corporate Services
- Advertising
- Reprints and ePrints
- Sponsored Supplements
- Branded Books
- Journals Career Network
- About Human Reproduction
- About the European Society of Human Reproduction and Embryology
- Editorial Board
- Self-Archiving Policy
- Dispatch Dates
- Contact ESHRE
- Journals on Oxford Academic
- Books on Oxford Academic

Article Contents
Introduction, environmental and genetic epidemiology of endometriosis: research to date, principal aims of a case–control study, practical considerations when investigating endometriosis, conclusions, what makes a good case–control study design issues for complex traits such as endometriosis.
- Article contents
- Figures & tables
- Supplementary Data
Krina T. Zondervan, Lon R. Cardon, Stephen H. Kennedy, What makes a good case–control study? Design issues for complex traits such as endometriosis, Human Reproduction , Volume 17, Issue 6, 1 June 2002, Pages 1415–1423, https://doi.org/10.1093/humrep/17.6.1415
- Permissions Icon Permissions
The combined investigation of environmental and genetic risk-factors in complex traits will refocus attention on the case–control study. Endometriosis is an example of a complex trait for which most case–control studies have not followed the basic criteria of epidemiological study design. Appropriate control selection has been a particular problem. This article reviews the principles underlying the design of case–control studies, and their application to the study of endometriosis. Only if it is designed well is the case–control study a suitable alternative to the prospective cohort study. Use of newly diagnosed over prevalent cases is preferable, as the latter may alter risk estimates and complicate the interpretation of findings. Controls should be selected from the source population from which cases arose. Potential confounding should be addressed both in studies of environmental and genetic factors. For endometriosis, a possible design would be to: (i) use newly diagnosed cases with `endometriotic' disease; (ii) collect information predating symptom onset; and (iii) use at least one population-based female control group matched on unadjustable confounders and screened for pelvic symptoms. In conclusion, future studies of complex traits such as endometriosis will have to incorporate both environmental and genetic factors. Only adequately designed studies will allow reliable results to be obtained and any true aetiologic heterogeneity expected to underlie a complex trait to be detected.
Investigating the aetiology of complex traits represents a major challenge. The multiple genetic and environmental factors they are caused by are likely to have only modest effect sizes that will vary across populations ( Cardon and Bell, 2001 ). The need to incorporate environmental factors in analyses has refocused attention on traditional epidemiological study designs such as the case–control study ( Clayton and McKeigue, 2001 ). In genetic research settings, concerns for analytical problems such as confounding by population of origin (population stratification) have in the past brought this type of study into disrepute. In epidemiological research, however, similar problems of confounding are regarded as mostly related to poor study design in terms of case and control selection.
Endometriosis is an example of a complex trait with several additional features complicating epidemiological study design, such as the lack of consensus about its precise definition and the need for an invasive procedure to establish the diagnosis. The condition is broadly defined as the presence of endometrial-like tissue outside the uterine cavity, associated with symptoms of dysmenorrhoea, dyspareunia, chronic pelvic pain and subfertility. It can only be diagnosed with certainty on histological examination. Disease severity has traditionally been classified using the revised American Fertility Society (rAFS) system into four stages (minimal to severe) on the basis of observed implant size, presence of cysts and adhesion formation ( American Fertility Society, 1985 ). However, minimal or mild endometriosis is increasingly viewed as part of a normal physiological process, whereas the more severe forms—ovarian cysts and deeply infiltrating lesions—are considered `endometriotic disease' ( Koninckx et al ., 1999 ). Thus, endometriosis can be seen as a continuum that is only considered pathological when a certain threshold of severity has been reached. This is similar to many other conditions that are regarded as quantitative traits with a threshold of clinical relevance, such as obesity or various psychological disorders.
Because of the many difficulties inherent in the epidemiological study of endometriosis, Holt and Weiss recently published some excellent recommendations for study design ( Holt and Weiss, 2000 ). They stressed the importance of using a standard definition, and discussed the implications of selecting cases from various source populations. We wish to build on their recommendations by demonstrating how the principles of a well-designed case–control study can be applied in the investigation of both genetic and environmental risk-factors for endometriosis. We note that most of the points raised are not confined to the study of endometriosis, but are important in case–control studies of any complex trait.
Because of the need for a surgical diagnosis, the prevalence of endometriosis in the general population is unknown. Estimates from asymptomatic fertile subpopulations undergoing tubal ligation have varied greatly, from 0.7 to 43% around a mean of 4% ( Eskenazi and Warner, 1997 ). However, up to 90% of these women were diagnosed with minimal or mild endometriosis.
The main aetiological hypothesis for endometriosis is retrograde menstruation ( Sampson, 1927 ). However, retrograde menstruation has been observed in up to 90% of women ( Halme et al ., 1984 ), which implies that other factors must also be involved. Inevitably, the need for a surgical diagnosis has limited studies investigating risk-factors for endometriosis, since they have to be based on selected patient samples.
The evidence that endometriosis is a complex trait is highly suggestive. Reviews of environmental risk-factors, researched independently from genetic factors, have implicated prolonged and heavy menstruation and increased exposure to estrogen ( Mangtani and Booth, 1993 ; Eskenazi and Warner, 1997 ). Many of these studies failed to take account of basic epidemiological principles in their design. Of 100 studies of environmental risk-factors reviewed by Eskenazi and Warner, only six met the following basic criteria for adequate study design: (i) cohort or case–control design; (ii) surgically confirmed cases; (iii) clearly described criteria for control selection; and (iv) adjustment for confounding factors in the analysis ( Eskenazi and Warner, 1997 ). In a search for studies published since then, we have found only two more studies that conformed to these criteria ( Signorello et al ., 1997 ; Pauwels et al ., 2001 ). The total of eight studies, seven of which were of case–control design, varied widely in terms of case definition and control selection (Table I ). Apart from generally consistent associations with increasing age and prolonged menstruation, other findings such as for smoking, exercise, body mass index, parity and tampon use were either inconsistent or simply not tested in more than one study ( Eskenazi and Warner, 1997 ). Exposure to 2,3,7,8-tetrachlorodibenzo-p-dioxin has been implicated in primate studies ( Rier et al ., 1993 ), but evidence for a role in human endometriosis is limited ( Mayani et al ., 1997 ; Pauwels et al ., 2001 ).
Two recent reviews have discussed the evidence for a genetic aetiology of endometriosis ( Bischoff and Simpson, 2000 ; Zondervan et al ., 2001a ). Genetic factors were implicated by a large twin study, in which 51% of the variance of susceptibility to endometriosis was attributed to genes ( Treloar et al ., 1999 ), and by four case–control studies showing that the first-degree relatives of affected women are at 3–9 times increased risk of developing the disease compared with first-degree relatives of controls ( Simpson et al ., 1980 ; Lamb et al ., 1986 ; Coxhead and Thomas, 1993 ; Moen and Magnus, 1993 ). There have been 11 case–control studies (Table II ) that have assessed the influence of specific genetic variants (`functional' candidate genes), mainly focusing on genes involved in detoxification ( GSTM1 , GSTT1 , NAT2 ), galactose metabolism ( GALT ), differential expression of hormone receptors ( ESR1 ) and immunological dysfunction ( IL-1β ). Case–control studies of genetic variants have been substantially smaller than those of environmental factors, and generally lack the power required to detect the moderate effect sizes likely to apply to complex traits such as endometriosis. Most did not comply with the criteria of basic epidemiological study design described above, in particular that of appropriate control selection and adjustment for potential confounders ( Zondervan et al ., 2001a ). This may be due to a misconception that these principles have been developed specifically for studies of environmental factors, which are often difficult to measure, change over time, and the collection of which may be subject to information bias. Nevertheless, as will be discussed in this paper, the choice of case and control selection can also have a profound effect on the results of candidate gene studies.
In the following few paragraphs we briefly discuss the principles of the case–control study and the reasons why appropriate selection of cases, and in particular of controls, is so important.
A case–control study aims to derive a risk estimate for a particular factor of exposure (environmental or genetic) that is as close as possible to the estimate that would have been derived had a prospective cohort study been performed. Cohort studies (in which two or more groups of people free of the disease of interest but different in terms of exposures are followed to investigate who develops the disease) are the `gold standard' for risk-factor analysis, because they allow the collection of unbiased risk-factor information. When unfeasible, the case–control study (in which exposures are compared between groups of people with and without the disease of interest) can be a good alternative, provided cases and controls are selected appropriately.
The above translates into the following general principles for designing a case–control study ( Rothman and Greenland, 1998a ). Firstly, cases should be incident (newly arising) cases that are recruited prospectively from a certain population during the time period of the study, and for whom risk-factor information is collected retrospectively. Secondly, as each new case arises, one or more controls are sampled from the same population and their risk-factor information is collected. Every time a control is selected, he/she is not removed from the sampling population but remains eligible for future sampling either as a control or case. This means that theoretically, an unaffected individual could be selected more than once as a control, and could subsequently be selected as a case if he/she develops the disease later in the study period (although in practice the likelihood of this scenario is small).
Incident versus prevalent cases
The main problem in applying the above guidelines often lies with the identification of incident cases. Endometriosis is no exception. Not only is there a substantial delay between symptom onset and diagnosis ( Hadfield et al ., 1996 ), we do not know what pathological changes qualify as `onset' of disease, nor do we have the means to measure these changes as they occur. For most non-infectious conditions, the term `incident' is rather an artificial concept. Disease onset is more appropriately viewed as a continuum of biological changes which, once a certain threshold has been reached, is considered clinically relevant and termed `disease'.
In practice, newly diagnosed cases are usually taken to be `incident'. Their use is highly preferable over prevalent cases (i.e. all those in a population having the disease at a specific point in time irrespective of time since diagnosis) because they minimize the chance that an observed effect of an environmental risk-factor is the result of diagnosis. One can easily imagine this for behavioural risk-factors: a person may be more likely to stop smoking or change exercise frequency because of a certain diagnosis. Equally, a person may change habits because of the onset of symptoms, a time-point which could pre-date diagnosis considerably. In data collection every effort must therefore be taken to determine environmental exposures prior to the onset of disease symptoms.
Even the use of newly diagnosed cases with risk-factor information pre-dating symptom onset cannot circumvent the problem that certain exposures may have changed as a result of subclinical disease. For example, heavy menstrual bleeding that pre-dated symptom onset in endometriosis may imply a role in its causation, but could also be a result of early physiological changes associated with the condition. Such possible explanations should be borne in mind when interpreting the study results.
( Rothman and Greenland, 1998a )
Where A = number of affecteds and N = number of individuals in the population from which affecteds are sampled.
In this situation, the OR only provides a reasonable approximation of the relative risk that would have been obtained in a cohort study if the disease is relatively rare (general rule of thumb: prevalence A/N <20%). For diseases more common than this, an increase in risk associated with an exposure will produce inflated ORs ( Kirkwood, 1988 ).
The main advantage of studying candidate genes rather than environmental factors is that the exposure of interest usually remains constant (except for situations in which differential expression at different times in life occurs, or somatic mutations influence outcome). Genetic factors cannot be a result of symptom onset or diagnosis. However, many of the concerns raised also apply to studies of genetic risk-factors. Every effort must be taken for controls to be sampled from the source population from which cases arose. Choosing to use incident or prevalent cases also has certain consequences. Sampling of prevalent cases will provide a mixture that is skewed towards individuals who have had the condition longer ( Freeman and Hutchison, 1980 ). Different genetic factors may be found in studies using prevalent cases than in those using incident cases: the former could be more important for disease maintenance, whereas the latter could relate more to the onset of disease. Prior hypotheses for the disease model could help in the choice of incident versus prevalent cases, whereas using duration of the disease as a co-variate in the analysis may also be of benefit.
Case definition in endometriosis research
As shown in Tables I and II , many different definitions of endometriosis have been used, most of which appear to be based on opportunistic groups of prevalent cases that were seen in the study clinic at the time. Holt and Weiss rightly commented that for study results to become comparable, a standard definition has to be used ( Holt and Weiss, 2000 ). The rAFS classification system ( American Fertility Society, 1985 ) is not particularly suited to this purpose, as it was principally designed to categorize women according to the probability of conceiving and does not correlate well with pelvic pain symptoms ( Porpora et al ., 1999 ). Because of the high frequency with which minimal/mild endometriosis is found in asymptomatic women, and current theories of these disease stages representing a normal physiological process, it appears logical to limit case definition to more severe stages. Holt and Weiss proposed a standard definition of definite and possible `endometriotic disease' based on a combination of surgical observations and pelvic symptoms (Table III ). In addition, a recent study showed that ovarian endometriosis, often present in endometriotic disease, can be accurately diagnosed through high-resolution ultrasound ( Eskenazi et al ., 2001 ).
Control selection
The choice of a control group is entirely determined by the definition and selection of the case group. More specifically, the source population from which controls are sampled should be that from which cases are also sampled. This is true for studies of environmental as well as genetic factors. An appropriate choice of controls will allow the allele frequencies of cases to be compared with those of their source population, and thus minimize the chance of population stratification (finding spurious associations).
A case–control study can be restricted to any (sub)type of case that may be of interest, as long as controls are selected appropriately for these case groups. Of course, the more restricted a definition of a case in terms of subtype or setting, the more difficult it becomes to identify the population from which such cases arose. It is important that controls should have had the same opportunity to develop the disease of interest, and—had they done so—they would have had the same opportunity as cases to have been included in the study. By definition, this means they should be women, since men cannot express the phenotype. Some studies of candidate genes have used males as controls, given that their allele frequencies were known to represent those in the source population ( Hadfield et al ., 1999 ; Nakago et al ., 2001 ; Stefansson et al ., 2001 ). However, using male rather than female controls for whom disease status is unknown has no added benefit: had they been women, they may have been a case. Moreover, the use of male controls may become a potential source of bias when environmental exposures are also included in the study, as many exposure profiles are likely to differ from those among women, thus producing spurious associations.
The choice of control groups in endometriosis research has focused on the concern that controls should be free of disease. Because of the requirement for a surgical diagnosis, these control groups have included fertile women undergoing laparoscopic sterilization and women who underwent laparoscopy for infertility unrelated to endometriosis. It is unlikely that these groups were representative of the source population from which cases were derived.
Rather than selecting controls who underwent laparoscopy, it would be advantageous to find a control group that is more population-based. The main concern against this option would be that such a control group could contain a substantial number of undiagnosed cases, thereby diluting the risk factor effects. This concern is likely to be unjustified. In a community survey, Zondervan et al . found a prevalence of 24.0% for chronic pelvic pain in the UK ( Zondervan et al ., 2001b ). A total of 28.3% reported having had chronic pelvic pain, infertility (defined as inability to conceive for at least 12 months) or both (unpublished results). In their review, Eskenazi and Warner noted that in studies of women who underwent laparoscopy for pelvic pain or infertility, the prevalence of endometriosis was 20% on average, and that around a third of these had moderate/severe disease ( Eskenazi and Warner, 1997 ). From these findings, it appears that the community prevalence of more severe stages of endometriosis is probably <2%. Community-based control groups are therefore unlikely to contain many undiagnosed cases, especially if they are screened for moderate to severe pelvic symptoms. Therefore, the inclusion of undiagnosed cases will dilute the observed effects of risk-factors only to a marginal extent. Instead, it appears much more beneficial to focus research efforts on defining a control population that is representative of the source population of cases, than to be overly concerned about obtaining a completely disease-free control group.
Community-based controls can be a random sample of a particular population, or more selected groups such as neighbourhood and friend controls. Neighbourhood controls are subjects who are sampled from the same neighbourhood as cases arose from ( Rothman and Greenland, 1998a ). However, if any of the exposures of interest are related to the living environment of cases, using neighbourhood controls will prevent this factor from being identified as a risk-factor (over-matching). Using friend controls could cause similar problems, as friends may be more similar to cases in certain behavioural exposures. In addition, friend controls are identified by cases themselves and are therefore less likely to be chosen independent of their exposure status (thus potentially causing bias).
An alternative strategy is to use hospital-based controls diagnosed with unrelated conditions. The main hypothesis behind this approach is that such controls are better matched to cases on various potential confounding factors that are impossible to measure, such as referral or health care seeking pattern and socio-economic status. A potential problem is that they are not selected at random from the source population of cases and may thus be unrepresentative of the exposure distribution in that population. Using several different diagnostic groups can dilute the biasing effects of including a specific diagnostic group that is unrepresentative of the source population. Suggestions for appropriate groups very much depend on the health care system of the country under study. Essentially, controls should have had the same opportunity and inclination to attend for their respective diagnoses as the case group did for theirs. For endometriosis, an example could be women attending for treatment of chronic, non-cancerous conditions such as asthma.
Lastly, as there is almost never one ideal control group, an obvious solution is to use multiple groups. However, this makes the study more expensive and the interpretation of the analyses more complicated. If control groups differ in their exposure patterns, it is difficult to find out which one most represents the true exposure distribution of the source population ( Rothman and Greenland, 1998a ).
All control groups that are not randomly selected from a population (neighbourhood, friend and hospital-based controls) essentially represent forms of matching. Matching refers to the selection of controls that are as similar as possible to cases with respect to the distribution of one or more potential confounding factors that are difficult to measure. Cases can be matched to controls on an individual basis or in strata of exposure values (frequency matching). Matching is not without disadvantages. Matching factors can no longer be investigated in the analysis, as they need to be used as stratification variables. Furthermore, there are situations in which over-matching can occur ( Rothman and Greenland, 1998b ). Matching on a variable that is associated only with exposure, not with disease, reduces statistical efficiency: the investigator has to stratify on the matching variable, while in an unmatched design adjustment for the variable would have been unnecessary. Matching on variables that are affected by exposure or disease (such as symptoms) can cause bias and thus affect the validity of the results. In a study of endometriosis, for example, one would never match on level of pelvic pain or parity, as this would make controls more similar to cases for various potential risk-factors that affect both endometriosis and pain or infertility.
There are various situations in which individual matching could be appropriate in the study of endometriosis. As endometriosis is an age-related condition ( Eskenazi and Warner, 1997 ) and age is also related to many exposures, it is an important confounder. It can be controlled for in the analysis, but it may be statistically more efficient to match on age. If there is concern for bias due to differential access to health care, various forms of matching could be used. For example, in countries where patients have to be referred from a primary care doctor to a gynaecological clinic, controls could be matched to cases on the particular primary care doctor that referred the case. In multicentre studies, one would also match on centre.
A special form of matching, developed in genetic research, is the use of family-based controls such as siblings or cousins. The main reason for its development was to avoid confounding in comparing allele frequencies between cases and controls by population origin (population stratification), which was thought to be a major concern when using population-based controls. Although family-based controls are appropriate when studying the effect of genes only, they become a problem when wishing to incorporate environmental factors in the analysis. The main disadvantage is over-matching on environmental factors. For example, siblings are more likely to share environmental exposures because they have been brought up in the same surroundings. This means that the number of discordant case–control units is smaller and therefore many more are needed for the study to have sufficient power. The power is further decreased by the fact that a stratified analysis has to be performed to allow for the matching on family. Lastly, cases may or may not have (a different number of) eligible relatives. This creates the problem of who to select, and how to adjust the analysis to incorporate within-family dependency of the measurements ( Weinberg and Umbach, 2000 ). The latter problem is an important limitation of the design for endometriosis research. As endometriosis is heritable and related to infertility, cases may—on average—have fewer eligible blood relatives than controls without the condition.
Other study designs have been developed to avoid population stratification in genetic studies, such as the case-only design and various case-parent designs (haplotype relative risk, transmission disequilibrium test, pseudo-sib) that use non-transmitted alleles from parents in various ways to create `controls'. The relative power of these various types of association studies as opposed to linkage studies has been well described within the field of statistical genetics ( Risch and Merikangas, 1996 ; Risch and Teng, 1998 ; Teng and Risch, 1998 ). Weinberg and Umbach have also provided a detailed comparison between such family-based methods of association and population-based case–control designs, concluding that most family-based methods were unsuitable for the estimation of the main effects of genes and exposures as well as their interaction ( Weinberg and Umbach, 2000 ). A method worth mentioning that tries to address the potential problem of population stratification in standard case–control studies, is that of `genomic control'. This tests the presence of population stratification in a standard case–control study by comparing allelic frequencies of randomly selected anonymous genes between cases and controls ( Bacanu et al ., 2000 ; Pritchard et al ., 2000 ). If these frequencies differ systematically then population stratification is likely and should be adjusted for by using stratified analyses. This approach could be suitable for the study of endometriosis. However, although the power of the method may already be reasonable when using 20 markers, its application may at present be limited because of genotyping cost in the large case–control studies needed to study complex traits.
The study of gene–environment interaction
Recent debate has somewhat questioned the scientific merit of studying statistical interactions between genes and environment in complex diseases ( Clayton and McKeigue, 2001 ). Nevertheless, it also suggested that the population-based case–control study is the most suited to such investigations. However, an important limiting factor in studying interaction is sufficient power. Smith and Day calculated that to detect an interaction between two dichotomous variables with main effects of similar magnitude in an unmatched case–control study, study size would have to be increased at least 4-fold ( Smith and Day, 1984 ).
Matching and counter-matching can provide ways to improve the power for studying gene-environment interaction, but only in situations where the exposures are rare ( Smith and Day, 1984 ; Andrieu et al ., 2001 ). These methods also assume that the gene–environment interaction of interest is known prior to the design of the study, whereas in practice, investigators tend to look only for interaction between factors identified from the study with main effects that are large enough to be of interest. This approach does not allow for factors that have small main effects, but for which the interaction produces higher risk estimates. This is more likely to be the scenario in the investigation of complex traits including endometriosis.
Appropriate case definition and control selection is vital in determining the validity and reproducibility of case–control studies of complex traits. With respect to endometriosis, insufficient attention has been paid to this topic, especially in studies investigating the effects of candidate genes.
Future studies of complex traits will increasingly have to incorporate both environmental and genetic factors. Since individual effect sizes for risk-factors underlying complex traits are unlikely to be large, the collection of accurate, unbiased and comparable data from sufficiently large samples will be of the utmost importance. It is only if designs of studies in different populations are valid and consistent that we will be able to compare their results and differentiate between true aetiologic heterogeneity expected to underlie a complex trait and effects due to design differences and inadequacies. In view of the generally poor study designs to date, this appears of particular relevance to endometriosis.
The eight studies of environmental risk-factors for endometriosis that were considered of adequate epidemiological design. Adapted from Eskenazi and Warner (1997).
The 11 case–control studies of genetic risk-factors for endometriosis
Standard definition proposed by Holt and Weiss (2000) of endometriotic disease
To whom correspondence should be addressed. E-mail: [email protected]
K.T.Z. is supported by an MRC Special Training Fellowship in Bioinformatics.
American Fertility Society ( 1985 ) Revised American Fertility Society classification of endometriosis. Fertil. Steril. , 43 , 351 –352.
Andrieu, N., Goldstein, A.M., Thomas, D.C. and Langholz, B. ( 2001 ) Counter-matching in studies of gene–environment interaction: efficiency and feasibility. Am. J. Epidemiol. , 153 , 265 –274.
Bacanu, S., Devlin, B. and Roeder, K. ( 2000 ) The power of genomic control. Am. J. Hum. Genet. , 66 , 1933 –1944.
Baranov, V.S., Ivaschenko, T., Bakay, B., Aseev, M., Belotserkovskaya, R., Baranova, H., Malet, P., Perriot, J., Mouraire, P., Baskakov, V.N. et al . ( 1996 ) Proportion of the GSTM1 0/0 genotype in some Slavic populations and its correlation with cystic fibrosis and some multifactorial diseases. Hum. Genet. , 97 , 516 –520.
Baranova, H., Bothorishvilli, R., Canis, M., Albuisson, E., Perriot, S., Glowaczower, E., Bruhat, M.A., Baranov, V. and Malet, P. ( 1997 ) Glutathione S-transferase M1 gene polymorphism and susceptibility to endometriosis in a French population. Mol. Hum. Reprod. , 3 , 775 –780.
Baranova, H., Canis, M., Ivaschenko, T., Albuisson, E., Bothorishvilli, R., Baranov, V., Malet, P. and Bruhat, M.A. ( 1999 ) Possible involvement of arylamine N-acetyltransferase 2, glutathione S-transferases M1 and T1 genes in the development of endometriosis. Mol. Hum. Reprod. , 5 , 636 –641.
Baxter, S.W., Thomas, E.J. and Campbell, I.G. ( 2001 ) GSTM1 null polymorphism and susceptibility to endometriosis and ovarian cancer. Carcinogenesis , 22 , 63 –65.
Bischoff, F.Z. and Simpson, J.L. ( 2000 ) Heritability and molecular genetic studies of endometriosis. Hum. Reprod. Update , 6 , 37 –44.
Candiani, G.B., Danesino, V., Gastaldi, A., Parazzini, F. and Ferraroni, M. ( 1991 ) Reproductive and menstrual factors and risk of peritoneal and ovarian endometriosis. Fertil. Steril. , 56 , 230 –234.
Cardon, L.R. and Bell, J.I. ( 2001 ) Association study designs for complex diseases. Nat. Rev. Genet , 2 , 91 –99.
Clayton, D. and McKeigue, P.M. ( 2001 ) Epidemiological methods for studying genes and environmental factors in complex disease. Lancet , 358 , 1356 –1360.
Coxhead, D. and Thomas, E.J. ( 1993 ) Familial inheritance of endometriosis in a British population. A case control study. J. Obstet. Gynaecol. , 13 , 42 –44.
Cramer, D.W., Wilson, E., Stillman, R.J., Berger, M.J., Belisle, S., Schiff, I., Albrecht, B., Gibson, M., Stadel, B.V. and Schoenbaum, S.C. ( 1986 ) The relation of endometriosis to menstrual characteristics, smoking and exercise. JAMA , 255 , 1904 –1908.
Cramer, D.W., Hornstein, M.D. and Barbieri, R.L. ( 1996 ) Endometriosis associated with the N314D mutation of galactose-1-phosphate uridyl transferase (GALT). Mol. Hum. Reprod. , 2 , 149 –152.
Darrow, S.L., Vena, J.E., Batt, R.E., Zielezny, M.A., Michalek, A.M. and Selman, S. ( 1993 ) Menstrual cycle characteristics and the risk of endometriosis. Epidemiology , 4 , 135 –142.
Darrow, S.L., Selman, S., Batt, R.E., Zielezny, M.A. and Vena, J.E. ( 1994 ) Sexual activity, contraception and reproductive factors in predicting endometriosis. Am. J. Epidemiol. , 140 , 500 –509.
Eskenazi, B. and Warner, M.L. ( 1997 ) Epidemiology of endometriosis. Obstet. Gynecol. Clin. North Am. , 24 , 235 –258.
Eskenazi, B., Warner, M., Bonsignore, L., Olive, D., Samuels, S. and Vercellini, P. ( 2001 ) Validation study of nonsurgical diagnosis of endometriosis. Fertil. Steril. , 76 , 929 –935.
Freeman, J. and Hutchison, G.B. ( 1980 ) Prevalence, incidence and duration. Am. J. Epidemiol. , 112 , 707 –723.
Georgiou, I., Syrrou, M., Bouba, I., Dalkalitsis, N., Paschopoulos, M., Navrozoglou, I. and Lolis, D. ( 1999 ) Association of estrogen receptor gene polymorphisms with endometriosis. Fertil. Steril. , 72 , 164 –166.
Greenland, S. and Rothman, K.J. (1998) Measures of effect and measures of association. In Modern Epidemiology , 2nd edn. Lippincott-Raven, Philadelphia, USA, pp. 47–64.
Grodstein, F., Goldman, M.B., Ryan, L. and Cramer, D.W. ( 1993 ) Relation of female infertility to consumption of caffeinated beverages. Am. J. Epidemiol. , 137 , 1353 –1360.
Grodstein, F., Goldman, M.B. and Cramer, D.W. ( 1994 ) Infertility in women and moderate alcohol use [see comments]. Am. J. Public Health , 84 , 1429 –1432.
Hadfield, R., Mardon, H., Barlow, D.H. and Kennedy, S.H. ( 1996 ) Delay in diagnosis of endometriosis: a survey of women from the USA and the UK. Hum. Reprod. , 11 , 878 –880.
Hadfield, R.M., Manek, S., Nakago, S., Mukherjee, S., Weeks, D.E., Mardon, H.J., Barlow, D.H. and Kennedy, S.H. ( 1999 ) Absence of a relationship between endometriosis and the N314D polymorphism of galactose-1-phosphate uridyl transferase in a UK population. Mol. Hum. Reprod. , 5 , 990 –993.
Hadfield, R.M., Manek, S., Weeks, D.E., Mardon, H.J., Barlow, D.H., Kennedy, S.H. and OXEGENE Collaborative Group ( 2001 ) Linkage and association studies of the relationship between endometriosis and genes encoding the detoxification enzymes GSTM1, GSTT1 and CYP1A1. Mol. Hum. Reprod. , 7 , 1073 –1078.
Halme, J., Hammond, M.G., Hulka, J.F., Raj, S.G. and Talbert, L.M. ( 1984 ) Retrograde menstruation in healthy women and in patients with endometriosis. Obstet. Gynecol. , 64 , 151 –154.
Holt, V.L. and Weiss, N.S. ( 2000 ) Recommendations for the design of epidemiologic studies of endometriosis. Epidemiology , 11 , 654 –659.
Hsieh, Y.Y., Chang, C.C., Tsai, F.J., Wu, J.Y., Shi, Y.R., Tsai, H.D. and Tsai, C.H. ( 2001a ) Polymorphisms for interleukin-1 beta (IL-1 beta)-511 promoter, IL-1 beta exon 5 and IL-1 receptor antagonist: nonassociation with endometriosis. J. Assist. Reprod. Genet , 18 , 506 –511.
Hsieh, Y.Y., Tsai, F.J., Chang, C.C., Chen, W.C., Tsai, C.H., Tsai, H.D. and Lin, C.C. ( 2001b ) p21 gene codon 31 arginine/serine polymorphism: non-association with endometriosis. J. Clin. Lab. Anal. , 15 , 184 –187.
Kirkwood, B.R. (1988) Cohort and case–control studies. In Essentials of Medical Statistics . Blackwell Scientific Publications, Oxford, UK, pp. 173–183.
Kitawaki, J., Obayashi, H., Ishihara, H., Koshiba, H., Kusuki, I., Kado, N., Tsukamoto, K., Hasegawa, G., Nakamura, N. and Honjo, H. ( 2001 ) Oestrogen receptor-alpha gene polymorphism is associated with endometriosis, adenomyosis and leiomyomata. Hum. Reprod. , 16 , 51 –55.
Koninckx, P.R., Barlow, D. and Kennedy, S. ( 1999 ) Implantation versus infiltration: the Sampson versus the endometriotic disease theory. Gynecol. Obstet. Invest. , 47 (Suppl. 1), 3 –10.
Lamb, K., Hoffmann, R.G. and Nichols, T.R. ( 1986 ) Family trait analysis: a case–control study of 43 women with endometriosis and their best friends. Am. J. Obstet. Gynecol. , 154 , 601 .
Makhlouf-Obermeyer, C., Armenian, H.K. and Azoury, R. ( 1986 ) Endometriosis in Lebanon. A case–control study. Am. J. Epidemiol. , 124 , 762 –767.
Mangtani, P. and Booth, M. ( 1993 ) Epidemiology of endometriosis. J. Epidemiol. Commun. Health , 47 , 84 –88.
Mayani, A., Barel, S., Soback, S. and Almagor, M. ( 1997 ) Dioxin concentrations in women with endometriosis. Hum. Reprod. , 12 , 373 –375.
McCann, S.E., Freudenheim, J.L., Darrow, S.L., Batt, R.E. and Zielezny, M.A. ( 1993 ) Endometriosis and body fat distribution. Obstet. Gynecol. , 82 , 545 –549.
Moen, M.H. and Magnus, P. ( 1993 ) The familial risk of endometriosis. Acta Obstet. Gynecol. Scand. , 72 , 560 –564.
Morland, S.J., Jiang, X., Hitchcock, A., Thomas, E.J. and Campbell, I.G. ( 1998 ) Mutation of galactose-1-phosphate uridyl transferase and its association with ovarian cancer and endometriosis. Int. J. Cancer , 77 , 825 –827.
Nakago, S., Hadfield, R.M., Zondervan, K.T., Mardon, H., Manek, S., Weeks, D.E., Barlow, D.H. and Kennedy, S.H. ( 2001 ) Association between endometriosis and N-acetyl Transferase 2 polymorphisms in the UK population. Mol. Hum. Reprod. , 7 , 1079 –1083.
Parazzini, F., Ferraroni, M., Bocciolone, L., Tozzi, L., Rubessa, S. and La-Vecchia, C. ( 1994 ) Contraceptive methods and risk of pelvic endometriosis. Contraception , 49 , 47 –55.
Parazzini, F., Ferraroni, M., Fedele, L., Bocciolone, L., Rubessa, S. and Riccardi, A. ( 1995 ) Pelvic endometriosis: reproductive and menstrual risk factors at different stages in Lombardy, northern Italy. J. Epidemiol. Commun. Health , 49 , 61 –64.
Pauwels, A., Schepens, P.J., D'Hooghe, T., Delbeke, L., Dhont, M., Brouwer, A. and Weyler, J. ( 2001 ) The risk of endometriosis and exposure to dioxins and polychlorinated biphenyls: a case–control study of infertile women. Hum. Reprod. , 16 , 2050 –2055.
Porpora, M.G., Koninckx, P.R., Piazze, J., Natili, M., Colagrande, S. and Cosmi, E.V. ( 1999 ) Correlation between endometriosis and pelvic pain. J. Am. Assoc. Gynecol. Laparosc. , 6 , 429 –434.
Pritchard, J.K., Stephens, M., Rosenberg, N.A. and Donnelly, P. ( 2000 ) Association mapping in structured populations. Am. J. Hum. Genet. , 67 , 170 –181.
Rier, S.E., Martin, D.C., Bowman, R.E., Dmowski, W.P. and Becker, J.L. ( 1993 ) Endometriosis in rhesus monkeys ( Macaca Mulatta ) following chronic exposure to 2, 3, 7, 8-tetrachlorodibenzo-p-dioxin. Fund. Appl. Toxicol. , 21 , 433 –441.
Risch, N. and Merikangas, K. ( 1996 ) The future of genetic studies of complex human diseases. Science , 273 , 1516 –1517.
Risch, N. and Teng, J. ( 1998 ) The relative power of family-based and case–control designs for linkage disequilibrium studies of complex human diseases. I. DNA pooling. Genome Res. , 8 , 1273 –1288.
Rothman, K.J. and Greenland, S. (1998a) Case–control studies. In Modern Epidemiology , 2nd edn. Lippincott-Raven, Philadelphia, USA, pp. 93–114.
Rothman, K.J. and Greenland, S. (1998b) Matching. In Modern Epidemiology , 2nd edn. Lippincott-Raven, Philadelphia, USA, pp. 147–162.
Sampson, J.A. ( 1927 ) Peritoneal endometriosis due to the menstrual dissemination of endometrial tissue into the peritoneal cavity. Am. J. Obstet. Gynecol. , 14 , 469
Sangi-Haghpeykar, H. and Poindexter, A.N. ( 1995 ) Epidemiology of endometriosis among parous women. Obstet. Gynecol. , 85 , 983 –992.
Signorello, L.B., Harlow, B.L., Cramer, D.W., Spiegelman, D. and Hill, J.A. ( 1997 ) Epidemiologic determinants of endometriosis: a hospital-based case–control study. Ann. Epidemiol. , 7 , 267 –274.
Simpson, J.L., Elias, S., Malinak, L.R. and Buttram, V.C.J. ( 1980 ) Heritable aspects of endometriosis. I. Genetic studies. Am. J. Obstet. Gynecol. , 137 , 327 –331.
Smith, P.G. and Day, N.E. ( 1984 ) The design of case–control studies: the influence of confounding and interaction effects. Int. J. Epidemiol. , 13 , 356 –365.
Stefansson, H., Einarsdottir, A., Geirrson, R.T., Jonsdottir, K., Sverrisdottir, G., Gudnadottir, V.G., Gunnarsdottir, S., Manolescu, A., Gulcher, J. and Stefansson, K. ( 2001 ) Endometriosis is not associated with or linked to the GALT gene. Fertil. Steril. , 76 , 1019 –1022.
Teng, J. and Risch, N. ( 1998 ) The relative power of family-based and case–control designs for linkage disequilibrium studies of complex human diseases. II. Individual genotyping. Genome Res. , 9 , 241
Treloar, S.A., O'Connor, D.T., O'Connor, V.M. and Martin, N.G. ( 1999 ) Genetic influences of endometriosis in an Australian twin sample. Fertil. Steril. , 71 , 701 –710.
Vessey, M.P., Villard-Mackintosh, L. and Painter, R. ( 1993 ) Epidemiology of endometriosis in women attending family planning clinics. Br. Med. J. , 306 , 182 –184.
Weinberg, C.R. and Umbach, D.M. ( 2000 ) Choosing a retrospective design to assess joint genetic and environmental contributions to risk. Am. J. Epidemiol. , 152 , 197 –203.
Zondervan, K.T., Cardon, L.R. and Kennedy, S.H. ( 2001a ) The genetic basis of endometriosis. Curr. Opin. Obstet. Gynecol. , 13 , 309 –314.
Zondervan, K.T., Yudkin, P.L., Vessey, M.P., Jenkinson, C.P., Dawes, M.G., Barlow, D.H. and Kennedy, S.H. ( 2001b ) The community prevalence of chronic pelvic pain in women and associated illness behaviour. Br. J. Gen. Pract. , 51 , 541 –547.
Email alerts
Citing articles via.
- Recommend to your Library
Affiliations
- Online ISSN 1460-2350
- Print ISSN 0268-1161
- Copyright © 2023 European Society of Human Reproduction and Embryology
- About Oxford Academic
- Publish journals with us
- University press partners
- What we publish
- New features
- Open access
- Institutional account management
- Rights and permissions
- Get help with access
- Accessibility
- Media enquiries
- Oxford University Press
- Oxford Languages
- University of Oxford
Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide
- Copyright © 2023 Oxford University Press
- Cookie settings
- Cookie policy
- Privacy policy
- Legal notice
This Feature Is Available To Subscribers Only
Sign In or Create an Account
This PDF is available to Subscribers Only
For full access to this pdf, sign in to an existing account, or purchase an annual subscription.
What makes a good case-control study? Design issues for complex traits such as endometriosis
Affiliation.
- 1 Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 6BN, UK. [email protected]
- PMID: 12042253
- DOI: 10.1093/humrep/17.6.1415
The combined investigation of environmental and genetic risk-factors in complex traits will refocus attention on the case-control study. Endometriosis is an example of a complex trait for which most case-control studies have not followed the basic criteria of epidemiological study design. Appropriate control selection has been a particular problem. This article reviews the principles underlying the design of case-control studies, and their application to the study of endometriosis. Only if it is designed well is the case-control study a suitable alternative to the prospective cohort study. Use of newly diagnosed over prevalent cases is preferable, as the latter may alter risk estimates and complicate the interpretation of findings. Controls should be selected from the source population from which cases arose. Potential confounding should be addressed both in studies of environmental and genetic factors. For endometriosis, a possible design would be to: (i) use newly diagnosed cases with 'endometriotic' disease; (ii) collect information predating symptom onset; and (iii) use at least one population-based female control group matched on unadjustable confounders and screened for pelvic symptoms. In conclusion, future studies of complex traits such as endometriosis will have to incorporate both environmental and genetic factors. Only adequately designed studies will allow reliable results to be obtained and any true aetiologic heterogeneity expected to underlie a complex trait to be detected.
Publication types
- Research Support, Non-U.S. Gov't
- Case-Control Studies*
- Endometriosis / epidemiology
- Endometriosis / etiology*
- Endometriosis / genetics
- Environment
- Epidemiologic Factors
- Risk Factors

An official website of the United States government
The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
- Publications
- Account settings
- Advanced Search
- Journal List
- Indian J Dermatol
- v.61(2); Mar-Apr 2016
Methodology Series Module 2: Case-control Studies
Maninder singh setia.
Epidemiologist, MGM Institute of Health Sciences, Navi Mumbai, Maharashtra, India
Case-Control study design is a type of observational study. In this design, participants are selected for the study based on their outcome status. Thus, some participants have the outcome of interest (referred to as cases), whereas others do not have the outcome of interest (referred to as controls). The investigator then assesses the exposure in both these groups. The investigator should define the cases as specifically as possible. Sometimes, definition of a disease may be based on multiple criteria; thus, all these points should be explicitly stated in case definition. An important aspect of selecting a control is that they should be from the same ‘study base’ as that of the cases. We can select controls from a variety of groups. Some of them are: General population; relatives or friends; and hospital patients. Matching is often used in case-control control studies to ensure that the cases and controls are similar in certain characteristics, and it is a useful technique to increase the efficiency of the study. Case-Control studies can usually be conducted relatively faster and are inexpensive – particularly when compared with cohort studies (prospective). It is useful to study rare outcomes and outcomes with long latent periods. This design is not very useful to study rare exposures. Furthermore, they may also be prone to certain biases – selection bias and recall bias.
Introduction
Case-Control study design is a type of observational study design. In an observational study, the investigator does not alter the exposure status. The investigator measures the exposure and outcome in study participants, and studies their association.
In a case-control study, participants are selected for the study based on their outcome status. Thus, some participants have the outcome of interest (referred to as cases), whereas others do not have the outcome of interest (referred to as controls). The investigator then assesses the exposure in both these groups. Thus, by design, in a case-control study the outcome has to occur in some of the participants that have been included in the study.
As seen in Figure 1 , at the time of entry into the study (sampling of participants), some of the study participants have the outcome (cases) and others do not have the outcome (controls). During the study procedures, we will examine the exposure of interest in cases as well as controls. We will then study the association between the exposure and outcome in these study participants.

Example of a case-control study
Examples of Case-Control Studies
Smoking and lung cancer study.
In their landmark study, Doll and Hill (1950) evaluated the association between smoking and lung cancer. They included 709 patients of lung carcinoma (defined as cases). They also included 709 controls from general medical and surgical patients. The selected controls were similar to the cases with respect to age and sex. Thus, they included 649 males and 60 females in cases as well as controls.
They found that only 0.3% of males were non-smokers among cases. However, the proportion of non-smokers among controls was 4.2%; the different was statistically significant ( P = 0.00000064). Similarly they found that about 31.7% of the female were non-smokers in cases compared with 53.3% in controls; this difference was also statistically significant (0.01< p <0.02).
Melanoma and tanning (Lazovic et al ., 2010)
The authors conducted a case-control study to study the association between melanoma and tanning. The 1167 cases - individuals with invasive cutaneous melanoma – were selected from Minnesota Cancer Surveillance System. The 1101 controls were selected randomly from Minnesota State Driver's License list; they were matched for age (+/- 5 years) and sex.
The data were collected by self administered questionnaires and telephone interviews. The investigators assessed the use of tanning devices (using photographs), number of years, and frequency of use of these devices. They also collected information on other variables (such as sun exposure; presence of freckles and moles; and colour of skin, hair, among other exposures.
They found that melanoma was higher in individuals who used UVB enhances and primarily UVA-emitting devices. The risk of melanoma also increased with increase in years of use, hours of use, and sessions.
Risk factors for erysipelas (Pitché et al, 2015)
Pitché et al (2015) conducted a case-control study to assess the factors associated with leg erysipelas in sub-Saharan Africa. This was a multi-centre study; the cases and controls were recruited from eight countries in sub-Saharan Africa.
They recruited cases of acute leg cellulitis in these eight countries. They recruited two controls for each case; these were matched for age (+/- 5 years) and sex. Thus, the final study has 364 cases and 728 controls. They found that leg erysipelas was associated with obesity, lympoedema, neglected traumatic wound, toe-web intertrigo, and voluntary cosmetic depigmentation.
We have provided details of all the three studies in the bibliography. We strongly encourage the readers to read the papers to understand some practical aspects of case-control studies.
Selection of Cases and Controls
Selection of cases and controls is an important part of this design. Wacholder and colleagues (1992 a, b, and c) have published wonderful manuscripts on design and conduct of case-control of studies in the American Journal of Epidemiology. The discussion in the next few sections is based on these manuscripts.

Selection of case
The investigator should define the cases as specifically as possible. Sometimes, definition of a disease may be based on multiple criteria; thus, all these points should be explicitly stated in case definition.
For example, in the above mentioned Melanoma and Tanning study, the researchers defined their population as any histologic variety of invasive cutaneous melanoma. However, they added another important criterion – these individuals should have a driver's license or State identity card. This probably is not directly related to the clinic condition, so why did they add this criterion? We will discuss this in detail in the next few paragraphs.
Selection of a control
The next important point in designing a case-control study is the selection of control patients.
In fact, Wacholder and colleagues have extensively discussed aspects of design of case control studies and selection of controls in their article.
According to them, an important aspect of selecting a control is that they should be from the same ‘study base’ as that of the cases. Thus, the pool of population from which the cases and controls will be enrolled should be same. For instance, in the Tanning and Melanoma study, the researchers recruited cases from Minnesota Cancer Surveillance System; however, it was also required that these cases should either have a State identity card or Driver's license. This was important since controls were randomly selected from Minnesota State Driver's license list (this also included the list of individuals who have the State identity card).
Another important aspect of a case-control study is that we should measure the exposure similarly in cases and controls. For instance, if we design a research protocol to study the association between metabolic syndrome (exposure) and psoriasis (outcome), we should ensure that we use the same criteria (clinically and biochemically) for evaluating metabolic syndrome in cases and controls. If we use different criteria to measure the metabolic syndrome, then it may cause information bias.
Types of Controls
We can select controls from a variety of groups. Some of them are: General population; relatives or friends; or hospital patients.
Hospital controls
An important source of controls is patients attending the hospital for diseases other than the outcome of interest. These controls are easy to recruit and are more likely to have similar quality of medical records.
However, we have to be careful while recruiting these controls. In the above example of metabolic syndrome and psoriasis, we recruit psoriasis patients from the Dermatology department of the hospital as controls. We recruit patients who do not have psoriasis and present to the Dermatology as controls. Some of these individuals have presented to the Dermatology department with tinea pedis. Do we recruit these individuals as controls for the study? What is the problem if we recruit these patients? Some studies have suggested that diabetes mellitus and obesity are predisposing factors for tinea pedis. As we know, fasting plasma glucose of >100 mg/dl and raised trigylcerides (>=150 mg/dl) are criteria for diagnosis of metabolic syndrome. Thus, it is quite likely that if we recruit many of these tinea pedis patients, the exposure of interest may turn out to be similar in cases and controls; this exposure may not reflect the truth in the population.
Relative and friend controls
Relative controls are relatively easy to recruit. They can be particularly useful when we are interested in trying to ensure that some of the measurable and non-measurable confounders are relatively equally distributed in cases and controls (such as home environment, socio-economic status, or genetic factors).
Another source of controls is a list of friends referred by the cases. These controls are easy to recruit and they are also more likely to be similar to the cases in socio-economic status and other demographic factors. However, they are also more likely to have similar behaviours (alcohol use, smoking etc.); thus, it may not be prudent to use these as controls if we want to study the effect of these exposures on the outcome.
Population controls
These controls can be easily conducted the list of all individuals is available. For example, list from state identity cards, voter's registration list, etc., In the Tanning and melanoma study, the researchers used population controls. They were identified from Minnesota state driver's list.
We may have to use sampling methods (such as random digit dialing or multistage sampling methods) to recruit controls from the population. A main advantage is that these controls are likely to satisfy the ‘study-base’ principle (described above) as suggested by Wacholder and colleagues. However, they can be expensive and time consuming. Furthermore, many of these controls will not be inclined to participate in the study; thus, the response rate may be very low.
Matching in a Case-Control Study
Matching is often used in case-control control studies to ensure that the cases and controls are similar in certain characteristics. For example, in the smoking and lung cancer study, the authors selected controls that were similar in age and sex to carcinoma cases. Matching is a useful technique to increase the efficiency of study.
’Individual matching’ is one common technique used in case-control study. For example, in the above mentioned metabolic syndrome and psoriasis, we can decide that for each case enrolled in the study, we will enroll a control that is matched for sex and age (+/- 2 years). Thus, if 40 year male patient with psoriasis is enrolled for the study as a case, we will enroll a 38-42 year male patient without psoriasis (and who will not be excluded for other reason) as controls.
If the study has used ‘individual matching’ procedures, then the data should also reflect the same. For instance, if you have 45 males among cases, you should also have 45 males among controls. If you show 60 males among controls, you should explain the discrepancy.
Even though matching is used to increase the efficiency in case-control studies, it may have its own problems. It may be difficult to fine the exact matching control for the study; we may have to screen many potential enrollees before we are able to recruit one control for each case recruited. Thus, it may increase the time and cost of the study.
Nonetheless, matching may be useful to control for certain types of confounders. For instance, environment variables may be accounted for by matching controls for neighbourhood or area of residence. Household environment and genetic factors may be accounted for by enrolling siblings as controls.
If we use controls from the past (time period when cases did not occur), then the controls are sometimes referred to historic controls. Such controls may be recruited from past hospital records.
Strengths of a Case-Control Study
- Case-Control studies can usually be conducted relatively faster and are inexpensive – particularly when compared with cohort studies (prospective)
- It is useful to study rare outcomes and outcomes with long latent periods. For example, if we wish to study the factors associated with melanoma in India, it will be useful to conduct a case-control study. We will recruit cases of melanoma as cases in one study site or multiple study sites. If we were to conduct a cohort study for this research question, we may to have follow individuals (with the exposure under study) for many years before the occurrence of the outcome
- It is also useful to study multiple exposures in the same outcome. For example, in the metabolic syndrome and psoriasis study, we can study other factors such as Vitamin D levels or genetic markers
- Case-control studies are useful to study the association of risk factors and outcomes in outbreak investigations. For instance, Freeman and colleagues (2015) in a study published in 2015 conducted a case-control study to evaluate the role of proton pump inhibitors in an outbreak of non-typhoidal salmonellosis.
Limitations of a Case-control Study
- The design, in general, is not useful to study rare exposures. It may be prudent to conduct a cohort study for rare exposures
Since the investigator chooses the number of cases and controls, the proportion of cases may not be representative of the proportion in the population. For instance if we choose 50 cases of psoriasis and 50 controls, the prevalence of proportion of psoriasis cases in our study will be 50%. This is not true prevalence. If we had chosen 50 cases of psoriasis and 100 controls, then the proportion of the cases will be 33%.
- The design is not useful to study multiple outcomes. Since the cases are selected based on the outcome, we can only study the association between exposures and that particular outcome
- Sometimes the temporality of the exposure and outcome may not be clearly established in case-control studies
- The case-control studies are also prone to certain biases
If the cases and controls are not selected similarly from the study base, then it will lead to selection bias.
- Odds Ratio: We are able to calculate the odds ratios (OR) from a case-control study. Since we are not able to measure incidence data in case-control study, an odds ratio is a reasonable measure of the relative risk (under some assumptions). Additional details about OR will be discussed in the biostatistics section.
The OR in the above study is 3.5. Since the OR is greater than 1, the outcome is more likely in those exposed (those who are diagnosed with metabolic syndrome) compared with those who are not exposed (those who do are not diagnosed with metabolic syndrome). However, we will require confidence intervals to comment on further interpretation of the OR (This will be discussed in detail in the biostatistics section).
- Other analysis : We can use logistic regression models for multivariate analysis in case-control studies. It is important to note that conditional logistic regressions may be useful for matched case-control studies.
Calculating an Odds Ratio (OR)

Hypothetical study of metabolic syndrome and psoriasis

Additional Points in A Case-Control Study
How many controls can i have for each case.
The most optimum case-to-control ratio is 1:1. Jewell (2004) has suggested that for a fixed sample size, the chi square test for independence is most powerful if the number of cases is same as the number of controls. However, in many situations we may not be able recruit a large number of cases and it may be easier to recruit more controls for the study. It has been suggested that we can increase the number of controls to increase statistical power (if we have limited number of cases) of the study. If data are available at no extra cost, then we may recruit multiple controls for each case. However, if it is expensive to collect exposure and outcome information from cases and controls, then the optimal ratio is 4 controls: 1 case. It has been argued that the increase in statistical power may be limited with additional controls (greater than four) compared with the cost involved in recruiting them beyond this ratio.
I have conducted a randomised controlled trial. I have included a group which received the intervention and another group which did not receive the intervention. Can I call this a case-control study?
A randomised controlled trial is an experimental study. In contrast, case-control studies are observational studies. These are two different groups of studies. One should not use the word case-control study for a randomised controlled trial (even though you have a control group in the study). Every study with a control group is not a case-control study. For a study to be classified as a case-control study, the study should be an observational study and the participants should be recruited based on their outcome status (some have the disease and some do not).
Should I call case-control studies prospective or retrospective studies?
In ‘The Dictionary of Epidemiology’ by Porta (2014), the authors have suggested that even though the term ‘retrospective’ was used for case-control studies, the study participants are often recruited prospectively. In fact, the study on risk factors for erysipelas (Pitché et al ., 2015) was a prospective case case-control study. Thus, it is important to remember that the nature of the study (case-control or cohort) depends on the sampling method. If we sample the study participants based on exposure and move towards the outcome, it is a cohort study. However, if we sample the participants based on the outcome (some with outcome and some do not) and study the exposures in both these groups, it is a case-control study.
In case-control studies, participants are recruited on the basis of disease status. Thus, some of participants have the outcome of interest (referred to as cases), whereas others do not have the outcome of interest (referred to as controls). The investigator then assesses the exposure in both these groups. Case-control studies are less expensive and quicker to conduct (compared with prospective cohort studies at least). The measure of association in this type of study is an odds ratio. This type of design is useful for rare outcomes and those with long latent periods. However, they may also be prone to certain biases – selection bias and recall bias.
Financial support and sponsorship
Conflicts of interest.
There are no conflicts of interest.
Bibliography
What Is A Case Control Study?
Julia Simkus
Research Assistant at Princeton University
BA (Hons) Psychology, Princeton University
Julia Simkus is a graduate of Princeton University with a Bachelor of Arts in Psychology. She will start her studies for a Master's Degree in Counseling for Mental Health and Wellness in September 2023.
Learn about our Editorial Process
Saul Mcleod, PhD
Educator, Researcher
BSc (Hons) Psychology, MRes, PhD, University of Manchester
Saul Mcleod, Ph.D., is a qualified psychology teacher with over 18 years experience of working in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.
A case-control study is a research method where two groups of people are compared – those with the condition (cases) and those without (controls). By looking at their past, researchers try to identify what factors might have contributed to the condition in the ‘case’ group.
Table of Contents
A case-control study looks at people who already have a certain condition (cases) and people who don’t (controls). By comparing these two groups, researchers try to figure out what might have caused the condition. They look into the past to find clues, like habits or experiences, that are different between the two groups.
The “cases” are the individuals with the disease or condition under study, and the “controls” are similar individuals without the disease or condition of interest.
The controls should have similar characteristics (i.e., age, sex, demographic, health status) to the cases to mitigate the effects of confounding variables .
Case-control studies identify any associations between an exposure and an outcome and help researchers form hypotheses about a particular population.
Researchers will first identify the two groups, and then look back in time to investigate which subjects in each group were exposed to the condition.
If the exposure is found more commonly in the cases than the controls, the researcher can hypothesize that the exposure may be linked to the outcome of interest.

Figure: Schematic diagram of case-control study design. Kenneth F. Schulz and David A. Grimes (2002) Case-control studies: research in reverse . The Lancet Volume 359, Issue 9304, 431 – 434
Quick, inexpensive, and simple
Because these studies use already existing data and do not require any follow-up with subjects, they tend to be quicker and cheaper than other types of research. Case-control studies also do not require large sample sizes.
Beneficial for studying rare diseases
Researchers in case-control studies start with a population of people known to have the target disease instead of following a population and waiting to see who develops it. This enables researchers to identify current cases and enroll a sufficient number of patients with a particular rare disease.
Useful for preliminary research
Case-control studies are beneficial for an initial investigation of a suspected risk factor for a condition. The information obtained from cross-sectional studies then enables researchers to conduct further data analyses to explore any relationships in more depth.
Limitations
Subject to recall bias.
Participants might be unable to remember when they were exposed or omit other details that are important for the study. In addition, those with the outcome are more likely to recall and report exposures more clearly than those without the outcome.
Difficulty finding a suitable control group
It is important that the case group and the control group have almost the same characteristics, such as age, gender, demographics, and health status.
Forming an accurate control group can be challenging, so sometimes researchers enroll multiple control groups to bolster the strength of the case-control study.
Do not demonstrate causation
Case-control studies may prove an association between exposures and outcomes, but they can not demonstrate causation.
A case-control study is an observational study where researchers analyzed two groups of people (cases and controls) to look at factors associated with particular diseases or outcomes.
Below are some examples of case-control studies:
- Investigating the impact of exposure to daylight on the health of office workers (Boubekri et al., 2014).
- Comparing serum vitamin D levels in individuals who experience migraine headaches with their matched controls (Togha et al., 2018).
- Analyzing correlations between parental smoking and childhood asthma (Strachan and Cook, 1998).
- Studying the relationship between elevated concentrations of homocysteine and an increased risk of vascular diseases (Ford et al., 2002).
- Assessing the magnitude of the association between Helicobacter pylori and the incidence of gastric cancer (Helicobacter and Cancer Collaborative Group, 2001).
- Evaluating the association between breast cancer risk and saturated fat intake in postmenopausal women (Howe et al., 1990).
Frequently asked questions
1. what’s the difference between a case-control study and a cross-sectional study.
Case-control studies are different from cross-sectional studies in that case-control studies compare groups retrospectively while cross-sectional studies analyze information about a population at a specific point in time.
In cross-sectional studies , researchers are simply examining a group of participants and depicting what already exists in the population.
2. What’s the difference between a case-control study and a longitudinal study?
Case-control studies compare groups retrospectively, while longitudinal studies can compare groups either retrospectively or prospectively.
In a longitudinal study , researchers monitor a population over an extended period of time, and they can be used to study developmental shifts and understand how certain things change as we age.
In addition, case-control studies look at a single subject or a single case, whereas longitudinal studies can be conducted on a large group of subjects.
3. What’s the difference between a case-control study and a retrospective cohort study?
Case-control studies are retrospective as researchers begin with an outcome and trace backward to investigate exposure; however, they differ from retrospective cohort studies.
In a retrospective cohort study , researchers examine a group before any of the subjects have developed the disease, then examine any factors that differed between the individuals who developed the condition and those who did not.
Thus, the outcome is measured after exposure in retrospective cohort studies, whereas the outcome is measured before the exposure in case-control studies.
Boubekri, M., Cheung, I., Reid, K., Wang, C., & Zee, P. (2014). Impact of windows and daylight exposure on overall health and sleep quality of office workers: a case-control pilot study. Journal of Clinical Sleep Medicine: JCSM: Official Publication of the American Academy of Sleep Medicine, 10 (6), 603-611.
Ford, E. S., Smith, S. J., Stroup, D. F., Steinberg, K. K., Mueller, P. W., & Thacker, S. B. (2002). Homocyst (e) ine and cardiovascular disease: a systematic review of the evidence with special emphasis on case-control studies and nested case-control studies. International journal of epidemiology, 31 (1), 59-70.
Helicobacter and Cancer Collaborative Group. (2001). Gastric cancer and Helicobacter pylori: a combined analysis of 12 case control studies nested within prospective cohorts. Gut, 49 (3), 347-353.
Howe, G. R., Hirohata, T., Hislop, T. G., Iscovich, J. M., Yuan, J. M., Katsouyanni, K., … & Shunzhang, Y. (1990). Dietary factors and risk of breast cancer: combined analysis of 12 case—control studies. JNCI: Journal of the National Cancer Institute, 82 (7), 561-569.
Lewallen, S., & Courtright, P. (1998). Epidemiology in practice: case-control studies. Community eye health, 11 (28), 57–58.
Strachan, D. P., & Cook, D. G. (1998). Parental smoking and childhood asthma: longitudinal and case-control studies. Thorax, 53 (3), 204-212.
Tenny, S., Kerndt, C. C., & Hoffman, M. R. (2021). Case Control Studies. In StatPearls . StatPearls Publishing.
Togha, M., Razeghi Jahromi, S., Ghorbani, Z., Martami, F., & Seifishahpar, M. (2018). Serum Vitamin D Status in a Group of Migraine Patients Compared With Healthy Controls: A Case-Control Study. Headache, 58 (10), 1530-1540.
Further Information
Schulz, K. F., & Grimes, D. A. (2002). Case-control studies: research in reverse. The Lancet, 359(9304), 431-434.
What is a case-control study?
- SAVE ARTICLE

Leave a Comment Cancel reply
You must be logged in to post a comment.
Case Report vs Case-Control Study: A Simple Explanation
A case report is the description of the clinical story of a single patient, whereas a case-control study compares 2 groups of participants differing in outcome in order to determine if a suspected exposure in their past caused that difference.
Here’s the evidence pyramid showing the level of evidence for different study designs:

Further reading
- Case Report: A Beginner’s Guide with Examples
- Case Report vs Cross-Sectional Study
- Cohort vs Cross-Sectional Study
- How to Identify Different Types of Cohort Studies?
- Matched Pairs Design
- Randomized Block Design

- Free Case Studies
- Business Essays
Write My Case Study
Buy Case Study
Case Study Help
- Case Study For Sale
- Case Study Service
- Hire Writer
Cohort Study vs Case-Control: Pros, Cons, and Differences
Case-control studies.
A case-control study is a kind of research design in which two subsisting groups varying in outcome are differentiated and classified on the basis of some conjectured casual characteristic.
Case-control research subjects chosen based on disease status and assessed for previous illness to a risk factor of interest. “Cases” are those determined to have the disease or outcome of interest. “Controls” are free from the disease or outcome of interest. The illness data can come from a variety of sources, like subsisting data in a medical record or by surveying the participants.You can think of this as a “flashback” study.
It accounts for the passage of time using a flashback technique to assess past characteristics or exposures and to groups of people, cases and controls.
Case-Control Study Design
At the present time, cases and controls are identified and past exposures are measured. The study determines the odds of having the exposure among cases and controls, and then compares these groups to determine the association between the exposure and the outcome. Unbiased selection of cases and controls is very crucial to this research design. Selection biases pose a substantial threat to the validity of study findings for this design.Case-control studies measure the probabilities of having an exposure or characteristics in the case and control populations.
These probabilities are then compared using the odds ratio – a measure of association.
Cohort Study
A cohort study, in other words a prospective study, is a research design which study subjects – disease-free at enrollment and chosen based exposure rank. Unexposed and exposed groups are followed for the same amount of time to determine who develops the disease of interest. This design meets the need to confirm a casual relation between exposure and outcome. Since the exposure is established first and the potential effect is captured prospectively, the temporal association is actually witnessed during the course of the study.You can think of it as a “motion picture” study.
This research design follows groups of individuals free from disease through a period of time in motion picture fashion to determine whether the disease develops.
Cohort Study Design
A cohort time at present time exposed and unexposed are recruited to be in the study, then they are followed prospectively to see if they amplify the outcome or disease of interest. The exposed and unexposed groups are then compared to determine the association with the outcome. Data from cohort studies are used to calculate the risk or rate of the health characteristics or disease.
Pros and Cons
Strengths of Case-Control Studies:
- Can be relatively quickly done;
- Can be relatively inexpensive;
- Good for diseases with short or long latency/duration;
- Good for rare diseases.
Weaknesses of Case-Control Studies:
- Inefficient for rare exposures;
- Usually unable to determine prevalence or the incidence in the population;
- More predisposed to bias, especially selection, and recall biases.
Strengths of Cohort Studies:
- Good for rare exposures;
- Exposure clearly precedes disease;
- Can examine multiple effects of an exposure;
- Able to determine incidence of disease in population.
Weaknesses of Cohort Studies:
- Can be expansive;
- Inefficient for rare diseases;
- Once begun, difficult to examine other study factors (exposures);
- The prospective study may take a long time if the disease has a long latency.
Related posts:
- Pros and Cons of a Matrix Organization
- Living Away from Home Country; pros and Cons
- Pros and Cons of Social Studies Teaching
- Pros and Cons if the Japanese MITI Model Issue
- Case Control vs Cohort Study
- Case Study: Fundamental Differences of Infrastructure
- What Is Cohort Study: Types, Study Design and Examples
Quick Links
Privacy Policy
Terms and Conditions
Testimonials
Our Services
Case Study Writing Service
Case Studies For Sale
Our Company
Welcome to the world of case studies that can bring you high grades! Here, at ACaseStudy.com, we deliver professionally written papers, and the best grades for you from your professors are guaranteed!
[email protected] 804-506-0782 350 5th Ave, New York, NY 10118, USA
Acasestudy.com © 2007-2019 All rights reserved.

Hi! I'm Anna
Would you like to get a custom case study? How about receiving a customized one?
Haven't Found The Case Study You Want?
For Only $13.90/page
ORIGINAL RESEARCH article
This article is part of the research topic.
Innovative Approaches In The Management Of Bone and Joint Infection, Volume II
A clinical prediction model for predicting the surgical site infection after Open Reduction and Internal Fixation (ORIF) procedure considering the NHSN/SIR risk model: A multicenter case-control study Type of study: Case-control
- 1 Infectious Diseases and Tropical Medicine Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran, Iran
- 2 Department of Epidemiology, School of Public Health and Safety, Shahid Beheshti University of Medical Sciences, Iran
- 3 Department of Infectious Diseases, School of Medicine, Tehran University of Medical Sciences, Iran
- 4 Prevention of Cardiovascular Disease Research Center, Department of Epidemiology, School of Public Health and Safety, Shahid Beheshti University of Medical Sciences, Tehran, Iran., Iran
The final, formatted version of the article will be published soon.
Introduction: Surgical Site Infection (SSI) is one of the most common surgical-related problems in the world, especially in developing countries. SSI is responsible for mortality, long hospitalizatio n period, and a high economic burden .Method: This hospital-based case-control study, was conducted in six educational hospitals in Tehran Iran.Totally, 244 patients at the age of 18-85 years who had ORIF surgery were selected. Of those, 122 selected patients who developed SSIs were compared to 122 non-infected controls. At the second stage, all patients (n =350) who underwent ORIF surgery in a hospital were selected for estimation of Standardized Infection Ratio (SIR). Logistic regression model was used for predicting the most important factors associated with the occurrence of SSIs. Finally, the performance of ORIF prediction model was evaluated using discrimination and calibration indices. Data were analyzed using R.3.6.2 and STATA.14 software.Results: Klebsiella (14.75%) was the most frequently detected bacterium in SSIs following ORIF surgery. Results revealed that the most important factors associated with SSI following ORIF procedure were, elder age, elective surgery, prolonged operation time, ASA score ≥ 2, class 3 and 4 wound and preoperative blood glucose levels of > 200 mg/dL; while, preoperative higher haemoglobin (g/dl) was a protective factor. Evidences for interaction effect between age and gender, BMI and gender, age and elective surgery were also observed. After assessing the internal validity of the model, the overall performance of the models was good with an area under the curve of 95%.The SIR of SSI for ORIF surgery in the selected hospital was 0.66 among 18-85 years old patients.New risk prediction models can help to detect high-risk patients and monitor the infection rate in hospitals based on infection prevention and control programs. Physicians using prediction models can recognize high risk patients with these factors before ORIF procedure.
Keywords: Surgical site infection, ORIF surgery, Prediction model, Standardized infection ratio, NosocomiaI infection
Received: 18 Mar 2023; Accepted: 31 Aug 2023.
Copyright: © 2023 Taherpour, Mehrabi, Seifi and Hashemi-Nazari. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Dr. Seyed Saeed Hashemi-Nazari, Prevention of Cardiovascular Disease Research Center, Department of Epidemiology, School of Public Health and Safety, Shahid Beheshti University of Medical Sciences, Tehran, Iran., Tehran, Iran
People also looked at

IMAGES
VIDEO
COMMENTS
Case-control studies can be used for both exploratory and explanatory research, and they are a good choice for studying research topics like disease exposure and health outcomes. A case-control study may be a good fit for your research if it meets the following criteria.
Treasure Island (FL): StatPearls Publishing; 2023 Jan. Excerpt A case-control study is a type of observational study commonly used to look at factors associated with diseases or outcomes. The case-control study starts with a group of cases, which are the individuals who have the outcome of interest.
Sounak Chakraborty, PhD DOI: https://doi.org/10.1016/j.chest.2020.03.009 A Practical Overview of Case-Control Studies in Clinical Practice Case-control studies are one of the major observational study designs for performing clinical research.
Case-control studies are particularly appropriate for studying disease outbreaks, rare diseases, or outcomes of interest. This article describes several types of case-control designs, with simple graphical displays to help understand their differences.
Of the four main types of case-control studies, we will focus on the basic case-control study and the nested case-control study. Other types include the case-cohort study and the case-crossover study which are discussed elsewhere. 9 In a nested case-control study, the case-control study is embedded within a cohort of patients, and cases and controls are both selected from the same cohort.
How to plan a good case-control study? Rakesh Aggarwal Add to Mendeley https://doi.org/10.1016/j.injr.2015.01.004 Get rights and content Abstract Case-control studies are a very frequently used restudy design in clinical and epidemiological research. They are popular because they are easy, quick and inexpensive to plan and conduct.
Definition A study that compares patients who have a disease or outcome of interest (cases) with patients who do not have the disease or outcome (controls), and looks back retrospectively to compare how frequently the exposure to a risk factor is present in each group to determine the relationship between the risk factor and the disease.
32479703 10.1111/jpc.14929 Abstract Research designs are broadly divided into observational studies (i.e. cross-sectional; case-control and cohort studies) and experimental studies (randomised control trials, RCTs). Each design has a specific role, and each has both advantages and disadvantages.
Guidance for Assessing the Quality of Case-Control Studies. The guidance document below is organized by question number from the tool for quality assessment of case-control studies. Question 1. Research question ... Examples of studies rated good, fair, and poor are useful, but each study must be assessed on its own.
What makes a good case-control study? | Human Reproduction | Oxford Academic Journal Article What makes a good case-control study? Design issues for complex traits such as endometriosis Krina T. Zondervan, Lon R. Cardon, Stephen H. Kennedy Human Reproduction, Volume 17, Issue 6, 1 June 2002, Pages 1415-1423, https://doi.org/10.1093/humrep/17.6.1415
For endometriosis, a possible design would be to: (i) use newly diagnosed cases with 'endometriotic' disease; (ii) collect information predating symptom onset; and (iii) use at least one population-based female control group matched on unadjustable confounders and screened for pelvic symptoms.
Case-control studies are perfect for evaluating outbreaks and rare conditions. Researchers simply need to let a sufficient number of known cases accumulate in an established database. The alternative would be to select a large random sample and hope that the condition afflicts it eventually.
By contrast, in an observational study, the investigator does not intervene and rather simply "observes" and assesses the strength of the relationship between an exposure and disease variable. 6 Three types of observational studies include cohort studies, case-control studies, and cross-sectional studies ( Figure 1 ).
Advantages of Case-Control Studies Case-control studies have specific advantages compared to other study designs. They are comparatively quick, inexpensive, and easy. They are particularly appropriate for (1) investigating outbreaks, and (2) studying rare diseases or outcomes.
A case-control study is a way of carrying out a medical investigation to confirm or indicate what is likely to have caused a condition. They are usually retrospective, meaning that the...
Porta's Dictionary of Epidemiology defines the case-control study as: an observational epidemiological study of persons with the disease (or another outcome variable) of interest and a suitable control group of persons without the disease (comparison group, reference group). [3]
We discuss several subtle problems associated with matched case-control studies that do not arise or are minor in matched cohort studies: (1) matching, even for non-confounders, can create selection bias; (2) matching distorts dose-response relations between matching variables and the outcome; (3) unbiased estimation requires accounting for the ...
Abstract. Case-Control study design is a type of observational study. In this design, participants are selected for the study based on their outcome status. Thus, some participants have the outcome of interest (referred to as cases), whereas others do not have the outcome of interest (referred to as controls). The investigator then assesses the ...
A case-control study is a retrospective study that looks back in time to find the relative risk between a specific exposure (e.g. second hand tobacco smoke) and an outcome (e.g. cancer). A control group of people who do not have the disease or who did not experience the event is used for comparison.
A case-control study is an observational study where researchers analyzed two groups of people (cases and controls) to look at factors associated with particular diseases or outcomes. Investigating the impact of exposure to daylight on the health of office workers (Boubekri et al., 2014).
Case control studies have estimated that sigmoidoscopy screening reduces both incidence and mortality rates from colorectal cancer and that the benefit is preserved with screening frequencies as low as every 10 years. However, failure to inspect the proximal colon is an inherent shortcoming of sigmoidoscopy and case control studies suggest that ...
No, a case-control study is retrospective, meaning that it looks backwards in time to collect information about exposures that happened in the past: Example: In 1991, Fred Kern, Jr. reported the case of an 88-year-old man who has been eating 20-30 eggs each day for almost 15 years. The man had a normal cholesterol level as his body adapted to ...
Case-Control Studies. A case-control study is a kind of research design in which two subsisting groups varying in outcome are differentiated and classified on the basis of some conjectured casual characteristic. Case-control research subjects chosen based on disease status and assessed for previous illness to a risk factor of interest.
The present study is an observational case-control investigation carried out at the Department of Periodontics, College of Dentistry, University of Thiqar, Iraq. From February 2022 to October 2022, a study was conducted involving the voluntary enrollment of individuals diagnosed with periodontitis and healthy individuals who were both smokers ...
Introduction: Surgical Site Infection (SSI) is one of the most common surgical-related problems in the world, especially in developing countries. SSI is responsible for mortality, long hospitalizatio n period, and a high economic burden .Method: This hospital-based case-control study, was conducted in six educational hospitals in Tehran Iran.Totally, 244 patients at the age of 18-85 years who ...