DistillerSR Logo

About Systematic Reviews

Summary of Findings Table in a Systematic Review

systematic review summary of findings table

Automate every stage of your literature review to produce evidence-based research faster and more accurately.

What is a summary of findings table.

The Cochrane Review defines the “summary of findings table” as a structured tabular format in which the primary findings of a review, particularly information related to the quality of evidence, the magnitude of the effects of the studied interventions, and the aggregate of available data on the main outcomes, are presented. It includes multiple pieces of data derived from both quantitative and qualitative data analysis in systematic reviews . These include information about the main outcomes, the type and number of studies included, the estimates (both relative and absolute) of the effect or association, and important comments about the review, all written in a plain-language summary so that it’s easily interpreted. It also includes a grade of the quality of evidence; i.e., a rating of its certainty.

Most systematic reviews are expected to have one summary of findings table. But some studies may have multiple, if the review addresses more than one comparison, or deals with substantially different populations that require separate tables. The studies in a table can also be grouped in terms of applied intervention type, type of outcome measure, the type of participants, the study design etc..

How Do You Make A Summary Of Findings Table For A Systematic Review?

Learn more about distillersr.

(Article continues below)

systematic review summary of findings table

What Does A Summary Of Findings Table Include?

A summary of findings table typically includes the following information:

  • A description of the population and setting addressed by the available evidence
  • A description of comparisons addressed in the table, including all interventions
  • A list of the most important outcomes, whether desirable or undesirable (limited to seven)
  • A measure of the burdens of each outcome
  • The magnitude of effect measured for each outcome (both absolute and relative)
  • The participants and studies analyzed for each outcome
  • An assessment of the certainty of the evidence for each outcome (typically using GRADE)
  • Explanations

It’s best to include evidence profiles, i.e. additional tables that support the data in the summary of findings, to which the review may be linked. It also may be neat to have a study descriptor table different from a results table. The study descriptor table shows information about the characteristics of included studies, like study design, study region, participant information, etc. The results table mostly contains outcomes, outcome measures, study results, etc. These can help provide readers with more context about the review, and its conclusions.

Final Takeaway

3 reasons to connect.

systematic review summary of findings table

Jump to navigation

Home

Cochrane Training

Chapter 15: interpreting results and drawing conclusions.

Holger J Schünemann, Gunn E Vist, Julian PT Higgins, Nancy Santesso, Jonathan J Deeks, Paul Glasziou, Elie A Akl, Gordon H Guyatt; on behalf of the Cochrane GRADEing Methods Group

Key Points:

  • This chapter provides guidance on interpreting the results of synthesis in order to communicate the conclusions of the review effectively.
  • Methods are presented for computing, presenting and interpreting relative and absolute effects for dichotomous outcome data, including the number needed to treat (NNT).
  • For continuous outcome measures, review authors can present summary results for studies using natural units of measurement or as minimal important differences when all studies use the same scale. When studies measure the same construct but with different scales, review authors will need to find a way to interpret the standardized mean difference, or to use an alternative effect measure for the meta-analysis such as the ratio of means.
  • Review authors should not describe results as ‘statistically significant’, ‘not statistically significant’ or ‘non-significant’ or unduly rely on thresholds for P values, but report the confidence interval together with the exact P value.
  • Review authors should not make recommendations about healthcare decisions, but they can – after describing the certainty of evidence and the balance of benefits and harms – highlight different actions that might be consistent with particular patterns of values and preferences and other factors that determine a decision such as cost.

Cite this chapter as: Schünemann HJ, Vist GE, Higgins JPT, Santesso N, Deeks JJ, Glasziou P, Akl EA, Guyatt GH. Chapter 15: Interpreting results and drawing conclusions. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.4 (updated August 2023). Cochrane, 2023. Available from www.training.cochrane.org/handbook .

15.1 Introduction

The purpose of Cochrane Reviews is to facilitate healthcare decisions by patients and the general public, clinicians, guideline developers, administrators and policy makers. They also inform future research. A clear statement of findings, a considered discussion and a clear presentation of the authors’ conclusions are, therefore, important parts of the review. In particular, the following issues can help people make better informed decisions and increase the usability of Cochrane Reviews:

  • information on all important outcomes, including adverse outcomes;
  • the certainty of the evidence for each of these outcomes, as it applies to specific populations and specific interventions; and
  • clarification of the manner in which particular values and preferences may bear on the desirable and undesirable consequences of the intervention.

A ‘Summary of findings’ table, described in Chapter 14 , Section 14.1 , provides key pieces of information about health benefits and harms in a quick and accessible format. It is highly desirable that review authors include a ‘Summary of findings’ table in Cochrane Reviews alongside a sufficient description of the studies and meta-analyses to support its contents. This description includes the rating of the certainty of evidence, also called the quality of the evidence or confidence in the estimates of the effects, which is expected in all Cochrane Reviews.

‘Summary of findings’ tables are usually supported by full evidence profiles which include the detailed ratings of the evidence (Guyatt et al 2011a, Guyatt et al 2013a, Guyatt et al 2013b, Santesso et al 2016). The Discussion section of the text of the review provides space to reflect and consider the implications of these aspects of the review’s findings. Cochrane Reviews include five standard subheadings to ensure the Discussion section places the review in an appropriate context: ‘Summary of main results (benefits and harms)’; ‘Potential biases in the review process’; ‘Overall completeness and applicability of evidence’; ‘Certainty of the evidence’; and ‘Agreements and disagreements with other studies or reviews’. Following the Discussion, the Authors’ conclusions section is divided into two standard subsections: ‘Implications for practice’ and ‘Implications for research’. The assessment of the certainty of evidence facilitates a structured description of the implications for practice and research.

Because Cochrane Reviews have an international audience, the Discussion and Authors’ conclusions should, so far as possible, assume a broad international perspective and provide guidance for how the results could be applied in different settings, rather than being restricted to specific national or local circumstances. Cultural differences and economic differences may both play an important role in determining the best course of action based on the results of a Cochrane Review. Furthermore, individuals within societies have widely varying values and preferences regarding health states, and use of societal resources to achieve particular health states. For all these reasons, and because information that goes beyond that included in a Cochrane Review is required to make fully informed decisions, different people will often make different decisions based on the same evidence presented in a review.

Thus, review authors should avoid specific recommendations that inevitably depend on assumptions about available resources, values and preferences, and other factors such as equity considerations, feasibility and acceptability of an intervention. The purpose of the review should be to present information and aid interpretation rather than to offer recommendations. The discussion and conclusions should help people understand the implications of the evidence in relation to practical decisions and apply the results to their specific situation. Review authors can aid this understanding of the implications by laying out different scenarios that describe certain value structures.

In this chapter, we address first one of the key aspects of interpreting findings that is also fundamental in completing a ‘Summary of findings’ table: the certainty of evidence related to each of the outcomes. We then provide a more detailed consideration of issues around applicability and around interpretation of numerical results, and provide suggestions for presenting authors’ conclusions.

15.2 Issues of indirectness and applicability

15.2.1 the role of the review author.

“A leap of faith is always required when applying any study findings to the population at large” or to a specific person. “In making that jump, one must always strike a balance between making justifiable broad generalizations and being too conservative in one’s conclusions” (Friedman et al 1985). In addition to issues about risk of bias and other domains determining the certainty of evidence, this leap of faith is related to how well the identified body of evidence matches the posed PICO ( Population, Intervention, Comparator(s) and Outcome ) question. As to the population, no individual can be entirely matched to the population included in research studies. At the time of decision, there will always be differences between the study population and the person or population to whom the evidence is applied; sometimes these differences are slight, sometimes large.

The terms applicability, generalizability, external validity and transferability are related, sometimes used interchangeably and have in common that they lack a clear and consistent definition in the classic epidemiological literature (Schünemann et al 2013). However, all of the terms describe one overarching theme: whether or not available research evidence can be directly used to answer the health and healthcare question at hand, ideally supported by a judgement about the degree of confidence in this use (Schünemann et al 2013). GRADE’s certainty domains include a judgement about ‘indirectness’ to describe all of these aspects including the concept of direct versus indirect comparisons of different interventions (Atkins et al 2004, Guyatt et al 2008, Guyatt et al 2011b).

To address adequately the extent to which a review is relevant for the purpose to which it is being put, there are certain things the review author must do, and certain things the user of the review must do to assess the degree of indirectness. Cochrane and the GRADE Working Group suggest using a very structured framework to address indirectness. We discuss here and in Chapter 14 what the review author can do to help the user. Cochrane Review authors must be extremely clear on the population, intervention and outcomes that they intend to address. Chapter 14, Section 14.1.2 , also emphasizes a crucial step: the specification of all patient-important outcomes relevant to the intervention strategies under comparison.

In considering whether the effect of an intervention applies equally to all participants, and whether different variations on the intervention have similar effects, review authors need to make a priori hypotheses about possible effect modifiers, and then examine those hypotheses (see Chapter 10, Section 10.10 and Section 10.11 ). If they find apparent subgroup effects, they must ultimately decide whether or not these effects are credible (Sun et al 2012). Differences between subgroups, particularly those that correspond to differences between studies, should be interpreted cautiously. Some chance variation between subgroups is inevitable so, unless there is good reason to believe that there is an interaction, review authors should not assume that the subgroup effect exists. If, despite due caution, review authors judge subgroup effects in terms of relative effect estimates as credible (i.e. the effects differ credibly), they should conduct separate meta-analyses for the relevant subgroups, and produce separate ‘Summary of findings’ tables for those subgroups.

The user of the review will be challenged with ‘individualization’ of the findings, whether they seek to apply the findings to an individual patient or a policy decision in a specific context. For example, even if relative effects are similar across subgroups, absolute effects will differ according to baseline risk. Review authors can help provide this information by identifying identifiable groups of people with varying baseline risks in the ‘Summary of findings’ tables, as discussed in Chapter 14, Section 14.1.3 . Users can then identify their specific case or population as belonging to a particular risk group, if relevant, and assess their likely magnitude of benefit or harm accordingly. A description of the identifying prognostic or baseline risk factors in a brief scenario (e.g. age or gender) will help users of a review further.

Another decision users must make is whether their individual case or population of interest is so different from those included in the studies that they cannot use the results of the systematic review and meta-analysis at all. Rather than rigidly applying the inclusion and exclusion criteria of studies, it is better to ask whether or not there are compelling reasons why the evidence should not be applied to a particular patient. Review authors can sometimes help decision makers by identifying important variation where divergence might limit the applicability of results (Rothwell 2005, Schünemann et al 2006, Guyatt et al 2011b, Schünemann et al 2013), including biologic and cultural variation, and variation in adherence to an intervention.

In addressing these issues, review authors cannot be aware of, or address, the myriad of differences in circumstances around the world. They can, however, address differences of known importance to many people and, importantly, they should avoid assuming that other people’s circumstances are the same as their own in discussing the results and drawing conclusions.

15.2.2 Biological variation

Issues of biological variation that may affect the applicability of a result to a reader or population include divergence in pathophysiology (e.g. biological differences between women and men that may affect responsiveness to an intervention) and divergence in a causative agent (e.g. for infectious diseases such as malaria, which may be caused by several different parasites). The discussion of the results in the review should make clear whether the included studies addressed all or only some of these groups, and whether any important subgroup effects were found.

15.2.3 Variation in context

Some interventions, particularly non-pharmacological interventions, may work in some contexts but not in others; the situation has been described as program by context interaction (Hawe et al 2004). Contextual factors might pertain to the host organization in which an intervention is offered, such as the expertise, experience and morale of the staff expected to carry out the intervention, the competing priorities for the clinician’s or staff’s attention, the local resources such as service and facilities made available to the program and the status or importance given to the program by the host organization. Broader context issues might include aspects of the system within which the host organization operates, such as the fee or payment structure for healthcare providers and the local insurance system. Some interventions, in particular complex interventions (see Chapter 17 ), can be only partially implemented in some contexts, and this requires judgements about indirectness of the intervention and its components for readers in that context (Schünemann 2013).

Contextual factors may also pertain to the characteristics of the target group or population, such as cultural and linguistic diversity, socio-economic position, rural/urban setting. These factors may mean that a particular style of care or relationship evolves between service providers and consumers that may or may not match the values and technology of the program.

For many years these aspects have been acknowledged when decision makers have argued that results of evidence reviews from other countries do not apply in their own country or setting. Whilst some programmes/interventions have been successfully transferred from one context to another, others have not (Resnicow et al 1993, Lumley et al 2004, Coleman et al 2015). Review authors should be cautious when making generalizations from one context to another. They should report on the presence (or otherwise) of context-related information in intervention studies, where this information is available.

15.2.4 Variation in adherence

Variation in the adherence of the recipients and providers of care can limit the certainty in the applicability of results. Predictable differences in adherence can be due to divergence in how recipients of care perceive the intervention (e.g. the importance of side effects), economic conditions or attitudes that make some forms of care inaccessible in some settings, such as in low-income countries (Dans et al 2007). It should not be assumed that high levels of adherence in closely monitored randomized trials will translate into similar levels of adherence in normal practice.

15.2.5 Variation in values and preferences

Decisions about healthcare management strategies and options involve trading off health benefits and harms. The right choice may differ for people with different values and preferences (i.e. the importance people place on the outcomes and interventions), and it is important that decision makers ensure that decisions are consistent with a patient or population’s values and preferences. The importance placed on outcomes, together with other factors, will influence whether the recipients of care will or will not accept an option that is offered (Alonso-Coello et al 2016) and, thus, can be one factor influencing adherence. In Section 15.6 , we describe how the review author can help this process and the limits of supporting decision making based on intervention reviews.

15.3 Interpreting results of statistical analyses

15.3.1 confidence intervals.

Results for both individual studies and meta-analyses are reported with a point estimate together with an associated confidence interval. For example, ‘The odds ratio was 0.75 with a 95% confidence interval of 0.70 to 0.80’. The point estimate (0.75) is the best estimate of the magnitude and direction of the experimental intervention’s effect compared with the comparator intervention. The confidence interval describes the uncertainty inherent in any estimate, and describes a range of values within which we can be reasonably sure that the true effect actually lies. If the confidence interval is relatively narrow (e.g. 0.70 to 0.80), the effect size is known precisely. If the interval is wider (e.g. 0.60 to 0.93) the uncertainty is greater, although there may still be enough precision to make decisions about the utility of the intervention. Intervals that are very wide (e.g. 0.50 to 1.10) indicate that we have little knowledge about the effect and this imprecision affects our certainty in the evidence, and that further information would be needed before we could draw a more certain conclusion.

A 95% confidence interval is often interpreted as indicating a range within which we can be 95% certain that the true effect lies. This statement is a loose interpretation, but is useful as a rough guide. The strictly correct interpretation of a confidence interval is based on the hypothetical notion of considering the results that would be obtained if the study were repeated many times. If a study were repeated infinitely often, and on each occasion a 95% confidence interval calculated, then 95% of these intervals would contain the true effect (see Section 15.3.3 for further explanation).

The width of the confidence interval for an individual study depends to a large extent on the sample size. Larger studies tend to give more precise estimates of effects (and hence have narrower confidence intervals) than smaller studies. For continuous outcomes, precision depends also on the variability in the outcome measurements (i.e. how widely individual results vary between people in the study, measured as the standard deviation); for dichotomous outcomes it depends on the risk of the event (more frequent events allow more precision, and narrower confidence intervals), and for time-to-event outcomes it also depends on the number of events observed. All these quantities are used in computation of the standard errors of effect estimates from which the confidence interval is derived.

The width of a confidence interval for a meta-analysis depends on the precision of the individual study estimates and on the number of studies combined. In addition, for random-effects models, precision will decrease with increasing heterogeneity and confidence intervals will widen correspondingly (see Chapter 10, Section 10.10.4 ). As more studies are added to a meta-analysis the width of the confidence interval usually decreases. However, if the additional studies increase the heterogeneity in the meta-analysis and a random-effects model is used, it is possible that the confidence interval width will increase.

Confidence intervals and point estimates have different interpretations in fixed-effect and random-effects models. While the fixed-effect estimate and its confidence interval address the question ‘what is the best (single) estimate of the effect?’, the random-effects estimate assumes there to be a distribution of effects, and the estimate and its confidence interval address the question ‘what is the best estimate of the average effect?’ A confidence interval may be reported for any level of confidence (although they are most commonly reported for 95%, and sometimes 90% or 99%). For example, the odds ratio of 0.80 could be reported with an 80% confidence interval of 0.73 to 0.88; a 90% interval of 0.72 to 0.89; and a 95% interval of 0.70 to 0.92. As the confidence level increases, the confidence interval widens.

There is logical correspondence between the confidence interval and the P value (see Section 15.3.3 ). The 95% confidence interval for an effect will exclude the null value (such as an odds ratio of 1.0 or a risk difference of 0) if and only if the test of significance yields a P value of less than 0.05. If the P value is exactly 0.05, then either the upper or lower limit of the 95% confidence interval will be at the null value. Similarly, the 99% confidence interval will exclude the null if and only if the test of significance yields a P value of less than 0.01.

Together, the point estimate and confidence interval provide information to assess the effects of the intervention on the outcome. For example, suppose that we are evaluating an intervention that reduces the risk of an event and we decide that it would be useful only if it reduced the risk of an event from 30% by at least 5 percentage points to 25% (these values will depend on the specific clinical scenario and outcomes, including the anticipated harms). If the meta-analysis yielded an effect estimate of a reduction of 10 percentage points with a tight 95% confidence interval, say, from 7% to 13%, we would be able to conclude that the intervention was useful since both the point estimate and the entire range of the interval exceed our criterion of a reduction of 5% for net health benefit. However, if the meta-analysis reported the same risk reduction of 10% but with a wider interval, say, from 2% to 18%, although we would still conclude that our best estimate of the intervention effect is that it provides net benefit, we could not be so confident as we still entertain the possibility that the effect could be between 2% and 5%. If the confidence interval was wider still, and included the null value of a difference of 0%, we would still consider the possibility that the intervention has no effect on the outcome whatsoever, and would need to be even more sceptical in our conclusions.

Review authors may use the same general approach to conclude that an intervention is not useful. Continuing with the above example where the criterion for an important difference that should be achieved to provide more benefit than harm is a 5% risk difference, an effect estimate of 2% with a 95% confidence interval of 1% to 4% suggests that the intervention does not provide net health benefit.

15.3.2 P values and statistical significance

A P value is the standard result of a statistical test, and is the probability of obtaining the observed effect (or larger) under a ‘null hypothesis’. In the context of Cochrane Reviews there are two commonly used statistical tests. The first is a test of overall effect (a Z-test), and its null hypothesis is that there is no overall effect of the experimental intervention compared with the comparator on the outcome of interest. The second is the (Chi 2 ) test for heterogeneity, and its null hypothesis is that there are no differences in the intervention effects across studies.

A P value that is very small indicates that the observed effect is very unlikely to have arisen purely by chance, and therefore provides evidence against the null hypothesis. It has been common practice to interpret a P value by examining whether it is smaller than particular threshold values. In particular, P values less than 0.05 are often reported as ‘statistically significant’, and interpreted as being small enough to justify rejection of the null hypothesis. However, the 0.05 threshold is an arbitrary one that became commonly used in medical and psychological research largely because P values were determined by comparing the test statistic against tabulations of specific percentage points of statistical distributions. If review authors decide to present a P value with the results of a meta-analysis, they should report a precise P value (as calculated by most statistical software), together with the 95% confidence interval. Review authors should not describe results as ‘statistically significant’, ‘not statistically significant’ or ‘non-significant’ or unduly rely on thresholds for P values , but report the confidence interval together with the exact P value (see MECIR Box 15.3.a ).

We discuss interpretation of the test for heterogeneity in Chapter 10, Section 10.10.2 ; the remainder of this section refers mainly to tests for an overall effect. For tests of an overall effect, the computation of P involves both the effect estimate and precision of the effect estimate (driven largely by sample size). As precision increases, the range of plausible effects that could occur by chance is reduced. Correspondingly, the statistical significance of an effect of a particular magnitude will usually be greater (the P value will be smaller) in a larger study than in a smaller study.

P values are commonly misinterpreted in two ways. First, a moderate or large P value (e.g. greater than 0.05) may be misinterpreted as evidence that the intervention has no effect on the outcome. There is an important difference between this statement and the correct interpretation that there is a high probability that the observed effect on the outcome is due to chance alone. To avoid such a misinterpretation, review authors should always examine the effect estimate and its 95% confidence interval.

The second misinterpretation is to assume that a result with a small P value for the summary effect estimate implies that an experimental intervention has an important benefit. Such a misinterpretation is more likely to occur in large studies and meta-analyses that accumulate data over dozens of studies and thousands of participants. The P value addresses the question of whether the experimental intervention effect is precisely nil; it does not examine whether the effect is of a magnitude of importance to potential recipients of the intervention. In a large study, a small P value may represent the detection of a trivial effect that may not lead to net health benefit when compared with the potential harms (i.e. harmful effects on other important outcomes). Again, inspection of the point estimate and confidence interval helps correct interpretations (see Section 15.3.1 ).

MECIR Box 15.3.a Relevant expectations for conduct of intervention reviews

15.3.3 Relation between confidence intervals, statistical significance and certainty of evidence

The confidence interval (and imprecision) is only one domain that influences overall uncertainty about effect estimates. Uncertainty resulting from imprecision (i.e. statistical uncertainty) may be no less important than uncertainty from indirectness, or any other GRADE domain, in the context of decision making (Schünemann 2016). Thus, the extent to which interpretations of the confidence interval described in Sections 15.3.1 and 15.3.2 correspond to conclusions about overall certainty of the evidence for the outcome of interest depends on these other domains. If there are no concerns about other domains that determine the certainty of the evidence (i.e. risk of bias, inconsistency, indirectness or publication bias), then the interpretation in Sections 15.3.1 and 15.3.2 . about the relation of the confidence interval to the true effect may be carried forward to the overall certainty. However, if there are concerns about the other domains that affect the certainty of the evidence, the interpretation about the true effect needs to be seen in the context of further uncertainty resulting from those concerns.

For example, nine randomized controlled trials in almost 6000 cancer patients indicated that the administration of heparin reduces the risk of venous thromboembolism (VTE), with a risk ratio of 43% (95% CI 19% to 60%) (Akl et al 2011a). For patients with a plausible baseline risk of approximately 4.6% per year, this relative effect suggests that heparin leads to an absolute risk reduction of 20 fewer VTEs (95% CI 9 fewer to 27 fewer) per 1000 people per year (Akl et al 2011a). Now consider that the review authors or those applying the evidence in a guideline have lowered the certainty in the evidence as a result of indirectness. While the confidence intervals would remain unchanged, the certainty in that confidence interval and in the point estimate as reflecting the truth for the question of interest will be lowered. In fact, the certainty range will have unknown width so there will be unknown likelihood of a result within that range because of this indirectness. The lower the certainty in the evidence, the less we know about the width of the certainty range, although methods for quantifying risk of bias and understanding potential direction of bias may offer insight when lowered certainty is due to risk of bias. Nevertheless, decision makers must consider this uncertainty, and must do so in relation to the effect measure that is being evaluated (e.g. a relative or absolute measure). We will describe the impact on interpretations for dichotomous outcomes in Section 15.4 .

15.4 Interpreting results from dichotomous outcomes (including numbers needed to treat)

15.4.1 relative and absolute risk reductions.

Clinicians may be more inclined to prescribe an intervention that reduces the relative risk of death by 25% than one that reduces the risk of death by 1 percentage point, although both presentations of the evidence may relate to the same benefit (i.e. a reduction in risk from 4% to 3%). The former refers to the relative reduction in risk and the latter to the absolute reduction in risk. As described in Chapter 6, Section 6.4.1 , there are several measures for comparing dichotomous outcomes in two groups. Meta-analyses are usually undertaken using risk ratios (RR), odds ratios (OR) or risk differences (RD), but there are several alternative ways of expressing results.

Relative risk reduction (RRR) is a convenient way of re-expressing a risk ratio as a percentage reduction:

systematic review summary of findings table

For example, a risk ratio of 0.75 translates to a relative risk reduction of 25%, as in the example above.

The risk difference is often referred to as the absolute risk reduction (ARR) or absolute risk increase (ARI), and may be presented as a percentage (e.g. 1%), as a decimal (e.g. 0.01), or as account (e.g. 10 out of 1000). We consider different choices for presenting absolute effects in Section 15.4.3 . We then describe computations for obtaining these numbers from the results of individual studies and of meta-analyses in Section 15.4.4 .

15.4.2 Number needed to treat (NNT)

The number needed to treat (NNT) is a common alternative way of presenting information on the effect of an intervention. The NNT is defined as the expected number of people who need to receive the experimental rather than the comparator intervention for one additional person to either incur or avoid an event (depending on the direction of the result) in a given time frame. Thus, for example, an NNT of 10 can be interpreted as ‘it is expected that one additional (or less) person will incur an event for every 10 participants receiving the experimental intervention rather than comparator over a given time frame’. It is important to be clear that:

  • since the NNT is derived from the risk difference, it is still a comparative measure of effect (experimental versus a specific comparator) and not a general property of a single intervention; and
  • the NNT gives an ‘expected value’. For example, NNT = 10 does not imply that one additional event will occur in each and every group of 10 people.

NNTs can be computed for both beneficial and detrimental events, and for interventions that cause both improvements and deteriorations in outcomes. In all instances NNTs are expressed as positive whole numbers. Some authors use the term ‘number needed to harm’ (NNH) when an intervention leads to an adverse outcome, or a decrease in a positive outcome, rather than improvement. However, this phrase can be misleading (most notably, it can easily be read to imply the number of people who will experience a harmful outcome if given the intervention), and it is strongly recommended that ‘number needed to harm’ and ‘NNH’ are avoided. The preferred alternative is to use phrases such as ‘number needed to treat for an additional beneficial outcome’ (NNTB) and ‘number needed to treat for an additional harmful outcome’ (NNTH) to indicate direction of effect.

As NNTs refer to events, their interpretation needs to be worded carefully when the binary outcome is a dichotomization of a scale-based outcome. For example, if the outcome is pain measured on a ‘none, mild, moderate or severe’ scale it may have been dichotomized as ‘none or mild’ versus ‘moderate or severe’. It would be inappropriate for an NNT from these data to be referred to as an ‘NNT for pain’. It is an ‘NNT for moderate or severe pain’.

We consider different choices for presenting absolute effects in Section 15.4.3 . We then describe computations for obtaining these numbers from the results of individual studies and of meta-analyses in Section 15.4.4 .

15.4.3 Expressing risk differences

Users of reviews are liable to be influenced by the choice of statistical presentations of the evidence. Hoffrage and colleagues suggest that physicians’ inferences about statistical outcomes are more appropriate when they deal with ‘natural frequencies’ – whole numbers of people, both treated and untreated (e.g. treatment results in a drop from 20 out of 1000 to 10 out of 1000 women having breast cancer) – than when effects are presented as percentages (e.g. 1% absolute reduction in breast cancer risk) (Hoffrage et al 2000). Probabilities may be more difficult to understand than frequencies, particularly when events are rare. While standardization may be important in improving the presentation of research evidence (and participation in healthcare decisions), current evidence suggests that the presentation of natural frequencies for expressing differences in absolute risk is best understood by consumers of healthcare information (Akl et al 2011b). This evidence provides the rationale for presenting absolute risks in ‘Summary of findings’ tables as numbers of people with events per 1000 people receiving the intervention (see Chapter 14 ).

RRs and RRRs remain crucial because relative effects tend to be substantially more stable across risk groups than absolute effects (see Chapter 10, Section 10.4.3 ). Review authors can use their own data to study this consistency (Cates 1999, Smeeth et al 1999). Risk differences from studies are least likely to be consistent across baseline event rates; thus, they are rarely appropriate for computing numbers needed to treat in systematic reviews. If a relative effect measure (OR or RR) is chosen for meta-analysis, then a comparator group risk needs to be specified as part of the calculation of an RD or NNT. In addition, if there are several different groups of participants with different levels of risk, it is crucial to express absolute benefit for each clinically identifiable risk group, clarifying the time period to which this applies. Studies in patients with differing severity of disease, or studies with different lengths of follow-up will almost certainly have different comparator group risks. In these cases, different comparator group risks lead to different RDs and NNTs (except when the intervention has no effect). A recommended approach is to re-express an odds ratio or a risk ratio as a variety of RD or NNTs across a range of assumed comparator risks (ACRs) (McQuay and Moore 1997, Smeeth et al 1999). Review authors should bear these considerations in mind not only when constructing their ‘Summary of findings’ table, but also in the text of their review.

For example, a review of oral anticoagulants to prevent stroke presented information to users by describing absolute benefits for various baseline risks (Aguilar and Hart 2005, Aguilar et al 2007). They presented their principal findings as “The inherent risk of stroke should be considered in the decision to use oral anticoagulants in atrial fibrillation patients, selecting those who stand to benefit most for this therapy” (Aguilar and Hart 2005). Among high-risk atrial fibrillation patients with prior stroke or transient ischaemic attack who have stroke rates of about 12% (120 per 1000) per year, warfarin prevents about 70 strokes yearly per 1000 patients, whereas for low-risk atrial fibrillation patients (with a stroke rate of about 2% per year or 20 per 1000), warfarin prevents only 12 strokes. This presentation helps users to understand the important impact that typical baseline risks have on the absolute benefit that they can expect.

15.4.4 Computations

Direct computation of risk difference (RD) or a number needed to treat (NNT) depends on the summary statistic (odds ratio, risk ratio or risk differences) available from the study or meta-analysis. When expressing results of meta-analyses, review authors should use, in the computations, whatever statistic they determined to be the most appropriate summary for meta-analysis (see Chapter 10, Section 10.4.3 ). Here we present calculations to obtain RD as a reduction in the number of participants per 1000. For example, a risk difference of –0.133 corresponds to 133 fewer participants with the event per 1000.

RDs and NNTs should not be computed from the aggregated total numbers of participants and events across the trials. This approach ignores the randomization within studies, and may produce seriously misleading results if there is unbalanced randomization in any of the studies. Using the pooled result of a meta-analysis is more appropriate. When computing NNTs, the values obtained are by convention always rounded up to the next whole number.

15.4.4.1 Computing NNT from a risk difference (RD)

A NNT may be computed from a risk difference as

systematic review summary of findings table

where the vertical bars (‘absolute value of’) in the denominator indicate that any minus sign should be ignored. It is convention to round the NNT up to the nearest whole number. For example, if the risk difference is –0.12 the NNT is 9; if the risk difference is –0.22 the NNT is 5. Cochrane Review authors should qualify the NNT as referring to benefit (improvement) or harm by denoting the NNT as NNTB or NNTH. Note that this approach, although feasible, should be used only for the results of a meta-analysis of risk differences. In most cases meta-analyses will be undertaken using a relative measure of effect (RR or OR), and those statistics should be used to calculate the NNT (see Section 15.4.4.2 and 15.4.4.3 ).

15.4.4.2 Computing risk differences or NNT from a risk ratio

To aid interpretation of the results of a meta-analysis of risk ratios, review authors may compute an absolute risk reduction or NNT. In order to do this, an assumed comparator risk (ACR) (otherwise known as a baseline risk, or risk that the outcome of interest would occur with the comparator intervention) is required. It will usually be appropriate to do this for a range of different ACRs. The computation proceeds as follows:

systematic review summary of findings table

As an example, suppose the risk ratio is RR = 0.92, and an ACR = 0.3 (300 per 1000) is assumed. Then the effect on risk is 24 fewer per 1000:

systematic review summary of findings table

The NNT is 42:

systematic review summary of findings table

15.4.4.3 Computing risk differences or NNT from an odds ratio

Review authors may wish to compute a risk difference or NNT from the results of a meta-analysis of odds ratios. In order to do this, an ACR is required. It will usually be appropriate to do this for a range of different ACRs. The computation proceeds as follows:

systematic review summary of findings table

As an example, suppose the odds ratio is OR = 0.73, and a comparator risk of ACR = 0.3 is assumed. Then the effect on risk is 62 fewer per 1000:

systematic review summary of findings table

The NNT is 17:

systematic review summary of findings table

15.4.4.4 Computing risk ratio from an odds ratio

Because risk ratios are easier to interpret than odds ratios, but odds ratios have favourable mathematical properties, a review author may decide to undertake a meta-analysis based on odds ratios, but to express the result as a summary risk ratio (or relative risk reduction). This requires an ACR. Then

systematic review summary of findings table

It will often be reasonable to perform this transformation using the median comparator group risk from the studies in the meta-analysis.

15.4.4.5 Computing confidence limits

Confidence limits for RDs and NNTs may be calculated by applying the above formulae to the upper and lower confidence limits for the summary statistic (RD, RR or OR) (Altman 1998). Note that this confidence interval does not incorporate uncertainty around the ACR.

If the 95% confidence interval of OR or RR includes the value 1, one of the confidence limits will indicate benefit and the other harm. Thus, appropriate use of the words ‘fewer’ and ‘more’ is required for each limit when presenting results in terms of events. For NNTs, the two confidence limits should be labelled as NNTB and NNTH to indicate the direction of effect in each case. The confidence interval for the NNT will include a ‘discontinuity’, because increasingly smaller risk differences that approach zero will lead to NNTs approaching infinity. Thus, the confidence interval will include both an infinitely large NNTB and an infinitely large NNTH.

15.5 Interpreting results from continuous outcomes (including standardized mean differences)

15.5.1 meta-analyses with continuous outcomes.

Review authors should describe in the study protocol how they plan to interpret results for continuous outcomes. When outcomes are continuous, review authors have a number of options to present summary results. These options differ if studies report the same measure that is familiar to the target audiences, studies report the same or very similar measures that are less familiar to the target audiences, or studies report different measures.

15.5.2 Meta-analyses with continuous outcomes using the same measure

If all studies have used the same familiar units, for instance, results are expressed as durations of events, such as symptoms for conditions including diarrhoea, sore throat, otitis media, influenza or duration of hospitalization, a meta-analysis may generate a summary estimate in those units, as a difference in mean response (see, for instance, the row summarizing results for duration of diarrhoea in Chapter 14, Figure 14.1.b and the row summarizing oedema in Chapter 14, Figure 14.1.a ). For such outcomes, the ‘Summary of findings’ table should include a difference of means between the two interventions. However, when units of such outcomes may be difficult to interpret, particularly when they relate to rating scales (again, see the oedema row of Chapter 14, Figure 14.1.a ). ‘Summary of findings’ tables should include the minimum and maximum of the scale of measurement, and the direction. Knowledge of the smallest change in instrument score that patients perceive is important – the minimal important difference (MID) – and can greatly facilitate the interpretation of results (Guyatt et al 1998, Schünemann and Guyatt 2005). Knowing the MID allows review authors and users to place results in context. Review authors should state the MID – if known – in the Comments column of their ‘Summary of findings’ table. For example, the chronic respiratory questionnaire has possible scores in health-related quality of life ranging from 1 to 7 and 0.5 represents a well-established MID (Jaeschke et al 1989, Schünemann et al 2005).

15.5.3 Meta-analyses with continuous outcomes using different measures

When studies have used different instruments to measure the same construct, a standardized mean difference (SMD) may be used in meta-analysis for combining continuous data. Without guidance, clinicians and patients may have little idea how to interpret results presented as SMDs. Review authors should therefore consider issues of interpretability when planning their analysis at the protocol stage and should consider whether there will be suitable ways to re-express the SMD or whether alternative effect measures, such as a ratio of means, or possibly as minimal important difference units (Guyatt et al 2013b) should be used. Table 15.5.a and the following sections describe these options.

Table 15.5.a Approaches and their implications to presenting results of continuous variables when primary studies have used different instruments to measure the same construct. Adapted from Guyatt et al (2013b)

15.5.3.1 Presenting and interpreting SMDs using generic effect size estimates

The SMD expresses the intervention effect in standard units rather than the original units of measurement. The SMD is the difference in mean effects between the experimental and comparator groups divided by the pooled standard deviation of participants’ outcomes, or external SDs when studies are very small (see Chapter 6, Section 6.5.1.2 ). The value of a SMD thus depends on both the size of the effect (the difference between means) and the standard deviation of the outcomes (the inherent variability among participants or based on an external SD).

If review authors use the SMD, they might choose to present the results directly as SMDs (row 1a, Table 15.5.a and Table 15.5.b ). However, absolute values of the intervention and comparison groups are typically not useful because studies have used different measurement instruments with different units. Guiding rules for interpreting SMDs (or ‘Cohen’s effect sizes’) exist, and have arisen mainly from researchers in the social sciences (Cohen 1988). One example is as follows: 0.2 represents a small effect, 0.5 a moderate effect and 0.8 a large effect (Cohen 1988). Variations exist (e.g. <0.40=small, 0.40 to 0.70=moderate, >0.70=large). Review authors might consider including such a guiding rule in interpreting the SMD in the text of the review, and in summary versions such as the Comments column of a ‘Summary of findings’ table. However, some methodologists believe that such interpretations are problematic because patient importance of a finding is context-dependent and not amenable to generic statements.

15.5.3.2 Re-expressing SMDs using a familiar instrument

The second possibility for interpreting the SMD is to express it in the units of one or more of the specific measurement instruments used by the included studies (row 1b, Table 15.5.a and Table 15.5.b ). The approach is to calculate an absolute difference in means by multiplying the SMD by an estimate of the SD associated with the most familiar instrument. To obtain this SD, a reasonable option is to calculate a weighted average across all intervention groups of all studies that used the selected instrument (preferably a pre-intervention or post-intervention SD as discussed in Chapter 10, Section 10.5.2 ). To better reflect among-person variation in practice, or to use an instrument not represented in the meta-analysis, it may be preferable to use a standard deviation from a representative observational study. The summary effect is thus re-expressed in the original units of that particular instrument and the clinical relevance and impact of the intervention effect can be interpreted using that familiar instrument.

The same approach of re-expressing the results for a familiar instrument can also be used for other standardized effect measures such as when standardizing by MIDs (Guyatt et al 2013b): see Section 15.5.3.5 .

Table 15.5.b Application of approaches when studies have used different measures: effects of dexamethasone for pain after laparoscopic cholecystectomy (Karanicolas et al 2008). Reproduced with permission of Wolters Kluwer

1 Certainty rated according to GRADE from very low to high certainty. 2 Substantial unexplained heterogeneity in study results. 3 Imprecision due to wide confidence intervals. 4 The 20% comes from the proportion in the control group requiring rescue analgesia. 5 Crude (arithmetic) means of the post-operative pain mean responses across all five trials when transformed to a 100-point scale.

15.5.3.3 Re-expressing SMDs through dichotomization and transformation to relative and absolute measures

A third approach (row 1c, Table 15.5.a and Table 15.5.b ) relies on converting the continuous measure into a dichotomy and thus allows calculation of relative and absolute effects on a binary scale. A transformation of a SMD to a (log) odds ratio is available, based on the assumption that an underlying continuous variable has a logistic distribution with equal standard deviation in the two intervention groups, as discussed in Chapter 10, Section 10.6  (Furukawa 1999, Guyatt et al 2013b). The assumption is unlikely to hold exactly and the results must be regarded as an approximation. The log odds ratio is estimated as

systematic review summary of findings table

(or approximately 1.81✕SMD). The resulting odds ratio can then be presented as normal, and in a ‘Summary of findings’ table, combined with an assumed comparator group risk to be expressed as an absolute risk difference. The comparator group risk in this case would refer to the proportion of people who have achieved a specific value of the continuous outcome. In randomized trials this can be interpreted as the proportion who have improved by some (specified) amount (responders), for instance by 5 points on a 0 to 100 scale. Table 15.5.c shows some illustrative results from this method. The risk differences can then be converted to NNTs or to people per thousand using methods described in Section 15.4.4 .

Table 15.5.c Risk difference derived for specific SMDs for various given ‘proportions improved’ in the comparator group (Furukawa 1999, Guyatt et al 2013b). Reproduced with permission of Elsevier 

15.5.3.4 Ratio of means

A more frequently used approach is based on calculation of a ratio of means between the intervention and comparator groups (Friedrich et al 2008) as discussed in Chapter 6, Section 6.5.1.3 . Interpretational advantages of this approach include the ability to pool studies with outcomes expressed in different units directly, to avoid the vulnerability of heterogeneous populations that limits approaches that rely on SD units, and for ease of clinical interpretation (row 2, Table 15.5.a and Table 15.5.b ). This method is currently designed for post-intervention scores only. However, it is possible to calculate a ratio of change scores if both intervention and comparator groups change in the same direction in each relevant study, and this ratio may sometimes be informative.

Limitations to this approach include its limited applicability to change scores (since it is unlikely that both intervention and comparator group changes are in the same direction in all studies) and the possibility of misleading results if the comparator group mean is very small, in which case even a modest difference from the intervention group will yield a large and therefore misleading ratio of means. It also requires that separate ratios of means be calculated for each included study, and then entered into a generic inverse variance meta-analysis (see Chapter 10, Section 10.3 ).

The ratio of means approach illustrated in Table 15.5.b suggests a relative reduction in pain of only 13%, meaning that those receiving steroids have a pain severity 87% of those in the comparator group, an effect that might be considered modest.

15.5.3.5 Presenting continuous results as minimally important difference units

To express results in MID units, review authors have two options. First, they can be combined across studies in the same way as the SMD, but instead of dividing the mean difference of each study by its SD, review authors divide by the MID associated with that outcome (Johnston et al 2010, Guyatt et al 2013b). Instead of SD units, the pooled results represent MID units (row 3, Table 15.5.a and Table 15.5.b ), and may be more easily interpretable. This approach avoids the problem of varying SDs across studies that may distort estimates of effect in approaches that rely on the SMD. The approach, however, relies on having well-established MIDs. The approach is also risky in that a difference less than the MID may be interpreted as trivial when a substantial proportion of patients may have achieved an important benefit.

The other approach makes a simple conversion (not shown in Table 15.5.b ), before undertaking the meta-analysis, of the means and SDs from each study to means and SDs on the scale of a particular familiar instrument whose MID is known. For example, one can rescale the mean and SD of other chronic respiratory disease instruments (e.g. rescaling a 0 to 100 score of an instrument) to a the 1 to 7 score in Chronic Respiratory Disease Questionnaire (CRQ) units (by assuming 0 equals 1 and 100 equals 7 on the CRQ). Given the MID of the CRQ of 0.5, a mean difference in change of 0.71 after rescaling of all studies suggests a substantial effect of the intervention (Guyatt et al 2013b). This approach, presenting in units of the most familiar instrument, may be the most desirable when the target audiences have extensive experience with that instrument, particularly if the MID is well established.

15.6 Drawing conclusions

15.6.1 conclusions sections of a cochrane review.

Authors’ conclusions in a Cochrane Review are divided into implications for practice and implications for research. While Cochrane Reviews about interventions can provide meaningful information and guidance for practice, decisions about the desirable and undesirable consequences of healthcare options require evidence and judgements for criteria that most Cochrane Reviews do not provide (Alonso-Coello et al 2016). In describing the implications for practice and the development of recommendations, however, review authors may consider the certainty of the evidence, the balance of benefits and harms, and assumed values and preferences.

15.6.2 Implications for practice

Drawing conclusions about the practical usefulness of an intervention entails making trade-offs, either implicitly or explicitly, between the estimated benefits, harms and the values and preferences. Making such trade-offs, and thus making specific recommendations for an action in a specific context, goes beyond a Cochrane Review and requires additional evidence and informed judgements that most Cochrane Reviews do not provide (Alonso-Coello et al 2016). Such judgements are typically the domain of clinical practice guideline developers for which Cochrane Reviews will provide crucial information (Graham et al 2011, Schünemann et al 2014, Zhang et al 2018a). Thus, authors of Cochrane Reviews should not make recommendations.

If review authors feel compelled to lay out actions that clinicians and patients could take, they should – after describing the certainty of evidence and the balance of benefits and harms – highlight different actions that might be consistent with particular patterns of values and preferences. Other factors that might influence a decision should also be highlighted, including any known factors that would be expected to modify the effects of the intervention, the baseline risk or status of the patient, costs and who bears those costs, and the availability of resources. Review authors should ensure they consider all patient-important outcomes, including those for which limited data may be available. In the context of public health reviews the focus may be on population-important outcomes as the target may be an entire (non-diseased) population and include outcomes that are not measured in the population receiving an intervention (e.g. a reduction of transmission of infections from those receiving an intervention). This process implies a high level of explicitness in judgements about values or preferences attached to different outcomes and the certainty of the related evidence (Zhang et al 2018b, Zhang et al 2018c); this and a full cost-effectiveness analysis is beyond the scope of most Cochrane Reviews (although they might well be used for such analyses; see Chapter 20 ).

A review on the use of anticoagulation in cancer patients to increase survival (Akl et al 2011a) provides an example for laying out clinical implications for situations where there are important trade-offs between desirable and undesirable effects of the intervention: “The decision for a patient with cancer to start heparin therapy for survival benefit should balance the benefits and downsides and integrate the patient’s values and preferences. Patients with a high preference for a potential survival prolongation, limited aversion to potential bleeding, and who do not consider heparin (both UFH or LMWH) therapy a burden may opt to use heparin, while those with aversion to bleeding may not.”

15.6.3 Implications for research

The second category for authors’ conclusions in a Cochrane Review is implications for research. To help people make well-informed decisions about future healthcare research, the ‘Implications for research’ section should comment on the need for further research, and the nature of the further research that would be most desirable. It is helpful to consider the population, intervention, comparison and outcomes that could be addressed, or addressed more effectively in the future, in the context of the certainty of the evidence in the current review (Brown et al 2006):

  • P (Population): diagnosis, disease stage, comorbidity, risk factor, sex, age, ethnic group, specific inclusion or exclusion criteria, clinical setting;
  • I (Intervention): type, frequency, dose, duration, prognostic factor;
  • C (Comparison): placebo, routine care, alternative treatment/management;
  • O (Outcome): which clinical or patient-related outcomes will the researcher need to measure, improve, influence or accomplish? Which methods of measurement should be used?

While Cochrane Review authors will find the PICO domains helpful, the domains of the GRADE certainty framework further support understanding and describing what additional research will improve the certainty in the available evidence. Note that as the certainty of the evidence is likely to vary by outcome, these implications will be specific to certain outcomes in the review. Table 15.6.a shows how review authors may be aided in their interpretation of the body of evidence and drawing conclusions about future research and practice.

Table 15.6.a Implications for research and practice suggested by individual GRADE domains

The review of compression stockings for prevention of deep vein thrombosis (DVT) in airline passengers described in Chapter 14 provides an example where there is some convincing evidence of a benefit of the intervention: “This review shows that the question of the effects on symptomless DVT of wearing versus not wearing compression stockings in the types of people studied in these trials should now be regarded as answered. Further research may be justified to investigate the relative effects of different strengths of stockings or of stockings compared to other preventative strategies. Further randomised trials to address the remaining uncertainty about the effects of wearing versus not wearing compression stockings on outcomes such as death, pulmonary embolism and symptomatic DVT would need to be large.” (Clarke et al 2016).

A review of therapeutic touch for anxiety disorder provides an example of the implications for research when no eligible studies had been found: “This review highlights the need for randomized controlled trials to evaluate the effectiveness of therapeutic touch in reducing anxiety symptoms in people diagnosed with anxiety disorders. Future trials need to be rigorous in design and delivery, with subsequent reporting to include high quality descriptions of all aspects of methodology to enable appraisal and interpretation of results.” (Robinson et al 2007).

15.6.4 Reaching conclusions

A common mistake is to confuse ‘no evidence of an effect’ with ‘evidence of no effect’. When the confidence intervals are too wide (e.g. including no effect), it is wrong to claim that the experimental intervention has ‘no effect’ or is ‘no different’ from the comparator intervention. Review authors may also incorrectly ‘positively’ frame results for some effects but not others. For example, when the effect estimate is positive for a beneficial outcome but confidence intervals are wide, review authors may describe the effect as promising. However, when the effect estimate is negative for an outcome that is considered harmful but the confidence intervals include no effect, review authors report no effect. Another mistake is to frame the conclusion in wishful terms. For example, review authors might write, “there were too few people in the analysis to detect a reduction in mortality” when the included studies showed a reduction or even increase in mortality that was not ‘statistically significant’. One way of avoiding errors such as these is to consider the results blinded; that is, consider how the results would be presented and framed in the conclusions if the direction of the results was reversed. If the confidence interval for the estimate of the difference in the effects of the interventions overlaps with no effect, the analysis is compatible with both a true beneficial effect and a true harmful effect. If one of the possibilities is mentioned in the conclusion, the other possibility should be mentioned as well. Table 15.6.b suggests narrative statements for drawing conclusions based on the effect estimate from the meta-analysis and the certainty of the evidence.

Table 15.6.b Suggested narrative statements for phrasing conclusions

Another common mistake is to reach conclusions that go beyond the evidence. Often this is done implicitly, without referring to the additional information or judgements that are used in reaching conclusions about the implications of a review for practice. Even when additional information and explicit judgements support conclusions about the implications of a review for practice, review authors rarely conduct systematic reviews of the additional information. Furthermore, implications for practice are often dependent on specific circumstances and values that must be taken into consideration. As we have noted, review authors should always be cautious when drawing conclusions about implications for practice and they should not make recommendations.

15.7 Chapter information

Authors: Holger J Schünemann, Gunn E Vist, Julian PT Higgins, Nancy Santesso, Jonathan J Deeks, Paul Glasziou, Elie Akl, Gordon H Guyatt; on behalf of the Cochrane GRADEing Methods Group

Acknowledgements: Andrew Oxman, Jonathan Sterne, Michael Borenstein and Rob Scholten contributed text to earlier versions of this chapter.

Funding: This work was in part supported by funding from the Michael G DeGroote Cochrane Canada Centre and the Ontario Ministry of Health. JJD receives support from the National Institute for Health Research (NIHR) Birmingham Biomedical Research Centre at the University Hospitals Birmingham NHS Foundation Trust and the University of Birmingham. JPTH receives support from the NIHR Biomedical Research Centre at University Hospitals Bristol NHS Foundation Trust and the University of Bristol. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.

15.8 References

Aguilar MI, Hart R. Oral anticoagulants for preventing stroke in patients with non-valvular atrial fibrillation and no previous history of stroke or transient ischemic attacks. Cochrane Database of Systematic Reviews 2005; 3 : CD001927.

Aguilar MI, Hart R, Pearce LA. Oral anticoagulants versus antiplatelet therapy for preventing stroke in patients with non-valvular atrial fibrillation and no history of stroke or transient ischemic attacks. Cochrane Database of Systematic Reviews 2007; 3 : CD006186.

Akl EA, Gunukula S, Barba M, Yosuico VE, van Doormaal FF, Kuipers S, Middeldorp S, Dickinson HO, Bryant A, Schünemann H. Parenteral anticoagulation in patients with cancer who have no therapeutic or prophylactic indication for anticoagulation. Cochrane Database of Systematic Reviews 2011a; 1 : CD006652.

Akl EA, Oxman AD, Herrin J, Vist GE, Terrenato I, Sperati F, Costiniuk C, Blank D, Schünemann H. Using alternative statistical formats for presenting risks and risk reductions. Cochrane Database of Systematic Reviews 2011b; 3 : CD006776.

Alonso-Coello P, Schünemann HJ, Moberg J, Brignardello-Petersen R, Akl EA, Davoli M, Treweek S, Mustafa RA, Rada G, Rosenbaum S, Morelli A, Guyatt GH, Oxman AD, Group GW. GRADE Evidence to Decision (EtD) frameworks: a systematic and transparent approach to making well informed healthcare choices. 1: Introduction. BMJ 2016; 353 : i2016.

Altman DG. Confidence intervals for the number needed to treat. BMJ 1998; 317 : 1309-1312.

Atkins D, Best D, Briss PA, Eccles M, Falck-Ytter Y, Flottorp S, Guyatt GH, Harbour RT, Haugh MC, Henry D, Hill S, Jaeschke R, Leng G, Liberati A, Magrini N, Mason J, Middleton P, Mrukowicz J, O'Connell D, Oxman AD, Phillips B, Schünemann HJ, Edejer TT, Varonen H, Vist GE, Williams JW, Jr., Zaza S. Grading quality of evidence and strength of recommendations. BMJ 2004; 328 : 1490.

Brown P, Brunnhuber K, Chalkidou K, Chalmers I, Clarke M, Fenton M, Forbes C, Glanville J, Hicks NJ, Moody J, Twaddle S, Timimi H, Young P. How to formulate research recommendations. BMJ 2006; 333 : 804-806.

Cates C. Confidence intervals for the number needed to treat: Pooling numbers needed to treat may not be reliable. BMJ 1999; 318 : 1764-1765.

Clarke MJ, Broderick C, Hopewell S, Juszczak E, Eisinga A. Compression stockings for preventing deep vein thrombosis in airline passengers. Cochrane Database of Systematic Reviews 2016; 9 : CD004002.

Cohen J. Statistical Power Analysis in the Behavioral Sciences . 2nd edition ed. Hillsdale (NJ): Lawrence Erlbaum Associates, Inc.; 1988.

Coleman T, Chamberlain C, Davey MA, Cooper SE, Leonardi-Bee J. Pharmacological interventions for promoting smoking cessation during pregnancy. Cochrane Database of Systematic Reviews 2015; 12 : CD010078.

Dans AM, Dans L, Oxman AD, Robinson V, Acuin J, Tugwell P, Dennis R, Kang D. Assessing equity in clinical practice guidelines. Journal of Clinical Epidemiology 2007; 60 : 540-546.

Friedman LM, Furberg CD, DeMets DL. Fundamentals of Clinical Trials . 2nd edition ed. Littleton (MA): John Wright PSG, Inc.; 1985.

Friedrich JO, Adhikari NK, Beyene J. The ratio of means method as an alternative to mean differences for analyzing continuous outcome variables in meta-analysis: a simulation study. BMC Medical Research Methodology 2008; 8 : 32.

Furukawa T. From effect size into number needed to treat. Lancet 1999; 353 : 1680.

Graham R, Mancher M, Wolman DM, Greenfield S, Steinberg E. Committee on Standards for Developing Trustworthy Clinical Practice Guidelines, Board on Health Care Services: Clinical Practice Guidelines We Can Trust. Washington, DC: National Academies Press; 2011.

Guyatt G, Oxman AD, Akl EA, Kunz R, Vist G, Brozek J, Norris S, Falck-Ytter Y, Glasziou P, DeBeer H, Jaeschke R, Rind D, Meerpohl J, Dahm P, Schünemann HJ. GRADE guidelines: 1. Introduction-GRADE evidence profiles and summary of findings tables. Journal of Clinical Epidemiology 2011a; 64 : 383-394.

Guyatt GH, Juniper EF, Walter SD, Griffith LE, Goldstein RS. Interpreting treatment effects in randomised trials. BMJ 1998; 316 : 690-693.

Guyatt GH, Oxman AD, Vist GE, Kunz R, Falck-Ytter Y, Alonso-Coello P, Schünemann HJ. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. BMJ 2008; 336 : 924-926.

Guyatt GH, Oxman AD, Kunz R, Woodcock J, Brozek J, Helfand M, Alonso-Coello P, Falck-Ytter Y, Jaeschke R, Vist G, Akl EA, Post PN, Norris S, Meerpohl J, Shukla VK, Nasser M, Schünemann HJ. GRADE guidelines: 8. Rating the quality of evidence--indirectness. Journal of Clinical Epidemiology 2011b; 64 : 1303-1310.

Guyatt GH, Oxman AD, Santesso N, Helfand M, Vist G, Kunz R, Brozek J, Norris S, Meerpohl J, Djulbegovic B, Alonso-Coello P, Post PN, Busse JW, Glasziou P, Christensen R, Schünemann HJ. GRADE guidelines: 12. Preparing summary of findings tables-binary outcomes. Journal of Clinical Epidemiology 2013a; 66 : 158-172.

Guyatt GH, Thorlund K, Oxman AD, Walter SD, Patrick D, Furukawa TA, Johnston BC, Karanicolas P, Akl EA, Vist G, Kunz R, Brozek J, Kupper LL, Martin SL, Meerpohl JJ, Alonso-Coello P, Christensen R, Schünemann HJ. GRADE guidelines: 13. Preparing summary of findings tables and evidence profiles-continuous outcomes. Journal of Clinical Epidemiology 2013b; 66 : 173-183.

Hawe P, Shiell A, Riley T, Gold L. Methods for exploring implementation variation and local context within a cluster randomised community intervention trial. Journal of Epidemiology and Community Health 2004; 58 : 788-793.

Hoffrage U, Lindsey S, Hertwig R, Gigerenzer G. Medicine. Communicating statistical information. Science 2000; 290 : 2261-2262.

Jaeschke R, Singer J, Guyatt GH. Measurement of health status. Ascertaining the minimal clinically important difference. Controlled Clinical Trials 1989; 10 : 407-415.

Johnston B, Thorlund K, Schünemann H, Xie F, Murad M, Montori V, Guyatt G. Improving the interpretation of health-related quality of life evidence in meta-analysis: The application of minimal important difference units. . Health Outcomes and Qualithy of Life 2010; 11 : 116.

Karanicolas PJ, Smith SE, Kanbur B, Davies E, Guyatt GH. The impact of prophylactic dexamethasone on nausea and vomiting after laparoscopic cholecystectomy: a systematic review and meta-analysis. Annals of Surgery 2008; 248 : 751-762.

Lumley J, Oliver SS, Chamberlain C, Oakley L. Interventions for promoting smoking cessation during pregnancy. Cochrane Database of Systematic Reviews 2004; 4 : CD001055.

McQuay HJ, Moore RA. Using numerical results from systematic reviews in clinical practice. Annals of Internal Medicine 1997; 126 : 712-720.

Resnicow K, Cross D, Wynder E. The Know Your Body program: a review of evaluation studies. Bulletin of the New York Academy of Medicine 1993; 70 : 188-207.

Robinson J, Biley FC, Dolk H. Therapeutic touch for anxiety disorders. Cochrane Database of Systematic Reviews 2007; 3 : CD006240.

Rothwell PM. External validity of randomised controlled trials: "to whom do the results of this trial apply?". Lancet 2005; 365 : 82-93.

Santesso N, Carrasco-Labra A, Langendam M, Brignardello-Petersen R, Mustafa RA, Heus P, Lasserson T, Opiyo N, Kunnamo I, Sinclair D, Garner P, Treweek S, Tovey D, Akl EA, Tugwell P, Brozek JL, Guyatt G, Schünemann HJ. Improving GRADE evidence tables part 3: detailed guidance for explanatory footnotes supports creating and understanding GRADE certainty in the evidence judgments. Journal of Clinical Epidemiology 2016; 74 : 28-39.

Schünemann HJ, Puhan M, Goldstein R, Jaeschke R, Guyatt GH. Measurement properties and interpretability of the Chronic respiratory disease questionnaire (CRQ). COPD: Journal of Chronic Obstructive Pulmonary Disease 2005; 2 : 81-89.

Schünemann HJ, Guyatt GH. Commentary--goodbye M(C)ID! Hello MID, where do you come from? Health Services Research 2005; 40 : 593-597.

Schünemann HJ, Fretheim A, Oxman AD. Improving the use of research evidence in guideline development: 13. Applicability, transferability and adaptation. Health Research Policy and Systems 2006; 4 : 25.

Schünemann HJ. Methodological idiosyncracies, frameworks and challenges of non-pharmaceutical and non-technical treatment interventions. Zeitschrift für Evidenz, Fortbildung und Qualität im Gesundheitswesen 2013; 107 : 214-220.

Schünemann HJ, Tugwell P, Reeves BC, Akl EA, Santesso N, Spencer FA, Shea B, Wells G, Helfand M. Non-randomized studies as a source of complementary, sequential or replacement evidence for randomized controlled trials in systematic reviews on the effects of interventions. Research Synthesis Methods 2013; 4 : 49-62.

Schünemann HJ, Wiercioch W, Etxeandia I, Falavigna M, Santesso N, Mustafa R, Ventresca M, Brignardello-Petersen R, Laisaar KT, Kowalski S, Baldeh T, Zhang Y, Raid U, Neumann I, Norris SL, Thornton J, Harbour R, Treweek S, Guyatt G, Alonso-Coello P, Reinap M, Brozek J, Oxman A, Akl EA. Guidelines 2.0: systematic development of a comprehensive checklist for a successful guideline enterprise. CMAJ: Canadian Medical Association Journal 2014; 186 : E123-142.

Schünemann HJ. Interpreting GRADE's levels of certainty or quality of the evidence: GRADE for statisticians, considering review information size or less emphasis on imprecision? Journal of Clinical Epidemiology 2016; 75 : 6-15.

Smeeth L, Haines A, Ebrahim S. Numbers needed to treat derived from meta-analyses--sometimes informative, usually misleading. BMJ 1999; 318 : 1548-1551.

Sun X, Briel M, Busse JW, You JJ, Akl EA, Mejza F, Bala MM, Bassler D, Mertz D, Diaz-Granados N, Vandvik PO, Malaga G, Srinathan SK, Dahm P, Johnston BC, Alonso-Coello P, Hassouneh B, Walter SD, Heels-Ansdell D, Bhatnagar N, Altman DG, Guyatt GH. Credibility of claims of subgroup effects in randomised controlled trials: systematic review. BMJ 2012; 344 : e1553.

Zhang Y, Akl EA, Schünemann HJ. Using systematic reviews in guideline development: the GRADE approach. Research Synthesis Methods 2018a: doi: 10.1002/jrsm.1313.

Zhang Y, Alonso-Coello P, Guyatt GH, Yepes-Nunez JJ, Akl EA, Hazlewood G, Pardo-Hernandez H, Etxeandia-Ikobaltzeta I, Qaseem A, Williams JW, Jr., Tugwell P, Flottorp S, Chang Y, Zhang Y, Mustafa RA, Rojas MX, Schünemann HJ. GRADE Guidelines: 19. Assessing the certainty of evidence in the importance of outcomes or values and preferences-Risk of bias and indirectness. Journal of Clinical Epidemiology 2018b: doi: 10.1016/j.jclinepi.2018.1001.1013.

Zhang Y, Alonso Coello P, Guyatt G, Yepes-Nunez JJ, Akl EA, Hazlewood G, Pardo-Hernandez H, Etxeandia-Ikobaltzeta I, Qaseem A, Williams JW, Jr., Tugwell P, Flottorp S, Chang Y, Zhang Y, Mustafa RA, Rojas MX, Xie F, Schünemann HJ. GRADE Guidelines: 20. Assessing the certainty of evidence in the importance of outcomes or values and preferences - Inconsistency, Imprecision, and other Domains. Journal of Clinical Epidemiology 2018c: doi: 10.1016/j.jclinepi.2018.1005.1011.

For permission to re-use material from the Handbook (either academic or commercial), please see here for full details.

Interactive Summary of Findings tables: the way to present and understand results of systematic reviews

Affiliation.

  • 1 Department of Health Research Methods, Evidence and Impact (formerly Clinical Epidemiology and Biostatistics), McMaster GRADE Centre, Michael G DeGroote Cochrane Canada Centre, McMaster University, Hamilton, Canada; Department of Medicine, McMaster University, Hamilton, Canada Department of Health Research Methods, Evidence and Impact (formerly Clinical Epidemiology and Biostatistics), McMaster GRADE Centre, Michael G DeGroote Cochrane Canada Centre, McMaster University, Hamilton, Canada Department of Health Research Methods, Evidence and Impact (formerly Clinical Epidemiology and Biostatistics), McMaster GRADE Centre, Michael G DeGroote Cochrane Canada Centre, McMaster University, Hamilton, Canada; Department of Medicine, McMaster University, Hamilton, Canada.
  • PMID: 30870328
  • DOI: 10.11124/JBISRIR-D-19-00059

Publication types

  • Systematic Review
  • Delivery of Health Care / organization & administration*
  • Delivery of Health Care / standards
  • Evidence-Based Medicine / trends*
  • Guidelines as Topic
  • Information Dissemination / methods*
  • Statistics as Topic

Jump to navigation

Home

Summary of findings tables for communicating key findings of systematic reviews

To assess the effects of 'Summary of findings' tables on communicating key findings of systematic reviews of the effects of healthcare interventions.

This will be achieved by:

  • assessing the effects of 'Summary of findings' tables versus full versions of systematic reviews on communicating key findings of systematic reviews of the effects of healthcare interventions;
  • assessing the effects of 'Summary of findings' tables plus full review versus full review (no 'Summary of findings' tables);
  • assessing the effects of 'Summary of findings' tables versus other summaries of systematic reviews on communicating key findings of systematic reviews of the effects of healthcare interventions;
  • assessing the effects of interactive 'Summary of findings' tables versus static 'Summary of findings' tables on communicating key findings of systematic reviews of the effects of healthcare interventions;
  • assessing the effects of 'Summary of findings' tables versus other formats of 'Summary of findings' tables on communicating key findings of systematic reviews of the effects of healthcare interventions;
  • assessing how particular participant groups e.g. patients, healthcare providers, policy makers, understand and apply the information from the 'Summary of findings' tables.

This is a protocol.

Cochrane Norway

Summary of findings tables.

Summary of Findings table

The Summary of Findings table aims to help readers understand the results of a Cochrane review more correctly and find key information faster by:

  • highlighting the most important outcomes, both benefits and harms
  • presenting what is known and not known about each of these outcomes
  • presenting how sure we can be of the evidence for each outcome

The Summary of Findings table is one of the outputs of the Grading of Recommendations Assessment, Development and Evaluation ( GRADE ) system for evaluating certainty of evidence. Authors produce a Summary of Findings table by grading the evidence, one outcome at a time. 

The Summary of Findings table format is based on extensive feedback from stakeholders and testing with users.

The Summary of Findings table is a key building block for many other derivative products, including our plain language summary formats , SUPPORT Summaries , SURE rapid responses and policy briefs, and DECIDE frameworks for going from evidence to decisions or recommendations .

How is the Summary of Findings table format currently being used?

Summary of Findings tables are increasingly being incorporated into Cochrane Reviews and are also used by authors of other systematic reviews, for instance at the Norwegian Institute of Public Health. The Cochrane Collaboration now defines Summary of Findings tables as “highly desirable” in their author guidelines ( MERCIR ).

Read more about our ongoing work to explore an interactive format for Summary of Findings Tables .

Templates and instructions for use

See the Cochrane Handbook’s chapter 11.5

Relevant publications from staff at Cochrane Norway

  • Rosenbaum SE, Glenton C, Oxman AD. Summary-of-findings tables in Cochrane reviews improved understanding and rapid retrieval of key information. Journal of Clinical Epidemiology, 2010 Jun;63(6):620-6.
  • Rosenbaum SE, Glenton C, Nylund HK, Oxman AD. User testing and stakeholder feedback contributed to the development of understandable and useful Summary of Findings tables for Cochrane reviews. Journal of Clinical Epidemiology, 2010 Jun;63(6):607-19.
  • Glenton C, Kho M, Underland V, Nilsen, ES, Oxman A. Summaries of findings, descriptions of interventions and information about adverse effects would make reviews more informative. Journal of Clinical Epidemiology, 2006, 59 (8): 770-778.

This research has been supported through funds provided by the Cochrane Collaboration Steering Group and the Norwegian Knowledge Centre for the Health Services. Ongoing work is supported in part by the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 258583 (DECIDE project).

11.5.1  Introduction to ‘Summary of findings’ tables

‘Summary of findings’ tables present the main findings of a review in a transparent and simple tabular format. In particular, they provide key information concerning the quality of evidence, the magnitude of effect of the interventions examined, and the sum of available data on the main outcomes. Most reviews would be expected to have a single ‘Summary of findings’ table. Other reviews may include more than one, for example if the review addresses more than one major comparison, or substantially different populations. In the CDSR, the principal ‘Summary of findings’ table of a review will appear at the beginning, before the Background section. Other ‘Summary of findings’ tables will appear between the Results and Discussion sections.

The planning for the ‘Summary of findings’ table comes early in the systematic review, with the selection of the outcomes to be included in (i) the review and (ii) the ‘Summary of findings’ table.  Because this is a crucial step, and one typically not formally addressed in traditional Cochrane reviews, we will review the issues in selecting outcomes here.

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • Write for Us
  • BMJ Journals More You are viewing from: Google Indexer

You are here

  • Volume 24, Issue 2
  • Five tips for developing useful literature summary tables for writing review articles
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • http://orcid.org/0000-0003-0157-5319 Ahtisham Younas 1 , 2 ,
  • http://orcid.org/0000-0002-7839-8130 Parveen Ali 3 , 4
  • 1 Memorial University of Newfoundland , St John's , Newfoundland , Canada
  • 2 Swat College of Nursing , Pakistan
  • 3 School of Nursing and Midwifery , University of Sheffield , Sheffield , South Yorkshire , UK
  • 4 Sheffield University Interpersonal Violence Research Group , Sheffield University , Sheffield , UK
  • Correspondence to Ahtisham Younas, Memorial University of Newfoundland, St John's, NL A1C 5C4, Canada; ay6133{at}mun.ca

https://doi.org/10.1136/ebnurs-2021-103417

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

Literature reviews offer a critical synthesis of empirical and theoretical literature to assess the strength of evidence, develop guidelines for practice and policymaking, and identify areas for future research. 1 It is often essential and usually the first task in any research endeavour, particularly in masters or doctoral level education. For effective data extraction and rigorous synthesis in reviews, the use of literature summary tables is of utmost importance. A literature summary table provides a synopsis of an included article. It succinctly presents its purpose, methods, findings and other relevant information pertinent to the review. The aim of developing these literature summary tables is to provide the reader with the information at one glance. Since there are multiple types of reviews (eg, systematic, integrative, scoping, critical and mixed methods) with distinct purposes and techniques, 2 there could be various approaches for developing literature summary tables making it a complex task specialty for the novice researchers or reviewers. Here, we offer five tips for authors of the review articles, relevant to all types of reviews, for creating useful and relevant literature summary tables. We also provide examples from our published reviews to illustrate how useful literature summary tables can be developed and what sort of information should be provided.

Tip 1: provide detailed information about frameworks and methods

  • Download figure
  • Open in new tab
  • Download powerpoint

Tabular literature summaries from a scoping review. Source: Rasheed et al . 3

The provision of information about conceptual and theoretical frameworks and methods is useful for several reasons. First, in quantitative (reviews synthesising the results of quantitative studies) and mixed reviews (reviews synthesising the results of both qualitative and quantitative studies to address a mixed review question), it allows the readers to assess the congruence of the core findings and methods with the adapted framework and tested assumptions. In qualitative reviews (reviews synthesising results of qualitative studies), this information is beneficial for readers to recognise the underlying philosophical and paradigmatic stance of the authors of the included articles. For example, imagine the authors of an article, included in a review, used phenomenological inquiry for their research. In that case, the review authors and the readers of the review need to know what kind of (transcendental or hermeneutic) philosophical stance guided the inquiry. Review authors should, therefore, include the philosophical stance in their literature summary for the particular article. Second, information about frameworks and methods enables review authors and readers to judge the quality of the research, which allows for discerning the strengths and limitations of the article. For example, if authors of an included article intended to develop a new scale and test its psychometric properties. To achieve this aim, they used a convenience sample of 150 participants and performed exploratory (EFA) and confirmatory factor analysis (CFA) on the same sample. Such an approach would indicate a flawed methodology because EFA and CFA should not be conducted on the same sample. The review authors must include this information in their summary table. Omitting this information from a summary could lead to the inclusion of a flawed article in the review, thereby jeopardising the review’s rigour.

Tip 2: include strengths and limitations for each article

Critical appraisal of individual articles included in a review is crucial for increasing the rigour of the review. Despite using various templates for critical appraisal, authors often do not provide detailed information about each reviewed article’s strengths and limitations. Merely noting the quality score based on standardised critical appraisal templates is not adequate because the readers should be able to identify the reasons for assigning a weak or moderate rating. Many recent critical appraisal checklists (eg, Mixed Methods Appraisal Tool) discourage review authors from assigning a quality score and recommend noting the main strengths and limitations of included studies. It is also vital that methodological and conceptual limitations and strengths of the articles included in the review are provided because not all review articles include empirical research papers. Rather some review synthesises the theoretical aspects of articles. Providing information about conceptual limitations is also important for readers to judge the quality of foundations of the research. For example, if you included a mixed-methods study in the review, reporting the methodological and conceptual limitations about ‘integration’ is critical for evaluating the study’s strength. Suppose the authors only collected qualitative and quantitative data and did not state the intent and timing of integration. In that case, the strength of the study is weak. Integration only occurred at the levels of data collection. However, integration may not have occurred at the analysis, interpretation and reporting levels.

Tip 3: write conceptual contribution of each reviewed article

While reading and evaluating review papers, we have observed that many review authors only provide core results of the article included in a review and do not explain the conceptual contribution offered by the included article. We refer to conceptual contribution as a description of how the article’s key results contribute towards the development of potential codes, themes or subthemes, or emerging patterns that are reported as the review findings. For example, the authors of a review article noted that one of the research articles included in their review demonstrated the usefulness of case studies and reflective logs as strategies for fostering compassion in nursing students. The conceptual contribution of this research article could be that experiential learning is one way to teach compassion to nursing students, as supported by case studies and reflective logs. This conceptual contribution of the article should be mentioned in the literature summary table. Delineating each reviewed article’s conceptual contribution is particularly beneficial in qualitative reviews, mixed-methods reviews, and critical reviews that often focus on developing models and describing or explaining various phenomena. Figure 2 offers an example of a literature summary table. 4

Tabular literature summaries from a critical review. Source: Younas and Maddigan. 4

Tip 4: compose potential themes from each article during summary writing

While developing literature summary tables, many authors use themes or subthemes reported in the given articles as the key results of their own review. Such an approach prevents the review authors from understanding the article’s conceptual contribution, developing rigorous synthesis and drawing reasonable interpretations of results from an individual article. Ultimately, it affects the generation of novel review findings. For example, one of the articles about women’s healthcare-seeking behaviours in developing countries reported a theme ‘social-cultural determinants of health as precursors of delays’. Instead of using this theme as one of the review findings, the reviewers should read and interpret beyond the given description in an article, compare and contrast themes, findings from one article with findings and themes from another article to find similarities and differences and to understand and explain bigger picture for their readers. Therefore, while developing literature summary tables, think twice before using the predeveloped themes. Including your themes in the summary tables (see figure 1 ) demonstrates to the readers that a robust method of data extraction and synthesis has been followed.

Tip 5: create your personalised template for literature summaries

Often templates are available for data extraction and development of literature summary tables. The available templates may be in the form of a table, chart or a structured framework that extracts some essential information about every article. The commonly used information may include authors, purpose, methods, key results and quality scores. While extracting all relevant information is important, such templates should be tailored to meet the needs of the individuals’ review. For example, for a review about the effectiveness of healthcare interventions, a literature summary table must include information about the intervention, its type, content timing, duration, setting, effectiveness, negative consequences, and receivers and implementers’ experiences of its usage. Similarly, literature summary tables for articles included in a meta-synthesis must include information about the participants’ characteristics, research context and conceptual contribution of each reviewed article so as to help the reader make an informed decision about the usefulness or lack of usefulness of the individual article in the review and the whole review.

In conclusion, narrative or systematic reviews are almost always conducted as a part of any educational project (thesis or dissertation) or academic or clinical research. Literature reviews are the foundation of research on a given topic. Robust and high-quality reviews play an instrumental role in guiding research, practice and policymaking. However, the quality of reviews is also contingent on rigorous data extraction and synthesis, which require developing literature summaries. We have outlined five tips that could enhance the quality of the data extraction and synthesis process by developing useful literature summaries.

  • Aromataris E ,
  • Rasheed SP ,

Twitter @Ahtisham04, @parveenazamali

Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

Competing interests None declared.

Patient consent for publication Not required.

Provenance and peer review Not commissioned; externally peer reviewed.

Read the full text or download the PDF:

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List
  • J Med Libr Assoc
  • v.109(1); 2021 Jan 1

Use of a search summary table to improve systematic review search methods, results, and efficiency

Alison c. bethel.

1 [email protected] , Information Specialist, Evidence Synthesis Team, University of Exeter Medical School, Exeter, United Kingdom

Morwenna Rogers

2 [email protected] , Evidence Synthesis Team, National Institute for Health Research Applied Research Collaboration South West Peninsula, University of Exeter Medical School, Exeter, United Kingdom

Rebecca Abbott

3 [email protected] , Evidence Synthesis Team, National Institute for Health Research Applied Research Collaboration South West Peninsula,, University of Exeter Medical School, Exeter, United Kingdom

Associated Data

Background:.

Systematic reviews are comprehensive, robust, inclusive, transparent, and reproducible when bringing together the evidence to answer a research question. Various guidelines provide recommendations on the expertise required to conduct a systematic review, where and how to search for literature, and what should be reported in the published review. However, the finer details of the search results are not typically reported to allow the search methods or search efficiency to be evaluated.

Case Presentation:

This case study presents a search summary table, containing the details of which databases were searched, which supplementary search methods were used, and where the included articles were found. It was developed and published alongside a recent systematic review. This simple format can be used in future systematic reviews to improve search results reporting.

Conclusions:

Publishing a search summary table in all systematic reviews would add to the growing evidence base about information retrieval, which would help in determining which databases to search for which type of review (in terms of either topic or scope), what supplementary search methods are most effective, what type of literature is being included, and where it is found. It would also provide evidence for future searching and search methods research.

Systematic reviews are designed to be comprehensive, robust, inclusive, transparent, and reproducible when bringing together the evidence to answer a research question. Depending on the field and topic, they may be large and time consuming with many included studies, or they can contain no relevant studies at all, finding that the area urgently requires primary research [ 1 ]. The timescales to publication can vary widely [ 2 ], and some systematic reviews are regularly updated, particularly if there is new relevant evidence in the field being researched [ 3 ]. However, research consistently shows that search strategies are not recorded well enough to allow them to be reproduced [ 4 – 6 ].

Systematic review guidelines recommend that the systematic review team include expertise in systematic review methods, including information retrieval [ 7 – 10 ]. Information retrieval is a core competency for librarians and information specialists who are involved in systematic reviews [ 11 ], and having a librarian or information specialist as part of the team is associated with significantly higher-quality search strategies [ 12 ]. The role of a librarian or information specialist in a systematic review can vary, ranging from the more limited role of checking searches written by others in the team to taking on many, if not all, aspects of search development and information retrieval [ 7 , 9 , 13 ].

Once a protocol is in place, one of the first tasks undertaken by the librarian or information specialist is to create search strategies for the predefined bibliographic databases listed in the protocol. Ideally, this task is then followed up by supplementary searching, such as forward and backward citation searching or table of contents searching in journals that are relevant to the topic [ 14 ]. Searching for grey literature is also recognized as an important part of a comprehensive search strategy for systematic reviews, and previous literature describes which resources are best suited to finding it and the contribution it can make to a systematic review [ 15 – 18 ]. Finally, depending on how long the systematic review takes to publication, update searches may also be part of the process [ 3 ].

Guidelines and guidance for conducting systematic reviews are available from various organizations, including Cochrane, the Campbell Collaboration, the Centre for Reviews and Dissemination, and the Joanna Briggs Institute. These guidelines all include some detail about how the searching should be undertaken, but there is no clear consensus about how many or which databases should be searched. Similarly, tools for assessing the quality of systematic reviews vary somewhat in their recommendations of what reviewers should look for in the search methods. Table 1 highlights a few of the different guidelines and checklists for undertaking, reporting, and appraising the searching component of systematic reviews, organized by the tool or author or organization.

Recommendations for systematic review searching from guidelines and checklists

As well as understanding where and how to search for information, it is important to understand how well the search strategies perform. Cooper et al. suggest there are six summative metrics of search effectiveness: sensitivity, specificity, precision, accuracy, number needed to read (NNR), and yield [ 23 ]. However, while there are suggested standards for reporting search methods and strategies [ 24 – 26 ], there currently are no requirements to report on these effectiveness metrics. Some research has been published on search effectiveness, but this research seems to be restricted to systematic reviews of certain conditions [ 27 – 32 ] or from specific organizations [ 33 – 35 ]. By reporting search effectiveness as well as search methods in more detail, evidence about information retrieval would accumulate, which could then inform guidelines about how many and which databases to search and which supplementary search methods to use for particular topics or types of evidence synthesis.

The aim of this study was to develop a search summary table (SST) that could report on search methods as well as search effectiveness. The authors demonstrate what an SST could look like and how it can be used. In the suggested SST, the only metric not covered by Cooper et al. [ 23 ] is “specificity,” because this requires a known number of references (e.g., when developing a search filter).

CASE PRESENTATION

The SST was tested by the Evidence Synthesis Team at the University of Exeter in a systematic review, “‘They've Walked the Walk': A Systematic Review of Quantitative and Qualitative Evidence for Parent-to-Parent Support for Parents of Babies in the Neonatal Unit” by Hunt et al. [ 36 ]. The database search strategies used for the review are provided in supplemental Appendix A , and a blank SST template is provided in supplemental Appendix B .

Completion of a search summary table (SST)

The SST was completed in two stages. In stage one, all the references that were downloaded or exported from every electronic database, including all duplicates, were recorded and saved in an EndNote library. Every record included a code for the database name where the record was found. As per traditional systematic review methods, the number of records screened at both the title-and-abstract stage and full-text stage were recorded as well as the final number of included references and which supplementary search methods were undertaken. Stage two involved rerunning the searches in those databases where most included references had been found in order to discover whether references that were not found during the original search were in the database and, if they were, whether they were retrieved by the search.

The SST presents the search information used to inform the PRISMA flow diagram, the search methods, and additional information gathered by the librarian or information specialist in their search log. Completion of stage one took approximately forty minutes and completion of stage two approximately one hour. Using this format to present the information allows calculation of various search effectiveness metrics.

Table 2 shows the key features of the SST. The first five metrics (numbered 1 to 5) are summative metrics of effective searching suggested by Cooper et al. [ 23 ]. Three additional metrics (numbered 6 to 8) provide further useful search-related information for the librarian or information specialist.

Metrics used in the search summary table (SST)

Sensitivity/recall and precision calculations are given in the SST for each database searched and overall, using the total number of references that have been found from database searching, the number of included (i.e., relevant) references from database searching, and the total number of included (i.e., relevant) references from all search methods. Reporting these metrics in this manner shows the effectiveness of search strategies for each individual database as well as database searching as a whole.

NNR usually indicates the number of references needed to screen at the title and abstract stage to find one included reference. However, the value of splitting this metric into two additional metrics can be seen: (1) number needed to screen (NNS), which is the number of references that needed to be screened during title and abstract screening to identify one reference to undergo full-text screening; and (2) number needed to read at full text (NNR FT), which is the number of references that needed to be read during full-text screening to include one reference in the systematic review. Reporting these three metrics separately increases the transparency of the searching and selection process.

Table 3 shows the SST for Hunt et al.'s systematic review [ 36 ].

Completed SST for Hunt et al.'s systematic review, ‘“They've Walked the Walk': A Systematic Review of Quantitative and Qualitative Evidence for Parent-To-Parent Support for Parents of Babies in Neonatal Care” [ 36 ]

Codes: x=found from the search; y=in database, found when search strategy rerun; n=not in the database; z=in the database, not found using the search strategy; qL=qualitative; qT=quantitative.

Format codes: jnl=journal article; ths=PhD thesis.

Supplementary search codes: fcs=forward citation search; bcs=backward citation search; hs=hand search; wss=website search; org=from contacting organizations.

Databases listed: ASSIA=Applied Social Sciences Index and Abstracts; BNI=British Nursing Index; HMIC=Health Management Information Consortium; PQDT=ProQuest Dissertations and Theses; SPP=Social Policy and Practice; WoS=Web of Science.

Contextual consideration of the findings

Key findings can be surmised from the metrics reported in the SST for this example systematic review, which involved a search for both qualitative and quantitative evidence.

Grey literature.

Two doctoral (PhD) theses were included in the systematic review, and both were found by searching in PsycINFO. One was also found by searching CINAHL. This was surprising, because grey literature searching is often seen as separate from the database searching process, yet these theses were found by searching bibliographic databases as opposed to Proquest Dissertations and Theses Global.

Search strategy comprehensiveness.

Three included references were found from citation searching, two of which were in both EMBASE and MEDLINE but were not retrieved by the database search strategies. If the search strategies had included the free-text search term “council*” ( supplemental Appendix A ), then these references would have been retrieved. This was an extremely valuable learning point for the information specialist in the team and reaffirmed the purpose of supplementary searching.

Unique references.

The only database to retrieve unique references (n=2) was PsycINFO, demonstrating the high degree of duplication among bibliographic databases.

Supplementary searching.

Hand searching, website searching, and organization searching was carried out but found no additional relevant references. Although forward citation searching found two additional relevant references, both of these (and a third additional relevant reference) were also found by backward citation searching. The time spent on these methods of supplementary searching was not recorded but might be useful in the future.

Qualitative references.

The CINAHL search retrieved only two of the eight qualitative references and did not retrieve any unique qualitative references. This was surprising, because previous research showed that this database was a good source of qualitative studies [ 37 ].

Quantitative references.

All the quantitative references were found from searching MEDLINE and citation searching.

Number needed to read.

Reporting the overall NNR as well as splitting this metric into two metrics—NNS and NNR FT—allowed more accurate and transparent reporting of the screening stages. Concerning the NNS, for every thirty-nine references that underwent title and abstract screening, one underwent full-text screening. Concerning the NNR, for every ten references read in full-text screening, one was included in the systematic review.

Often MEDLINE and EMBASE are suggested as the basic minimum for searching on health care topics [ 7 , 19 ]; however, in this case, neither database provided unique records, although MEDLINE had higher sensitivity and precision than EMBASE. For this particular systematic review, searching MEDLINE and PsycINFO along with backward citation searching would have found all of the included references. If this had been done, then the maximum number of references to screen would have been 2,835 (total number of references downloaded), a reduction of over 1,600.

The optimal number of databases that need to be searched varies depending on the review question. However, it is commonly agreed that searching only one database is not sufficient, and supplementary searching in some form is also needed. In this case, backwards citation searching found additional studies.

An SST can report data related to database and search performance and effectiveness in terms of sensitivity, precision, recall, NNR, yield, and number of unique records. Furthermore, additional information gathered during supplementary searching (e.g., citation searching and hand-searching) indicates the effectiveness of search strategies for individual databases and which methods of supplementary searching were most useful. This information could allow librarians and information specialists to be more selective when choosing databases and supplementary search methods. Publishing an SST as part of a systematic review would help to develop and make more explicit, rather than tacit, the model of the literature search process as described by Cooper et al. [ 38 ].

Future systematic reviews on a similar condition or population could use a related SST that is already available, either in-house or one that has been published, to enable a more evidence-based approach to database and search methods selection. For rapid reviews [ 39 ] and scoping reviews [ 40 ], in which searching might not be as exhaustive, this information could provide evidence about where to focus the search. SSTs would be particularly valuable in updating a systematic review [ 3 ]; if, for example, two databases that are always searched consistently do not contain any of the included studies, then perhaps they need not be searched in the future.

An SST only provides evidence for one particular systematic review; teams using them for future systematic reviews might not be fully confident that the same results would be produced for their specific questions. However, if all systematic reviews completed and published an SST as standard, then there would be more evidence available for making evidence-based decisions on which databases to search, which supplementary search methods would be most valuable, and which search strategies and terms would find the most relevant references for specific questions. Broad generalizations on searching cannot be made until more SSTs are available, but they can still be a valuable learning tool for all those involved in searching for systematic reviews, as their creation requires reflection on what was done and why, which can be carried on into the next systematic review. SSTs can also provide evidence for other purposes of searching, such as update searches [ 3 ] or scoping and preliminary searches [ 40 ].

SSTs can be useful to librarians and information specialists in several ways. First, for individuals who are new to the topic area or to systematic reviews, they provide a valuable source of evidence on which to base database and search method choices and recommendations. Second, they provide evidence about which databases are essential for undertaking specific systematic reviews, which could be useful for groups or individuals in negotiating database access with their institutions. Third, SSTs could help librarians and information specialists audit their database selections and search strategies, as they would show whether a database contains a reference and whether it would be captured by their search strategy. Fourth, a librarian or information specialist's knowledge would be built up more quickly, because completing an SST would help them reflect on their search strategies, search methods, and database selection.

Another area in which SSTs could be useful is in search methods and information retrieval research. If SSTs are published as part of a systematic review, then the searching becomes more transparent, replicable, and open, which is a fundamental component of good quality systematic reviews. Librarians and information specialists could use the data provided in SSTs to perform more thorough analyses on where studies are likely to be found and which databases suit particular topics. Trends might be observed, such as country-specific biases in database selection and use, and knowledge about specific databases could be shared in an easy format.

One specific area for monitoring is grey literature. By reviewing and analyzing SSTs, librarians and information specialists would be able to determine the extent to which grey literature publications are included in systematic reviews and how they are found, which would help to focus search time and energy. Future research following from this project may include finding a simple way to retrospectively evaluate search strategies, which could help improve future search strategy or search methods development or aid in the creation of a repository where all SSTs could be shared and accessed.

Cooper et al.'s systematic review identified fifty studies of the effectiveness of literature searching, which was a representative sample of the available literature [ 23 ]. SSTs would add to this literature and help move forward the discussion about what constitutes an effective search for a systematic review.

The SST is a simple way to collate the search information generated from a systematic review. Creating and reporting an SST as part of a systematic review would add to the knowledgebase on database selection and supplementary search methods and provide evidence for future searching and search methods research.

SUPPLEMENTAL FILES

  • Systematic review
  • Open access
  • Published: 19 February 2024

‘It depends’: what 86 systematic reviews tell us about what strategies to use to support the use of research in clinical practice

  • Annette Boaz   ORCID: orcid.org/0000-0003-0557-1294 1 ,
  • Juan Baeza 2 ,
  • Alec Fraser   ORCID: orcid.org/0000-0003-1121-1551 2 &
  • Erik Persson 3  

Implementation Science volume  19 , Article number:  15 ( 2024 ) Cite this article

1758 Accesses

68 Altmetric

Metrics details

The gap between research findings and clinical practice is well documented and a range of strategies have been developed to support the implementation of research into clinical practice. The objective of this study was to update and extend two previous reviews of systematic reviews of strategies designed to implement research evidence into clinical practice.

We developed a comprehensive systematic literature search strategy based on the terms used in the previous reviews to identify studies that looked explicitly at interventions designed to turn research evidence into practice. The search was performed in June 2022 in four electronic databases: Medline, Embase, Cochrane and Epistemonikos. We searched from January 2010 up to June 2022 and applied no language restrictions. Two independent reviewers appraised the quality of included studies using a quality assessment checklist. To reduce the risk of bias, papers were excluded following discussion between all members of the team. Data were synthesised using descriptive and narrative techniques to identify themes and patterns linked to intervention strategies, targeted behaviours, study settings and study outcomes.

We identified 32 reviews conducted between 2010 and 2022. The reviews are mainly of multi-faceted interventions ( n  = 20) although there are reviews focusing on single strategies (ICT, educational, reminders, local opinion leaders, audit and feedback, social media and toolkits). The majority of reviews report strategies achieving small impacts (normally on processes of care). There is much less evidence that these strategies have shifted patient outcomes. Furthermore, a lot of nuance lies behind these headline findings, and this is increasingly commented upon in the reviews themselves.

Combined with the two previous reviews, 86 systematic reviews of strategies to increase the implementation of research into clinical practice have been identified. We need to shift the emphasis away from isolating individual and multi-faceted interventions to better understanding and building more situated, relational and organisational capability to support the use of research in clinical practice. This will involve drawing on a wider range of research perspectives (including social science) in primary studies and diversifying the types of synthesis undertaken to include approaches such as realist synthesis which facilitate exploration of the context in which strategies are employed.

Peer Review reports

Contribution to the literature

Considerable time and money is invested in implementing and evaluating strategies to increase the implementation of research into clinical practice.

The growing body of evidence is not providing the anticipated clear lessons to support improved implementation.

Instead what is needed is better understanding and building more situated, relational and organisational capability to support the use of research in clinical practice.

This would involve a more central role in implementation science for a wider range of perspectives, especially from the social, economic, political and behavioural sciences and for greater use of different types of synthesis, such as realist synthesis.

Introduction

The gap between research findings and clinical practice is well documented and a range of interventions has been developed to increase the implementation of research into clinical practice [ 1 , 2 ]. In recent years researchers have worked to improve the consistency in the ways in which these interventions (often called strategies) are described to support their evaluation. One notable development has been the emergence of Implementation Science as a field focusing explicitly on “the scientific study of methods to promote the systematic uptake of research findings and other evidence-based practices into routine practice” ([ 3 ] p. 1). The work of implementation science focuses on closing, or at least narrowing, the gap between research and practice. One contribution has been to map existing interventions, identifying 73 discreet strategies to support research implementation [ 4 ] which have been grouped into 9 clusters [ 5 ]. The authors note that they have not considered the evidence of effectiveness of the individual strategies and that a next step is to understand better which strategies perform best in which combinations and for what purposes [ 4 ]. Other authors have noted that there is also scope to learn more from other related fields of study such as policy implementation [ 6 ] and to draw on methods designed to support the evaluation of complex interventions [ 7 ].

The increase in activity designed to support the implementation of research into practice and improvements in reporting provided the impetus for an update of a review of systematic reviews of the effectiveness of interventions designed to support the use of research in clinical practice [ 8 ] which was itself an update of the review conducted by Grimshaw and colleagues in 2001. The 2001 review [ 9 ] identified 41 reviews considering a range of strategies including educational interventions, audit and feedback, computerised decision support to financial incentives and combined interventions. The authors concluded that all the interventions had the potential to promote the uptake of evidence in practice, although no one intervention seemed to be more effective than the others in all settings. They concluded that combined interventions were more likely to be effective than single interventions. The 2011 review identified a further 13 systematic reviews containing 313 discrete primary studies. Consistent with the previous review, four main strategy types were identified: audit and feedback; computerised decision support; opinion leaders; and multi-faceted interventions (MFIs). Nine of the reviews reported on MFIs. The review highlighted the small effects of single interventions such as audit and feedback, computerised decision support and opinion leaders. MFIs claimed an improvement in effectiveness over single interventions, although effect sizes remained small to moderate and this improvement in effectiveness relating to MFIs has been questioned in a subsequent review [ 10 ]. In updating the review, we anticipated a larger pool of reviews and an opportunity to consolidate learning from more recent systematic reviews of interventions.

This review updates and extends our previous review of systematic reviews of interventions designed to implement research evidence into clinical practice. To identify potentially relevant peer-reviewed research papers, we developed a comprehensive systematic literature search strategy based on the terms used in the Grimshaw et al. [ 9 ] and Boaz, Baeza and Fraser [ 8 ] overview articles. To ensure optimal retrieval, our search strategy was refined with support from an expert university librarian, considering the ongoing improvements in the development of search filters for systematic reviews since our first review [ 11 ]. We also wanted to include technology-related terms (e.g. apps, algorithms, machine learning, artificial intelligence) to find studies that explored interventions based on the use of technological innovations as mechanistic tools for increasing the use of evidence into practice (see Additional file 1 : Appendix A for full search strategy).

The search was performed in June 2022 in the following electronic databases: Medline, Embase, Cochrane and Epistemonikos. We searched for articles published since the 2011 review. We searched from January 2010 up to June 2022 and applied no language restrictions. Reference lists of relevant papers were also examined.

We uploaded the results using EPPI-Reviewer, a web-based tool that facilitated semi-automation of the screening process and removal of duplicate studies. We made particular use of a priority screening function to reduce screening workload and avoid ‘data deluge’ [ 12 ]. Through machine learning, one reviewer screened a smaller number of records ( n  = 1200) to train the software to predict whether a given record was more likely to be relevant or irrelevant, thus pulling the relevant studies towards the beginning of the screening process. This automation did not replace manual work but helped the reviewer to identify eligible studies more quickly. During the selection process, we included studies that looked explicitly at interventions designed to turn research evidence into practice. Studies were included if they met the following pre-determined inclusion criteria:

The study was a systematic review

Search terms were included

Focused on the implementation of research evidence into practice

The methodological quality of the included studies was assessed as part of the review

Study populations included healthcare providers and patients. The EPOC taxonomy [ 13 ] was used to categorise the strategies. The EPOC taxonomy has four domains: delivery arrangements, financial arrangements, governance arrangements and implementation strategies. The implementation strategies domain includes 20 strategies targeted at healthcare workers. Numerous EPOC strategies were assessed in the review including educational strategies, local opinion leaders, reminders, ICT-focused approaches and audit and feedback. Some strategies that did not fit easily within the EPOC categories were also included. These were social media strategies and toolkits, and multi-faceted interventions (MFIs) (see Table  2 ). Some systematic reviews included comparisons of different interventions while other reviews compared one type of intervention against a control group. Outcomes related to improvements in health care processes or patient well-being. Numerous individual study types (RCT, CCT, BA, ITS) were included within the systematic reviews.

We excluded papers that:

Focused on changing patient rather than provider behaviour

Had no demonstrable outcomes

Made unclear or no reference to research evidence

The last of these criteria was sometimes difficult to judge, and there was considerable discussion amongst the research team as to whether the link between research evidence and practice was sufficiently explicit in the interventions analysed. As we discussed in the previous review [ 8 ] in the field of healthcare, the principle of evidence-based practice is widely acknowledged and tools to change behaviour such as guidelines are often seen to be an implicit codification of evidence, despite the fact that this is not always the case.

Reviewers employed a two-stage process to select papers for inclusion. First, all titles and abstracts were screened by one reviewer to determine whether the study met the inclusion criteria. Two papers [ 14 , 15 ] were identified that fell just before the 2010 cut-off. As they were not identified in the searches for the first review [ 8 ] they were included and progressed to assessment. Each paper was rated as include, exclude or maybe. The full texts of 111 relevant papers were assessed independently by at least two authors. To reduce the risk of bias, papers were excluded following discussion between all members of the team. 32 papers met the inclusion criteria and proceeded to data extraction. The study selection procedure is documented in a PRISMA literature flow diagram (see Fig.  1 ). We were able to include French, Spanish and Portuguese papers in the selection reflecting the language skills in the study team, but none of the papers identified met the inclusion criteria. Other non- English language papers were excluded.

figure 1

PRISMA flow diagram. Source: authors

One reviewer extracted data on strategy type, number of included studies, local, target population, effectiveness and scope of impact from the included studies. Two reviewers then independently read each paper and noted key findings and broad themes of interest which were then discussed amongst the wider authorial team. Two independent reviewers appraised the quality of included studies using a Quality Assessment Checklist based on Oxman and Guyatt [ 16 ] and Francke et al. [ 17 ]. Each study was rated a quality score ranging from 1 (extensive flaws) to 7 (minimal flaws) (see Additional file 2 : Appendix B). All disagreements were resolved through discussion. Studies were not excluded in this updated overview based on methodological quality as we aimed to reflect the full extent of current research into this topic.

The extracted data were synthesised using descriptive and narrative techniques to identify themes and patterns in the data linked to intervention strategies, targeted behaviours, study settings and study outcomes.

Thirty-two studies were included in the systematic review. Table 1. provides a detailed overview of the included systematic reviews comprising reference, strategy type, quality score, number of included studies, local, target population, effectiveness and scope of impact (see Table  1. at the end of the manuscript). Overall, the quality of the studies was high. Twenty-three studies scored 7, six studies scored 6, one study scored 5, one study scored 4 and one study scored 3. The primary focus of the review was on reviews of effectiveness studies, but a small number of reviews did include data from a wider range of methods including qualitative studies which added to the analysis in the papers [ 18 , 19 , 20 , 21 ]. The majority of reviews report strategies achieving small impacts (normally on processes of care). There is much less evidence that these strategies have shifted patient outcomes. In this section, we discuss the different EPOC-defined implementation strategies in turn. Interestingly, we found only two ‘new’ approaches in this review that did not fit into the existing EPOC approaches. These are a review focused on the use of social media and a review considering toolkits. In addition to single interventions, we also discuss multi-faceted interventions. These were the most common intervention approach overall. A summary is provided in Table  2 .

Educational strategies

The overview identified three systematic reviews focusing on educational strategies. Grudniewicz et al. [ 22 ] explored the effectiveness of printed educational materials on primary care physician knowledge, behaviour and patient outcomes and concluded they were not effective in any of these aspects. Koota, Kääriäinen and Melender [ 23 ] focused on educational interventions promoting evidence-based practice among emergency room/accident and emergency nurses and found that interventions involving face-to-face contact led to significant or highly significant effects on patient benefits and emergency nurses’ knowledge, skills and behaviour. Interventions using written self-directed learning materials also led to significant improvements in nurses’ knowledge of evidence-based practice. Although the quality of the studies was high, the review primarily included small studies with low response rates, and many of them relied on self-assessed outcomes; consequently, the strength of the evidence for these outcomes is modest. Wu et al. [ 20 ] questioned if educational interventions aimed at nurses to support the implementation of evidence-based practice improve patient outcomes. Although based on evaluation projects and qualitative data, their results also suggest that positive changes on patient outcomes can be made following the implementation of specific evidence-based approaches (or projects). The differing positive outcomes for educational strategies aimed at nurses might indicate that the target audience is important.

Local opinion leaders

Flodgren et al. [ 24 ] was the only systemic review focusing solely on opinion leaders. The review found that local opinion leaders alone, or in combination with other interventions, can be effective in promoting evidence‐based practice, but this varies both within and between studies and the effect on patient outcomes is uncertain. The review found that, overall, any intervention involving opinion leaders probably improves healthcare professionals’ compliance with evidence-based practice but varies within and across studies. However, how opinion leaders had an impact could not be determined because of insufficient details were provided, illustrating that reporting specific details in published studies is important if diffusion of effective methods of increasing evidence-based practice is to be spread across a system. The usefulness of this review is questionable because it cannot provide evidence of what is an effective opinion leader, whether teams of opinion leaders or a single opinion leader are most effective, or the most effective methods used by opinion leaders.

Pantoja et al. [ 26 ] was the only systemic review focusing solely on manually generated reminders delivered on paper included in the overview. The review explored how these affected professional practice and patient outcomes. The review concluded that manually generated reminders delivered on paper as a single intervention probably led to small to moderate increases in adherence to clinical recommendations, and they could be used as a single quality improvement intervention. However, the authors indicated that this intervention would make little or no difference to patient outcomes. The authors state that such a low-tech intervention may be useful in low- and middle-income countries where paper records are more likely to be the norm.

ICT-focused approaches

The three ICT-focused reviews [ 14 , 27 , 28 ] showed mixed results. Jamal, McKenzie and Clark [ 14 ] explored the impact of health information technology on the quality of medical and health care. They examined the impact of electronic health record, computerised provider order-entry, or decision support system. This showed a positive improvement in adherence to evidence-based guidelines but not to patient outcomes. The number of studies included in the review was low and so a conclusive recommendation could not be reached based on this review. Similarly, Brown et al. [ 28 ] found that technology-enabled knowledge translation interventions may improve knowledge of health professionals, but all eight studies raised concerns of bias. The De Angelis et al. [ 27 ] review was more promising, reporting that ICT can be a good way of disseminating clinical practice guidelines but conclude that it is unclear which type of ICT method is the most effective.

Audit and feedback

Sykes, McAnuff and Kolehmainen [ 29 ] examined whether audit and feedback were effective in dementia care and concluded that it remains unclear which ingredients of audit and feedback are successful as the reviewed papers illustrated large variations in the effectiveness of interventions using audit and feedback.

Non-EPOC listed strategies: social media, toolkits

There were two new (non-EPOC listed) intervention types identified in this review compared to the 2011 review — fewer than anticipated. We categorised a third — ‘care bundles’ [ 36 ] as a multi-faceted intervention due to its description in practice and a fourth — ‘Technology Enhanced Knowledge Transfer’ [ 28 ] was classified as an ICT-focused approach. The first new strategy was identified in Bhatt et al.’s [ 30 ] systematic review of the use of social media for the dissemination of clinical practice guidelines. They reported that the use of social media resulted in a significant improvement in knowledge and compliance with evidence-based guidelines compared with more traditional methods. They noted that a wide selection of different healthcare professionals and patients engaged with this type of social media and its global reach may be significant for low- and middle-income countries. This review was also noteworthy for developing a simple stepwise method for using social media for the dissemination of clinical practice guidelines. However, it is debatable whether social media can be classified as an intervention or just a different way of delivering an intervention. For example, the review discussed involving opinion leaders and patient advocates through social media. However, this was a small review that included only five studies, so further research in this new area is needed. Yamada et al. [ 31 ] draw on 39 studies to explore the application of toolkits, 18 of which had toolkits embedded within larger KT interventions, and 21 of which evaluated toolkits as standalone interventions. The individual component strategies of the toolkits were highly variable though the authors suggest that they align most closely with educational strategies. The authors conclude that toolkits as either standalone strategies or as part of MFIs hold some promise for facilitating evidence use in practice but caution that the quality of many of the primary studies included is considered weak limiting these findings.

Multi-faceted interventions

The majority of the systematic reviews ( n  = 20) reported on more than one intervention type. Some of these systematic reviews focus exclusively on multi-faceted interventions, whilst others compare different single or combined interventions aimed at achieving similar outcomes in particular settings. While these two approaches are often described in a similar way, they are actually quite distinct from each other as the former report how multiple strategies may be strategically combined in pursuance of an agreed goal, whilst the latter report how different strategies may be incidentally used in sometimes contrasting settings in the pursuance of similar goals. Ariyo et al. [ 35 ] helpfully summarise five key elements often found in effective MFI strategies in LMICs — but which may also be transferrable to HICs. First, effective MFIs encourage a multi-disciplinary approach acknowledging the roles played by different professional groups to collectively incorporate evidence-informed practice. Second, they utilise leadership drawing on a wide set of clinical and non-clinical actors including managers and even government officials. Third, multiple types of educational practices are utilised — including input from patients as stakeholders in some cases. Fourth, protocols, checklists and bundles are used — most effectively when local ownership is encouraged. Finally, most MFIs included an emphasis on monitoring and evaluation [ 35 ]. In contrast, other studies offer little information about the nature of the different MFI components of included studies which makes it difficult to extrapolate much learning from them in relation to why or how MFIs might affect practice (e.g. [ 28 , 38 ]). Ultimately, context matters, which some review authors argue makes it difficult to say with real certainty whether single or MFI strategies are superior (e.g. [ 21 , 27 ]). Taking all the systematic reviews together we may conclude that MFIs appear to be more likely to generate positive results than single interventions (e.g. [ 34 , 45 ]) though other reviews should make us cautious (e.g. [ 32 , 43 ]).

While multi-faceted interventions still seem to be more effective than single-strategy interventions, there were important distinctions between how the results of reviews of MFIs are interpreted in this review as compared to the previous reviews [ 8 , 9 ], reflecting greater nuance and debate in the literature. This was particularly noticeable where the effectiveness of MFIs was compared to single strategies, reflecting developments widely discussed in previous studies [ 10 ]. We found that most systematic reviews are bounded by their clinical, professional, spatial, system, or setting criteria and often seek to draw out implications for the implementation of evidence in their areas of specific interest (such as nursing or acute care). Frequently this means combining all relevant studies to explore the respective foci of each systematic review. Therefore, most reviews we categorised as MFIs actually include highly variable numbers and combinations of intervention strategies and highly heterogeneous original study designs. This makes statistical analyses of the type used by Squires et al. [ 10 ] on the three reviews in their paper not possible. Further, it also makes extrapolating findings and commenting on broad themes complex and difficult. This may suggest that future research should shift its focus from merely examining ‘what works’ to ‘what works where and what works for whom’ — perhaps pointing to the value of realist approaches to these complex review topics [ 48 , 49 ] and other more theory-informed approaches [ 50 ].

Some reviews have a relatively small number of studies (i.e. fewer than 10) and the authors are often understandably reluctant to engage with wider debates about the implications of their findings. Other larger studies do engage in deeper discussions about internal comparisons of findings across included studies and also contextualise these in wider debates. Some of the most informative studies (e.g. [ 35 , 40 ]) move beyond EPOC categories and contextualise MFIs within wider systems thinking and implementation theory. This distinction between MFIs and single interventions can actually be very useful as it offers lessons about the contexts in which individual interventions might have bounded effectiveness (i.e. educational interventions for individual change). Taken as a whole, this may also then help in terms of how and when to conjoin single interventions into effective MFIs.

In the two previous reviews, a consistent finding was that MFIs were more effective than single interventions [ 8 , 9 ]. However, like Squires et al. [ 10 ] this overview is more equivocal on this important issue. There are four points which may help account for the differences in findings in this regard. Firstly, the diversity of the systematic reviews in terms of clinical topic or setting is an important factor. Secondly, there is heterogeneity of the studies within the included systematic reviews themselves. Thirdly, there is a lack of consistency with regards to the definition and strategies included within of MFIs. Finally, there are epistemological differences across the papers and the reviews. This means that the results that are presented depend on the methods used to measure, report, and synthesise them. For instance, some reviews highlight that education strategies can be useful to improve provider understanding — but without wider organisational or system-level change, they may struggle to deliver sustained transformation [ 19 , 44 ].

It is also worth highlighting the importance of the theory of change underlying the different interventions. Where authors of the systematic reviews draw on theory, there is space to discuss/explain findings. We note a distinction between theoretical and atheoretical systematic review discussion sections. Atheoretical reviews tend to present acontextual findings (for instance, one study found very positive results for one intervention, and this gets highlighted in the abstract) whilst theoretically informed reviews attempt to contextualise and explain patterns within the included studies. Theory-informed systematic reviews seem more likely to offer more profound and useful insights (see [ 19 , 35 , 40 , 43 , 45 ]). We find that the most insightful systematic reviews of MFIs engage in theoretical generalisation — they attempt to go beyond the data of individual studies and discuss the wider implications of the findings of the studies within their reviews drawing on implementation theory. At the same time, they highlight the active role of context and the wider relational and system-wide issues linked to implementation. It is these types of investigations that can help providers further develop evidence-based practice.

This overview has identified a small, but insightful set of papers that interrogate and help theorise why, how, for whom, and in which circumstances it might be the case that MFIs are superior (see [ 19 , 35 , 40 ] once more). At the level of this overview — and in most of the systematic reviews included — it appears to be the case that MFIs struggle with the question of attribution. In addition, there are other important elements that are often unmeasured, or unreported (e.g. costs of the intervention — see [ 40 ]). Finally, the stronger systematic reviews [ 19 , 35 , 40 , 43 , 45 ] engage with systems issues, human agency and context [ 18 ] in a way that was not evident in the systematic reviews identified in the previous reviews [ 8 , 9 ]. The earlier reviews lacked any theory of change that might explain why MFIs might be more effective than single ones — whereas now some systematic reviews do this, which enables them to conclude that sometimes single interventions can still be more effective.

As Nilsen et al. ([ 6 ] p. 7) note ‘Study findings concerning the effectiveness of various approaches are continuously synthesized and assembled in systematic reviews’. We may have gone as far as we can in understanding the implementation of evidence through systematic reviews of single and multi-faceted interventions and the next step would be to conduct more research exploring the complex and situated nature of evidence used in clinical practice and by particular professional groups. This would further build on the nuanced discussion and conclusion sections in a subset of the papers we reviewed. This might also support the field to move away from isolating individual implementation strategies [ 6 ] to explore the complex processes involving a range of actors with differing capacities [ 51 ] working in diverse organisational cultures. Taxonomies of implementation strategies do not fully account for the complex process of implementation, which involves a range of different actors with different capacities and skills across multiple system levels. There is plenty of work to build on, particularly in the social sciences, which currently sits at the margins of debates about evidence implementation (see for example, Normalisation Process Theory [ 52 ]).

There are several changes that we have identified in this overview of systematic reviews in comparison to the review we published in 2011 [ 8 ]. A consistent and welcome finding is that the overall quality of the systematic reviews themselves appears to have improved between the two reviews, although this is not reflected upon in the papers. This is exhibited through better, clearer reporting mechanisms in relation to the mechanics of the reviews, alongside a greater attention to, and deeper description of, how potential biases in included papers are discussed. Additionally, there is an increased, but still limited, inclusion of original studies conducted in low- and middle-income countries as opposed to just high-income countries. Importantly, we found that many of these systematic reviews are attuned to, and comment upon the contextual distinctions of pursuing evidence-informed interventions in health care settings in different economic settings. Furthermore, systematic reviews included in this updated article cover a wider set of clinical specialities (both within and beyond hospital settings) and have a focus on a wider set of healthcare professions — discussing both similarities, differences and inter-professional challenges faced therein, compared to the earlier reviews. These wider ranges of studies highlight that a particular intervention or group of interventions may work well for one professional group but be ineffective for another. This diversity of study settings allows us to consider the important role context (in its many forms) plays on implementing evidence into practice. Examining the complex and varied context of health care will help us address what Nilsen et al. ([ 6 ] p. 1) described as, ‘society’s health problems [that] require research-based knowledge acted on by healthcare practitioners together with implementation of political measures from governmental agencies’. This will help us shift implementation science to move, ‘beyond a success or failure perspective towards improved analysis of variables that could explain the impact of the implementation process’ ([ 6 ] p. 2).

This review brings together 32 papers considering individual and multi-faceted interventions designed to support the use of evidence in clinical practice. The majority of reviews report strategies achieving small impacts (normally on processes of care). There is much less evidence that these strategies have shifted patient outcomes. Combined with the two previous reviews, 86 systematic reviews of strategies to increase the implementation of research into clinical practice have been conducted. As a whole, this substantial body of knowledge struggles to tell us more about the use of individual and MFIs than: ‘it depends’. To really move forwards in addressing the gap between research evidence and practice, we may need to shift the emphasis away from isolating individual and multi-faceted interventions to better understanding and building more situated, relational and organisational capability to support the use of research in clinical practice. This will involve drawing on a wider range of perspectives, especially from the social, economic, political and behavioural sciences in primary studies and diversifying the types of synthesis undertaken to include approaches such as realist synthesis which facilitate exploration of the context in which strategies are employed. Harvey et al. [ 53 ] suggest that when context is likely to be critical to implementation success there are a range of primary research approaches (participatory research, realist evaluation, developmental evaluation, ethnography, quality/ rapid cycle improvement) that are likely to be appropriate and insightful. While these approaches often form part of implementation studies in the form of process evaluations, they are usually relatively small scale in relation to implementation research as a whole. As a result, the findings often do not make it into the subsequent systematic reviews. This review provides further evidence that we need to bring qualitative approaches in from the periphery to play a central role in many implementation studies and subsequent evidence syntheses. It would be helpful for systematic reviews, at the very least, to include more detail about the interventions and their implementation in terms of how and why they worked.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Abbreviations

Before and after study

Controlled clinical trial

Effective Practice and Organisation of Care

High-income countries

Information and Communications Technology

Interrupted time series

Knowledge translation

Low- and middle-income countries

Randomised controlled trial

Grol R, Grimshaw J. From best evidence to best practice: effective implementation of change in patients’ care. Lancet. 2003;362:1225–30. https://doi.org/10.1016/S0140-6736(03)14546-1 .

Article   PubMed   Google Scholar  

Green LA, Seifert CM. Translation of research into practice: why we can’t “just do it.” J Am Board Fam Pract. 2005;18:541–5. https://doi.org/10.3122/jabfm.18.6.541 .

Eccles MP, Mittman BS. Welcome to Implementation Science. Implement Sci. 2006;1:1–3. https://doi.org/10.1186/1748-5908-1-1 .

Article   PubMed Central   Google Scholar  

Powell BJ, Waltz TJ, Chinman MJ, Damschroder LJ, Smith JL, Matthieu MM, et al. A refined compilation of implementation strategies: results from the Expert Recommendations for Implementing Change (ERIC) project. Implement Sci. 2015;10:2–14. https://doi.org/10.1186/s13012-015-0209-1 .

Article   Google Scholar  

Waltz TJ, Powell BJ, Matthieu MM, Damschroder LJ, et al. Use of concept mapping to characterize relationships among implementation strategies and assess their feasibility and importance: results from the Expert Recommendations for Implementing Change (ERIC) study. Implement Sci. 2015;10:1–8. https://doi.org/10.1186/s13012-015-0295-0 .

Nilsen P, Ståhl C, Roback K, et al. Never the twain shall meet? - a comparison of implementation science and policy implementation research. Implementation Sci. 2013;8:2–12. https://doi.org/10.1186/1748-5908-8-63 .

Rycroft-Malone J, Seers K, Eldh AC, et al. A realist process evaluation within the Facilitating Implementation of Research Evidence (FIRE) cluster randomised controlled international trial: an exemplar. Implementation Sci. 2018;13:1–15. https://doi.org/10.1186/s13012-018-0811-0 .

Boaz A, Baeza J, Fraser A, European Implementation Score Collaborative Group (EIS). Effective implementation of research into practice: an overview of systematic reviews of the health literature. BMC Res Notes. 2011;4:212. https://doi.org/10.1186/1756-0500-4-212 .

Article   PubMed   PubMed Central   Google Scholar  

Grimshaw JM, Shirran L, Thomas R, Mowatt G, Fraser C, Bero L, et al. Changing provider behavior – an overview of systematic reviews of interventions. Med Care. 2001;39 8Suppl 2:II2–45.

Google Scholar  

Squires JE, Sullivan K, Eccles MP, et al. Are multifaceted interventions more effective than single-component interventions in changing health-care professionals’ behaviours? An overview of systematic reviews. Implement Sci. 2014;9:1–22. https://doi.org/10.1186/s13012-014-0152-6 .

Salvador-Oliván JA, Marco-Cuenca G, Arquero-Avilés R. Development of an efficient search filter to retrieve systematic reviews from PubMed. J Med Libr Assoc. 2021;109:561–74. https://doi.org/10.5195/jmla.2021.1223 .

Thomas JM. Diffusion of innovation in systematic review methodology: why is study selection not yet assisted by automation? OA Evid Based Med. 2013;1:1–6.

Effective Practice and Organisation of Care (EPOC). The EPOC taxonomy of health systems interventions. EPOC Resources for review authors. Oslo: Norwegian Knowledge Centre for the Health Services; 2016. epoc.cochrane.org/epoc-taxonomy . Accessed 9 Oct 2023.

Jamal A, McKenzie K, Clark M. The impact of health information technology on the quality of medical and health care: a systematic review. Health Inf Manag. 2009;38:26–37. https://doi.org/10.1177/183335830903800305 .

Menon A, Korner-Bitensky N, Kastner M, et al. Strategies for rehabilitation professionals to move evidence-based knowledge into practice: a systematic review. J Rehabil Med. 2009;41:1024–32. https://doi.org/10.2340/16501977-0451 .

Oxman AD, Guyatt GH. Validation of an index of the quality of review articles. J Clin Epidemiol. 1991;44:1271–8. https://doi.org/10.1016/0895-4356(91)90160-b .

Article   CAS   PubMed   Google Scholar  

Francke AL, Smit MC, de Veer AJ, et al. Factors influencing the implementation of clinical guidelines for health care professionals: a systematic meta-review. BMC Med Inform Decis Mak. 2008;8:1–11. https://doi.org/10.1186/1472-6947-8-38 .

Jones CA, Roop SC, Pohar SL, et al. Translating knowledge in rehabilitation: systematic review. Phys Ther. 2015;95:663–77. https://doi.org/10.2522/ptj.20130512 .

Scott D, Albrecht L, O’Leary K, Ball GDC, et al. Systematic review of knowledge translation strategies in the allied health professions. Implement Sci. 2012;7:1–17. https://doi.org/10.1186/1748-5908-7-70 .

Wu Y, Brettle A, Zhou C, Ou J, et al. Do educational interventions aimed at nurses to support the implementation of evidence-based practice improve patient outcomes? A systematic review. Nurse Educ Today. 2018;70:109–14. https://doi.org/10.1016/j.nedt.2018.08.026 .

Yost J, Ganann R, Thompson D, Aloweni F, et al. The effectiveness of knowledge translation interventions for promoting evidence-informed decision-making among nurses in tertiary care: a systematic review and meta-analysis. Implement Sci. 2015;10:1–15. https://doi.org/10.1186/s13012-015-0286-1 .

Grudniewicz A, Kealy R, Rodseth RN, Hamid J, et al. What is the effectiveness of printed educational materials on primary care physician knowledge, behaviour, and patient outcomes: a systematic review and meta-analyses. Implement Sci. 2015;10:2–12. https://doi.org/10.1186/s13012-015-0347-5 .

Koota E, Kääriäinen M, Melender HL. Educational interventions promoting evidence-based practice among emergency nurses: a systematic review. Int Emerg Nurs. 2018;41:51–8. https://doi.org/10.1016/j.ienj.2018.06.004 .

Flodgren G, O’Brien MA, Parmelli E, et al. Local opinion leaders: effects on professional practice and healthcare outcomes. Cochrane Database Syst Rev. 2019. https://doi.org/10.1002/14651858.CD000125.pub5 .

Arditi C, Rège-Walther M, Durieux P, et al. Computer-generated reminders delivered on paper to healthcare professionals: effects on professional practice and healthcare outcomes. Cochrane Database Syst Rev. 2017. https://doi.org/10.1002/14651858.CD001175.pub4 .

Pantoja T, Grimshaw JM, Colomer N, et al. Manually-generated reminders delivered on paper: effects on professional practice and patient outcomes. Cochrane Database Syst Rev. 2019. https://doi.org/10.1002/14651858.CD001174.pub4 .

De Angelis G, Davies B, King J, McEwan J, et al. Information and communication technologies for the dissemination of clinical practice guidelines to health professionals: a systematic review. JMIR Med Educ. 2016;2:e16. https://doi.org/10.2196/mededu.6288 .

Brown A, Barnes C, Byaruhanga J, McLaughlin M, et al. Effectiveness of technology-enabled knowledge translation strategies in improving the use of research in public health: systematic review. J Med Internet Res. 2020;22:e17274. https://doi.org/10.2196/17274 .

Sykes MJ, McAnuff J, Kolehmainen N. When is audit and feedback effective in dementia care? A systematic review. Int J Nurs Stud. 2018;79:27–35. https://doi.org/10.1016/j.ijnurstu.2017.10.013 .

Bhatt NR, Czarniecki SW, Borgmann H, et al. A systematic review of the use of social media for dissemination of clinical practice guidelines. Eur Urol Focus. 2021;7:1195–204. https://doi.org/10.1016/j.euf.2020.10.008 .

Yamada J, Shorkey A, Barwick M, Widger K, et al. The effectiveness of toolkits as knowledge translation strategies for integrating evidence into clinical care: a systematic review. BMJ Open. 2015;5:e006808. https://doi.org/10.1136/bmjopen-2014-006808 .

Afari-Asiedu S, Abdulai MA, Tostmann A, et al. Interventions to improve dispensing of antibiotics at the community level in low and middle income countries: a systematic review. J Glob Antimicrob Resist. 2022;29:259–74. https://doi.org/10.1016/j.jgar.2022.03.009 .

Boonacker CW, Hoes AW, Dikhoff MJ, Schilder AG, et al. Interventions in health care professionals to improve treatment in children with upper respiratory tract infections. Int J Pediatr Otorhinolaryngol. 2010;74:1113–21. https://doi.org/10.1016/j.ijporl.2010.07.008 .

Al Zoubi FM, Menon A, Mayo NE, et al. The effectiveness of interventions designed to increase the uptake of clinical practice guidelines and best practices among musculoskeletal professionals: a systematic review. BMC Health Serv Res. 2018;18:2–11. https://doi.org/10.1186/s12913-018-3253-0 .

Ariyo P, Zayed B, Riese V, Anton B, et al. Implementation strategies to reduce surgical site infections: a systematic review. Infect Control Hosp Epidemiol. 2019;3:287–300. https://doi.org/10.1017/ice.2018.355 .

Borgert MJ, Goossens A, Dongelmans DA. What are effective strategies for the implementation of care bundles on ICUs: a systematic review. Implement Sci. 2015;10:1–11. https://doi.org/10.1186/s13012-015-0306-1 .

Cahill LS, Carey LM, Lannin NA, et al. Implementation interventions to promote the uptake of evidence-based practices in stroke rehabilitation. Cochrane Database Syst Rev. 2020. https://doi.org/10.1002/14651858.CD012575.pub2 .

Pedersen ER, Rubenstein L, Kandrack R, Danz M, et al. Elusive search for effective provider interventions: a systematic review of provider interventions to increase adherence to evidence-based treatment for depression. Implement Sci. 2018;13:1–30. https://doi.org/10.1186/s13012-018-0788-8 .

Jenkins HJ, Hancock MJ, French SD, Maher CG, et al. Effectiveness of interventions designed to reduce the use of imaging for low-back pain: a systematic review. CMAJ. 2015;187:401–8. https://doi.org/10.1503/cmaj.141183 .

Bennett S, Laver K, MacAndrew M, Beattie E, et al. Implementation of evidence-based, non-pharmacological interventions addressing behavior and psychological symptoms of dementia: a systematic review focused on implementation strategies. Int Psychogeriatr. 2021;33:947–75. https://doi.org/10.1017/S1041610220001702 .

Noonan VK, Wolfe DL, Thorogood NP, et al. Knowledge translation and implementation in spinal cord injury: a systematic review. Spinal Cord. 2014;52:578–87. https://doi.org/10.1038/sc.2014.62 .

Albrecht L, Archibald M, Snelgrove-Clarke E, et al. Systematic review of knowledge translation strategies to promote research uptake in child health settings. J Pediatr Nurs. 2016;31:235–54. https://doi.org/10.1016/j.pedn.2015.12.002 .

Campbell A, Louie-Poon S, Slater L, et al. Knowledge translation strategies used by healthcare professionals in child health settings: an updated systematic review. J Pediatr Nurs. 2019;47:114–20. https://doi.org/10.1016/j.pedn.2019.04.026 .

Bird ML, Miller T, Connell LA, et al. Moving stroke rehabilitation evidence into practice: a systematic review of randomized controlled trials. Clin Rehabil. 2019;33:1586–95. https://doi.org/10.1177/0269215519847253 .

Goorts K, Dizon J, Milanese S. The effectiveness of implementation strategies for promoting evidence informed interventions in allied healthcare: a systematic review. BMC Health Serv Res. 2021;21:1–11. https://doi.org/10.1186/s12913-021-06190-0 .

Zadro JR, O’Keeffe M, Allison JL, Lembke KA, et al. Effectiveness of implementation strategies to improve adherence of physical therapist treatment choices to clinical practice guidelines for musculoskeletal conditions: systematic review. Phys Ther. 2020;100:1516–41. https://doi.org/10.1093/ptj/pzaa101 .

Van der Veer SN, Jager KJ, Nache AM, et al. Translating knowledge on best practice into improving quality of RRT care: a systematic review of implementation strategies. Kidney Int. 2011;80:1021–34. https://doi.org/10.1038/ki.2011.222 .

Pawson R, Greenhalgh T, Harvey G, et al. Realist review–a new method of systematic review designed for complex policy interventions. J Health Serv Res Policy. 2005;10Suppl 1:21–34. https://doi.org/10.1258/1355819054308530 .

Rycroft-Malone J, McCormack B, Hutchinson AM, et al. Realist synthesis: illustrating the method for implementation research. Implementation Sci. 2012;7:1–10. https://doi.org/10.1186/1748-5908-7-33 .

Johnson MJ, May CR. Promoting professional behaviour change in healthcare: what interventions work, and why? A theory-led overview of systematic reviews. BMJ Open. 2015;5:e008592. https://doi.org/10.1136/bmjopen-2015-008592 .

Metz A, Jensen T, Farley A, Boaz A, et al. Is implementation research out of step with implementation practice? Pathways to effective implementation support over the last decade. Implement Res Pract. 2022;3:1–11. https://doi.org/10.1177/26334895221105585 .

May CR, Finch TL, Cornford J, Exley C, et al. Integrating telecare for chronic disease management in the community: What needs to be done? BMC Health Serv Res. 2011;11:1–11. https://doi.org/10.1186/1472-6963-11-131 .

Harvey G, Rycroft-Malone J, Seers K, Wilson P, et al. Connecting the science and practice of implementation – applying the lens of context to inform study design in implementation research. Front Health Serv. 2023;3:1–15. https://doi.org/10.3389/frhs.2023.1162762 .

Download references

Acknowledgements

The authors would like to thank Professor Kathryn Oliver for her support in the planning the review, Professor Steve Hanney for reading and commenting on the final manuscript and the staff at LSHTM library for their support in planning and conducting the literature search.

This study was supported by LSHTM’s Research England QR strategic priorities funding allocation and the National Institute for Health and Care Research (NIHR) Applied Research Collaboration South London (NIHR ARC South London) at King’s College Hospital NHS Foundation Trust. Grant number NIHR200152. The views expressed are those of the author(s) and not necessarily those of the NIHR, the Department of Health and Social Care or Research England.

Author information

Authors and affiliations.

Health and Social Care Workforce Research Unit, The Policy Institute, King’s College London, Virginia Woolf Building, 22 Kingsway, London, WC2B 6LE, UK

Annette Boaz

King’s Business School, King’s College London, 30 Aldwych, London, WC2B 4BG, UK

Juan Baeza & Alec Fraser

Federal University of Santa Catarina (UFSC), Campus Universitário Reitor João Davi Ferreira Lima, Florianópolis, SC, 88.040-900, Brazil

Erik Persson

You can also search for this author in PubMed   Google Scholar

Contributions

AB led the conceptual development and structure of the manuscript. EP conducted the searches and data extraction. All authors contributed to screening and quality appraisal. EP and AF wrote the first draft of the methods section. AB, JB and AF performed result synthesis and contributed to the analyses. AB wrote the first draft of the manuscript and incorporated feedback and revisions from all other authors. All authors revised and approved the final manuscript.

Corresponding author

Correspondence to Annette Boaz .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: appendix a., additional file 2: appendix b., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Boaz, A., Baeza, J., Fraser, A. et al. ‘It depends’: what 86 systematic reviews tell us about what strategies to use to support the use of research in clinical practice. Implementation Sci 19 , 15 (2024). https://doi.org/10.1186/s13012-024-01337-z

Download citation

Received : 01 November 2023

Accepted : 05 January 2024

Published : 19 February 2024

DOI : https://doi.org/10.1186/s13012-024-01337-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Implementation
  • Interventions
  • Clinical practice
  • Research evidence
  • Multi-faceted

Implementation Science

ISSN: 1748-5908

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

systematic review summary of findings table

  • Open access
  • Published: 23 February 2024

Comparison of clinical and radiological outcomes for the anterior and medial approaches to open reduction in the treatment of bilateral developmental dysplasia of the hip: a systematic review protocol

  • Edward Alan Jenner   ORCID: orcid.org/0000-0003-0803-5091 1 ,
  • Govind Singh Chauhan 1 ,
  • Abdus Burahee 2 , 3 ,
  • Junaid Choudri 2 ,
  • Adrian Gardner 2 , 3 &
  • Christopher Edward Bache 1  

Systematic Reviews volume  13 , Article number:  72 ( 2024 ) Cite this article

Metrics details

Developmental dysplasia of the hip (DDH) affects 1–3% of newborns and 20% of cases are bilateral. The optimal surgical management strategy for patients with bilateral DDH who fail bracing, closed reduction or present too late for these methods to be used is unclear. There are proponents of both medial approach open reduction (MAOR) and anterior approach open reduction (AOR); however, there is little evidence to inform this debate.

We will perform a systematic review designed according to the Preferred Reporting Items for Systematic Review and Meta-Analysis Protocol. We will search the medical and scientific databases including the grey and difficult to locate literature. The Medical Subject Headings “developmental dysplasia of the hip”, “congenital dysplasia of the hip”, “congenital hip dislocation”, “developmental hip dislocation”, and their abbreviations, “DDH” and “CDH” will be used, along with the qualifier “bilateral”. Reviewers will independently screen records for inclusion and then independently extract data on study design, population characteristics, details of operative intervention and outcomes from the selected records. Data will be synthesised and a meta-analysis performed if possible. If not possible we will analyse data according to Systematic Review without Meta-Analysis guidance. All studies will be assessed for risk of bias. For each outcome measure a summary of findings will be presented in a table with the overall quality of the recommendation assessed using the Grading of Recommendations Assessment Development and Evaluation approach.

The decision to perform MAOR or AOR in patients with bilateral DDH who have failed conservative management is not well informed by the current literature. High-quality, comparative studies are exceptionally challenging to perform for this patient population and likely to be extremely uncommon. A systematic review provides the best opportunity to deliver the highest possible quality of evidence for bilateral DDH surgical management.

Systematic review registration

The protocol has been registered in the International Prospective Register of Systematic Reviews (PROSPERO ID CRD42022362325).

Peer Review reports

Introduction

DDH describes a spectrum of abnormalities in the infant’s hip, from subluxation to frank dislocation, due to incomplete acetabular and femoral head development [ 1 ]. Developmental dysplasia of the hip (DDH) affects 1–3% of newborns and 20% of cases are bilateral [ 2 , 3 , 4 ]. Although many cases of DDH spontaneously resolve as the child grows [ 5 ] those in whom the hip(s) remains shallow, subluxed, or dislocated will go on to develop gait abnormalities, hip pain, and early onset osteoarthritis [ 6 ]. This often requires early hip arthroplasty [ 7 ]. Clinical and radiological outcomes for children with bilateral DDH have been reported to be worse than for children with unilateral DDH by some authors [ 8 , 9 , 10 ] whereas others have found no difference [ 11 , 12 ].

The aim of treatment in bilateral DDH is to achieve concentrically reduced hips, without significant deformity or residual dysplasia. If bilateral DDH is detected as a neonate, abduction bracing is attempted, although failure rates are higher than for unilateral disease [ 8 , 13 , 14 ]. Patients who fail bracing proceed to examination under anesthetic and arthrogram, aiming for closed reduction and hip spica. Typically, this is performed before age 6 months. Bilateral DDH represents a significant risk factor for failure of conservative treatment [ 8 , 9 ] and patients failing closed reduction proceed to open reduction.

Operative options are medial approach open reduction (MAOR) or anterior approach open reduction (AOR). MAOR is performed between 6 and 18 months of age [ 15 ]. This approach requires limited soft tissue dissection through a small, cosmetically acceptable, anteromedial incision with minimal blood loss. The anatomical blocks to reduction (capsular constriction, transverse acetabular ligament, ligamentum teres and iliopsoas tendon) are well visualised and released. Both hips are usually operated on at the same sitting and the patient is immobilised in a hip spica for 6–12 weeks postoperatively. Critics suggest that MAOR increases the risk of femoral head avascular necrosis (AVN), prevents the blocks to femoral head reduction from being fully addressed and does not allow capsulorrhaphy [ 16 , 17 , 18 ]. Rates of residual dysplasia may also be higher. It has been reported that MAOR may have worse outcomes compared to AOR [ 16 , 17 , 18 ]; however, these studies relate to unilateral cases and limited data, specific to bilateral DDH, has been published. The data relating to unilateral disease is itself heterogeneous and contradictory [ 15 , 19 , 20 ].

AOR is usually performed around 12–24 months of age through a bikini line incision via the ilio-inguinal approach. This results in a larger, less cosmetically acceptable scar, more soft tissue dissection, potentially greater blood loss and risks of damage to the lateral femoral cutaneous nerve [ 21 , 22 ]. Proponents argue that AOR allows all the potential blocks to femoral head reduction to be addressed and capsulorrhaphy to be performed therefore improving outcomes [ 23 ]. Pelvic osteotomy can be performed through the same approach and this is usually required when surgery is undertaken after age 2 years [ 24 , 25 , 26 , 27 ]. Typically, in AOR, one hip is operated on at each sitting with a 6-week gap between surgeries during which the patient is immobilised in a hip spica cast [ 11 , 12 ]. Some authors have reported single-sitting bilateral surgery in AOR [ 25 ]; however, this remains rare.

The choice of AOR or MAOR depends on a number of factors, including the patient’s age, the surgeon’s training and experience and the perceived advantages and disadvantages of each technique. Both of these surgical management strategies for bilateral DDH have proponents on each side, however, there is limited evidence to inform decision-making. To the best of our knowledge, this will be the first systematic review comparing outcomes for AOR vs MAOR in bilateral DDH.

Our aim is to establish whether there is a difference in the clinical and radiological outcomes for children with bilateral DDH who have been treated with MAOR compared to AOR. We will examine a range of clinical and radiological outcome measures and if possible perform a quantitative analysis. We will summarise the evidence available and give recommendations for management. This will help to inform decision-making in the management of bilateral DDH.

Design and methods

This protocol has been designed according to the Preferred Reporting Items for Systematic Review and Meta-Analysis Protocol (PRISMA-P) [ 28 , 29 ]. The design and method have been formed through discussion between experts in the management of DDH and experts in the methodology of systematic reviews. The protocol has been registered in the International Prospective Register of Systematic Reviews (PROSPERO ID-CRD42022362325).

Eligibility criteria

Children with idiopathic bilateral developmental dysplasia of the hip undergoing surgical management of both hips.

Exclusion criteria—children with bilateral DDH in whom one hip is managed through harness treatment alone, children with teratologic bilateral developmental dysplasia of the hip, children undergoing revision surgery and surgery for acetabular dysplasia in adolescence.

Intervention

Medial approach open reduction of the hip (MAOR).

Anterior approach open reduction of the hip (AOR).

Rate and severity of avascular necrosis of the femoral head at the latest follow-up using Kalamchi and MacEwen [ 30 ] or Bucholz and Ogden classification [ 31 ] or other appropriate scoring system.

Radiological outcome at the latest follow-up using acetabular index measured in degrees, Severin Score [ 32 ] or other appropriate scoring system.

Clinical outcomes at the latest follow-up including Modified McKay criteria [ 33 ], Children’s Hospital of Oakland Hip Evaluation Scale [ 34 ], Pediatric Outcomes Data Collection Instrument (PODCI) [ 35 ] or other appropriate scoring system.

Prevalence, event rate or time-to-event surgical complications assessed according to the Clavien-Dindo system [ 36 , 37 ] or other appropriate scoring system.

Prevalence, event rate or time to event of secondary surgery.

Study design

Inclusion criteria—clinical studies, level IV (retrospective case series) and above, with a clear description of the operative management with a set of clinical and/or radiological outcomes included, published in English.

Exclusion criteria—case reports, technical or cadaveric studies, studies without a clear description of the operative management or where this is unobtainable, studies without a clear description of clinical and/or radiological outcomes or where this is unobtainable. Full-text studies not available in English will be excluded.

Search strategy

A search of the electronic medical and scientific databases; PubMed, MEDLINE, the Cochrane Library, Embase, Google Scholar, Web of Science and Scopus will be conducted from the date of first entry until the date of search. The grey and difficult-to-locate literature (including theses and dissertations) will be searched via the Open Grey [ 38 ] and Open Access Theses and Dissertations [ 39 ] databases. The Medical Subject Headings (MeSH terms) “developmental dysplasia of the hip”, “congenital dysplasia of the hip”, “congenital hip dislocation”, “developmental hip dislocation”, and their abbreviations, “DDH” and “CDH” will be used, along with the qualifier “bilateral”. The search strategy will be developed in Medline and then applied to other databases. An example of the search strategy can be found in Additional file  1 . Only full-text studies, published in English will be included. There will be no time limit imposed.

Study selection

Two reviewers (EJ and GC) will independently screen the title and abstract of records for inclusion according to the eligibility criteria. Once preliminary screening has been performed, selected studies will be screened as full text. Researchers will be blinded to each other’s decisions. Where there is disagreement a separate reviewer (CEB) will arbitrate. Screening decisions at the full-text stage will be fully recorded. The results of the screening will be presented in a PRISMA flow diagram [ 29 ].

Data management

The selected studies will be collated in the Zotero citation management system, screened for duplicates, and exported to Systematic Review Data Repository-Plus [ 40 ]. This database will be used to aid data extraction and management. Extracted data will be exported to RevMan software for analysis.

Data extraction

Data will be extracted in a predefined electronic data extraction form. Data on study design, population characteristics, details of operative intervention (intervention and comparison), and outcomes (clinical, radiological, complications and rate of secondary surgery) will be extracted. A summary of intended data items for extraction is shown in Table  1 . Four reviewers (EJ, GC, MJC and AB) will be allocated the selected studies and will independently extract data. Each reviewer will be blinded to data extraction. Where possible corresponding authors will be contacted for unreported data. Data will be extracted to a secured anonymised form on Systematic Review Data Repository-Plus and then exported to RevMan for analysis.

Data synthesis

The extracted data will be summarised in a structured table format, grouped and ordered by study design (according to the hierarchy of evidence) or by risk bias if study designs are similar, and including the data items specific to the outcomes of interest. This will help to assess clinical and methodological heterogeneity across the studies and determine the feasibility of performing a meta-analysis. We do not expect the included studies to be of sufficient quality or consistency to allow a meta-analysis to be performed. In this instance, we will follow the Systematic Review without Meta-Analysis (SWiM) guidance [ 41 ] and analyse data according to this and the recommendations in the Cochrane Handbook Chapter 12 [ 42 ]. Studies will be grouped and tabulated as described. We expect that the key outcome data for radiological and clinical outcomes will be in short ordinal scales (e.g. Severin Score [ 32 ]). Where possible we will transform these data to dichotomous outcomes and present this as a relative risk with 95% confidence intervals for MAOR in comparison to AOR. Longer ordinal scales such as the Pediatric Outcomes Data Collection Instrument [ 35 ] will be transformed to continuous data. For complications and secondary surgery data we will transform to an incidence estimate, event rate or time-to-event data. For non-comparative studies, we will transform extracted data as described above and use this to generate a crude estimate of incidence, prevalence or event rate. Where possible we will pool this data using a random effects model as per the recommendation in Murad et al. [ 43 ]. Results will be reported according to the guidance in the Cochrane Handbook Chapter 12 [ 42 ]. Where sufficient information is available but synthesis cannot be performed a structured reporting of effects will be used. When effect estimates are available without measures of precision an illustrated synthesis of summary statistics will be used. If P values are available an illustrated synthesis of P values will be used. Where directions of effect are available an illustrated synthesis using vote-counting based on direction of effect will be used.

We aim to limit publication bias by a thorough and systematic search of the literature including the grey literature as described in the search strategy. Where possible publication bias will be assessed across studies by generation of funnel plots. These will be inspected for asymmetry and analysed via Egger’s test [ 44 ].

Risk of bias

Randomised trials will be assessed using the Cochrane Risk of Bias 2 (RoB 2) tool [ 45 ]. However, included studies are most likely to be non-randomised, observational studies. For comparative studies (cohort or case–control) we will use the ROBINS-I tool to assess risk of bias [ 46 ]. For case series, we will use Murad et al. ’s method for evaluating the methodological quality across four domains; selection, ascertainment, causality and reporting [ 43 ]. Four reviewers (EJ, GC, MJC & AB) will assess included studies for risk of bias. A separate reviewer (CEB) will resolve disagreements through discussion. A summary figure of the risk of bias analysis will be included in the final manuscript.

Assessment of quality

For each outcome measure a summary of findings will be presented in a table [ 47 ] with the overall quality of the recommendation assessed using the Grading of Recommendations Assessment Development and Evaluation approach (GRADE) [ 48 ]. This approach uses five factors; risk of bias, inconsistency, indirectness, imprecision and publication bias to assess the quality of evidence and produce a rating of “high”, “moderate”, “low” or “very low”. GRADEpro GDT software [ 49 ] will be used to aid decision-making when assessing the quality of evidence.

Discussion and implications of review

Management of bilateral DDH represents a significant challenge for the paediatric orthopaedic surgeon. The aim of treatment is to achieve concentrically reduced hips, without significant deformity or residual dysplasia. The decision to perform MAOR or AOR in patients with bilateral DDH who have failed conservative management is not well informed by the current literature. High-quality, comparative studies are exceptionally challenging to perform for this patient population and likely to be extremely uncommon. A systematic review provides the best opportunity to deliver the highest possible quality of evidence for bilateral DDH surgical management. We are not aware of any systematic reviews that compare the outcomes of MAOR with AOR for bilateral DDH. This study aims to identify whether there are any significant differences in the clinical or radiological outcomes for patients with bilateral DDH surgically treated with MAOR compared to AOR so that surgeons can make better-informed decisions about the management strategy they will offer to patients.

Limitations

We expect that this review will be limited by studies that have a small sample size and have a retrospective, non-comparative study design. We expect result reporting to be heterogeneous and incomplete. These limitations will place all studies at a high risk of bias and therefore limit the quality of evidence that can be derived from the systematic review.

Availability of data and materials

Not applicable.

Abbreviations

  • Developmental dysplasia of the hip

Medial approach open reduction

Anterior approach open reduction

  • Congenital dysplasia of the hip

Avascular necrosis

Preferred Reporting Items for Systematic Review and Meta-Analysis Protocol

Pediatric Outcomes Data Collection Instrument

Medical Subject Headings

Systematic Review without Meta-Analysis

Cochrane Risk of Bias 2

Grading of Recommendations Assessment Development and Evaluation approach

Zhang S, Doudoulakis KJ, Khurwal A, Sarraf KM. Developmental dysplasia of the hip. Br J Hosp Med Lond Engl 2005. 2020;81(7):1–8.

Google Scholar  

Sewell MD, Rosendahl K, Eastwood DM. Developmental dysplasia of the hip. BMJ. 2009;339:b4454.

Article   CAS   PubMed   Google Scholar  

Marks DS, Clegg J, Al-Chalabi AN. Routine ultrasound screening for neonatal hip instability. Can it abolish late-presenting congenital dislocation of the hip? J Bone Joint Surg Br. 1994;76(4):534–8.

Macnicol MF. Results of a 25-year screening programme for neonatal hip instability. J Bone Joint Surg Br. 1990;72(6):1057–60.

Bialik V, Bialik GM, Blazer S, Sujov P, Wiener F, Berant M. Developmental dysplasia of the hip: a new approach to incidence. Pediatrics. 1999;103(1):93–9.

Cooperman DR, Wallensten R, Stulberg SD. Acetabular dysplasia in the adult. Clin Orthop. 1983;175:79–85.

Article   Google Scholar  

Furnes O, Lie SA, Espehaug B, Vollset SE, Engesaeter LB, Havelin LI. Hip disease and the prognosis of total hip replacements. A review of 53,698 primary total hip replacements reported to the Norwegian Arthroplasty Register 1987–99. J Bone Joint Surg Br. 2001;83(4):579–86.

Kitoh H, Kawasumi M, Ishiguro N. Predictive factors for unsuccessful treatment of developmental dysplasia of the hip by the Pavlik harness. J Pediatr Orthop. 2009;29(6):552–7.

Article   PubMed   Google Scholar  

Viere RG, Birch JG, Herring JA, Roach JW, Johnston CE. Use of the Pavlik harness in congenital dislocation of the hip. An analysis of failures of treatment. J Bone Joint Surg Am. 1990;72(2):238–44.

Greene WB, Drennan JC. A comparative study of bilateral versus unilateral congenital dislocation of the hip. Clin Orthop. 1982;162:78–86.

Zionts LE, MacEwen GD. Treatment of congenital dislocation of the hip in children between the ages of one and three years. J Bone Joint Surg Am. 1986;68(6):829–46.

Wang TM, Wu KW, Shih SF, Huang SC, Kuo KN. Outcomes of open reduction for developmental dysplasia of the hip: does bilateral dysplasia have a poorer outcome? J Bone Jt Surg Am. 2013;95(12):1081–6.

Segal LS, Boal DK, Borthwick L, Clark MW, Localio AR, Schwentker EP. Avascular necrosis after treatment of DDH: the protective influence of the ossific nucleus. J Pediatr Orthop. 1999;19(2):177–84.

Lerman JA, Emans JB, Millis MB, Share J, Zurakowski D, Kasser JR. Early failure of Pavlik harness treatment for developmental hip dysplasia: clinical and ultrasound predictors. J Pediatr Orthop. 2001;21(3):348–53.

Akilapa O. The medial approach open reduction for developmental dysplasia of the hip: do the long-term outcomes validate this approach? A systematic review of the literature. J Child Orthop. 2014;8(5):387–97.

Article   PubMed   PubMed Central   Google Scholar  

Okano K, Yamada K, Takahashi K, Enomoto H, Osaki M, Shindo H. Long-term outcome of Ludloff’s medial approach for open reduction of developmental dislocation of the hip in relation to the age at operation. Int Orthop. 2009;33(5):1391–6.

Mankey MG, Arntz GT, Staheli LT. Open reduction through a medial approach for congenital dislocation of the hip. A critical review of the Ludloff approach in sixty-six hips. J Bone Joint Surg Am. 1993;75(9):1334–45.

Koizumi W, Moriya H, Tsuchiya K, Takeuchi T, Kamegaya M, Akita T. Ludloff’s medial approach for open reduction of congenital dislocation of the hip. A 20-year follow-up. J Bone Joint Surg Br. 1996;78(6):924–9.

Gardner ROE, Bradley CS, Howard A, Narayanan UG, Wedge JH, Kelley SP. The incidence of avascular necrosis and the radiographic outcome following medial open reduction in children with developmental dysplasia of the hip: a systematic review. Bone Jt J. 2014;96-B(2):279–86.

Article   CAS   Google Scholar  

Hoellwarth JS, Kim YJ, Millis MB, Kasser JR, Zurakowski D, Matheney TH. Medial versus anterior open reduction for developmental hip dislocation in age-matched patients. J Pediatr Orthop. 2015;35(1):50–6.

Jia G, Wang E, Lian P, Liu T, Zhao S, Zhao Q. Anterior approach with mini-bikini incision in open reduction in infants with developmental dysplasia of the hip. J Orthop Surg. 2020;15(1):180.

Rudin D, Manestar M, Ullrich O, Erhardt J, Grob K. The anatomical course of the lateral femoral cutaneous nerve with special attention to the anterior approach to the hip joint. JBJS. 2016;98(7):561–7.

Herring JA Tachdjian MO. Texas Scottish rite hospital for children. Tachdjian’s Pediatric Orthopaedics. 4th ed. Philadelphia: Saunders/Elsevier; 2008. help_tachdjiansv1c15.pdf. Available from: https://storage.googleapis.com/global-help-publications/books/help_tachdjiansv1c15.pdf . [cited 2023 Sep 25].

Subasi M, Arslan H, Cebesoy O, Buyukbebeci O, Kapukaya A. Outcome in unilateral or bilateral DDH treated with one-stage combined procedure. Clin Orthop. 2008;466(4):830–6.

Ezirmik N, Yildiz K. Advantages of single-stage surgical treatment with salter innominate osteotomy and pemberton pericapsular osteotomy for developmental dysplasia of both hips. J Int Med Res. 2012;40(2):748–55.

Agus H, Bozoglan M, Kalenderer Ö, Kazımoğlu C, Onvural B, Akan İ. How are outcomes affected by performing a one-stage combined procedure simultaneously in bilateral developmental hip dysplasia? Int Orthop. 2014;38(6):1219–24.

Kotzias Neto A, Ferraz A, Bayer Foresti F, Barreiros HR. Bilateral developmental dysplasia of the hip treated with open reduction and Salter osteotomy: analysis on the radiographic results. Rev Bras Ortop Engl Ed. 2014;49(4):350–8.

Rethlefsen ML, Kirtley S, Waffenschmidt S, Ayala AP, Moher D, Page MJ, et al. PRISMA-S: an extension to the PRISMA statement for reporting literature searches in systematic reviews. Syst Rev. 2021;10(1):39.

Moher D, Shamseer L, Clarke M, Ghersi D, Liberati A, Petticrew M, et al. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Syst Rev. 2015;4(1):1.

Kalamchi A, MacEwen GD. Avascular necrosis following treatment of congenital dislocation of the hip. J Bone Joint Surg Am. 1980;62(6):876–88.

Patterns of ischemic necrosis of the proximal femur in nonoperatively treated congenital hip disease – ScienceOpen. Available from: https://www.scienceopen.com/document?vid=397821b3-1187-46ae-a255-9f9805906ecf . [cited 2022 Oct 12].

Severin E. Congenital dislocation of the hip; development of the joint after closed reduction. J Bone Joint Surg Am. 1950;32-A(3):507–18.

McKay DW. A comparison of the innominate and the pericapsular osteotomy in the treatment of congenital dislocation of the hip. Clin Orthop. 1974;98:124–32.

Aguilar CM, Neumayr LD, Eggleston BE, Earles AN, Robertson SM, Jergesen HE, et al. Clinical evaluation of avascular necrosis in patients with sickle cell disease: children’s hospital Oakland hip evaluation scale–a modification of the harris hip score. Arch Phys Med Rehabil. 2005;86(7):1369–75.

Daltroy LH, Liang MH, Fossel AH, Goldberg MJ. The POSNA pediatric musculoskeletal functional health questionnaire: report on reliability, validity, and sensitivity to change. Pediatric Outcomes Instrument Development Group. Pediatric Orthopaedic Society of North America. J Pediatr Orthop. 1998;18(5):561–71.

Dindo D, Demartines N, Clavien PA. Classification of surgical complications: a new proposal with evaluation in a cohort of 6336 patients and results of a survey. Ann Surg. 2004;240(2):205–13.

Dodwell ER, Pathy R, Widmann RF, Green DW, Scher DM, Blanco JS, et al. Reliability of the modified Clavien-Dindo-Sink complication classification system in pediatric orthopaedic surgery. JBJS Open Access. 2018;3(4):e0020.

OPENGREY.EU - Grey literature database. Available from: https://opengrey.eu/ . [cited 2022 Oct 13].

OATD – Open access theses and dissertations. Available from: https://oatd.org/ . [cited 2022 Oct 13].

SRDR+. Available from: https://srdrplus.ahrq.gov/ . [cited 2022 Oct 12].

Campbell M, McKenzie JE, Sowden A, Katikireddi SV, Brennan SE, Ellis S, et al. Synthesis without meta-analysis (SWiM) in systematic reviews: reporting guideline. BMJ. 2020;368:l6890.

Chapter 12: synthesizing and presenting findings using other methods. Available from: https://training.cochrane.org/handbook/current/chapter-12 . [cited 2022 Oct 12].

Murad MH, Sultan S, Haffar S, Bazerbachi F. Methodological quality and synthesis of case series and case reports. BMJ Evid Based Med. 2018;23(2):60–3.

Egger M, Davey Smith G, Schneider M, Minder C. Bias in meta-analysis detected by a simple, graphical test. BMJ. 1997;315(7109):629–34.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Sterne JAC, Savović J, Page MJ, Elbers RG, Blencowe NS, Boutron I, et al. RoB 2: a revised tool for assessing risk of bias in randomised trials. BMJ. 2019;366:l4898.

Sterne JA, Hernán MA, Reeves BC, Savović J, Berkman ND, Viswanathan M, et al. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. BMJ. 2016;355:i4919.

Guyatt G, Oxman AD, Akl EA, Kunz R, Vist G, Brozek J, et al. GRADE guidelines: 1. Introduction-GRADE evidence profiles and summary of findings tables. J Clin Epidemiol. 2011;64(4):383–94.

Guyatt GH, Oxman AD, Vist GE, Kunz R, Falck-Ytter Y, Alonso-Coello P, et al. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. BMJ. 2008;336(7650):924–6.

Guideline Development Tool. Available from: https://gdt.gradepro.org/app/#projects . [cited 2022 Oct 13].

Download references

Acknowledgements

Author information, authors and affiliations.

Birmingham Children’s Hospital, Steelhouse Lane, Birmingham, B4 6NH, UK

Edward Alan Jenner, Govind Singh Chauhan & Christopher Edward Bache

Royal Orthopaedic Hospital, Bristol Road South, Birmingham, B31 2AP, UK

Abdus Burahee, Junaid Choudri & Adrian Gardner

University of Birmingham, College of Medical & Dental Sciences, Birmingham, UK

Abdus Burahee & Adrian Gardner

You can also search for this author in PubMed   Google Scholar

Contributions

EJ, GC, AB, JC, AG and CEB all made substantial contributions to the conceptualisation, design, background, drafting and editing of this protocol.

Corresponding author

Correspondence to Edward Alan Jenner .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1..

Search strategy example.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Jenner, E.A., Chauhan, G.S., Burahee, A. et al. Comparison of clinical and radiological outcomes for the anterior and medial approaches to open reduction in the treatment of bilateral developmental dysplasia of the hip: a systematic review protocol. Syst Rev 13 , 72 (2024). https://doi.org/10.1186/s13643-023-02444-6

Download citation

Received : 19 January 2023

Accepted : 21 December 2023

Published : 23 February 2024

DOI : https://doi.org/10.1186/s13643-023-02444-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Congenital hip dislocation
  • Developmental hip dislocation

Systematic Reviews

ISSN: 2046-4053

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

systematic review summary of findings table

  • Open access
  • Published: 22 February 2024

Barriers and facilitators for recruiting and retaining male participants into longitudinal health research: a systematic review

  • Danielle J. Borg 1 , 2   na1 ,
  • Melina Haritopoulou-Sinanidou 3   na1 ,
  • Pam Gabrovska 4 ,
  • Hsu-Wen Tseng 5 ,
  • David Honeyman 6 ,
  • Daniel Schweitzer 2 , 7 &
  • Kym M. Rae 2 , 4  

BMC Medical Research Methodology volume  24 , Article number:  46 ( 2024 ) Cite this article

17 Accesses

1 Altmetric

Metrics details

Successfully recruiting male participants to complete a healthcare related study is important for healthcare study completion and to advance our clinical knowledgebase. To date, most research studies have examined the barriers and facilitators of female participants in longitudinal healthcare-related studies with limited information available about the needs of males in longitudinal research. This systematic review examines the unique barriers and facilitators to male recruitment across longitudinal healthcare-related research studies.

Following PRIMSA guidelines, MEDLINE, Embase, CINAHL and Web of Science databases were systematically searched using the terms recruitment and/or retention, facilitators and/or barriers and longitudinal studies from 1900 to 2023 which contained separate data on males aged 17–59 years. Health studies or interventions were defined longitudinal if they were greater than or equal to 12 weeks in duration with 3 separate data collection visits.

Twenty-four articles published from 1976–2023 met the criteria. One-third of the studies had a predominantly male sample and four studies recruited only male participants. Males appear disinterested towards participation in health research, however this lack of enthusiasm can be overcome by clear, non-directive communication, and studies that support the participants interests. Facilitating factors are diverse and may require substantial time from research teams.

Conclusions

Future research should focus on the specific impact of these factors across the spectrum of longitudinal health-related studies. Based on the findings of this systematic review, researchers from longitudinal health-related clinical trials are encouraged to consider male-specific recruitment strategies to ensure successful recruitment and retention in their studies.

Registration

This systemic review is registered with the PROSPERO database (CRD42021254696).

Peer Review reports

Recruitment into and continued participation of participants in clinical research provides continued challenges for researchers [ 1 ]. This is particularly true for participants who identify as men, likely due to the social roles and norms gender plays in society [ 2 ]. Recruitment is time-consuming, expensive, and the involvement and retention of male participants, as part of longitudinal healthcare studies, can be enormously demanding. It is likely that social constructs related to men such as cultural perceptions and health-seeking behaviour [ 3 ], have contributed to the challenges of male participant recruitment in healthcare-related research. However, these specific barriers have not been systematically investigated as part of previous clinical-research studies. Previous studies have explored the attitudes, beliefs and knowledge of the public towards research and research participation, focusing on clinical trials [ 4 ]. The public’s willingness to participate may be informed by a favourable attitude towards researchers, comprehension of the trial rationale, or the specific clinical circumstances (e.g., having a non-fatal disease with no known cure, being healthy, or critically ill with a limited chance of survival) [ 4 ]. It is important to note that, research findings often require a considerable amount of time to transition into clinical practice, and it is essential to educate the public about this process to encourage their participation in studies, ultimately advancing the progress of research. Given the time lag between findings resulting from healthcare research studies to healthcare implementation, an important component of participation is to enhance the public understanding of healthcare related research studies [ 1 , 4 , 5 ].

There are a range of population groups among whom it can be particularly challenging to recruit as part of longitudinal health studies and can include disadvantaged, minority and vulnerable members of the community. While others have systematically reviewed the recruitment and retention of participants in health studies related to conditions including cancer, dementia, and HIV, as well as studies involving vulnerable populations [ 6 , 7 , 8 ], less is known about recruitment and retention of male participants as part of longitudinal health-related studies. This highlights the need to address recruitment issues in a broad spectrum of healthcare-related research studies for males.

Although several healthcare-related studies have examined recruitment of male participants across diverse populations groups, there is limited research identifying barriers and facilitators associated with overall male recruitment into healthcare-related studies. Notably, there are male-specific clinical changes across healthcare that can influence interest in related research [ 9 ]. Life expectancy is lower in males [ 3 ], especially those aged over 50 years, who often experience a greater disease burden [ 10 ]. Although previous studies demonstrate that men are disengaged with healthcare services, it is now recognised that males engage willingly and effectively with healthcare that recognise the preferences of males [ 10 , 11 , 12 ]. Previous literature have investigated methods of improving male recruitment to health behaviour research [ 13 ]. Indeed, sex was an important determinant of health-risk and health-promoting behaviours [ 14 ], with males being more likely to perform high-risk behaviours including smoking, unhealthy eating, excess alcohol consumption, and physical inactivity [ 3 , 15 ] and despite this, remained less likely to seek medical and psychological help when needed [ 16 ] or to participate in health-promotion programs [ 17 ]. Maher et al., detailed that males only comprise about 20% of health behaviour research participants, in mixed sex studies [ 18 ], contributing to a lack of evidence on how to increase the uptake of health-promoting behaviours for males [ 19 ]. These findings highlight the need for highly effective, male-specific methods to assist recruitment and retention in research studies in line with current best practice and guidelines.

Effective long-term recruitment methods to enable and facilitate male recruitment in longitudinal healthcare research consistently demonstrate strategies should be tailored for age, interests, and sex. To facilitate the effective recruitment of men into research, different recruitment methods for different age groups of either sex can be effective [ 11 , 20 ]. Younger males may be recruited through online social network platforms including Facebook [ 21 , 22 ], which is less effective in elderly males [ 23 ]. While elderly men would be more likely to participate if referred to the study by their affiliated health service provider, media coverage or mass mailings [ 11 ]. Facebook, in particular, is more effective at recruiting participants than any of the other social media platforms combined [ 21 ]. As of October 2020, more males globally (57%), use Facebook than females (43%) [ 24 ]. Yet, a recent systematic review of recruitment using Facebook, found little evidence of its effectiveness in recruiting participants of either sex aged over 35 years [ 22 ], highlighting that social media strategies were ineffective. Tolmie et al., reported that the need for ongoing health monitoring for older participants was the most important recruitment and retention motivator, in addition to fostering positive relationships between staff and participants, and communicating the studies progress to recruits [ 25 ].

The difficulty of recruiting and retaining males in research studies can adversely affect the statistical study power and generalisability of study findings, and in particular, those studies involving a longitudinal design which consequently affects the applicability of results to the male population [ 26 , 27 ]. While sex (male) and gender (men) constructs are important considerations in society and within health, to date, the terms male and men are often used interchangeably in health literature. For these reasons, this systematic review has reviewed published studies that consider male participation, recognising that the terms male and men, most often refer to the biological construct of male sex. This review has focused on health research or health interventions that are using a longitudinal study design. The main outcome was to identify specific barriers and facilitators of male recruitment and retention as part of longitudinal research-related studies. Findings have the potential to inform future development of patient-centred and evidence-based strategies to enhance recruitment into longitudinal health-related studies for men.

This systematic review protocol was registered with PROSPERO database (University of York Centre for Reviews and Dissemination) (CRD42021254696) and complies with reporting guidelines from the Preferred Reporting Items for Systematic Reviews and Meta-analysis (PRISMA) statement [ 28 ].

Study identification

Studies published in English without date restrictions were identified through systematic searching. There was no limit applied to the dates of publications, in order to explore the full breadth of the literature surrounding the topic and determine strategies used that remain relevant in this current time. The databases MEDLINE (Ovid, 1946 to present), Embase (Embase.com, 1947 to present), CINAHL (EBSCO, 1981 to present), and Web of Science (Clarivate Analytics, 1900 to present) were searched on 20 October 2020. The searches were updated on 21 October 2021 and 11 November 2023 to determine any additional publications during the 2020–2023 period. The MEDLINE search strategy was translated for the other databases using the Polyglot Search Translator [ 29 ]. Specific search terms used (see Supplementary File 1 ) included recruitment and/or retention, facilitators and/or barriers, and longitudinal studies. Here, longitudinal research was defined as a study with a minimum of three repeated study visits, or research data collections over a time greater than or equal to 12 weeks. Search terms were used with Boolean operators “AND” and “OR” as connective devices within the search strings. Where appropriate and possible, search terms were truncated (*) to retrieve multiple variations of a word.

All retrieved articles, excluding duplicates, were exported into Covidence [ 30 ] to facilitate the screening process. Identified studies were screened by two independent reviewers from the review team (DJB, MH-S, PG, HWT, DH, DS) to identify eligible studies. Studies were assessed for inclusion based on screening of title and abstracts. A third independent reviewer (KMR) reviewed conflicts. Full text papers were retrieved and assessed by two team members according to the inclusion and exclusion criteria with conflicts reviewed by third independent reviewer (KMR). The reference lists and citing articles of included studies and relevant reviews and systematic reviews were further hand-searched for further potential papers for inclusion.

Inclusion criteria

Types of participants.

This review includes any participant who identifies as male (biological sex). Where the included publication does not make it clear if this is the biological definition of males, or the gendered definition of men, the assumption has been made that this refers to those who are biologically male, and therefore included. Male participants between the ages of 17 – 59 years were included. We chose 17 years of age as the lower age limit to encompass studies involving adult males that did not require parental consent. The upper limit of 59 years was set to eliminate recruitment of older populations, as previous research has covered participation of older populations extensively and we aim to investigate factors influencing the involvement of younger individuals. Studies covering a broader age range were considered only if they provided age-specific data. Male parents, who were consenting on behalf of their child into a longitudinal health study or intervention were included. Parents who consent on behalf of a child are often needed to take part in certain aspects of the study; however only studies that specifically identified parental sex were included. Studies that included females or indeterminate sex were reviewed however these studies were only included if recruitment and/or retention of males and men were reported separately. Likewise, studies of parents and child or family studies were included if they reported recruitment and/or retention of male parents separately.

Types of studies

Any cross-sectional, longitudinal, survey, experimental, program evaluation studies or studies involving qualitative or mixed methods that intentionally (i.e., stated a priori) or incidentally (i.e., noted as a posteriori) included detailed commentary or analysis on the recruitment and/or retention of male participants in a longitudinal health intervention or health research study, with the requirement that this commentary offered informative data rather than a generalised statement about participant recruitment or retention.

Types of exposures/interventions

This review excluded studies focused on Alzheimer’s/dementia [ 31 ], cancer [ 32 ], HIV [ 33 ] and illegal drugs [ 34 ] due to the wealth of existing systematic review literature. Studies focusing on fathers with young children in early childhood health intervention research were excluded due to a recent systematic review [ 35 ].

Any other health research program or health intervention was included, provided enrolled male participants had data collected on a minimum of three separate occasions over a period of greater than or equal to 12 weeks. Longitudinal studies that were less than 12 weeks in duration or had less than 3 study visits or data collections were excluded. A health intervention was defined as any study aimed at improving specific health behaviours or outcomes.

Types of outcome measures

Studies were included if they identified specific strategies for recruiting and retaining male participants into longitudinal research and longitudinal clinical practice and if findings were analysed, reported, or discussed separately.

Exclusion criteria

Study population.

Studies on recruitment, retention, barriers and facilitators of vulnerable populations, males < 16 years and males > 60 years of age were excluded. Vulnerable male populations were defined as socioeconomically disadvantaged populations or racial and ethnic minorities (including Indigenous and First Nations people). Due to the cultural, economic and other differences that a review of these communities would likely identify, it was deemed to be appropriate for these populations to be reviewed separately in the future.

Study topic

Conference abstracts, review or systematic review papers, incomplete studies including study protocols, and grey literature were excluded from this review. Articles were excluded at any time in the screening process, if they did not (1) examine views or include discussions that considered retention, barriers, or facilitators for research/interventions; (2) include male specific data, and only discussed female participants; (3) determine the participant sex or (4) focus on participant recruitment or retention as part of the health research/intervention.

Data extraction

Data extraction was performed by three members of the review team (DJB, PG, MH-S) and reported narratively. Extraction was cross-checked for accuracy and consistency by one other reviewer (either DS or KMR). The following information was extracted from each included study: publication information, study aim, methods (i.e., participants, procedures, demographics), recruitment of male participants, retention of male participants, timing of data collection, and types of data collected from male participants. Reported barriers and facilitators to support male recruitment and retention was extracted.

Quality assessment

A quality assessment check is usually undertaken in a systematic review that pertains to a review that assesses the individual results of a group of specific studies. As this review assesses the barriers and facilitators to recruitment and/or retention methods, there was no need for a quality check of the included studies.

The database searching and the forward and backward citation checking yielded 16,457 and 13 papers respectively (16,470 total). 6,108 duplicates were removed resulting in 10,362 articles available for screening (Fig.  1 ). Of these, 9,214 studies did not meet the inclusion criteria based on titles and abstract screening and resulted in 1,148 full-text studies selected for further screening (Fig.  1 ). A total of 1,124 studies were then excluded with 255 having no male specific data, 166 conference abstracts, 115 HIV related research, 106 cancer related research, 78 studies had no included data on barriers or facilitators, 71 studies with a focus on males > 60 years, 69 studies from racial or ethnic minority, 52 studies were unrelated to health recruitment and retention, 48 Alzheimer’s or dementia research, 39 related to illegal drugs, 29 papers were studies with less than 3 study visits, 24 papers were males < 16 years of age, 22 systematic review/review papers, 19 focused on socioeconomically disadvantaged populations, 14 uncompleted studies/study protocol, 13studies were < 12 weeks duration, and 4 fathers in early childhood interventions (Fig.  1 ).

figure 1

PRISMA diagram depicts the search, screening, eligibility and inclusion results

A total of 24 articles remained and the data was extracted and included in this review. The oldest of these studies was published from 1976 [ 36 ] and the most recent, 2023 [ 37 , 38 ]. All of the included studies were conducted in Western countries except Cheraghi et al., which was based in the Middle East [ 39 ] and Schilling et al., which was based in India [ 37 ]; two were located in United Kingdom [ 40 , 41 ], two in France [ 42 , 43 ], one in Finland [ 44 ], one in Sweden [ 45 ], one in The Netherlands [ 46 ], one study across combined European nations [ 29 ], one in Germany [ 47 ] ten in North America [ 36 , 48 , 49 , 50 , 51 , 52 , 53 , 54 , 55 , 56 ] and three in Australia [ 38 , 57 , 58 ] and are described in Table  1 . Participant characteristics varied with study focus including participants with specific health conditions, such as overweight [ 41 , 57 ], having an occupational injury [ 40 , 41 ], having visited a sexually transmitted infection clinic [ 46 ], or being treated for a psychological disorder [ 44 , 50 , 53 ], COVID related issues [ 37 , 54 ], or habits such as alcohol abuse and smoking [ 56 ]. Some studies recruited participants from specific subgroups, including veterans [ 36 ], workers of an electricity company [ 42 ] and people that had attended a spouse abuse abatement program [ 50 ]. All twenty-four studies met the inclusion criteria for age. One of the studies was a family cohort study that recruited families of children with cystic fibrosis and congenital heart disease and required participation of both parents [ 51 ].

Of the included studies, 20 had male and female participants [ 37 , 38 , 39 , 40 , 42 , 43 , 44 , 45 , 46 , 47 , 48 , 49 , 51 , 52 , 53 , 54 , 55 , 56 , 57 , 58 ], with a number of these studies having a predominantly male sample [ 42 , 52 , 53 , 58 ]. Four studies recruited only male participants [ 36 , 41 , 50 , 61 ] (Table  2 ). The included studies with mixed sex either described male and female characteristics separately or clearly stated that there were no significant differences in recruitment and retention based on sex. All included studies used a minimum of three study visits or data collection, and the maximum number of study visits or data collections was 95 visits [ 41 ] and one study had up to 300 interactions with participants [ 44 ]. The minimum study length of included studies was 16 weeks [ 50 ] and the maximum study duration was 43 years [ 45 ]. All included studies collected demographic data [ 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 , 49 , 50 , 51 , 52 , 53 , 54 , 55 , 56 , 57 , 58 , 61 ].

Recruitment

Overall, all studies provided information on recruitment rates and 19 provided information on retention rates [ 36 , 38 , 39 , 40 , 41 , 43 , 44 , 45 , 46 , 47 , 49 , 50 , 51 , 52 , 55 , 56 , 57 , 58 , 61 ] (Table  2 ). A variety of methods for male participant recruitment included advertising [ 36 , 43 , 54 , 57 ], letters of invitation [ 39 , 40 , 41 , 42 , 43 , 47 , 52 , 56 , 57 , 58 , 61 ], selection of participants from larger cohorts [ 42 , 43 , 50 , 53 ], or recruitment from hospitals or registers [ 37 , 44 , 48 , 51 , 54 , 57 ] (Table  3 ). The most common method was sending letters of invitation, used in 11 out of the 24 studies, and yielded recruitment rates between 4.4% and 79.3% [ 47 , 52 ]. Irvine et al., recruited participants through letter of invitation and time space sampling, and reported that time space sampling was difficult, time consuming and only yielded one participant per 11 field visits [ 23 ]. Snow et al., used multiple methods for recruitment, including recruitment from work sites and public sites, mass mailing, telephone, media, and referral methods and reported that mass mailing was the best method of these [ 55 ]. Rose et al., attributed their high recruitment rates to advertising and therefore people that agreed to participate had done so voluntarily and were more likely to be interested in the study and health interventions in general [ 36 ]. To maximise male participation, vanWees et al., adapted their recruitment methods to target male participants by raising awareness and a greater sense of responsibility in terms of male health through flyers or personalised invitations [ 46 ].

A variety of factors were identified that interfered with male participation in longitudinal research are shown in Table  4 . Some of these were situational and included participant death or relocation [ 36 , 39 , 42 , 45 , 48 , 51 , 53 , 55 , 57 , 61 ]. While other barriers included time commitment [ 40 , 58 ], reluctance for medical testing [ 58 ], or the belief that the study is an invasion of privacy [ 58 ]. A large number of studies reported that men did not attend study visits [ 40 , 58 ], were not interested in the study or could not be bothered to participate [ 36 , 40 , 41 , 44 , 48 , 49 , 51 , 58 , 61 ], and study staff received no response to invitations [ 40 , 41 , 61 ].

Facilitators

Many studies employed a variety of strategies to increase participation for males (Table  5 ). These varied from offering free medical screening [ 36 ], reminders for appointments [ 40 , 42 , 46 , 48 , 51 , 52 , 56 , 57 , 58 , 61 ], or enrolment of wives [ 36 ] or family members [ 39 , 51 ] to assist in retention. Several studies used a range of strategies, particularly [ 43 , 56 , 57 ], with varying degrees of success.

We have undertaken a thorough assessment of longitudinal studies to determine recruitment and retention facilitating strategies for male participants. Retention of male participants was particularly impressive in two studies at 85.5% (from n  = 69 recruited) [ 41 ] and 88.9% (from n  = 2,280 recruited) [ 36 ] over a period of 5 months and 12 years, respectively. Irvine et al., undertook a trial to reduce alcohol related consumption to reduce obesity and relied upon the perceived health benefits for their participants [ 41 ], while Rose et al., studied ageing in Veteran participants. Rose et al., retained participants over 12 years, through a diverse range of approaches including; the use of participant newsletters, short study visits, free medical screening, income supplementation, encouraged participant perceptions of being part of the ‘health elite’ and recruited wives to assist with retention [ 36 ]. This study began in 1976 where there was a greater inclination by the public to follow suggestions, also particularly true for their target Veteran population [ 36 ]. Like Rose et al. [ 36 ], the Irvine et al., study team maintained regular contact, ensuring convenient timing and location of study visits and continued to highlight perceived health benefits of the research project. The Irvine et al., research team spent additional time ensuring that their staff conversations and project literature was relaxed and friendly and non-directive in its approach [ 41 ]. While Rose et al., provided re-imbursement to employers for study attendance, neither of these top two studies [ 41 ] used a direct financial or gift incentive to participants but rather relied upon excellent communication strategies.

Several studies highlight that different aged males are retained at different rates in their studies. For example, Cheraghi et al., saw 67% 40–59 year males retained while only 55% of > 60 years in the same study were retained [ 39 ]. In the male only studies, Hamberger et al. [ 50 ], and Lee et al. [ 61 ], showed varied retention rates based upon age of the participants. Male only studies have shown that diverse approaches can be successful in recruitment and retention. Communication that is non-directive in style, clear and delivered by supportive staff was important for Irvine et al. [ 41 ]. Continuing to maintain contact with male participants was important and included contact through family or a spouse [ 36 ], Christmas [ 61 ] and birthday cards, and study related newsletters [ 36 ]. Male only studies have highlighted that where male participants have a vested interest, for example, weight loss and desire for health education, these interests can be important drivers for recruitment [ 41 ].

Barriers to recruitment and retention of males

Barriers varied and were related to an inability of participants to participate due to lack of understanding of study objectives [ 58 ], language barriers [ 58 ] or lack of access to the internet for studies being conducted online [ 53 ]. Table 4 highlights reasons given for refusal to participate and reasons for non-completion of a study.

In intervention studies focused on lifestyle changes, barriers to participation included inability to adhere to the study activities [ 57 ] or lack of motivation to engage with new technologies [ 44 ]. In one study where participants had to make dietary changes and frequently visit the research centre, participants expressed frustration in trying to implement study content due to associated financial costs involving more expensive food, transportation or computer access [ 53 ]. Participant feedback included that transportation and free meal options would have been more enticing [ 44 ]. Travelling to the study centre was found to be a barrier in two studies [ 53 ] and another study participants expressed a preference for study to be online [ 53 ]. Another barrier reported in two studies was difficulty in arranging a follow-up session [ 53 ].

The major causes for refusal or dropout were time commitment issues and lack of interest in the study [ 40 , 44 , 51 , 58 ]. Time commitment issues were related refusal to having to make frequent visits to the study centre [ 51 , 57 , 58 ] and lifestyle changes that required more time, such as exercising or taking time to cook meals [ 53 ]. Health issues played a role in participant attrition and participants with health issues [ 42 , 51 , 53 , 57 , 58 ] and psychological issues [ 50 ] were most likely to refuse participation or dropout.

Different demographic characteristics were associated with refusal to join a study or a particular data collection point. These included low socioeconomic status [ 40 , 42 ], younger age [ 40 , 42 , 43 , 44 , 52 ], older age [ 40 , 61 ], poor lifestyle factors [ 42 ] and being unmarried [ 43 , 45 ]. In a birth cohort study, non-participation was linked to fathers being born outside the country where the study resided or having lower education [ 51 ].

Interestingly, Ullman et al., explored factors related to types of study responders; (non-responders, reluctant responders and responders) in an ongoing longitudinal study. Findings demonstrated that males who considered themselves more attractive or having better relationships with others were more likely to respond, while those that felt worse about their own sense of self required more incentives and reminders in order to take part in the study [ 56 ].

Facilitators to recruitment and retention of male participants

While many of the facilitators listed in Table  5 would be suitable for either sex, using a male-centric approach would likely prove particularly useful. Study advertising on mainstream media or medical press was used as a method to establish study credibility [ 42 , 57 ]. Several studies maintained contact with their participants throughout the study [ 42 , 57 ]. In one study, a yearly letter, written by the principal investigator, was sent to participants [ 42 ], other studies sent out a study newsletter [ 36 ] or monthly emails with health and nutritional scientific information [ 43 ]. Two studies sent an annual holiday letter [ 51 ] and in another participants received birthday and Christmas cards [ 61 ]. These methods were thought to pique participant interest and motivate them to participate in study activities. Interestingly, Griffith Filipo et al. , used humour with participants through the use of humourous GIFs sent to participants following study visits and found these to be a facilitating factor [ 49 ].

Other motivational techniques for participation included increasing study accessibility and minimising interference with participants day-to-day activities. For example, one of the studies was designed to ensure that examinations only took a couple of hours [ 36 ]. Another study was designed to be exclusively online which was a determining factor for participation in 46.45% of the sample [ 43 ]. In a trial where participants had to answer SMS messages, participants were able to choose the amount, timing, and frequency of texts they received, with the ability to change these options throughout the study course [ 44 ]. Two studies planned with employers to pay participants regular wages or give leave without penalty while they participated in the study [ 43 ]. Finally, one study reported that additional interventions were implemented for people that struggled to adhere to the required activities [ 55 ].

Incentives were successful in study participation. Six studies gave monetary incentives [ 46 , 47 , 48 , 49 , 56 , 57 ]. Other studies gave participants small gifts such as membership cards, certificates of completion, pens, tee shirts, mugs, etc. [ 56 , 57 ]. In an intervention study where participants had to consume specific products, these products were provided freely for participants [ 57 ]. A few health interventions offered free medical screenings [ 43 , 46 , 57 ]. Participants in the Rose et al., were notified of the outcome of their medical examinations and alerted if anything was abnormal, which in some cases prevented life-threatening issues [ 36 ]. To minimise attrition, participant reminders to complete questionnaires or arrange appointments in multiple studies [ 40 , 42 , 46 , 52 , 56 , 57 , 61 ]. One study found that when participants were contacted to assess reasons for refusal this prompted some to change their minds and participate in the study [ 40 , 42 , 46 , 52 , 56 , 57 , 61 ].

One aspect that was associated with male participant retention were the perceived health benefits gained from participating in the study [ 36 , 41 , 57 ]. One study specifically highlighted its participants expressed satisfaction of being part of a “health elite” , which was associated with high retention rates [ 36 ]. Another beneficial factor was the idea that their involvement in the study aided research in the field of nutrition (22.24%) and advanced public health (61.37%) [ 43 ]. More recently this has been shown to be true during the COVID-19 pandemic where males participated in high rates [ 36 , 41 , 57 ]. Méjean et al., reported that 67.02% of participants expressed satisfaction that the study was funded exclusively by public sources which was perceived as unbiased, and this was particularly well received by male participants [ 43 ]. Finally, in an attempt to motivate male participants, Rose et al., enlisted participants’ wives in the study, which was found to have positive outcomes in retention [ 36 ].

The greatest challenge for data extraction for this review was the way in which authors report these figures in their studies. Many studies report on overall recruitment, retention and barriers but few studies clearly incorporated sex-specific findings. Interests, and drivers for behaviour are unique between sexes and therefore it is important that research projects report separated male and female specific findings. The most beneficial studies reviewed gave recruitment success with each approach, for example Crichton et al. [ 57 ], highlighted what number of participants were recruited from varied strategies including advertising via TV or newspaper, letter of invitation, from the hospital, or a notice in the library. The Ullman et al. [ 56 ], study was also clear in highlighting how many approaches they needed to have data returned to them, for example, immediately, after one reminder, or multiple reminders and a financial incentive. Likewise, studies who reported when or how they noticed attrition for their research were incredibly valuable [ 42 , 44 , 46 , 51 , 57 , 61 ].

Strengths and limitations

The strengths of this systematic review lie in its comprehensive compilation of research data from the past 47 years of male recruitment and retention in longitudinal research studies which has historically posed many issues to researchers [ 5 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 ]. To the authors’ knowledge at the time of print, there is no other systematic review available on the barriers and facilitators of the recruitment and retention of males in longitudinal research. It has been evident that the barriers and facilitators are not unique to a specific study aim but have been experienced across the diverse range of studies. This systematic review offers a comprehensive list of strategies which have worked with particular populations and strategies which have failed for researchers looking to improve their male recruitment and retention rates and has a particular focus on longitudinal research studies. Primarily it has highlighted that multiple facilitators will be needed when designing longitudinal research inclusive of males, as the barriers to participation are diverse. The most challenging barrier to overcome is how to develop enthusiasm and urgency from men towards health research. Regardless of the purpose of the underlying study, the barriers and facilitators for male participants are relatively consistent.

The exclusion of several population groups limits this study however it was felt that each of these required a detailed separate systematic review to ensure that the unique barriers and facilitators for the recruitment and retention of these communities are clearly articulated. A further limitation is that for papers to be included in this systematic review, they had to specifically mention an issue that detailed barriers/facilitators to recruitment/retention in the title/abstract rather than stating “we recruited” in the full text. Therefore, we acknowledge there it may be possible that some publications that focused on longitudinal studies involving male participants have been missed. In conclusion, this systematic review offers an in-depth look into the barriers and facilitators of the recruitment and retention strategies for males aged 17–59 years old for the past 47 years. It highlights that research teams will need to expend considerable time, expense and diverse approaches to successfully engage and retain male participants into longitudinal studies.

Availability of data and materials

All data generated or analysed during this systematic review are included in this published article.

Trauth JM, et al. Public attitudes regarding willingness to participate in medical research studies. J Health Soc Policy. 2000;12(2):23–43.

Article   CAS   PubMed   Google Scholar  

Barr E, et al. Gender as a social and structural variable: research perspectives from the National Institutes of Health (NIH). Transl Behav Med. 2024;14(1):13–22. https://doi.org/10.1093/tbm/ibad014 .

Article   PubMed   Google Scholar  

Pirkis J, et al. Cohort profile: ten to men (the Australian longitudinal study on male health). Int J Epidemiol. 2017;46(3):793–794i.

PubMed   Google Scholar  

Burns KE, et al. Attitudes and views of the general public towards research participation. Intern Med J. 2013;43(5):531–40.

Mishra GD, et al. Recruitment via the Internet and social networking sites: the 1989–1995 cohort of the Australian longitudinal study on women’s health. J Med Internet Res. 2014;16(12):e279.

Article   PubMed   PubMed Central   Google Scholar  

Bass SB, et al. Exploring the engagement of racial and ethnic minorities in HIV treatment and vaccine clinical trials: a scoping review of literature and implications for future research. AIDS Patient Care STDS. 2020;34(9):399–416.

Gilmore-Bykovskyi AL, et al. Recruitment and retention of underrepresented populations in Alzheimer’s disease research: a systematic review. Alzheimers Dement (N Y). 2019;5:751–70.

Todd A, et al. Age specific recruitment and retention to a large multicentre observational breast cancer trial in older women: the age gap trial. J Geriatr Oncol. 2021;12(5):714–23.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Chaudhari N, et al. Recruitment and retention of the participants in clinical trials: challenges and solutions. Perspect Clin Res. 2020;11(2):64–9.

Macdonald JJ. Shifting paradigms: a social-determinants approach to solving problems in men’s health policy and practice. Med J Aust. 2006;185(8):456–8.

Bracken K, et al. Recruitment strategies in randomised controlled trials of men aged 50 years and older: a systematic review. BMJ Open. 2019;9(4):e025580.

Smith JA, Robertson S. Men’s health promotion: a new frontier in Australia and the UK? Health Promot Int. 2008;23(3):283–9.

Ryan J, et al. It’s not raining men: a mixed-methods study investigating methods of improving male recruitment to health behaviour research. BMC Public Health. 2019;19(1):814.

Mahalik JR, Burns SM, Syzdek M. Masculinity and perceived normative health behaviors as predictors of men’s health behaviors. Soc Sci Med. 2007;64(11):2201–9.

Noble N, et al. Which modifiable health risk behaviours are related? A systematic review of the clustering of Smoking, Nutrition, Alcohol and Physical activity (‘SNAP’) health risk factors. Prev Med. 2015;81:16–41.

Yousaf O, Grunfeld EA, Hunter MS. A systematic review of the factors associated with delays in medical and psychological help-seeking among men. Health Psychol Rev. 2015;9(2):264–76.

Rongen A, et al. Workplace health promotion: a meta-analysis of effectiveness. Am J Prev Med. 2013;44(4):406–15.

Maher CA, et al. Are health behavior change interventions that use online social networks effective? A systematic review. J Med Internet Res. 2014;16(2):e40.

Robertson LM, et al. What works with men? A systematic review of health promoting interventions targeting men. BMC Health Serv Res. 2008;8:141.

Leuteritz K, et al. Recruiting young adult cancer patients: experiences and sample characteristics from a 12-month longitudinal study. Eur J Oncol Nurs. 2018;36:26–31.

Topolovec-Vranic J, Natarajan K. The use of social media in recruitment for medical research studies: a scoping review. J Med Internet Res. 2016;18(11):e286.

Whitaker C, Stevelink S, Fear N. The use of Facebook in recruiting participants for health research purposes: a systematic review. J Med Internet Res. 2017;19(8):e290.

Peel R, et al. Evaluating recruitment strategies for AUSPICE, a large Australian community-based randomised controlled trial. Med J Aust. 2019;210(9):409–15.

Tankovska H. Facebook: distribution of global audiences 2020, by gender. 2021.

Google Scholar  

Tolmie EP, et al. Understanding why older people participate in clinical trials: the experience of the Scottish PROSPER participants. Age Ageing. 2004;33(4):374–8.

Page SJ, Persch AC. Recruitment, retention, and blinding in clinical trials. Am J Occup Ther. 2013;67(2):154–61.

Roberts J, Waddy S, Kaufmann P. Recruitment and retention monitoring: facilitating the mission of the National Institute of Neurological Disorders and Stroke (NINDS). J Vasc Interv Neurol. 2012;5(suppl):14–9.

CAS   PubMed   PubMed Central   Google Scholar  

Page MJ, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372:n71.

Clark JM, et al. Improving the translation of search strategies using the polyglot search translator: a randomized controlled trial. J Med Libr Assoc. 2020;108(2):195–207.

Covidence systematic review software. Veritas Health Innovation; Melbourne; 2021. Available at https://www.covidence.org .

Cooper C, Ketley D, Livingston G. Systematic review and meta-analysis to estimate potential recruitment to dementia intervention studies. Int J Geriatr Psychiatry. 2014;29(5):515–25.

Forbes CC, et al. A systematic review of the feasibility, acceptability, and efficacy of online supportive care interventions targeting men with a history of prostate cancer. J Cancer Surviv. 2019;13(1):75–96.

Article   MathSciNet   PubMed   PubMed Central   Google Scholar  

Johnston RE, Heitzeg MM. Sex, age, race and intervention type in clinical studies of HIV cure: a systematic review. AIDS Res Hum Retroviruses. 2015;31(1):85–97.

Lewer D, et al. Frequency of health-care utilization by adults who use illicit drugs: a systematic review and meta-analysis. Addiction. 2020;115(6):1011–23.

Keys EM, et al. Recruitment and retention of fathers with young children in early childhood health intervention research: a systematic review and meta-analysis protocol. Syst Rev. 2019;8(1):300.

Rose CL, Bosse R, Szretter WT. The relationship of scientific objectives to population selection and attrition in longitudinal studies. The case of the normative aging study. Gerontologist. 1976;16(6):508–16.

Schilling J, et al. Development of a decentralized cohort for studying post-acute sequelae of COVID-19 in India in the Data4life study. Commun Med. 2023;3(1):117.

Amin S, et al. Participant perceptions in a long-term clinical trial of autosomal dominant polycystic kidney disease. Kidney Med. 2023;5(9):100691.

Cheraghi L, et al. Predisposing factors of long-term responsiveness in a cardio-metabolic cohort: Tehran lipid and glucose study. BMC Med Res Methodol. 2021;21(1):161.

Article   MathSciNet   CAS   PubMed   PubMed Central   Google Scholar  

Green E, et al. Exploring patterns of response across the lifespan: the Cambridge Centre for Ageing and Neuroscience (Cam-CAN) study. BMC Public Health. 2018;18(1):N.PAG-N.PAG.

Article   Google Scholar  

Irvine L, et al. Modifying alcohol consumption to reduce obesity: a randomized controlled feasibility study of a complex community-based intervention for men. Alcohol Alcohol. 2017;52(6):677–84.

Goldberg M, et al. Health problems were the strongest predictors of attrition during follow-up of the GAZEL cohort. J Clin Epidemiol. 2006;59(11):1213–21.

Méjean C, et al. Motives for participating in a web-based nutrition cohort according to sociodemographic, lifestyle, and health characteristics: the NutriNet-Santé cohort study. J Med Internet Res. 2014;16(8):e189–e189.

Kannisto KA, et al. Factors associated with dropout during recruitment and follow-up periods of a mHealth-based randomized controlled trial for Mobile.Net to encourage treatment adherence for people with serious mental health problems. J Med Internet Res. 2017;19(2):1–1.

Kelfve S, Fors S, Lennartsson C. Getting better all the time? Selective attrition and compositional changes in longitudinal and life course studies. Longitud Life Course Stud. 2017;8(1):104–19.

van Wees DA, et al. Who drops out and when? Predictors of non-response and loss to follow-up in a longitudinal cohort study among STI clinic visitors. PLoS One. 2019;14(6):15.

Limmroth V, et al. Ascertaining medication use and patient-reported outcomes via an app and exploring gamification in patients with multiple sclerosis treated with interferon β-1b: observational study. JMIR Form Res. 2022;6(3):e31972.

Gourash WF, et al. Five-year attrition, active enrollment, and predictors of level of participation in the Longitudinal Assessment of Bariatric Surgery (LABS-2) study. Surg Obes Relat Dis. 2022;18(3):394–403.

Griffith Fillipo IR, et al. Participant retention in a fully remote trial of digital psychotherapy: comparison of incentive types. Front Digit Health. 2022;4:963741.

Hamberger LK, Lohr JM, Gottlieb M. Predictors of treatment dropout from a spouse abuse abatement program. Behav Modif. 2000;24(4):528–52.

Janus M, Goldberg S. Factors influencing family participation in a longitudinal study: comparison of pediatric and healthy samples. J Pediatr Psychol. 1997;22(2):245–62.

Oleske DM, et al. Participation in occupational health longitudinal studies: predictors of missed visits and dropouts. Ann Epidemiol. 2007;17(1):9–18.

Olmos-Ochoa TT, et al. Barriers to participation in web-based and in-person weight management interventions for serious mental illness. Psychiatr Rehabil J. 2019;42(3):220–8.

Pogue JR, et al. Strategies and lessons learned from a longitudinal study to maximize recruitment in the midst of a global pandemic. Bayl Univ Med Center Proc. 2022;35(3):309–14.

Snow WM, et al. Predictors of attendance and dropout at the lung health study 11-year follow-up. Contemp Clin Trials. 2007;28(1):25–32.

Ullman JB, Newcomb MD. Eager, reluctant, and nonresponders to a mailed longitudinal survey: attitudinal and substance use characteristics differentiate respondents1. J Appl Soc Psychol. 1998;28(4):357–75.

Crichton GE, et al. Long-term dietary intervention trials: critical issues and challenges. Trials. 2012;13:111.

Markanday S, et al. Sex-differences in reasons for non-participation at recruitment: Geelong osteoporosis study. BMC Res Notes. 2013;6:104.

Azizi F, Zadeh-Vakili A, Takyar M. Review of rationale, design, and initial findings: tehran lipid and glucose study. Int J Endocrinol Metab. 2018;16(4 Suppl):e84777.

PubMed   PubMed Central   Google Scholar  

Shafto MA, et al. The Cambridge Centre for Ageing and Neuroscience (Cam-CAN) study protocol: a cross-sectional, lifespan, multidisciplinary examination of healthy cognitive ageing. BMC Neurol. 2014;14:204.

Lee DM, et al. The European male ageing study (EMAS): design, methods and recruitment. Int J Androl. 2009;32(1):11–24.

Newcomb MD. Psychosocial predictors and consequences of drug use: a developmental perspective within a prospective study. J Addict Dis. 1997;16(1):51–89.

Wong ATY, et al. Randomised controlled trial to determine the efficacy and safety of prescribed water intake to prevent kidney failure due to autosomal dominant polycystic kidney disease (PREVENT-ADPKD). BMJ Open. 2018;8(1):e018794.

Emery T, Henson RN, Tyler LK. Cambridge centre for ageing and neuroscience. 2010. Available from: https://www.cam-can.org/ . Cited 2023 31.01.

Pasco JA, Nicholson GC, Kotowicz MA. Cohort profile: Geelong osteoporosis study. Int J Epidemiol. 2011;41(6):1565–75.

Bell B, Rose CL, Damon A. The veterans administration longitudinal study of healthy aging. Gerontologist. 1966;6(4):179–84.

Download references

Acknowledgements

We acknowledge the original owners, the Turrbal people and the Jaggera people of the land which this data was analysed and publication was written on. We pay respect to the Elders of these communities, past, present and emerging.

This study has received cash and in-kind funding from the following organisations and institutes: Mater Foundation (NA), the University of Queensland. PG was supported by Mater Foundation, DJB was supported by Research Support Fellowship (the University of Queensland), DH was supported by the University of Queensland, HWT was supported by NHMRC ideas grant 1181053, DS was supported by Mater Health, KMR is supported by the Mater Foundation and Equity Trustees (ANZ QLD Community Foundation, QCF-ANZ Bank Fund, QCF-Thomas George Swallow Trust, The HJ Hinchey Cht Trust).

Author information

Danielle J. Borg and Melina Haritopoulou-Sinanidou are joint first author.

Authors and Affiliations

Pregnancy and Development Group, Mater Research – The University of Queensland, Aubigny Place, South Brisbane, 4101, Australia

Danielle J. Borg

Faculty of Medicine, University of Queensland, Herston, 4006, Australia

Danielle J. Borg, Daniel Schweitzer & Kym M. Rae

Experimental Melanoma Therapy Group, Faculty of Medicine, The University of Queensland, Herston, 4006, Australia

Melina Haritopoulou-Sinanidou

Indigenous Health Group, Mater Research Institute – The University of Queensland, Aubigny Place, South Brisbane, 4101, Australia

Pam Gabrovska & Kym M. Rae

Stem Cell Biology Group, Mater Research Institute – The University of Queensland, Translational Research Institute, 37 Kent Street, Woolloongabba, QLD, 4102, Australia

Hsu-Wen Tseng

Library, University of Queensland, St Lucia, 4072, Australia

David Honeyman

Department of Neurology, Mater Health, South Brisbane, 4101, Australia

Daniel Schweitzer

You can also search for this author in PubMed   Google Scholar

Contributions

DJB, MH-S, contributed to the study design, conducting the review, data extraction, analysis and appraisal, interpretation of the data, drafting the manuscript and obtained the final approval of the version to be published. PG, DH, H-WT, DS contributed to the study design, conducting the review, data extraction, analysis and appraisal, interpretation of the data, and revising the manuscript critically and gave a final approval of the version to be published. KMR contributed to the study conceptualisation, the study design, data extraction, analysis and appraisal, interpretation of the data, and revising the manuscript critically and gave a final approval of the version to be published.

Corresponding author

Correspondence to Kym M. Rae .

Ethics declarations

Ethics approval and consent to participate.

No protocol approval was needed for this systematic review since no participants were involved.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1..

Final Search Stategies.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Borg, D.J., Haritopoulou-Sinanidou, M., Gabrovska, P. et al. Barriers and facilitators for recruiting and retaining male participants into longitudinal health research: a systematic review. BMC Med Res Methodol 24 , 46 (2024). https://doi.org/10.1186/s12874-024-02163-z

Download citation

Received : 10 August 2023

Accepted : 28 January 2024

Published : 22 February 2024

DOI : https://doi.org/10.1186/s12874-024-02163-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Health research
  • Longitudinal study
  • Recruitment facilitators
  • Study retention

BMC Medical Research Methodology

ISSN: 1471-2288

systematic review summary of findings table

  • Open access
  • Published: 24 February 2024

Gut microbiota in patients with prostate cancer: a systematic review and meta-analysis

  • Haotian Huang 1 ,
  • Yang Liu 1 ,
  • Zhi Wen 1 ,
  • Caixia Chen 1 ,
  • Chongjian Wang 1 ,
  • Hongyuan Li 1 &
  • Xuesong Yang 1  

BMC Cancer volume  24 , Article number:  261 ( 2024 ) Cite this article

Metrics details

Increasing evidence indicates that gut microbiota are closely related to prostate cancer. This study aims to assess the gut microbiota composition in patients with prostate cancer compared to healthy participants, thereby advancing understanding of gut microbiota's role in prostate cancer.

A systematic search was conducted across PubMed, Web of Science, and Embase databases, in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. The methodological quality of included studies was evaluated using the Newcastle–Ottawa Scale (NOS), and pertinent data were analyzed. The kappa score assessed interrater agreement.

This study encompassed seven research papers, involving 250 prostate cancer patients and 192 controls. The kappa was 0.93. Meta-analysis results showed that alpha-diversity of gut microbiota in prostate cancer patients was significantly lower than in the control group. In terms of gut microbiota abundance, the ratio of Proteobacteria , Bacteroidia , Clostridia , Bacteroidales , Clostridiales , Prevotellaceae , Lachnospiraceae , Prevotella , Escherichia - Shigella , Faecalibacterium , and Bacteroides was higher in prostate cancer patients. Conversely, the abundance ratio of Actinobacteria , Bacteroidetes , Firmicutes , Selenomonadales , Veillonella , and Megasphaera was higher in the control group.

Our study reveals differences in alpha-diversity and abundance of gut microbiota between patients with prostate cancer and controls, indicating gut microbiota dysbiosis in those with prostate cancer. However, given the limited quality and quantity of selected studies, further research is necessary to validate these findings.

Peer Review reports

Prostate cancer (PCa) is the most prevalent malignant tumor in males, particularly in the United States [ 1 ], significantly impacting public health. In 2022, PCa constituted approximately 27% of newly diagnosed male cancer cases, with its mortality rate ranking second among male cancers [ 2 ]. The incidence of PCa is also rapidly increasing in many Asian countries [ 3 ]. Research has indicated potential influences of various factors on PCa development, including genetics, race, age, local inflammation, and lifestyle habits [ 4 , 5 , 6 , 7 ]. However, the definitive impact of these factors on PCa pathogenesis remains unconfirmed. Recent studies have highlighted an increasing association between human diseases and microbiota, notably the gut microbiota (GM). Consequently, microbial factors, such as urinary and gut microbiota, are attracting significant interest in their impact on health [ 8 , 9 ].

The term 'microbiota' denotes the collection of microorganisms residing in a specific biological environment, including bacteria, viruses, parasites, and fungi [ 10 ]. The mammalian gastrointestinal tract hosts a complex community of trillions of symbiotic entities, such as bacteria, fungi, archaea, and viruses, collectively known as the GM [ 11 ]. Research has linked the GM to various conditions, including diabetes, Alzheimer's disease, and ulcerative colitis [ 12 , 13 , 14 , 15 ]. Advances in next-generation sequencing technologies have greatly improved our understanding of the GM's composition, for example, through the sequencing of the 16S rRNA gene or its amplicons, based on the variability of small subunit ribosomal RNA sequences [ 16 , 17 ]. This has enabled deeper exploration into the GM's relationship with diseases.

The prostate, being relatively distant from the gut, initially left the impact of gut microbiota (GM) on PCa unclear. However, recent studies have uncovered an association between GM and PCa. In 2018, Golombos et al. analyzed the GM of 20 male subjects, noting a higher prevalence of Bacteriodes massiliensis in PCa patients, although GM diversity appeared similar when comparing PCa patients with healthy controls [ 18 ]. In 2022, Fernandes et al. observed differences in the relative abundance of phylum-level bacteria between PCa patients and healthy individuals [ 19 ]. These studies suggest a significant link between GM and PCa, utilizing GM sequencing to analyze PCa patient samples.

Nevertheless, due to varying sample sizes and individual differences, the specific characteristics of GM in PCa patients remain ambiguous. To address this, our meta-analysis was conducted to examine changes in GM composition in PCa patients. This aims to discern GM's role in the etiology and progression of PCa and to explore new preventive and diagnostic methods.

This systematic review and meta-analysis adhered to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines and was prospectively registered with PROSPERO (CRD 42023476765).

Search strategy

We systematically retrieved relevant studies from the PubMed, EMBASE, and Web of Science databases from their inception until September 2023. Our search strategy was based on the PICOS principle: (P) Population: prostate cancer patients; (C) Comparison: healthy controls; (O) Outcome: diversity and abundance of gut microbiota; (S) Study Design: prospective studies, case–control studies, or cohort studies. The details of our search terms and strategy are presented in Additional file 1 : Table S1 (using PubMed as an example).

Inclusion and exclusion criteria

The inclusion criteria for studies were: (1) prospective studies, cohort studies, or case–control studies; (2) original research comparing the GM of PCa patients with a control cohort; (3) use of 16S rRNA sequencing technology; and (4) studies reporting microbial communities in fecal samples. The exclusion criteria were: (1) studies not on topic; (2) animal experiments, reviews, summaries, conference abstracts, secondary research, and editorials; and (3) studies where microbiota originated from urine, prostatic fluid, or prostate tissuee.

Study selection

Endnote reference management software was used for managing literature and eliminating duplicate records in our study. Records with 100% similarity were automatically removed, while those with 80–99% similarity were manually reviewed for removal. Two researchers (HH and LY) screened titles and abstracts for initial evaluation and categorization, determining which literature to include or exclude. They then fully read the remaining literature to confirm its relevance. The eliterature selection was independently conducted by these two investigators. Disagreements were resolved by consulting a third researcher.

Additionally, reference lists of included studies, systematic reviews, and reviews on the topic were scrutinized. All related articles were thoroughly read, and relevant articles were identified using the snowball technique.

Data extraction and quality assessment

Data extraction was independently performed by two researchers, including details such as the first author’s name, publication year, country, participant number, sample collection method, and the alpha-diversity and abundance of GM. Disputes were resolved through discussion with a third researcher. We used kappa score to assess interrater agreements. A kappa score ≤ 0.2 was considered a poor agreement, 0.21- 0.40 as fair agreement, 0.41–0.60 as moderate agreement, 0.61–0.80 as good and 0.81–1.00 as very good agreement [ 20 ].

For quality assessment, we used the Newcastle–Ottawa Scale (NOS) [ 21 ]. This tool, designed for observational studies, comprises eight components assessing the study's selectiveness, comparability of the exposed group, and outcome clarity. The total score is 9 points, with studies scoring 6 or above considered high-quality. Studies scoring below 6 were deemed low-quality and excluded from our analysis.

Outcome measures

The primary outcome assessed was the variation in alpha-diversity of gut microbiota between prostate cancer patients and the control group. "Alpha-diversity" is an evaluation of microbial diversity, which may include species richness, evenness of abundance, or both. Indexes such as Shannon and Simpson were utilized to assess alpha-diversity, while the Chao1 index and the count of Observed species or Operational Taxonomic Units (OTUs) estimated microbial richness. The secondary outcome evaluated the relative abundance of various taxa within the microbiota across studies, encompassing taxonomic categories like phylum, class, order, family, and genus.

Data analysis

A meta-analysis was conducted using StataMP 15.0 to assess both the primary outcome (alpha-diversity of GM) and the secondary outcome (relative abundance of various taxa). For continuous indicators, such as microbial alpha-diversity and abundance, we compiled and analyzed the overall mean, standard deviation (SD), and standard error (SE). SE was calculated using the formula SE = √(r*[1-r]/n) if not provided in the studies. For PCa and control groups with multiple subgroups, subgroup data were combined. The combined effect size (ES) was calculated using StataMP 15.0. Heterogeneity was quantitatively analyzed using I 2 [ 22 ]. A random-effects model was applied if I2 > 50%; a fixed-effects model was used if I2 < 50%. Sensitivity analysis was conducted by sequentially excluding individual studies to confirm the stability and reliability of our results. Publication bias was assessed using Begg's rank correlation test and Egger's linear regression test, with p  < 0.05 considered statistically significant [ 23 ].

Study selection,characteristics and quality of included studies

From an initial retrieval of 765 articles from the database, 268 duplicates were identified and removed. Reviewing the titles and abstracts of the remaining 497 articles resulted in a further exclusion of 463 studies. After full-text assessments of the remaining 34 articles, 27 were excluded due to non-conforming inclusion populations, lack of GM-related data, non-fecal sample origin, or absence of a control group. Consequently, 7 articles were included in this study [ 24 , 25 , 26 , 27 , 28 , 29 , 30 ], as illustrated in Fig.  1 . These articles predominantly originated from China, the United States, Finland, and Israel, encompassing a total of 442 samples (250 PCa patients and 192 controls). All studies employed 16sRNA sequencing, though the amplified regions of the 16S rRNA gene varied. Three studies targeted the V4 region [ 26 , 29 , 30 ], one the V3-V5 region [ 24 ], one the V3-V4 region [ 25 ], one the V6 region [ 27 ], and one the V1-V2 region [ 28 ]. For detailed population data and microbiota information, please see Table  1 .

figure 1

PRISMA fow diagram for the systematic review

The Kappa’s score for data extraction was 0.93, which demonstrated “very good” interrater agreement. All 7 articles scored 6 or above on the NOS, signifying their high quality (Table  2 ).

Alpha diversity of GM

Six articles reported on the alpha-diversity of the GM [ 24 , 25 , 26 , 27 , 29 , 30 ], However, the Evenness index, reported in only one article [ 30 ], was excluded from the quantitative analysis due to insufficient data. We analyzed the alpha-diversity of the PCa population and the control group, including the Chao1, Observed Species, Shannon, and Simpson indexes. The results showed reduced diversity in PCa patients compared to controls, as evidenced by lower scores in the Chao1 (Fig.  2 ), Observed Species (Fig.  3 ), and Shannon (Fig.  4 ) indexes. However, there was no significant difference in the Simpson index diversity between PCa patients and controls (Fig.  5 ). Due to significant heterogeneity, a sensitivity analysis was conducted (Additional file 2 : Figure S1-4). The results' stability was confirmed by this sensitivity analysis, indicating that excluding any single study did not significantly alter the overall effect size.

figure 2

Forest plot of alpha-diversity in Chao1 index

figure 3

Forest plot of alpha-diversity in Observed Species index

figure 4

Forest plot of alpha-diversity in Shannon index

figure 5

Forest plot of alpha-diversity in Simpson index

Bacterial phylum

Five studies reported data on the relative abundance of bacteria at the phylum level [ 24 , 25 , 26 , 28 , 30 ]. Forest plots showed that the GM of the control group had higher proportions of Actinobacteria , Bacteroidetes , and Firmicutes compared to the PCa group. In contrast, the relative abundance of Proteobacteria was higher in PCa patients. No statistically significant differences were observed in the relative abundance of Cyanobacteria , V errucomicrobia , Fusobacteria , Synergistetes , and Spirochaetes between the PCa and control groups (Additional file 3 : Figure S5-13). Due to significant heterogeneity in some bacteria, further sensitivity analysis was conducted (Additional file 3 : Figure S14-15). The stability of the results was reaffirmed by this analysis, showing no significant changes in the overall effect size when any single study was excluded.

Bacterial class

Four studies reported data on the relative abundance of bacteria at the class level [ 24 , 25 , 26 , 28 ]. Forest plots revealed that the GM of the control group had higher proportions of Actinobacteria , Negativicutes , Betaproteobacteria , and Epsilonproteobacteria compared to the PCa group. However, in the PCa group, the relative abundance of Bacteroidia , Clostridia , Gammaproteobacteria , and Coriobacteriia exceeded that of the control group. No statistically significant differences were observed in the relative abundance of Bacilli , Erysipelotrichia , Deltaproteobacteria , Verrucomicrobiae , Fusobacteria , Alphaproteobacteria , Synergistia , and Spirochaetes between PCa patients and controls (Additional file 3 : Figure S16-31). Significant heterogeneity in some bacteria necessitated a sensitivity analysis (Additional file 3 : Figure S32-34). The results' stability was confirmed by this analysis, showing no significant changes in the overall effect size when any single study was excluded.

Bacterial order

Three studies reported data on the relative abundance of bacteria at the order level [ 24 , 25 , 26 ]. The GM of the control group exhibited higher proportions of Lactobacillales , Selenomonadales , and Actinomycetales compared to the PCa group. In contrast, the relative abundance of Bacteroidales , Clostridiales , Enterobacteriales , Bifidobacteriales , and Coriobacteriales was higher in PCa patients (Additional file 3 : Figure S35-42). Due to significant heterogeneity in some bacteria, a sensitivity analysis was conducted (Additional file 3 : Figure S43-45). The stability of the results was reaffirmed by this sensitivity analysis, indicating no significant changes in the overall effect size when any single study was excluded.

Bacterial family

Five studies reported data on the relative abundance of bacteria at the family level [ 24 , 25 , 26 , 27 , 29 ]. Forest plots showed that the GM of the control group had higher proportions of Corynebacteriaceae , Veillonellaceae , Bacteroidaceae , and Actinomycetaceae compared to the PCa group. However, the abundance of Prevotellaceae , Lachnospiraceae , Ruminococcaceae , Erysipelotrichaceae , Burkholderiaceae , and Bifidobacteriaceae was greater in the PCa group. There were no statistically significant differences in the relative abundance of Streptococcaceae , Acidaminococcaceae , and Enterobacteriaceae between PCa patients and controls (Additional file 3 : Figure S46-58). Significant heterogeneity in some bacteria necessitated sensitivity analysis (Additional file 3 : Figure S59-60). The results' stability was confirmed by this analysis, showing no significant changes in the overall effect size when any single study was excluded.

Bacterial genus

Three studies have reported data on the relative abundance of bacteria at the genus level [ 24 , 25 , 26 ]. Forest plots indicated that the GM of the control group had higher proportions of Veillonella and Megasphaera compared to the PCa group. In contrast, the abundance of Prevotella , Escherichia-Shigella , Faecalibacterium , and Bacteroides was higher in PCa patients. No statistically significant differences in the abundance of Streptococcus were observed between PCa patients and controls (Additional file 3 : Figure S61-67).

Publication bias

A risk of bias assessment was conducted for each article included in our study. Based on Begg's correlation test and Egger's regression test, there was no statistically significant evidence of bias in the alpha-diversity of GM (Additional file 4 : Table S2). However, for the relative abundance of GM, publication bias was identified in certain bacteria, while no apparent bias was detected in others (Additional file 5 : Table S3-7).

Our comprehensive review represents the first meta-analysis examining gut microbiota composition in prostate cancer (PCa) patients. We observed notable variations in the composition of GM between PCa patients and non-PCa individuals. Our results indicated a decline in alpha-diversity of GM in PCa patients compared to the control group. Additionally, significant differences in bacterial relative abundance were evident at the phylum, class, order, family, and genus levels. Specifically, at the phylum level, a higher proportion of Proteobacteria was observed in PCa patients, while the proportions of Actinobacteria , Bacteroidetes , and Firmicutes were comparatively lower. At the genus level, increased abundance of Prevotella , Escherichia-Shigella , Faecalibacterium , and Bacteroides was noted in PCa patients, with a decreased abundance of Veillonella and Megasphaera .

Dysbiosis in the gut is defined as any alteration (increase or decrease) in GM that adversely affects the health of the host organism. Several studies suggest that the diversity of GM is increasingly recognized as a crucial factor in host health. Concurrently, a decrease in microbial diversity has been associated with various gastrointestinal and systemic diseases [ 31 , 32 ]. Thus, GM is considered a regulatory factor in human health [ 31 ]. This finding aligns with our research, where a declining trend in gut microbiota was observed in PCa patients. Our studies facilitate exploration into the correlation between PCa and GM, but do not establish a causal relationship. The following factors may contribute to the decrease in gut microbiota α-diversity.

Changes in estrogen levels in humans may contribute to the decline in gut microbial alpha-diversity in patients diagnosed with PCa. Barrett-Connor et al. suggested a potential link between increased estrogen levels in the body and an increased risk associated with the prostate [ 33 ]. Thus, estrogen is considered a potential factor influencing the onset and progression of PCa [ 34 ]. Estrogen can indirectly suppress androgens by inhibiting the hypothalamic luteinizing hormone-releasing hormone (LHRH), reducing the stimulation of the pituitary gland to secrete luteinizing hormone (LH) and thereby constraining PCa progression. Some gut bacteria can metabolize and produce estrogen, known as the estrobolome, affecting the body's estrogen levels [ 35 ]. Normally, conjugated estrogen (glucuronide) produced in the liver cannot bind with estrogen receptors (ER). Gut microbiota can produce beta-glucuronidase to catalyze estrogen from a conjugated form to a dissociated form, which is closely related to human health. Dysbiosis of gut microbiota can impair this process, leading to decreased deconjugation and circulating estrogens, potentially linked with cancer emergence. Furthermore, estrogen might play a role in the progression of PCa, possibly via pathways such as genetic mutation, DNA damage, or chronic inflammation [ 36 ].

The implementation of Androgen Deprivation Therapy (ADT) in patients diagnosed with PCa might be linked to a decrease in the alpha-diversity of GM. ADT, a standard treatment for PCa, aims to control disease progression by suppressing androgen production. Matsushita et al. identified a potential positive correlation between serum testosterone levels and the prevalence of Firmicutes [ 37 ]. A study involving PCa patients who underwent short-term, medium-term, and long-term ADT found that those receiving long-term ADT had significantly lower GM diversity compared to the other groups. At the phylum level, the abundance of Firmicutes and Bacteroidetes was higher in the long-term ADT group than in the other two subgroups [ 38 ]. Additionally, Sfanos' research, which analyzed the feces of PCa patients undergoing androgen deprivation therapy (ADT), noted an enrichment of bacteria capable of steroid biosynthesis, such as muciniphila , Ruminococcaceae , or Lachnospiraceae , in the GM of these patients. Gut bacteria can also produce androgens from corticosteroids. These studies suggest that GM undergoes changes due to androgen deprivation and serves as a source of androgenic steroids, potentially contributing to resistance against ADT. This aligns with our findings, where Ruminococcaceae and Lachnospiraceae are proportionally higher in PCa patients. However, as various bacteria can perform steroid synthesis, further research is needed to identify specific androgenic steroid biosynthetic pathways activated within bacteria [ 39 ]. Therefore, the decline in GM diversity may be attributed to changes in testosterone levels [ 40 ].

Long-term intake of a high-fat diet (HFD) may also contribute to a decrease in the alpha-diversity of GM in patients with PCa. The composition of GM is influenced by various factors, including lifestyle habits, diet, illness conditions, and drug usage, with dietary factors having a particularly significant impact [ 41 ]. The consumption of HFD, dairy products, and processed meats has been confirmed as risk factors for prostate cancer [ 42 , 43 ]. A study using a prostate-specific Pten knockout mouse model suggests that a high-fat diet (HFD) promotes prostate cancer growth compared to a control diet, with the effects of the control diet being negated by administering broad-spectrum antibiotics [ 44 ]. Short-chain fatty acids (SCFA) produced by GM can signal through IGF1 on prostate epithelial cells, activating MAPK and PI3K signaling pathways and stimulating prostate tumor growth. Additionally, SCFA produced by gut bacteria may mitigate inflammation by regulating cytokine production (such as IL-10) and promoting regulatory T cell expansion, though the specific mechanisms are not fully understood. Recent research indicates that HFD consumption increases the abundance of anaerobic bacteria and Bacteroides in the gut. HFD can alter GM, increasing the translocation of Gram-negative bacteria into the bloodstream and mesenteric fat tissue through the intestinal mucosa, leading to inflammation [ 45 ]. HFD may also compromise the gut barrier, enhance intestinal permeability, and allow various intestinal metabolites or bacterial components to enter the host's circulation, triggering an inflammatory response. This inflammatory response is a crucial factor in HFD-induced prostate cancer growth, with HFD potentially leading to increased IL-6 expression in prostate tissue and triggering prostate cancer [ 46 ].

Quantitatively analyzed at the phylum level, the GM of PCa and control populations exhibited differences, particularly in Proteobacteria , Actinobacteria , Bacteroidetes , and Firmicutes . The equilibrium of GM is primarily maintained by these phyla [ 47 ], with Bacteroidetes and Firmicutes typically dominating the balance. A reduction in these bacteria often indicates gut dysbiosis, contributing to disease [ 48 ], which aligns with our research findings. Additionally, an increased abundance of Proteobacteria is considered indicative of GM dysbiosis. While a temporary rise in Proteobacteria in a healthy state may not cause clinical symptoms [ 48 , 49 ], a long-term overabundance might reflect microbiota dysbiosis or a diseased state [ 48 ]. The specific relationship between Proteobacteria and PCa, however, remains unclear and warrants further investigation to explore this connection.

Quantitatively analyzed at the genus level, the gut microbiota (GM) of PCa and control populations show differences, particularly in Prevotella , Escherichia-Shigella , Faecalibacterium , Bacteroides , Veillonella , and Megasphaera . Among these, Faecalibacterium is a core genus in the human gut. Research indicates that Faecalibacterium can stimulate the NF-KB pathway and elevate the expression of multiple pro-inflammatory cytokine genes, potentially driving the progression of colorectal cancer [ 50 ]. While a direct link between Faecalibacterium and PCa has not been established, considering the gut inflammation response as a risk factor for PCa [ 46 ], a connection is plausible. Studies have shown that the abundance of Prevotella is high in the GM of patients with colorectal cancer [ 51 ]. Interestingly, Prevotella is also abundant in the gut of PCa patients, suggesting a possible connection. However, the specifics of this relationship and its underlying mechanisms remain to be explored, necessitating further research. Although we have conducted a thorough analysis at the phylum and genus levels, the role of GM at the order, class, and family levels in relation to PCa remains unclear. Future studies are required to explore these aspects and deepen our understanding of GM's role in PCa.

Additionally, dietary habits, medical procedures, race, geographic location, and other factors may contribute to the observed differences in diversity and abundance of GM between the PCa population and the control group. In terms of diet, Western-style diets are often associated with an increased risk of PCa compared to Chinese cuisine. However, current research yields inconsistent findings regarding whether the Western-style diet affects PCa risk through the mediation of GM, or through other factors such as metabolism or inflammation in prostate tissue [ 52 , 53 ]. Dietary nutrients, including fats, proteins, carbohydrates, vitamins (such as A, D, and E), and polyphenols, may also play a role in preventing PCa by influencing GM, though their specific mechanisms are not yet clear. Geographic variations also influence GM composition; for example, the gut microbiota in Japan exhibits a more abundant Actinobacteria phylum [ 54 ]. In terms of race, the participants in Alanee's studies were Caucasians [ 24 ], while those in Zhong's studies were Asians [ 25 ]. The diversity of subjects may impact the results, underscoring the need for more research to examine the influence of various factors on GM composition.

Given the presence of treatment-resistant cases in current PCa therapies, the GM offers a potential avenue for the prevention and treatment of prostate cancer. Understanding the intricate relationship between GM and PCa could lead to novel approaches in managing this disease.

Regarding screening potential, the use of serum PSA screening remains controversial due to modest risk reduction, a high rate of false positives, and questions about cost-efficacy at the population level [ 55 ]. Hence, detecting "unfavorable" characteristics in gut microbiota may be incorporated into prostate cancer risk screening. Our research results offer a reference for clinical physicians in this regard.

In terms of therapeutic potential, strategies aimed at transforming the gut microbiota of prostate cancer patients from unfavorable to favorable characteristics may aid in delaying or treating the disease. Various methods, such as fecal microbiota transplantation (FMT), prebiotics, probiotics, or synbiotics, can be used to treat the gut microbiota in prostate cancer. For instance, probiotics have seen wide application in patients with obesity and alcoholic liver disease [ 56 , 57 , 58 ]. Our research findings indicate potential bacterial differences between the cancer and control groups, which could guide future researchers in identifying "favorable" or "unfavorable" microbiota. This offers a reference for the development of future microbiota therapies in prostate cancer management.

Strengths and limitations

Our study exhibited several advantages. We maintained strict inclusion criteria, systematically retrieved all relevant studies that meet our predetermined conditions, and adhered to the PRISMA guidelines for reporting systematic reviews and meta-analyses. Additionally, our research included recent matching cohorts, providing an in-depth examination of the diversity and richness of gut microbiota in patients with PCa.

Despite these strengths, our study faced several limitations. 1. The number of articles included was limited, with only seven studies being available for quantitative analysis. 2. High heterogeneity among the included studies could have influenced the results, a common challenge in observational studies [ 59 ], as opposed to randomized controlled trials. 3. The included studies showed significant clinical and methodological heterogeneity, with factors like participant sample size, race, diet, residence, treatment methods, and age impacting GM composition. 4. Variations in DNA extraction methods, sequencing platforms, and sequencing depths used for sequencing the 16S rRNA gene region might have led to inconsistent results. 5. The methods of feces collection, such as stool samples and rectal swabs, also varied, potentially affecting the outcomes. The composition of the control group was not always consistent, and the inclusion of both healthy samples and benign prostatic hyperplasia (BPH) samples might have introduced biases.

Furthermore, our study could not encompass all bacterial strains associated with PCa. While we established a correlation between GM and PCa, this did not definitively imply a causal relationship. Future high-quality studies are required to validate these findings.

Conclusions

Overall, our meta-analysis findings indicated variances in both the abundance and alpha diversity of GM when comparing PCa patients to the control group.Microbial dysbiosis may be caused by ADT treatment, HFD, and changes in endogenous estrogens. The impact of GM on the pathogenicity of PCa still remained disputed. In the future, the gut microbiota may find broader applications in the screening and treatment of PCa (prostate cancer). However, further foundational and clinical research were required to elucidate this connection.

Availability of data and materials

The datasets analyzed in this study are potentially available from the corresponding authors upon a reasonable request.

Abbreviations

  • Prostate cancer
  • Gut microbiota

Benign prostate hyperplasia

High-fat diet

Luteinizing hormone-releasing hormone

Luteinizing hormone

Androgen deprivation therapy

Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 Countries. CA A Cancer J Clin. 2021;71(3):209–49. https://doi.org/10.3322/caac.21660 .

Article   Google Scholar  

Siegel RL, Miller KD, Fuchs HE, Jemal A. Cancer statistics, 2022. CA A Cancer J Clin. 2022;72(1):7–33. https://doi.org/10.3322/caac.21708 .

Kimura T, Egawa S. Epidemiology of prostate cancer in Asian countries. Int J Urol. 2018;25(6):524–31. https://doi.org/10.1111/iju.13593 .

Article   PubMed   Google Scholar  

Wong MC, Goggins WB, Wang HH, Fung FD, Leung C, Wong SY, Ng CF, Sung JJ. Global Incidence and mortality for prostate cancer: analysis of temporal patterns and trends in 36 countries. Eur Urol. 2016;70(5):862–74. https://doi.org/10.1016/j.eururo.2016.05.043 .

Wang G, Zhao D, Spring DJ, DePinho RA. Genetics and biology of prostate cancer. Genes Dev. 2018;32(17–18):1105–40. https://doi.org/10.1101/gad.315739.118 .

Article   CAS   PubMed   PubMed Central   Google Scholar  

Garrett WS. Cancer and the microbiota. Science. 2015;348(6230):80–6. https://doi.org/10.1126/science.aaa4972 .

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Fabiani R, Minelli L, Bertarelli G, Bacci S. A Western dietary pattern increases prostate cancer risk: a systematic review and meta-analysis. Nutrients. 2016;8(10):626. https://doi.org/10.3390/nu8100626 .

Article   PubMed   PubMed Central   Google Scholar  

Massari F, Mollica V, Di Nunno V, Gatto L, Santoni M, Scarpelli M, Cimadamore A, Lopez-Beltran A, Cheng L, Battelli N, et al. The human microbiota and prostate cancer: friend or foe? Cancers (Basel). 2019;11(4):459. https://doi.org/10.3390/cancers11040459 .

Article   CAS   PubMed   Google Scholar  

Porter CM, Shrestha E, Peiffer LB, Sfanos KS. The microbiome in prostate inflammation and prostate cancer. Prostate Cancer Prostatic Dis. 2018;21(3):345–54. https://doi.org/10.1038/s41391-018-0041-1 .

Banerjee S, Robertson ES: Future Perspectives: Microbiome, Cancer and Therapeutic Promise. In.; 2019: 363–389.

Woo V, Alenghat T. Epigenetic regulation by gut microbiota. Gut Microbes. 2022;14(1):2022407. https://doi.org/10.1080/19490976.2021.2022407 .

Lacroix V, Cassard A, Mas E, Barreau F. Multi-omics analysis of gut microbiota in inflammatory bowel diseases: what benefits for diagnostic, prognostic and therapeutic tools? Int J Mol Sci. 2021;22(20):11255. https://doi.org/10.3390/ijms222011255 .

Morais LH. Schreiber HLt, Mazmanian SK: The gut microbiota-brain axis in behaviour and brain disorders. Nat Rev Microbiol. 2021;19(4):241–55. https://doi.org/10.1038/s41579-020-00460-0 .

Hou M, Xu G, Ran M, Luo W, Wang H. APOE-ε4 carrier status and gut microbiota dysbiosis in patients with alzheimer disease. Front Neurosci. 2021;15:619051. https://doi.org/10.3389/fnins.2021.619051 .

Palacios T, Vitetta L, Coulson S, Madigan CD, Lam YY, Manuel R, Briskey D, Hendy C, Kim JN, Ishoey T, et al. Targeting the intestinal microbiota to prevent type 2 diabetes and enhance the effect of metformin on glycaemia: a randomised controlled pilot study. Nutrients. 2020;12(7):2041. https://doi.org/10.3390/nu12072041 .

Allaband C, McDonald D, Vázquez-Baeza Y, Minich JJ, Tripathi A, Brenner DA, Loomba R, Smarr L, Sandborn WJ, Schnabl B, et al. Microbiome 101: studying, analyzing, and interpreting gut microbiome data for clinicians. Clin Gastroenterol Hepatol. 2019;17(2):218–30. https://doi.org/10.1016/j.cgh.2018.09.017 .

Langille MG, Zaneveld J, Caporaso JG, McDonald D, Knights D, Reyes JA, Clemente JC, Burkepile DE, Vega Thurber RL, Knight R, et al. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat Biotechnol. 2013;31(9):814–21. https://doi.org/10.1038/nbt.2676 .

Golombos DM, Ayangbesan A, O’Malley P, Lewicki P, Barlow L, Barbieri CE, Chan C, DuLong C, Abu-Ali G, Huttenhower C, et al. The role of gut microbiome in the pathogenesis of prostate cancer: a prospective. Pilot Study Urology. 2018;111:122–8. https://doi.org/10.1016/j.urology.2017.08.039 .

Fernandes A, Oliveira A, Guedes C, Fernandes R, Soares R, Barata P. Effect of radium-223 on the gut microbiota of prostate cancer patients: a pilot case series study. Curr Issues Mol Biol. 2022;44(10):4950–9. https://doi.org/10.3390/cimb44100336 .

McHugh ML. Interrater reliability: the kappa statistic. Biochemia medica. 2012;22(3):276–82.

Article   MathSciNet   PubMed   PubMed Central   Google Scholar  

Stang A. Critical evaluation of the Newcastle-Ottawa scale for the assessment of the quality of nonrandomized studies in meta-analyses. Eur J Epidemiol. 2010;25(9):603–5. https://doi.org/10.1007/s10654-010-9491-z .

Higgins JP, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ (Clinical research ed). 2003;327(7414):557–60. https://doi.org/10.1136/bmj.327.7414.557 .

Egger M, Davey Smith G, Schneider M, Minder C. Bias in meta-analysis detected by a simple, graphical test. BMJ (Clinical research ed). 1997;315(7109):629–34. https://doi.org/10.1136/bmj.315.7109.629 .

Alanee S, El-Zawahry A, Dynda D, Dabaja A, McVary K, Karr M, Braundmeier-Fleming A. A prospective study to examine the association of the urinary and fecal microbiota with prostate cancer diagnosis after transrectal biopsy of the prostate using 16sRNA gene analysis. Prostate. 2019;79(1):81–7. https://doi.org/10.1002/pros.23713 .

Zhong W, Wu K, Long Z, Zhou X, Zhong C, Wang S, Lai H, Guo Y, Lv D, Lu J, et al. Gut dysbiosis promotes prostate cancer progression and docetaxel resistance via activating NF-κB-IL6-STAT3 axis. Microbiome. 2022;10(1):94. https://doi.org/10.1186/s40168-022-01289-w .

Smith KS, Fruge AD, van der Pol W, Caston NE, Morrow CD, Demark-Wahnefried W, Carson TL. Gut microbial differences in breast and prostate cancer cases from two randomised controlled trials compared to matched cancer-free controls. Beneficial Microbes. 2021;12(3):239–48. https://doi.org/10.3920/bm2020.0098 .

Sfanos KS, Markowski MC, Peiffer LB, Ernst SE, White JR, Pienta KJ, Antonarakis ES, Ross AE. Compositional differences in gastrointestinal microbiota in prostate cancer patients treated with androgen axis-targeted therapies. Prostate Cancer Prostatic Dis. 2018;21(4):539–48. https://doi.org/10.1038/s41391-018-0061-x .

Liss MA, White JR, Goros M, Gelfond J, Leach R, Johnson-Pais T, Lai Z, Rourke E, Basler J, Ankerst D, et al. Metabolic biosynthesis pathways identified from fecal microbiome associated with prostate cancer. Eur Urol. 2018;74(5):575–82. https://doi.org/10.1016/j.eururo.2018.06.033 .

Kalinen S, Kallonen T, Gunell M, Ettala O, Jambor I, Knaapila J, Syvänen KT, Taimen P, Poutanen M, Ohlsson C et al: Gut microbiota affects prostate cancer risk through steroid hormone biosynthesis. In.; 2021.

Katz R, Ahmed MA, Safadi A, Abu Nasra W, Visoki A, Huckim M, Elias I, Nuriel-Ohayon M, Neuman H. Characterization of fecal microbiome in biopsy positive prostate cancer patients. BJUI compass. 2022;3(1):55–61. https://doi.org/10.1002/bco2.104 .

Fan Y, Pedersen O. Gut microbiota in human metabolic health and disease. Nat Rev Microbiol. 2021;19(1):55–71. https://doi.org/10.1038/s41579-020-0433-9 .

Frank DN, St Amand AL, Feldman RA, Boedeker EC, Harpaz N, Pace NR. Molecular-phylogenetic characterization of microbial community imbalances in human inflammatory bowel diseases. Proc Natl Acad Sci USA. 2007;104(34):13780–5. https://doi.org/10.1073/pnas.0706625104 .

Barrett-Connor E, Garland C, McPhillips JB, Khaw KT, Wingard DL. A prospective, population-based study of androstenedione, estrogens, and prostatic cancer. Cancer Res. 1990;50(1):169–73.

CAS   PubMed   Google Scholar  

Huggins C, Hodges CV. Studies on prostatic cancer. I. The effect of castration, of estrogen and androgen injection on serum phosphatases in metastatic carcinoma of the prostate. CA Cancer J Clin. 1972;22(4):232–40. https://doi.org/10.3322/canjclin.22.4.232 .

Baker JM, Al-Nakkash L, Herbst-Kralovetz MM. Estrogen-gut microbiome axis: physiological and clinical implications. Maturitas. 2017;103:45–53. https://doi.org/10.1016/j.maturitas.2017.06.025 .

Nelles JL, Hu WY, Prins GS. Estrogen action and prostate cancer. Expert Rev Endocrinol Metab. 2011;6(3):437–51. https://doi.org/10.1586/eem.11.20 .

Matsushita M, Fujita K, Motooka D, Hatano K, Hata J, Nishimoto M, Banno E, Takezawa K, Fukuhara S, Kiuchi H, et al. Firmicutes in gut microbiota correlate with blood testosterone levels in elderly men. World J Men’s Health. 2022;40(3):517–25. https://doi.org/10.5534/wjmh.210190 .

Kure A, Tsukimi T, Ishii C, Aw W, Obana N, Nakato G, Hirayama A, Kawano H, China T, Shimizu F, et al. Gut environment changes due to androgen deprivation therapy in patients with prostate cancer. Prostate Cancer Prostatic Dis. 2023;26(2):323–30. https://doi.org/10.1038/s41391-022-00536-3 .

Pernigoni N, Guo C, Gallagher L, Yuan W, Colucci M, Troiani M, Liu L, Maraccani L, Guccini I, Migliorini D, et al. The potential role of the microbiota in prostate cancer pathogenesis and treatment. Nat Rev Urol. 2023;20(12):706–18. https://doi.org/10.1038/s41585-023-00795-2 .

Zha C, Peng Z, Huang K, Tang K, Wang Q, Zhu L, Che B, Li W, Xu S, Huang T, et al. Potential role of gut microbiota in prostate cancer: immunity, metabolites, pathways of action? Front Oncol. 2023;13:1196217. https://doi.org/10.3389/fonc.2023.1196217 .

Zhernakova A, Kurilshikov A, Bonder MJ, Tigchelaar EF, Schirmer M, Vatanen T, Mujagic Z, Vila AV, Falony G, Vieira-Silva S, et al. Population-based metagenomics analysis reveals markers for gut microbiome composition and diversity. Science. 2016;352(6285):565–9. https://doi.org/10.1126/science.aad3369 .

Newmark HL, Heaney RP. Dairy products and prostate cancer risk. Nutr Cancer-an Int J. 2010;62(3):297–9. https://doi.org/10.1080/01635580903407221 .

Article   CAS   Google Scholar  

Punnen S, Hardin J, Cheng I, Klein EA, Witte JS. Impact of meat consumption, preparation, and mutagens on aggressive prostate cancer. PLoS ONE. 2011;6(11):e27711. https://doi.org/10.1371/journal.pone.0027711 .

Matsushita M, Fujita K, Hayashi T, Kayama H, Motooka D, Hase H, Jingushi K, Yamamichi G, Yumiba S, Tomiyama E, et al. Gut microbiota-derived short-chain fatty acids promote prostate cancer growth via IGF1 signaling. Cancer Res. 2021;81(15):4014–26. https://doi.org/10.1158/0008-5472.Can-20-4090 .

Amar J, Chabo C, Waget A, Klopp P, Vachoux C, Bermúdez-Humarán LG, Smirnova N, Bergé M, Sulpice T, Lahtinen S, et al. Intestinal mucosal adherence and translocation of commensal bacteria at the early onset of type 2 diabetes: molecular mechanisms and probiotic treatment. EMBO Mol Med. 2011;3(9):559–72. https://doi.org/10.1002/emmm.201100159 .

Fujita K, Hayashi T, Matsushita M, Uemura M, Nonomura N. Obesity, inflammation, and prostate cancer. J Clin Med. 2019;8(2):201. https://doi.org/10.3390/jcm8020201 .

Turnbaugh PJ, Ley RE, Mahowald MA, Magrini V, Mardis ER, Gordon JI. An obesity-associated gut microbiome with increased capacity for energy harvest. Nature. 2006;444(7122):1027–31. https://doi.org/10.1038/nature05414 .

Article   ADS   PubMed   Google Scholar  

Shin NR, Whon TW, Bae JW. Proteobacteria: microbial signature of dysbiosis in gut microbiota. Trends Biotechnol. 2015;33(9):496–503. https://doi.org/10.1016/j.tibtech.2015.06.011 .

Caporaso JG, Lauber CL, Costello EK, Berg-Lyons D, Gonzalez A, Stombaugh J, Knights D, Gajer P, Ravel J, Fierer N, et al. Moving pictures of the human microbiome. Genome Biol. 2011;12(5):R50. https://doi.org/10.1186/gb-2011-12-5-r50 .

Brennan CA, Garrett WS. Fusobacterium nucleatum - symbiont, opportunist and oncobacterium. Nat Rev Microbiol. 2019;17(3):156–66. https://doi.org/10.1038/s41579-018-0129-6 .

Flemer B, Lynch DB, Brown JM, Jeffery IB, Ryan FJ, Claesson MJ, O’Riordain M, Shanahan F, O’Toole PW. Tumour-associated and non-tumour-associated microbiota in colorectal cancer. Gut. 2017;66(4):633–43. https://doi.org/10.1136/gutjnl-2015-309595 .

Richman EL, Kenfield SA, Stampfer MJ, Giovannucci EL, Chan JM. Egg, red meat, and poultry intake and risk of lethal prostate cancer in the prostate-specific antigen-era: incidence and survival. Cancer Prev Res (Phila). 2011;4(12):2110–21. https://doi.org/10.1158/1940-6207.Capr-11-0354 .

Meyer F, Bairati I, Shadmani R, Fradet Y, Moore L. Dietary fat and prostate cancer survival. Cancer causes & control : CCC. 1999;10(4):245–51. https://doi.org/10.1023/a:1008913307947 .

Nishijima S, Suda W, Oshima K, Kim SW, Hirose Y, Morita H, Hattori M. The gut microbiome of healthy Japanese and its microbial and functional uniqueness. DNA Res. 2016;23(2):125–33. https://doi.org/10.1093/dnares/dsw002 .

Catalona WJ. Prostate cancer screening. Med Clin North Am. 2018;102(2):199–214. https://doi.org/10.1016/j.mcna.2017.11.001 .

Musazadeh V, Roshanravan N, Dehghan P, Ahrabi SS. Effect of probiotics on liver enzymes in patients with non-alcoholic fatty liver disease: an umbrella of systematic review and meta-analysis. Front Nutr. 2022;9:844242. https://doi.org/10.3389/fnut.2022.844242 .

Keramati M, Kheirouri S, Musazadeh V, Alizadeh M. Association of high dietary acid load with the risk of cancer: a systematic review and meta-analysis of observational studies. Front Nutr. 2022;9:816797. https://doi.org/10.3389/fnut.2022.816797 .

Musazadeh V, Zarezadeh M, Ghalichi F, Ahrabi SS, Jamilian P, Jamilian P, Ghoreishi Z. Anti-obesity properties of probiotics; a considerable medical nutrition intervention: Findings from an umbrella meta-analysis. Eur J Pharmacol. 2022;928:175069. https://doi.org/10.1016/j.ejphar.2022.175069 .

Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, Moher D, Becker BJ, Sipe TA, Thacker SB. Meta-analysis of observational studies in epidemiology: a proposal for reporting. Meta-analysis Of Observational Studies in Epidemiology (MOOSE) group. Jama. 2000;283(15):2008–12. https://doi.org/10.1001/jama.283.15.2008 .

Download references

Acknowledgements

We thank Zibo Yimore Translation CO. LTD for the language editing service.

The authors declared that no funds, grants, or other support were received during the preparation of this manuscript.

Author information

Authors and affiliations.

Department of Urology, Afliated Hospital of North Sichuan Medical College, Nanchong, China

Haotian Huang, Yang Liu, Zhi Wen, Caixia Chen, Chongjian Wang, Hongyuan Li & Xuesong Yang

You can also search for this author in PubMed   Google Scholar

Contributions

All authors participated in the conception and design of the study. HH and LY conducted data collection and analysis. The initial draft of the manuscript was prepared by HH, and all authors provided comments on earlier drafts. All authors have reviewed and given their approval for the publication of the final manuscript.

Corresponding author

Correspondence to Xuesong Yang .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing of interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1., supplementary material 2., supplementary material 3., supplementary material 4., supplementary material 5., supplementary material 6., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Huang, H., Liu, Y., Wen, Z. et al. Gut microbiota in patients with prostate cancer: a systematic review and meta-analysis. BMC Cancer 24 , 261 (2024). https://doi.org/10.1186/s12885-024-12018-x

Download citation

Received : 08 November 2023

Accepted : 18 February 2024

Published : 24 February 2024

DOI : https://doi.org/10.1186/s12885-024-12018-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • 16S sequencing
  • Systematic review and meta-analysis

ISSN: 1471-2407

systematic review summary of findings table

  • Open access
  • Published: 17 February 2024

Global prevalence of intimate partner violence during the COVID-19 pandemic among women: systematic review and meta-analysis

  • Mearg Eyasu Kifle 1 ,
  • Setognal Birara Aychiluhm 2 &
  • Etsay Woldu Anbesu 2  

BMC Women's Health volume  24 , Article number:  127 ( 2024 ) Cite this article

930 Accesses

Metrics details

During the coronavirus pandemic, people faced strict preventive measures, including staying at home and maintaining social distance, which led to increasing rates of intimate partner violence. Women have been facing dual health emergencies, including COVID-19 and domestic violence. Despite this, there is a lack of representative data on intimate partner violence during the COVID-19 pandemic and inconsistent findings.

The Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines were used to develop the systematic review and meta-analysis. All English-language studies conducted between 31 December 2019 and May 15/2022 were extracted from databases such as PubMed/Medline, CINAHL, and Google Scholar. The quality of the articles was assessed using the Joanna Briggs Institute Meta-Analysis of Statistics Assessment and Review Instrument (JBI-MAStARI). The I 2 was used to assess heterogeneity among studies. Publication bias was assessed using funnel plot inspection and Egger’s test. A random effect model was used for the analysis using RevMan and STATA 14 software.

A total of 5065 studies were retrieved, and 14 studies were included in the final meta-analysis. The pooled prevalence of intimate partner violence was 31% (95% CI: 22, 40). Subgroup analysis based on region showed that the highest prevalence of intimate partner violence was in developing regions (33, 95% CI: 23.0, 43.0) compared to developed regions (14, 95% CI: 11.0, 17.0). Subgroup analysis based on country showed that Uganda had the highest prevalence of IPV 68% (95% CI: 62.0, 72.0), and the lowest was in the USA 10% (95% CI: 7.0, 15.0).

Nearly one in three women experienced intimate partner violence during the COVID-19 pandemic. Subgroup analysis based on region showed that the highest prevalence of intimate partner violence was in developing regions (33%). All forms of intimate partner violence (physical, sexual, emotional, and economic) were prevalent. Thus, available interventions should be implemented to alleviate women’s intimate partner violence during the COVID-19 pandemic and similar emerging and remerging pandemics, particularly in developing countries.

Trial registration

PROSPERO registration number: CRD42022334613 .

Peer Review reports

Introduction

Gender-based violence (GBV) is any cruelty directed at individuals based on their sex, gender identity, or socially defined way of femaleness and maleness [ 1 , 2 , 3 ]. Violence against women is the primary form of GBV and is a basic violation of women’s human rights [ 1 , 2 , 3 , 4 , 5 ]. Threats, coercion, and denial of liberty against women are some of the violence against women [ 5 , 6 , 7 ]. The main actors of violence against women are male partners, including husbands, fiancées, or ex-partners, often referred to as intimate partners [ 5 , 6 , 7 , 8 ]. The World Health Organization (WHO) defines intimate partner violence as any behaviour within an intimate relationship by an intimate partner that causes physical, psychological, and sexual harm to those in the relationship, and it is one of the most common types of violence experienced by women [ 7 , 8 , 9 ].

Intimate partner violence is a serious, highly prevalent, preventable public health problem that violates women’s rights [ 10 ]. It has been exacerbated during the COVID-19 pandemic following control and prevention actions such as isolation, stay-at-home, and movement restrictions, targeted at reducing the pandemic have brought vulnerable women and potential perpetrators under the confines of the home setting and have increased the risk of IPV [ 11 , 12 , 13 ]. Globally, one in three women experiences physical, sexual, or psychological harm from an intimate partner or ex-partner [ 14 , 15 ]. The World Health Organization (WHO) and European Commission evidence indicated a ‘shadow pandemic’, with the strong potential of increased IPV across the globe as seen during the Ebola pandemic. At the beginning of the pandemic (March–April), community-based victim organizations reported a 25–50% increase in hotline calls, up to a 150% increase in website traffic, and a 12.5% increase in IPV-related police activity [ 16 , 17 , 18 , 19 ].

Lockdown declarations following the COVID-19 pandemic in several countries of developed countries increased intimate partner violence by 20%, 21–35%, 32–36%, and 30–50% [ 12 , 20 ]. In Africa, approximately 36.6% of women experience lifetime physical or sexual IPV [ 7 ]. During the COVID-19 pandemic, in Kenya (35%), Somalia (50%), South Africa (10660), Niger (499 cases) and Ethiopia (12.9%), Intimate partner violence has been reported to be as high as before [ 21 ]. In Ethiopia, more than 100 girls have been raped during COVID-19 within less than 2 months, and some of them are close family members [ 13 ].

Intimate partner violence has a complex and multifaceted health outcome, including physical, mental, sexual, and reproductive health issues, which, in turn, result in a high degree of women’s morbidity and mortality [ 22 ]. A study performed by the WHO showed that women who experienced violence were twice as likely to have an abortion and doubled their likelihood of falling into depression [ 23 ]. Approximately 41% of female IPV survivors experience some form of physical injury [ 24 ]. IPV can also extend beyond physical injury and result in death. Data from U.S. crime reports suggest that 16% of murder victims are killed by an intimate partner and that over 40% of female homicide victims in the U.S. are killed by an intimate partner [ 25 ].

There are policies and strategies implemented to overcome the problem at the global or local level just before and after the pandemic, including teaching safe and healthy relationship skills, engaging influential adults and peers, disrupting developmental pathways toward IPV, creating protective environments, strengthening economic support for families, and supporting survivors. Increased safety and lessened harm, commitment, cooperation, and leadership from numerous sectors, including public health, education, justice, health care, social services, business and labor, and government [ 26 , 27 , 28 , 29 , 30 , 31 ]. Despite this intervention, intimate partner violence remains a major public health problem during the COVID-19 pandemic. Moreover, there is a lack of representative data on intimate partner violence during the COVID-19 pandemic and inconsistent findings. Therefore, this systematic review and meta-analysis aimed to estimate the pooled prevalence of intimate partner violence during the COVID-19 pandemic among women.

Protocol and registration

These systematic reviews and meta-analyses were registered with the International Prospective Register of Systematic Reviews PROSPERO with an ID number (CRD42022334613) available at https://www.crd.york.ac.uk/prospero/#myprospero .

Form of violence

Physical violence includes slapping, hitting, kicking and beating.

Sexual violence includes forced sexual intercourse and other forms of sexual coercion.

Emotional (psychological) abuse includes insults, belittling, constant humiliation, intimidation (e.g., destroying things), threats of harm, and threats to take away children.

Controlling behaviours , including isolating a person from family and friends; monitoring their movements; and restricting access to financial resources, employment, education or medical care.

Search strategy and appraisal of studies

All published studies conducted in different countries that reported intimate partner violence during COVID-19 from December 2019 to May 2022 were included. The search was limited to peer-reviewed, indexed scientific journals and written in English. “The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [ 32 ] were used to develop the present systematic review using PRISMA checklist − 2020 (supplementary material file  1 ).

Article searches were conducted in databases including PubMed/MEDLINE, CINAHL, and Google Scholar. Medical Subject Headings (MeSH) terms and entry terms were used to search studies, and amendments were made based on the types of databases. The key terms and entry terms were connected by Boolean operators (supplementary material file  2 ). Screening was conducted independently by both authors (ME and EW), and disagreements were resolved through discussion with the third author (SB). Using a snowballing technique, references of eligible studies and relevant reviews were also searched.

Eligibility criteria

The study included studies performed in different countries (globally), observational study designs included cross-sectional and cohort studies, published and unpublished studies, studies that reported the prevalence of intimate partner violence during COVID-19, only quantitative results for studies that reported both quantitative and qualitative results, English published articles, women having intimate partner violence, and studies conducted since COVID-19 identified in Wuhan, China from December 2019 to May 2022 (databases search date may 15–30/2022) were included. However, studies other than English, articles with no available full text and no response for relevant missing data after email contact with the corresponding author, case and conference reports, reviews, letters, and qualitative results for studies that reported both quantitative and qualitative results were excluded.

CoCoPop/PEO

Condition : Women’s intimate partner violence during the COVID-19 pandemic.

Context: worldwide.

Population: women with partners.

Exposure of interest : exposure is a determinant that increases or decreases the likelihood of intimate partner violence during the COVID-19 pandemic. The determinants can be but are not limited to age, residence, husbands’ educational status, decision-making power, social support, wealth index, history of abortion, arranged marriage, history of child death, controlling behaviour of the husband, and COVID-19 pandemic.

Outcome/condition : The outcome of the study was the pooled prevalence of intimate partner violence during COVID-19. Intimate partner violence includes physical, sexual, and emotional abuse and controlling behaviours by an intimate partner [ 7 ].

Study selection

Two independent reviewers (ME and EW) screened the searched studies. Duplicate articles were removed, assessments of articles using their titles and abstracts were performed, and irrelevant titles and abstracts were removed. A full-text review of relevant studies was performed before the inclusion of studies in the final meta-analysis. Disagreements among reviewers during the review process were resolved through discussion with the third author (SB). Endnote reference manager software [ 33 ] was used to collect and remove duplicate, irrelevant titles and abstracts.

Quality assessment

During the screening process, two independent reviewers (ME and EW.) performed the quality assessment and evaluated the risk of bias in eligible studies. The “Joanna Briggs Institute Meta-Analysis of Statistics Assessment and Review Instrument (JBI-MAStARI)” tool was used to critically appraise the quality of the studies (supplementary material file  3 ) [ 34 ]. The components of the quality assessment were clear inclusion criteria, study population and setting , measurement criteria, event and exposure measurements, and appropriate statistical analysis . During the critical quality appraisal of the studies, any disagreement among the authors was resolved by discussion with the third author (SB).

Data extraction

Data were extracted independently by two authors (ME and EW) using a pilot test data extraction Excel sheet and RevMan software. The outcome data extraction format contains the authors’ names, publication year, countries, study design, study setting, and sample size. Any disagreement was resolved through discussion with the third author (SB). In the case of incomplete results, email contact with the corresponding author was made, and articles were excluded if no response was made.

Statistical analysis

The final included studies were imported to STATA version 14 to determine the pooled prevalence. The results were reported in narrative descriptions, tables, and graphs. A random-effects model was used to estimate the true effect at the 95% CI [ 35 ].

The results were reported using a forest plot with respective odds ratios and 95% CIs. Heterogeneity among the included studies was assessed by visual graphical inspection of the forest plot [ 36 ] and statistically using the I 2 statistic [ 37 ]. I 2 statistics of 25, 50, and 75% indicated low, moderate, and increased levels of heterogeneity, respectively, with p  < 0.05.

Publication bias was identified using visual inspection of the funnel plot. In addition, evidence of publication bias was assessed statistically using Egger’s tests [ 38 ] at p  < 0.05. The differences in heterogeneity between the studies were performed by subgroup analysis and meta-regression [ 39 ] based on country, study area (developed/developing), and sample size.

A total of 5065 articles were collected from PubMed/MED-LINE, CINAHL, and Google Scholar. All articles were imported into EndNote software (version X8; Thomson Reuters, New York, NY), and 37 articles were excluded due to duplications. A total of 4884 articles were excluded after a review of their titles and abstracts. A total of 144 articles were assessed for eligibility based on the preset criteria. A total of 130 articles were excluded because the outcome of interest was not reported, and qualitative studies were excluded. Finally, 14 articles were eligible and included in this meta-analysis (Fig.  1 ).

figure 1

Flow chart of study selection for meta-analysis of IPV among women during the COVID-19 pandemic, 2022

Quality appraisal

All the included studies met a minimum of four out of eight (50% and above) JBI critical appraisal scores. The criteria for inclusion were clearly defined in all studies. Strategies to address confounding factors and appropriate statistics were made in all included studies. However, since all the included studies were cross-sectional studies, the identification of confounding factors was not applicable for this study (supplementary material file  4 ).

Characteristics of included studies

All 14 studies in this systematic review and meta-analysis were cross-sectional studies; eight of them were conducted in Ethiopia [ 4 , 13 , 40 , 41 , 42 , 43 , 44 , 45 ], one was conducted in Congo [ 46 ], one was conducted in Uganda [ 47 ], one was conducted in Bangladesh [ 48 ], one was conducted in Arab countries [ 49 ], one was conducted in Canada [ 50 ], and one was conducted in the USA [ 51 ]. A total of 8335 women with intimate partners were involved in our study. The sample size of the studies ranged from 216 [ 50 ] to 2002 [ 46 ]. In this review, the lowest prevalence (7.1%) of intimate partner violence was in St. Paul’s Hospital, Ethiopia [ 45 ], while the highest prevalence (68%) of IPV was reported in Uganda 48 (Table  1 ).

Pooled prevalence of any form of intimate partner violence among women during COVID-19

The pooled prevalence of intimate partner violence among women was 31% (95% CI: 22, 40)). In this review, the lowest prevalence (7.1%) of IPV was in St. Paul’s Hospital, Ethiopia [ 45 ], while the highest prevalence (68%) of intimate partner violence was reported in Uganda [ 51 ]. The included studies exhibited significant heterogeneity (I 2  = 99.07, p  < 0.001) (Fig.  2 ).

figure 2

Forest plot showing the pooled prevalence of IPV among women during the COVID-19 pandemic

Subgroup analysis

Subgroup analysis was performed based on region (developed/developing) and country to identify the possible source of heterogeneity. Subgroup analysis based on region showed that the highest prevalence of intimate partner violence was in developing regions (33, 95% CI: 23.0, 43.0) compared to developed regions (14, 95% CI: 11.0, 17.0). High heterogeneity was reported in developing countries (I 2 = 99.19; p  < 0.001) (Fig.  3 ).

figure 3

Forest plot showing subgroup analysis on IPV among women during the COVID-19 pandemic by region

Subgroup analysis based on country showed that Uganda had the highest prevalence of intimate partner violence among women (68, 95% CI: 62.0, 72.0), and the lowest was in the USA (10, 95% CI: 7.0, 15.0%). High heterogeneity was reported in studies performed in Ethiopia (I 2 = 98.78; p  < 0.001). Ethiopia had the highest weight of 57.30, and the possible reason may be the high number of studies performed and included in that area, and the lowest weight was in Canada, 7.02 (Fig.  4 ).

figure 4

Forest plot subgroup prevalence of IPV among women during the COVID-19 pandemic by country

Metaregression

Meta-regression was performed to identify the source of heterogeneity across the studies by considering continuous and categorical variables, including region (developed/developing), country, and sample size. Meta-regression indicated that no heterogeneity was observed ( p value> 0.05) (Table  2 ).

Publication bias: On visual inspection, asymmetry was observed in the funnel plots since there were six studies on the right and eight studies on the left (Fig.  5 ). However, the results from Egger’s regression test did not show statistical significance ( p  = 0.345) (Table  3 ).

figure 5

Funnel plot for publication bias, IPV among women during the COVID-19 pandemic

Sensitivity analysis

The results showed that no single study unduly influenced the overall estimate of intimate partner violence during the COVID-19 pandemic and its associated factors (supplementary Figure file S 1 ) .

Forms of intimate partner violence

In this study, the prevalence was calculated for each form of intimate partner violence.

Controlling violence

The prevalence of controlling violence in one study during the pandemic was 54% (95% CI: 49, 60) [ 47 ] (Fig.  6 ).

figure 6

Forest plot showing the prevalence of controlling violence among women during the COVID-19 pandemic

Verbal violence

The pooled prevalence of verbal violence faced by women with intimate partners during the pandemic in two studies was 53% (95% CI: 51, 56) [ 46 , 49 ] (Fig.  7 ).

figure 7

Forest plot showing the pooled prevalence of verbal among women during the COVID-19 pandemic

Emotional violence

The pooled prevalence of emotional violence faced by women with intimate partners during the pandemic in 13 studies was 25% (95% CI: 17, 32) [ 4 , 13 , 40 , 41 , 42 , 43 , 44 , 45 , 47 , 48 , 49 , 50 , 52 ] (Fig.  8 ).

figure 8

Forest plot showing the pooled prevalence of emotional violence among women during the COVID-19 pandemic

Economic violence

The pooled prevalence of economic violence faced by women with intimate partners during the pandemic in two studies was 17% (95% CI: 15, 20) [ 47 , 49 ] (Fig.  9 ).

figure 9

Sexual violence

The pooled prevalence of sexual violence faced by women with intimate partners during the pandemic in 14 studies was 14% (95% CI: 10, 18) [ 4 , 13 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 , 49 , 50 , 52 ] (Fig.  10 ).

figure 10

Forest plot showing the pooled prevalence of sexual violence among women during the COVID-19 pandemic

Physical violence

The pooled prevalence of physical violence faced by women with intimate partners during the pandemic in 14 studies was 14% (95% CI: 9, 18) [ 4 , 13 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 , 49 , 50 , 52 ] (Fig.  11 ).

figure 11

Forest plot showing the pooled prevalence of physical violence among women during the COVID-19 pandemic

This systematic review and meta-analysis aimed to estimate the pooled prevalence of intimate partner violence during the COVID-19 pandemic. To the best of our knowledge, no systematic review or meta-analysis has been conducted on the pooled prevalence of intimate partner violence during the COVID-19 pandemic. Moreover, there is a lack of representative data on intimate partner violence during the COVID-19 pandemic, and there are also inconsistent findings. Therefore, this systematic review and meta-analysis will help policy-makers, programmers, planners, clinicians, and researchers design appropriate strategies.

The pooled prevalence of any form of IPV among women during the COVID-19 pandemic was 31% (95% CI: 22–40). This prevalence was comparable to a systematic review performed before the pandemic of 30% [ 53 ] and 27% [ 54 ] and during the pandemic of 31% [ 55 ] and 33.4% [ 56 ]. However, the pooled prevalence was higher than that in studies performed before the pandemic: sub-Saharan Africa, 20% [ 54 ]; northern Africa, 15% [ 57 ]; southern Asia, 19% [ 14 ]; western Asia, 13% [ 7 ]; African countries, 15.23% [ 58 ]; China, 7.7% [ 59 ]; and France, 7% [ 60 ]. Moreover, it was also higher than studies performed in the United States, 18.0% [ 17 ], Ethiopia, 26.6% [ 40 ], and Arab countries, 22.2% [ 49 ], during the pandemic. Our finding was lower than those of a study conducted in Peru (48.0% [ 61 ]), New Orleans (59% [ 62 ]), Jordan (40% [ 63 ]), Iran (65.4% [ 64 ]), and Bangladesh (45.29% [ 48 ]). The difference might be due to differences in sample size, study setting, study period, availability, and access to health services, reproductive health information, geographical areas, and the cultures of study subjects.

In this study, the prevalence of each component of IPV during the pandemic was also determined as controlling violence, verbal violence, emotional violence, economic violence, sexual violence and physical violence, which are the prevalent forms of violence faced during the pandemic by women with intimate partners.

The limitation of this study is that it includes only articles published in the English language. Databases such as Scopus and EMBASE were not considered due to the lack of free access, and we recommend funding to expand the database search source. Additionally, all included studies in this meta-analysis were cross-sectional; as a result, the outcome variables could be affected by other confounding variables, and cause and effect relationships could not be determined. Furthermore, studies from seven countries fulfilled the eligibility criteria and may not be representative. Despite these limitations, searching, selection and data extraction of the studies were performed based on eligibility criteria independently by two authors, and ambiguity was resolved by a third author.

Conclusions

Availability of data and materials.

All data generated or analysed during the current study are included in this manuscript and its supplementary information files.

Abbreviations

Coronavirus

Cumulative Index to Nursing and Allied Health Literature

Intimate Partner Violence

Medical Search Headings

Preferred Reporting Items for Systematic Review & Meta-analysis

Review Manager Software

Goldfarb S, Goldscheid J. International human rights law on violence against women and children and its impact on domestic law and action. In: Women and children as victims and offenders: background, prevention, reintegration. Springer; 2016. p. 3–45.

Google Scholar  

Russo NF, Pirlott A. Gender-based violence: concepts, methods, and findings. Ann N Y Acad Sci. 2006;1087(1):178–205.

Article   ADS   PubMed   Google Scholar  

Hossain M, McAlpine A. Gender based violence research methodologies in humanitarian settings: an evidence review and recommendations; 2017.

Ehitemariyam T. Assessment of intimate partner violence against women of reproductive age and associated factors during covid 19 pandemic in Debre Berhan Town, Ethiopia 2021: a community–based cross-sectional study 2021.

Garcia-Moreno C, Jansen HAFM, Ellsberg M, Heise L, Watts C. WHO multicountry study on women’s health and domestic violence against women: initial results on prevalence, health outcomes and women’s responses. Geneva: World Health Organization; 2005.

WHO P. Understanding and addressing violence against women. Geneva, Switzerland: The World Health Organization; 2012.

García-Moreno C, Pallitto C, Devries K, Stöckl H, Watts C, Abrahams N. Global and regional estimates of violence against women: prevalence and health effects of intimate partner violence and non-partner sexual violence. World Health Organization; 2013.

Krug EG, Mercy JA, Dahlberg LL, Zwi AB. The world report on violence and health. Lancet. 2002;360(9339):1083–8.

Article   PubMed   Google Scholar  

Benebo FO, Schumann B, Vaezghasemi M. Intimate partner violence against women in Nigeria: a multilevel study investigating the effect of women’s status and community norms. BMC Womens Health. 2018;18(1):1–17.

Article   Google Scholar  

Tochie JN, Ofakem I, Ayissi G, Endomba FT, Fobellah NN, Wouatong C, et al. Intimate partner violence during the confinement period of the COVID-19 pandemic: exploring the French and Cameroonian public health policies. Pan Afr Med J. 2020;35(Suppl 2)

Duncan TK, Weaver JL, Zakrison TL, Joseph B, Campbell BT, Christmas AB, et al. Domestic violence and safe storage of firearms in the COVID-19 era. Ann Surg. 2020;272(2):e55.

Hatchimonji JS, Swendiman RA, Seamon MJ, Nance ML. Trauma does not quarantine: violence during the COVID-19 pandemic. Ann Surg. 2020;272(2):e53.

Shitu S, Yeshaneh A, Abebe H. Intimate partner violence and associated factors among reproductive age women during COVID-19 pandemic in southern Ethiopia, 2020. Reprod Health. 2021;18(1):1–10.

Devries KM, Mak JY, Garcia-Moreno C, Petzold M, Child JC, Falder G, et al. The global prevalence of intimate partner violence against women. Science. 2013;340(6140):1527–8.

Article   ADS   CAS   PubMed   Google Scholar  

Gaines C. Commentary on a cochrane review of screening for intimate partner violence in health care settings. Nurs Womens Health. 2017;21(6):439–41.

Organization WH. COVID-19 and violence against women: what the health sector/system can do, 7 April 2020. World Health Organization; 2020.

Jetelina KK, Knell G, Molsberry RJ. Changes in intimate partner violence during the early stages of the COVID-19 pandemic in the USA. Inj Prev. 2021;27(1):93–7.

Committee IR. “Everything on her shoulders”. Rapid assessment on gender and violence against women and girls in the Ebola outbreak in Beni, DRC. New York: International Rescue Committee Accessed March. 2019;8:2021.

Piquero AR, Riddell JR, Bishopp SA, Narvey C, Reid JA, Piquero NL. Staying home, staying safe? A short-term analysis of COVID-19 on Dallas domestic violence. Am J Crim Justice. 2020;45(4):601–35.

Article   PubMed   PubMed Central   Google Scholar  

Bradley NL, DiPasquale AM, Dillabough K, Schneider PS. Health care practitioners’ responsibility to address intimate partner violence related to the COVID-19 pandemic. CMAJ. 2020;192(22):E609–E10.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Forum ACP, editor Under siege impact of COVID-19 on girls in Africa2020: PLAN International/African Child Policy Forum (ACPF).

Sere Y, Roman NV, Ruiter RA. Coping with the experiences of intimate partner violence among south African women: systematic review and meta-synthesis. Front Psychiatry. 2021:522.

Usher K, Bhullar N, Durkin J, Gyamfi N, Jackson D. Family violence and COVID-19: increased vulnerability and reduced options for support. Int J Ment Health Nurs. 2020;

Breiding MJ, Chen J, Black MC. Intimate partner violence in the United States--2010; 2014.

Cooper A, Smith EL. Homicide trends in the United States, 1980–2008. Washington, DC: Bureau of Justice Statistics; 2011.

Capaldi DM, Knoble NB, Shortt JW, Kim HK. A systematic review of risk factors for intimate partner violence. Partn Abus. 2012;3(2):231–80.

Browning CR. The span of collective efficacy: extending social disorganization theory to partner violence. J Marriage Fam. 2002;64(4):833–50.

Matjasko JL, Niolon PH, Valle LA. The role of economic factors and economic support in preventing and escaping from intimate partner violence. J Policy Anal Manag. 2013;32(1):122.

Banyard VL. Toward the next generation of bystander prevention of sexual and relationship violence: action coils to engage communities. Springer; 2015.

Book   Google Scholar  

Wasserman G, Seracini A, Loeber R, Farrington D. Child delinquents: development, intervention, and service needs. Thousand Oaks, CA: Sage Publication; 2001.

McMahon S, Stepleton K, Cusano J, O’Connor J, Gandhi K, McGinty F. Beyond sexual assault surveys: a model for comprehensive campus climate assessments. J Stud Aff Res Pract. 2018;55(1):78–90.

Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Int J Surg. 2021;88:105906.

Agrawal A, Rasouli M. EndNote 1–2-3 easy!: reference management for the professional. Springer Nature; 2019.

Munn Z, Tufanaru C, Aromataris E. JBI's systematic reviews: data extraction and synthesis. American J Nurs. 2014;114(7):49–54.

Berkey CS, Hoaglin DC, Mosteller F, Colditz GA. A random-effects regression model for meta-analysis. Stat Med. 1995;14(4):395–411.

Article   CAS   PubMed   Google Scholar  

Rücker G, Schwarzer G. Beyond the forest plot: the drapery plot. Res Synth Methods. 2021;12(1):13–9.

Higgins JP, Thompson SG. Quantifying heterogeneity in a meta-analysis. Stat Med. 2002;21(11):1539–58.

Pustejovsky JE, Rodgers MA. Testing for funnel plot asymmetry of standardized mean differences. Res Synth Methods. 2019;10(1):57–71.

Spineli LM, Pandis N. Problems and pitfalls in subgroup analysis and meta-regression. Am J Orthod Dentofac Orthop. 2020;158(6):901–4.

Gebrewahd GT, Gebremeskel GG, Tadesse DB. Intimate partner violence against reproductive age women during COVID-19 pandemic in northern Ethiopia 2020: a community-based cross-sectional study. Reprod Health. 2020;17(1):1–8.

Fetene G, Alie MS, Girma D, Negesse Y. Prevalence and its predictors of intimate partner violence against pregnant women amid COVID-19 pandemic in Southwest Ethiopia, 2021: a cross-sectional study. SAGE Open Med. 2022;10:20503121221079317.

Getinet W, Azale T, Getie E, Salelaw E, Amare T, Demilew D, et al. Intimate partner violence among reproductive-age women in Central Gondar zone, northwest, Ethiopia: a population-based study. BMC Womens Health. 2022;22(1):109.

Shewangzaw Engda A, Dargie Wubetu A, Kasahun Amogne F, Moltot KT. Intimate partner violence and COVID-19 among reproductive age women: a community-based cross-sectional survey, Ethiopia. Womens Health. 2022;18:17455065211068980.

CAS   Google Scholar  

Tadesse AW, Tarekegn SM, Wagaw GB, Muluneh MD, Kassa AM. Prevalence and associated factors of intimate partner violence among married women during COVID-19 pandemic restrictions: a community-based study. J Interpers Violence. 2022;37(11–12):NP8632–NP50.

Teshome A, Gudu W, Bekele D, Asfaw M, Enyew R, Compton SD. Intimate partner violence among prenatal care attendees amidst the COVID-19 crisis: the incidence in Ethiopia. Int J Gynecol Obstet. 2021;153(1):45–50.

Article   CAS   Google Scholar  

Ditekemena JD, Luhata C, Mavoko HM, Siewe Fodjo JN, Nkamba DM, Van Damme W, et al. Intimate partners violence against women during a COVID-19 lockdown period: results of an online survey in 7 provinces of the Democratic Republic of Congo. Int J Environ Res Public Health. 2021;18(10)

Katushabe E, Chinweuba A, Omieibi A, Asiimwe JB. Prevalence and determinants of intimate-partner violence among pregnant women attending a City health Centre IV, south western Uganda, during the COVID-19 pandemic: a cross-sectional study. Stud J Health Res Afr. 2022;3(3):17.

Rayhan I, Akter K. Prevalence and associated factors of intimate partner violence (IPV) against women in Bangladesh amid COVID-19 pandemic. Heliyon. 2021;7(3):e06619.

El-Nimr NA, Mamdouh HM, Ramadan A, El Saeh HM, Shata ZN. Intimate partner violence among Arab women before and during the COVID-19 lockdown. J Egypt Public Health Assoc. 2021;96(1):1–8.

Muldoon KA, Denize KM, Talarico R, Boisvert C, Frank O, Harvey AL, et al. COVID-19 and perinatal intimate partner violence: a cross-sectional survey of pregnant and postpartum individuals in the early stages of the COVID-19 pandemic. BMJ Open. 2021;11(5):e049295.

Cannon CE, Ferreira R, Buttell F, First J. COVID-19, intimate partner violence, and communication ecologies. Am Behav Sci. 2021;65(7):992–1013.

Spencer CM, Gimarc C, Durtschi J. COVID-19 specific risk markers for intimate partner violence perpetration. J Fam Violence. 2022;37(6):881–91.

García-Moreno C, Stöckl H. Violence against women, its prevalence and health consequences. Violence against Women and Mental Health. 178: Karger Publishers; 2013. p. 1–11.

Sardinha L, Maheu-Giroux M, Stöckl H, Meyer SR, García-Moreno C. Global, regional, and national prevalence estimates of physical or sexual, or both, intimate partner violence against women in 2018. Lancet. 2022;399(10327):803–13.

Berniell I, Facchini G. COVID-19 lockdown and domestic violence: evidence from internet-search behavior in 11 countries. Eur Econ Rev. 2021;136:103775.

Boxall H, Morgan A. Intimate partner violence during the COVID-19 pandemic: a survey of women in Australia. Australia's National Research Organisation for Women's Safety; 2021.

Maheu-Giroux M, Sardinha L, Stöckl H, Meyer SR, Godin A, Alexander M, et al. A framework to model global, regional, and national estimates of intimate partner violence. BMC Med Res Methodol. 2022;22(1):1–17.

Shamu S, Abrahams N, Temmerman M, Musekiwa A, Zarowsky C. A systematic review of African studies on intimate partner violence against pregnant women: prevalence and risk factors. PLoS One. 2011;6(3):e17591.

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Wang T, Liu Y, Li Z, Liu K, Xu Y, Shi W, et al. Prevalence of intimate partner violence (IPV) during pregnancy in China: a systematic review and meta-analysis. PLoS One. 2017;12(10):e0175108.

Peraud W, Quintard B, Constant A. Factors associated with violence against women following the COVID-19 lockdown in France: results from a prospective online survey. PLoS One. 2021;16(9):e0257193.

Agüero JM. COVID-19 and the rise of intimate partner violence. World Dev. 2021;137:105217.

Buttell F, Ferreira RJ. The hidden disaster of COVID-19: intimate partner violence. Psychol Trauma Theory Res Pract Policy. 2020;12(S1):S197.

Abuhammad S. Violence against Jordanian women during COVID-19 outbreak. Int J Clin Pract. 2021;75(3):e13824.

Fereidooni R, Mootz J, Sabaei R, Khoshnood K, Heydari ST, Moradian MJ, et al. The COVID-19 pandemic, socioeconomic effects, and intimate partner violence against women: A population-based cohort study in Iran. Maryam and Molavi Vardanjani, Hossein, The COVID-19 Pandemic, Socioeconomic Effects, and Intimate Partner Violence Against Women: A Population-Based Cohort Study in Iran. 2021.

Download references

Acknowledgements

We would like to thank Samara University for the HINARY database website and network access.

No funding.

Author information

Authors and affiliations.

Family Guidance Association, Logia, Afar, Ethiopia

Mearg Eyasu Kifle

Department of Public Health, College of Medical and Health Sciences, Samara University, Samara, Ethiopia

Setognal Birara Aychiluhm & Etsay Woldu Anbesu

You can also search for this author in PubMed   Google Scholar

Contributions

ME and EW are participated on screened ,data extraction and quality appraisal and SB a mediator that solves any disagrmeent with in the two authors.

Corresponding author

Correspondence to Mearg Eyasu Kifle .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1., additional file 2., additional file 3., additional file 4., additional file 5., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Kifle, M.E., Aychiluhm, S.B. & Anbesu, E.W. Global prevalence of intimate partner violence during the COVID-19 pandemic among women: systematic review and meta-analysis. BMC Women's Health 24 , 127 (2024). https://doi.org/10.1186/s12905-023-02845-8

Download citation

Received : 27 December 2022

Accepted : 14 December 2023

Published : 17 February 2024

DOI : https://doi.org/10.1186/s12905-023-02845-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Intimate partner violence
  • Pooled prevalence

BMC Women's Health

ISSN: 1472-6874

systematic review summary of findings table

IMAGES

  1. Summary findings systematic review

    systematic review summary of findings table

  2. Overview

    systematic review summary of findings table

  3. Summary of findings from systematic reviews for associations between

    systematic review summary of findings table

  4. Five tips for developing useful literature summary tables for writing

    systematic review summary of findings table

  5. Summary table of studies included in the present systematic literature

    systematic review summary of findings table

  6. Systematic review-Summary table of the results (PART 1).

    systematic review summary of findings table

VIDEO

  1. Systematic literature review

  2. Systematic Review for Beginners

  3. SYSTEMATIC AND LITERATURE REVIEWS

  4. 10b2 Chapter 5 Summary of Findings Paragraph 2

  5. Lesson 13 3 T test dealing with intervals and setting table and findings

  6. Research Methodology B.Com Sem 6

COMMENTS

  1. Chapter 14: Completing 'Summary of findings' tables and grading the

    A 'Summary of findings' table for a given comparison of interventions provides key information concerning the magnitudes of relative and absolute effects of the interventions examined, the amount of available evidence and the certainty (or quality) of available evidence.

  2. Summary of Findings Table in a Systematic Review

    A summary of findings table typically includes the following information: A description of the population and setting addressed by the available evidence A description of comparisons addressed in the table, including all interventions A list of the most important outcomes, whether desirable or undesirable (limited to seven)

  3. PDF Summary of findings tables for communicating key findings of systematic

    Summary of findings tables for communicating key findings of systematic reviews. Cochrane Database of Systematic Reviews, 2017(2), Article MR000044. https://doi.org/10.1002/14651858.MR000044 Published in: Cochrane Database of Systematic Reviews Document Version: Publisher's PDF, also known as Version of record

  4. PDF 'Summary of findings' tables in network meta-analysis (NMA)

    - Summary of Findings (SoF) tables in Systematic Reviews and Meta-analysis 2. Outline Part 2. NMA-SoF table Introduction to the NMA-SoF table project Part 3. NMA-SoF table examples Part 4. Q&A 3. ... Preparing summary of findings tables and evidence profiles-continuous outcomes. Journal of clinical epidemiology. 2013;66(2):173-83.

  5. Chapter 15: Interpreting results and drawing conclusions

    A 'Summary of findings' table, described in Chapter 14, Section 14.1, provides key pieces of information about health benefits and harms in a quick and accessible format.

  6. Completing 'Summary of findings' tables and grading the certainty of

    Planning for the 'Summary of findings' table starts early in the systematic review, with the selection of the outcomes to be included in: the review; and the 'Summary of findings' table. This is a crucial step, and one that review authors need to address carefully.

  7. Interactive Summary of Findings tables: the way to present... : JBI

    The JBI Database of Systematic Reviews and Implementation Reports continues to enhance its systematic reviews by including GRADE (Grading of Recommendations Assessment, Development and Evaluation) Summary of Findings (SoF) tables. Summary of Findings tables are concise, tabular summaries of the evidence that address a specific health-related question. 1 They include information about the main ...

  8. Interactive Summary of Findings tables: the way to present and ...

    Interactive Summary of Findings tables: the way to present and understand results of systematic reviews JBI Database System Rev Implement Rep. 2019 Mar;17 (3):259-260. doi: 10.11124/JBISRIR-D-19-00059. Authors Holger J Schünemann 1 , Nancy Santesso , Jan L Brozek Affiliation

  9. Summary of findings tables for communicating key findings of systematic

    Summary of findings tables for communicating key findings of systematic reviews To assess the effects of 'Summary of findings' tables on communicating key findings of systematic reviews of the effects of healthcare interventions. This will be achieved by:

  10. PDF Summary of Findings Tables for Joanna Briggs Institute Systematic Reviews

    Systematic reviews should be accompanied by a Summary of Findings table.6 The Summary of Findings table should include the question being investigated, the population, intervention and comparison, the outcomes assessed, estimated risk or odds for categorical data or weighted means for continuous data, relative effect, sample size as well as the ...

  11. Summary of Findings tables

    The Summary of Findings table aims to help readers understand the results of a Cochrane review more correctly and find key information faster by: highlighting the most important outcomes, both benefits and harms. presenting what is known and not known about each of these outcomes. presenting how sure we can be of the evidence for each outcome.

  12. Presenting Results and 'Summary of Findings' Tables

    'Summary of findings' tables. Additional tables. Presenting results in the text. Writing an abstract. Writing a plain language summary. Chapter information. References. Cochrane Handbook for Systematic Reviews of Interventions: Cochrane Book Series. Related; Information; Close Figure Viewer. Return to Figure. Previous Figure Next Figure ...

  13. 11.5.1 Introduction to Summary of findings tables

    Other 'Summary of findings' tables will appear between the Results and Discussion sections. The planning for the 'Summary of findings' table comes early in the systematic review, with the selection of the outcomes to be included in (i) the review and (ii) the 'Summary of findings' table. Because this is a crucial step, and one ...

  14. Completing 'Summary of findings' tables and grading ...

    Abstract. Planning for the 'Summary of findings' table starts early in the systematic review, with the selection of the outcomes to be included in: the review; and the 'Summary of findings ...

  15. Assessment, Development and Evaluation (GRADE) Summary of Findings (SoF

    Objective. Summary of findings (SoF) tables present results of systematic reviews in a concise and explicit format. Adopted by many review groups including the Cochrane Collaboration and the Agency for Healthcare Research and Quality (AHRQ), optimal understanding of SoF table may be influenced by the type of information being conveyed and objectives or preferences of the end user.

  16. Five tips for developing useful literature summary tables for writing

    Literature summary tables are not only meant to provide an overview of basic information (authors, country, purpose and findings) about included articles, but they should also provide detailed information about the theoretical and conceptual frameworks and the methods used in the included article.

  17. About Cochrane Reviews

    What is a summary of findings table? A summary of findings table presents the main findings of a review in a transparent and simple tabular format. In particular, the tables provide key information concerning the quality of evidence, the magnitude of effect of the interventions examined, and the sum of available data on the main outcomes.

  18. Use of a search summary table to improve systematic review search

    Publishing a search summary table in all systematic reviews would add to the growing evidence base about information retrieval, which would help in determining which databases to search for which type of review (in terms of either topic or scope), what supplementary search methods are most effective, what type of literature is being included, an...

  19. Completing 'Summary of findings' tables and grading the certainty of

    Completing 'Summary of findings' tables and grading the certainty of the evidence - Cochrane Handbook for Systematic Reviews of Interventions - Wiley Online Library Holger J SchünemannJulian PT Higgins,,, Matthew J. Page The full text of this article hosted at iucr.org is unavailable due to technical difficulties.

  20. 'It depends': what 86 systematic reviews tell us about what strategies

    Thirty-two studies were included in the systematic review. Table 1. provides a detailed overview of the included systematic reviews comprising reference, ... A summary is provided in Table ... Atheoretical reviews tend to present acontextual findings (for instance, one study found very positive results for one intervention, and this gets ...

  21. Synthesis methods used to combine observational studies and randomised

    The search strategy and selection process for eligible systematic reviews have been previously described in detail [].Briefly, we searched Medline via PubMed to identify systematic reviews that included RCTs and OSs evaluating the effect of healthcare interventions, published between January 2015 and December 2019 in general and internal medicine or public health journals with an impact factor ...

  22. Comparison of clinical and radiological outcomes for the anterior and

    A systematic review provides the best opportunity to deliver the highest possible quality of evidence for bilateral DDH surgical management. The protocol has been registered in the International Prospective Register of Systematic Reviews (PROSPERO ID CRD42022362325). ... Introduction-GRADE evidence profiles and summary of findings tables. J ...

  23. Global disease burden of and risk factors for acute lower respiratory

    We conducted a systematic review and meta-analysis of aggregated data from studies published between Jan 1, 1995, and Dec 31, 2021, identified from MEDLINE, Embase, and Global Health, and individual participant data shared by the Respiratory Virus Global Epidemiology Network on respiratory infectious diseases.

  24. Barriers and facilitators for recruiting and retaining male

    Based on the findings of this systematic review, researchers from longitudinal health-related clinical trials are encouraged to consider male-specific recruitment strategies to ensure successful recruitment and retention in their studies. This systemic review is registered with the PROSPERO database (CRD42021254696).

  25. Gut microbiota in patients with prostate cancer: a systematic review

    A systematic search was conducted across PubMed, Web of Science, and Embase databases, in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. The methodological quality of included studies was evaluated using the Newcastle-Ottawa Scale (NOS), and pertinent data were analyzed.

  26. Global prevalence of intimate partner violence during the COVID-19

    Despite this, there is a lack of representative data on intimate partner violence during the COVID-19 pandemic and inconsistent findings. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines were used to develop the systematic review and meta-analysis.

  27. A protocol for an overview of systematic reviews to map photodynamic

    This is a protocol for an overview to summarize the findings of Systematic Reviews (SR) dealing with Photodynamic Inactivation (PDI) for control of oral diseases. Specific variables of oral infectious will be considered as outcomes, according to dental specialty. Cochrane Database of Systematic Reviews (CDSR), MEDLINE, LILACS, Embase, and Epistemonikos will be searched, as well as reference lists.