Section 8

Data Synthesis

Last updated: September 25th 2022

For CEE Standards for conduct and reporting of data synthesis click here.

 8.1 Developing data synthesis methods

Data synthesis refers to the collation of all relevant evidence identified in the Systematic Review in order to answer the review question. A narrative synthesis of the data should always be planned involving listing of eligible studies and tabulation of their key characteristics and outcomes. For Systematic Reviews, if evidence is available in a suitable format and quantity then a quantitative synthesis, such as aggregating by meta-analysis, may also be planned. The likely form of the data synthesis may be informed by the previous pilot-testing of data extraction and critical appraisal steps. For example, the Review Team may identify whether the studies reported in the articles are likely to be of sufficient quality to allow relatively robust statistical synthesis and what sorts of study designs are appropriate to include. This pilot-testing process should also inform the approach to the synthesis by allowing, for example: the identification of the range of data types and methodological approaches; the determination of appropriate effect size metrics and analytical approaches (e.g. meta-analysis or qualitative synthesis); and the identification of study covariates. This section includes an overview of different forms of synthesis, narrative, quantitative and qualitative. All Systematic Reviews should present some form of narrative synthesis and many will contain more than one of these approaches (e.g. Bowler et al. 2010). It is not the intention here to give detailed guidelines on synthesis methods since each has its own supporting literature. This Section concentrates on how to make decisions on the correct form of synthesis to conduct.

8.2 Systematic Reviews

8.2.1 Narrative synthesis

Narrative synthesis is the tabulation and/or visualisation (often with descriptive statistics) of the findings of individual primary studies with supporting text to explain the context. A narrative synthesis is often viewed as preparatory when compared with quantitative synthesis and this may be true in terms of application of analytical rigour and lack of statistical power but narrative synthesis has advantages when dealing with broader questions and disparate outcomes. Often narrative synthesis is the only option when faced with a pool of disparate studies of relatively high susceptibility to bias, but such syntheses also accompany quantitative syntheses in order to provide context and background and help characterise the full evidence base. Some form of narrative synthesis should be provided in any Systematic Review, simply to present the context and overview of the evidence. A valuable guide to the conduct of narrative synthesis is provided by Popay (2006).

Narrative synthesis requires the construction of tables, developed from data coding and extraction forms (see Section 8) that provide details of the study or population characteristics, data quality, and relevant outcomes, all of which should have been defined a priori in the Protocol. Narrative synthesis should include a statement of the measured effect reported in each study and the Review Team’s assessment of study validity (including internal and external validity). Where the validity of studies varies greatly, reviewers may wish to give greater weight to some studies than others. In these instances it is vital that the studies have been subject to standardised a priori critical appraisal with the value judgments regarding both internal and external validity clearly stated. Ideally these will have been subject to stakeholder scrutiny at the Protocol stage. The level of detail employed and emphasis placed on narrative synthesis will be dependent on whether other types of synthesis are also employed. An example of an entirely narrative synthesis (Davies et al. 2006) and a narrative synthesis that complements a quantitative synthesis (Bowler et al. 2010) are available in the CEE Library.

Use of simple vote counting as a form of synthesis (e.g. comparing how many studies showed a positive versus negative or neutral outcome based on statistical significance of the results) should be avoided. Vote counting is misleading because this procedure does not take into account differences in study validity and power. Moreover, vote-counting does not provide an estimate of the magnitude of the effect in question. Whilst tabulation may make it easy for the reader to vote count, the authors should avoid its use in developing and reporting their findings.

Recording of key characteristics of each study included in a narrative synthesis is vital if the Systematic Review is to be useful in summarising the evidence base. Key characteristics are normally presented in tabular form and a minimum list is given below.

  • Article reference
  • Subject population
  • Intervention/exposure variable
  • Setting/context
  • Outcome measures
  • Methodological design
  • Relevant reported results

It should be noted here that the interpretation of the results provided by the authors of the study is normally not summarised as this could simply compound subjective assessments or decisions.

8.2.2 Quantitative data synthesis

Usually, when attempting to measure the effect of an intervention or exposure, a quantitative synthesis is desirable. This provides a combined effect and a measure of its variance within and between studies. Quantitative syntheses can be powerful in the sense of enabling the study of the impacts of effect modifiers and increasing power to predict outcomes of interventions or exposures under varying environmental conditions.

Meta-analysis and meta-regression are now commonly used in the environmental sciences and there is a well- developed supporting literature (e.g. Arnqvist & Wooster 1995; Osenberg et al. 1999; Gurevitch & Hedges 2001; Gates 2002; Borenstein et al. 2009; Koricheva et al. 2013) as well as online guidance and training; consequently, we have not provided detailed guidance here. Meta-analysis provides summary effect sizes with each data set weighted according to some measure of its reliability (e.g. with more weight given to large studies with precise effect estimates and less to small studies with imprecise effect estimates). Generally, each study is weighted in proportion to sample size or inverse proportion to the variance of its effect. Meta-regression aims to provide summary effects after adjusting for study-level covariates.

Pooling of individual effects can be undertaken with fixed-effects or random-effects statistical models. Fixed-effects models estimate the combined effect assuming there is a single true underlying effect across the studies, whereas random-effects models assume there is a distribution of effects that depend on study characteristics. Random- effects models include inter-study variability; thus, when there is heterogeneity, a random-effects model usually has wider confidence intervals on its pooled effect than a fixed-effects model (NHS CRD 2001; Khan et al. 2003). Random-or mixed-effects models (containing both random and fixed effects) are often more appropriate for the analysis of ecological data because the numerous complex interactions common in ecology are likely to result in heterogeneity between studies or sites. Exploration of heterogeneity is often more important than the overall pooling from a management perspective, as there is rarely a one-size-fits-all solution to environmental problems.

Relationships between differences in characteristics of individual studies and heterogeneity in results can be investigated as part of the meta-analysis, thus aiding the interpretation of ecological relevance of the findings. Exploration of these differences may be facilitated by construction of tables that group studies with similar characteristics and outcomes together. Important factors that could produce variation in effect size should be defined a priori and their relative importance considered prior to data extraction to make the most efficient use of data. These factors may include differing populations, interventions, outcomes, and methodology. Resulting variation in effect sizes across studies can then be explored by meta-regression.
If sufficient data exist, meta-analyses are often undertaken on subgroups and the significance of differences assessed. Subgroup analyses must be interpreted with caution because statistical power may be limited (Type I errors possible) and multiple analyses of numerous subgroups could result in spurious significance (Type II errors possible). A mixed-effects meta-regression approach might be adopted whereby statistical models including study-level covariates are fitted to the full dataset, with studies weighted according to the precision of the estimate of treatment effect after adjustment for covariates (Sharp 1998).

Despite the attempt to achieve objectivity in reviewing scientific data, considerable subjective judgment is involved when undertaking meta-analyses. These judgements include decisions about choice of effect measure, how data are combined to form datasets, which data sets are relevant and which are methodologically sound enough to be included, methods of meta-analysis, and the issue of whether and how to investigate sources of heterogeneity (Thompson 1994). Reviewers should state these decisions explicitly and distinguish between them to minimise bias and increase transparency.

If possible, a quantitative synthesis should be accompanied by an exploration of possible effects of publication bias. Positive and/or statistically significant results are more readily available than non-significant or negative results because they are more likely published in high-impact journals and in the English language. Whilst searching methodology can reduce this bias, it is still uncertain how influential it might be. There are a number of exploratory plots and tests for publication bias. One example is the funnel plot often accompanied by the Egger (Egger et al. 1997). This approach aims to test for a relationship between the size and precision of study effects, plotted on x- and y-axis of the funnel plot. However, a funnel plot can change greatly depending on the scale of the precision (Lau et al. 2006), and only the trial size is appropriate for effect measures used in ecology, such as the standardised mean difference (SMD) or response ratio. Another approach is to calculate the fail safe number, which is the number of null result studies that would have to be added to a meta-analysis to lower the significance or the magnitude of the effect to a specified level (e.g. where it would be considered statistically or biologically non-significant), but see Scargle (2000). Wherever possible, grey literature and unpublished studies should be included in a meta-analysis to allow direct assessment of publication bias by comparison of effect sizes in published and unpublished studies.