Interpreting findings and reporting conduct
Last updated: September 25th 2022
9.1 The interpretation of evidence syntheses
CEE evidence synthesis methodologies seek to collate and synthesise data in order to present reliable evidence in relation to the review question. The strength of the evidence base and implications of the results for decision-making require careful consideration and interpretation. The discussion and conclusions may consider the implications of the evidence in relation to practical decisions, but the decision-making context may vary, leading to different decisions based on the same evidence. Authors should, where appropriate, explicitly acknowledge the variation in possible interpretation and simply present the evidence so as to inform rather than offer advice. Recommendations that depend on assumptions about resources and values should be avoided (Khan et al. 2003, Deeks et al. 2005).
Deeks et al (2005) offer the following advice that is of relevance here. Authors and end-users should be wary of the pitfalls surrounding inconclusive evidence and should beware of unwittingly introducing bias in their desire to draw conclusions rather than pointing out the limits of current knowledge. Where reviews are inconclusive because there is insufficient evidence, it is important not to confuse ‘no evidence of an effect’ with ‘evidence of no effect’. The former may not provide a basis for change to existing policy or practice, but has an important bearing on future research, whereas the latter could have considerable ramifications for current policy or practice.
Review authors, and to a lesser extent end-users, may be tempted to reach conclusions that go beyond the evidence that is reviewed or to present only some of the results. Authors must be careful to be balanced when reporting on and interpreting results. For example, if a ‘positive’ but statistically non-significant trend is described as ‘promising’, then a ‘negative’ effect of the same magnitude should be described as a ‘warning sign’. Other examples of unbalanced reporting include one-sided reporting of sensitivity analyses or explaining non-significant positive results but not negative ones. If the confidence interval for the estimate of difference in the effects of interventions overlaps the null value, the analysis is compatible with both a true beneficial effect and a true harmful effect. If one of the possibilities is mentioned in the conclusion, the other possibility should be mentioned as well and both should be given equal consideration in discussion of results. One-sided attempts to explain results with reference to indirect evidence external to the review should be avoided. Considering results in a blinded manner can avoid these pitfalls (Deeks et al. 2005). Authors should consider how the results would be presented and framed in the conclusions and discussion if the direction of the results was reversed.
9.1.1 Limitations of an evidence synthesis
Biases can occur in the evidence synthesis process, which do not impair the raw data themselves but may affect the findings of the synthesis (through a biased sample of articles) and should be fully considered and reported (see review in Borenstein et al. 2009). For example:
Publication bias: statistically significant results are more prone to be published than non significant ones. Yet, there is no strict relationship between the quality of the methodology and the significance of results, and thus, their publication. A good methodology may lead to non significant results and be kept as a grey article.
Language bias: searching is generally undertaken in English because it is the most common language used in scientific writing. This may result in an over-representation of statistically significant results (Egger et al. 1997; Jüni et al. 2002) because they are more likely to be accepted in the English scientific literature.
Availability bias: only the studies that are easily available are included in the analysis, whilst other significant results may exist but are less easily available (this can be an increasing problem as many private companies have their own research teams and publish their own articles or reports). Similarly, a confidentiality bias may exist in some sensitive topics (eg GMO, nuclear power) because some research results may not be available for security reasons.
Cost bias: time and resources necessary for a thorough search are not always available, which could lead to the selection of the studies only available free or at low cost.
Familiarity bias: the researcher limited the search to articles relevant to his/her own discipline.
Duplication bias: some studies with statistically significant results may be published more than once (Tramer et al. 1997).
Citation bias: Studies with significant results are more likely to be cited by other authors and thus easier to be found during the search (Gøtzsche 1997; Ravnskov 1992).
All these biases can be considered when reporting ‘limitations of the evidence synthesis’ and several methods exist to quantify their impacts on the results (Borenstein et al. 2009).
9.2 Reporting conduct of evidence syntheses
CEE standards require a high level of reporting of the conduct of evidence syntheses so as to ensure high transparency and repeatability allowing others to test replicability of findings.
Each of the conduct Section (5-9) has guidance on reporting. In addition CEE now recommends using the RepOrting standards for Systematic Evidence Syntheses (ROSES) checklist (Haddaway et al. in press) as this will be used by editors and peer reviewers when appraising reports.
9.3 Reporting findings of evidence syntheses
Evidence Syntheses are most often conducted to assess available evidence of effectiveness or of impact. In so doing, Systematic Reviews (not SMs) assess the strength of a causal inference (Hill 1971). Aspects that may be reported in the conclusion section include:
1. The quality/reliability of the included studies.
2. The relevance/external validity of the included studies.
3. The size and statistical significance of the observed effects.
4. The consistency of the effects across studies or sites and the extent to which this can be explained by other variables (effect modifiers).
5. The clarity of the relationship between the intensity of the intervention and the outcome.
6. The existence of any indirect evidence that supports or refutes the inference.
7. The lack of other plausible competing explanations of the observed effects (bias or confounding).
In a review concerning the impacts of liming streams and rivers on fish and invertebrates, Mant et al. (2011) discuss all of the above points in a good example of Systematic Review conclusions. In addition to discussing the limitations of their review, the authors describe the range of quality of studies included, the size and consistency of the effect observed across studies, the link between intervention intensity and outcome, the presence of effect modifiers, the presence of evidence in support/refute of the review findings, and the potential for other causative factors for the observed effects.
There is a range of approaches to grading the strength of evidence presented in health-related reviews, but there is no universal approach (Deeks et al. 2005). We suggest that authors of reviews in environmental management explicitly state weaknesses associated with each of the aspects above, but the overall impact they make on conclusions can only be considered subjectively.
9.3.1 Implications for policy and practice
A key objective of Systematic Review is to inform decision-makers of the implications of the best available evidence relating to a question of concern, and enable them to place this evidence in context, in order to make a decision on the best course of action. Providing evidence that increases capacity to predict the outcomes of alternative actions should lead to better decision making.
End-users will need to decide, either implicitly or explicitly, how applicable the evidence presented in a Systematic Review is to their particular circumstances (Deeks et al. 2005). This is particularly critical in environmental management where many factors may vary between sites and it seems likely that many interventions/actions will vary in their effectiveness/impact depending on a wide range of potential environmental variables. Authors should therefore highlight where the evidence is likely to be applicable and equally importantly where it may not be applicable with reference to variation between studies and study characteristics (see 8.3 External validity).
Clearly, variation in the ecological context and geographical location of studies can limit the applicability of results. Authors should be aware of the timescale of included studies, which may be insufficiently short to make long-term predictions. Variation in application of the intervention may also be important (and difficult to predict), but authors should be aware of differences between ex situ and in situ treatments (measuring efficacy versus effectiveness respectively) where they are combined and should also consider the implications of applying the same intervention at different scales. Variation in baseline risk may also be an important consideration in determining the applicability of results, as the net benefit of any intervention depends on the risk of adverse outcomes without intervention, as well as on the effectiveness of the intervention (Deeks et al. 2005).
Where review authors identify predictable variation in the relative effect of the intervention or exposure in relation to the specified reasons for heterogeneity, these should be highlighted. However, these relationships require cautious interpretation (because they are only correlations), particularly where sample sizes are small, data points are not fully independent and multiple confounding occurs. When reporting implications of the review findings for policy and practice, the emphasis should be on objective information and not on subjective advocacy.
9.3.2 Implications for research
Rather like primary scientific studies, most Systematic Reviews will generate more questions than they answer. Knowledge gaps will be frequent, as will areas where the quality of science conducted to date is inadequate. In conducting an Systematic Review, critically appraising the quality of existing studies and attempting to assess the available evidence in terms of its fitness for purpose, reviewers should be able to draw conclusions concerning the need for further research. This need may simply be reported in the form of knowledge gaps but may often consist of recommendations for the design of future studies that will generate data of sufficient quality to improve the evidence base and decrease the uncertainty surrounding the question.
9.4 Format for CEE Reports
The format for submitting full reports can be found on the Environmental Evidence website by following the links below:
For Systematic Maps
9.4.1 Additional files
To maximise transparency Systematic Reviews should normally be supported by a number of supplementary materials made available in additional files linked from the main text. Authors should refer to ROSES for guidance on what should be reported. The following is a minimal list of expected information (note other additional files may be provided depending on the size and complexity of the synthesis):
1. A report of literature scoping containing combinations of search strings and the outcome of searches of different databases (this is usually as an appendix with the Protocol).
2. A list of articles excluded after reading the full text, including reasons for exclusion (note: a list of articles included is expected in the main text).
3. A list of articles that could not be obtained at full text: such articles are therefore potentially relevant but not fully screened.
4. Data extraction and validity assessment tables for Systematic Reviews or data coding tables for Systematic Maps; for example, Excel files with data extracted from each included study (this may be included in the main text if a small number of studies is included or may be provided in several files for larger Systematic Reviews).