What is CEESAT ?
CEESAT is the CEE Synthesis Assessment Tool that is used to critically appraise each review and overview before inclusion in the CEEDER Database.
CEESAT has been supported by Mistra EviEM (Sweden) which hosted a CEE workshop for a group of invited persons engaged in further improving the tool.
About the CEESAT criteria in an article published by Paul Woodcock and colleagues in Biological Conservation and on the CEE website.
About current CEESAT Criteria
-Download CEESAT for Evidence Reviews Criteria
-Download CEESAT for Evidence Overviews Criteria
See here for the whole methodology of CEEDER.

The CEESAT workshop group. From left: Jacqui Eales, Neal Haddaway, Ruth Garside, Nicola Randall, Barbara Livoreil, Andrew Pullin, Geoff Frampton, Christian Kohl and Biljana Macura.
The CEESAT checklist provides a point by point appraisal of the confidence that can be placed in the findings of an evidence review by assessing the rigour of the methods used in the review, the transparency with which those methods are reported and the limitations imposed on synthesis by the quantity and quality of available primary data. Note that CEESAT does not distinguish between reviews that do not employ methodology that reduces risk of bias and increases reliability of findings and reviews that may have employed such methodology but do not report it.
Each component of the review process is appraised according to a set of criteria. Each criteria are assigned a rating ranging from red (poor quality) to gold (high quality). Explore review components, sets of criteria, and thresholds for different ratings below.
How are evidence reviews identified and rated?
Step 1: We perform a systematic search of multiple databases and use search engines to collect potential environmental evidence reviews. Searches are regularly updated.
Step 2: We use a set of eligibility criteria (see below) to screen potential reviews for inclusion in the CEEDER database.
Step 3: Eligible evidence reviews are randomly allocated to Review College members for rating. The members rate reliability of evidence reviews using CEESAT criteria (see below).
Using CEESAT criteria as an indicator of reliability of reviews
The CEESAT criteria were developed to critically appraise reviews in terms of transparency, repeatability and risk of bias. For each of 16 elements, a review is rated using four categories of review methodology as follows:
Gold: Meets the standards of conduct and/or reporting that reduce risk of bias as much as could reasonably be expected. Lowest risk of bias – high repeatability – highest reliability/confidence in findings.
Green: Acceptable standard of conduct/reporting that reduces risk of bias. Acceptable risk of bias – repeatable – acceptable reliability/confidence in findings.
Amber: Deficiencies in conduct and/or reporting standards such that the risk of bias is increased (above green), alternatively risk of bias may be less easy to assess. Medium risk of bias – not fully repeatable – low reliability/confidence in findings
Red: Serious deficiencies in conduct and/or reporting such that risk of bias is high. High risk if bias – not repeatable – little to no confidence in findings
When finding a review, or comparing reviews, of relevance to your evidence needs you can either use the ratings as a whole for judging review reliability or look at certain elements that you feel are important for the context in which you are working. For example, you may feel that a comprehensive search strategy and clear eligibility criteria are crucial for you to have confidence in the findings, in which case you might want criteria 3 & 4 to be rated Gold or Green.
Any Red ratings in reviews should be considered carefully to decide what impact that may have on the findings. One Red rating does not necessarily mean that you should have no confidence in the findings but it might do if that red rating is for what you consider a crucial element of review conduct (e.g. eligibility criteria).
Although the categories could also be given “scores” (e.g. from 1-4), using such total scores or mean scores to compare review reliability is not necessarily meaningful and we advise against this in any context except a crude “eyeballing”. It may be more important to understand what elements of a review score Red or Amber and therefore may be deficient. Clearly reviews with ratings that are all Reds and Ambers should be viewed with low confidence, but it does not mean that the findings are wrong. At the other end of the scale reviews with ratings of mostly Gold and Green can be viewed with high confidence, but it does not mean that the findings are right. Additionally, one or two Ambers and Reds among predominantly Gold and Greens could increase risk of bias and decrease confidence substantially and therefore should be considered carefully.
Finally, the CEEDER ratings are not a substitute for reading the review. There may be other aspects of the review that make it more or less useful to you as a source of evidence.
Contact us: info@environmentalevidence.org
Decisions on eligibility involve some subjective judgement and we will not always get it right.
We welcome feedback from users on CEESAT and what is or is not included in CEEDER.