Appendix D Methodology checklist: cohort studies

Download (PDF)

Appendices B–G include checklists for those study designs that are expected to be used in the evidence reviews for NICE social care guidance. Other checklists can found in the NICE clinical guidelines manual and Methods for the development of NICE public health guidance.

Checklist

Study identification

Include author, title, reference, year of publication

Guidance topic:

Review question no:

Checklist completed by:

Circle or highlight 1 option for each question:

A. Selection bias (systematic differences between the comparison groups)

The method of allocation to intervention groups was unrelated to potential confounding factors (that is, the reason for participant allocation to intervention groups is not expected to affect the outcome[s] under study)

Yes

Unclear

N/A

Attempts were made within the design or analysis to balance the comparison groups for potential confounders

Yes

Unclear

N/A

The groups were comparable at baseline, including all major confounding factors

Yes

Unclear

N/A

Based on your answers to the above, in your opinion was selection bias present? If so, what is the likely direction of its effect?

Low risk of bias

Unclear/unknown risk

High risk of bias

Likely direction of effect:

B. Performance bias (systematic differences between groups in the care provided, apart from the intervention under investigation)

The comparison groups received the same care and support apart from the intervention(s) studied

Yes

Unclear

N/A

Participants receiving care and support were kept 'blind' to intervention allocation

Yes

Unclear

N/A

Individuals administering care and support were kept 'blind' to intervention allocation

Yes

Unclear

N/A

Based on your answers to the above, in your opinion was performance bias present? If so, what is the likely direction of its effect?

Low risk of bias

Unclear/unknown risk

High risk of bias

Likely direction of effect:

C. Attrition bias (systematic differences between the comparison groups with respect to loss of participants)

All groups were followed up for an equal length of time (or analysis was adjusted to allow for differences in length of follow-up)

Yes

Unclear

N/A

a. How many participants did not complete the intervention in each group?

b. The groups were comparable for intervention completion (that is, there were no important or systematic differences between groups in terms of those who did not complete the intervention)

Yes

Unclear

N/A

a. For how many participants in each group were no outcome data available?

b. The groups were comparable with respect to the availability of outcome data (that is, there were no important or systematic differences between groups in terms of those for whom outcome data were not available)

Yes

Unclear

N/A

Based on your answers to the above, in your opinion was attrition bias present? If so, what is the likely direction of its effect?

Low risk of bias

Unclear/unknown risk

High risk of bias

Likely direction of effect:

D. Detection bias (bias in how outcomes are ascertained, diagnosed or verified)

The study had an appropriate length of follow-up

Yes

Unclear

N/A

The study used a precise definition of outcome

Yes

Unclear

N/A

A valid and reliable method was used to determine the outcome

Yes

Unclear

N/A

Investigators were kept 'blind' to participants' exposure to the intervention

Yes

Unclear

N/A

Investigators were kept 'blind' to other important confounding factors

Yes

Unclear

N/A

Based on your answers to the above, in your opinion was detection bias present? If so, what is the likely direction of its effect?

Low risk of bias

Unclear/unknown risk

High risk of bias

Likely direction of effect:

E. Overall assessment of internal validity. Are the study results internally valid?

Rate the study for internal validity below (for further information see notes on using the methodology checklist)

−

Comments:

F. Overall assessment of external validity – Are the study results externally valid (i.e. generalisable to the source population)? Consider participants, interventions, settings, comparisons and outcomes.

Rate the study for external validity below (for further information see notes on use of the methodology checklist)

−

Comments:

Notes on use of Methodology checklist: cohort studies

Cohort studies are designed to answer questions about the relative effects of interventions, using an observational design. Such studies usually study 2 or more groups of people – cohorts – with similar characteristics. One group receives an intervention, is exposed to a risk factor or has a particular symptom and the other group does not. The study follows their progress over time and records what happens.

Please note some of the items on this checklist may need to be filled in individually for different outcomes reported by the study. It is therefore important that the systematic reviewer has a clear idea of what the important outcomes are before appraising a study. You are likely to need input from the Guidance Development Group in defining the important outcomes.

Checklist items are worded so that a 'yes' response always indicates that the study has been designed/conducted in such a way as to minimise the risk of bias for that item. An 'unclear' response to a question may arise when the item is not reported or is not reported clearly. 'N/A' should be used when a cohort study cannot give an answer of 'yes' no matter how well it has been done.

This checklist is designed to assess the internal and external validity of the study. Internal validity implies that the differences observed between groups of participants allocated to different interventions may (apart from the possibility of random error) be attributed to the intervention under investigation. Biases are characteristics that are likely to make estimates of effect differ systematically from the truth. External validity assesses the extent to which the findings for the study participants apply to the whole 'source population' (that is, the population they were chosen from)

This checklist contains 5 sections (A–E) on internal validity. Sections A–D each address a potential source of bias . At the end of each section you are asked to give your opinion on whether bias is present, and to estimate the likely direction of this bias – whether you think it will have increased or decreased the effect size reported by the study. It will not always be possible to determine the direction of bias, but thinking this through can help greatly in interpreting results. In section E you are asked to give an overall assessment of the internal validity of the study (using ++, −). Section F then requires you to assess and rate the external validity of the study (also using ++, +, −).

A: Selection bias

Selection bias can be introduced into a study when there are systematic differences between the participants in the different intervention groups. As a result, the differences in the outcome observed may be explained by pre-existing differences between the groups rather than because of the intervention itself. For example, if the people in one group are in poorer health or have higher levels of need , then they may be more likely to have a bad outcome than those in the other group, regardless of the effect of the intervention. The intervention groups should be similar at the start of the study – the only difference between the groups should be in terms of the intervention received.

The main difference between randomised trials and non-randomised studies is the potential susceptibility of the latter to selection bias. Randomisation should ensure that, apart from the intervention received, the intervention groups differ only because of random variation. However, care needs to be taken in the design and analysis of non-randomised studies to take account of potential confounding factors. There are 2 main ways of accounting for potential confounding factors in non-randomised studies. Firstly, participants can be allocated to intervention groups to ensure that the groups are equal with respect to the known confounders. Secondly, statistical techniques can be used within the analysis to take into account known differences between groups. Neither of these approaches is able to address unknown or unmeasurable confounding factors, and it is important to remember that measurement of known confounders is subject to error. Therefore, considerable judgement is needed to assess the internal validity of non-randomised studies; social care practitioner (or healthcare professional where appropriate) input may be needed to identify potential confounding factors that should be taken into consideration.

A1. The method of allocation to intervention groups was unrelated to potential confounding factors

In non-randomised studies, there will usually be a reason why participants are allocated to the intervention groups (often as a result of social care practitioner and/or service user choice). If this reason is linked to the outcome under study, this can result in confounding by indication (where the decision to treat is influenced by some factor that is related in turn to the intervention outcome). For example, if the participants who are the most ill or have the highest level of need are selected for the intervention, then the intervention group may experience worse outcomes because of this difference between the groups at baseline. It will not always be possible to determine from the report of a study which factors influenced the allocation of participants to intervention groups.

A2. Attempts were made within the design or analysis to balance the comparison groups for potential confounders

This represents an attempt when designing the study to ensure that the groups are similar in terms of known confounding factors, in order to optimise comparability between the intervention groups. For example, in a matched design, the controls are deliberately chosen to be equivalent to the intervention group for any potential confounding variables, such as age and sex.

An alternative approach is to use statistical techniques to adjust for known confounding factors in the analysis.

A3. The groups were comparable at baseline, including all major confounding factors

Studies may report the distributions of potential confounding factors in the comparison groups, or important differences in these factors may be noted.

Formal tests comparing the groups are problematic – failure to detect a difference does not mean that a difference does not exist, and multiple comparisons of factors may falsely detect some differences that are not real.

Social care practitioner (or healthcare professional where appropriate) input may be needed to determine whether all likely confounders have been considered. Confounding factors may differ according to outcome, so you will need to consider potential confounding factors for each of the outcomes that are of interest to your review.

B: Performance bias

Performance bias refers to systematic differences in the care provided to the participants in the comparison groups, other than the intervention under investigation.

This may consist of additional care, support, or advice, or even simply a belief about the effects of an intervention. If performance bias is present, it can be difficult to attribute any observed effect to the intervention rather than to the other factors.

Performance bias can be more difficult to determine in non-randomised studies than in randomised studies, because the latter are likely to have been better planned and executed according to strict protocols that specify standardised interventions and care. It may be particularly difficult to determine performance bias for retrospective studies, where there is usually no control over standardisation.

B1. The comparison groups received the same care apart from the intervention(s) studied

There should be no differences between the intervention groups apart from the intervention(s) received. If some participants received additional care or support (known as 'co-intervention'), this intervention is a potential confounding factor that may compromise the results.

Blinding

Blinding (also known as masking) refers to the process of withholding information about intervention allocation or exposure status from those involved in the study who could potentially be influenced by this information. This can include participants, investigators, those administering care and support, and those involved in data collection and analysis. If people are aware of the intervention allocation or exposure status ('unblinded'), this can bias the results of studies, either intentionally or unintentionally, through the use of other effective co-interventions, decisions about withdrawal, differential reporting of symptoms or influencing concordance with the intervention. Blinding of those assessing outcomes is covered in section D on detection bias.

Blinding of participants and carers is not always possible, particularly in studies of non-drug interventions used in social care, and so performance bias may be a particular issue in these studies. It is important to think about the likely size and direction of bias caused by failure to blind.

The terms 'single blind', 'double blind' and even 'triple blind' are sometimes used in studies. Unfortunately, they are not always used consistently. Commonly, when a study is described as 'single blind', only the participants are blind to their group allocation. When both participants and investigators are blind to group allocation the study is often described as 'double blind'. It is preferable to record exactly who was blinded, if reported, to avoid misunderstanding.

B2. Participants receiving care were kept 'blind' to intervention allocation

The knowledge of assignment to a particular intervention group may affect outcomes such as a study participant's reporting of symptoms, self-use of other known interventions or even dropping out of the study.

B3. Individuals administering care were kept 'blind' to intervention allocation

If individuals who are administering the intervention and/or other care and support to the participant are aware of intervention allocation, they may treat participants receiving one intervention differently from those receiving the comparison intervention; for example, by offering additional co-interventions.

C: Attrition bias

Attrition refers to the loss of participants during the course of a study. Attrition bias occurs when there are systematic differences between the comparison groups with respect to participants lost, or differences between the participants lost to the study and those who remain. Attrition can occur at any point after participants have been allocated to their intervention groups. As such, it includes participants who are excluded after allocation (and may indicate a violation of eligibility criteria), those who do not complete intervention (whether or not they continue measurement) and those who do not complete outcome measurement (regardless of whether or not the intervention was completed). Consideration should be given to why participants dropped out, as well as how many. Participants who dropped out of a study may differ in some significant way from those who remained as part of the study throughout. Drop-out rates and reasons for dropping out should be similar across all intervention groups. The proportion of participants excluded after allocation should be stated in the study report and the possibility of attrition bias considered within the analysis; however, these are not always reported.

C1. All groups were followed up for an equal length of time (or analysis was adjusted to allow for differences in length of follow-up)

If the comparison groups are followed up for different lengths of time, then more events are likely to occur in the group followed up for longer, distorting the comparison. This may be overcome by adjusting the denominator to take the time into account; for example by using person-years.

C2a. How many participants did not complete intervention in each group?

A very high number of participants dropping out of a study should give concern. The drop-out rate may be expected to be higher in studies conducted over a longer period of time. The drop-out rate includes people who did not even start the intervention; that is, they were excluded from the study after allocation to intervention groups.

C2b. The groups were comparable for intervention completion (that is, there were no important or systematic differences between groups in terms of those who did not complete the intervention)

If there are systematic differences between groups in terms of those who did not complete the intervention, consider both why participants dropped out and whether any systematic differences in those who dropped out may be related to the outcome under study, such as potential confounders. Systematic differences between groups in terms of those who dropped out may also result in intervention groups that are no longer comparable with respect to potential confounding factors.

C3a. For how many participants in each group were no outcome data available?

A very high number of participants for whom no outcome data were available should give concern.

C3b. The groups were comparable with respect to the availability of outcome data (that is, there were no important or systematic differences between groups in terms of those for whom outcome data were not available)

If there are systematic differences between groups in terms of those for whom no outcome data were available, consider both why the outcome data were not available and whether there are any systematic differences between participants for whom outcome data were and were not available.

D: Detection bias (this section should be completed individually for each important relevant outcome)

The way outcomes are assessed needs to be standardised for the comparison groups; failure to 'blind' people who are assessing the outcomes can also lead to bias, particularly with subjective outcomes. Most studies report results for more than 1 outcome, and it is possible that detection bias may be present for some, but not all, outcomes. It is therefore recommended that this section is completed individually for each important outcome that is relevant to the guidance review question under study. To avoid biasing your review, you should identify the relevant outcomes before considering the results of the study. Social care practitioner (or healthcare professional where appropriate) input may be required to identify the most important outcomes for a review.

D1. The study had an appropriate length of follow-up

The follow-up of participants after intervention should be of an adequate length to identify the outcome of interest. This is particularly important when different outcomes of interest occur early and late after an intervention. A study that is too short will give an unbalanced assessment of the intervention. For events occurring later, a short study will give an imprecise estimate of the effect, which may or may not also be biased. For example, a late-occurring side effect will not be detected in the intervention arm if the study is too short.

D2. The study used a precise definition of outcome

D3. A valid and reliable method was used to determine the outcome

The outcome under study should be well defined and it should be clear how the investigators determined whether participants experienced, or did not experience, the outcome. The same methods for defining and measuring outcomes should be used for all participants in the study. Often there may be more than 1 way of measuring an outcome (for example, questionnaires, reporting of symptoms and functioning). The method of measurement should be valid (that is, it measures what it claims to measure) and reliable (that is, it measures something consistently).

D4. Investigators were kept 'blind' to participants' exposure to the intervention

D5. Investigators were kept 'blind' to other important confounding factors

In this context the 'investigators' are the individuals who are involved in making the decision about whether a participant has experienced the outcome under study. Investigators can introduce bias through differences in measurement and recording of outcomes, and making biased assessments of a participant's outcome based on the collected data. The degree to which lack of blinding can introduce bias will vary depending on the method of measuring an outcome, but will be greater for more subjective outcomes, such as reporting of pain or quality of life.

E. Overall assessment of internal validity

Rate the study for internal validity according to the list below:

++ All or most of the checklist criteria have been fulfilled, where they have not been fulfilled the conclusions are very unlikely to alter

+ Some of the checklist criteria have been fulfilled, where they have not been fulfilled, or not adequately described, the conclusions are unlikely to alter

Few or no checklist criteria have been fulfilled and the conclusions are likely or very likely to alter

F. Overall assessment of external validity

Rate the external validity of the study (also using ++, +, −). This is the extent to which the findings for the study participants apply to the whole source population (the population they were chosen from). This may also involve an assessment of the extent to which, if the study were replicated in a different setting but with similar population parameters, the results would have been the same or similar. If the study includes an 'intervention', then it should be assessed to see whether it would be feasible in settings other than that initially investigated.