Introduction to real-world evidence in NICE decision making

Corporate document

Background
What is real-world data?
What is real-world evidence?
Uses of real-world evidence in NICE guidance
Estimating intervention effects using real-world data
Challenges in generating real-world evidence

Introduction to real-world evidence in NICE decision making

Background

Real-world data can improve our understanding of health and social care delivery, patient health and experiences, and the effects of interventions on patient and system outcomes in routine settings.

As described in the NICE strategy 2021 to 2026, we want to use real-world data to resolve gaps in knowledge and drive forward access to innovations for patients.

We developed the real-world evidence framework to help deliver on this ambition. It does this by:

identifying when real-world data can be used to reduce uncertainties and improve guidance
clearly describing best practices for planning, conducting and reporting real-world evidence studies to improve the quality and transparency of evidence.

The framework aims to improve the quality of real-world evidence informing our guidance. It does not set minimum acceptable standards for the quality of evidence. Users should refer to relevant NICE manuals for further information on how recommendations are made (see the section on uses of real-world data in NICE guidance).

The framework is mainly targeted at those developing evidence to inform NICE guidance. It is also relevant to patients, those collecting data, and reviewers of evidence.

What is real-world data?

We define real-world data as data relating to patient health or experience or care delivery collected outside the context of a highly controlled clinical trial. Real-world data can be routinely collected during the delivery of health or social care. It can also be collected prospectively, to address 1 or more specific research questions. Most real-world data sources are observational (or non-interventional), that is, any interventions (or exposures) are not determined by a study protocol. Instead, medical interventions are decided by patients and healthcare professionals. And in public health or social care, interventions may be determined by individual behaviours, environmental exposures or policy makers.

Some interventional studies, such as pragmatic clinical trials, can also produce real-world evidence. Such trials may also make use of real-world data sources to design trials, recruit participants or collect outcome data. For more information, see the UK Medicines and Healthcare products Regulatory Agency's (MHRA) guideline on randomised controlled trials using real-world data to support regulatory decisions.

Table 2 describes common sources of non-interventional real-world data. These include original data collections (such as patient health records) and data curated from original sources (such as the data obtained from retrospective chart reviews). While each type of data source has some general strengths and weaknesses, the value for a given research question will depend on the characteristics of the specific data (for further information, see the section on assessing data suitability). Different sources of real-world data can be combined by linking or pooling to improve data quality and coverage, potentially allowing additional research questions to be answered.

Real-world data can be quantitative or qualitative. Common data types include patient demographics, health behaviours, medical history, clinical outcomes (including patient-reported outcomes), patient or user experiences, resource use, costs, omics, laboratory measurements, imaging, free text, test results and patient-generated data. We consider both national data collections and international data when making recommendations.

Table 2

**Common sources of real-world data**
Data source	Description	Examples
Electronic health records	Computerised individual patient records. These are typically used to inform the clinical management of patients. These sometimes integrate data from other information systems including laboratory, genomic, and imaging systems.	The Clinical Practice Research Datalink (CPRD) GOLD contains demographic and clinical information on patients enrolled in participating general practices across the UK.
Administrative data	Data collected for administrative purposes by health and social care services.	The Hospital Episode Statistics (HES) Admitted Patient Care dataset contains information on diagnoses and procedures done for all patients admitted to NHS hospitals or NHS-funded treatments in private hospitals. Its primary purpose is to inform the reimbursement of hospitals through payment by results and other operational activities.
Claims data	A type of administrative data on healthcare service use often collected from insurance-based systems.	Centers for Medicare & Medicaid Services data contains data on individuals in receipt of Medicare services derived from reimbursement information or payment of bills. The NHS Business Services Authority provides data on medicines dispensed in primary care in England.
Patient registries	Registries are organised systems that collect uniform data (clinical and other) to identify specified outcomes for a population defined by a particular disease, condition or exposure. Registries can serve several purposes including research, clinical care or policy. Registries can include interventional studies.	The Systemic Anti-Cancer Therapy (SACT) dataset contains information on all patients treated with anticancer therapies from NHS England providers. This data is widely used within NICE to provide information on drugs approved for use within the Cancer Drugs Fund. The UK Cystic Fibrosis Registry collects data on consenting people with cystic fibrosis across specialist centres in the UK. The registry data is used to improve the health of people with cystic fibrosis by facilitating research, guiding quality improvement at care centres and monitoring the safety of new drugs.
Patient-generated health data	Data generated directly by patients or their carers including from wearable medical or personal devices, mobile apps, social media, and other internet-based tools. Data can be collected actively (for example, by people entering data on a form) or passively (for example, a smart watch that measures people's activity level).	Pulse oximeters used to monitor people with COVID-19 treated at home to alert need for hospital admission (Greenhalgh et al. 2021). Self-reported data on COVID-19 and long-COVID symptoms from the ZOE app.
Chart reviews	Data extracted retrospectively from a review of patient health records (including paper or electronic records). Chart reviews are widely used in natural history studies. They may allow the extraction of data not reported in routine data sources.	Retrospective chart reviews are especially common in studies of rare diseases to model natural history of disease and treatment pathways (Garbade et al. 2021).
Audit and service evaluation	Clinical audits are done to understand how current standards of care measure against best practice or a set standard, and subsequently inform quality improvement. Data can be collected prospectively or retrospectively. Service evaluations are done to define and judge current care.	The Healthcare Quality Improvement Partnership manages national clinical audit programmes such as the Sentinel Stroke National Audit Programme (SSNAP). SSNAP is used to assess the quality of the organisation and delivery of multidisciplinary inpatient stroke health services in England, Wales and Northern Ireland.
Observational cohorts with primary data collection	Traditional prospective studies designed to answer one or more research questions.	The UK Biobank collects data on patient medical histories and genetics. It links to patient records for health outcomes. It was not designed for a specific research question but to enable epidemiological research. EMBRACE-I is a multicentre prospective cohort study to evaluate local tumour control and morbidity in patients undergoing MRI-based image guided adaptive brachytherapy for locally advanced cervical tumours.
Health surveys, interviews and focus groups	Health surveys involve systematic collection of data about health and disease in a human population through surveys. They have various purposes including understanding trends in health in a population or understanding patients' experiences of care. Interviews and focus groups are done to collect qualitative data such as patient perception and experiences.	The Health Survey for England is an annual representative household survey measuring trends in health in England. The 'Living with Lipoedema' 2021 survey by patient charity Lipoedema UK collects patient experience data from individuals with lipoedema. It evaluates experiences of patients having non-cosmetic liposuction or other treatments for lipoedema.

What is real-world evidence?

We define real-world evidence as evidence generated from the analysis of real-world data. It can cover a large array of evidence types including disease epidemiology, health service research or causal estimation (see the section on uses of real-world data in NICE guidance). It can be generated from a large range of study designs and analytical methods (including quantitative and qualitative methods) depending on the research question or use case. A real-world evidence study may use routinely collected data, bespoke data collection, or a combination of the two. We consider single-arm trials that use real-world data sources to create an external control to be real-world evidence studies.

Uses of real-world evidence in NICE guidance

NICE guidance

NICE has several guidance products that use the best available evidence to develop recommendations that guide decisions in health, public health and social care, including:

guidelines for clinical, social care and public health topics, which offer advisory guidance to health and social care professionals
evaluations of medical technologies including medicines, diagnostics, medical devices, digital health technologies and interventional procedures.

Guidelines are developed internally by NICE. Technology evaluations are usually informed by company submissions but may also use evidence submitted by manufacturers or other stakeholders or research commissioned from independent academic centres.

The processes and methods for technology evaluations differ across NICE's programmes. The Technology Appraisal Programme evaluates mostly medicines (including highly specialised technologies) but can also include medical devices and diagnostics. The Technology Appraisal and Diagnostic Guidance Programmes both consider the cost effectiveness of medical technologies. The Medical Technologies Evaluation Programme evaluates medical technologies including medical devices, digital health technologies and diagnostics that are expected to be cost-saving or cost-neutral and uses cost-consequence analysis considering patient and system outcomes. The Interventional Procedures Programme evaluates the efficacy and safety of interventional procedures without analysis of cost.

When NICE recommends a treatment 'as an option' through its Technology Appraisal Programme, the NHS must make sure it is available within 3 months (unless otherwise specified) of its date of publication. If a technology is potentially cost effective but there is substantial and resolvable uncertainty about its value, it can be recommended for use in a managed access agreement. After a specified period of collecting real-world data, the technology is reassessed through the Technology Appraisal Programme. Selected devices, diagnostic or digital technologies that are recommended in NICE guidance and are likely to be affordable and produce cost savings within 3 years of adoption can be funded through NHS England's MedTech funding mandate.

Methods and process manuals have been developed for different NICE programmes. Users of this framework should consult these manuals as appropriate:

Developing NICE guidelines: the manual explains the processes and methods used to develop and update NICE guidelines, the guidance that NICE develops covering topics across clinical care, social care and public health.
NICE's health technology evaluations manual describes the methods and processes for developing health technology evaluation, including for the Diagnostics Assessment Programme, the Medical Technologies Evaluation Programme, the Highly Specialised Technologies Programme, and the Technology Appraisal Programme.
NICE's interventional procedures programme manual describes the processes and methods for developing guidance in the Interventional Procedures Programme.

NICE's evidence standards framework for digital health technologies sets out what good levels of evidence for digital health technologies look like. It is aimed at innovators and commissioners of digital health technologies.

Use cases for real-world data

The differences between NICE's guidance programmes lead to variation in the uses and acceptability of real-world evidence.

Real-world data is already used across NICE programmes to generate different types of evidence, especially for questions that are not about the effects of interventions. Examples from previous NICE guidance include:

characterising health conditions, interventions, care pathways, and patient outcomes and experiences including natural history: NICE highly specialised technologies guidance on onasemnogene abeparvovec for treating spinal muscular atrophy used multiple sources of real-world data to characterise spinal muscular atrophy
estimating economic burden: NICE technology appraisal guidance on benralizumab for treating severe eosinophilic asthma reported data from CPRD GOLD linked to HES
designing, populating and validating economic models. Common types of evidence include:
- patient starting characteristics: NICE diagnostics guidance on QAngio XA 3D QFR and CAAS vFFR imaging software for assessing coronary stenosis during invasive coronary angiography reported data from the IRIS-FFR registry
- baseline rates of events: NICE guideline on chronic obstructive pulmonary disease (COPD) in over 16s: diagnosis and management reported data from CPRD GOLD on baseline COPD exacerbation rates by disease severity
- characterisation of treatment in routine practice: NICE technology appraisal guidance on fenfluramine for treating seizures associated with Dravet syndrome used multiple real-world studies to assess the average dose for comparator treatments in routine practice
- transition probabilities between health states or disease progression: NICE technology appraisal guidance on patiromer for treating hyperkalaemia used CPRD data to model transition between disease states for people with chronic kidney disease
- resource use and costs: NICE medical technologies guidance on HeartFlow FFRCT for estimating fractional flow reserve from coronary CT angiography used cost data on coronary revascularisation from NHS reference costs
- patient-reported outcomes, including quality of life: NICE highly specialised technologies guidance on elosulfase alfa for treating mucopolysaccharidosis type 4a used quality of life data from a survey
- extrapolation: NICE technology appraisal guidance on atezolizumab with carboplatin and etoposide for untreated extensive-stage small-cell lung cancer used data from the Flatiron Health database, which is derived from US electronic health records.
measuring patient experience: NICE medical technologies guidance on myCOPD for managing COPD used patient survey data on the ease of use of the technology
developing and validating digital health technologies including prognostic models: see the NICE evidence standards framework for digital health technologies for further information
identifying, characterising and addressing health inequalities: NICE technology appraisal guidance on crizanlizumab for preventing sickle cell crises in sickle cell disease reported evidence from the National Haemoglobinopathy Registry on the health and disproportionate burden of sickle cell disease in certain minority ethnic groups
estimating test accuracy or reproducibility of test results such as biomarkers: NICE medical technologies guidance on Zio XT for detecting cardiac arrhythmias reported data from a retrospective observational cohort study
estimating device or procedure failure rates: NICE guideline on joint replacement (primary): hip, knee and shoulder used data from the National Joint Registry on revision rates of knee replacements
measuring the impact of interventions (including tests) on service delivery and decisions about care: NICE diagnostics guidance on tumour profiling tests to guide adjuvant chemotherapy decisions in early breast cancer reported results from several prospective observational studies.

Real-world data can also be used to assess the applicability of trial results to patients in the NHS or even to estimate intervention effects (for further information, see the section on estimating intervention effects using real-world data).

While real-world evidence is already widely used for many of these types of evidence (Leahy et al. 2020, Makady et al. 2018), its use could be more commonplace. When data is representative of the target population and of sufficient quality it may be the preferred source of data. Background event rates or natural history data from trials may sometimes overestimate or underestimate event rates in the target population because of selective recruitment (Bloudek et al. 2021). In some cases, there may be value in performing studies using routinely collected data rather than relying on published evidence that has lower applicability to the research question.

Estimating intervention effects using real-world data

Uses and challenges of randomised controlled trials

Randomised controlled trials are the preferred study design for estimating the causal effects of interventions. This is because randomisation ensures that any differences in known and unknown baseline characteristics between groups are because of chance. Blinding (if applied) prevents knowledge of treatment allocation from influencing behaviours, and standardised protocols ensure consistent data collection.

However, randomised controlled trials are not always available or may not be sufficient to address the research question of interest.

Randomised trials may not be available for several reasons, including:

randomisation is considered unethical, for instance because of high unmet need
patients are unwilling to be allocated to one of the interventions in the trial
healthcare professionals are unwilling to randomise patients to an intervention which they consider less effective
a small number of eligible patients
financial or technical constraints on studies
not all treatment combinations (including treatment sequences) can be directly assessed.

Randomised controlled trials may be especially difficult to do for rare diseases, innovative and complex technologies, or in certain populations.

Similarly, high-quality randomised controlled trials can be challenging for medical devices and interventional procedures because of the difficulty of blinding, the importance of learning effects, changes to standard of care making the choice of comparator challenging, changes to the characteristics of the technology over time that may impact on performance, and limited research capacity or access to funding (Bernard et al. 2014).

Even if trials are available, they may not be directly applicable to the research question or to routine care in the NHS because of:

use of comparators that do not represent the standard of care in the NHS (including placebo control)
use of unvalidated surrogate outcomes
limited follow up
exclusion of eligible population groups (for example, individuals with comorbidities, pregnant women, and children)
differences in populations, care pathways, or settings that impact on the transferability of results to the target population in the NHS
differences in patient's use of a technology
clinical support that differs from routine practice
learning effects (that is, the effect of an intervention changes over time as users become more experienced)
methods used to address post-randomisation events such as treatment switching, loss to follow up or missing data.

Some of these challenges, such as the use of comparators that do not represent the standard of care in the NHS, can potentially be addressed through other approaches such as network meta-analysis under certain assumptions about the comparability of the trials. See the NICE Decision Support Unit report on sources and synthesis of evidence for further information.

Real-world evidence can also be generated from randomised controlled trials that use real-world data in their design or for measuring outcomes, such as pragmatic clinical trials. Such trials may provide substantial value in combining the internal validity from randomisation with the greater generalisability of data from routine practice. The UK MHRA has published guidance on producing real-world evidence from randomised controlled trials.

Real-world evidence

Real-world data can be used to contextualise randomised trials, to estimate effects of interventions in the absence of trials, or to complement trials to answer a broader range of questions about the impacts of interventions in routine settings.

Contextualisation

Contextualisation involves assessing whether the results from trials will translate well to the target population in the NHS. While this is an important use of real-world data across NICE programmes, NICE may require the collection of further data through managed access arrangements for medicines that are potentially cost effective and if uncertainties can be addressed through further data collection. This data is often used to understand the relevance of trials to the NHS.

Real-world data has been used in NICE guidance to contextualise clinical trials including for:

differences in eligible population in the NHS, treatment pathways, care settings and outcomes: NICE technology appraisal guidance on pegcetacoplan for treating paroxysmal nocturnal haemoglobinuria used UK registry data to show that urinary haemoglobin levels in UK practice were in line with the eligibility threshold for the randomised controlled trial
modelling the relationship between surrogate outcomes and final outcomes (including patient-reported outcomes): NICE highly specialised technologies guidance on lumasiran for treating primary hyperoxaluria type 1 used a registry-based study to model the relationship between plasma oxalate, a surrogate outcome, and kidney function
measuring the use of, and adherence to, interventions: NICE medical technologies guidance on Sleepio to treat insomnia and insomnia symptoms used data on usage collected from the app or website
assessing the appropriateness of assumptions about long-term outcomes or treatment effects beyond trial periods: NICE technology appraisal guidance on nintedanib for treating progressive fibrosing interstitial lung diseases used registry data to validate extrapolations of long-term outcomes.

NICE technology appraisal guidance on osimertinib for treating EGFR T790M mutation-positive advanced non-small-cell lung cancer used data from the Systemic Anti-Cancer Therapy (SACT) dataset to assess the relevance of results from the AURA3 trial to NHS patients. In particular, SACT data was used to compare:

overall survival
differences in patient characteristics including age, ethnicity, performance status and treatment history.

Estimation

Effects can be estimated for a range of different outcomes, including:

patient outcomes – clinical outcomes, biomarkers, patient-reported outcomes, behaviour change, user satisfaction and engagement
system outcomes – resource use, costs and processes of care.

Real-world data can be used to better understand the effects of an intervention over its life cycle. The potential uses of real-world data for estimating effects of interventions depend on the stage in their life cycle.

For new interventions (for example, those with recent marketing authorisation in the UK), there will be limited real-world data on their use and outcomes in the NHS. The uses of real-world data include:

creating a comparator arm (that is, external control) to estimate effects against a single-arm trial or to add to controls from a randomised controlled trial: NICE highly specialised technologies guidance on metreleptin for treating lipodystrophy used a natural history study to form an external control to a single-arm trial
using data from early access to medicines schemes: NICE technology appraisal guidance on berotralstat for preventing recurrent attacks of hereditary angioedema included early access to medicines scheme data to reduce uncertainty around long-term outcomes
estimating comparative effects in other countries in which the technology was available earlier than in the UK (Jonsson et al. 2021)
predicting outcomes and treatment effects in routine settings, for example, by reweighting results from trials to reflect characteristics of all eligible patients: NICE technology appraisal guidance on pembrolizumab with carboplatin and paclitaxel for untreated metastatic squamous non-small-cell lung cancer used prescribing data from the Cancer Drugs Fund to estimate outcomes weighted by subgroup prevalence.

Once medical technologies are used routinely or in pilot projects, the opportunities for real-world data are greater and include:

estimating effects of interventions in routine settings (see NICE medical technologies guidance on DyeVert Systems for reducing the risk of acute kidney injury in coronary and peripheral angiography)
providing head-to-head comparisons with preferred comparators: NICE technology appraisal guidance on mogamulizumab for previously treated mycosis fungoides and Sezary syndrome used HES data to provide a UK-specific standard-of-care comparator arm to the intervention arm of a randomised controlled trial
estimating effects in populations excluded from, or under-represented in, the available randomised controlled trials, or extrapolating results from trials: NICE technology appraisal guidance on casirivimab plus imdevimab, nirmatrelvir plus ritonavir, sotrovimab and tocilizumab for treating COVID-19 used OpenSAFELY electronic health records data to support outcomes observed in trial data, and included high-risk populations excluded from trial data
exploring heterogeneity in intervention effects: NICE technology appraisal guidance on pembrolizumab for treating relapsed or refractory classical Hodgkin lymphoma after stem cell transplant or at least 2 previous therapies used SACT data to model overall survival among those without previous stem-cell transplant
estimating effects on final outcomes of interest (rather than surrogate outcomes) and over longer time periods
estimating effects for combination therapies (including sequences) or decision strategies not examined in randomised controlled trials (Fu et al. 2021)
incorporating into evidence synthesis, for example, informing priors, increasing power or filling evidence gaps in a network meta-analysis (NICE Decision Support Unit report on sources and synthesis of evidence, Sarri et al. 2020).

The validity of real-world evidence for estimating intervention effects

A growing body of literature aims to understand the internal validity of real-world evidence (or, more generally, non-randomised studies) in comparison with randomised controlled trials. This includes meta-epidemiological studies, which compare results from studies of different designs addressing the same question (Woolacott et al. 2017), individual case studies (Dickerman et al. 2020) and systematic replication studies such as RCT Duplicate (Franklin et al. 2020).

These studies have demonstrated that high-quality non-randomised studies can produce valid estimates of relative treatment effects in many, but certainly not all, situations. There are some common design principles that improve the likelihood of valid estimates including:

the use of active comparators (alternative interventions for the same or similar indication, usually of the same modality) and
comparing new users (or initiators) of interventions rather than those who have been using an intervention for some time (prevalent users).

Validity may also depend on other factors including the characteristics of the disease, type of outcome (objective clinical outcomes are preferred), the treatment landscape, and data content and quality.

Challenges in generating real-world evidence

Real-world data has great potential for improving our understanding of the value of interventions in routine settings. However, there are important challenges that must be addressed to generate robust results and improve trust in the evidence. We describe key challenges below.

Trust in real-world evidence studies

Real-world data is often complex and requires substantial preparation before it can be analysed. Also, for some applications, such as the estimation of comparative effects, the methods of analysis can be advanced. When making use of already collected data, researchers may have access to data before finalising their statistical analysis plans. Data preparation and analytical decisions can have important effects on the resulting estimates.

Therefore, concerns about the integrity and trustworthiness of the resulting evidence (for example, resulting from data dredging or cherry-picking) need to be addressed. Concerns about the legitimate use of data have been highlighted by the retraction of high-profile studies about the effectiveness of repurposed medicines for treating COVID-19 from prominent medical journals.

Trust in real-world evidence studies can be improved by:

registering the study protocol before implementing the study (see the Real-World Evidence Transparency Initiative)
reporting checklists or tools (see Enhancing the Quality and Transparency of Health Research [EQUATOR] network)
requiring author statements to confirm the integrity of data access and study conduct (see learning from a retraction by the editors of the Lancet Group, 2020)
open publishing of data, code lists and analytical code
providing access to data through secure data environments and maintaining audit trails (see the Department of Health and Social Care's report on better, broader, safer: using health data for research and analysis).

See guidance on planning, conducting and reporting real-world evidence studies in the section on conduct of quantitative real-world evidence studies to generate real-world evidence.

Data quality and relevance

There are several common challenges with using real-world data. Some types of data are often, though not always, absent from real-world data sources (such as measures of tumour size or functional status). However, methods to extract data elements from unstructured data, such as doctor's notes, are increasingly used.

Other variables may be collected at an insufficiently granular level. For instance, a study may need knowledge of a specific drug or medical device, but the data may include only drug or device class. Similarly, a study may need to distinguish between haemorrhagic and ischaemic strokes while a data source may contain data on all strokes without further detail. Even if relevant items are collected with the needed granularity, the data may be missing or inaccurate, which can cause information bias. In addition, there may be variation in data-recording practices and quality across centres or individuals, and in the quality management processes for different sources of data.

In addition to the availability of data on relevant study elements, the relevance of a given data source to a research question may be affected by several factors. This includes the representativeness of the study sample and similarities in treatment patterns and healthcare delivery to routine care in the NHS, the timeliness of data, sample size and length of follow up. The key questions are whether the data is sufficient to produce robust estimates relevant to the decision problem and whether results are expected to translate or generalise to the target population in the NHS.

See the section on assessing data suitability for further information.

Risk of bias

Studies using real-world data are at risk of bias from a number of sources, depending on the use case. We describe key risks of bias that threaten validity in individual real-world evidence studies below. Detailed descriptions of risks of bias in non-randomised studies are available, such as the European Network of Centres for Pharmacoepidemiology and Pharmacovigilance (ENCEPP) guide on methodological standards in pharmacoepidemiology and chapter 25 in the Cochrane handbook for systematic reviews of interventions.

Selection bias

In non-comparative studies, selection bias can occur if the people studied are not representative of the target population. This might result from non-random sampling of the source population, non-response to a questionnaire, or differences in behaviours and outcomes of those who volunteer to be part of research studies.

In comparative effect studies, selection bias occurs if the selection of participants or follow-up time is related to both the interventions and the outcomes. A lack of representativeness of the target population is not itself necessarily a cause of selection bias in comparative studies. Selection bias in comparative studies is distinct from confounding.

Common causes of selection bias at study entry include:

including prevalent users of a technology compared with non-users (users who had already experienced the event or not tolerated the intervention would be excluded from analysis)
excluding a period of follow up in which the outcome cannot occur (known as immortal time bias for survival outcomes)
selection into the study based on a characteristic (for example, admission to hospital) that is related to the intervention and outcome.

A common cause of selection bias at study exit is loss to follow up. Selection bias can also be caused by excluding participants from analysis, such as those with missing data.

Information bias

Information bias may result from missing or inaccurate data on population eligibility criteria, interventions or exposures, outcomes and covariates (as relevant). These limitations may occur because of low data quality, care patterns or data collection processes. They may also result from misspecification of the follow-up period.

The consequences of these issues depend on factors including the study type, whether limitations vary across intervention groups, whether they are random or systematic (that is, the missing data mechanism), the magnitude of the limitation and in which variables they occur. One common cause of differential misclassification across groups is detection bias. This occurs when the processes of care differ according to intervention status such that outcomes are more likely to identified in 1 group than in another. See the section on measurement error and misclassification for further information.

Confounding

Confounding occurs when there are common causes of the choice of intervention and the outcome. This is expected to be common in healthcare because healthcare professionals and patients make decisions about treatment initiation and continuation based on their expectations of benefits and risks (known as confounding by indication or channelling bias). Confounding bias may be intractable when comparing treatments with different indications and across types of intervention (for example, interventional procedure compared with drug treatment) and for studies of environmental exposures.

Bias may also arise because of inappropriate adjustment for covariates, for example, if a study controls for covariates on the causal pathway (such as blood pressure in the effect of anti-hypertensive medication on stroke), colliders (a variable influenced independently by both the exposure and the outcome), or instruments (defined as a variable that is associated with the exposure but unrelated with the outcome except through the exposure).

External validity bias

External validity refers to how well the findings from the analytical sample apply to the target population of interest. Study findings may be intended to be applied to a target population from which the study sample was drawn ('generalisability'), or to another target population, from which the study sample was not derived ('transportability').

Differences can occur between the study sample and target population for factors that affect outcomes on the scale of estimation (for example, relative versus absolute effects). These may include differences in patient or disease characteristics, healthcare settings, staff experience, treatment types and clinical pathways. Further differences may result from patient exclusions, drop out and data missingness in the analytical sample.

Methods to assess and adjust for some elements of external validity bias (those relating to differences in patient characteristics in studies of comparative treatment effects) are discussed in the section on addressing external validity bias.

Other forms of bias

Reverse causation (or protopathic bias) occurs when the intervention is a result of the outcome or a symptom of the outcome. This is most problematic in conditions with long latency periods such as several cancers. If present, this is a severe form of bias with major implications for internal validity.

Biases may also result from the statistical analysis of data (for example, model misspecification).

When assessing the body of literature on a research question there are further concerns about publication bias because of non-reporting of real-world evidence studies, especially if they show null results (Chan et al. 2014).