NICE process and methods

5 The reference case

This section details methods for assembling and synthesising evidence on the technology being appraised in order to estimate its clinical and cost effectiveness. The estimates of clinical and cost effectiveness are individual yet interdependent key inputs into the decision-making of the Appraisal Committee. The Institute seeks to promote high-quality analysis and to encourage consistency in analytical approaches, but also acknowledges the need to report studies in other ways to reflect particular circumstances.

5.1 Framework for estimating clinical and cost effectiveness

Directions on particular aspects of NICE health technology assessment and economic evaluation are presented below. The position statement of the Institute is set out (in bold), followed by explanation and justification.

The concept of the reference case

5.1.1 The Institute has to make decisions across different technologies and disease areas. It is, therefore, crucial that analyses of clinical and cost effectiveness undertaken to inform the appraisal adopt a consistent approach. To allow this, the Institute has defined a 'reference case' that specifies the methods considered by the Institute to be appropriate for the Appraisal Committee's purpose and consistent with an NHS objective of maximising health gain from limited resources. Submissions to the Institute should include an analysis of results generated using these reference case methods. This does not preclude additional analyses being presented when 1 or more aspects of methods differ from the reference case. However, these must be justified and clearly distinguished from the reference case.

5.1.2 There is considerable debate about the most appropriate methods to use for some aspects of health technology assessment. This uncertainty relates to choices that are essentially value judgements; for example, whose preferences to use (patients or the general public) for valuation of health outcomes. It also includes methodological choices that relate to more technical aspects of an analysis; for example, the most appropriate approach to measuring health-related quality of life. Although the reference case specifies the methods preferred by the Institute, it does not preclude the Appraisal Committee's consideration of non-reference-case analyses if appropriate. The key elements of analysis using the reference case are summarised in table 5.1.

Table 5.1 Summary of the reference case

Element of health technology assessment

Reference case

Section providing details

Defining the decision problem

The scope developed by NICE

5.1.4 to 5.1.6

Comparator(s)

As listed in the scope developed by NICE

2.2.4 to 2.2.6, 5.1.6, 5.1.14

Perspective on outcomes

All direct health effects, whether for patients or, when relevant, carers

5.1.7, 5.1.8

Perspective on costs

NHS and PSS

5.1.9 and 5.1.10

Type of economic evaluation

Cost–utility analysis with fully incremental analysis

5.1.11 to 5.1.14

Time horizon

Long enough to reflect all important differences in costs or outcomes between the technologies being compared

5.1.15 to 5.1.17

Synthesis of evidence on health effects

Based on systematic review

5.2

Measuring and valuing health effects

Health effects should be expressed in QALYs.The EQ-5D is the preferred measure of health-related quality of life in adults.

5.3.1

Source of data for measurement of health-related quality of life

Reported directly by patients and/or carers

5.3.3

Source of preference data for valuation of changes in health-related quality of life

Representative sample of the UK population

5.3.4

Equity considerations

An additional QALY has the same weight regardless of the other characteristics of the individuals receiving the health benefit

5.4.1

Evidence on resource use and costs

Costs should relate to NHS and PSS resources and should be valued using the prices relevant to the NHS and PSS

5.5.1

Discounting

The same annual rate for both costs and health effects (currently 3.5%)

5.6.1

NICE, National Institute for Health and Care Excellence; NHS, National Health Service; PSS, personal social services; QALYs, quality-adjusted life years; EQ-5D, standardised instrument for use as a measure of health outcome.

5.1.3 There may be reasons for applying non-reference-case methods. In these cases, the reasons for not applying reference-case methods should be clearly specified and justified, and the likely implications should be quantified. The Appraisal Committee will then make a judgement regarding the weight it attaches to the results of such a non-reference-case analysis.

Defining the decision problem

5.1.4 Estimating clinical and cost effectiveness should begin with a clear statement of thedecision problemthat defines the technologies being compared and the relevant patient group(s). The decision problem should be consistent with the Institute's scope for the appraisal; any differences must be justified.

5.1.5 The main technology of interest, its expected place in the pathway of care, the comparator(s) and the relevant patient group(s) will be defined in the scope developed by the Institute (see section 2).

5.1.6 When selecting comparators for assessment, give particular consideration to the scope (see section 2), and to the evidence to allow a robust assessment of relative clinical and cost effectiveness.

Perspective

5.1.7 For the reference case, the perspective on outcomes should be all direct health effects, whether for patients or other people. The perspective adopted on costs should be that of the NHS and personal and social services.

5.1.8 The reference-case perspective on outcomes aims to maximise health gain from available healthcare resources. Some features of healthcare delivery often referred to as 'process characteristics' may ultimately have health consequences, for example, mode of treatment delivery through its impact on adherence. If characteristics of healthcare technologies have a value to people independent of any direct effect on health, the nature of these characteristics should be clearly explained and if possible the value of the additional benefit should be quantified. These characteristics may include convenience and the level of information available for patients.

5.1.9 The Institute does not set the budget for the NHS. The appropriate objective of the Institute's technology appraisal programme is to offer guidance that represents an efficient use of available NHS and personal social services resources. For these reasons, the reference-case perspective on costs is that of the NHS and personal social services.

5.1.10 Some health technologies may have substantial benefits to other government bodies (for example, treatments to reduce drug misuse may have the effect of reducing crime). These issues should be identified during the scoping stage of an appraisal. Appraisals that consider benefits to the government incurred outside of the NHS and personal social services will be agreed with the Department of Health (and other relevant government bodies as appropriate) and detailed in the remit from the Department of Health and the final scope. For these non-reference-case analyses the benefits and costs (or cost savings) should be presented separately from the reference-case analysis. Productivity costs are not included in either the reference-case or non-reference-case analyses.

Type of economic evaluation

5.1.11 For the reference case, cost-effectiveness (specifically cost–utility) analysis is the preferred form of economic evaluation. This seeks to establish whether differences in expected costs between options can be justified in terms of changes in expected health effects. Health effects should be expressed in terms of QALYs.

5.1.12 The focus on cost-effectiveness analysis is justified by the Institute's focus on maximising health gains from a fixed NHS and personal social services budget and the more extensive use and publication of these methods compared with cost–benefit analysis. Currently, the QALY is considered to be the most appropriate generic measure of health benefit that reflects both mortality and health-related quality of life effects. If the assumptions that underlie the QALY (for example, constant proportional trade-off and additive independence between health states) are inappropriate in a particular case, then evidence to this effect should be produced and analyses using alternative measures may be presented as an additional non-reference-case analysis.

5.1.13 Standard decision rules should be followed when combining costs and QALYs. When appropriate, these should reflect when dominance or extended dominance exists, presented thorough incremental cost–utility analysis. Incremental cost-effectiveness ratios (ICERs) reported must be the ratio of expected additional total cost to expected additional QALYs compared with alternative treatment(s). In addition to ICERs, expected net monetary or health benefits can be presented using values placed on a QALY gained of £20,000 and £30,000.

5.1.14 In exceptional circumstances, if the comparators form part of a class of treatments, and evidence is available to support their clinical equivalence, estimates of QALYs gained for the class as a whole can be presented.

Time horizon

5.1.15 The time horizon for estimating clinical and cost effectiveness should be sufficiently long to reflect all important differences in costs or outcomes between the technologies being compared.

5.1.16 Many technologies have impacts on costs and outcomes over a patient's lifetime. In such instances, a lifetime time horizon for clinical and cost effectiveness is usually appropriate. A lifetime time horizon is required when alternative technologies lead to differences in survival or benefits that persist for the remainder of a person's life. For a lifetime time horizon, it is often necessary to extrapolate data beyond the duration of the clinical trials and to consider the associated uncertainty. When the impact of treatment beyond the results of the clinical trials is estimated, analyses that compare several alternative scenarios reflecting different assumptions about future treatment effects using different statistical models are desirable (see section 5.7 on modelling). These should include assuming that the treatment does not provide further benefit beyond the treatment period as well as more optimistic assumptions. Analyses that limit the time horizon to periods shorter than the expected impact of treatment do not usually provide the best estimates of benefits and costs.

5.1.17 A time horizon shorter than a patient's lifetime could be justified if there is no differential mortality effect between treatment options, and the differences in costs and health-related quality of life relate to a relatively short period (for example, in the case of an acute infection which has no long term sequelae).

5.2 Synthesis of evidence on health effects

5.2.1 The objective of the analysis of clinical effectiveness is an unbiased estimate of the mean clinical effectiveness of the technologies being compared. The analysis of clinical effectiveness must be based on data from all relevant studies of the best available quality and should consider the range of typical patients, normal clinical circumstances, clinically relevant outcomes, comparison with relevant comparators, and measures of both relative and absolute effectiveness with appropriate measures of uncertainty. The Institute has a preference for RCTs directly comparing the intervention with 1 or more relevant comparators and these should be presented in the reference-case analysis if available.

Systematic review

5.2.2 All health effects should be identified and quantified, with all data sources clearly described. In the reference case, evidence on outcomes should be obtained from a systematic review, defined as systematically locating, including, appraising and synthesising the evidence to obtain a reliable and valid overview of the data related to a clearly formulated question[1].

Relevant studies

5.2.3 RCTs directly comparing the technology under appraisal with relevant comparators provide the most valid evidence of relative efficacy. However, such evidence may not always be available and may not be sufficient to quantify the effect of treatment over the course of the disease. Therefore, data from non-randomised studies may be required to supplement RCT data. Any potential bias arising from the design of the studies used in the assessment should be explored and documented.

Study selection and data extraction

5.2.4 A systematic review of relevant studies of the technology being appraised should be conducted according to a previously prepared protocol to minimise the potential for bias, and should include studies investigating relevant comparators.

5.2.5 Once the search strategy has been developed and literature searching undertaken, a list of possible studies should be compiled. Each study must be assessed to determine whether it meets the inclusion criteria of the review. A log of ineligible studies should be maintained with the rationale for why studies were included or excluded. Having more than 1 reviewer assess all records retrieved by the search strategy increases the validity of the decision. The procedure for resolving disagreements between reviewers should be reported.

Critical appraisal

5.2.6 The quality of a study's overall design, its execution, and the validity of its results determines its relevance to the decision problem. Each study meeting the criteria for inclusion should be critically appraised. Whenever possible, the criteria for assessing published studies should be used to assess the validity of unpublished and part-published studies.

Treatment effect modifiers

5.2.7 Many factors can affect the overall estimate of relative treatment effects obtained from a systematic review. Some differences between studies occur by chance, others from differences in the characteristics of patients (such as age, sex, severity of disease, choice and measurement of outcomes), care setting, additional routine care and the year of the study. Such potential treatment effect modifiers should be identified before data analysis, either by a thorough review of the subject area or discussion with experts in the clinical discipline.

Pairwise meta-analysis

5.2.8 Synthesis of outcome data through meta-analysis is appropriate provided there are sufficient relevant and valid data using measures of outcome that are comparable.

5.2.9 The characteristics and possible limitations of the data (that is, population, intervention, setting, sample size and validity of the evidence) should be fully reported for each study included in the analysis and a forest plot included.

5.2.10 Statistical pooling of study results should be accompanied by an assessment of heterogeneity (that is, any variability in addition to that accounted for by chance) which can, to some extent, be taken into account using a random (as opposed to fixed) effects model. However, the degree of, and the reasons for, heterogeneity should be explored as fully as possible. Known clinical heterogeneity (for example, because of patient characteristics) may be explored by using subgroup analyses and meta-regression. When there is doubt about the relevance of a particular trial, a sensitivity analysis should exclude that study. If the risk of an event differs substantially between the control groups of the studies in a meta-analysis, an assessment of whether the measure of relative treatment effect is constant over different baseline risks should be carried out. This is especially important when the measure of relative treatment effect is to be used in an economic model and the baseline rate of events in the comparator arm of the model is very different to the corresponding rates in the studies in the meta-analysis.

5.2.11 A group of related technologies might have similar but not necessarily identical effects, whether or not recognised as a 'class'. When the Institute is appraising a number of related technologies within a single appraisal, meta-analyses based on individual effects should be carried out. A class effect can be analysed as a sensitivity analysis, unless specified otherwise in the scope for the appraisal.

Indirect comparisons and network meta-analyses

5.2.12 Data from head-to-head RCTs should be presented in the reference-case analysis. When technologies are being compared that have not been evaluated within a single RCT, data from a series of pairwise head-to-head RCTs should be presented together with a network meta-analysis if appropriate. The network meta-analysis must be fully described and presented as additional to the reference-case analysis. The Appraisal Committee will take into account the additional uncertainty associated with the lack of direct evidence when considering estimates of relative effectiveness derived from indirect sources only. The principles of good practice for standard pairwise meta-analyses should also be followed in adjusted indirect treatment comparisons and network meta-analyses.

5.2.13 The term 'network meta-analysis' includes adjusted indirect comparisons, but also refers to more complex evidence analysis such as 'mixed treatment comparisons'. An 'adjusted indirect comparison' refers to the synthesis of data from trials in which the technologies of interest have not been compared directly with each other in head-to-head trials, but have been compared indirectly using a common comparator. Mixed treatment comparisons include both head-to-head trials of treatments of interest (both interventions and comparators) and trials that include 1 of the treatments of interest.

5.2.14 Ideally, the network meta-analysis should contain all treatments that have been identified either as an intervention or as appropriate comparators in the scope. Therefore, trials that compare at least 2 of the relevant (intervention or comparator) treatments should be incorporated, even if the trial includes comparators that are not relevant to the decision problem. The principles of good practice for conducting systematic reviews and meta-analyses should be carefully followed when conducting mixed and indirect treatment comparisons. In brief, a clear description of the methods of synthesis and the rationale for how RCTs are identified, selected and excluded is needed. The methods and results of the individual trials included in the network meta-analysis and a table of baseline characteristics for each trial must be documented. If there is doubt about the relevance of a particular trial or set of trials, sensitivity analysis should be presented in which these trials are excluded (or if absent from the base-case analysis, included).

5.2.15 The heterogeneity between results of pairwise comparisons and inconsistencies between the direct and indirect evidence on the technologies should be reported. If inconsistency within a network meta-analysis is found, then attempts should be made to explain and resolve these inconsistencies.

5.2.16 In all cases when evidence is combined using adjusted indirect comparisons or network meta-analysis frameworks, trial randomisation must be preserved, that is, it is not acceptable to compare results from single treatment arms from different randomised trials. If this type of comparison is presented, the data will be treated as observational in nature and associated with increased uncertainty.

5.2.17 Evidence from a network meta-analysis must be presented in both tabular form and in graphical formats such as forest plots. The direct and indirect components of the network meta-analysis should be clearly identified and the number of trials in each comparison stated. Results from pairwise meta-analyses using the direct comparisons should be presented alongside those based on the full network meta-analysis.

5.2.18 When sufficient relevant and valid data are not available for including in pairwise or network meta-analyses, the analysis may have to be restricted to a narrative overview that critically appraises individual studies and presents their results. In these circumstances, the Appraisal Committee will be particularly cautious when reviewing the results and in drawing conclusions about the relative clinical effectiveness of the treatment options.

5.3 Measuring and valuing health effects

5.3.1 For the cost-effectiveness analyses health effects should be expressed in QALYs. For the reference case, the measurement of changes in health-related quality of life should be reported directly from patients and the utility of these changes should be based on public preferences using a choice-based method. The EQ-5D is the preferred measure of health-related quality of life in adults.

5.3.2 A QALY combines both quality of life and life expectancy into a single index. In calculating QALYs, each of the health states experienced within the time horizon of the model is given a utility reflecting the health-related quality of life associated with that health state. The duration of time spent in each health state is multiplied by the utility. Deriving the utility for a particular health state usually comprises 2 elements: measuring health-related quality of life in people who are in the relevant health state and valuing it according to preferences for that health state relative to other states (usually perfect health and death).

5.3.3 Health-related quality of life, or changes in health-related quality of life, should be measured directly by patients. When it is not possible to obtain measurements of health-related quality of life directly from patients, data should be obtained from the person who acts as their carer in preference to healthcare professionals.

5.3.4 The valuation of health-related quality of life measured in patients (or by their carers) should be based on a valuation of public preferences from a representative sample of the UK population using a choice-based method. This valuation leads to the calculation of utility values.

5.3.5 Different methods used to measure health-related quality of life produce different utility values; therefore, results from different methods or instruments cannot always be compared. Given the need for consistency across appraisals, one measurement method, the EQ-5D, is preferred for the measurement of health-related quality of life in adults.

5.3.6 The EQ-5D is a standardised and validated generic instrument that is widely used and has been validated in many patient populations. The EQ-5D comprises 5 dimensions of health: mobility, ability to self-care, ability to undertake usual activities, pain and discomfort, and anxiety and depression. For each of these dimensions it has 3 levels of severity (no problems, some problems, severe problems). The system has been designed so that people can describe their own health-related quality of life using a standardised descriptive system. A set of preference values elicited from a large UK population study using a choice-based method of valuation (the time trade-off method) is available for the EQ-5D health state descriptions. This set of values should be applied to measurements of health-related quality of life to generate health-related utility values.

5.3.7 In some circumstances adjustments to utility values, for example for age or comorbidities, may be needed.

5.3.8 If not available in the relevant clinical trials, EQ-5D data can be sourced from the literature. When obtained from the literature, the methods of identification of the data should be systematic and transparent. The justification for choosing a particular data set should be clearly explained. When more than 1 plausible set of EQ-5D data is available, sensitivity analyses should be carried out to show the impact of the alternative utility values.

5.3.9 When EQ-5D data are not available, these data can be estimated by mapping other health-related quality of life measures or health-related benefits observed in the relevant clinical trial(s) to EQ-5D. The mapping function chosen should be based on data sets containing both health-related quality of life measures and its statistical properties should be fully described, its choice justified, and it should be adequately demonstrated how well the function fits the data. Sensitivity analyses to explore variation in the use of the mapping algorithms on the outputs should be presented.

5.3.10 In some circumstances the EQ-5D may not be the most appropriate. To make a case that the EQ-5D is inappropriate, qualitative empirical evidence on the lack of content validity for the EQ-5D should be provided, demonstrating that key dimensions of health are missing. This should be supported by evidence that shows that EQ-5D performs poorly on tests of construct validity and responsiveness in a particular patient population. This evidence should be derived from a synthesis of peer-reviewed literature. In these circumstances alternative health-related quality of life measures may be used and must be accompanied by a carefully detailed account of the methods used to generate the data, their validity, and how these methods affect the utility values.

5.3.11 When necessary, consideration should be given to alternative standardised and validated preference-based measures of health-related quality of life that have been designed specifically for use in children. The standard version of the EQ-5D has not been designed for use in children. An alternative version for children aged 7–12 years is available, but a validated UK valuation set is not yet available.

5.3.12 A new version of the EQ-5D, the EQ-5D-5L, has been developed in which there are 5 levels of severity (no problem, slight problems, moderate problems, severe problems and unable to or extreme problems) for each of the 5 dimensions of health (see section 5.3.6). The EQ-5D-5L may be used for reference-case analyses. The descriptive system for the EQ-5D-5L has been validated, but no valuation set to derive utilities currently exists. Until an acceptable valuation set for the EQ-5D-5L is available, the validated mapping function to derive utility values for the EQ-5D-5L from the existing EQ-5D (-3L) may be used (available from www.euroqol.org).

In August 2017, NICE issued a position statement on the use of the EQ‑5D‑5L valuation set. Companies and academic groups should refer to this statement.

5.4 Equity considerations in cost-effectiveness analysis

5.4.1 In the reference case, an additional QALY should receive the same weight regardless of any other characteristics of the people receiving the health benefit.

5.4.2 The estimation of QALYs, as defined in the reference case, implies a particular position regarding the comparison of health gained between individuals. Therefore, in the reference case, an additional QALY is of equal value regardless of other characteristics of the individuals, such as their socio-demographic characteristics, their age, or their level of health. The Committee has discretion to consider a different equity position, and may do so in certain circumstances and when instructed by the NICE Board (see section 6).

5.5 Evidence on resource use and costs

NHS and personal and social services costs

5.5.1 For the reference case, costs should relate to resources that are under the control of the NHS and personal and social services. These resources should be valued using the prices relevant to the NHS and personal and social services. Evidence should be presented to demonstrate that resource use and cost data have been identified systematically.

5.5.2 The public list prices for technologies (for example, pharmaceuticals or medical devices) should be used in the reference-case analysis. When there are nationally available price reductions, for example for medicines procured for use in secondary care through contracts negotiated by the NHS Commercial Medicines Unit, then the reduced price should be used in the reference-case analysis to best reflect the price relevant to the NHS. The Commercial Medicines Unit publishes information on the prices paid for some generic drugs by NHS trusts through its Electronic Marketing Information Tool (eMIT); focusing on medicines in the National Generics Programme Framework for England. Analyses based on price reductions for the NHS will only be considered when the reduced prices are transparent and consistently available across the NHS, and if the period for which the specified price is available is guaranteed. When a reduced price is available through a patient access scheme that has been agreed with the Department of Health, the base-case analysis should include the costs associated with the scheme. The review date for the appraisal will be informed by the period of time over which the manufacturer or sponsor can guarantee any such pricing agreements.

5.5.3 For medicines that are predominantly prescribed in primary care, prices should be based on the Drug Tariff.

5.5.4 In the absence of a published list price and price agreed by a national institution (as may be the case for some devices), the price submitted by the manufacturer may be used, provided that it is nationally and publicly available.

5.5.5 Healthcare resource groups (HRGs) are a valuable source of information for estimating resource use. HRGs are standard groupings of clinically similar treatments that use common levels of healthcare resources. The national average unit cost of an HRG is reported as part of the annual mandatory collection of reference costs from all NHS organisations in England. The use of these costs can reduce the need for local micro-costing (costing of each individual component of care related to the use of a technology). Care must be taken to ensure that all relevant HRGs have been taken into account. For example, the cost of hospital admission for a serious condition may not account for time spent in critical care, which is captured and costed as a separate HRG.

5.5.6 Data based on HRGs may not be appropriate in all circumstances (for example, when the new technology and the comparator both fall under the same HRG, or when the mean cost does not reflect resource use in relation to the new technology under appraisal). In such cases, other sources of evidence, such as micro-costing studies, may be more appropriate. When cost data are taken from literature, the methods used to identify the sources should be defined. When several alternative sources are available, a justification for the costs chosen should be provided and discrepancies between the sources explained. When appropriate, sensitivity analysis should be used to assess the implications for results of using alternative data sources.

5.5.7 Costs related to the condition of interest and incurred in additional years of life gained as a result of treatment should be included in the reference-case analysis. Costs that are considered to be unrelated to the condition or technology of interest should be excluded.

5.5.8 If introduction of the technology requires changes in infrastructure, costs or savings should be included in the analysis.

5.5.9 When a group of related technologies is being appraised as part of a 'class' of treatments, an analysis using the individual unit costs specific to each technology should normally be presented in the reference case. Exceptionally, if there is a very wide range of technologies and costs to be considered, then analyses using the weighted mean cost and the highest and lowest cost estimates should be presented.

5.5.10 Value added tax (VAT) should be excluded from all economic evaluations, but included in calculation of the budgetary impact when the resources in question are liable for this tax.

Non-NHS and non-personal and social services costs

5.5.11 Some technologies may have a substantial impact on the costs (or cost savings) to government bodies other than the NHS. Exceptionally, these costs may be included if specifically agreed with the Department of Health, usually before referral of the topic. When non-reference-case analyses include these broader costs, explicit methods of valuation are required. In all cases, these costs should be reported separately from NHS and personal social services costs, and not included in the ICER.

5.5.12 Costs borne by patients may be included when they are reimbursed by the NHS or personal social services. When the rate of reimbursement varies between patients or geographical regions, such costs should be averaged across all patients. Where there are costs borne by patients that are not reimbursed by the NHS and personal social services, these may be presented separately. Productivity costs should be excluded.

5.5.13 When care by family members, friends or a partner might otherwise have been provided by the NHS or personal social services it may be appropriate to consider the cost of the time of providing this care, even when adopting a NHS or personal social services perspective. All analyses including the time spent by family members of providing care should be presented separately. A range of valuation methods exists to cost this type of care. Methods chosen should be clearly described and sensitivity analyses using other methods should be presented. Personal social service savings should also be incorporated.

5.6 Discounting

5.6.1 Cost-effectiveness results should reflect the present value of the stream of costs and benefits accruing over the time horizon of the analysis. For the reference case, the same annual discount rate should be used for both costs and benefits (currently 3.5%).

5.6.2 The specific discount rate varies across jurisdictions and over time. The Institute considers that it is usually appropriate to discount costs and health effects at the same annual rate of 3.5%, based on the recommendations of the UK Treasury for the discounting of costs.

5.6.3 Sensitivity analyses using rates of 1.5% for both costs and health effects may be presented alongside the reference-case analysis (see section 6.2.19).

5.7 Modelling methods

5.7.1 Full documentation and justification of structural assumptions and data inputs should be provided. When there are alternative plausible assumptions and inputs, sensitivity analyses of their effects on model outputs should be undertaken.

5.7.2 Modelling provides an important framework for synthesising available evidence and generating estimates of clinical and cost effectiveness in a format relevant to the Appraisal Committee's decision-making process. Models are required for most technology appraisals. Situations when modelling is likely to be required include those when:

  • all the relevant evidence is not contained in a single trial

  • patients participating in trials do not represent the typical patients likely to use the technology within the NHS

  • intermediate outcome measures are used rather than effect on health-related quality of life and survival

  • relevant comparators have not been used or trials do not include evidence on relevant populations

  • clinical trial design includes crossover (treatment switching) that would not occur in clinical practice

  • costs and benefits of the technologies extend beyond the trial follow-up period.

5.7.3 Providing an all-embracing definition of what constitutes a high-quality model is not possible. In general, estimates of treatment effect should be based on the results of the systematic review, structural assumptions should be fully justified and data inputs should be clearly documented and justified in the context of a valid review of the alternatives.

5.7.4 The methods of quality assurance used in the development of the model should be detailed and the methods and results of model validation should be provided. In addition, the results from the analysis should be presented in a disaggregated format and should include a tabular presentation of information on estimates of life-years gained, mortality rates (at separate time points if appropriate) and the frequency of selected clinical events predicted by the model.

5.7.5 Clinical end points that reflect how a patient feels, functions, or how long a patient survives are regarded as more informative than surrogate end points (such as laboratory tests and imaging findings). When the use of 'final' clinical end points is not possible and 'surrogate' data on other outcomes are used to infer the effect of treatment on mortality and health-related quality of life, evidence in support of the surrogate-to-final end point outcome relationship must be provided together with an explanation of how the relationship is quantified for use in modelling. The usefulness of the surrogate end point for estimating QALYs will be greatest when there is strong evidence that it predicts health-related quality of life and/or survival. In all cases, the uncertainty associated with the relationship between the end point and health-related quality of life or survival should be explored and quantified.

5.7.6 Clinical trial data generated to estimate treatment effects may not sufficiently quantify the risk of some health outcomes or events for the population of interest or may not provide estimates over a sufficient duration for the economic analysis. The methods used to identify and critically appraise sources of data for economic models should be stated and the choice of particular data sets should be justified with reference to their suitability to the population of interest in the appraisal.

5.7.7 Modelling is usually required to extrapolate costs and health benefits over an extended time horizon. Assumptions used to extrapolate the impact of treatment over the relevant time horizon should have both external and internal validity and be reported transparently. The external validity of the extrapolation should be assessed by considering both clinical and biological plausibility of the inferred outcome as well as its coherence with external data sources such as historical cohort data sets or other relevant clinical trials. Internal validity should be explored and when statistical measures are used to assess the internal validity of alternative models of extrapolation based on their relative fit to the observed trial data, the limitations of these statistical measures should be documented. Alternative scenarios should also be routinely considered to compare the implications of different methods for extrapolation of the results. For example, for duration of treatment effects, scenarios might include when the treatment benefit in the extrapolated phase is: (i) nil; (ii) the same as during the treatment phase and continues at the same level; or (iii) diminishes in the long term.

5.7.8 In RCTs, participants randomised to the control group are sometimes allowed to switch treatment group and receive the active intervention. In these circumstances, when intention-to-treat analysis is considered inappropriate, statistical methods that adjust for treatment switching can also be presented. Simple adjustment methods such as censoring or excluding data from patients who crossover should be avoided because they are very susceptible to selection bias. The relative merits and limitations of the methods chosen to explore the impact of switching treatments should be explored and justified with respect to the method chosen and in relation to the specific characteristics of the data set in question. These characteristics include the mechanism of crossover used in the trial, the availability of data on baseline and time-dependent characteristics, and expectations around the treatment effect if the patients had remained on the treatment to which they were allocated.

5.8 Exploring uncertainty

5.8.1 It is important for the model to quantify the decision uncertainty associated with a technology (that is, the probability that a different decision would be reached if the true cost effectiveness of each technology could be ascertained before making the decision).

5.8.2 Models are subject to uncertainty around the structural assumptions used in the analysis. Examples of structural uncertainty may include how different states of health are categorised and how different pathways of care are represented. These structural assumptions should be clearly documented and the evidence and rationale to support them provided. The impact of structural uncertainty on estimates of cost effectiveness should be explored by separate analyses of a representative range of plausible scenarios.

5.8.3 Examples of when this type of scenario analysis should be conducted are:

  • when there is uncertainty about the most appropriate assumption to use for extrapolation of costs and outcomes beyond trial follow-up

  • when there is uncertainty about how the pathway of care is most appropriately represented in the analysis

  • when there may be economies of scale (for example, in appraisals of diagnostic technologies).

5.8.4 Uncertainty about the appropriateness of the methods used in the reference case can also be dealt with using sensitivity analysis, but these analyses must be presented separately.

5.8.5 A second type of uncertainty arises from the choice of data sources to provide values for the key parameters, such as different costs and utilities, estimates of relative effectiveness and their duration. The implications of different estimates of key parameters must be reflected in sensitivity analyses (for example, through the inclusion of alternative data sets). Inputs must be fully justified and uncertainty explored by sensitivity analysis using alternative input values.

5.8.6 The choice of data sources to include in an analysis may not be clear-cut. In such cases, the analysis should be re-run, using the alternative data source or excluding the study about which there is doubt, and the results reported separately. Examples of when this type of sensitivity analysis should be conducted are:

  • when alternative sets of plausible data on the health-related utility associated with the disease or intervention are available

  • when there is variability between hospitals in the cost of a particular resource or service, or the acquisition price of a particular technology

  • when there are doubts about the quality or relevance of a particular study in a meta-analysis or network meta-analysis.

5.8.7 A third source of uncertainty arises from parameter precision, once the most appropriate sources of information have been identified (that is, the uncertainty around the mean health and cost inputs in the model). Distributions should be assigned to characterise the uncertainty associated with the (precision of) mean parameter values. Probabilistic sensitivity analysis is preferred. This enables the uncertainty associated with parameters to be simultaneously reflected in the results of the model. In non-linear decision models, probabilistic methods provide the best estimates of mean costs and outcomes. The mean value, distribution around the mean, and the source and rationale for the supporting evidence should be clearly described for each parameter included in the model. The distributions chosen for probabilistic sensitivity analysis should not be arbitrarily chosen, but chosen to represent the available evidence on the parameter of interest, and their use should be justified. Formal elicitation methods are available if there is a lack of data to inform the mean value and associated distribution of a parameter. If there are alternative plausible distributions that could be used to represent uncertainty in parameter values, this should be explored by separate probabilistic analyses of these scenarios.

5.8.8 Evidence about the extent of correlation between individual parameters should be carefully considered and reflected in the probabilistic analysis. Assumptions made about the correlations should be clearly presented.

5.8.9 The computational methods used to implement an appropriate model structure may occasionally present challenges in conducting probabilistic sensitivity analysis. The use of model structures that limit the feasibility of probabilistic sensitivity analysis should be clearly specified and justified. Models should always be fit for purpose, and should enable a thorough consideration of the decision uncertainty associated with the model structure and input parameters. The choice of a 'preferred' model structure or programming platform should not result in the failure to adequately characterise uncertainty.

5.8.10 Appropriate ways of presenting uncertainty in cost-effectiveness data parameter uncertainty include confidence ellipses and scatter plots on the cost-effectiveness plane (when the comparison is restricted to 2 alternatives) and cost-effectiveness acceptability curves. The presentation of cost-effectiveness acceptability curves should include a representation and explanation of the cost-effectiveness acceptability frontier. Uncertainty should also be presented in tabular form. In addition to details of the expected mean results (costs, outcomes and ICERs), the probability that the treatment is cost effective at maximum acceptable ICERs of £20,000–£30,000 per QALY gained and the error probability (that the treatment is not cost effective) should also be presented, particularly when there are more than 2 alternatives.

5.8.11 The use of univariate and best- or worst-case sensitivity analysis is an important way of identifying parameters that may have a substantial impact on the cost-effectiveness results and of explaining the key drivers of the model. However, such analyses become increasingly unhelpful in representing the combined effects of multiple sources of uncertainty as the number of parameters increase. The use of probabilistic sensitivity analysis can allow a more comprehensive characterisation of the parameter uncertainty associated with all input parameters.

5.9 Companion diagnostics

5.9.1 The use of a technology may be conditional on the presence or absence of a particular biomarker (for example a gene or a protein). If a diagnostic test to establish the presence or absence of this biomarker is carried out solely to support the treatment decision for the specific technology, the associated costs of the diagnostic test should be incorporated into the assessments of clinical and cost effectiveness. A sensitivity analysis should be provided without the cost of the diagnostic test. When appropriate, the diagnostic accuracy of the test for the particular biomarker of treatment efficacy should be examined and, when appropriate, incorporated in the economic evaluation.

5.9.2 The appraisal will take account of any requirements of the marketing authorisation, including tests to be completed and the definition of a positive test. In clinical practice in the NHS, it may be possible that an alternative diagnostic test procedure to that used in the clinical trials of the technology is used. When appropriate, the possibility that using an alternative test (which may differ in diagnostic accuracy from that used in the clinical trials) may affect selection of the patient population for treatment and the cost effectiveness of the treatment will be highlighted in the appraisal guidance.

5.9.3 It is expected that assessments of multiple companion diagnostic test options will generally be undertaken in the NICE diagnostics assessment programme. For further information see the NICE diagnostics assessment programme manual.

5.10 Analysis of data for patient subgroups

5.10.1 For many technologies, the capacity to benefit from treatment will differ for patients with differing characteristics. This should be explored as part of the reference-case analysis by providing estimates of clinical and cost effectiveness separately for each relevant subgroup of patients. The characteristics of patients in the subgroup should be clearly defined and should preferably be identified on the basis of an expectation of differential clinical or cost effectiveness because of known, biologically plausible mechanisms, social characteristics or other clearly justified factors. When possible, potentially relevant subgroups will be identified at the scoping stage with consideration being given to the rationale for expecting a subgroup effect. However, this does not preclude the identification of subgroups later in the process; in particular, during the deliberations of the Appraisal Committee.

5.10.2 Given the Institute's focus on maximising health gain from limited resources, it is important to consider how clinical and cost effectiveness may differ because of differing characteristics of patient populations. Typically, the capacity to benefit from treatment will differ between patients, and this may also impact on the subsequent cost of care. There should be a clear justification and, if appropriate, biological plausibility for the definition of the patient subgroup and the expectation of a differential effect. Post hoc data 'dredging' in search of subgroup effects is to be avoided and will be viewed sceptically.

5.10.3 The estimate of the overall net treatment effect of an intervention is determined by the baseline risk of a particular condition or event and/or the relative effects of the technology compared with the relevant comparator treatment. The overall net treatment effect may also be determined by other features of the people comprising the population of interest. It is therefore likely that relevant subgroups may be identified in terms of differences in 1 or more contributors to absolute treatment effects.

5.10.4 For subgroups based on differences in baseline risk of specific health outcomes, systematic identification of data to quantify this is required. It is important that the methods for identifying appropriate baseline data for the purpose of subgroup analysis are provided in sufficient detail to enable replication and critical appraisal.

5.10.5 Care should be taken to specify how subgroup analyses are undertaken, including the choice of scale on which any effect modification is defined. The statistical precision of all subgroup estimates should be reflected in the analysis of parameter uncertainty. The characteristics of the patients associated with the subgroups presented should be clearly specified to allow the Appraisal Committee to judge the appropriateness of the analysis with regard to the decision problem.

5.10.6 The standard subgroup analyses performed in RCTs or systematic reviews seek to determine whether there are differences in relative treatment effects between subgroups (through the analysis of interactions between the effectiveness of the technology and patient characteristics). The possibility of differences emerging by chance, particularly when multiple subgroups are reported, is high and should be taken into account. Pre-specification of a particular subgroup in the study or review protocol, with a clear rationale for anticipating a difference in efficacy and a prediction of the direction of the effect, will increase the credibility of a subgroup analysis.

5.10.7 In considering subgroup analyses, the Appraisal Committee will take specific note of the biological or clinical plausibility of a subgroup effect in addition to the strength of the evidence in favour of such an effect (for example, if it has a clear, pre-specified rationale and is consistent across studies). The evidence supporting biological or clinical plausibility for a subgroup effect should be fully documented, including details of statistical analysis.

5.10.8 Individual patient data are preferred, if available, for the estimation of subgroup-specific parameters. However, as for all evidence, the appropriateness of such data will always be assessed by considering factors such as the quality of the analysis, how representative the available evidence is to clinical practice and how relevant it is to the decision problem.

5.10.9 Consideration of subgroups based on differential cost may be appropriate in some circumstances; for example, if the cost of managing a particular complication of treatment is known to be different in a specific subgroup.

5.10.10 When considering subgroups, the Appraisal Committee pays particular attention to its legal obligations on equality and human rights.

5.10.11 Types of subgroups that are not considered relevant are those based solely on the following factors:

  • subgroups based solely on differential treatment costs for individuals according to their social characteristics

  • subgroups specified in relation to the costs of providing treatment in different geographical locations in the UK (for example, when the costs of facilities available for providing the technology vary according to location).

5.10.12 Analysis of 'treatment continuation rules', whereby cost effectiveness is maximised based on continuing treatment only in those who achieve a specified 'response' within a given time, should not be analysed as a separate subgroup. Rather, the strategy involving the 'continuation rule' should be analysed as a separate scenario, by considering it as an additional treatment strategy alongside the base-case interventions and comparators. This enables the costs and health consequences of factors such as any additional monitoring associated with the continuation rule to be incorporated into the economic analysis. Additional considerations for continuation rules include:

  • the robustness and plausibility of the end point on which the rule is based

  • whether the 'response' criteria defined in the rule can be reasonably achieved

  • the appropriateness and robustness of the time at which response is measured

  • whether the rule can be incorporated into routine clinical practice

  • whether the rule is likely to predict those patients for whom the technology is particularly cost effective

  • considerations of fairness with regard to withdrawal of treatment from people whose condition does not respond to treatment.

5.11 Presentation of data and results

Presenting data

5.11.1 All parameters used to estimate clinical and cost effectiveness should be presented clearly in tabular form and include details of data sources. For continuous , mean values should be presented and used in the analyses. For all variables, measures of precision should be detailed. For probabilistic analyses, the distributions used to characterise the uncertainty in input parameters should be documented and justified. As much detail as possible on the data used in the analysis should be provided.variables

Presenting expected cost-effectiveness results

5.11.2 The expected value of each component of cost and expected total costs should be presented; expected QALYs for each option compared in the analysis should also be detailed in terms of their main contributing components. ICERs should be calculated as appropriate.

5.11.3 The main individual components comprising both costs and QALYs for the intervention and control treatment pathways should be tabulated. For QALYs this includes presenting the life-year component separately. The costs and QALYs associated with different stages of the disease should also be presented separately.

5.12 Impact on the NHS

Implementation of NICE guidance

5.12.1 Information on the net impact of the implementation of the health technology on the NHS (and personal and social services, when appropriate) is required.

5.12.2 As outlined in more detail below, when possible, the information on NHS impact should include details on key epidemiological and clinical assumptions, resource units and costs with reference to a general England and Wales population, and patient or service base (for example, per 100,000 population, per average primary care trust or per ward).

Implementation or uptake and population health impact

5.12.3 Evidence-based estimates of the current baseline treatment rates and expected appropriate implementation or uptake or treatment rates of the appraised and comparator technologies in the NHS should be supplied. In addition, an estimate of the resulting health impact (for example, QALYs or life-years gained) in a given population should ideally be attempted. These should take account of the condition's epidemiology and the appropriate levels of access to diagnosis and treatment in the NHS. It should also highlight any key assumptions or uncertainties.

Resource impact

5.12.4 Implementation of a new health technology will have direct implications for the provision of units of the appraised and comparator technologies (for example, doses of drugs or theatre hours) by the NHS. In addition, the technology may have a knock-on impact (increase or decrease) on other NHS and personal and social services resources, including alternative or avoided treatment and resources required to support the use of the new technology. These might include:

  • staff numbers and hours

  • training and education

  • support services (for example, laboratory tests)

  • service capacity or facilities (for example, hospital beds, clinic sessions, diagnostic services and residential home places).

5.12.5 Any likely constraints on the resources required to support the implementation of the appraised technology should be highlighted, and comment should be made on the impact this may have on the implementation timescale.

Costs

5.12.6 Estimates of net NHS (and personal and social services, when appropriate) costs of the expected resource impact should be provided to allow effective national and local financial planning. The costs should be disaggregated by appropriate generic organisational (for example, NHS, personal and social services, hospital or primary care) and budgetary categories (for example, drugs, staffing, consumables or capital). When possible, this should be to the same level and detail as that adopted in resource unit information. If savings are anticipated, the extent to which these finances can actually be realised should be specified. Supplied costs should also specify the inclusion or exclusion of VAT. The cost information should be based on published cost analyses or recognised publicly available databases or price lists.

5.12.7 If implementation of the technology could have substantial resource implications for other services, the effects on the submitted cost-effectiveness evidence for the technology should be explored.

5.12.8 The Institute produces costing tools to allow individual NHS organisations and local health economies to quickly assess the impact guidance will have on local budgets. Details of how the costing tools are developed are available in the Institute's document, Assessing cost impact: methods guide.



[1] The independent academic groups follow general guidelines for conducting systematic reviews published by the Centre for Reviews and Dissemination at the University of York (Systematic Reviews: CRD's guidance for undertaking reviews in health care).