Appendix E. Measurement of health benefits

This Appendix, taken with modifications from the Evaluation Report to the Appraisals Committee, provides some background information on the way in which health benefits are calculated. It does not form part of the guidance proper.

A1 Measuring benefits

A1.1 Measures of the benefit of treatment used in cost-effectiveness analyses can be based on 'natural' units, for example years of life gained, or on value-based measures, for example Quality Adjusted Life Years (QALYs). The number of QALYs gained by using a particular treatment is a measure of its benefit in terms of improvements in the quality of life of patients (including physical performance, pain, distress and psychological improvements as well as changes in survival) summed over a period of time. It therefore incorporates the value of changes in both morbidity and mortality, where these exist.

A1.2 In the particular case of MS, although there are natural units which capture specific aspects of the impact of MS, such as relapses avoided and delaying progression to wheelchair dependency, there is none which captures both the impact on relapses and the full impact of progression. These measures therefore ignore some of the established benefits of the beta interferons.

A1.3 Although imperfect as 'natural' units to capture gains from delayed progression, the EDSS does provide a means to create a value-based measure of benefit. All of the studies that attempt to encompass the full effect of delayed progression have used changes in EDSS converted to changes in QALYs. This requires an estimate of utilities (adjustments for level of quality of life) applied to each of the EDSS levels, and based not on the disability itself but to include all the associated morbidity.

A1.4 An alternative measure is provided in the literature and in the submissions in the form of a measure based on the EDSS called variously Area Under the Curve, integrated area under the EDSS time curve or disability burden unit. This is calculated by multiplying the EDSS score by the time during which that score is observed, and summing over time. This measure is therefore very similar to the QALY, the difference being that EDSS scores are given an equal weight rather than a weight based on the relative utility of different health states.

A1.5 This summed EDSS measure has a number of disadvantages. The numbers used in the EDSS itself are not cardinal numbers either by construction or by behaviour. (A "cardinal" number can be added, subtracted, multiplied or divided, and the result has ready meaning.) The EDSS score is, by contrast, "ordinal", which means that a higher score represents greater disability. But it does not imply, for example, that an EDSS score of 8 (restricted to bed or chair or perambulated in a wheelchair) is twice as disabled a state as an EDSS score of 4 (fully ambulatory and able to walk up to 500 metres without aid or rest). This means that the summed EDSS measure is also not cardinal. Its units are arbitrary, meaning that a cost per summed EDSS score avoided is equally arbitrary. The utility scores used in calculating QALYs weight the underlying EDSS scores in ways designed to produce cardinal numbers having identifiable units. The summed EDSS score therefore shares any problems that the QALY has and has a number of others besides.

A2 The use of QALYs in MS

A.2.1 Although all of the submissions to the Committee from the manufacturers report QALYs and cost-effectiveness ratios derived from them, some also make a number of criticisms of the approach. These include some unexplained "assertions", but the following statements warrant further comment:

A.2.2 QALYS discriminate against people with MS.

This appears to be based on two premises. The first is a mistaken belief that QALY measurement does not count transient improvements in quality of life; that is emphatically not the case. The second is a related argument that people with disabilities do not have the same potential to gain QALYs because of their lower underlying quality of life. However, this argument only applies, and then in theory only, to therapies that are lifesaving. It does not apply to interventions that improve quality of life – on the contrary, lower quality of life suggests a greater capacity to gain QALYs. Since the impact of therapies for MS is dominated by improvements in quality of life, this criticism does not apply.

A.2.3 QALYs do not discriminate in favour of people with MS.

The QALY approach is egalitarian in considering any particular gain in quantity or quality of life as being of equal value regardless of the age, sex or other characteristics of the recipients The suggestion is that QALYs should be adjusted so that they are greater for those of working age. In other words, it proposes that one should discriminate against young and old people, because they do not work or have dependants. Whilst there is some evidence that there are those who would support such discrimination, it is unclear how far it should be taken. A logical implication of the argument in favour of such discrimination is that QALYs should be weighted against individuals of working age who do not have dependants or who are unable to work. It might even imply employment of an individual weight based on the number of dependants and the size of income from employment.

A.2.4 QALY gains are estimated using a population based estimate of utility values, which are inferior to those based on patient preferences.

The evidence provided by Parkin et al (J of Neurology,Neurosurgery and Psychiatry, 2000; 68: 144-49) suggested that despite differences in utility values for health states, estimates of QALY gains were not affected by the use of patient rather than population utilities. Moreover, there is an argument that societal-based estimates used consistently for all evaluations are more appropriate because they reflect wider values that are comparable over different therapies.

A.2.5 QALY gains include average relapses and therefore do not take account of severe relapses.

This is not correct, since the calculation of an average includes both more severe and milder relapses as well as those of average severity. A larger sample of people with MS, thus containing more relapses than that which has been studied to date, might include a greater number of severe relapses and might plausibly raise the average severity. However, it may also include a smaller proportion of severe relapses and so lower average severity. There is no evidence either way.

A.2.6 The loss of utility due to relapses may be an underestimate because it is assessed after the event.

This may be true; there are methodological difficulties with obtaining quality of life data during relapses that are serious enough to require hospitalisation, which mean that it is difficult to test. However, there is no evidence that the values are too high, or too low.