3 The manufacturer's submission

The Appraisal Committee (section 8) considered evidence submitted by the manufacturer of teriflunomide and a review of this submission by the Evidence Review Group (ERG; section 9).

Clinical effectiveness

3.1 The manufacturer provided clinical-effectiveness evidence, identified through systematic review, from:

  • 3 phase III randomised controlled clinical trials: TEMSO (n=1088, 108 weeks follow-up), TENERE (n=324, follow-up between 48 and 118 weeks) and TOWER (n=1169, follow-up between 48 and 154 weeks)

  • a phase II trial: Study 2001 (n=179, 36 weeks)

  • 2 extension studies: to Study 2001 (n=147, median 7.1 years follow-up) and to TEMSO (n=742).

    TEMSO, TOWER and Study 2001 compared the effectiveness of teriflunomide (7 mg or 14 mg once daily) with placebo. After completion of the core study for TEMSO and Study 2001, patients could enter the extension phases of the studies. Those who were originally randomised to teriflunomide continued their assigned treatment and those receiving placebo were re-allocated to teriflunomide 7 mg or 14 mg. TENERE compared the effectiveness of teriflunomide (7 mg or 14 mg once daily) with Rebif‑44 (interferon beta‑1a) 3 times a week. Each of the phase III multicentre trials included sites in the UK.

3.2 The inclusion criteria of TEMSO, TOWER and Study 2001 specified the number of previous relapses before study entry. For TEMSO and TOWER, this was at least 1 relapse in the previous year, or at least 2 in the previous 2 years. For Study 2001, this was 1 relapse in the previous year or 2 in the previous 3 years. The phase III trials included people with an Expanded Disability Status Scale (EDSS) score between 0 and 5.5, whereas Study 2001 and the extension studies had a range of between 0 and 6.0.

3.3 The primary outcome of TEMSO and TOWER was annualised relapse rate. The primary outcome of TENERE was time to failure (which included treatment failure and discontinuation), and the primary outcome of Study 2001 was combined unique active (new and persisting) lesions per MRI scan. The trial outcomes presented by the manufacturer included annualised relapse rate, severity of relapse (inferred from hospitalisation), disability (EDSS score, 3‑month sustained accumulation of disability [SAD], and 6‑month SAD), freedom from disease activity, mortality, adverse events and discontinuation rate. The manufacturer's meta-analysis of the placebo-controlled trials included 5 outcomes: annualised relapse rate, proportion of relapse-free patients, 3‑month SAD, all-cause discontinuations and discontinuations because of adverse events. The intention-to-treat populations were used for analyses of clinical trial data.

3.4 The manufacturer provided data from Study 2001, TEMSO, TOWER and a meta-analysis, which compared the 14 mg dose of teriflunomide with placebo. Data from TEMSO, TOWER and the meta-analysis showed that teriflunomide was associated with a statistically significant reduction in adjusted annualised relapse rate (adjusted for EDSS score at baseline and geographic region) compared with placebo:

  • TEMSO (teriflunomide 0.37 [95% confidence interval {CI} 0.31 to 0.44], placebo 0.54 [95% CI 0.47 to 0.62], 31.5% relative risk reduction, p<0.001)

  • TOWER trial (teriflunomide 0.32 [95% CI 0.27 to 0.38], placebo not stated, relative risk 0.637 [95% CI 0.512 to 0.793], p=0.0001)

  • meta-analysis (relative risk compared with placebo 0.66, 95% CI 0.59 to 0.75).

    Study 2001 showed that teriflunomide reduced the point estimate for the annualised relapse rate compared with placebo, but this was not statistically significant. TEMSO showed that statistically significantly fewer people receiving teriflunomide had 3‑month SAD than those receiving placebo (teriflunomide: 20.2% [95% CI 15.6 to 24.7], placebo: 27.3% [95% CI 22.3 to 32.3], hazard ratio [HR] 0.70 [95% CI 0.51 to 0.97]). TOWER showed lower rates of 3‑month SAD with teriflunomide than placebo at 48 weeks (teriflunomide: 7.8%, placebo: 14.2%) and at 132 weeks (teriflunomide: 15.8%, placebo: 21.0%). The meta-analysis of TEMSO and TOWER (Study 2001 was not included in this analysis) estimated a statistically significantly lower risk of 3‑month SAD for teriflunomide compared with placebo (HR 0.694, 95% CI 0.544 to 0.886). Teriflunomide did not statistically significantly reduce 6‑month SAD compared with placebo in TEMSO (HR 0.749; 95% CI 0.505 to 1.111) or TOWER (HR 0.843; 95% CI 0.533 to 1.334). No statistically significant differences in EDSS change from baseline were seen in the TEMSO and Study 2001 trials. Changes in EDSS from TOWER were provided as commercial in confidence by the manufacturer and therefore cannot be given here. The manufacturer also provided health-related quality-of-life data. Changes in fatigue and health-related quality of life (measured using SF‑36 and EQ‑5D) were not statistically significantly different between teriflunomide and placebo in the individual trials.

3.5 The primary outcome of TENERE was time to failure, defined as confirmed relapse or treatment discontinuation with teriflunomide 14 mg compared with Rebif‑44. Of those receiving teriflunomide, 37.8% experienced failure compared with 42.4% in the Rebif‑44 group. For the adjusted annualised relapse rate, no statistically significant differences between teriflunomide and Rebif‑44 were reported in the TENERE study (0.259 compared with 0.216, respectively; p=0.59). The SAD data were provided as commercial in confidence and cannot be presented here. At week 48 the global satisfaction score on the Treatment Satisfaction Questionnaire for Medication was statistically significantly higher with teriflunomide than Rebif‑44 (higher score indicates better satisfaction; 68.818 compared with 60.975, p=0.0162).

3.6 The manufacturer did a mixed treatment comparison (MTC) that compared teriflunomide with each of the treatments in the decision problem (beta interferons, glatiramer acetate, natalizumab and fingolimod). The base-case MTC included 30 clinical trials, which recruited patients from the year 2000 onwards, at least 80% of whom had relapsing–remitting multiple sclerosis. A separate 'all years' analysis was also provided, which included all studies, including those that recruited patients before 2000. The year 2000 was justified by the manufacturer as an appropriate cut-off point because of changes in diagnostic criteria used in multiple sclerosis trials, which coincided with a reduction in annualised relapse rates at diagnosis. After 2000, the McDonald criteria were used, which identifies multiple sclerosis earlier than the previously used Poser criteria. The outcomes presented in the MTCs included annualised relapse rate, proportion of relapse-free patients, 3‑month SAD, all-cause discontinuation rate, and discontinuation rate because of adverse events. The MTCs used a Bayesian random effects model. The results from the base-case MTC (post-2000) and 'all years' MTC are discussed for each comparator separately in sections 3.7 to 3.9.

3.7 The manufacturer provided data from the base-case MTC (post-2000) and on the clinical effectiveness of teriflunomide 14 mg compared with all the disease-modifying therapies including the interferons Rebif‑44, Betaferon (interferon beta‑1b) and Avonex (interferon beta‑1a), as well as glatiramer acetate (see section 3.9 for the natalizumab and fingolimod results).

  • For the annualised relapse rate, no statistically significant differences were seen between teriflunomide and Rebif‑44 (rate ratio 1.06, 95% CI 0.84 to 1.35), Betaferon (rate ratio 0.98, 95% CI 0.73 to 1.31), Avonex (rate ratio 0.86, 95% CI 0.69 to 1.05) or glatiramer acetate (rate ratio 1.05, 95% CI 0.83 to 1.31).

  • The base-case MTC (post-2000) also suggested no statistically significant difference in 3‑month SAD between teriflunomide and Rebif‑44 (HR 0.90, 95% CI 0.54 to 1.45), Betaferon (HR 0.58, 95% CI 0.30 to 1.12), Avonex (HR 0.77, 95% CI 0.50 to 1.24) or glatiramer acetate (HR 0.76, 95% CI 0.45 to 1.30).

  • The base-case MTC (post-2000) also suggested there was a statistically significantly greater rate of all-cause discontinuation with teriflunomide compared with Betaferon (odds ratio 2.10, 95% CI 1.22 to 3.50) and glatiramer acetate (odds ratio 1.50, 95% CI 1.02 to 2.23). There was no statistically significant difference in discontinuation rate between teriflunomide and Rebif‑44 (odds ratio 0.80, 95% CI 0.54 to 1.30) or Avonex (odds ratio 1.13, 95% CI 0.71 to 1.82).

3.8 The manufacturer also provided data from the 'all years' MTC on the clinical effectiveness of teriflunomide 14 mg compared with all the disease-modifying therapies including the interferons Rebif‑44, Betaferon (interferon beta‑1b) and Avonex (interferon beta‑1a), and glatiramer acetate.

  • For the annualised relapse rate, no statistically significant differences were seen between teriflunomide and Rebif‑44 (rate ratio 1.04, 95% CI 0.84 to 1.28), Betaferon (rate ratio 0.97, 95% CI 0.80 to 1.18), Avonex (rate ratio 0.84, 95% CI 0.71 to 1.00) or glatiramer acetate (rate ratio 1.01, 95% CI 0.85 to 1.21).

  • The 'all years' MTC also showed no statistically significant difference in 3‑month SAD between teriflunomide and Rebif‑44 (HR 0.97, 95% CI 0.66 to 1.43), Betaferon (HR 0.75, 95% CI 0.49 to 1.24), Avonex (HR 0.80, 95% CI 0.51 to 1.23) or glatiramer acetate (HR 0.90, 95% CI 0.60 to 1.37).

  • The 'all years' MTC showed there was not a statistically significant difference in all-cause discontinuation between teriflunomide compared with Rebif‑44 (odds ratio 0.74, 95% CI 0.51 to 1.10), Betaferon (odds ratio 1.56, 95% CI 0.95 to 2.49), Avonex (odds ratio 1.04, 95% CI 0.66 to 1.60) or glatiramer acetate (odds ratio 1.28, 95% CI 0.89 to 1.84).

3.9 The base-case MTC (post-2000) and 'all years' MTC were also used to compare the clinical effectiveness of teriflunomide with fingolimod and natalizumab for the whole active relapsing–remitting multiple sclerosis population. The base-case MTC (post-2000) suggested that teriflunomide was associated with a statistically significantly higher annualised relapse rate compared with fingolimod (rate ratio 1.45, 95% CI 1.17 to 1.80) and natalizumab (rate ratio 2.12, 95% CI 1.63 to 2.75). This was also seen with the 'all years' MTC. Both the base-case MTC (post-2000) and 'all years' MTC showed that there was no statistically significant difference in 3‑month SAD between teriflunomide and fingolimod or natalizumab. The manufacturer also presented data for the following outcomes: proportion of patients who were relapse-free; all-cause discontinuation; and discontinuation because of adverse events.

3.10 The manufacturer conducted 2 separate indirect comparisons of teriflunomide for the subgroups of patients in TEMSO with highly active relapsing–remitting multiple sclerosis (teriflunomide n=11, placebo n=10; approximately 2% of the trial population) and rapidly evolving severe relapsing–remitting multiple sclerosis (teriflunomide n=33, placebo n=39; approximately 7% of the trial population). The indirect comparison used data from the fingolimod European public assessment report and the natalizumab manufacturer's submission to NICE. The outcomes presented included annualised relapse rate and 3‑month SAD. The 95% confidence intervals and probability (p) values were not provided and the detailed results of these analyses were provided but were marked as commercial in confidence and therefore cannot be presented here. The indirect treatment comparisons suggested that teriflunomide was associated with a lower annualised relapse rate and 3‑month SAD compared with fingolimod in highly active relapsing–remitting multiple sclerosis. It also suggested that teriflunomide was associated with a lower annualised relapse rate, but a higher 3‑month SAD than natalizumab in rapidly evolving severe relapsing–remitting multiple sclerosis.

3.11 The manufacturer stated that almost all patients treated with teriflunomide reported at least 1 adverse event. However, for most of the events, the incidence was similar to placebo. Rates of discontinuation because of adverse events were higher for teriflunomide than with placebo. The manufacturer provided results from the base-case MTC (post-2000) and 'all years' MTC for all-cause discontinuations (see sections 3.7 and 3.8). Discontinuations because of adverse events were presented as academic in confidence and cannot be reported here. The base-case (post-2000) MTC and 'all years' MTC showed that there was no statistically significant difference in discontinuation because of adverse events between teriflunomide and Betaferon, Avonex, Rebif‑44 or glatiramer acetate. In addition, the manufacturer carried out a comparison of adverse events between teriflunomide and Rebif‑44, the results of which were provided as commercial in confidence and therefore cannot be presented here.

Cost effectiveness

3.12 The manufacturer submitted an economic model to evaluate the cost effectiveness of teriflunomide. In addition, it conducted a systematic literature review that identified 2 cost-effectiveness studies for relapsing–remitting multiple sclerosis to inform parameters used in the model.

3.13 The manufacturer's model used a multistate Markov approach. The model contained 20 health states that were defined by disability level (EDSS scores 0–9), and the type of multiple sclerosis (relapsing–remitting multiple sclerosis or secondary progressive multiple sclerosis). Patients with relapsing–remitting multiple sclerosis entered the model in relapsing–remitting multiple sclerosis states 0–7. In each cycle, patients could remain in the same state, progress to a worse state (patients could not regress to a better state), transfer to a secondary progressive multiple sclerosis health state, or die. Health states for secondary progressive multiple sclerosis were included to represent the clinical progression of relapsing–remitting multiple sclerosis. It was assumed that, when progressing from relapsing–remitting multiple sclerosis to a secondary progressive multiple sclerosis state, the patient's disease would also progress by 1 EDSS state. In addition, in each cycle patients could withdraw from treatment, stop treatment after reaching the EDSS limit for which a disease-modifying treatment is allowed (EDSS 6), or experience relapse and adverse events. The probability of death depended on the EDSS state, age and sex. The transition probabilities, discontinuation rates, relapse rates and adverse event rates throughout the model were based on data from the base-case MTC (post-2000) (treatment effect on progression, treatment effect on relapses, hospitalisation because of relapse, withdrawal, and adverse events), or taken from the literature (natural disease progression, demographic profile of patients entering the model, natural relapse rates, mortality). Treatment effects on disability and relapse were assumed to be constant over time, that is, there was no waning of treatment effect and, once patients stopped receiving treatment, they continued to benefit because they were at a better EDSS state than they would have been without the treatment, and the EDSS state determined disability, relapse and progression. The patients then followed the natural history of progression. In the base case, patients stopped treatment if their relapsing–remitting multiple sclerosis progressed to secondary progressive multiple sclerosis, or progressed to an EDSS state greater than 6. In the manufacturer's sensitivity analyses, treatment could be continued in secondary progressive multiple sclerosis but the treatment effect was reduced by 50% when the condition progressed to secondary progressive multiple sclerosis. It was assumed that withdrawal rates would not persist over the whole period of the model and therefore after 2 years the rate was estimated to decrease by 50% (based on clinical opinion). The cycle length was 1 year, and the time horizon was lifetime, assumed to be 50 years with a mean starting age of 39 years (based on the UK risk-sharing scheme cohort). The manufacturer stated that the analyses used an NHS and personal and social services perspective and applied a 3.5% discount rate on costs and health effects.

3.14 The manufacturer's base-case analyses compared teriflunomide with a blended comparator of Rebif‑22 (interferon beta‑1a [22 micrograms]), Rebif‑44, Avonex, Betaferon and glatiramer acetate. The blended comparator was calculated as the weighted average of the clinical efficacy and cost–utility inputs on the basis of UK market share data. The manufacturer also conducted a full incremental analysis, comparing teriflunomide with the individual treatments: glatiramer acetate, Rebif‑22, Rebif‑44, Avonex, Betaferon and aggregated Rebif. The possibility of receiving more than 1 treatment (treatment sequencing) was considered in scenario analyses (see section 3.19). The manufacturer provided separate analyses of teriflunomide compared with fingolimod and natalizumab for the people with relapsing–remitting multiple sclerosis, and for the subgroups with highly active relapsing–remitting multiple sclerosis and with rapidly evolving severe relapsing–remitting multiple sclerosis (see section 3.21).

3.15 The model applied health state utility values to each of the EDSS states. The utility values in the manufacturer's model were taken from Orme et al. (2007), which was a UK survey of health-related quality of life (measured using EQ‑5D) in people with multiple sclerosis. The utility values ranged from 0.870 (EDSS 0) to a state valued as worse than death, −0.049 (EDSS 8) and −0.195 (EDSS 9), by the general population sample who provided values for the EQ‑5D. The secondary progressive multiple sclerosis health states were the values from the relapsing–remitting multiple sclerosis health states minus 0.045. The manufacturer collected EQ‑5D data in the TEMSO study but did not apply these data in the model on the basis that this study was an international study and may not be representative of the UK population. Disutility values were also applied to each EDSS state for relapse, caregiving and adverse events. The disutilities associated with relapse were estimated using a UK study (Orme et al.) and a US study (Prosser et al. 2003). The UK disutility value of relapse taken from Orme et al. was assumed to represent relapse without hospitalisation. The difference in utility seen between relapses with or without hospitalisation in the Prosser study was then used to estimate the disutility of relapse with hospitalisation (−0.0297, −0.0089 without hospitalisation). Disutility values taken from a study by Gani et al. (2008) were applied for caregivers and took into account the time spent caring for the patient (which was taken from Orme et al.). A different value was estimated for each EDSS state and ranged from 0 (EDSS 0) to −0.140 (EDSS 9). The disutility values for adverse events were taken from the published literature. A value was derived for each event and adjusted for time, according to the treatment, to estimate a treatment-specific annual disutility value; these included nausea (−0.0001), diarrhoea (−0.0004), hair thinning (−0.1140), fatigue (−0.0014), headache (−0.0002), immediate post-injection systemic reactions (−0.0001), arthralgia (−0.0034) and influenza-like symptoms (−0.0343 to −0.0114).

3.16 The model used NHS reference costs and the Payment by Results tariff to estimate the costs of administration, monitoring and adverse effects associated with each treatment. The manufacturer assumed that teriflunomide was not associated with administration costs because it is an oral treatment. In addition, some costs were derived from the literature; health-state costs (including direct medical costs and direct non-medical costs) were derived from Tyas et al. (2007). These costs differed across the EDSS states and ranged from £336 (EDSS 0) to £19,704 (EDSS 9) for direct medical costs, and from £5335 (EDSS 0) to £20,811 (EDSS 8) and £12,915 (EDSS 9) for non-medical costs. The cost associated with relapse was sourced from Dee et al. (2012): £845 without hospitalisation and £6164 with hospitalisation. The resource use and costs applied in the model were validated by the manufacturer's clinical experts. Fingolimod is available to the NHS with a simple discount through a patient access scheme agreed with the Department of Health. However, the magnitude of this discount was not known by the manufacturer and therefore was not applied in the base-case analysis (but was explored in the sensitivity analysis, using a range of assumed discounts).

3.17 Teriflunomide dominated the blended comparator in the base case (incremental costs: −£5491; incremental quality-adjusted life years [QALYs]: 0.201), that is, it was less expensive and more effective. The cost-effectiveness acceptability curve provided by the manufacturer showed a 63% probability of teriflunomide being cost effective if the maximum acceptable incremental cost-effectiveness ratio (ICER) was £20,000 per QALY gained.

3.18 The manufacturer conducted one-way sensitivity analyses, which showed that the cost effectiveness of teriflunomide was most sensitive to the blended comparator hazard ratio for disability progression, the teriflunomide hazard ratio for disability progression, the blended comparator withdrawal rate, disease costs, the teriflunomide annual relapse rates, and the blended comparator annual relapse rates. For each of the analyses, teriflunomide continued to dominate the blended comparator, except when the hazard ratios for disability progression were varied. Teriflunomide was dominated by the blended comparator when the lower 95% confidence interval for the blended comparator disability progression hazard ratio was applied (that is, reducing the progression risk with the blended comparator). When applying the upper 95% confidence interval for the teriflunomide disability progression hazard ratio (that is, increasing the progression risk with teriflunomide), the ICER for teriflunomide compared with the blended comparator was £20,613 per QALY gained.

3.19 The manufacturer conducted scenario analyses that explored likely treatment sequences, based on clinical opinion. This analysis included a sequence of 2 treatments after teriflunomide or the blended comparator. Treatments that were included as second and third line were the blended comparator, fingolimod, natalizumab or best supportive care. As part of these analyses, the manufacturer applied 2 assumed patient access scheme prices for fingolimod (£11,000 and £13,000), as well as the list price. Teriflunomide dominated the blended comparator in all scenarios, irrespective of the size of patient access scheme discount for fingolimod. The manufacturer conducted further scenario analyses including using the 'all years' MTC for clinical data, using different sources of costs and utilities, and using the EDSS distribution, patient population and proportion of relapses from the clinical trials. Teriflunomide dominated the blended comparator for all scenarios.

3.20 The manufacturer also presented an incremental analysis in which teriflunomide was compared with the individual comparators (glatiramer acetate, Rebif‑22, Rebif‑44, Avonex and Betaferon). In the base case, teriflunomide dominated all the comparators. The manufacturer also conducted incremental analysis for the following scenarios: the 'all years' MTC data; the 'all years' MTC values without Bornstein et al. (1987; this study was excluded because it did not use EDSS); and the base-case MTC (post-2000) values including treatment in secondary progressive multiple sclerosis. Teriflunomide dominated each of the individual comparators for most of the scenarios, with the following exceptions: the 'all years' MTC (£86,866 per QALY gained for teriflunomide compared with glatiramer acetate [incremental costs: £3573; incremental QALYs: 0.041]); the 'all years' MTC without Bornstein et al. (£21,062 per QALY gained for teriflunomide compared with glatiramer acetate [incremental costs: £2641; incremental QALYs: 0.125], and £301,857 per QALY gained for Rebif‑22 compared with teriflunomide [incremental costs: £4130; incremental QALYs: 0.130]); and the base-case MTC (post-2000) with secondary progressive multiple sclerosis treatment (£105,604 per QALY gained for Rebif‑22 compared with teriflunomide [incremental costs: £11,709; incremental QALYs: 0.111]).

3.21 The manufacturer presented cost-effectiveness results for 2 subgroups: teriflunomide compared with fingolimod in the subgroup of people with highly active relapsing–remitting multiple sclerosis, and teriflunomide compared with natalizumab in the subgroup of people with rapidly evolving severe relapsing–remitting multiple sclerosis. For the first subgroup, teriflunomide dominated fingolimod when the fingolimod list price was used (incremental cost savings: £35,084; incremental QALYs: 0.746) and when it was assumed fingolimod cost £11,000 per year (incremental cost savings: £67,826; incremental QALYs: 0.725). For the second subgroup, teriflunomide was associated with an ICER of £63,107 (incremental cost savings: £30,133; incremental QALYs: −0.477) saved per QALY lost compared with natalizumab. The manufacturer stated that, because of limitations in the clinical data (see section 3.10), these analyses were not reliable.

Evidence Review Group comments

3.22 The ERG reviewed the decision problem presented by the manufacturer, and commented that it was in line with the scope, except for the population. The ERG noted that secondary progressive multiple sclerosis and primary progressive multiple sclerosis populations were not presented in the manufacturer's submission because the marketing authorisation for teriflunomide was limited to relapsing–remitting multiple sclerosis.

3.23 The ERG considered the generalisability of the placebo-controlled clinical trials to UK clinical practice. It noted that although most of the patients in the trials had relapsing–remitting multiple sclerosis (at least 87%), the trials also included people with primary progressive multiple sclerosis and secondary progressive multiple sclerosis. The ERG noted that Study 2001 used the Poser rather than McDonald criteria to diagnose patients with multiple sclerosis, and stated that the McDonald criteria were more in keeping with current clinical practice. However, it concluded that overall, the differences were not large and that the trial populations can be considered generalisable to the UK population with active relapsing–remitting multiple sclerosis who would be receiving a disease-modifying therapy.

3.24 The ERG commented that all placebo-controlled clinical trials were short considering the generally long duration of multiple sclerosis and infrequency of relapses, and therefore may not adequately capture differences in relapse rates. Of particular note, Study 2001 lasted only 36 weeks. The ERG noted that the European Medicines Agency suggests that a trial duration of at least 2 years is needed to accurately assess relapses and disability progression. Furthermore, the ERG noted that quality-of-life and mortality data were limited to 2‑year follow-up and supplemented by longer-term extension studies, which were not placebo controlled and therefore did not account for the natural history of the disease.

3.25 The ERG noted that the TEMSO and TOWER trials reported 3‑month SAD and that the European Medicines Agency recommends the use of 6‑month SAD data. The ERG commented that 6‑month SAD would be preferable to 3‑month SAD because there remains a possibility of recovery from disability at 3 months. The ERG noted that the manufacturer provided evidence that a large proportion of patients in both groups of the trials did not have persistent disability (that is, their disability regressed). The ERG commented that meta-analysis of 6‑month SAD was not provided by the manufacturer.

3.26 The ERG commented that a random effects model chosen by the manufacturer for meta-analyses of the placebo-controlled trials may not have been appropriate because of the small number of studies (2 or 3 in each analysis). The ERG noted that there were some differences between Study 2001 and the phase III trials TEMSO and TOWER. It also noted that a higher proportion of patients in Study 2001 had received previous disease-modifying therapies compared with TEMSO and TOWER. It noted that Study 2001, as a proof of concept study, was small (61 patients per treatment arm) and just 36 weeks long, so assessment of relapse rates may not have been robust. Furthermore, it noted that EDSS scores were higher and more patients stopped treatment in the teriflunomide arm of Study 2001 than in the other trials. The ERG stated that these differences suggested that the studies were too heterogeneous to pool the results of Study 2001 with TOWER and TEMSO. It noted that Study 2001 was excluded from the 3‑month SAD meta-analysis because of the short duration of this trial. The ERG commented that it was questionable whether Study 2001 should have been included in the analyses for the other outcomes because of its short duration and the differences between the arms in previous treatment.

3.27 The ERG noted that the TENERE trial may not have been adequately powered to detect statistically significant differences in all investigated outcomes. It commented that because TENERE was not a double-blind trial, there may be bias in the evaluation of the primary outcome (which relies on patient-reported symptoms). The ERG also noted that there were some differences in patient baseline characteristics between the 2 treatment arms, which make the results of the trial difficult to interpret (details were provided as commercial in confidence and therefore cannot be presented here).

3.28 The ERG noted that the base-case MTC (post-2000) included all relevant comparators. Informal checks for consistency by the ERG did not identify major problems, but the ERG commented that comparison of Betaferon with placebo showed different 3‑month SAD results in the base-case MTC (post-2000) to those from the TENERE study. The ERG noted that the base-case MTC (post-2000) data showed that Betaferon was associated with a higher 3‑month SAD than placebo. The 3‑month SAD data from the TENERE study were provided as commercial in confidence and cannot be presented here. In addition, the ERG stated that the difference between the direct comparison and the base-case MTC (post-2000) was quite large for the effect of teriflunomide on 3‑month SAD compared with Rebif‑44. The ERG noted that this inconsistency may have contributed to the favourable results for teriflunomide compared with the beta interferons generated by the MTC (particularly for the base-case analysis). It also noted that the results of the 'all years' MTC were more consistent with the direct trial results. The ERG commented that the relative effect of teriflunomide on 3‑month SAD was a key driver in the economic model.

3.29 The ERG's major criticism of the manufacturer's MTC was that pre-2000 trials were excluded in the base-case analysis. It acknowledged the reasons given by the manufacturer (change in diagnostic criteria in 2000 from Poser to McDonald, and identification of patients earlier in the disease course). However, the ERG noted that the base-case MTC (post-2000) included 5 studies that had used the earlier Poser criteria. The ERG considered that a more appropriate approach would have been to conduct an 'all years' MTC with baseline relapse rate included as a covariate because it would have included all the trial data but would have accounted for any heterogeneity in baseline annualised relapsed rates. The ERG noted that the impact of the 2000 cut-off date was that all but 1 of the placebo-controlled trials of beta interferons and glatiramer acetate were excluded. It commented that although the manufacturer's concerns about including older trials were justified to some extent, neither the base case nor the 'all years' analysis were optimal, and that omission of the placebo-controlled beta interferon trials from the base-case analysis reduced the reliability of the results.

3.30 The ERG reviewed the trials that were included in the MTC and noted that some were short in both the base-case and the 'all years' networks. For example, the network for the outcome of annualised relapse rate included 11 trials of less than or equal to 12 months' duration. The network for the outcome of 3‑month SAD included 3 trials of less than or equal to 12 months' duration. The ERG again commented that 12 months is a short duration for assessing infrequent events such as multiple sclerosis relapse or confirmed progression. However, the ERG did not re-run the MTC analyses after excluding these trials of shorter duration. It commented that it was unclear what impact this may have, especially considering outcomes such as relapse and SAD.

3.31 The ERG reviewed the evidence provided for the subgroups of people with highly active or rapidly evolving severe relapsing–remitting multiple sclerosis. It commented that these results were not reliable because of the very small number of patients in this subgroup from the TEMSO trial and the poor definition of these patients used in the TOWER trial. The ERG noted that the manufacturer's submission did not include a synthesis of adverse event data that could be readily checked against supporting tables. Furthermore, the relatively short duration of the placebo-controlled trials limited the assessment of any differences in mortality and less frequently reported adverse events. The ERG commented further that although a greater number of patients in the Rebif‑44 arm in the TENERE trial stopped treatment because of adverse events, this should be interpreted in the light of differences in baseline characteristics (the details of which were provided as commercial in confidence and therefore cannot be presented here). The ERG commented that the impact of this difference is unknown.

3.32 The ERG reviewed the manufacturer's economic model and systematic review. It commented that the manufacturer did a comprehensive, well-rounded systematic literature review and that the model was structurally similarly to models used in previous NICE technology appraisals. During clarification, an error was identified in the manufacturer's model, which was corrected throughout the ERG analyses.

3.33 The ERG conducted some sensitivity analyses to determine the key areas of uncertainty in the manufacturer's model. It identified the following as having the most impact and conducted scenario analyses to explore them further:

  • the choice of comparator (see section 3.34)

  • the natural history and the rate of transition to secondary progressive multiple sclerosis (see section 3.35)

  • the rate of progression (see section 3.36)

  • the health-related quality of life associated with the more severe health states (see section 3.38).

3.34 The ERG regarded the use of a blended comparator in the base case of the manufacturer's model as inappropriate. The manufacturer's method for calculating the blended comparator, which used a weighted average of each individual treatment outcome as model inputs, was considered by the ERG to be inappropriate, because the outcomes of the average treatment effects are not the same as the average outcomes of the treatments because of the correlation between the costs and QALYs in the model. To address this, the ERG weighted the costs and QALYs for each individual treatment, the results of which were provided as commercial in confidence by the manufacturer and therefore cannot be presented here. Overall, the ERG considered that the use of a blended comparator hides the effects of changes in the model because the different individual treatments may have different treatment effects compared with placebo.

3.35 The ERG reviewed how disability progression was captured in the manufacturer's model. It noted that the model used the London Ontario data set (published in 1989) for predicting the initial distribution of EDSS and natural history progression of relapsing–remitting multiple sclerosis without treatment. The ERG stated that previous NICE technology appraisals have questioned the applicability of the London Ontario data set because of changes in multiple sclerosis care and because it did not collect data on patients whose condition improved to a better EDSS state over time. The ERG noted that a substantial proportion of patients in the TEMSO trial who experienced SAD later improved. It also considered that the initial EDSS states and transition probabilities were taken from a population with more severe disease than the population in which teriflunomide is expected to be used. The ERG therefore conducted analyses to explore each of the following:

  • using the initial EDSS distribution from the TEMSO and TOWER trials

  • using the TEMSO and TOWER data to estimate disability progression in 2 analyses: patients with active relapsing–remitting multiple sclerosis, and patients with secondary progressive multiple sclerosis

  • using alternative rates of conversion from relapsing–remitting multiple sclerosis to secondary progressive multiple sclerosis based on the London Ontario data set, calculated by the ERG.

    With the patient access scheme discount was applied, teriflunomide dominated Rebif and the blended comparator for each of these disability progression scenarios.

3.36 The ERG noted that the effect of treatment on disability progression was estimated from the manufacturer's base-case MTC (post-2000), and stated that these data were not robust because a large number of studies were excluded (by selecting only studies post-2000) and because of the heterogeneity across the included studies (see sections 3.28 to 3.30). The ERG highlighted the following concerns:

  • Betaferon was estimated to be less effective at slowing disability progression compared with best supportive care.

  • The estimate of 3‑month SAD for teriflunomide compared with Rebif‑44 from the base-case MTC (post-2000) appeared to be more favourable towards teriflunomide compared with the direct head-to-head evidence in TENERE.

  • The blended comparator masked treatment effects and subsequently favoured teriflunomide compared with each of the beta interferons individually.

3.37 The ERG conducted scenario analyses to explore the impact of different treatment effects. Firstly, it used the TENERE trial data, rather than MTC data, to estimate the relative treatment effect for teriflunomide compared with Rebif‑44. Secondly, it tested the assumption that there was no difference in treatment effect between teriflunomide and Rebif‑44. Finally the 'all years' MTC data were used to estimate the relative treatment effect of teriflunomide. Applying the patient access scheme price in these exploratory analyses, teriflunomide dominated the blended comparator or Rebif‑44 in all scenarios.

3.38 The ERG commented on the utility values used in the model. It noted that the manufacturer's base case used values derived from a 2005 UK multiple sclerosis survey (Orme et al. 2007), which have been criticised in previous NICE technology appraisals because of the low response rates, selection bias, unrepresentative population and patient-reported level of severity. The ERG noted that the TEMSO trial had collected health-related quality-of-life data using EQ‑5D, although only for EDSS states 0–6. The ERG noted that the utility values from TEMSO were higher for all EDSS states than the estimates taken from Orme et al., which were the lowest values identified in the manufacturer's literature review. The ERG considered the utility values from TEMSO to be more applicable to the treatment population because TEMSO better reflected patients who are likely to receive teriflunomide as a treatment for active relapsing–remitting multiple sclerosis. The ERG therefore explored 4 scenarios using alternative utility values:

  • TEMSO data for EDSS state 0–6, and health-related quality-of-life data from Natalizumab for the treatment of adults with highly active relapsing–remitting multiple sclerosis (NICE technology appraisal guidance 127) for EDSS states 7–9

  • TEMSO data for EDSS states 0–6, and an average of 4 studies for states 7–9

  • an average of 4 studies used for all EDSS states

  • TEMSO data for EDSS states 0–6, and the differences between states 7–9 seen in Orme et al. used to calculate states 7–9 from the TEMSO data.

    Applying the patient access scheme price, teriflunomide dominated the blended comparator and Rebif‑44 in all scenarios. The ERG considered that the last scenario was the most representative of patients being treated with teriflunomide because it used utility data from the TEMSO trial (EDSS 0–6) for the baseline estimates of health-related quality of life and estimated utility differences between EDSS states 7–9 from a large UK-based survey (Orme et al.).

3.39 The ERG reviewed the costs included in the manufacturer's model. It commented that there was uncertainty surrounding which costs were included in some of the sources used, particularly the direct non-health costs for the EDSS states. Furthermore, it noted that one source of costs (Karampampa et al. 2012, used in sensitivity analyses) included informal care costs such as productivity losses of the working caregivers, and that these do not meet the NICE reference case. When the ERG investigated the impact of excluding non-health costs, teriflunomide still dominated the blended comparator and Rebif‑44 when the patient access scheme discount was included.

3.40 The ERG presented an exploratory analysis comprising all of its preferred parameters, as follows:

  • trial distribution of initial EDSS

  • trial estimates of natural history

  • ERG calculation of secondary progressive multiple sclerosis conversion

  • treatment effects from the 'all years' MTC

  • trial-based health-related quality-of-life data using the differences seen in the Orme et al. (2007) study to extrapolate the higher EDSS state values

  • exclusion of non-health costs.

    The resulting ICERs were similar to those in the manufacturer's base case, although the total QALYs were higher and the total costs lower for each intervention. The ERG noted that the increase in total QALYs was because the EDSS states were less severe at the start of treatment, the model allows for improvements in disability (EDSS), and because utility values were derived from the trials (for EDSS 0–6). The decrease in total costs was largely explained by the exclusion of non-health costs. The results of the ERG's probabilistic analysis suggested that teriflunomide is more effective and more costly than glatiramer acetate, resulting in an ICER of £107,148 per QALY gained. However, teriflunomide dominated Rebif‑44 and the blended comparator. Because of the uncertainty associated with the manufacturer's MTC, the ERG also presented its preferred analysis using the manufacturer's base-case MTC (post-2000), rather than the 'all years' MTC. As described in sections 3.28 to 3.30, by using the base-case MTC (post-2000) rather than direct trial results, Betaferon is less effective (in terms of 3‑month SAD) than placebo. In addition, the hazard ratios comparing teriflunomide with each of the comparators are lower in the base-case MTC (post-2000), and therefore more favourable to teriflunomide. The ERG's deterministic analysis resulted in an ICER of £6266 per QALY gained for teriflunomide compared with glatiramer acetate, and teriflunomide dominated all other comparators.

3.41 The ERG noted the treatment became more cost effective as more patients stopped treatment (that is, higher withdrawal rates reduced the ICER), and suggested that this is counterintuitive. The ERG conducted exploratory analyses to test for logical consistency and external validation of the manufacturer's model. The ERG compared the change in QALYs, the change in costs and the ICERs, compared with treatment without a disease-modifying therapy presented in previous NICE technology appraisals (Beta interferon and glatiramer acetate for the treatment of multiple sclerosis [NICE technology appraisal guidance 32], Natalizumab for the treatment of adults with highly active relapsing–remitting multiple sclerosis [NICE technology appraisal guidance 127], and Fingolimod for the treatment of highly active relapsing–remitting multiple sclerosis [NICE technology appraisal guidance 254]). It noted that the manufacturer's model estimated ICERs for the interferons and glatiramer acetate compared with treatment without disease-modifying therapy that were considerably higher than those presented for the UK risk-sharing scheme (Avonex: £175,918 per QALY gained; Betaferon: dominated by treatment without disease-modifying treatment; Rebif‑22: £82,098 per QALY gained; Rebif‑44: £79,310 per QALY gained; glatiramer acetate: £142,703 per QALY gained), and that the ICER for teriflunomide compared with treatment without disease-modifying therapy was substantially lower than these ICERs.

3.42 The ERG commented on the subgroup analyses that compared teriflunomide with fingolimod and natalizumab for highly active relapsing–remitting and rapidly evolving severe relapsing–remitting multiple sclerosis, respectively. It did not consider the subgroup analyses to be reliable because of the very small number of patients included in each of the teriflunomide groups, because the relative risks and hazard ratios were calculated from the specified subgroups combined with the natural history of the full relapsing–remitting multiple sclerosis population, because only results from the TEMSO trial were used to calculate the teriflunomide effects and because of the inadequate methodology used. The ERG also noted that, although the manufacturer stated that the patient population in the model was based on patients for whom beta interferons and glatiramer acetate were the appropriate comparators (that is, not in people with rapidly evolving severe or highly active relapsing–remitting multiple sclerosis), the manufacturer did not provide subgroup analyses that excluded people with rapidly evolving severe or highly active relapsing–remitting multiple sclerosis. The ERG used the manufacturer's corrected model, and assumed that teriflunomide in people with rapidly evolving severe or highly active relapsing–remitting multiple sclerosis has the same effectiveness as in the full active relapsing–remitting multiple sclerosis population, to calculate the ICERs for teriflunomide in people with rapidly evolving severe or highly active relapsing–remitting multiple sclerosis populations (that is, compared with natalizumab and fingolimod respectively). The patient access scheme price for fingolimod was not applied. When the patient access scheme for teriflunomide was included, teriflunomide was associated with a lower cost than both natalizumab and fingolimod, but also with fewer QALYs.

Manufacturer's additional evidence

3.43 The manufacturer provided additional evidence, as requested in the appraisal consultation document, during the consultation. The manufacturer presented results of the 'all years' MTC, adjusted for baseline relapse rates. These data were similar to the 'all years' MTC (see section 3.8). These data compared the clinical effectiveness of teriflunomide 14 mg with all the disease-modifying therapies, including the interferons (Rebif‑44, Betaferon and Avonex), glatiramer acetate, fingolimod and natalizumab for the whole relapsing–remitting multiple sclerosis population.

  • For the annualised relapse rate, no statistically significant differences were seen between teriflunomide and Rebif‑44 (rate ratio 0.99, 95% CI 0.77 to 1.30), Betaferon (rate ratio 0.94, 95% CI 0.72 to 1.25), Avonex (rate ratio 0.78, 95% CI 0.61 to 1.01) or glatiramer acetate (rate ratio 0.99, 95% CI 0.79 to 1.24). Teriflunomide was associated with a statistically significantly higher relapse rate than fingolimod (rate ratio 1.41, 95% CI 1.14 to 1.77) and natalizumab (rate ratio 2.10, 95% CI 1.61 to 2.76).

  • The adjusted 'all years' MTC also showed no statistically significant difference in 3‑month SAD between teriflunomide and Rebif‑44 (HR 0.82, 95% CI 0.54 to 1.27), Betaferon (HR 0.59, 95% CI 0.34 to 1.03), Avonex (HR 0.72, 95% CI 0.46 to 1.16), glatiramer acetate (HR 0.76, 95% CI 0.49 to 1.20), fingolimod (HR 0.91, 95% CI 0.63 to 1.29) or natalizumab (HR 1.14, 95% CI 0.73 to 1.8).

  • The adjusted 'all years' MTC showed there was a statistically significantly greater rate of all-cause discontinuation with teriflunomide compared with Betaferon (odds ratio 1.93, 95% CI 1.19 to 3.04), glatiramer acetate (odds ratio 1.47, 95% CI 1.02 to 2.05) and fingolimod (odds ratio 1.52, 95% CI 1.01 to 2.17). There was no statistically significant difference in discontinuation rate between teriflunomide and Rebif‑44 (odds ratio 0.88, 95% CI 0.60 to 1.26), Avonex (odds ratio 1.18, 95% CI 0.74 to 1.84) or natalizumab (odds ratio 1.34, 95% CI 0.82 to 2.22).

    The manufacturer also presented data for the proportion of patients who were relapse-free. However, these data are marked as academic in confidence by the manufacturer and cannot be presented here.

3.44 The manufacturer provided a revised cost-effectiveness base case as part of the additional evidence, which did all of the following:

  • used the 'all years' MTC adjusted for baseline relapse rates to estimate disease progression and withdrawal rates

  • used the natural history progression data from the placebo arms of the TOWER and TEMSO trials

  • used the baseline characteristics and initial EDSS distribution from the TOWER and TEMSO trials

  • used the ERG's amended calculation for SPMS conversion probabilities

  • excluded the direct non-medical costs

  • used the utilities seen in the TEMSO trial, using increments from the Orme et al. (2007) study for high EDSS states when trial data were not available

  • applied treatment waning whereby the treatment effect was 75% after 2 years and 50% after 5 years.

    The manufacturer presented a fully incremental analysis using the revised base case that showed teriflunomide dominated each of the beta interferons. Compared with glatiramer acetate, teriflunomide had an ICER of £13,234 per QALY gained.

3.45 The additional evidence provided by the manufacturer also included sensitivity analyses for which the manufacturer presented pairwise comparisons with glatiramer acetate. In the revised base case (see section 3.44), treatment waning was included, and non-medical costs were excluded. The manufacturer presented sensitivity analyses to explore the impact of these. When treatment waning and non-medical costs were both excluded, the probabilistic ICER of teriflunomide compared with glatiramer acetate was £10,143 per QALY gained. When non-medical costs were included, teriflunomide dominated glatiramer acetate irrespective of whether treatment waning was or was not applied.

3.46 The manufacturer provided sensitivity analyses relating to treatment sequencing in the additional evidence. The manufacturer presented 7 different scenarios. The sequences that included teriflunomide dominated the sequences without teriflunomide in 5 of the 7 scenarios. These 5 sequences had teriflunomide replacing a line of treatment (for example, teriflunomide, fingolimod and best supportive care compared with Rebif‑44, fingolimod and best supportive care), or adding an additional treatment line (for example, teriflunomide, Rebif‑44 and fingolimod compared with Rebif‑44, fingolimod and best supportive care). The other 2 sequences included a comparison of teriflunomide, Rebif‑44 and glatiramer acetate with Rebif‑44, glatiramer acetate and best supportive care, which resulted in an ICER of £38,200 per QALY gained, and a comparison of Rebif‑44, teriflunomide and best supportive care with Rebif‑44, glatiramer acetate and best supportive care, which resulted in an ICER of £28,606 per QALY gained.

3.47 External validation of the model, using the parameters applied to the revised base case, was presented in the manufacturer's additional evidence. The resulting ICERs compared with best supportive care were: Avonex £210,570 per QALY gained, Betaferon £1,915,664 per QALY gained, Rebif‑22 £371,954 per QALY gained, Rebif‑44 £170,893 per QALY gained, and glatiramer acetate £98,785 per QALY gained. These were higher than those presented in NICE technology appraisal guidance 32, which were £48,085, £52,523, £58,817, £78,556 and £97,690 per QALY gained respectively. If treatment waning was excluded from the model (which was not included in NICE technology appraisal guidance 32), the corresponding ICERs for the manufacturer's base-case model, compared with best supportive care, were lower: Avonex £117,759 per QALY gained, Betaferon £131,825 per QALY gained, Rebif‑22 £65,486 per QALY gained, Rebif‑44 £79,027 per QALY gained, and glatiramer acetate £46,473 per QALY gained.

The ERG critique of the manufacturer's additional evidence

3.48 The ERG reviewed the additional evidence presented by the manufacturer and commented that the document submitted by the manufacturer largely reflected the amendments and corrections intended to address the Committee's considerations. Furthermore, the ERG noted that the meta-regression methods used to adjust the MTC for baseline relapse rates were acceptable. The ERG commented that the meta-regression carried out by the manufacturer resulted in a reduction in the effect size of Rebif‑44, Betaferon and glatiramer acetate for 3‑month SAD and discontinuations. The ERG commented that this was because the trials in the MTC with the largest baseline relapse rates were the placebo-controlled Rebif‑44, Betaferon and glatiramer acetate trials. The ERG noted that the adjusted 'all years' MTC had similar results to the 'all years' MTC and base-case (post-2000) MTC but that some of the point estimates favoured teriflunomide more than the base-case (post-2000) MTC.

3.49 The ERG reran the sensitivity analyses conducted by the manufacturer for treatment waning, inclusion of non-medical costs and treatment sequencing, and the results were similar to those presented by the manufacturer. The ERG explored the inclusion of some non-medical costs, using a cost midpoint from Karampampa et al. (2012). When these costs were applied, teriflunomide dominated the beta interferons in the probabilistic and deterministic analyses. In addition, teriflunomide dominated glatiramer acetate in the probabilistic analyses, and a deterministic ICER of £2729 per QALY gained was estimated for teriflunomide compared with glatiramer acetate.

3.50 The ERG conducted further exploratory analyses to show the key driver in the ICER difference between the ERG's re-estimation of the revised manufacturer's base case (£13,972 per QALY gained) and the ERG's previously preferred scenario (£107,148 per QALY gained), for teriflunomide compared with glatiramer acetate. The ERG noted that the only difference in the parameters applied to these analyses was the MTC (adjusted 'all years' MTC or 'all years' MTC, respectively), and inclusion of treatment waning. The ERG applied the disability progression rate from the 'all years' MTC rather than from the adjusted 'all years' MTC to the manufacturer's revised base case and this increased the ICER from £13,972 to £109,237 per QALY gained for teriflunomide compared with glatiramer acetate. The ERG applied the withdrawal rate from the 'all years' MTC rather than from the adjusted 'all years' MTC to the manufacturers revised base case and this increased the ICER from £13,972 to £22,797 per QALY gained for teriflunomide compared with glatiramer acetate. The ERG explored the impact of having the same withdrawal rate for glatiramer acetate and teriflunomide in the manufacturer's revised base case and the estimated ICER was £32,971 per QALY gained for teriflunomide compared with glatiramer acetate.

3.51 The ERG noted that the treatment sequencing highlighted the difference in costs and effectiveness when teriflunomide is added to current treatment, rather than replacing an existing therapy. The ERG therefore compared teriflunomide with best supportive care to understand this impact. The ERG explored the impact of including or excluding treatment waning, and including, excluding or applying a midpoint for non-medical costs, the resulting ICERs were:

  • including treatment waning:

    • excluding non-medical costs, £64,032 per QALY gained

    • including non-medical costs, £50,743 per QALY gained

    • using a cost midpoint (see section 3.49), £50,602 per QALY gained.

  • excluding treatment waning:

    • excluding non-medical costs, £42,243 per QALY gained

    • including non-medical costs, £29,293 per QALY gained

    • using a cost midpoint (see section 3.49), £29,289 per QALY gained.

3.52 The ERG commented on the validation of the manufacturer's model. The ERG noted that the ICERs of the disease-modifying treatments compared with placebo presented by the manufacturer were higher than those in NICE technology appraisal guidance 32. The ERG commented that the manufacturer's model predicted slower progression but a lower health-related quality of life than seen from the UK risk-sharing scheme.

3.53 Full details of all the evidence are in the manufacturer's submission and the ERG report.

  • National Institute for Health and Care Excellence (NICE)