4 Evidence and interpretation

The Appraisal Committee considered evidence from a number of sources.

4.1 Clinical effectiveness

4.1.1

The Assessment Group completed a systematic review of the efficacy of the technologies for the treatment of rheumatoid arthritis after the failure of a TNF inhibitor. Thirty-five studies met the criteria for inclusion. Five of these were randomised controlled trials (RCTs), one was a non-randomised controlled study, and 29 were included in the Assessment Report as uncontrolled studies (including two long-term RCT extensions). Three of the RCTs were subsequently excluded from the review by the Assessment Group because the studies included regimens outside of their marketing authorisation or the comparator was not considered relevant. A sixth RCT (the SUNRISE trial) was identified but not included in the systematic review because the trial was unpublished and data from the manufacturer were not provided in time for inclusion in the analyses. The two included RCTs compared rituximab with placebo (the REFLEX trial) and abatacept with placebo (the ATTAIN trial), with all people in these two RCTs also receiving ongoing methotrexate or other conventional DMARDs. This section summarises the outcomes for clinical effectiveness in terms of ACR 20, 50 and 70 response rate and improvement in DAS28 and/or HAQ score for the studies identified in the Assessment Group's systematic review.

Adalimumab

4.1.2

For adalimumab, the Assessment Group identified five uncontrolled studies with durations of follow-up ranging from 3 to 12 months. Four studies had small sample sizes ranging from 24 to 41. The fifth, a multicentre study, included 899 people. The Assessment Group did not pool the results because of substantial clinical and statistical heterogeneity between studies. Three studies reported ACR 20, 50 and 70 response rates ranging from 46% to 75%, 27% to 50% and 13% to 33% respectively. Four studies reported mean improvements in DAS28 score ranging from 1.30 to 1.90 when compared with pre-treatment values. Three studies reported mean improvements in HAQ ranging from 0.21 to 0.48 when compared with pre-treatment values. None of the studies assessed joint damage or quality of life.

Etanercept

4.1.3

For etanercept, the Assessment Group identified seven uncontrolled studies with durations of follow-up ranging from 3 months to over 9 months. Sample sizes ranged from 25 to 201. The results were not pooled because of substantial clinical and statistical heterogeneity between studies. Four studies reported ACR 20, 50 and 70 response rates ranging from 38% to 72%, 18% to 21% and 8% to 20% respectively. Four studies reported mean improvements in DAS28 score ranging from 0.47 to 1.80 when compared with pre-treatment values. Three studies reported mean improvements in HAQ score ranging from zero to 0.45 when compared with pre-treatment values. None of the studies assessed joint damage or quality of life.

Infliximab

4.1.4

For infliximab, the Assessment Group identified three uncontrolled studies, each with a small sample size ranging from 20 to 24. The Assessment Group could not determine the duration of follow-up in the studies. None of the studies reported ACR response criteria or quantitative results of changes in DAS28 and HAQ scores, or assessed joint damage or quality of life.

TNF inhibitors as a group

4.1.5

Eight studies were identified. None provided separate outcome data for individual TNF inhibitors. One non-randomised controlled study compared TNF inhibitors with rituximab (see section 4.1.10) and seven studies had no control group. The duration of follow-up in the studies ranged from 3 months to 4 years and sample sizes ranged from 70 to 818. One study reported ACR 20, 50 and 70 response rates of 46%, 26% and 7% respectively. A different study reported a mean improvement in HAQ score of 0.11 when compared with pre-treatment values. Three studies reported mean improvements in DAS28 score ranging from 0.88 to 1.00 when compared with pre-treatment values. No studies assessed joint damage or quality of life.

Rituximab

4.1.6

The Assessment Group identified one RCT (REFLEX, n=517), as well as the trial's long-term extension. The REFLEX trial compared rituximab with placebo (with ongoing methotrexate in both groups) in people who had had an inadequate response to one or more TNF inhibitors. The primary outcome in the REFLEX trial was ACR 20 response rate at 6 months. The REFLEX trial reported statistically significant differences favouring the rituximab group for ACR 20, 50 and 70 response rates. For the rituximab group the values were 51%, 27% and 12%, and for the placebo group these were 18%, 5% and 1% respectively (for all comparisons p<0.0001). A statistically significant difference in mean improvement in DAS28 was reported (1.9 in the rituximab group compared with 0.4 in the placebo group, p<0.0001). A statistically significant difference was also reported for mean improvement in HAQ score (0.40 in the rituximab group compared with 0.10 in the placebo group, p<0.0001). The long-term observational extension of the REFLEX trial included people who had received up to three courses of rituximab. This reported that people receiving further courses of rituximab responded in terms of ACR comparably to patients who received rituximab in the randomised phase of the RCT.

4.1.7

The SUNRISE trial (n=559) compared the safety and efficacy of one course of rituximab with two courses of rituximab. This trial was not included in the Assessment Group's analysis, but the results were later submitted by consultees. The study reported that of the people who had been randomised to a second course of rituximab at week 24, 54% had an ACR 20 at 48 weeks, whereas of those who had been randomised to placebo, 45% had an ACR 20 at 48 weeks (p=0.02). A statistically significant difference in mean improvement in DAS28 was also reported (1.9 in the rituximab retreatment group compared with 1.5 in the placebo retreatment group, p<0.01).

4.1.8

In addition to the REFLEX trial, the Assessment Group identified one non-randomised controlled study comparing TNF inhibitors with rituximab (see section 4.1.10), five uncontrolled studies and a pooled analysis combining data from the REFLEX trial, its long-term extension and other studies. Durations of follow-up in the uncontrolled studies ranged from 6 months to 1 year and sample sizes ranged from 20 to 158. The Assessment Group could not determine how many people the pooled analysis included. One study reported ACR 20, 50 and 70 response rates of 65%, 33% and 12% respectively. Another study reported a mean improvement in DAS of 1.61 when compared with pre-treatment values. None of the uncontrolled studies assessed improvement in HAQ score, joint damage or quality of life. The pooled analysis included people who had had up to five courses of rituximab treatment; this study showed clinical outcomes similar to those seen for rituximab in the REFLEX trial.

Abatacept

4.1.9

The Assessment Group identified one RCT (ATTAIN) and its long-term extension. The ATTAIN trial compared abatacept with placebo (with ongoing DMARDs in both groups) in people with an inadequate response to one or more TNF inhibitors. The co-primary outcomes in the ATTAIN trial were ACR 20 response rate and change in HAQ score at 6 months. The ATTAIN trial reported statistically significant differences in response rates for abatacept compared with placebo. The values were 50% compared with 20% (p<0.001) for ACR 20, 20% compared with 4% (p<0.001) for ACR 50, and 10% compared with 2% (p<0.01) for ACR 70. The ATTAIN trial also reported a statistically significant improvement in mean DAS28 score favouring the abatacept group (1.98 compared with 0.71, p<0.001). In addition, the ATTAIN trial reported a statistically significant difference in mean improvement in the HAQ score favouring the abatacept group (0.45 compared with 0.11, p<0.001). The long-term extension of the ATTAIN trial followed people for up to 5 years. This analysis showed that among people continuing on abatacept, clinical outcomes in terms of ACR response rates were comparable to those seen for abatacept in the randomised phase of the RCT. Further data were provided from a large prospective uncontrolled study (ARRIVE, n=1046); this study reported an improvement in DAS28 of 2.00 when compared with pre-treatment values.

Comparative effectiveness

4.1.10

The Assessment Group did not identify any randomised controlled trials directly comparing the five technologies, or trials comparing the technologies with other biological DMARDs or conventional DMARDs not previously tried by study participants. One non-randomised controlled study (n=318) compared switching from a TNF inhibitor to rituximab with switching to an alternative TNF inhibitor. When a switch occurred because the first TNF inhibitor was not effective, the mean change in DAS28 score was reported to be significantly higher in the rituximab group (mean change -1.34) compared with the group who received an alternative TNF inhibitor (mean change -0.93; p=0.03).

4.1.11

The Assessment Group conducted an adjusted indirect comparison of rituximab and abatacept using data from placebo-controlled trials that included similar populations. The analysis suggested no statistically significant differences in response rates between abatacept and rituximab for ACR 20 (relative risk 1.12, 95% confidence interval [CI] 0.68 to 1.84), ACR 50 (relative risk 1.00, 95% CI 0.33 to 2.98) and ACR 70 (relative risk 1.80, 95% CI 0.24 to 13.35).

Subgroup analyses

4.1.12

The Assessment Group identified evidence for adalimumab, etanercept and abatacept that compared response to treatment according to the reason for withdrawal of the first TNF inhibitor. The evidence compared primary non-responses (when the disease had never responded to treatment with a TNF inhibitor) with secondary non-responses (when there was a reduction in disease response to the first TNF inhibitor). The majority of analyses suggested no statistically significant differences in response according to reason for withdrawal of the first TNF inhibitor. However, two analyses of adalimumab showed statistically significant higher response rates for ACR 20 (pooled estimate risk difference -0.20, 95% CI -0.37 to -0.02) and ACR 50 (pooled estimate risk difference -0.12, 95% CI -0.20 to -0.04) when the first TNF inhibitor had been withdrawn because of a secondary non-response. Data for abatacept from the ATTAIN long-term extension study also suggested that in a non-intention-to-treat analysis of people who had a change in HAQ greater than 0.3 at 6 months, the proportion whose disease responded was higher among people who had experienced a secondary non-response than among those who experienced a primary non-response (risk difference -0.15, 95% CI -0.28 to -0.02). However, registry data for the TNF inhibitors as a group suggested that, when measured as EULAR response rates at 3 months, and assuming missing observations represented non-responders, there were statistically significantly higher response rates among people who stopped their first TNF inhibitor because of primary non-response (risk difference 0.30, 95% CI 0.13 to 0.46). No specific data for infliximab and rituximab were identified.

Evidence for the influence of the presence of auto-antibodies (that is, rheumatoid factor and anti-CCP antibody status) on effectiveness was available only for rituximab, from the REFLEX trial. The trial reported no statistically significant differences in relative treatment effect by rheumatoid factor status. However, absolute response rates were lower in both the rituximab and the placebo groups for people who were rheumatoid factor-negative than those who were rheumatoid factor-positive. When participants were stratified according to both rheumatoid factor and anti-CCP antibody status, the data suggested a greater treatment response in people who were either rheumatoid factor-positive or anti-CCP antibody-positive than in those who were negative for both rheumatoid factor and anti-CCP. The Assessment Group noted that this retrospective analysis should be treated with caution.

4.2 Cost effectiveness

Published literature

4.2.1

The Assessment Group identified four published economic analyses – two of rituximab and two of abatacept – that met the criteria for inclusion in the systematic review. All four used a decision analytic model. Three of the studies carried out a cost–utility analysis and reported results in terms of costs per quality-adjusted life years (QALYs) gained. The remaining study (of abatacept) reported results in terms of costs per additional case of 'low disease activity state' gained (DAS28 less than 2.6) and costs per additional remission gained (DAS28 up to 3.2). The Assessment Group could not perform a direct comparison of the incremental cost-effectiveness ratios (ICERs) from these studies because of different specifications of the models, including treatment sequence, time horizon, perspective and country of origin.

Manufacturers' submissions

4.2.2

All five manufacturers provided economic analyses to support their submissions. One analysis (etanercept, Pfizer) was provided only as a narrative summary and not as a fully executable model. All submissions were based on cost–utility analyses run over a lifetime horizon and from the perspective of the healthcare provider. All but one submission (abatacept, Bristol-Myers Squibb) used conventional DMARDs as the base-case comparator.

Abbott Laboratories (adalimumab)

4.2.3

Abbott Laboratories developed a discrete-event simulation model and performed a cost–utility analysis of adalimumab, etanercept, infliximab, rituximab and abatacept (all of which were considered in combination with methotrexate), each in comparison with conventional DMARDs. The model also compared adalimumab with conventional DMARDs only, and the sequence of adalimumab followed by rituximab compared with each of the remaining biological DMARDs. The model simulated people with profiles based on the baseline characteristics of people in the British Society for Rheumatology Biologics Register. The model included adverse events. It also included a continuation rule using ACR 50 response to determine whether a person continued therapy after an initial trial period.

4.2.4

Response to treatment was based on ACR response rates mapped to a change in HAQ score. The manufacturer derived the ACR response rates from a mixed treatment comparison of 34 studies and assumed that the responses were equal across all TNF inhibitors (that is, that the response to adalimumab was not different from other TNF inhibitors). To map ACR to HAQ, the change in HAQ score associated with each ACR response category was calculated from adalimumab clinical trial data in which both HAQ and ACR were measured. The model assumed that when people discontinued treatment with a TNF inhibitor, the initial effect of treatment was lost. The model assumed that disease progressed while on treatment and that it progressed at a constant annual rate to a greater degree on conventional than on biological DMARDS. This was modelled in terms of an annual increase in HAQ score of 0.030 for biological DMARDs, 0.045 for conventional DMARDs and 0.060 for rescue therapy. To convert HAQ scores to EuroQol (EQ-5D) scores, the manufacturer used a non-linear mapping mechanism estimated using EQ-5D data collected in trials of tocilizumab (another biological DMARD).

4.2.5

Costs included drug acquisition, administration, monitoring and hospitalisation (including joint replacement surgery). The cost of administering intravenous drugs was estimated to be £462 for each infusion, based on the Healthcare Resource Group 2007/08 tariff. The manufacturer of adalimumab assumed that administering subcutaneous drugs would require 3 hours of nurse training incurring a one-off cost of £129. The manufacturer assumed retreatment with rituximab would occur every 9 months. This assumption was tested in a sensitivity analysis using a retreatment schedule of every 6 months.

4.2.6

The base-case analysis showed that for rituximab in comparison with conventional DMARDs, the incremental QALY gain was 1.375 at an incremental cost of £15,100, giving an ICER of £10,986 per QALY gained. When it was assumed that treatment with rituximab was repeated every 6 months instead of every 9 months, the ICER increased from £10,986 per QALY gained in the base case to £15,856 per QALY gained. The base-case analysis showed that for adalimumab or etanercept in comparison with conventional DMARDs, the QALY gain was 1.467 at an incremental cost of £23,423, giving an ICER of £15,962 per QALY gained. For infliximab and abatacept (each in comparison with conventional DMARDs), the QALY gains were 1.451 and 1.136 respectively. The incremental costs were £31,241 and £34,188 respectively, giving ICERs of £21,529 and £30,104 per QALY gained. From the data presented, the ICER for adalimumab or etanercept compared with rituximab, given every 9 months, could be calculated to be £83,230 per QALY gained. Infliximab was dominated by adalimumab and etanercept, and abatacept was dominated by rituximab. From the data presented, the ICER for adalimumab or etanercept in comparison with rituximab given every 6 months could be calculated to be £16,280 per QALY gained. Infliximab was dominated by adalimumab and etanercept, and abatacept was dominated by rituximab. Univariate sensitivity analyses (altering one variable at a time) suggested that the model was most sensitive to the HAQ score at the start of treatment, change in HAQ score while on treatment (that is, underlying disease progression), assumptions regarding the loss or maintenance of initial treatment effect upon stopping treatment, the way in which HAQ was mapped to EQ-5D and the dosing schedule of rituximab.

Pfizer (etanercept)

4.2.7

Pfizer presented the results of a Markov model with a 6-month cycle. The model compared three strategies: treatment with two sequential TNF inhibitors, treatment with a TNF inhibitor followed by a conventional DMARD and treatment with a TNF inhibitor followed by rituximab. The manufacturer did not include abatacept in the economic analyses. Baseline patient characteristics reflected those of the patients in the TEMPO trial (an RCT of etanercept in people whose rheumatoid arthritis had had an inadequate response to conventional DMARDs). The model included serious adverse events. The base-case model included a continuation rule using HAQ score mapped from DAS28 to determine whether a person continued therapy after an initial trial period.

4.2.8

The manufacturer defined response to treatment in terms of mean change in HAQ score in people treated with a second TNF inhibitor after primary non-response, secondary loss of response or intolerance of their first TNF inhibitor. The data were taken from a trial of adalimumab; the values used were -0.44, -0.51 and -0.55 respectively. The manufacturer estimated the mean changes in HAQ score for those treated with rituximab (-0.40) from the REFLEX trial; all were unadjusted estimates of absolute treatment effect observed in the trial. The effect of conventional DMARDs following failure of a TNF inhibitor was assumed to be zero, based on data from the British Society for Rheumatology Biologics Register. The model included underlying disease progression while on treatment, modelled in terms of worsening HAQ score over time. The manufacturer assumed that while on treatment with biological DMARDs the HAQ score remained unchanged (that is, disease did not progress/worsen) but that while on conventional DMARDs the HAQ score changed at a rate of 0.075 per 6-month cycle from the first 6 months up to 3 years, and then 0.10 per 6-month cycle from year 3 onwards. The manufacturer used a linear mapping mechanism to convert HAQ scores to EQ-5D scores during each model cycle.

4.2.9

Costs included drug acquisition and administration, and costs associated with hospitalisation, outpatient visits, primary care visits, investigations and monitoring. The cost of administration was unclear. Rituximab was assumed to be provided once every 6 months.

4.2.10

The manufacturer presented the base-case results for a range of assumptions regarding changes in HAQ score for both TNF inhibitors and conventional DMARDs. Total differences in costs and QALYs were presented only for people who had switched from one TNF inhibitor to another because of an adverse event. The ICER for TNF inhibitors compared with conventional DMARDs was £15,294 per QALY gained when switching as a result of primary non-response and £14,501 per QALY gained when switching as a result of secondary loss of response. The ICER for TNF inhibitors compared with rituximab was £19,077 per QALY gained and £16,225 per QALY gained following switching for primary non-response and secondary loss of response respectively. The manufacturer presented no probabilistic sensitivity analysis results in the submission.

Schering-Plough (infliximab)

4.2.11

Schering-Plough developed a patient-level simulation model and performed a cost–utility analysis of infliximab compared with adalimumab, etanercept, rituximab, and abatacept, all of which were considered in combination with methotrexate. Comparisons were also made with conventional DMARDs only, and of the TNF inhibitors followed by rituximab. The model simulated people based on the characteristics at baseline of participants in the GO-AFTER trial (a trial of the TNF inhibitor golimumab in people with an inadequate response to a previous TNF inhibitor). The model did not include adverse events. The base-case model included a continuation rule using EULAR response to determine whether a person continued therapy after an initial trial period.

4.2.12

The manufacturer determined response to treatment by a multi-step process. First, baseline EULAR response rates from the British Society for Rheumatology Biologics Register were converted to baseline ACR response rates using an algorithm derived from the GO-AFTER trial. Treatment-specific ACR response rates were generated from the results of a mixed treatment comparison of RCTs of biological DMARDs. These ACR response rates were then converted back to EULAR response rates using the GO-AFTER algorithm. The EULAR response categories were then mapped to EQ-5D using algorithms derived from British Society for Rheumatology Biologics Register data that indirectly calculated EQ-5D from HAQ. In addition to the initial response to treatment, the model included underlying disease progression while on treatment, modelled using change in HAQ score over time. The manufacturer assumed that people treated with biological DMARDs experienced no progression in their disease whereas the condition deteriorated at a rate of 0.042 per year in people treated with conventional DMARDs.

4.2.13

Costs included drug acquisition and administration, monitoring and hospitalisation. It was assumed that in 63% of cases sharing of vials resulted in no wastage of unused infliximab. The cost of administering infused drugs was assumed to be £162.12, based on the cost given in the assessment report for NICE's technology appraisal guidance on adalimumab, etanercept and infliximab for the treatment of rheumatoid arthritis [2007] now replaced by NICE's technology appraisal guidance on adalimumab, etanercept, infliximab, certolizumab pegol, golimumab, tocilizumab and abatacept for rheumatoid arthritis not previously treated with DMARDs or after conventional DMARDs only have failed and adjusted for inflation. The manufacturer presented two analyses for rituximab: one assuming retreatment every 6 months and the other every 9 months.

4.2.14

The base-case analysis showed that for rituximab in comparison with conventional DMARDs, the incremental QALY gain was 0.65 at an incremental cost of £11,325, giving an ICER of £17,422 per QALY gained (assuming 9-month retreatment intervals). Assuming retreatment with rituximab every 6 months, the ICER for rituximab compared with conventional DMARDs was £27,161 per QALY gained. The QALY gains for adalimumab, etanercept and infliximab were 0.66, 0.62 and 0.65 respectively. The respective incremental costs were £23,129, £22,257 and £18,628, giving ICERs of £35,138, £35,898 and £28,661 per QALY gained compared with conventional DMARDs. The strategy comparing abatacept with conventional DMARDs resulted in an incremental QALY gain of 0.63 for an incremental cost of £28,205, producing an ICER of £44,795 per QALY gained. From the data presented, the ICER for adalimumab in comparison with rituximab given every 9 months could be calculated to be £1,186,230 per QALY gained. Etanercept and abatacept were dominated by rituximab, and infliximab was extendedly dominated by rituximab. When rituximab was assumed to be given every 6 months, the ICER for adalimumab compared with rituximab could be calculated to be £553,700 per QALY gained. Etanercept and abatacept were dominated by rituximab, and infliximab was extendedly dominated by rituximab.

Roche Products (rituximab)

4.2.15

Roche Products developed a patient-level simulation model and performed a cost–utility analysis of rituximab compared with adalimumab, etanercept, infliximab and abatacept, all of which were followed by a sequence of conventional DMARDs. The manufacturer also compared rituximab with conventional DMARDs alone. The model simulated people whose profiles were based on baseline characteristics of participants in the REFLEX trial. The manufacturer did not include adverse events.

4.2.16

Response to treatment was defined in terms of ACR response rates mapped to a change in HAQ score. ACR response rates were derived from two sources: a mixed treatment comparison of RCTs of TNF inhibitors in people whose rheumatoid arthritis had had an inadequate response to conventional DMARDs, and an indirect comparison of the abatacept ATTAIN trial and the rituximab REFLEX trial. The manufacturer then adjusted (reduced by 30%) the results of the mixed treatment comparison to reflect a lower response to treatment observed in people who had had an inadequate response to a first TNF inhibitor. The manufacturer converted ACR to HAQ using an algorithm from data in the REFLEX trial. When people discontinued treatment, the manufacturer assumed that the initial effect of treatment was lost. The model included both initial response to treatment and underlying disease progression while on treatment, each modelled as changes in HAQ score. It was assumed that while a person was on a biological DMARD there was no change in HAQ score. People on conventional DMARDs experienced an increase (worsening) in HAQ score of 0.0225 per 6-month cycle and people receiving palliative care experienced an increase in HAQ score of 0.03 per 6-month cycle. The manufacturer mapped HAQ scores to EQ-5D scores using a non-linear mapping mechanism derived from data from the tocilizumab trials.

4.2.17

The costs included drug acquisition and administration, monitoring and hospitalisation. The cost of administration was assumed to be £162 per infusion and this included all premedication and monitoring costs. Subcutaneous drugs incurred monitoring and premedication costs of £1,268 per year and administration costs (£136 for etanercept and £68 for adalimumab) to allow for the 10% of people who will receive injections from a district nurse. Retreatment with rituximab was assumed to occur every 8.7 months. This assumption was tested in sensitivity analysis.

4.2.18

The base-case analysis showed that for rituximab compared with conventional DMARDs, the incremental QALY gain was 1.071 with an incremental cost of £5,685, giving an ICER of £5,311 per QALY gained. Assuming retreatment with rituximab every 6 months, the ICER for rituximab compared with conventional DMARDs increased from £5,311 to £10,876 per QALY gained. The strategies comparing rituximab with etanercept, infliximab or abatacept showed rituximab to be both more effective and less costly. The strategy comparing rituximab with adalimumab showed rituximab to be less effective and less costly, with a QALY loss of 0.044 and an incremental reduction in cost of £13,551, resulting in an ICER for adalimumab compared with rituximab of £310,771 per QALY gained.

Bristol-Myers Squibb (abatacept)

4.2.19

Bristol-Myers Squibb developed a patient-level simulation model and performed a cost–utility analysis of abatacept compared with rituximab, both followed by infliximab. They also modelled abatacept compared with a basket of TNF inhibitors (reflecting the proportion of each drug's market share), both followed by another basket of TNF inhibitors. The model simulated people whose profiles were based on the baseline characteristics of participants in the ATTAIN trial. The model included adverse events for the first 6 months of treatment.

4.2.20

The manufacturer defined response to treatment in terms of mean change in HAQ score. Estimates for rituximab (0.38) and abatacept (0.42) were based on a mixed treatment comparison of the ATTAIN and REFLEX trials. Estimates for TNF inhibitors (0.21) were taken from an analysis of data from the British Society for Rheumatology Biologics Register completed by NICE's Decision Support Unit for NICE's technology appraisal guidance on adalimumab, etanercept and infliximab for the treatment of rheumatoid arthritis [2007]. The manufacturer assumed that people who discontinued treatment lost the initial effect of treatment. Underlying progression of disease while on treatment was modelled using HAQ score. For people treated with abatacept, disease was assumed to improve over time, with an annual improvement (HAQ score decrease) of 0.0729 in analyses compared with rituximab, and of 0.013 in analyses compared with TNF inhibitors. For the other biological treatments, the manufacturer assumed that disease worsened at an annual HAQ score increase of 0.012. A linear mapping mechanism was used to convert HAQ scores to Health Utilities Index Mark 3 scores during each model cycle.

4.2.21

Costs included drug acquisition and administration, monitoring, hospitalisation (including that for joint replacement surgery), outpatient visits and costs associated with adverse events. Different administration costs were used for the different drugs requiring intravenous infusion. For abatacept the cost per infusion was £141.83 based on the assessment report for NICE's technology appraisal guidance on adalimumab, etanercept and infliximab for the treatment of rheumatoid arthritis [2007] and adjusted for inflation to 2007/2008; for rituximab and infliximab the cost was £284.73 based on NHS reference costs. For subcutaneous treatments a one-off cost of £25.66 was incurred for training. Retreatment with rituximab occurred once every 6 months.

4.2.22

The base-case analysis showed that in comparison with the basket of TNF inhibitors, the QALY gain for abatacept was 0.47 at an incremental cost of £10,888, giving an ICER of £23,019 per QALY gained. The strategy comparing abatacept with rituximab resulted in an incremental QALY gain of 0.45 for an incremental cost of £9,238, producing an ICER of £20,438 per QALY gained. Sensitivity analyses showed that when it was assumed that there were no differences in the underlying progression of disease between the biological DMARDs (that is, a worsening of HAQ score of 0.012 per year was assumed for each biological treatment), the ICER for abatacept was £40,534 per QALY gained compared with rituximab, and £27,871 per QALY gained compared with TNF inhibitors.

The Birmingham Rheumatoid Arthritis Model

4.2.23

The Assessment Group carried out an independent economic analysis using the Birmingham Rheumatoid Arthritis Model. The model simulated people with rheumatoid arthritis based on the baseline characteristics of people in the British Society for Rheumatology Biologics Register. The model sampled people individually and compared each of the technologies (followed by a sequence of conventional DMARDs) with one another or with conventional DMARDs alone. It allowed for two stages of stopping treatment early. The first stage represented the possibility of stopping treatment after 6 weeks (assumed to be because of toxicity) and the second stage represented stopping treatment at between 6 and 24 weeks (assumed to be because of either toxicity or lack of efficacy). The model did not allow for stopping rituximab early because it was necessary to model the full costs of each cycle of treatment.

4.2.24

The Assessment Group modelled response to treatment using HAQ, with changes in HAQ scores calculated using a multiplier that represents a proportional change from a given baseline HAQ score. The respective HAQ multipliers for rituximab and abatacept were derived from the REFLEX and ATTAIN trials. The HAQ multipliers for adalimumab and etanercept were derived from uncontrolled studies. In the absence of data, the HAQ multiplier for infliximab was assumed to be the same as for etanercept. The Assessment Group assumed that when people discontinue treatment, they lose the initial effect of treatment. In addition to the initial response to treatment, the model assumed that underlying disease progresses during treatment. This was modelled by increases in HAQ score. In the base-case analysis, it was assumed that HAQ score remains constant for a person treated with a biological DMARD, but increases (worsens) for people treated with conventional DMARDs or palliative care. The annual HAQ score increase was 0.045 for conventional DMARDs and 0.06 for palliative care. The Assessment Group used a non-linear equation to convert HAQ scores to EQ-5D scores.

4.2.25

Costs included drug acquisition and administration plus monitoring. The administration cost for drugs requiring infusion was assumed to be £141.83. Costs for hospitalisation and joint replacement were estimated using a cost per unit HAQ score. Retreatment with rituximab was assumed to occur every 8.7 months.

4.2.26

The base-case analysis showed that for rituximab compared with conventional DMARDs, the incremental QALY gain was 0.96 with an incremental cost of £20,400, giving an ICER of £21,100 per QALY gained. The QALY gains for adalimumab, etanercept and infliximab were 0.75, 0.67 and 0.67 respectively. The respective incremental costs were £25,800, £26,100 and £24,000, giving ICERs (rounded to the nearest £100) of £34,300, £38,900 and £36,100 per QALY gained. The strategy comparing abatacept with conventional DMARDs resulted in an incremental QALY gain of 1.15 for an incremental cost of £44,000, producing an ICER of £38,400 per QALY gained. Compared with the TNF inhibitors, rituximab was shown to be both less costly and more effective. The ICER for abatacept in comparison with rituximab was £130,600 per QALY gained. The strategies comparing abatacept with adalimumab, etanercept and infliximab resulted in ICERs of £46,400 per QALY gained, £37,800 per QALY gained and £41,700 per QALY gained respectively.

4.2.27

Scenario analyses were undertaken to explore the impact of varying single assumptions within the model. These included: the time on treatment with the various therapies; the rituximab treatment interval; the efficacy of conventional DMARDs after the failure of a TNF inhibitor; the change in HAQ score while on biological DMARDs; the proportion of people stopping treatment early; the inclusion of costs of adverse events and palliation; and assumptions related to the equation used to map HAQ score to EQ-5D scores. These analyses indicated that the results are subject to considerable uncertainty.

4.2.28

Assuming that there was underlying progression of disease modelled as an increase in HAQ score of 0.03 per year while on biological DMARDs increased the ICERs for the comparison with conventional DMARDs. The ICERs were £61,300 per QALY gained for adalimumab, £76,300 per QALY gained for etanercept, £68,900 per QALY gained for infliximab, £46,000 per QALY gained for rituximab and £63,300 per QALY gained for abatacept.

4.2.29

Assuming conventional DMARDs were no more effective than placebo reduced the base-case ICERs for the comparison with conventional DMARDs to £28,100 per QALY gained for adalimumab, £31,100 per QALY gained for etanercept, £28,800 per QALY gained for infliximab, £16,300 per QALY gained for rituximab and £32,100 per QALY gained for abatacept. An incremental analysis demonstrated that rituximab was shown to be both less costly and more effective than each of the TNF inhibitors. The ICER for abatacept in comparison with rituximab was £158,000 per QALY gained.

4.2.30

Using the same proportion of people stopping early as was used in the Roche model (based on failure to achieve an ACR 20 response) reduced the base-case ICERs for the comparison with conventional DMARDs to £22,200 per QALY gained for adalimumab, £23,400 per QALY gained for etanercept, £26,200 per QALY gained for infliximab, £19,500 per QALY gained for rituximab, and £24,100 per QALY gained for abatacept. A detailed analysis of the costs and QALYs provided as an addendum by the Assessment Group showed that abatacept (with costs of £70,500 and 2.82 QALYs) was dominated by rituximab (with costs of £62,300 and 2.91 QALYs). Adalimumab, etanercept and infliximab were associated with lower costs than rituximab (£61,700, £60,300, and £62,300, respectively). However, they were also associated with fewer QALYs (2.56, 2.47 and 2.49, respectively). In an incremental analysis, rituximab had the lowest ICER, with the other treatments being either dominated or extendedly dominated.

4.2.31

Results suggested that the model was sensitive to changes in the equation converting HAQ to health-related quality of life; and the assumed time between treatments for comparisons involving rituximab. Assuming retreatment with rituximab every 6 months instead of every 8.7 months, the ICER for rituximab in comparison with conventional DMARDs increased from £21,100 to £32,600 per QALY gained.

4.2.32

Results suggested that the model was not sensitive to changes in the cost parameters, including those associated with hospitalisation and joint replacement, palliative care, and adverse events. The base-case analysis assumed a cost of joint replacement and hospitalisation of £1,120 per HAQ score unit. The exclusion of these costs increased the ICER in comparison with conventional DMARDs by approximately £2,500 per QALY gained. The inclusion of additional drug costs of palliation (that is, a £420 start-up cost and a subsequent annual cost of approximately £1,000) reduced the ICERs in comparison with conventional DMARDs by approximately £1,000 per QALY gained.

4.3 Consideration of the evidence

4.3.1

The Appraisal Committee reviewed the data available on the clinical and cost effectiveness of adalimumab, etanercept, infliximab, rituximab and abatacept after the failure of a TNF inhibitor. The Committee considered evidence on the nature of rheumatoid arthritis and the value placed on the benefits of adalimumab, etanercept, infliximab, rituximab and abatacept by people with the condition, those who represent them and clinical specialists. It also took into account the effective use of NHS resources.

Clinical effectiveness

4.3.2

The Committee considered the current clinical management of people with rheumatoid arthritis. The Committee heard from clinical specialists that the pathway of care following the failure of treatment with a TNF inhibitor depends on the individual person's responses to therapies, the clinical experience of the physician and the person's preference. The Committee heard from patient experts that rheumatoid arthritis can have a severe impact on quality of life, and that fatigue, pain and depression are common among people with the disease. Patient experts reported that rheumatoid arthritis frequently affects people's ability to work, noting the considerable burden placed on the carers of people with the disease. The Committee heard that rheumatoid arthritis may not respond to a given treatment, or there may be a decline in response over time that requires a change in treatment. Clinical specialists and patient experts emphasised the importance of having the option of multiple treatments for people whose disease has not responded adequately to initial treatment with a TNF inhibitor.

4.3.3

The Committee heard from clinical specialists that rheumatoid arthritis is heterogeneous and that different people may respond differently to a given treatment. In addition, it is difficult to predict whether an individual's disease will respond to a given treatment. Experts stated that people for whom a TNF inhibitor had never produced a response may be less suitable for a second TNF inhibitor than people whose rheumatoid arthritis had previously responded, and that people with seronegative antibody status may be less suitable for treatment with rituximab, than people with seropositive antibody status, although uncertainty surrounds this. The Committee therefore understood that response to treatments varies, and that it is not currently possible to target specific treatments to individuals because the response to any particular treatment cannot be predicted.

4.3.4

The Committee heard that the management of rheumatoid arthritis was changing in line with NICE guidelines for rheumatoid arthritis, and that more clinicians start DMARDs early and increase the dose of DMARDs quickly as required. The Committee heard from clinical specialists that, as a consequence of this accelerated approach to the use of DMARDs, physicians initiate treatment with TNF inhibitors sooner after diagnosis than they had done previously and therefore the characteristics of the people being treated with TNF inhibitors have changed over time. The Committee also heard from the clinical specialists that initiating treatment with TNF inhibitors earlier in the treatment pathway may increase people's potential to benefit from such treatment because of the reduced accumulation of irreversible joint damage. The Committee understood that these changes in the management of rheumatoid arthritis limited the generalisability of data from the British Society for Rheumatology Biologics Register, as it represents a cohort of people whose characteristics (including disease duration and number of previously received treatments) may not reflect those of the people currently seen in clinical practice for whom the first biological DMARD has failed.

4.3.5

The Committee considered the decision problem in the scope, noting comments from consultees that three technologies (certolizumab pegol, golimumab and tocilizumab) defined as comparators in the scope had not been included in the Assessment Report because marketing authorisations had not been obtained at the point at which the protocol was finalised. The Committee noted that golimumab and tocilizumab were subject to separate ongoing appraisals and were not yet in routine clinical use. It recognised that NICE technology appraisal guidance on certolizumab pegol for the treatment of rheumatoid arthritis (now replaced by NICE's technology appraisal guidance on adalimumab, etanercept, infliximab, certolizumab pegol, golimumab, tocilizumab and abatacept for rheumatoid arthritis not previously treated with DMARDs or after conventional DMARDs only have failed) did not include guidance on its use after the failure of a TNF inhibitor. The Committee heard that in the absence of NICE guidance, the use of certolizumab pegol in this patient group would be subject to local decision making. The Committee concluded that excluding certolizumab pegol, golimumab and tocilizumab from the Assessment Report was appropriate. The Committee noted, however, the availability of RCT data for golimumab (an alternative TNF inhibitor) after the failure of a TNF inhibitor, and agreed that consideration of this data would be relevant to the decision problem in the appraisal (section 4.3.7).

4.3.6

The Committee considered whether the TNF inhibitors could be considered as a group with respect to clinical effectiveness. The Committee was aware that each of the TNF inhibitors has a different mechanism of action. The Committee heard from clinical specialists that variations in the underlying mechanism of disease across people coupled with different mechanisms of action of the individual drugs can result in a variety of responses to treatment with TNF inhibitors. The Committee heard from patient experts that the technologies should be considered separately. The Committee heard from clinical specialists that data from the British Society for Rheumatology Biologics Register show no statistically significant difference in effect between TNF inhibitors, but that these data reflect the effectiveness of the first use of TNF inhibitors and not the effectiveness after the failure of a previous TNF inhibitor. Based on advice from the clinical specialists and patient experts that disease can respond differently to different TNF inhibitors, the Committee concluded that it may not be appropriate to assume that the TNF inhibitors form a homogeneous group with regard to clinical effectiveness. However, in the absence of evidence either way, it was not currently possible to distinguish with certainty between the TNF inhibitors in terms of their clinical effectiveness.

4.3.7

The Committee discussed the clinical effectiveness of a second TNF inhibitor after the failure of a first. It noted that the review of the evidence by the Assessment Group had identified no RCTs and that the majority of other studies were uncontrolled observational or registry datasets, some of which had examined the TNF inhibitors as a group. The Committee heard from clinical specialists that for conventional DMARDs, the proportion of people whose condition responded to each successive treatment was reduced as the number of treatments increased, and that the same was considered to hold true for biological DMARDs. The clinical specialists noted that failure of a first TNF inhibitor was associated with an increased risk of failure of a second TNF inhibitor, but that the proportion of people with a good response was comparable. The clinical specialists therefore considered that a second TNF inhibitor was clinically effective. The Committee noted the comment made in consultation that the GO-AFTER trial data for a different TNF inhibitor (golimumab) could be used to inform the relative treatment effect of the TNF inhibitors in comparison with placebo. It discussed data from this trial that showed a statistically significant benefit from treatment with the TNF inhibitor golimumab after the failure of a different TNF inhibitor when compared with placebo. However, while noting the availability of these data, the Committee were mindful of comments from the clinical specialists and patient experts that the TNF inhibitors should be considered separately (see section 4.3.6). Furthermore, the Committee agreed that it would not be appropriate to apply the results from a treatment not included in the appraisal to the drugs in the appraisal, but concluded that the results for golimumab could be seen as confirming a beneficial effect of TNF inhibitor treatment following failure of a first TNF inhibitor.

4.3.8

The Committee specifically considered the data from the British Society for Rheumatology Biologics Register. It heard from clinical specialists that the register reported an improvement in HAQ score of 0.11 among people receiving a second TNF inhibitor. The Committee also heard that this change did not differ from the average change in HAQ score (0.10) among people whose first TNF inhibitor had failed but who had continued to take it. The Committee heard from the clinical specialists that this change was smaller than the minimum value that is considered a clinically important difference within the context of a clinical trial (0.22) and within the context of an observational study (0.14). The Committee noted the limitations of the British Society for Rheumatology Biologics Register data. Considering these data in conjunction with the evidence discussed in section 4.3.7, the Committee concluded that although the studies suggest that a second TNF inhibitor is effective after the failure of a first, the absence of any rigorously controlled data meant that it could not quantify with certainty the relative effects of adalimumab, etanercept or infliximab in comparison with either conventional DMARDs or alternative biological DMARDs.

4.3.9

The Committee considered the evidence from the randomised controlled trials of rituximab (REFLEX trial) and of abatacept (ATTAIN trial). The Committee noted the results of the Assessment Group's indirect comparison of rituximab and abatacept based on these trials, which did not show statistically significant differences between the two treatments, and that the conclusions were similar to those from the indirect comparisons carried out by Roche and Bristol Myers Squibb. The Committee concluded that both treatments had been shown to be clinically effective in comparison with placebo, but that one treatment had not been shown to be more effective than the other. The Committee concluded that the data for TNF inhibitors were insufficient to quantify with certainty the relative effect of rituximab and abatacept in comparison with adalimumab, etanercept or infliximab when used after the failure of the first TNF inhibitor.

4.3.10

The Committee considered specifically the evidence of clinical effectiveness for the subgroup of people defined by reason for withdrawal of the first TNF inhibitor. It heard from clinical specialists that they considered that people whose disease had not responded to the first TNF inhibitor (primary non-response) would be less likely to experience a response to a second TNF inhibitor in comparison with those whose disease had initially responded but who had later experienced diminishing benefit (secondary non-response). However, the Committee considered that the studies identified by the Assessment Group did not show a consistent difference in response (including less response, similar response and better response) between secondary non-response and primary non-response. The Committee concluded that although some evidence and clinical testimony suggested a difference in response by reason for withdrawal, there was currently insufficient evidence for the Committee to use this as a basis for making recommendations for this specific subgroup.

4.3.11

The Committee also considered subgroups based on the presence of auto-antibodies (rheumatoid factor and anti-CCP antibody status) and the impact of the presence of auto-antibodies on the clinical and cost effectiveness of rituximab. The Committee heard from the clinical specialists that the presence of auto-antibodies is not a consistent measure in that the same person may have a positive test for auto-antibodies in one instance and a negative test in another. The Committee was aware that a post-hoc analysis of the REFLEX study showed no statistically significant differences in relative effectiveness between subgroups defined by auto-antibody status. The Committee recognised that the REFLEX study and several other studies highlighted by consultees showed a lower absolute response rate in people who test seronegative compared with those who test seropositive. The Committee heard from clinical specialists that draft guidelines from the British Society for Rheumatology advise that people who test seropositive for either rheumatoid factor or anti-CCP may be more likely to respond than people who test seronegative for the two antibodies, and that this should be taken into account when considering rituximab. However, the clinical specialists considered that people who test seronegative may still respond to rituximab treatment. The Committee was not aware of data showing with certainty that adalimumab, etanercept, infliximab or abatacept would be more clinically effective than rituximab in this situation and that people who test seronegative are not excluded from the current marketing authorisation for rituximab. On balance, the Committee was not persuaded that there was currently sufficient evidence to conclude that rituximab treatment was inappropriate for people who test seronegative. Therefore, the Committee agreed not to make differential recommendations for a subgroup based on auto-antibody status.

4.3.12

The Committee noted that no studies had been identified that compared the biological DMARDs with a newly initiated conventional DMARD after the failure of a first TNF inhibitor. The Committee heard from clinical specialists that they considered that any treatment effect for conventional DMARDs in this situation would be very limited. The Committee was aware of evidence from the Behandel Strategieen (BeST) study, which investigated the effectiveness of different treatment sequences of biological and conventional DMARDs in people with early rheumatoid arthritis. The Committee heard from the Assessment Group that evidence from the BeST study was not appropriate in this instance, as it did not address the clinical effectiveness of individual DMARDs and the study population did not represent people with established rheumatoid arthritis. The Committee noted comments received in consultation stating that it would be appropriate to assume no effect of conventional DMARDs after failure of a TNF inhibitor, because data from the British Society for Rheumatology Biologics Register reported that people who stopped treatment with a TNF inhibitor showed no average change in HAQ score 12 months later. However, the Committee was mindful that these data were for people stopping treatment with a TNF inhibitor only and did not specifically measure the effect of treatment for people starting conventional DMARDs at that point. Overall, the Committee concluded that, on the basis of clinical opinion, the effect of conventional DMARDs in people for whom a TNF inhibitor had failed was likely to be small, but it did not accept that there would be no effect at all associated with therapy. In addition, the Committee agreed that the uncertainty about the effectiveness of DMARDs contributed to the difficulty in quantifying with certainty the relative effect of biological treatments compared with DMARDs.

4.3.13

In summary, the Committee noted that, apart from the randomised controlled trials of rituximab and abatacept, the available evidence on the effectiveness of treatment with the considered technologies after the failure of a TNF inhibitor was mainly derived from observational studies with short follow-up periods that included relatively small numbers of participants. The Committee noted that many of the studies lacked a comparison group, so it was not clear what would have happened had participants not received therapy. The Committee considered that shortcomings in the design of studies of the sequential use of TNF inhibitors could affect the validity of the results. It also considered that characteristics of the study participants, changes in clinical practice, and, in some instances, small participant numbers could affect the generalisability of the results. The Committee considered that there are significant limitations in the evidence base available for this appraisal and that the relative clinical effectiveness of TNF inhibitors after the failure of a first TNF inhibitor remains uncertain. The Committee acknowledged that some of the manufacturers had carried out mixed treatment or indirect comparisons. It noted the concerns of the Assessment Group that these comparisons did not increase the robustness of the results because of the inclusion of populations outside the scope, and because of the possible shortcomings related to dealing with heterogeneity between the included studies. The Committee was aware that the analyses did not consistently demonstrate a similar pattern of effect between the technologies. On balance, the Committee was not persuaded that for this appraisal the nature of the evidence available would allow mixed treatment or indirect comparisons to adequately address the underlying uncertainty in the effectiveness of these technologies. The Committee further noted that more research is needed, specifically using the DAS28 outcome measure (which forms the basis for the rules for continuing treatment in current NICE guidance on treatments for rheumatoid arthritis). The Committee heard from clinical specialists and manufacturers about ongoing research on the treatment of rheumatoid arthritis after the failure of a TNF inhibitor. The Committee heard that a current clinical trial of infliximab (RESTART) is being undertaken in patients with active rheumatoid arthritis in whom treatment with etanercept or adalimumab has failed. It noted that a preliminary analysis from this trial had been provided by the manufacturer of infliximab as commercial in confidence.

Cost effectiveness

4.3.14

The Committee examined the cost-effectiveness analysis of sequential use of TNF inhibitors performed by the Assessment Group and the manufacturers of the technologies. The Committee noted that all analyses modelled a sequence of treatments, which it considered appropriate for rheumatoid arthritis. The Committee noted, however, that there were differences in the sequences modelled. The Committee was aware that one of the models (from Pfizer) had not been provided as an executable file, and had not included abatacept. This limited the Committee's ability to use the model to inform decision making. The Committee recognised that one of the models (from Bristol-Myers Squibb) had not included a comparison with conventional DMARDs, which limited the comparability of the model with those of the other manufacturers and of the Assessment Group.

4.3.15

The Committee was presented with information about the costs used in the economic models. The Committee recognised that the costs of hospitalisation and joint replacement had been included in all of the manufacturers' models, and that these costs were derived from a range of data sources including the British Society for Rheumatology Biologics Register and the Norfolk Arthritis Register. The Committee was aware that the Birmingham Rheumatoid Arthritis Model included an assumed cost for joint replacement and hospitalisation. The Committee noted comments received during consultation that the costs of palliative care had been underestimated in the Birmingham Rheumatoid Arthritis Model. The Committee recognised that the Assessment Group had carried out a series of analyses examining the sensitivity of the ICERs to changes in the cost parameters, including the removal of joint replacement and hospitalisation costs, the addition of extra costs of palliation and inclusion of the costs of adverse events. These analyses showed that the ICERs were not very sensitive to changes in these costs, and that the ICERs were most sensitive to changes in the assumptions about the natural history of the disease, the efficacy of the treatments and the proportion of people stopping treatment early.

4.3.16

The Committee discussed the different sources of estimates of clinical effectiveness for the biological DMARDs that had been used in the economic modelling. The Committee noted that all models had used the REFLEX and ATTAIN trials to inform the estimates of rituximab and abatacept, but that sources varied for the estimates for TNF inhibitors and conventional DMARDs. It noted that some had included RCT data from populations outside the scope of the appraisal, uncontrolled observational studies or registry data. The Committee was aware that no head-to-head evidence existed that compared all the biological DMARDs, and as a result some models derived relative treatment effect from indirect comparisons. The Committee noted that these had included evidence from studies in which participants had not previously been treated with a TNF inhibitor. The Assessment Group reported that it considered the use of data from populations beyond the scope of the appraisal to complete an indirect comparison inappropriate because of the heterogeneity of the studies from which the data were taken. The Committee heard from the Assessment Group that it had modelled the rates of effectiveness for biological and conventional DMARDs as absolute rather than relative changes, even if from placebo-controlled randomised trials, because it considered that evidence did not allow for the completion of a mixed treatment or indirect comparison. The Committee noted that the use of non-randomised comparisons could affect the robustness of the results, but it accepted that the evidence base available for the sequential use of adalimumab, etanercept and infliximab did not currently allow for a robust analysis of the relative treatment effects.

4.3.17

The Committee considered the value of HAQ score as a measure of functional assessment. The Committee heard from clinical specialists that HAQ score was affected by both reversible and irreversible components of the disease process, and that longstanding disease lessens the potential for improvements in HAQ score because of irreversible damage. For this reason the Committee considered that HAQ score may not be an appropriate measure of clinical benefit in established disease. The clinical specialists and patient experts considered that treatment might benefit individuals in ways not captured by HAQ score (such as a reduction in inflammation). The Committee recognised that the HAQ may be subject to 'ceiling effects' (in certain circumstances, the score cannot worsen), and that it does not incorporate symptoms such as pain, fatigue and sleep disturbance. The Committee concluded that patients may derive benefits from the treatment that are not reflected in HAQ score because of irreversible joint damage.

4.3.18

The Committee discussed the range of methods used to model efficacy of the treatments, including responses in ACR categories mapped to changes in HAQ score, ACR response categories mapped to EULAR response and mean HAQ score change without the use of ACR response categories. The Committee was aware of the limitations of the HAQ score, including its insensitivity to small changes within the higher range of scores and its inability to capture meaningful improvements in pain and fatigue (see section 4.3.17). Following explanation from the Assessment Group, the Committee understood that HAQ multipliers represent a proportional change from a given baseline HAQ score. This means that for a baseline HAQ score of 2.00, the use of a HAQ multiplier of 0.25 translates into an HAQ improvement of 0.50, which results in a post-treatment HAQ score of 1.50. The Committee considered that the use of a multiplier to model changes in HAQ meant that absolute changes in the upper range of the HAQ scores were larger than those in the lower range, and that, using a HAQ multiplier, people with more severe disease would have larger absolute HAQ improvements than if the changes in HAQ score observed from the clinical studies were used directly. The Committee discussed comments received in consultation that the HAQ multiplier lacked face validity because the distribution of simulated changes in treatment-related HAQ score did not reflect the ranges observed in clinical trials. Bearing in mind these considerations, the Committee was not persuaded that the use of a HAQ multiplier was an unreasonable way to model changes in HAQ score. However, it agreed that alternative approaches should not be discounted and that it was appropriate to consider the cost-effectiveness analyses of the manufacturers who used alternative methods to calculate clinical effectiveness inputs.

4.3.19

The Committee discussed how the models had incorporated underlying progression of disease during treatment. The Committee noted that all but two analyses had been carried out assuming that disease did not progress in people receiving TNF inhibitors, rituximab and abatacept, but that disease did progress in people taking conventional DMARDs. The Committee was aware that for the biological DMARDs, the use of no progression assumed both no underlying deterioration of physical function and no reduction in response to treatment. The Committee noted that one of the analyses (from Bristol-Myers Squibb) had assumed that abatacept delayed progression more than the other biological DMARDs. It recognised that the values used came from two different sources: clinical trial data on abatacept and an analysis of natural history data not specific to biological DMARDs. The Committee was not persuaded that the evidence was sufficient to support an assumption that the different biological treatments differentially altered progression of disease. In conclusion, the Committee was persuaded that it was appropriate to assume that biological DMARDs delayed disease more than conventional DMARDs. However, the Committee was aware that people with rheumatoid arthritis normally experience a reduction in the response to treatment before stopping it (secondary loss of response). The Committee agreed to base its discussions on the ICERs that assumed no progression of disease for patients during treatment with the biological DMARDs, but was not persuaded that this assumption fully reflects the disease process. This is because people could experience some worsening of HAQ while on treatment, particularly in the period of time prior to stopping treatment because of secondary loss of response, in which case the ICERs assuming no progression of disease may overestimate the benefits of treatment.

4.3.20

The Committee noted that none of the economic models included health-related quality of life measured using a generic preference-based measure, but had mapped a disease-specific measure (HAQ or DAS) to a generic measure (EQ-5D). The Committee understood that in the case where DAS was mapped to EQ-5D, the algorithm used had been developed from EQ-5D data itself derived indirectly from HAQ data. The Committee noted that the mapping of HAQ to EQ-5D allowed for the symptoms of rheumatoid arthritis to cover a broad range of values on the quality-of-life scale, from excellent health to states worse than death. The Committee noted that mapping utilities was outside the reference case, but recognised it had been used in previous NICE technology appraisals of treatments for rheumatoid arthritis in the absence of directly-elicited EQ-5D data. The Committee heard from the Assessment Group that the Birmingham Rheumatoid Arthritis Model incorporated HAQ because it did not consider that DAS captured all aspects of disability that one would expect to correlate with health-related quality of life. The Committee heard from clinical specialists that evidence from the British Society for Rheumatology Biologics Register suggested that for more severe HAQ scores, mapping may underestimate the change in EQ-5D. However, the Committee was mindful that none of the models incorporated directly elicited EQ-5D data and all relied on mapping to inform estimates. The Committee noted that some of the manufacturers had mapped HAQ to EQ-5D using a linear function, while others had used a non-linear function. It heard from the Assessment Group that the use of a non-linear function places a greater value on changes at the lower end of the HAQ scale than at the upper end, but that this did not significantly change the estimated ICERs. The Committee concluded that mapping to EQ-5D had shortcomings, but in the absence of an alternative was an acceptable way to derive estimates of utility, and that the use of a non-linear function was not unreasonable.

4.3.21

The Committee discussed the time intervals between treatments with rituximab. The Committee was aware of the results of the REFLEX trial, in which the average time interval between treatments was 307 days, and the SPC for rituximab, which indicates treatment intervals of no less than 16 weeks. It heard from clinical specialists that they would offer to retreat a patient with rituximab before disease flared, that there was wide variation in time to retreatment with rituximab and that it would be reasonable to assume that treatment with rituximab would occur, on average, less frequently than every 6 months, with some people requiring an infusion less often than once a year. The Committee noted that the Birmingham Rheumatoid Arthritis Model modelled time to repeat treatment as 8.7 months in the base case, basing this estimate on Roche's submission. It noted that similar time to re-treatment had been assumed in a number of the other manufacturers' submissions. The Committee understood that recently published data from the SUNRISE trial indicate that two courses of rituximab given 6-monthly result in a statistically significantly higher ACR 20 response rate at 1 year than one course given per year. It noted the comments received in consultation from some of the manufacturers who, because of the newly available data from the SUNRISE trial, considered that their base-case analyses may overestimate the duration of time between rituximab retreatments. The Committee noted that in the Assessment Group's scenario analysis, the ICER for rituximab compared with conventional DMARDs was from £21,100 per QALY gained when time to retreatment was 8.7 months, and that this increased to £32,600 per QALY gained when time to retreatment was 6 months. The Committee concluded that an 8.7-month retreatment interval is likely to overestimate the time between consecutive courses of rituximab. However, on the basis of the clinical specialists' advice, the Committee considered that it was unlikely to be as frequent as every 6 months for every person receiving rituximab.

4.3.22

The Committee considered the use of stopping and continuation rules in the economic models. The Committee noted that current NICE guidance on the first use of TNF inhibitors (NICE's technology appraisal guidance on adalimumab, etanercept and infliximab for the treatment of rheumatoid arthritis [2007]) recommends that TNF inhibitors should be not be continued unless there is an adequate response at 6 months following initiation of therapy. An adequate response is defined as an improvement in DAS28 of 1.2 points or more. The Committee heard from the clinical specialists that data from the British Society for Rheumatology Biologics Register indicate that a number of people will continue treatment with a TNF inhibitor in the absence of such a response, indicating that clinicians currently do not adhere strictly to continuation rules. However, it also heard from the clinical specialists that although implementing continuation rules could be difficult, clinicians were increasingly following guidance on continuation rules. The Committee was aware that four of the manufacturers had submitted models that included continuation rules, each based on a different response criterion (that is HAQ score, EULAR response, ACR 20 and ACR 50). The Committee understood that the Birmingham Rheumatoid Arthritis Model was not designed in a way that could incorporate continuation rules based on response. The Committee noted, however, the scenario analyses that included the proportions of people stopping treatment early that were used in the manufacturers' response-based models. These analyses lowered the ICERs for the TNF inhibitors and abatacept compared with conventional DMARDs by approximately £10,000 per QALY gained, to between £23,800 per QALY gained for adalimumab and £27,400 per QALY gained for infliximab. The Assessment Group explained that ICERs for rituximab did not change because the cost of rituximab treatment occurred at the start of each course of treatment. The Committee understood that consultees considered the modelling of continuation rules appropriate, as it reflects existing NICE guidance for the treatment of rheumatoid arthritis. The Committee concluded that continuation rules should be considered in the estimation of cost effectiveness.

4.3.23

The Committee discussed the different ways in which the manufacturers and the Assessment Group had modelled the efficacy of conventional DMARDs. The Committee noted that the Birmingham Rheumatoid Arthritis Model assumed that the conventional DMARDs used after the failure of a TNF inhibitor were 50% as effective as when used in early rheumatoid arthritis. The Committee heard from the clinical specialists and from comments received in consultation that the assumption that conventional DMARDs used after the failure of a TNF inhibitor were 50% as effective as when used in early rheumatoid arthritis overestimated their effectiveness at this point in the treatment pathway. The Committee noted that the Assessment Group had completed a scenario analysis that assumed an efficacy of conventional DMARDs equal to that of placebo (reflecting the assumption used in the submission from Roche). This resulted in ICERs of approximately £16,000 per QALY gained for rituximab compared with conventional DMARDs, and between £28,000 and £32,000 per QALY gained for the other technologies compared with conventional DMARDs. The Assessment Group explained that the differences between the ICERs using different assumptions about DMARD efficacy were not larger because the Birmingham Rheumatoid Arthritis Model assumes that TNF inhibitors are added to a sequence rather than used as an alternative treatment. Therefore, the effects of the conventional DMARDs were observed in both the intervention and comparator sequences. The Committee discussed comments received on the effectiveness of conventional DMARDs. The Committee concluded that an analysis that assumed the effect of conventional DMARDs to be no more than that of placebo was not plausible, but accepted that the base-case assumption of a reduction of 50% was an underestimate of the reduction in effect of conventional DMARDs, and therefore overestimated the ICERs in the Assessment Group's base-case analysis.

4.3.24

The Committee considered the estimates of cost effectiveness for the use of rituximab after the failure of a TNF inhibitor. It recognised that in all but one of the economic models, rituximab had been associated with the lowest ICERs of the biological DMARDs compared with conventional DMARDs. Rituximab was also associated with the lowest ICERs of the biological DMARDs in the Assessment Group's scenario analysis that assumed a poorer response to conventional DMARDs than was assumed in the base case. In addition, the Committee considered it was appropriate to incorporate continuation rules, and noted the ICERs from a sensitivity analysis carried out by the Assessment Group that included the assumptions regarding continuation rule from the model submitted by Roche. This analysis also showed that rituximab had the most favourable ICER among the technologies, with the other drugs being either dominated or extendedly dominated. Taking these results into account, as well as the estimates from the manufacturers' economic models, the Committee considered that the most plausible ICER for rituximab compared with DMARDs would be in the lower end of the range of £20,000 to £30,000 per QALY gained. The Committee concluded that rituximab could be considered a cost-effective use of NHS resources. The Committee recognised that rituximab treatment was not provided at regular intervals, but instead that people were retreated when treatment was required, which could mean some loss of response between infusions. However, the Committee noted the testimony from clinical specialists that they would aim to treat before disease flared. The Committee concluded that treatment could only be considered cost effective if an adequate response could be maintained following retreatment with a dosing interval of at least 6 months.

4.3.25

The Committee discussed the cost effectiveness of the TNF inhibitors. The Committee understood that that in the absence of robust data on the clinical effectiveness of the TNF inhibitors, the ICERs were uncertain. The Committee noted that the analyses by Pfizer comparing the TNF inhibitors with rituximab produced ICERs of £19,077 per QALY gained (with primary non-response to the first TNF inhibitor) and £16,225 per QALY gained (with secondary loss of response). However, as Pfizer did not include an economic model in their submission, these results could not be validated. The Committee noted that most of the other models, including the Assessment Group's model, showed that in comparison with rituximab, either the ICERs for the TNF inhibitors were very high (above £80,000 per QALY gained) or the TNF inhibitors were dominated by rituximab (that is, rituximab was both more effective and less costly). In the Abbott model the ICER for TNF inhibitors compared with rituximab was £16,000 when rituximab was given every 6 months. However, the Committee did not accept that the re-treatment interval with rituximab would on average be 6 months. The Committee was mindful of the differences in the analyses of clinical effectiveness used in the different economic models, and considered the comments from consultees on the Birmingham Rheumatoid Arthritis Model. The Committee considered the manufacturer's models and agreed that taking into account all data together did not alter the conclusion drawn from the Birmingham Rheumatoid Arthritis Model. The Committee was not persuaded that the current evidence available and the cost-effectiveness analyses presented could support a decision to recommend adalimumab, etanercept or infliximab as an alternative to rituximab after the failure of a previous TNF inhibitor as an appropriate use of NHS resources.

4.3.26

The Committee discussed the cost effectiveness of abatacept. The Committee considered that most of the economic models showed that in comparison with rituximab, the ICERs for abatacept were either very high (above £100,000 per QALY gained in the Assessment Group base case) or abatacept was dominated by rituximab (that is, rituximab was both more effective and less costly). The analysis by Bristol-Myers Squibb that produced an ICER of £20,438 per QALY gained assumed an improvement in HAQ of 0.013 per year during treatment with abatacept. When the same rate of HAQ score increase was assumed for abatacept as for the other biological DMARDs in the base case (a worsening of 0.012 per year), the ICER increased to £40,534 per QALY gained. The Committee therefore concluded that abatacept when used as an alternative to rituximab after the failure of a previous TNF inhibitor would not be a cost-effective use of NHS resources.

4.3.27

The Committee was aware that for some people rituximab treatment may not be suitable because of a contraindication to rituximab or methotrexate, or that rituximab or methotrexate may need to be withdrawn because of an adverse event. The Committee was mindful that it had not been presented with any clinical evidence regarding the use of adalimumab, etanercept, infliximab or abatacept for patients for whom rituximab was contraindicated or not tolerated, and that any estimates of their effectiveness in this population were subject to additional uncertainty. However, it acknowledged that for people unable to take rituximab or methotrexate because of adverse events or contraindications, the appropriate comparator was conventional DMARDs. It considered the ICERs in the Birmingham Rheumatoid Arthritis Model that compared the TNF inhibitors with conventional DMARDs, noting that with the addition of continuation rules, the ICERs were between £23,800 and £27,400 per QALY gained, and would be lower if a lower effect of conventional DMARDs was assumed. The Committee considered these ICERs in light of the respective ICERs from the manufacturers' models, which lay between £14,500 and £35,900 per QALY gained. The Committee concluded that, on balance, these ICERs were sufficiently low to compensate for the uncertainty about the effectiveness of these treatments to be accepted. The Committee considered the Assessment Group's ICER for abatacept (including the continuation rule) of £26,200 per QALY gained compared with conventional DMARDs and agreed that this would be lower if the model assumed a lower effect of conventional DMARDs. The Committee considered these ICERs and the respective ICERs for abatacept from the manufacturer's model. It noted that these lay between £21,500 and £44,700 per QALY gained. The Committee was persuaded that these data taken together could support a decision to recommend adalimumab, etanercept, infliximab or abatacept as treatment options to be used after the failure of a first TNF inhibitor for the treatment of people who cannot receive rituximab therapy because have a contraindication to rituximab or who have had an adverse event on treatment with rituximab, as an appropriate use of NHS resources. The Committee recognised that for people who cannot receive rituximab therapy because they have a contraindication to methotrexate or where methotrexate is withdrawn because of an adverse event, this choice would be limited to adalimumab and etanercept, because infliximab and abatacept have marketing authorisations for use in combination with methotrexate.

4.3.28

The Committee was aware that for some people rituximab treatment may fail to provide an adequate response. The Committee was mindful that it had not been presented with any clinical or cost-effectiveness evidence regarding the use of adalimumab, etanercept, infliximab or abatacept for this group of people. The Committee considered that all estimates of relative clinical effectiveness were subject to uncertainty, and that in this group of patients the estimates were subject to further uncertainty because factors such as disease duration and prior treatment exposure may affect response to treatment (see section 4.3.7). Therefore, while accepting that the evidence submitted could be considered reflective of the group who had not received rituximab because of contraindications, or had not had an adequate trial because rituximab treatment was withdrawn because of an adverse event, the Committee was not persuaded that this evidence could be applied to the group of people for whom rituximab had failed to provide an adequate response. The Committee was therefore unable to make recommendations about the use of adalimumab, etanercept, infliximab or abatacept in people for whom rituximab has failed to provide an adequate response.

4.3.29

The Committee considered whether its recommendations were associated with any potential issues related to equality. The Committee was made aware that the use of the DAS28 would not be an appropriate tool for people with specific disabilities of the lower limbs and that the DAS44 would be a better tool to use for people with greater lower limb disease burden. The Committee agreed that it was important to allow clinicians to adjust the assessment of disease severity depending on the characteristics of the disease, and that the recommendations should reflect this. The Committee explored whether there were people, other than those who for whom rituximab or methotrexate was contraindicated or who had an adverse event on treatment with rituximab or methotrexate, who were unable to have rituximab because of a specific additional disability or comorbidity. The Committee noted that no such group of people had been identified during scoping or in consultation. The Committee was aware that people with mobility problems or visual impairment may find travel to hospital onerous or inconvenient. However, the Committee concluded that it was not clear that travel to receive infusions one or two times per year was necessarily more onerous or inconvenient than the alternative of much more frequent injections. In any event, the Committee did not consider that the need to travel would make it impossible or unreasonably difficult for these people to obtain treatment with rituximab, and noted that they would need to travel to other hospital or healthcare appointments in relation to their condition. The Committee concluded that rituximab would still be the most appropriate treatment option, taking into account its cost-effectiveness data and the infrequent dosing interval, but that all reasonable steps should be taken to provide practical support and assistance to ensure access to treatment for this group of people.

4.3.30

The Committee noted a consultee's concerns about equity of access with other countries, but concluded that this concern did not pertain to any group protected by the equalities legislation, and that it would not be appropriate to address this as part of a technology appraisal. The Committee also noted a consultee's comment stating that the guidance may have a disproportional impact on patients who test seronegative, but agreed that its recommendations, not differentiating between groups of people, did not affect any group protected by the equalities legislation and that the issue of auto-antibody status had been addressed in detail.