3 Committee discussion
The appraisal committee considered evidence submitted by Bristol Myers Squibb, a review of this submission by the evidence review group (ERG), and responses from stakeholders. See the committee papers for full details of the evidence.
This review looks at data collected in the Cancer Drugs Fund to address uncertainties identified during the original appraisal. Further information about the original appraisal is in the committee papers. As a condition of the Cancer Drugs Fund funding and the managed access arrangement, the company was required to collect updated efficacy data from the CheckMate 214 study for people with intermediate- or poor-risk advanced or metastatic renal cell carcinoma. In addition, data was collected on the use of nivolumab with ipilimumab for intermediate- and poor-risk disease in the NHS through the Cancer Drugs Fund using the Systemic Anti-Cancer Therapy (SACT) dataset.
People with untreated intermediate- or poor-risk renal cell carcinoma would welcome a new treatment option
3.1 For intermediate- or poor-risk advanced renal cell carcinoma, tyrosine kinase inhibitors such as pazopanib, sunitinib, tivozanib and cabozantinib are current standard care in the NHS. They can cause adverse effects such as fatigue, hand and foot syndrome, and chronic diarrhoea, which can substantially affect quality of life. The committee agreed that people with intermediate- or poor-risk advanced renal cell carcinoma would welcome a new treatment option.
Sunitinib and pazopanib are the appropriate comparators, although other treatments are now routinely available
3.2 The committee was aware that the treatment pathway for untreated advanced renal cell carcinoma had changed since the original appraisal. The committee considered several pieces of NICE technology appraisal guidance on oral tyrosine kinase inhibitors, noting that NICE now recommends that people may be offered any of sunitinib, pazopanib, tivozanib or cabozantinib. Cabozantinib and tivozanib were not included in the original appraisal because of when these pieces of guidance were published. NICE recommended the combination of avelumab and axitinib for use within the Cancer Drugs Fund for this indication, but NICE does not consider this routine practice, so they cannot be considered comparators. The NHS England clinical lead for the Cancer Drugs Fund reiterated a point he had made during the original appraisal: that nivolumab with ipilimumab is the first checkpoint inhibitor in untreated renal cell carcinoma and would likely displace tyrosine kinase inhibitors if recommended for routine use in the NHS. The committee concluded that pazopanib and sunitinib are the relevant comparators in this appraisal, but noted the potential shifting of lines of treatment in the treatment pathway.
3.3 The clinical experts in the original appraisal noted that, in practice, sunitinib and pazopanib are considered clinically equivalent. The committee recalled that previous appraisals also considered sunitinib and pazopanib to be clinically equivalent, and there was no new evidence to change this conclusion. The committee concluded that pazopanib and sunitinib can be considered clinically equivalent.
Updated CheckMate 214 data still shows that nivolumab with ipilimumab is more clinically effective than sunitinib
3.4 The main source of evidence came from CheckMate 214, an open-label randomised control trial, with sunitinib as the comparator. The co-primary endpoints of the trial were overall survival and progression-free survival, amended in the protocol by the company to include overall response rate. The trial stratified people by risk of death using a prognostic risk score, as defined by the International Metastatic Renal Cell Carcinoma Database Consortium scoring system. Risk level is determined using 6 risk factors including Karnofsky performance status score, time from original diagnosis, and levels of haemoglobin, serum calcium, neutrophils and platelets. The likelihood of survival is considered intermediate ('intermediate risk') when there are 1 or 2 risk factors, and poor ('poor risk') when there are 3 or more risk factors present. The population in the trial was wider than the population in the marketing authorisation, which limits treatment to people with disease that is intermediate or poor risk. The company stated that the trial had sufficient power to investigate clinical outcomes in a combined intermediate- or poor-risk group (n=847; 667 intermediate and 180 poor risk, respectively). The trial also included 249 people with favourable-risk disease, but this group was not included in the marketing authorisation. In the original appraisal, the company presented 2 interim data cuts with the most recent from August 2018, with a minimum of 30 months' follow up (referred to as the '30-month data cut'). For the review of this guidance, the company presented a further data cut from February 2021, reflecting a median follow up of 67.7 months and a minimum of 60 months' follow up (referred to as the '60-month data cut'). The updated evidence from the 60-month data cut showed improved overall survival that was consistent with the extrapolations in the original submission. Median overall survival did not change from the 30-month to the 60-month data cut for sunitinib: 27 months (95% confidence interval [CI] 22 to 33 months for the 30-month data cut, and 22 to 34 months for the 60-month data cut). Median overall survival in the nivolumab with ipilimumab arm was not reached at 30 months (95% CI 36 months to not evaluable), but was 47 months at 60 months (95% CI 35 to 57). The hazard ratio between treatment arms for overall survival did not show a substantial change, from 0.66 (95% CI 0.54 to 0.80) at 30 months to 0.68 (CI 0.58 to 0.81) at 60 months, mostly associated with better than predicted survival of people in the sunitinib treatment arm. The committee concluded that the updated clinical evidence for nivolumab with ipilimumab closely matched the extrapolations from the original appraisal, showing that nivolumab with ipilimumab is more clinically effective than sunitinib.
3.5 Public Health England submitted data from the Systemic Anti-Cancer Therapy (SACT) dataset, including data from 814 people who had nivolumab with ipilimumab through the Cancer Drugs Fund during the period of April 2019 to November 2020. The SACT data had a median follow up of 10.8 months, ranging from 5 to 24.7 months. The committee noted that naively comparing the data from SACT with data from CheckMate 214 showed worse survival for people in the NHS compared with participants in the trial. The committee considered that differences in characteristics of the patients in the NHS and in the trial likely accounted for this, notably that the SACT data include a higher proportion of people with poor-risk disease (see section 3.7). The committee considered that the distribution of characteristics in the SACT dataset was likely to better represent people who had nivolumab with ipilimumab in NHS clinical practice. However, the SACT dataset provided no comparative evidence because it included only people who had nivolumab with ipilimumab. The committee concluded that the relative effect of nivolumab with ipilimumab compared with sunitinib from CheckMate 214 was the most appropriate source of evidence on the clinical efficacy of nivolumab with ipilimumab, and for economic modelling.
Nivolumab with ipilimumab appears to be more effective in poor-risk than in intermediate-risk disease
3.6 In the original appraisal, the company had presented estimates of relative effectiveness separated by risk status for the intermediate- and poor-risk subgroups. The committee visually inspected the Kaplan–Meier curves and concluded that the curves suggested that treatment was more effective in poor-risk disease than in intermediate disease. The committee was aware this could represent poor-risk disease responding poorly to sunitinib rather than responding particularly well to nivolumab with ipilimumab. The ERG had requested CheckMate 214 outcomes from the 60-month data cut stratified by risk status, but the company had not provided this, noting that the trial was 'not powered' for outcomes by subgroups. The committee recognised that if the study had more participants, the trial would very likely have shown significant effect modification by risk status. The company considered these estimates by subgroup to be confidential so they cannot be presented here. The committee was also aware that the marketing authorisation does not include favourable disease. The ERG noted that in the SACT data, overall survival appears to be lower for people with poor-risk disease. Before the meeting, the clinical experts noted that a recent post hoc analysis from CheckMate 214 showed that people with sarcomatoid disease may have a particular benefit with nivolumab with ipilimumab over sunitinib at 42 months' follow up. A clinical expert noted that sarcomatoid histology has a higher tumour mutational burden and may benefit from immunotherapy to a greater extent. They also noted that these people are likely to have poor-risk disease. The committee noted that the company chose not to present any analyses stratified by risk status from the 60-month data cut. The committee nonetheless concluded that there is likely to be a difference in relative effectiveness by risk, but that it would have preferred to see outcomes by subgroup from the updated trial data cut.
The SACT dataset should inform the proportion of people with poor-risk disease in the economic model
3.7 CheckMate 214 included 21% of people with a high risk of death (poor-risk disease), and the clinical experts in the previous appraisal considered that this proportion was likely larger in NHS practice. If intermediate- and poor-risk disease respond differently to treatment (see section 3.6), the absolute treatment effect in the combined group would depend on the distribution of baseline risk, which in turn would affect cost effectiveness. The SACT data, which was expected to inform the true proportion of people with intermediate- and poor-risk renal cell carcinoma in the NHS, included 35% of people with poor-risk disease (with the remainder having intermediate-risk disease). The committee considered that the SACT dataset included more people with poor-risk disease than it might otherwise have done because of the COVID-19 pandemic. The company presented evidence from several audits reporting delayed referrals, consultant appointments and diagnosis during the pandemic. It considered that this supported the larger proportion of people with poor-risk disease in the SACT dataset. The ERG considered that the SACT dataset better represented NHS patients who would have nivolumab with ipilimumab than the trial. It noted there was limited evidence that the pandemic influenced risk levels in SACT, noting that 87% of the included people had an Eastern Cooperative Oncology Group performance status of 0 or 1. The NHS England clinical lead for the Cancer Drugs Fund did not consider that the pandemic affected the proportion of people with poor-risk disease in the NHS. The committee concluded that the SACT dataset should inform the proportion of people with poor-risk disease in the economic model. However, it would have preferred to see the proportions of each risk group from the SACT dataset used to weight the effectiveness estimates of each risk group using the CheckMate 214 trial outcomes. It considered that this analysis would likely reduce the cost-effectiveness estimates because the incremental benefit over sunitinib or pazopanib for poor-risk disease is likely to be higher than for intermediate-risk disease.
Treatment crossover may favour sunitinib and would have a minimal effect on the cost-effectiveness results
3.8 In the original guidance, for the 30-month data cut, the company had amended the trial protocol to allow people randomised to sunitinib to switch ('crossover') to nivolumab with ipilimumab, and had not adjusted the trial results for this. While acknowledging that the crossover likely biased the hazard ratio towards zero, the committee wished to see long-term survival predictions for nivolumab with ipilimumab based on further data collection from CheckMate 214, adjusted for treatment switching. In the company's new submission, it did not adjust for treatment switching because it considered few people had switched treatments. The ERG noted that the unadjusted results likely favoured the comparator. The committee would have preferred to see adjusted results, but acknowledged the likely impact on the cost-effectiveness estimates was minimal given the relatively small number of people switching treatment. The impact of adjusting for treatment switching is uncertain, however it may favour sunitinib.
3.9 In the original appraisal, the committee concluded that second-line treatments in CheckMate 214 did not reflect NHS clinical practice. For example, some people randomised to nivolumab with ipilimumab in CheckMate 214 received immunotherapies again at later lines of treatment. The committee preferred an analysis that included both costs and benefits of treatments used in second-line treatment and beyond that reflected NHS clinical practice. During the original appraisal it considered that SACT data could inform this. In its new submission to NICE, the company used second-line and beyond treatment data from CheckMate 214 because of differences in follow up between the trial and SACT (minimum 60 months compared with minimum 5 months, respectively). It also provided a scenario using the proportions of second-line treatments from the SACT dataset. The ERG agreed that using treatments from the longer CheckMate 214 trial was appropriate. The clinical experts considered that the SACT treatments best matched NHS clinical practice. They noted that after sunitinib, people will often have either nivolumab or cabozantinib; whereas after nivolumab with ipilimumab, people will have a tyrosine kinase inhibitor – usually cabozantinib, but sometimes sunitinib, tivozanib, or lenvatinib with everolimus (see section 3.2). The committee noted that the NHS would not offer immunotherapy twice, and heard from the clinical experts that there is little evidence that a second round of immunotherapy works. The committee was concerned about using CheckMate 214 as a source of data for second-line and beyond treatments if any of these treatments not used in the NHS influenced survival outcomes. It considered that the true cost-effectiveness results may be somewhere between those based on trial data and those based on SACT data. It also noted that removing additional costs of nivolumab monotherapy after treatment with nivolumab with ipilimumab in the CheckMate 214 trial (because immunotherapy would likely not be offered twice) would reduce the cost-effectiveness estimates. The committee preferred to use evidence on effectiveness and costs from the same source, and concluded that it was appropriate to use CheckMate 214 data for second-line and beyond treatments.
3.10 In the original appraisal, the clinical experts explained that in their experience, nivolumab with ipilimumab is well tolerated and has a preferable adverse event profile compared with tyrosine kinase inhibitors. The committee acknowledged that nivolumab and ipilimumab are associated with some rare but unpleasant and potentially serious adverse events that are specific to immunotherapy. The clinical experts stated that clinicians are experienced in recognising and managing these serious adverse events. The committee maintained its conclusion that nivolumab with ipilimumab is well tolerated compared with tyrosine kinase inhibitors.
The company's model structure matches the committee's preferred assumptions from the original appraisal
3.11 The company used a partitioned survival model to estimate the cost effectiveness of nivolumab plus ipilimumab compared with sunitinib and pazopanib. The model included 6 health states: progression-free on treatment, progression-free off treatment, post-progression on treatment, post-progression off treatment, terminal care, and dead. The probability of being in a given health state was defined by the area under the curves for progression-free survival, overall survival, and their difference. The cycle length was 1 week and the time horizon was 40 years. The committee noted that the company's original economic model included an 'immunological effect' that resulted in people taking nivolumab with ipilimumab being effectively cured. It had previously concluded that this was not appropriately implemented and therefore the company did not explicitly include it in its resubmission; the ERG considered this approach appropriate. The committee concluded that the model structure was acceptable and closely matched its preferred assumptions from the original appraisal.
3.12 The committee recognised that the trial evidence did not span the whole time horizon of the model. The company explored the most appropriate hazard function to extrapolate overall survival for each of the treatments using the updated 60-month data from CheckMate 214. The committee originally concluded that both the company and the ERG's preferred extrapolations for overall survival were clinically plausible (log-normal and Kaplan–Meier with exponential extrapolation, respectively), but the absence of long-term data prevented it from determining which was most appropriate. It noted that the log-normal distribution predicted that a small proportion of people, not explicitly modelled as having been cured, would effectively be cured. Using the updated 60-month CheckMate 214 data, the company again considered that the log-normal curve was the most appropriate to extrapolate overall survival, based on goodness-of-fit to the data and clinical validation of predicted risk of death over time. The ERG was satisfied that the company used appropriate methods to select the model, but questioned the plausibility of its projections for overall survival in the long term. The ERG highlighted that a large proportion of gains in both life years and quality-adjusted life years (QALYs) for nivolumab plus ipilimumab compared with sunitinib in the model occurred in the extrapolated period. The ERG emphasised that a large proportion of these patients remained in the progression-free state. The committee noted that the company's scenario analyses using alternative extrapolations increased the cost-effectiveness estimates, demonstrating the importance of the choice of extrapolation method. The committee considered that the updated data supported the company's choice of the log-normal hazard function and that a proportion of people in CheckMate 214 would effectively be 'cured' with immunotherapy. The committee concluded that the extrapolations of overall survival were appropriate but, to explore uncertainty, it considered sensitivity analyses using other assumptions around extrapolating how the rate of death changes over time in its decision-making.
The company and ERG assumed that death rates for nivolumab with ipilimumab and sunitinib or pazopanib would equalise at different times
3.13 In its updated base case, the company assumed that nivolumab with ipilimumab would lead to a lower death rate than sunitinib or pazopanib, until the point at which the curve extrapolating overall survival for nivolumab with ipilimumab equalled the general population mortality curve, approximately 21 years from the start of treatment. The ERG was concerned that this approach was not supported by the CheckMate 214 trial data, which showed higher death rates for nivolumab with ipilimumab than for sunitinib at several time points. The ERG considered that the annualised hazard rates for each of the treatments equalised at approximately 4.5 years. Moreover, the ERG noted that clinical advice suggests death rates decrease over time, which the CheckMate 214 trial did not show for nivolumab with ipilimumab. The ERG considered that second-line treatments may have equalised the hazards for death between treatments, notably because a high proportion of people treated with sunitinib then received nivolumab monotherapy as a second-line or later treatment. The ERG provided 2 scenarios in which the hazards equalised at 4.5 years, 1 in which the death rate for sunitinib or pazopanib was set to the rate for nivolumab with ipilimumab, and 1 in which the death rate for nivolumab with ipilimumab was set to the rate for sunitinib or pazopanib. Both showed a range of potential effects of equalised hazards for death.
3.14 The company did not consider that the ERG's analysis of hazard rates used a recognised methodology, and countered the ERG's scenarios with its own evidence:
It submitted data on people in the CheckMate 214 trial who were alive after 5 years, demonstrating that a larger proportion of people treated with nivolumab and ipilimumab were progression free than those in the sunitinib arm. It considered that this was evidence of a sustained response. However, the ERG noted that some people in the sunitinib arm also had a sustained response for 5 years. The ERG further noted the high proportion of people in the sunitinib arm who later had nivolumab monotherapy.
The company considered that the increased hazard of death at the end of the observed trial period in the nivolumab with ipilimumab arm was an artefact. To support this conclusion, it submitted smoothed hazard plots based on datasets truncated at different months within the trial duration. The ERG considered that the smoothed hazard plots made it difficult to see changes in hazards that were occurring over time. It also felt that nivolumab monotherapy would be more likely to have a different effect after sunitinib treatment than other second-line treatments would have after nivolumab with ipilimumab, which could result in a convergence of hazards.
The death rates are likely to equalise somewhere between the company and ERG's base case assumptions
3.15 The committee considered there was substantial uncertainty in the rates of death at the end of the trial and the extrapolations. It considered that it would have liked to see plots of the hazard ratio over time implied by the survival extrapolations used in the economic model, and further sensitivity analysis that assumed different effects of gradual convergence of the death hazards. It preferred the smoothed hazard plots to demonstrate hazards observed during the trial, but recognised that the ERG's analysis of convergence of death hazards at 4.5 years may represent the earliest plausible estimate of the time point of convergence. It concluded that the death hazards between arms would be likely to equalise, probably between 4.5 and 21 years, between the company's base case and the ERG scenarios. The committee concluded that it was appropriate to consider incremental cost-effectiveness ratios (ICERs) resulting from this range.
It is not appropriate to assume different utilities based on treatment arm for the entire time horizon, but this minimally impacts the ICERs
3.16 In the original appraisal, the committee considered that the quality of life estimates should reflect whether disease had progressed, which treatment a person had, and being on or off treatment. The company added utilities by progression status in its new model. The ERG noted that people in the new model were assumed to have different utilities depending on the treatment, even after stopping treatment, and for the remainder of the time horizon. The ERG considered this unjustified, noting that utility values are likely to equalise as people receive further treatments. The ERG provided a scenario using the same utility values for health states in both treatment arms over the modelled period. A patient expert noted that she did not feel very different on or off treatment or before or after progression, apart from when she experienced a side effect. A clinical expert noted that treatment-related adverse events may drive utility when on treatment. The committee considered it was appropriate to consider disutility associated with treatment when on treatment, but that having different utilities when off treatment, particularly if stopping treatment because of an adverse event, was not appropriate for the whole modelled time horizon. Based on the scenario provided by the ERG, the committee concluded that these different utility values are likely to have minimal impact on the ICERs.
The most likely cost-effectiveness estimate is within what NICE considers an acceptable use of NHS resources
3.17 NICE's guide to the methods of technology appraisal notes that above a most plausible ICER of £20,000 per QALY gained, judgements about the acceptability of a technology as an effective use of NHS resources will take into account the degree of certainty around the ICER. The committee concluded that the true ICERs for nivolumab and ipilimumab compared with sunitinib or pazopanib may lie between those of the company's base case and the ERG scenarios (see section 3.15). The cost-effectiveness results are commercial in confidence and cannot be reported here. The committee noted that most of the ICERs were towards the higher end of the range normally considered an acceptable use of NHS resources (£20,000 to £30,000 per QALY gained) but that there was some uncertainty around where the true ICER lies. It also noted that several preferred assumptions that had not been incorporated into the models were likely to decrease the ICERs:
increasing the proportion of people with poor-risk disease included in the model, reflecting the proportion in the SACT dataset (see section 3.7)
adjusting results for treatment crossover in CheckMate 214 from sunitinib to nivolumab with ipilimumab (see section 3.8)
removing additional costs of nivolumab monotherapy after nivolumab with ipilimumab, which does not represent clinical practice (see section 3.9).
Taking these factors into account, the committee concluded that nivolumab with ipilimumab was likely to be an acceptable use of NHS resources.
3.19 The committee concluded that nivolumab with ipilimumab was more effective than treatments currently offered in the NHS for renal cell carcinoma and that the most plausible cost-effectiveness estimates were within what NICE considers an acceptable use of NHS resources. Therefore, nivolumab with ipilimumab is recommended for adults with untreated advanced renal cell carcinoma that is intermediate or poor risk as defined in the International Metastatic Renal Cell Carcinoma Database Consortium criteria.