3 The company's submission

The Appraisal Committee (section 7) considered evidence submitted by Novartis and a review of this submission by the Evidence Review Group (ERG; section 8).

Overview of clinical evidence

3.1 The company's systematic review identified 1 randomised controlled trial (RCT) of everolimus for preventing organ rejection after a liver transplant that it considered relevant to the decision problem: trial H2304. The company did not identify any non‑RCT evidence that was relevant to the decision problem.

3.2 H2304 was a 24‑month multicentre, open‑label randomised controlled trial that evaluated the efficacy and safety of everolimus in combination with reduced‑dose tacrolimus compared with standard‑dose tacrolimus. The trial was mostly conducted in the US with a limited number of UK patients. It included 719 people aged 18–70 years who had a primary liver transplant and had started an immunosuppressive regimen, containing tacrolimus and corticosteroids, 3–7 days after the transplant. Patients were randomised at 30 days (±5 days) after the transplant, to 1 of the following treatment arms:

  • Arm 1: everolimus with tacrolimus elimination, in which tacrolimus was completely withdrawn by the end of month 4 after the transplant. This treatment arm was stopped early due to a higher rate of acute rejection and treatment discontinuation and was excluded from further discussion in the company submission.

  • Arm 2: everolimus with reduced‑dose tacrolimus, in which everolimus was started at a daily dose of 2.0 mg. The dose was targeted to maintain a whole blood trough level of 3–8 ng/ml. After everolimus whole blood trough levels were confirmed to be in the target range, the dose of tacrolimus was tapered to achieve a target whole blood trough level of 3–5 ng/ml by 3 weeks after randomisation and continuing for the remainder of the study.

  • Arm 3: standard‑dose tacrolimus alone, in which tacrolimus trough levels were targeted to be maintained at 8–12 ng/ml until month 4 and then tapered to a target whole blood trough level of 6–10 ng/ml for the remainder of the study.

3.3 Prednisolone was taken at a minimum dose of 5 mg per day for at least 6 months. Before randomisation, 70% of people in both treatment groups were having mycophenolate mofetil but this was discontinued at randomisation according to the protocol. People having azathioprine or sirolimus were excluded from the study. Baseline characteristics appeared to be similar between the treatment arms. The proportion of people in the trial with hepatitis C virus was 31.8% in the everolimus arm and 31.3% in the standard‑dose tacrolimus arm. For hepatocellular carcinoma, the proportions were 17.1% and 14.4% respectively.

3.4 The inclusion criterion for baseline estimated glomerular filtration rate (eGFR, a measure of renal function) was ≥30 ml/min/1.73 m2. Randomised patients had a mean eGFR of 81 ml/min/1.73 m2 that, according to the company's clinical expert, is higher than the eGFR levels typically observed in patients in clinical practice in the UK (usually in the range of 50–65 ml/min/1.73 m2 at the time of liver transplant).

3.5 The primary outcome was a composite of treated biopsy proven acute rejection (tBPAR), graft loss or death at 12 months after transplantation (excluding events before randomisation). This was presented as the Kaplan–Meier incidence rate, with the difference being determined at the 97.5% confidence interval. Secondary outcomes included graft loss, death, number of acute graft rejections and change in renal function measured by eGFR. No patient‑related outcomes such as health‑related quality of life were measured in the trial.

3.6 Statistical analysis in H2304 was designed to show the non‑inferiority of everolimus with reduced‑dose tacrolimus compared with standard‑dose tacrolimus alone for the composite outcome of tBPAR, graft loss, or death at 12 months after transplantation. A pre‑determined non‑inferiority margin of 12% was used in the analysis for the primary outcome based on a p value of less than 0.001. Non‑inferiority was demonstrated if the upper limit of the 97.5% confidence interval for the difference between the 2 groups was below 12%. For the composite outcome of graft loss or death, a non‑inferiority margin of 10% was used.

ERG comments on the clinical evidence

3.7 The ERG considered that all studies relevant to the decision problem were included in the company's submission. The ERG noted that the clinical effectiveness of everolimus relied upon evidence drawn from the H2304 trial, which it considered was of good quality. However, it considered that the efficacy endpoints used in the trial might not be the most appropriate ones. Clinical opinion sought by the ERG explained that, although the number of acute rejections is a relevant endpoint, these are common and easily treated and long‑term survival is a more appropriate outcome for evaluating the effectiveness of immunosuppressive therapies. The ERG also noted the lack of a health questionnaire to directly capture patients' health‑related quality of life.

3.8 The ERG highlighted that the average whole blood trough levels of tacrolimus were higher than those initially planned for all arms of the trial and that the reduced‑dose tacrolimus group showed trough levels above 5 ng/ml throughout the 12 months. Clinical advisers to the ERG explained that a standard target blood level for a tacrolimus regimen in the UK is 6–8 ng/ml until month 1, just above 6 ng/ml until month 4 and between 5 and 6 ng/ml until the end of the first year. The ERG therefore considered that the reduced tacrolimus blood trough levels in H2304 were equivalent to the standard target blood trough levels of tacrolimus in UK practice.

3.9 The ERG considered that the company's overall approach to the statistical analysis of H2304 was generally sound. It highlighted that in non‑inferiority trials, the choice of the non‑inferiority margin is crucial. However, the ERG commented that it could not find a justification for the non‑inferiority margin used because the company did not explain its decision in the submission.

Clinical trial results

3.10 The company submission included results for the intention‑to‑treat (ITT) population from H2304 for 12 and 24 months follow‑up. For the primary composite efficacy endpoint of tBPAR, graft loss or death, everolimus with reduced‑dose tacrolimus was statistically non‑inferior to standard‑dose tacrolimus alone because the upper limit of the confidence interval [CI] was below the 12% non‑inferiority margin for this outcome and the p value was reported to be <0.001. At 12 months, the Kaplan–Meier survival probability was 93.3% compared with 90.3% (97.5% CI for the difference −8.7 to 2.6; p value for non‑inferiority <0.001). At 24 months the Kaplan–Meier survival probability was 89.7% compared with 87.5% (97.5% CI for the difference −8.8 to 4.4).

3.11 For the composite outcome of graft loss or death, everolimus with reduced‑dose tacrolimus was statistically non‑inferior to standard‑dose tacrolimus alone (the upper limit of the confidence interval was below the 10% non‑inferiority margin for this outcome; the 12 month Kaplan–Meier survival probabilities are academic‑in‑confidence and cannot be presented). At 24 months after transplantation the Kaplan–Meier probabilities were 92.7% and 93.8%, respectively (97.5% CI for the difference −4.2 to 6.4).

3.12 There were statistically significantly fewer episodes of rejection at 12 months in the group randomised to everolimus with reduced‑dose tacrolimus compared with the standard‑dose tacrolimus group. At 12 months, the Kaplan–Meier tBPAR‑free probability was 96.3% in the everolimus group compared with 89.3% in the standard‑dose tacrolimus group (95% CI for the difference −11.6 to −2.5; p value for equivalence =0.003). At 24 months, the Kaplan–Meier probabilities were 93.9% and 86.7% respectively (95% CI for the difference −13.5 to −0.9; p value for equivalence =0.01).

3.13 Everolimus with reduced‑dose tacrolimus was associated with better preservation of renal function compared with standard‑dose tacrolimus alone at 12 and 24 months after liver transplantation. The difference in mean eGFR was 8.50 ml/min/1.73 m2 (p<0.001, 97.5% CI 3.74 to 13.27) at month 12 and 6.66 ml/min/1.73 m2 (p<0.0001, 97.5% CI 1.90 to 11.42) at month 24.

3.14 The company presented results for a number of predefined subgroup analyses including subgroups based on age, gender, family origin, eGFR, hepatitis C status, and cause of end‑stage liver disease. The company reported that the overall pattern of the event rates within subgroups was similar to that observed in the overall population.

ERG comments on the clinical trial results

3.15 The ERG noted that the composite primary outcome of the trial (tBPAR, graft loss and death) combined 2 outcomes in which the treatment effects of everolimus relative to the comparator worked in different directions. The ERG commented that while the results for the outcome of graft loss or death favoured the standard‑dose tacrolimus arm of the trial, the addition of tBPAR to the composite endpoint favoured everolimus with reduced‑dose tacrolimus. The ERG questioned the appropriateness of the composite endpoint.

3.16 The ERG highlighted that there were no statistically significant differences between the treatment groups in the rates of graft loss or death at 12 or 24 months after transplantation. However, there were statistically significantly fewer episodes of acute graft rejection in the everolimus group compared with the standard‑dose tacrolimus group. The ERG considered that the effectiveness of everolimus was largely dependent on the choice of clinical outcomes and whether these included acute rejection episodes or graft losses.

Adverse effects of treatment

3.17 The company did not identify any trials that reported adverse events for everolimus with reduced‑dose tacrolimus, apart from H2304. In H2304, lipid changes occurred more frequently in the everolimus group than in the standard‑dose tacrolimus group (at 24 months, 26.9% compared with 11.6%, RR 15.4, 95% CI 8.5 to 22.2). At 24 months, the incidence of new onset diabetes mellitus was higher in the everolimus group (20.8% compared with 16.5%, RR 4.3, 95% CI 2.6 to 11.2). The company did not report any statistically significant differences in the occurrence at 12 months of diarrhoea, headache, hypertension, wound healing, biliary leaks, new onset diabetes mellitus, infections or renal failure. There were fewer cases of renal failure in the everolimus group (15 compared with 21 in the standard‑dose tacrolimus group at 12 months) but no tests of significance were reported.

3.18 The company reported that 74.3% of people in H2304 tolerated everolimus with reduced‑dose tacrolimus up to month 12. No patients developed severe renal dysfunction (eGFR <30 ml/min/1.73 m2). The company commented that evidence from the network meta‑analysis indicated that the treatments were comparable in terms of safety.

Network meta‑analysis

3.19 In the absence of direct trial evidence, the company did a systematic review and network meta‑analysis. This estimated the relative effectiveness of everolimus with reduced‑dose tacrolimus for preventing organ rejection in people having a liver transplant in the maintenance phase, compared with:

  • mycophenolate mofetil (in combination with standard‑dose tacrolimus, reduced‑dose tacrolimus or standard‑dose ciclosporin)

  • azathioprine (in combination with standard‑dose tacrolimus or standard‑dose ciclosporin)

  • standard‑dose tacrolimus.

3.20 The company identified 22 RCTs that assessed the efficacy of these treatments, with or without corticosteroids. It reported that there was heterogeneity between studies that provided challenges to building a feasible network, such as: lack of reporting of characteristics that were potential treatment effect modifiers, variation in the definition of the tacrolimus and ciclosporin arms with respect to dosage, variations in definitions of outcomes, and variation in the duration or use of corticosteroid therapy. However, the company considered that the evidence was sufficient to create feasible networks for 13 of the 16 clinical endpoints extracted from the studies. These included overall survival, graft survival, tBPAR, and renal function.

3.21 The company also reported some discrepancies between H2304 and the other trials. For example, more patients in H2304 had diabetes and hypertension at baseline, and the standard‑dose tacrolimus group had better overall survival, graft survival and tBPAR‑free probabilities than the standard‑dose tacrolimus groups of the other trials. The company reported that, because H2304 was the only trial of everolimus with reduced‑dose tacrolimus, it was not possible to conduct subgroup analyses excluding this study. H2304 was therefore assumed to be comparable with the rest of the evidence.

3.22 The results of the network meta‑analysis were presented as a consistency model, in which direct and indirect evidence were assumed to be consistent for any 'closed loops' in the evidence network. An inconsistency model was also presented if data were available (that is, using direct evidence only). The company reported that all models were based on the NICE Decision Support Unit document 4 (inconsistency in networks of evidence based on randomised controlled trials, 2011) and that the parameters of the different models were estimated within a Bayesian framework using a Markov Chain Monte Carlo method as implemented in the WinBUGS/OpenBUGS software package. To assess heterogeneity in the treatment effects for a particular pair‑wise comparison caused by the treatment effect modifiers, both fixed effects and random effects were modelled.

3.23 For overall survival at 12 and 24 months after transplantation, the company reported that everolimus with reduced‑dose tacrolimus was expected to be comparable to all other treatments. It was ranked fifth and fourth of the interventions at 12 and 24 months, respectively. At 24 months, overall survival was 85.3% (95% credible interval 72.5 to 92.9), compared with:

  • mycophenolate mofetil with ciclosporin (overall survival 88.9%, 95% credible interval 43.6 to 98.8)

  • mycophenolate mofetil with standard‑dose tacrolimus (overall survival 88.4%, 95% credible interval 83.8 to 92.0)

  • standard‑dose tacrolimus (overall survival 87.4%, 95% credible interval 84.1 to 90.2)

  • azathioprine with standard‑dose tacrolimus (overall survival 85.8%, 95% credible interval 76.7 to 91.8).

3.24 For graft survival at 12 and 24 months after transplantation, the company reported that everolimus with reduced‑dose tacrolimus was expected to be comparable to all other treatments. It was ranked fifth and fourth of the interventions at 12 and 24 months, respectively. At 24 months, graft survival was 79.7% (95% credible interval 67.3 to 88.4), compared with:

  • mycophenolate mofetil with ciclosporin (graft survival 86.0%, 95% credible interval 49.4 to 97.5)

  • mycophenolate mofetil with standard‑dose tacrolimus (graft survival 85.3%, 95% credible interval 80.0 to 89.6)

  • standard‑dose tacrolimus (graft survival 82.6%, 95% credible interval 79.1 to 85.8)

  • azathioprine with standard‑dose tacrolimus (graft survival 80.8%, 95% credible interval 70.6 to 88.2).

3.25 For the outcome of being tBPAR‑free at 3, 6 and 12 months after transplantation, everolimus with reduced‑dose tacrolimus was ranked as the best therapeutic option of all the interventions. At 12 months, the absolute estimate was 89.5% (95% credible interval 82.3 to 94.4), compared with:

  • mycophenolate mofetil with standard‑dose tacrolimus (tBPAR‑free 83.4%, 95% credible interval 75.8 to 88.9)

  • mycophenolate mofetil with reduced‑dose tacrolimus (tBPAR‑free 80.6%, 95% credible interval 74.3 to 85.9)

  • standard‑dose tacrolimus (tBPAR‑free 76.8%, 95% credible interval 72.0 to 81.2)

  • azathioprine with standard‑dose tacrolimus (tBPAR‑free 75.6%, 95% credible interval 65.3 to 83.7)

  • azathioprine with ciclosporin (tBPAR‑free 72.3%, 95% credible interval 55.4 to 84.6).

3.26 For renal function at 12 months after transplantation (reported in the studies as eGFR or estimated creatinine clearance), azathioprine with ciclosporin led to the lowest decline, followed by everolimus with reduced‑dose tacrolimus. At 12 months, the absolute estimate for everolimus with reduced‑dose tacrolimus was −23.1 (95% credible interval −27.4 to −18.7), compared with:

  • azathioprine with ciclosporin (change in eGFR from baseline −14.5, 95% credible interval −24.2 to −4.9)

  • mycophenolate mofetil with reduced‑dose tacrolimus (change in eGFR from baseline −28.2, 95% credible interval −32.3 to −24.1)

  • standard‑dose tacrolimus (change in eGFR from baseline −31.6, 95% credible interval −32.3 to −30.9).

3.27 The company did scenario analyses that involved removing specific trials from the network to assess the impact on the results. However, it did not present any results in its submission.

ERG comments on the network meta‑analysis

3.28 The ERG found that many of the trials used in the network meta‑analysis had substantially different tacrolimus target whole‑blood trough levels to those used in UK clinical practice. Some studies maintained tacrolimus blood trough levels above 5 ng/ml in the reduced tacrolimus dose arm, but other studies maintained blood trough levels below 5 ng/ml in the standard‑dose tacrolimus arm. Therefore, no consistency was seen across studies with respect to target drug levels.

3.29 The ERG questioned the validity of the network meta‑analysis results for the renal outcomes because it considered that the allocation of the different studies' treatment groups to the reduced‑dose and standard‑dose tacrolimus categories was inconsistent and misleading. Because the standard‑dose tacrolimus connector across the network meta‑analysis studies was so heterogeneous, the ERG considered that the results of the network meta‑analysis were not robust.

3.30 The ERG found a significant limitation in the network meta‑analysis because the data included in the WinBUGS codes did not relate to the submission data and appeared to have been taken from either a different submission or a theoretical exercise. The ERG could not verify which data were used for the analysis of specific outcomes because of a lack of clarity and transparency in the company submission. The ERG stated that it was unclear which studies had been included in the analysis for the tBPAR outcome. The company highlighted that their submission provided network diagrams and data tables for the acute rejection outcome at various time points. However, the ERG commented that it was still unclear which studies had been included.

3.31 The ERG commented that the company's scenario analyses, that removed specific trials from the network to assess the impact on the results, lacked transparency and were not informative because no results were presented.

Cost effectiveness

3.32 The company did not identify any existing cost‑effectiveness analyses of everolimus with reduced‑dose tacrolimus that were relevant to the decision problem.

3.33 The company developed a patient‑simulation model that evaluated the cost‑effectiveness of everolimus with reduced‑dose tacrolimus, with or without corticosteroids (that is, people were assumed to have corticosteroids initially and then tapered off completely from 6 months onwards), compared with azathioprine or mycophenolate mofetil with standard‑dose tacrolimus for maintenance immunosuppressive therapy. The model considered a hypothetical cohort of patients who had a liver transplant for any reason. The model included a core hepatic rejection model and a renal sub‑model. The core model consisted of 6 health states: 'stable post‑transplant', 'acute rejection', 'acute steroid‑resistant rejection', 'severe chronic rejection (leading to graft loss)', 'mild chronic rejection', and 'hepatic‑graft‑related death'.

3.34 The company included the renal sub‑model to demonstrate the 'renal sparing' effect of everolimus with reduced‑dose tacrolimus. The sub‑model included 5 health states defined by stages of chronic kidney disease (CKD) as measured by eGFR. These ranged from no CKD (eGFR 90+) to CKD stage 5 (eGFR <15). It also included a renal‑related death state. Patients could also leave the model from natural (background) mortality.

3.35 The model had a lifetime time horizon (80 years) and a cycle length of 3 months. The company stated that this reflected expert opinion that most acute rejection occurs 3 months after transplantation. A half‑cycle correction was not applied. The model used discount rates of 3.5% for costs and QALYs and an NHS/personal and social services perspective.

ERG comments on the model structure

3.36 The ERG considered that the company did not provide enough evidence to justify their approach of using a patient‑level simulation. The company reasoned that the use of a patient‑simulation model is appropriate when the patient flow is determined by the time since the last event or by the history of previous events. However, the ERG highlighted that only the severe chronic‑rejection state was affected by the time since the last event, and that the renal sub‑model transition probabilities were not time dependent and also did not depend on history of previous events. Based on assessment of patient heterogeneity, and the patient baseline characteristics simulated in the economic model, the ERG considered that a patient‑simulation model was not needed, and that a cohort state‑transition model would have been more appropriate.

3.37 The ERG considered that more emphasis should have been placed on the renal component of the economic model and also that more interaction between the 2 models should have been considered, perhaps within 1 broader model structure. The ERG stated that this was because immunosuppressive therapy after liver transplantation has an impact on renal functioning and renal functioning has an impact on graft survival.

3.38 The ERG found that the reporting of the model's structure and assumptions lacked clarity and that few justifications were provided for the assumptions used. In particular, the ERG questioned the clinical plausibility of the mild chronic‑rejection state (an asymptomatic state that patients could only move to 1 year after transplantation). The ERG's clinical expert adviser did not see a valid or justifiable reason for patients only to progress to this state 1 year after transplantation, therefore the relevance of including this health state in the model was not clear.

3.39 The ERG's clinical adviser stated that 3‑month cycles were too long to capture all the relevant events and that monthly cycles may have been more appropriate. The ERG also stated that because the cycle length of 3 months was relatively long, a half‑cycle correction should have been applied. The ERG highlighted that although the impact of a half‑cycle correction on the model outcomes would not be significant, no justification was given by the company as to why this was not applied.

3.40 The ERG considered that the time horizon of 80 years was unnecessarily high given that the average starting age of people in the model was 54 years. The ERG highlighted that after 40 years (when the average age in the model was 94 years), 100% of patients would have died.

3.41 The ERG expressed concern about the number of simulations (10,000) and the lack of stability in the patient‑simulation model. The ERG explored this in its exploratory analyses (see sections 3.66 and 3.67).

Model parameters

3.42 In the core hepatic‑rejection model, disease progression from the stable post‑transplant state to the acute rejection state was determined by the immunosuppressive regimen (the treatment group). The probability of progression was derived from the probability of being free from treated biopsy proven acute rejection (tBPAR) at 3, 6 and 12 months for each treatment group, which was calculated from the network meta‑analysis. The transition probabilities from the stable post‑transplant state remained constant from the fifth cycle (starting at month 13) onwards. All other transition probabilities for the other health states were assumed to be constant and were therefore independent of the immunosuppressive regimen.

3.43 For the renal sub‑model, transition probabilities for the first year were based upon the annual decrease in eGFR from baseline and were dependent on the treatment in year 1. The relative difference between treatments was calculated from the network meta‑analysis and used to derive the absolute eGFR decrease for each treatment. After year 1, the company assumed that renal function followed a natural progressive decline meaning that transition probabilities were assumed to be constant. These transition probabilities were not time dependent or treatment‑arm dependent and were based on underlying disease progression probabilities.

3.44 Health‑related quality‑of‑life data were not collected as part of H2304, therefore the company did a systematic literature review to identify health‑state utility values. The company identified 7 studies, 5 of which were studies measuring EQ‑5D in a UK population. All of the studies provided data for patients after transplantation, but none of the studies provided utility data specific to either acute or mild rejection, nor did they report disutility data specific to adverse events. Utility scores for the health states in the hepatic rejection model and for the renal sub‑model were based on Ratcliffe et al., 2002 and Neri et al., 2012 respectively, both UK studies using EQ‑5D. The study by Neri et al., assessed the relationship between health utility and renal function in people who had a kidney transplant.

3.45 Patients in the core hepatic model were assumed to have a stable health‑related quality of life over time, with most states assuming an asymptomatic state with a utility value of 0.58. Patients in the more severe state of graft loss (severe chronic rejection) experienced a decrease in utility to 0.53. In the renal sub‑model, patients' health‑related quality of life decreased in line with their symptoms until renal transplantation, from 0.83 in the 'no CKD' health state, to 0.64 for CKD stage 1 to 2, 0.58 for CKD stage 3, 0.49 for CKD stage 4 and 0.28 for CKD stage 5. The company used the 'minimum method', to take into account potential double counting of utility losses in simultaneous health states (for example, hepatic rejection and renal dysfunction). The minimum method assumes that the lowest value is used as the estimate of joint state utility.

3.46 The company obtained estimates of resource use for the hepatic‑rejection model from the University Hospitals Birmingham Foundation Trust and these were validated by the company's clinical advisers. For the renal sub‑model, the company reported that NICE's guideline on identifying and managing chronic kidney disease and Kerr et al., 2012 were the 2 sources used to obtain resource use data.

3.47 The company estimated the occurrence of adverse events associated with the different treatment regimens based on the network meta‑analysis and the summary of product characteristics for each drug product. These included hypertension, diabetes mellitus, infections, tremor and insomnia. The expected cost per treatment regimen was calculated by applying the cost of treating events with the probability of the events happening. The disutility estimates for treatment related adverse events were estimated from the published literature where possible.

ERG comments on the model parameters

3.48 The ERG questioned the validity and applicability of the economic analysis for the NHS context, because blood trough levels of reduced‑dose tacrolimus in H2304 were above what would be considered as reduced levels in UK clinical practice (see section 3.28). It also highlighted that because the standard dose of tacrolimus across the network meta‑analysis studies was so heterogeneous, the network meta‑analysis results that informed the model were likely to lack robustness (see section 3.29).

3.49 The ERG identified structural errors in the formulae that allocated patients to different health states in the model, which it considered could have been avoided if a cohort state‑transition model had been used. The ERG could not correct these errors because of the computational burden needed to run the patient‑level simulation model, but it considered that the results were likely to be biased in favour of the everolimus treatment regimen.

3.50 The ERG commented that because of the random generation process used to estimate the baseline eGFR levels some patients started the renal sub‑model model with negative levels of eGFR, which is not clinically plausible. The ERG explained that this meant these patients were immediately allocated to the CKD stage‑5 category, where patients with a negative eGFR level could stay for long periods of time (for example over 3 years) until they returned to CKD stages 1–2 following a transplant.

3.51 The ERG raised some concerns about the utility values used in the model. For example, it noted that the company had assumed that quality of life in the acute‑rejection and mild chronic‑rejection health states was the same as in the stable post‑transplant state. However, clinical opinion sought by the ERG suggested that patients in the acute‑rejection and mild chronic‑rejection health states would require hospitalisation and that this would reduce their quality of life relative to the stable post‑transplant state. The ERG also considered that the utility value of 0.83 for the 'no CKD' health state was more likely to represent CKD stage 1 and that the utility value of 0.64 for CKD stages 1 to 2 was too low.

3.52 The ERG was generally satisfied with the sources used to obtain unit costs for the hepatic model. However, it highlighted that clinical practice varies across centres for patients having a liver transplant but that the company obtained resource use data from 1 centre only and it was therefore important to validate these against different sources. In general, clinical opinion sought by the ERG disagreed with some of the resource data reported in the submission. The ERG commented that GP visits were unlikely to occur as frequently as described because most of these patients would be managed in secondary care. The ERG also highlighted that the number of tests required in some health states may have been underestimated. For example, in the stable post‑transplant state, people may also require a blood test to check immunosuppressive drug trough levels. The ERG also considered that the cost associated with the mild chronic‑rejection health state (£640) seemed too high for an asymptomatic condition that does not require treatment and it was not clear why this was higher than the cost associated with the stable post‑transplant health state (£73).

3.53 The ERG noted that the company used the most expensive brand price (£1.61 per mg) for tacrolimus (Prograf) in the economic model with no apparent justification. The ERG suggested that a weighted average price of £1.30 per mg (based on the market share information) for tacrolimus would have been more appropriate.

3.54 The ERG was generally satisfied with the estimation of adverse events and the quality of life data used to reflect these in the core hepatic‑rejection model. However, it did not agree that the costs and utility losses associated with everolimus‑related adverse events should have been included for the 3 months in the first model cycle because everolimus therapy starts 1 month after surgery, and therefore the adverse events associated with the drug should have been considered only for 2 months.

Cost‑effectiveness results

3.55 Everolimus with reduced‑dose tacrolimus was more costly and resulted in more QALYs than the other treatment regimens. The deterministic base‑case incremental cost‑effectiveness ratio (ICER) estimated by the company was £110,797 per QALY gained compared against mycophenolate mofetil with standard‑dose tacrolimus (incremental costs £38,004, incremental QALYs 0.343), and £187,842 per QALY gained compared against azathioprine with standard‑dose tacrolimus (incremental costs £35,221, incremental QALYs 0.188). The ICER for the azathioprine treatment regimen compared with the mycophenolate mofetil regimen was £17,895 per QALY gained.

3.56 No deterministic sensitivity analyses were reported in the company's submission. The company ran a probabilistic sensitivity analysis using 1000 simulations for 1000 patients. The company reported that the results of the probabilistic sensitivity analysis were similar to the deterministic base‑case results. The probabilistic ICERs for the everolimus treatment regimen were £105,526 and £184,714 compared with the mycophenolate mofetil and azathioprine treatment regimens, respectively.

3.57 The company's probabilistic sensitivity analysis indicated that for all simulations the ICERs for the everolimus treatment regimen compared with the mycophenolate mofetil regimen were higher than £30,000 per QALY gained. Similarly, the majority of simulations (>99%) were above £30,000 per QALY gained for the comparison of the everolimus and azathioprine treatment regimens. The analysis indicated that everolimus with reduced‑dose tacrolimus was likely to be the most cost‑effective therapy only if the maximum acceptable ICER was over £200,000 per QALY gained.

3.58 The company undertook a number of scenario analyses. In scenario 1, the company removed the mild chronic‑rejection state. This increased the ICER for the everolimus treatment regimen compared with the mycophenolate mofetil regimen to £227,528 per QALY gained. The azathioprine treatment regimen was dominated by the mycophenolate mofetil regimen.

3.59 In scenario 2, the company removed the opportunity for re‑transplant. This resulted in an ICER for everolimus in combination with reduced‑dose tacrolimus of £121,972 per QALY gained compared with the mycophenolate mofetil treatment regimen and £117,285 per QALY gained compared with the azathioprine treatment regimen.

3.60 In scenario 3, the company removed the renal sub‑model. This increased the ICER for everolimus in combination with reduced‑dose tacrolimus to £312,279 per QALY gained compared with the mycophenolate mofetil treatment regimen and to £374,832 per QALY gained compared with the azathioprine treatment regimen. The company stated that the large impact on the results reflected the benefit that the everolimus treatment regimen provides for patients through a renal‑sparing effect.

3.61 In scenario 4, the company reduced baseline eGFR from 81 ml/min/1.73 m2 to 60 ml/min/1.73 m2. This resulted in an ICER for everolimus in combination with reduced‑dose tacrolimus of £184,372 per QALY gained compared with the mycophenolate mofetil treatment regimen and £179,427 per QALY gained compared with the azathioprine treatment regimen. The company highlighted that the results of the scenario analyses demonstrated that the model was sensitive to changes in baseline eGFR.

ERG comments on the cost‑effectiveness results

3.62 The ERG noted a logical error in the model when it analysed data provided by the company on the average number of cycles spent in the different health states of the model. It noted that there was a total of 320 cycles in the model and that in the hepatic rejection model patients spent an average of 41 cycles in the everolimus with reduced‑dose tacrolimus group. In the remaining 279 cycles the patients were dead. However, in the renal model, patients only spent 31 cycles alive in the model, meaning that 10 cycles were 'missing' from the renal sub‑model. The ERG suggested that this logical error reflected a problem in the model formulae and/or structure.

3.63 The ERG commented that it would be useful to understand what the key drivers of the economic model were, especially considering the high ICERs presented, but that this was not possible because the company did not undertake deterministic sensitivity analysis.

3.64 The ERG highlighted that the results of the probabilistic sensitivity analysis were based on a small number of simulations (1000) and were unlikely to have generated reliable estimates.

3.65 The ERG queried the results of the company's scenario analyses (see sections 3.60 to 3.63). The ERG could not find a plausible reason why removing the mild chronic‑rejection state in the company's scenario‑1 analysis increased the ICER for the everolimus treatment regimen compared with the mycophenolate mofetil regimen but decreased it compared with the azathioprine regimen. It also noted that there was no consistency for scenario 2 (removing the re‑transplantation option from both models) and scenario 4 (decreasing the baseline eGFR level from 81 ml/min/1.73 m2 to 60 ml/min/1.73 m2), with the ICERs increasing compared with the mycophenolate mofetil treatment regimen and decreasing compared with the azathioprine treatment regimen.

ERG exploratory analyses

3.66 The ERG ran 2 iterations of the company's base‑case model using the same number of simulations and the same assumptions, to test the model's stability with regard to the ICER results. The ERG found considerable variation in the ICERs reported, especially for the everolimus treatment regimen compared with the mycophenolate mofetil regimen, with the ICERs ranging from £110,797 to £120,651 per QALY gained (nearly a 9% change). The ERG concluded that there was instability in the base‑case ICERs.

3.67 The ERG tried to determine the cause of instability in the model results. Therefore it 'fixed' the baseline characteristics of patients (by taking their mean values) in the simulation model instead of allowing these values to vary in each simulation according to a distribution. The reason for this analysis was to understand if the variation in results was generated by the simulated patient characteristics or if it was attributable to other problems in the model. The results of this exercise generated ICERs for the everolimus treatment regimen compared with the azathioprine regimen that ranged from dominant to £797,558 per QALY gained. For the everolimus treatment regimen compared with the mycophenolate mofetil regimen, the ICERs ranged from £431,348 to £582,668 per QALY gained. The ERG therefore considered that the instability of the model could not be solved by fixing patient baseline characteristics as mean estimates.

3.68 The ERG lacked confidence in the ICERs presented by the company but it considered that it would not be helpful to run any additional analyses with different input values because of the instability of the results. The ERG stated that it could not therefore make any predictions regarding the true cost effectiveness of everolimus.

Additional evidence following consultation on the Appraisal Consultation Document

3.69 No comments from individual patients or professional groups were received during consultation. The company submitted additional evidence relating to the generalisability of the H2304 trial. It acknowledged that there is limited published evidence showing that tacrolimus trough levels of 5 ng/ml or below are achieved in clinical practice, following a liver transplant. The company presented a summary of target tacrolimus trough levels, and those achieved, from clinical trials identified in a 2012 systematic review. These data showed that in the 3 studies that included reduced‑dose tacrolimus in combination with mycophenolate mofetil, none of the studies achieved mean tacrolimus trough levels below 5 ng/ml. The company reported that no studies comparing reduced‑dose tacrolimus with azathioprine were identified in the systematic review. In addition, the company submitted evidence showing that most patients in H2304 who had everolimus with reduced‑dose tacrolimus had a mean tacrolimus trough level within or below the target range of 3–5 ng/ml by month 12.

3.70 The company submitted an updated model. This increased the number of simulations in the base‑case and scenario analyses from 10,000 to 40,000, which the company reported increased the stability of the cost‑effectiveness results. It included 11 input changes, including:

  • amending the renal‑efficacy data in the model. Instead of taking the estimated 12‑month decrease in renal function from the arm of the network meta‑analysis that looked at standard‑dose tacrolimus, it was updated to take it from the arm that looked at mycophenolate mofetil with reduced‑dose tacrolimus. The company acknowledged that, in clinical practice, the studies that informed the reduced‑dose tacrolimus arm would be considered to use a standard dose because trough levels were consistently higher than 7.5 ng/ml

  • using the average brand price for tacrolimus as calculated by ERG, rather than the Prograf brand price

  • applying adverse‑event costs for everolimus for 2 months instead of 3 months in the first cycle, because treatment starts 30 days after a transplant

  • shortening the time horizon of the model from 80 years to 40 years

  • recalculating renal‑progression rates using the correct rate–probability conversion equation to generate the correct risk of progression to subsequent CKD stages per 3‑month cycle.

  • correcting a number of errors relating to the calculation of transition probabilities

  • correcting the estimate of baseline eGFR levels so that no patients had negative levels

  • correcting an incorrect formula in the renal sub‑model that led to 10 'missing' cycles identified by the ERG.

3.71 The company presented an updated base‑case analysis that included the 11 amendments to the economic model. This increased the base‑case ICER for everolimus with reduced‑dose tacrolimus when compared with using mycophenolate mofetil with standard‑dose tacrolimus from £110,797 to £176,604 per QALY gained. It also decreased the base‑case ICER for everolimus with reduced‑dose tacrolimus compared with using azathioprine with standard‑dose tacrolimus from £187,842 to £104,782 per QALY gained. The company reported that a second run of the updated model resulted in smaller variations in the ICERs than in the original base‑case results and that this was because of the increase in simulations from 10,000 to 40,000. The ICERs were slightly higher for everolimus compared with the mycophenolate mofetil regimen than for everolimus compared with the azathioprine regimen. The company explained that this is because mycophenolate mofetil with standard‑dose tacrolimus is associated with a slightly lower per‑cycle rate of acute rejection (0.6%) when compared with everolimus (2.5%) and azathioprine (1.6%) at 13‑months follow‑up and beyond. The company also reported that the mycophenolate mofetil regimen had an adverse‑event‑related disutility score (−0.011) that was more favourable than azathioprine (−0.015), but less favourable than everolimus (−0.009).

3.72 The company's updated model included a number of new scenario analyses, including:

  • reducing the baseline eGFR from 81 ml/min/1.73 m2 to 60 ml/min/1.73 m2, to be more reflective of the patient population in the UK. The ICER for everolimus with reduced‑dose tacrolimus was £197,404 per QALY gained compared with the mycophenolate mofetil treatment regimen and £107,618 per QALY gained compared with the azathioprine regimen

  • removing the mild chronic‑rejection state from the model. The ICER for everolimus with reduced‑dose tacrolimus was £172,893 per QALY gained compared with the mycophenolate mofetil treatment regimen and £95,794 per QALY gained compared with the azathioprine regimen

  • amending the utility values for acute rejection, acute steroid‑resistant rejection and mild‑chronic rejection from 0.58 to 0.56 so that they were lower than the stable post‑transplant health state. The ICER for everolimus with reduced‑dose tacrolimus was £171,116 per QALY gained compared with the mycophenolate mofetil treatment regimen and £103,858 per QALY gained compared with the azathioprine regimen.

3.73 The company also repeated a scenario analysis that the ERG had done to test the stability of the original model. The company fixed the baseline characteristics, which resulted in an ICER of £320,637 per QALY gained for everolimus with reduced‑dose tacrolimus compared against mycophenolate mofetil with standard‑dose tacrolimus. The ICER per QALY gained was £161,462 for everolimus with reduced‑dose tacrolimus, compared against azathioprine with standard‑dose tacrolimus. The company highlighted that this demonstrated that its updated model was more stable and produced smaller variations in the ICERs when the model was rerun.

3.74 Using the original model, the company explored the impact of reclassifying tacrolimus dosing on the acute‑rejection efficacy inputs in the model by re‑running the network meta‑analysis. Studies with treatment arms that included tacrolimus at trough levels less than 5 ng/ml were reclassified as 'reduced tacrolimus', while studies with treatment arms at trough levels of more than 5 ng/ml were reclassified as 'standard tacrolimus'. This reclassification of studies in the network meta‑analysis changed the probability of acute rejection for the 3 treatment regimens and resulted in ICERs for everolimus with reduced‑dose tacrolimus of £101,893 per QALY gained compared with the mycophenolate mofetil treatment regimen and £73,827 per QALY gained compared with the azathioprine regimen.

3.75 Full details of all the evidence are available.

  • National Institute for Health and Care Excellence (NICE)