3 Committee discussion

The appraisal committee (section 5) considered evidence submitted by Janssen, a review of this submission by the evidence review group (ERG), the technical report developed through engagement with stakeholders, the responses to the appraisal consultation document and the ERG's review of the company's consultation responses. See the committee papers for full details of the evidence.

The appraisal committee was aware that none of the issues raised during the technical engagement stage had been fully resolved. Therefore, it considered all the feedback received from consultees and commentators, the ERG's report on the company's response to engagement and other issues that had not been consulted on during engagement.

Clinical need and current management

Living with moderately to severely active disease is physically and emotionally disabling

3.1 The patient expert explained that the experience of living with moderately to severely active ulcerative colitis varies on an individual level, but in their experience it is extremely challenging. They explained that, in the 5 years between initial diagnosis and the point at which they had surgery, they had only experienced about 18 months in total when their disease was not active. During periods of active disease, they never had fewer than 4 to 5 bowel movements per day. They experienced constant pain, sleep deprivation (caused by being awake in the night to go to the toilet) and depression. They also explained that using corticosteroids is associated with side effects and contributes to low mood. They commented that the effects of the disease and side effects of medication can be moderated, to an extent, through management strategies, such as avoiding social activities and mapping local toilets. However, this is an extreme burden. They explained that feeling out of control is an important and common issue for many people with moderately to severely active ulcerative colitis. The clinical experts said that the patient expert's comments reflect the experience of patients that they see in practice, and responses received during consultation from consultees also reflected these comments. The committee took account of comments submitted in writing by patient experts and research undertaken by the company, which highlighted the effects of the disease and current treatments, including surgery, on daily activities, relationships, self-esteem and body image. It concluded that living with moderately to severely active disease is physically and emotionally disabling.

Ustekinumab is an alternative to vedolizumab

3.2 The clinical experts recognised that NICE already recommends several treatment options for when conventional therapy or a biological agent cannot be tolerated, or the disease has responded inadequately or lost response to treatment. The clinical experts commented that all the current treatments are similarly effective, however, in current practice, most patients will be offered a tumour necrosis factor (TNF)‑alpha inhibitor first when conventional therapy has failed. This is because biosimilars are available in this class, which have a lower price. The clinical experts stated that the cheapest infliximab biosimilar is usually prescribed first. If a patient has a loss of response and has produced antibodies, they would be offered another TNF‑alpha inhibitor. If a patient has not produced antibodies and their disease has responded inadequately or lost response to treatment with 1 TNF‑alpha inhibitor, then vedolizumab or tofacitinib are considered. The clinical experts noted that a patient's choice of treatment is often influenced by the drugs' safety profiles, and that in their experience tofacitinib is sometimes more effective but associated with more severe side effects and is not often used in clinical practice. They explained that TNF‑alpha inhibitors are not appropriate for everyone. For example, people who have contraindications such as high risk of heart failure, or people who are at high risk of infection, including older people. The clinical experts explained that, for these people, vedolizumab would usually be offered instead. They stated that conventional therapy is not a suitable comparator for ustekinumab, because ustekinumab's place in the clinical pathway is likely to be the same as for vedolizumab. The committee noted that vedolizumab and ustekinumab have different mechanisms of action to TNF‑alpha inhibitors, and that ustekinumab and vedolizumab have similar adverse effect profiles. It also noted that ustekinumab may be preferred over vedolizumab because of its mode of administration (subcutaneous injection rather than intravenous injection). Also, because vedolizumab acts mainly on the gut, ustekinumab could be advantageous because it acts on other manifestations of the disease (such as in the skin and joints). The committee noted that vedolizumab and tofacitinib are recommended for people who cannot have TNF‑alpha inhibitors. However, vedolizumab is the most relevant comparator because it has a similar safety profile to ustekinumab. The committee therefore concluded it is most appropriate to consider ustekinumab at the same place in the treatment pathway as vedolizumab.

There is an unmet need for new treatments that reduce the need for corticosteroids or surgery

3.3 The clinical and patient experts, and the consultation responses, agreed that there is an unmet need for new non-surgical treatment options because many people have an inadequate response to current therapies or they stop working. The only option for these people, other than surgery, is long-term corticosteroid use. This is associated with extreme side effects including mood changes such as irritability and depression, osteoporosis and cataracts. The patient and clinical experts, and comments from consultees, agreed that surgery can be an effective treatment for some patients, but it is avoided until this is the last available treatment option. Outcomes of surgery are variable; there can be a psychological impact, and abdominal scarring can significantly affect sexual and reproductive function. The patient expert also noted that ustekinumab's mode and frequency of administration during maintenance treatment may be more convenient than that of some other current treatments. The committee concluded that new medical treatment options would be welcome.

Clinical evidence

The UNIFI trial shows that ustekinumab is more effective than placebo at inducing and maintaining remission and response in all patients

3.4 UNIFI is a randomised, placebo-controlled trial of patients who had had an inadequate disease response to, or unacceptable side effects from, biological treatments (TNF‑alpha inhibitors or vedolizumab) or conventional non-biological therapy (corticosteroids or the immunomodulators azathioprine or mercaptopurine). It had an induction-phase study and a maintenance-phase study. There were 961 patients in the induction study, with outcomes measured at week 8 in the intention-to-treat (ITT) analyses and at week 16 in the non-ITT analyses. 523 patients with disease that had responded after 8 weeks of induction treatment with ustekinumab were entered into the ITT population of the maintenance study, and re-randomised to determine what maintenance treatment they would have (ustekinumab or placebo). The maintenance study also included 2 non-randomised populations: patients whose disease had responded to placebo during induction treatment (sample size and results not reported) and patients who had had more than 8 weeks of induction therapy with ustekinumab and were in response at week 16 (n=157; described by the company as 'delayed responders'). The company reported results for the ITT population for the following subgroups:

  • a 'biologic-failure' subgroup of people who had had at least 1 biological treatment (a TNF‑alpha inhibitor or vedolizumab) and either their disease did not respond or lost an initial response, or they could not tolerate it

  • a 'non-biologic failure' subgroup of people who had never had a biological treatment, but that also included some people who had had biological treatments but not had a documented 'biological failure'.

    At the end of induction treatment, rates of clinical remission and response were statistically significantly higher in the ustekinumab 6 mg per kg and 130 mg groups than the placebo group. This was the case for both the non-biologic failure and biologic-failure subgroups, and for the overall ITT population. At week 44 of the maintenance phase, a statistically significantly greater proportion of patients who had had ustekinumab maintenance with either dose were in clinical remission than those who had had placebo. This was the case for both the non-biologic failure and biologic-failure subgroups, and for the overall ITT population. The committee noted that these subgroups were defined differently to those in the NICE scope, and that in many trials of ulcerative colitis therapies patients are classified based on biological-treatment exposure status rather than biological-treatment failure status. The committee heard that there was considerable overlap in the definitions, however, with 94% of patients in the UNIFI non-biologic failure subgroup having had no previous exposure to biological therapy. The committee concluded that UNIFI data are generalisable to the population who would be eligible to have treatment with ustekinumab in the NHS. It also concluded that the results demonstrated that ustekinumab is more effective than placebo at inducing and maintaining remission and response in all patients covered by the marketing authorisation.

Issues raised about UNIFI at technical engagement have been resolved and do not affect the interpretation of the trial results

3.5 The committee reviewed the following points raised in technical report issue 1:

  • The UNIFI clinical-response results reported in the company submission do not appear to match those in the New England Journal of Medicine (NEJM) trial report, published in September 2019.

  • It is not clear from the information in the company submission that blinding was maintained between induction week 8 and the maintenance phase, or that baseline characteristics of patients in the re-randomised groups were well balanced. Therefore, it is not possible to assess whether the study is at high risk of bias.

  • The results for placebo 'non-responders' who had 6 mg per kg ustekinumab intravenously at week 8 and were assessed at week 16 are not reported in the company submission.

    The committee considered a summary of the company's responses to these points, which consisted of further explanations and data. The committee agreed with the ERG that the company's response demonstrated that there are no important discrepancies between the company submission and the NEJM article, and that UNIFI is at low risk of bias. The committee considered new UNIFI data that the company provided for patients who had 6 mg per kg ustekinumab intravenously at week 8 and who were assessed at week 16. It agreed that the new data did not change the interpretation of the results for the ITT population in the induction study.

Indirect treatment comparisons

The exclusion of trials carried out in Asian countries from the network meta-analyses has little effect on the cost-effectiveness estimates

3.6 The company identified 5 trials from Asian countries in its systematic literature review. However, it decided to exclude these studies from the network meta analyses (NMAs) that informed its cost-effectiveness analyses. The company tested the effect of its approach by doing sensitivity NMAs that included data from the Asian trials. The ERG identified some methodological problems with these sensitivity NMAs and feedback was sought on these points during technical engagement (see technical report issue 2). The company's response to technical engagement issue 2 resolved one, but not all, of the ERG's concerns about the sensitivity NMAs. The ERG explained that some of the company's inclusion and exclusion decisions about the sensitivity NMAs remained inappropriate. The ERG did, however, note that the Asian trials were relatively small. The overall effect on the results of the sensitivity NMAs was therefore likely to be low, and the main NMAs produced similar results to the sensitivity NMAs. The ERG concluded that excluding the Asian trials from the NMAs had little effect on the cost-effectiveness estimates. The committee noted that responses to technical engagement indicated that there was no clinical rationale for excluding the Asian trials, and that it would have been more appropriate for them to be included in the analyses that informed the economic model. Overall, it agreed with the ERG that this issue has little effect on the cost-effectiveness estimates and decided to further consider the NMAs that excluded the Asian trials.

The maintenance-phase NMAs are uncertain but provide more robust estimates of relative effectiveness than the company's unadjusted indirect treatment comparison

3.7 The committee agreed with the ERG that the company's induction-phase NMAs were methodologically robust and provided a suitable source of clinical data for the transition probabilities in the induction phase of the model. However, the committee noted that estimating the relative effectiveness of ustekinumab and its comparators in the maintenance phase by combining data from different trials was methodologically challenging, because of the lack of head-to-head trial data and differences in the trial designs. It was aware that the company had explored both the adjusted NMA and the unadjusted indirect comparison methods, and that the company's preference was to use the results of its unadjusted indirect treatment comparison (ITC) to inform the cost-effectiveness estimates. The committee was aware that the ERG considered the results of the company's unadjusted ITC to be unreliable and had therefore used results from the company's and its own NMAs to inform its exploratory analyses. The committee was aware that feedback had been sought at technical engagement (see technical report issues 4 and 5) to try to understand if any of the methods explored by the company and the ERG were more appropriate, or if other types of analyses should have been done. The committee reviewed the responses to the engagement issues 4 and 5 and noted the following:

  • No new data have been provided to support the assertion that heterogeneity in the placebo arms of the re-randomised maintenance-phase data is mainly caused by the continuing effects of induction treatment.

  • The company asserted that drug half-life is a cause of the continuing effects of induction treatment being observed during the maintenance phase. But evidence provided by a comparator company suggests there is no correlation between drug half-life and placebo-arm response rates.

  • The ERG and the company agreed that further analyses using existing data are unlikely to reduce the outstanding uncertainties.

    The clinical experts commented that multiple differences between the trials mean that they are not comparable. For example, the approaches to corticosteroid tapering varied. The committee considered the different approaches to combining the maintenance phase trial data. It agreed with the ERG that the unadjusted ITC methods preferred by the company are not recommended and the results of these analyses are not robust enough to inform decision making. At the first committee meeting the committee concluded that the company's 1‑year NMA conditional on response and the ERG's maintenance-only NMA both had limitations and the results were very uncertain. However, because no alternative data were available, the results provided the best available estimates of relative effectiveness. In response to the appraisal consultation document, the company updated its base case using its 1‑year NMA conditional on response, which the ERG had also used in its base case. The committee concluded that although the results of this NMA are highly uncertain, it was preferred to the ERG's maintenance-only NMA for providing estimates of relative effectiveness to inform the cost-effectiveness model.

The pooling of the standard and escalated-dose effects in the maintenance phase has little effect on the results

3.8 The committee noted that the company and the ERG had not agreed a preferred approach for the pooling of standard and escalated efficacy dose effects during the maintenance phase (technical report issue 7). The committee concluded that this was a relatively minor issue compared with the other uncertainties in the maintenance analyses and did not have a major effect on decision making.

The company's economic model

The model is appropriate for decision making but additional health states in the model would have been preferable

3.9 The company estimated the cost effectiveness of ustekinumab using a model with a hybrid structure (the induction phase was modelled using a decision tree and the maintenance phase was modelled using a Markov structure). The company provided cost-effectiveness estimates for 2 subgroups defined by biological-treatment failure status, but not for the overall population. The committee noted that the ERG had used the same model for its base-case analyses, but with different assumptions including the proportions of patients experiencing response and remission after the failure of initial treatment. The committee considered the health state definitions, recalling the clinical experts' comments that many patients in the population of interest have chronically active disease that is controlled with the long-term use of corticosteroids. The committee noted that the company's 'active disease' health state definition (Mayo score between 6 and 12 points, 'remission or response without remission not achieved') did not necessarily apply to this group of patients. The company explained that the 'active disease' health state represents a mixed population of people with active ulcerative colitis that is controlled with corticosteroids and people with active ulcerative colitis who are experiencing an exacerbation. It is therefore difficult to identify the appropriate utility value for this health state in the model. The committee agreed that it would have preferred the inclusion of additional health states in the model to appropriately reflect the progression of the disease. The clinical experts commented that if patients taking long-term corticosteroids stop treatment, they are likely to start experiencing active disease again over time. On this basis, the committee concluded that although the model structure did not explicitly account for patients with disease that was being controlled through the long-term use of corticosteroids, and additional health states would have been more appropriate, the model could be used for decision making.

The assumption that 30% of patients have escalated doses of maintenance treatment is acceptable

3.10 The committee noted that, in response to technical engagement issue 6, the company had adjusted its base-case assumptions about dose escalation for patients having maintenance infliximab to reflect the ERG's preference. The ERG assumed that for all drugs included in the analysis, 30% of patients would have an escalated maintenance dose, even though the escalated dose for infliximab is not licensed in the UK. The committee noted that other responses to engagement indicated that off-label use of escalated-dose infliximab is common UK practice but that escalation rates vary between biological therapies. The clinical experts agreed that infliximab dose escalation is common practice but noted that the variation in escalation rates across treatments cited in the engagement responses was not realistic. The committee recognised that there was some uncertainty about this issue. It concluded that this was not a major driver of cost effectiveness, and it was willing to accept the company's revised assumption.

Response rates and remission rates are uncertain for patients with disease that does not respond or loses response to initial therapy

3.11 The committee noted that ulcerative colitis is not always a chronically active disease and many people with ulcerative colitis have ongoing periods of relapse and remission. The company and ERG base-case analyses used different assumptions for response rates and remission rates in patients whose disease did not respond or lost response to initial therapy. The committee noted that the responses to technical engagement issue 3 had not provided any additional clarity on this issue because the additional evidence provided by the company was of low quality. Comments from a patient organisation suggested that most patients continue to experience active disease until surgery or death, but this is not the same as assuming that no patients ever experience an improvement in symptoms. The company, the ERG and the clinical experts all acknowledged that there is limited evidence about the course of the disease after initial treatment failure. However, the clinical experts stated that for patients such as those in the UNIFI trial they would not expect many, if any, patients to experience an improvement in symptoms unless they were on corticosteroids. The committee considered that the ERG's assumption might be considered optimistic, but it agreed that there is likely to be a small number of people who improve without treatment. It concluded that it was not possible to estimate the rates of response and remission for patients with disease that did not respond or lost response to initial therapy, but it was likely to be nearer the company's assumption of a 0% response rate.

Utility values in the economic model

The utility values are uncertain, and the choice of inputs has a large effect on the cost-effectiveness estimates

3.12 The committee noted that the company and the ERG both used the same utility values in their base cases, but that other sources of utility data are available. Their utility data came from a publication by Woehl et al. 2008. The committee noted that other utility values for response, response without remission, and active ulcerative colitis health states based on EQ‑5D‑5L data were collected in UNIFI and therefore could have been used instead. It was also aware of other sources of utility data in this population (for example, Swinburn et al. 2012 and Vaizey et al. 2013), with values somewhere between the values from Woehl et al. and those collected in UNIFI, which had been used in scenario analyses in previous appraisals. The utility value for the 'active disease' state in Woehl et al. was considerably lower (0.41) than the equivalent value derived from the UNIFI EQ‑5D‑5L data. Because of this, the choice of utility data has a large effect on the cost-effectiveness estimates. The committee noted that the Woehl et al. 2008 data had been considered in all previous ulcerative colitis appraisals but that the reliability of the utility estimates had also been a source of controversy in all the previous appraisals. It noted that the Woehl et al. 2008 publication is only available as an abstract that includes little information about the study methodology or the characteristics of the patients it included. Therefore, it is difficult to assess whether the patients in Woehl et al. 2008 are representative of the population of interest and if the methodology is appropriate. The committee noted that the sample size of Woehl et al. 2008 is smaller than that in the UNIFI EQ‑5D analyses. It also noted that the ERG cited consistency with other appraisals as the only reason for choosing the Woehl et al. 2008 data over the UNIFI data, and that it considered the UNIFI analyses to be well conducted. The committee acknowledged that there were limitations with the trial-based utility values. It noted that the UNIFI EQ‑5D data may be subject to placebo effects and that the length of time over which the data were collected was probably inadequate for estimating the real effect of the disease on health-related quality of life. The committee recalled the patient expert's description of the disease experience and decided that it was plausible that some of the effects on health-related quality of life (such as feeling out of control) might not have been captured in either the Woehl et al. 2008 or the UNIFI analyses. It agreed that all data sources (including Woehl et al. 2008, UNIFI, Swinburn et al. 2012 and Vaizey et al. 2013) had some strengths and some limitations, and it was not possible to determine which was most robust. The ERG explained that there was no basis to distinguish between the 3 published analyses of utility values in ulcerative colitis in terms of methodological or reporting quality, generalisability of the results or applicability to the current decision problem. The committee noted that the UNIFI analyses reported similar values to some of the published utility values in ulcerative colitis. It therefore concluded that the UNIFI analyses were as appropriate as other available utility analyses and therefore considered both this and Woehl et al. 2008 in its decision making.

The economic model is unsuitable for modelling 'stopping rules'

3.13 The company proposed that it is appropriate to consider 'stopping rules' for ustekinumab, in line with NICE's guidance on infliximab, adalimumab and golimumab for ulcerative colitis and on vedolizumab for ulcerative colitis. Based on the company's updated base case, it presented analyses of 'stopping rules' with ustekinumab discontinuation at 1, 2, 3 or 5 years. The clinical experts explained that it is usual practice to review treatment every 12 months; if a person is in sustained remission it may be appropriate to stop treatment, but this is dependent on a variety of factors such as patient choice and the person's overall fitness. The clinical experts also explained that if the person relapsed following discontinuation, treatment would be restarted with either the same, or a different, treatment. The company confirmed that when people stop treatment in the economic model, they do not restart it. The committee agreed that the model does not reflect clinical practice, because it does not account for people who stop treatment but later relapse and restart treatment. The ERG highlighted that it is difficult to model 'stopping rules' and that these have not been modelled in economic analyses in previous ulcerative colitis appraisals, but they were considered qualitatively in those appraisals. The committee concluded that the model structure did not allow 'stopping rules' to be modelled to reflect clinical practice and therefore did not consider them further.

Cost-utility analysis should be used to determine cost effectiveness

3.14 In response to the appraisal consultation document, the company submitted an additional analysis comparing ustekinumab with vedolizumab. The ERG highlighted that this was not a full cost-comparison analysis as stated by the company. It also noted that this analysis did not account for the uncertainty in the clinical efficacy of ustekinumab compared with vedolizumab, which comes from the uncertainty in the maintenance-phase NMAs and the inability to scrutinise the methods of the trial that reports data for vedolizumab as a comparator. The committee agreed that the company's analysis was not appropriate for decision making. It concluded that it was appropriate to use a cost-utility analysis for decision making, and that irrelevant comparators should be excluded from the fully incremental analysis in order to obtain the relevant incremental cost-effectiveness ratios (ICERs).

Cost-effectiveness estimates

Ustekinumab is cost-effective compared with vedolizumab in the cost-utility analyses

3.15 The committee considered the ERG's cost-effectiveness estimates, which incorporated the confidential comparator discounts. The committee noted that the ERG had presented a number of scenarios, which included the confidential patient access schemes and also the Commercial Medicines Unit prices for the comparators and for ustekinumab. Following consultation the company's updated base case included the committee's preferred assumptions in the 1‑year conditional on response NMA (see section 3.7), assuming 0% response and remission rates for patients with disease that did not respond or lost response to initial therapy. The company also presented a scenario analyses using 1% response and remission rates for patients with disease that did not respond or lost response to initial therapy (see section 3.11). For both analyses the company used the Woehl et al. 2008 utility values. The ERG's analyses of the company's updated base case investigated further scenario analysis, using the various utility sources (Woehl et al. 2008, Swinburn et al. 2012, Vaizey et al. 2013 and UNIFI). The committee noted that the ICERs for ustekinumab compared with the lowest-cost TNF‑alpha inhibitor are above £30,000 per quality-adjusted life year (QALY) gained, when ustekinumab was not dominated (more expensive and less effective) or extendedly dominated (its ICER was higher than that of the next more effective option) by the other comparators for both the non-biologic failure and biologic failure groups. The committee accepted that TNF‑alpha inhibitors, conventional therapy and tofacitinib are not relevant comparators in this appraisal and agreed that vedolizumab is the only relevant comparator (see section 3.2). When ustekinumab is compared with vedolizumab, for all scenarios investigated and irrespective of the source of utilities, the ICERs are below £30,000 per QALY gained for both patient subgroups (failed conventional therapy with or without prior exposure to a biological). Therefore, despite the uncertainty around the maintenance NMA results and which utility value is most appropriate in this population, the committee agreed that ustekinumab is likely to be a cost-effective use of NHS resources in people who would otherwise have vedolizumab.


  • National Institute for Health and Care Excellence (NICE)