4 Committee discussion

Clinical effectiveness

Knowing the likely course of the disease may help people with Crohn's disease and the NHS

4.1 The patient expert explained that having Crohn's disease can substantially affect the quality of life of the person and their family. Crohn's disease is a complex disease associated with symptoms that can be highly debilitating. Symptoms include abdominal pain, profound fatigue, weight loss and a constant urge to have a bowel movement, and extraintestinal manifestations, which can affect the joints, skin, bones, eyes, kidneys and liver. Recent research from the Secured Anonymised Information Linkage (SAIL) databank in Wales estimates that the prevalence of Crohn's disease is 1 in 271. The starting age of Crohn's disease is between 10 and 40 years so most people face a lifetime of medication and repeated major surgery. Most are not eligible for help with the cost of prescriptions. Currently the extent of inflammation is monitored using endoscopic imaging and faecal calprotectin blood tests, but they do not predict disease progression or the likelihood of needing surgery in the future. People may not want invasive monitoring using colonoscopy because it is stressful to prepare for, has unpleasant side effects and may aggravate symptoms. The patient expert suggested that a test that predicts long-term disease course could help give people a better understanding and acceptance of their condition, and make planning review appointments more efficient. It could also help increase quality of life outcomes, reduce potential side effects from first-line treatments, allow more effective earlier drug treatment, and reduce demands on NHS services.

Studies on the prognostic ability of the tests are heterogenous and have small sample sizes

4.2 The reviewed studies on prognostic ability had mixed populations, including people with ulcerative colitis. The number of people with Crohn's disease in each study was small, given the prevalence of the condition in the wider population. The committee noted that the small sample sizes could mean that the reviewed studies were underpowered to produce robust estimates of the prognostic ability of the tests. The committee also noted that there are other predictive studies for Crohn's disease with larger populations, showing that larger sample sizes are possible. The committee concluded that the heterogeneity in the population and the population size added substantial uncertainty to the interpretation of study results.

There is no standard definition of a high or low risk of a severe disease course

4.3 The reviewed studies used different measures to define a person as being at high or low risk of following a severe course of Crohn's disease. IBDX studies used poor outcomes, such as surgery and complications, as a proxy for a severe disease course (see sections 3.5 and 3.6), whereas the PredictSURE IBD study used the need for multiple treatment escalations (see section 3.7). This inconsistency is a source of additional uncertainty.

The accuracy of PredictSURE IBD and IBDX in predicting a severe disease course is uncertain

4.4 Little data was identified on the prognostic accuracy of the tests. Sensitivity, specificity and negative predictive value were only reported for the PredictSURE IBD test, and in only 1 study (Biasci 2019). The clinical expert said that, at the moment, severe disease course may be predicted by known risk factors such as age and smoking status. But there is no consensus on, or algorithm for, how these risk factors should be combined, and their predictive value is limited. The clinical expert also said that, based on the findings of the Biasci study, the PredictSURE IBD test appears to perform better than risk prediction based on clinical features or endoscopic findings, and therefore has the potential to be a useful test. The committee noted that it would help to understand if the tests can give a more accurate prognosis when used alongside clinical features rather than as a substitute. The committee concluded that overall, the evidence on the prognostic accuracy of PredictSURE IBD and IBDX is weak, and encouraged further research on their accuracy when used alongside clinical features (see section 5).

There is little evidence on how the tests affect treatment decisions

4.5 The proposed value of the tests is to categorise people with Crohn's disease according to their risk of following a severe disease course. People predicted to have a severe disease course could have top-down treatment, which may help control the disease early, leading to better outcomes like fewer flare-ups, and prevent bowel damage and limit the need for surgery. The committee noted that currently there was no evidence on how the tests can help with decisions about personalised treatment plans. It concluded that it would help to have research on how the tests affect treatment decisions (see section 5). PROFILE, a randomised, multicentre, biomarker-stratified, open-label study is ongoing in the UK with results expected in 2022. This trial uses PredictSURE IBD to assign people to top-down or step-up treatment, and may help address this evidence gap.

There is no evidence on how the tests affect clinical outcomes

4.6 The committee considered that there was no evidence to show that using the prognostic tests to identify people at high risk of a severe disease course and help guide treatment improves clinical outcomes. The committee encouraged studies assessing how the tests affect clinical outcomes (see section 5).

Cost effectiveness

Corticosteroids are often used as a first-line treatment for adults with moderately active or severely active Crohn's disease

4.7 The committee noted that the treatment sequences modelled by the external assessment group (EAG) in the original base case may not reflect treatment in the NHS. Corticosteroid treatment was not included in the original model because all high-risk patients (in the top-down and step-up arms) were assumed to have initial induction treatment with corticosteroids before moving to the next treatment steps. Therefore, the impact of corticosteroid use would be the same in both arms. Because of this, the EAG excluded people in the Biasci study who did not escalate from steroids to immunomodulators or anti-tumour necrosis factor (TNF) treatment (see section 3.18). The committee heard that the recent consensus guidelines from the British Society of Gastroenterology (Lamb et al. 2020) recommended minimising steroid use because of toxicity and lack of efficacy, except in moderate to severely uncomplicated luminal Crohn's disease, which systemic corticosteroids may have some benefit for. However, the clinical experts said that corticosteroids are still used as a first-line treatment in adults with moderately active or severely active Crohn's disease, unless there is a contraindication. In a top-down treatment strategy, a shorter period of corticosteroids, or sometimes no corticosteroids, may be used before starting treatment on biologics. The committee considered a revised base-case model, which had treatment with a corticosteroid first in the step-up strategy. The committee concluded that adding the corticosteroid step better reflected current NHS practice.

Managing Crohn's disease is complex, and rapidly evolving because of new treatments and tests

4.8 Clinical experts explained that treatment of Crohn's disease varies across the NHS. Many treatments are already available and new drugs are entering the market. Treatments are often combined, for example, in the EAG model 30% of people who had a TNF-alpha inhibitor, and 20% of people who had a biological treatment that was not an anti-TNF, also had an immunomodulator. This is because there is evidence to show that combination treatment reduces the chances of losing response to biologics (immunogenicity). However, the clinical experts said there is no consensus on using monotherapy or combination therapy, and that it varies in clinical practice. Tests that monitor levels of biologics and presence of antibodies to biologics are also being more widely used. These tests can guide a personalised treatment strategy to help maintain a treatment response for longer. The committee noted that an important study used to provide model inputs for the top-down and step-up treatment strategies (D'Haens et al. 2008) was over 10 years old. It used treatment strategies that do not reflect current practice in the NHS or the top-down treatment strategy that was included in the EAG model (see section 4.10). The committee concluded that variation in clinical practice, and the absence of more recent data comparing top-down and step-up strategies, make modelling of the treatment strategies difficult. This creates great uncertainty around the model structure.

It is uncertain what a top-down treatment strategy in the NHS would look like

4.9 Top-down treatment is not widely used in the NHS and so it is uncertain what the treatment pathway would look like. The clinical experts noted that de-escalation of biologics in Crohn's disease is often unsuccessful and therefore not often used because of a high risk of relapse. The company model included an immunomodulator step after biologics but the EAG base case did not (see section 4.14). The EAG explored this as a scenario analysis (see section 3.30). In addition, the biologics modelled as second and third line can also be used as first line. The committee noted that lack of clinical consensus about the top-down treatment strategy adds extra uncertainty into the model.

It is not certain if top-down treatment has clinical benefits over step-up treatment

4.10 Clinical experts explained that early rather than late treatment with biologics could improve outcomes for people likely to have a more severe disease course. The EAG noted that the evidence on the effectiveness of top-down (early treatment with biologics) compared with step-up (late treatment with biologics) in the model was from the D'Haens study. This showed that people who had early combined immunosuppression had a longer time to relapse than people who had conventional treatment. The hazard function (based on the assumption that time to relapse is a proxy for time to next treatment escalation) derived from D'Haens was applied only to the first step of the model (the anti-TNF compared with immunomodulator step). In the model, people in the top-down arm of the model remained on initial treatment for longer than those in the step-up arm of the model (anti-TNF compared with immunomodulator). This resulted in people in the first step of the top-down arm having a higher probability of having and maintaining remission, which is associated with lower costs and higher quality-adjusted life years (QALYs). Later treatment steps in both the top-down and step-up strategies were assumed to have the same time to treatment escalation as anti-TNF in the top-down arm. This assumption was made because there was no evidence either way. The early combined immunosuppression used in D'Haens differed from the top-down treatment sequence described by the clinical experts because people did not carry on having maintenance treatment with infliximab but were allowed infliximab as needed (see section 3.19). This might have underestimated the benefits of top-down treatment in the model. The EAG said that in the long term, top down may not have an advantage over step up because the 10‑year follow-up study of D'Haens (Hoekman 2018) showed no difference in hospitalisation, surgery and endoscopic remission between both strategies. The clinical experts considered that early treatment with biologics does make a difference, and that there is a trend towards using biologics earlier, but said that there is not much good-quality evidence generalisable to the NHS to support this. The EAG noted that the evidence available was heterogeneous. It noted that a Canadian Agency for Drugs and Technologies in Health review published in 2019 also found that it is not clear if early biologic therapy is more effective than conventional therapy for Crohn's disease in adults because there are few studies and the ones that exist are heterogenous. Registry data could have been useful. The committee concluded that more evidence is needed on the effectiveness of top-down compared with step-up strategies. This is because if there is no evidence of benefit, there is no clinical rationale for identifying people at high risk of a severe disease course and treating them using a top-down strategy.

Because of the lack of data and the need for many assumptions, the model results are not certain

4.11 The committee noted that interpreting the modelling was difficult because of the very weak data feeding into it. There was limited data on the prognostic accuracy of the tests (see section 4.4), on the effectiveness of a top-down strategy compared with a step-up strategy (see section 4.8), and no information from studies on how these 2 steps would combine to affect clinical outcomes. The EAG explained that it had to make many assumptions to be able to link the evidence in the model. There was great variation in the results of the model. Because of the limited data and assumptions that needed to be made, the cost effectiveness of the tests is highly uncertain.

The economic model lacks face validity because top-down treatment is associated with a QALY loss

4.12 In the EAG's base-case model, a standard care strategy of no testing and step-up treatment dominated the strategy of testing with PredictSURE IBD followed by top-down or step-up treatment depending on the test result. In the revised base case when a corticosteroid step was included in the step-up treatment strategy, the PredictSURE IBD arm was still dominated by the standard care arm. The clinical experts had previously explained that early treatment with biologics could improve outcomes for people with severe Crohn's disease (see section 4.10). The committee therefore considered that the base-case results from the economic model, which show that top-down treatment is associated with a QALY loss compared with step-up treatment, lacks face validity. The EAG said that the QALY loss is because there are more treatment options in the step-up strategy because it has an immunomodulator step as the first-line treatment. The most recent revision of the model had both a corticosteroid step and an immunomodulator step at the start of the step-up strategy. In the model the treatment steps are incorporated independently of each other because of a lack of evidence on response to the full treatment strategy or on how response to each treatment is correlated. The consequence of this is that people on the step-up treatment arm have the opportunity to respond, even if just temporarily, to corticosteroids and immunomodulators, and therefore take longer to exhaust all their treatment options. This results in people on step-up treatment spending more time in the health states of response or remission, and therefore gaining more QALYs compared with people in the top-down treatment arm. The committee concluded that the QALY difference in favour of step-up treatment is unexpected and might not be seen in a real-world setting.

Changing some of the key assumptions in the model leads to a QALY gain in the PredictSURE IBD arm, but ICERs are high

4.13 The EAG explained that some scenario and sensitivity analyses did result in a QALY gain for the PredictSURE IBD arm of the model. For example, in the one-way sensitivity analyses, using higher response and remission rates for biologics in the top-down treatment strategy, and using lower response and remission rates for biologics in the step-up treatment strategy, all resulted in a QALY gain for PredictSURE IBD. Other scenarios that resulted in QALY gain for the PredictSURE IBD arm of the model were:

  • after 2 years in remission with biologics, a proportion of people have mucosal healing and do not need more treatment escalations

  • when some low-risk people were assumed to be misdiagnosed as high risk (see section 3.34) because they did not need any more treatment escalation

  • when an additional immunomodulator step was included at the end of the top-down treatment strategy

  • when it was assumed that all high-risk patients who receive step-up treatment do not respond to an immunomodulator and therefore escalate to anti-TNF.

    The ICERs from these sensitivity and scenario analyses were well above the range normally considered to be cost effective.

The EAG's model results are different from the company's model and the most relevant published economic model

4.14 The base-case probabilistic and deterministic results of the EAG's model produced QALYs in favour of standard care. This suggests that a no testing strategy with step-up treatment is better for people at high risk of a severe course of Crohn's disease than top-down treatment using the prognostic tool. This result was not consistent with the company's model and the model reported by Marchetti (2013), both of which reported that a top-down strategy is associated with more QALYs. The EAG noted that the difference between its model and the company's was that the treatment sequence modelled by the company had an immunomodulator as a last treatment step in the top-down arm. This was not modelled in the EAG's base case but as a scenario analysis. This scenario produced an ICER in favour of top-down treatment that was much higher than what NICE normally considers a cost-effective use of NHS resources (see section 3.30). The company's model also assumed a constant relative treatment effect, capped at 10 years, whereas the EAG's model assumed a diminishing relative treatment effect (see further details in the addendum to the diagnostic assessment report). Marchetti modelled a different treatment sequence (see section 3.11) to the EAG's, and a different time horizon – 5 years compared with the EAG's 65 years. The EAG explored changing the time horizon in their model to 5 years, which showed a small QALY gain for the PredictSURE IBD arm (see section 3.37). The difference in the results was likely because of the uncertainties in the top-down treatment pathway and the effectiveness of top-down compared with step-up strategies.

Assuming that IBDX and PredictSURE IBD have the same prognostic ability is not appropriate

4.15 Only data on the prognostic ability of PredictSURE IBD was included in the base case. The EAG included IBDX in an exploratory analysis that assumed that the ability of IBDX to identify people at high or low risk was the same as PredictSURE IBD. The committee heard that the tools identify different markers and need different test samples. The committee also noted that there was 1 abstract (Lyons 2020), which compared both tools and showed that PredictSURE IBD predicted a shorter time to treatment escalation in people classed as high risk. IBDX did not predict a difference in time to treatment escalation between people positive for 2 or more markers and those positive for only 1 marker (see section 3.8). The committee concluded it was not appropriate to assume the tests had the same prognostic accuracy, and that more evidence is needed (see section 5).

Evidence from a different starting cohort that includes children and teenagers would be useful

4.16 The committee heard that the average age in the EAG's model was 35. It considered that the model might not reflect other age groups that are first diagnosed with Crohn's, for example, one peak is in teenagers and another is at around 60. A clinical expert noted that the treatment pathway for children or teenagers would be different from adults because children often follow a more severe disease course and may need enteral nutrition. The committee heard that modelling this population may need an entirely new model rather than an adaptation of the model built by the EAG for the adult population.

Modelling adverse events or varying the cost of surgery may not have a huge impact on the results

4.17 The EAG did not model adverse events, to keep the model simple. It predicted that, if it had modelled adverse events, top-down treatment would have been more dominated. The committee thought the cost of surgery might have been underestimated and that its impact on the model results was not clear. The EAG noted that, although it did not vary the costs of surgery, the number of surgical events modelled was very small, so it did not anticipate a significant difference in results.

  • National Institute for Health and Care Excellence (NICE)