3 Committee discussion
The appraisal committee considered evidence submitted by Jazz Pharmaceuticals, a review of this submission by the evidence review group (ERG), NICE's technical report, and responses from stakeholders. See the committee papers for full details of the evidence.
The appraisal committee was aware that 1 issue was resolved during the technical engagement stage. It agreed that a subgroup of people with a baseline Epworth Sleepiness Scale (ESS) score of more than 12 should be used in the modelling (see the technical report, issue 2).
The committee recognised that there were remaining areas of uncertainty associated with the analyses presented and took these into account in its decision making. It discussed the following issues: primary therapy adherence, treatment response, adjustment for the improvement in the control arm, health utility values, partner utilities, treatment discontinuation, adverse events and dosing splits (see the technical report, issues 1 to 9), which were outstanding after the technical engagement stage.
Excessive daytime sleepiness caused by obstructive sleep apnoea affects quality of life, but it is uncertain by how much
3.1 The patient expert explained that obstructive sleep apnoea can negatively affect people's physical and mental wellbeing. Because of excessive daytime sleepiness, aspects of daily life such as education, employment, maintaining a social life and the ability to drive, are all negatively affected. Symptoms of sleep apnoea such as snoring can disrupt partners' sleep, which may affect their quality of life as well. The clinical experts agreed that excessive daytime sleepiness negatively affects people's quality of life, but said it can be difficult to measure by how much. They also noted that obstructive sleep apnoea can be associated with an increased risk of high blood pressure or stroke. The committee concluded that excessive daytime sleepiness caused by obstructive sleep apnoea affects quality of life. However, it was uncertain how much reducing excessive daytime sleepiness would improve quality of life.
3.2 The clinical experts said that most people with excessive daytime sleepiness caused by obstructive sleep apnoea are referred to sleep clinics. Initial treatment includes lifestyle advice about weight loss. Mandibular devices are considered for people with mild symptomatic obstructive sleep apnoea. NICE's technology appraisal guidance on continuous positive airway pressure (CPAP) for obstructive sleep apnoea/hypopnoea syndrome (from now, TA139) recommends CPAP for adults with moderate or severe obstructive sleep apnoea. The patient expert explained that CPAP is usually well tolerated but is associated with some inconvenience or discomfort. Wearing a face mask connected to the CPAP machine can also restrict sleeping. The clinical experts also said that some people cannot tolerate CPAP because they can feel claustrophobic wearing a mask, which can be exacerbated by certain mental health issues. People with neurodegenerative conditions may also not tolerate CPAP. The clinical and patient experts said that some people using CPAP will still have residual excessive daytime sleepiness. They noted that solriamfetol would be welcomed as another potential treatment option for this group. The committee concluded CPAP is an appropriate comparator. But some people cannot tolerate it, so a comparison with standard care without a primary obstructive sleep apnoea therapy was also important.
3.3 Obstructive sleep apnoea is currently treated in sleep services commissioned by the relevant clinical commissioning group. The clinical experts noted that, if solriamfetol was recommended, the likely requirement for more monitoring of adherence to CPAP (see section 3.5) could put pressure on these services. In its evidence submission and economic model, the company assumed that solriamfetol would be administered in specialist sleep services only. The committee asked the clinical experts if there was a possibility that solriamfetol could be prescribed in primary care. The experts suggested that treatment would have to be started in the specialist sleep clinics but were uncertain if longer-term prescribing could move to primary care. In response to consultation, the company provided information from clinicians and pharmacists to support its claim that solriamfetol would be limited to secondary care. It noted that solriamfetol has a black triangle in its marketing authorisation, meaning additional monitoring is needed. This would likely severely limit the use of solriamfetol in primary care. The committee concluded that solriamfetol is likely to be limited to secondary care.
3.4 The main clinical-effectiveness evidence for solriamfetol came from TONES 3. This was a 12‑week, randomised, double-blind, placebo-controlled, multicentre trial. The intervention was solriamfetol in doses of 37.5 mg, 75 mg and 150 mg (also including an unlicensed 300 mg dose). In both the intervention and comparator groups, around 70% of patients were using a primary obstructive sleep apnoea therapy, defined as either a prior effective surgical intervention or CPAP, at the start of the trial. These people were classified as adherent. The co-primary outcome of the trial was change in the ESS score and Maintenance of Wakefulness Test (MWT) from baseline to week 12. The results showed a significant change in ESS score and MWT from baseline to week 12 across all 3 licensed solriamfetol doses. The committee concluded that solriamfetol reduces excessive daytime sleepiness.
Solriamfetol is unlikely to affect adherence to a primary obstructive sleep apnoea therapy like CPAP
3.5 The patient expert and ERG said that some people with excessive daytime sleepiness may prefer to manage their symptoms with medicine instead of a primary therapy such as CPAP. This could lead to them using CPAP less and so a reduction in the combined benefits of CPAP and solriamfetol. The company included patient adherence to a primary therapy in its 3 trials (TONES 3, TONES 4 and TONES 5) as an exploratory end point. It also provided results from a peer-reviewed paper by Schweitzer et al. (2021), which showed no effect on primary therapy adherence in TONES 5 from baseline up to week 40. TONES 5 was an open-label trial assessing solriamfetol's long-term safety and efficacy for up to 52 weeks, and included patients who had completed another solriamfetol trial (including TONES 3). It included a 2‑week placebo-controlled randomised withdrawal phase. The ERG noted that the results of Schweitzer et al. were highly uncertain because of missing data and poor reporting. It said that the estimates were not reported separately for people classified as adherent or non-adherent at baseline. The clinical experts said that most sleep clinics can monitor CPAP machines remotely and that some people, such as heavy goods vehicle drivers, have their CPAP use monitored remotely regularly. The clinical experts acknowledged that, although people having solriamfetol alongside a primary therapy such as CPAP would have their use monitored, it may have to be more frequent. The committee noted the uncertainty in the TONES 5 data on adherence. It would have preferred to see more sensitivity analyses of the impact of missing data in Schweitzer et al., and a subgroup analysis stratified by adherence at baseline. In response to consultation, the company provided additional sensitivity analyses, which showed that people having solriamfetol would meet the standard definition of adherence, even in a 'worst case' scenario with respect to the missing data from Schweitzer et al. In these analyses, adherence was defined as CPAP use of 4 hours or more on 70% of nights. The committee concluded that adherence to a primary therapy like CPAP is unlikely to be affected by treatment with solriamfetol.
3.6 TONES 3 included people who adhered to a primary obstructive sleep apnoea therapy (standard care) and people who did not (see section 3.4). In the economic model presented at the first committee meeting, the company assumed that everyone entering the model had either solriamfetol with standard care (for example, CPAP) or standard care without solriamfetol. It presented a cost-effectiveness scenario analysis that included people from TONES 3 who did not adhere to standard care. The ERG noted that the baseline ESS score for the non-adherent group was worse than the adherent group. This meant that the improvement in ESS score because of solriamfetol treatment was greater, resulting in a lower incremental cost-effectiveness ratio (ICER) if the non-adherent group data was used. The committee felt that the company had not properly explained its methods for modelling the non-adherent group, or why people were not using their primary therapy. The committee recalled that the marketing authorisation for solriamfetol includes people who previously used a primary therapy but stopped. The committee also recalled that people with mental health or neurodegenerative conditions may struggle to use CPAP regularly (see section 3.2). It considered that recommendations restricting solriamfetol only for use with CPAP could discriminate against this group. The committee concluded that it would like to see clinical- and cost-effectiveness evidence for people who were not using a primary therapy. In response to consultation, the company provided clinical and cost-effectiveness evidence for people who were not using a primary therapy at baseline in TONES 3. The committee noted that the cost-effectiveness results were similar for people using a primary therapy at baseline compared with people who were not. It concluded that the company's model of solriamfetol with and without standard care was suitable for decision making.
3.7 The clinical experts said that the definition of treatment response for obstructive sleep apnoea varies considerably in clinical practice. The company used the ESS in TONES 3 as a component of the co-primary end point (see section 3.4). The model it presented at the first committee meeting defined treatment response as an ESS score reduction of 3 or more points, based on clinical opinion. Advice to the ERG was that an ESS score reduction of 2 or more points was appropriate but clinicians would consider other factors when assessing treatment effectiveness. The clinical experts said that, although an ESS score reduction of 2 or more points may be appropriate, there is no consensus on what reduction can be considered clinically relevant and that it varies by individual. The ERG tested the ESS score reduction threshold in a scenario analysis, which showed that changing the threshold did not significantly affect the cost-effectiveness results. The committee acknowledged the uncertainty about the ESS but concluded that an ESS score reduction of 2 or more points was an appropriate criterion for treatment response. In response to consultation, the company updated its model to define treatment response as an ESS score reduction of 2 or more points. The committee accepted this for decision making.
The company's Hawthorne effect scenario is an acceptable approach to account for the improvement in the control arm
3.8 In TONES 3, ESS score improved from baseline to week 12 in the control arm (placebo plus standard care). The company suggested that this was likely to be a 'true placebo' effect – that is, the effect would not continue in the real world for standard care plus placebo. However, the company acknowledged that the improvement in the TONES 3 control arm may also be because of observation bias – the Hawthorne effect (that is, patients reported an improvement in ESS score because they were being observed). Under this assumption, the size of treatment effect for both arms would be lower in the real world, but the relative difference between the arms would be maintained. The company adjusted for the Hawthorne effect by removing the improvement in ESS score observed in the control arm from both the standard care and solriamfetol with standard care groups in its model. However, the ERG considered that some of the improvement in the TONES 3 control arm could be because of regression to the mean. This is a tendency for extreme values to move closer to the mean when measures are repeated over time. The ERG preferred to use the raw unadjusted trial data for both the standard care and solriamfetol with standard care groups in the model, which it considered would reflect outcomes in clinical practice. During technical engagement and in response to consultation, the company presented evidence to suggest that the improvement in the TONES 3 control arm was unlikely to be because of regression to the mean. This included evidence from people transitioning from TONES 3 to TONES 5. Those who were already having solriamfetol showed a greater improvement in ESS score when treatment with solriamfetol was unblinded. The company also noted that the speed of improvement in the TONES 3 control arm was too fast to be regression to the mean, and that the baseline ESS scores in TONES 4 and TONES 5 were similar. This meant that neither baseline was a temporary extreme value, as would be expected with regression to the mean. The committee acknowledged that there may be some regression to the mean. In response to consultation, the company did sensitivity analyses, varying the relative contribution of each of the 3 potential mechanisms for the improvement in the TONES 3 control arm (regression to the mean, Hawthorne effect and true placebo). The company considered that assuming the control arm improvement was solely because of the Hawthorne effect, was conservative. This is because the true placebo effect may also be relevant due to the possible psychological benefit of having placebo in the trial, that is generalisable to routine practice. The ERG explained that there was not enough evidence to decide which mechanism was most relevant. It highlighted that there was uncertainty in the company's regression to the mean analyses. This was because the company assumed that people having solriamfetol whose symptoms did not respond to treatment had the same mean ESS score as the pooled standard care arm. The ERG was also concerned that attributing the control arm improvement to the true placebo effect would mean that the NHS would be paying for the benefit of placebo, if solriamfetol was recommended on this basis. The committee considered that the company's adjustment for the Hawthorne effect in its model was plausible. It felt that it was unlikely that regression to the mean was a major cause of the improvement in the TONES 3 control arm. It agreed with the ERG's concern about the true placebo effect. The committee concluded that it was reasonable to consider the 100% Hawthorne effect scenario in its decision making.
EQ-5D may have limitations in assessing quality of life for people with excessive daytime sleepiness
3.9 EQ‑5D data from TONES 3 showed that people having solriamfetol had no improvement in quality of life from baseline to week 12. The clinical experts explained it is likely that the reduction in ESS score in TONES 3 would have some impact on quality of life, but it is difficult to determine the extent of improvement using standard quality-of-life measures such as the EQ‑5D. Higher ESS scores mean more excessive daytime sleepiness. The company explained that the EQ‑5D is insensitive to changes in quality of life for people with excessive daytime sleepiness caused by obstructive sleep apnoea. This is because it does not include a sleep domain and is unable to measure the impact of obstructive sleep apnoea on interpersonal relationships. The company suggested that the EQ‑5D data collected in TONES 3 did not accurately reflect the substantial quality-of-life burden of the disease. It also noted that the EQ‑5D results were inconsistent with the other TONES 3 outcome measures. The committee concluded that the EQ‑5D may have limitations in assessing quality of life for people with excessive daytime sleepiness.
3.10 The company used a mapping algorithm to estimate EQ‑5D values based on ESS scores using data from the National Health and Wellness Survey (NHWS). The ERG considered that the company's mapping approach using NHWS was appropriate given the lack of alternative data. However, the committee was concerned that, if EQ‑5D is truly insensitive to changes in quality of life for people with this condition (see section 3.9), then mapping ESS scores to the EQ‑5D would not be appropriate and an alternative quality-of-life measure should be used. In response to consultation, the company said that it did not consider that the EQ‑5D or SF‑36 data collected in the TONES trials would accurately reflect the burden of obstructive sleep apnoea on quality of life. It continued to use the NHWS mapping algorithm as its base case and did not provide alternative SF‑6D utilities. In its second meeting, the committee noted several concerns with the company's NHWS mapping approach. It understood that the NHWS data was collected online from people who self-reported experience of obstructive sleep apnoea, narcolepsy, or both, rather than people who had necessarily been formally diagnosed. This may limit how relevant the NHWS data was to NHS clinical practice. The design of the NHWS also did not allow analysis of changes in ESS score or EQ‑5D over time, which may have given a more reliable measure of how change in ESS score predicts change in utility. The ERG highlighted that the NHWS algorithm may have omitted important predictive variables relating to quality of life. The committee was aware that similar mapping algorithms, based on longitudinal data that did not map change scores, have been used in NICE's technology appraisal guidance on nivolumab for advanced squamous non-small-cell lung cancer after chemotherapy and benralizumab for treating severe eosinophilic asthma. But, it noted that in the current appraisal, trial data and other sources for the mapping were available. The company said that the NHWS mapping algorithm used best methodological practice and that it should be considered as the base case. Despite this, the committee concluded that although the NHWS mapping approach might have advantages, it preferred a mapping based on the McDaid algorithm (see section 3.11) because it is likely to be less biased.
3.11 The company provided a scenario using the mapping algorithm from TA139, reported in McDaid et al. (2009). This used individual patient data measured both before and after treatment from 3 studies of people with sleep apnoea who attended sleep clinics. The committee acknowledged uncertainty because the McDaid algorithm was based on a smaller sample size than the NHWS. However, it noted that McDaid had been accepted by the committee in TA139, and that McDaid did not share some of the limitations of the NHWS data (see section 3.10). But the committee agreed that mapping should be considered a second-best option compared with using the available trial data. The company provided analyses suggesting that the EQ‑5D data from TONES 3 was inappropriate to use because there was a 'ceiling effect'. A large proportion of patients in TONES 3 had a baseline utility of 1 (the maximum utility value). This meant there was minimal room for utility scores to improve during the trial. The company also provided an analysis simulating what the utility gain might have been had different baseline utility values (from other CPAP studies) been used. The ERG had concerns about the rationale for the ceiling effect because it was unclear why other CPAP studies that also used EQ‑5D would not have had a similar ceiling effect. It noted that the baseline utility in the studies provided by the company were also in a population who had not had CPAP and so did not necessarily align with the population for this appraisal. The company showed research (Feng et al. 2021) suggesting a large ceiling effect with EQ‑5D and that it does not include aspects of quality of life such as energy and wellbeing. It argued that studies in people who had not had CPAP were more appropriate, because symptoms would not have been satisfactorily managed by CPAP in people who would have solriamfetol. The company also noted that the baseline EQ‑5D values in TONES 3 did not reflect the high baseline ESS scores, and that the trial was not long enough to capture changes in quality of life. The committee considered that the McDaid mapping would have some of the same issues as the EQ‑5D data from TONES 3, because it still used the EQ‑5D. It felt that it had not been presented with enough evidence to disregard the EQ‑5D data from TONES 3. So the committee considered that evidence directly from TONES 3 was a relevant source for consideration, despite uncertainty about the utility gain associated with ESS and the general limitations of using EQ‑5D. The company preferred assigning utility values based on both response status and treatment group because it considered that patients who had placebo in the trial would not be considered to 'respond' in practice. The committee did not agree with this approach, because there was no evidence provided for a treatment-related difference in quality of life that was not associated with ESS score. It agreed that health state utility values based on response status and independent of treatment group were preferred. The committee concluded that the quality-of-life benefits for solriamfetol from TONES 3 or from the utilities mapped using McDaid were the most acceptable sources for consideration in its decision making.
3.12 The committee considered that both TONES 3 and the McDaid algorithm provided equally plausible estimates of how much reducing excessive daytime sleepiness improves quality of life. It noted that the utility estimates differed widely, resulting in considerably different cost-effectiveness estimates depending on which utilities were used. The ERG explored 2 methods for averaging the TONES 3 and McDaid utilities. This would mean that reducing excessive daytime sleepiness would give some quality-of-life improvement in the model, though not as little as TONES 3 suggested or as much as McDaid alone suggested. The committee considered this approach would be the most appropriate way of producing an ICER for its decision making. The first method averaged the EQ‑5D utilities directly from TONES 3 with the utilities from McDaid. This approach assumed no relationship between ESS score and EQ‑5D in TONES 3. The second method averaged the coefficient of change in ESS score and change in EQ‑5D from TONES 3 and McDaid. The company considered that the methods used by the ERG were unconventional and lacked transparency. The company was also unclear why the 2 utility sources had been weighted equally in the ERG's analysis. The committee was not convinced there was enough evidence to prefer 1 source of utilities over the other. It recognised that both methods used novel techniques to determine a utility value between 2 different types of evidence. It noted that both methods had limitations, but the first method was preferable because it took into account any differences in the models used to calculate the utility values (such as covariates). The committee concluded that using the first method to average the utility values from TONES 3 and McDaid was appropriate.
Partner utility values are an important consideration but there is not enough evidence to include them in the modelling
3.13 Paragraph 5.1.7 on perspective in NICE's guide to the methods of technology appraisal notes that the perspective on outcomes should include all direct health effects, whether for patients or for other people. The company included partner utility values as a scenario in its modelling. This was because of the substantial impact that symptoms of obstructive sleep apnoea and its treatment can have on partners. The clinical expert agreed that partner utility values should be considered because of the substantial impact on family members (see section 3.1). But the ERG was concerned about the methods the company used to estimate partner utility values because the time trade-off utility estimates may not be comparable to those from the EQ‑5D. In its first meeting, the committee considered that partner utility values are important, but it had not been presented with enough evidence to support their inclusion in the modelling. In response to consultation, the company did not provide additional evidence to support including partner utilities. So, the committee did not change its earlier conclusion.
3.14 The company model presented at the first committee meeting did not include any costs for serious adverse events. The company said this was because most adverse events in TONES 3 were mild or moderate in severity. For adverse events that led to treatment discontinuation, the company model included the cost of 1 GP consultation. The ERG highlighted that some of the serious adverse events related to solriamfetol in the 150 mg arm of TONES 5 led to hospitalisation. This included 1 stroke. The company argued that including the cost of stroke would not be appropriate because it can occur in the target patient population in the 'real world'. In its base case, the ERG included hospitalisation costs for serious adverse events in patients taking solriamfetol (including stroke). In response to consultation, the company provided scenario analyses using different hospitalisation costs, based on hospitalisation rates. Its revised base case used annualised hospitalisation rates from TONES 3 for both treatment arms. The ERG highlighted that the company base case included a higher rate of hospitalisation with standard care than with solriamfetol, which the ERG considered implausible. The ERG preferred to use hospitalisation rates from TONES 5. The committee was presented with 3 other scenarios:
Scenario 1: hospitalisation rates from TONES 5 irrespective of relationship to solriamfetol for the solriamfetol arm, with no hospitalisation costs for the standard care arm.
Scenario 2: hospitalisation rates from TONES 5 irrespective of relationship to solriamfetol for the solriamfetol arm, and hospitalisation rates based on Hospital Episode Statistics data for the standard care arm.
Scenario 3: treatment-related hospitalisation rates from TONES 5 for the solriamfetol arm, with no hospitalisation costs for the standard care arm.
The committee agreed with the ERG that the company base case was implausible. It considered that in scenarios 1 and 2, the hospitalisation rates for the solriamfetol arm were implausibly high. So, the committee concluded that the hospitalisation costs presented in scenario 3 were acceptable for decision making (the exact hospitalisation rates are commercial in confidence and cannot be reported here).
3.15 Solriamfetol is available in different doses, which vary in cost and effectiveness. The clinical experts explained that it is difficult to estimate the most likely dose split in NHS clinical practice. Results for the different solriamfetol doses were weighted, based on dose-splitting assumptions, to inform cost-effectiveness comparisons between solriamfetol and standard care. In the company's base case, it was assumed that the dose splits were 40%, 40% and 20% respectively for the 37.5 mg, 75 mg and 150 mg doses of solriamfetol. The ERG noted that this dose split was different to that reported in a US study of prescribing data, in which a greater proportion of patients had the 75 mg dose (the figures are commercial in confidence and cannot be reported here). The ERG preferred to use the dose split based on the US prescribing data, in the absence of UK-specific data on solriamfetol prescribing patterns. In response to consultation, the company updated its base case to include the ERG's preferred dose split. The ERG highlighted that the cost-effectiveness conclusions were not sensitive to dose-split assumptions. The committee concluded that the dose split based on US prescribing data was acceptable for decision making.
Because of the uncertainty, an acceptable ICER is at the lower end of what NICE normally considers an acceptable use of NHS resources
3.16 NICE's guide to the methods of technology appraisal notes that judgements about the acceptability of a technology as an effective use of NHS resources will take into account the degree of certainty around the ICER. The committee will be more cautious about recommending a technology if it is less certain about the ICERs presented. The committee noted the high level of uncertainty, particularly around:
the adjustment for the improvement in the control arm (see section 3.8)
how the quality-of-life benefit of solriamfetol was measured (see sections 3.9 to 3.12).
So, the committee agreed that, because of the high level of uncertainty in the analyses, an acceptable ICER would be at the lower end of the range that NICE normally considers an acceptable use of NHS resources.
3.17 The committee considered the cost-effectiveness estimates for solriamfetol alone and with standard care compared with standard care alone. The cost-effectiveness results are commercial in confidence because they included the patient access scheme discount for solriamfetol. The committee preferred the following assumptions:
treatment response defined as an ESS score reduction of 2 or more points (see section 3.7)
applying the 100% Hawthorne effect scenario to account for the improvement in the control arm (see section 3.8)
utilities based on the average of the EQ‑5D data directly from TONES 3 and utilities mapped using McDaid (see section 3.12)
hospitalisation costs for solriamfetol included for treatment-related serious adverse events from TONES 5, with no hospitalisation costs for standard care (see section 3.14)
the ERG's preferred dose split, based on the US prescribing data (see section 3.15).
The ERG provided scenarios based on the committee's preferred assumptions. These used subgroup data based on use of CPAP at baseline from TONES 3. For people who can use CPAP, the ICER for solriamfetol with standard care compared with standard care alone was above the range the committee considered acceptable (see section 3.16). For people who cannot use CPAP, the ICER for solriamfetol alone compared with standard care was also above this range.
Solriamfetol is not recommended for treating excessive daytime sleepiness caused by obstructive sleep apnoea
3.18 The committee recognised that excessive daytime sleepiness caused by obstructive sleep apnoea is a debilitating condition that negatively affects many aspects of daily life (see section 3.1). It acknowledged that solriamfetol alone and with standard care was more effective than standard care alone in reducing excessive daytime sleepiness, as measured by the ESS and MWT. It also acknowledged that partner utilities were not included in the modelling. But it recognised obstructive sleep apnoea may affect partners and took this into account in its decision making. However, the committee believed that substantial uncertainty remained in the company's analysis. It considered that the most plausible cost-effectiveness estimates for solriamfetol alone and with standard care compared with standard care alone were above the range considered an acceptable use of NHS resources, even after taking into account the other factors (see section 3.16). Therefore, it did not recommend solriamfetol for routine commissioning in the NHS.