3 Committee discussion

The evaluation committee considered evidence submitted by Merck Sharp & Dohme, a review of this submission by the external assessment group (EAG), and responses from stakeholders. See the committee papers for full details of the evidence.

The condition

Details of condition

3.1

Pulmonary arterial hypertension (PAH) is a rare, severe and progressive form of pulmonary hypertension caused by changes in the smaller branches of the arteries in the lungs. The condition causes the walls of the pulmonary arteries to become thick and stiff, narrowing the space for blood to pass through and increasing blood pressure. As the arteries are less able to stretch, the heart works harder to pump blood to the lungs. This damages the heart and makes it less efficient at pumping blood around the body and getting oxygen to the muscles.

Clinical management

Current treatment of PAH

3.3

To determine prognosis and treatment, people with PAH are assessed for severity and risk status at initial diagnosis and 3 to 6 months after. Risk status can be classified by 2 different systems:

the European Society of Cardiology/European Respiratory Society (ESC/ERS) classification or
the World Health Organization functional class (WHO FC).

The company submission highlighted that the current treatment pathway for PAH in the UK reflects the 2022 ESC/ERS guidelines. The guidelines use risk status (predictive of risk of death in 1 year) to guide treatment decisions, with the aim of achieving and maintaining low-risk status. Risk status is assessed at diagnosis and initially classified into low-, intermediate- or high-risk. At follow up after initial treatment, people with PAH are assessed and classified into ESC/ERS risk categories: low, intermediate–low, intermediate–high or high. People without cardiopulmonary comorbidities usually have the following treatments:
if low risk (WHO FC 1 or 2), continue on endothelin receptor antagonist (ERA) with phosphodiesterase type 5 inhibitor (PDE5i) dual therapy
if intermediate–low risk, continue on ERA with PDE5i dual therapy, plus selexipag as an add-on treatment
if intermediate–high risk or high risk (WHO FC 3 or 4), stop treatment with selexipag and treat with ERA with PDE5i and with a prostaglandin I2 (PGI2) analogue (for example, intravenous prostacyclin analogues [PCAs]).

At the first committee meeting, clinical experts agreed that this treatment pathway reflects UK clinical practice and that as the disease progresses current treatment is maintained and new treatments are added on, apart from selexipag (which is stopped when PCA is initiated). They noted that discussing individual preferences is important in determining treatment options. A clinical expert noted that treatment often aims to maintain PAH in lower risk states for as long as possible. Patient experts and clinical experts highlighted the unmet need for treatments for PAH, particularly in the intermediate- and high-risk groups. They also noted that current treatments do not always alleviate symptoms. They described the invasive nature and side effects of having intravenous PCA, which can be frustrating and distressing. Some people need to take their PCA treatment with them everywhere, which can risk infection. Patient experts stated that they would welcome new treatments because the need is so great. Clinical and patient experts also discussed the possibility of having access to pulmonary rehabilitation and other physical therapies. The patient expert explained that pulmonary rehabilitation can provide short-term relief and is an effective treatment alongside pharmacological treatments, but it is not always available. At the second committee meeting, following the company's clarification during draft guidance consultation, the committee acknowledged the company's positioning for sotatercept. This was to start the treatment in people with PAH at intermediate–low-risk status at follow up, but not to be restricted to this risk group only (see section 3.4). The committee concluded that people with PAH and healthcare professionals would welcome a new treatment option that would not be restricted to the intermediate–low-risk group.

Positioning of sotatercept

3.4

The marketing authorisation for sotatercept includes people with a WHO FC of 2 to 3. This corresponds to people at an ESC/ERS low-, intermediate–low- or intermediate–high-risk status. By the time of its first meeting, the committee understood that the company had positioned sotatercept for people who are intermediate–low risk at follow up. Current treatment for this risk group is ERA with PDE5i plus selexipag as an add-on treatment (see section 3.3). So, sotatercept is positioned as an alternative treatment to selexipag for the intermediate–low risk group at follow up. At the first committee meeting, clinical experts explained that people who are initially diagnosed with high-risk status and then with intermediate–low-risk status at follow up could also be considered for ERA with PDE5i plus sotatercept. The clinical experts confirmed that sotatercept is appropriate for people with an ESC/ERS intermediate–low-risk status. But they also noted that in clinical practice, sotatercept could be an option for treating PAH with a broader range of ESC/ERS risk statuses. This is because WHO FC 2 and 3, as indicated in the marketing authorisation for sotatercept, includes people from low risk to intermediate–high risk. The clinical expert explained that 38% of people with PAH are intermediate–high risk and 17% are high risk. The clinical experts stated that there is an unmet need for treatments for ESC/ERS high-risk PAH, including for people who cannot have intravenous PCAs. They were concerned that sotatercept would not be used in the intermediate–high-risk groups if recommended. The company stated that the marketing authorisation may be broadened in the future to cover WHO FC 4, which the committee understood includes those at an ESC/ERS high-risk status. The company explained that its positioning of sotatercept in ESC/ERS intermediate–low-risk PAH was based on the current marketing authorisation and available study data. The committee noted that the primary study, STELLAR (see section 3.5), included people with a WHO FC 2 and 3 and so would include some people with intermediate–high-risk status. The committee noted that the company's positioning of sotatercept for ESC/ERS intermediate–low-risk PAH was narrower than its marketing authorisation. It noted that this treatment could benefit other risk groups within the marketing authorisation. The committee also understood that there are other published studies, and potentially data from STELLAR, that may allow for an analysis comparing sotatercept and PCA in people at an ESC/ERS intermediate–high or high-risk status.

At the second committee meeting, the company clarified its positioning is to start sotatercept in people with intermediate–low risk, but not to restrict its use to this group only. This meant that people would only start on sotatercept if they were at intermediate–low risk status at follow up. The company indicated that sotatercept is expected to be continued when their condition progresses to higher risk status. But in the modelling, the company assumed that everyone who progresses to the high-risk state stops sotatercept and starts PCA, unless PCA is not suitable. The company explained that this was because there was no evidence in STELLAR in the high-risk group. There was no one in STELLAR in the high-risk state at baseline in the sotatercept arm. The company further explained STELLAR did not generate evidence that compared adding sotatercept to background treatment with adding PCA to background treatment in the intermediate–high-risk group. This was because people at intermediate–high-risk in STELLAR had already been on PCA at baseline as part of their background treatment. It noted that the ZENITH trial (which compared sotatercept plus background treatment with placebo plus background treatment) was done in people with higher risk status (WHO FC 3 and 4). But, it explained that the trial was also not designed to compare adding sotatercept to dual therapy with adding PCA to dual therapy, because most patients on PCA in the trial had started PCA before being recruited. The EAG agreed with the company's rationale and approach to modelling the high-risk group. The patient expert emphasised the unmet need in the high-risk group. The clinical expert thought that the company's positioning was reasonable, but also highlighted the unmet need in the intermediate–high and high-risk groups. They explained that once started in clinical practice, sotatercept would not be stopped even when progressing to high risk. The committee acknowledged the unmet need in the higher risk groups. It noted that people already having sotatercept who progress from the intermediate–low risk to the higher risk states would continue to have sotatercept in practice. It was also aware that improvement or worsening of ESC/ERS risk status was not an outcome assessed in STELLAR. The STELLAR trial instead assessed changes in WHO FC, and the results from the trial showed that at 24 weeks, WHO FC in the sotatercept arm (163 people) was:

improved in about 30% (48 people)
maintained in about 63.8% (104 people)
worsened in about 4.3% (7 people).

Around 1.3% (2 out of 159) of people in the sotatercept arm moved from FC 3 to FC 4 at 24 weeks. This might correspond to high risk on ESC/ERS classification, but this was uncertain. But the committee noted that data was based on 24-week follow up. There is also uncertainty in how many people on sotatercept would transition to high risk in the longer term or in clinical practice.

The committee understood that treatments, including sotatercept, aim to maintain PAH in lower-risk states for as long as possible. The company's marketing authorisation covers WHO FC 2 to 3 only and there is a lack of evidence for the high-risk group. It also acknowledged the limitations of the available evidence for clinical benefit of starting sotatercept compared with starting PCA in the intermediate–high and high-risk groups. The committee concluded that the company's positioning of starting sotatercept in PAH in the intermediate–low risk group at follow up and continuing if progressing to intermediate–high risk was reasonable given the evidence available.

Clinical effectiveness

STELLAR

3.5

STELLAR (n=323) is a multicentre, double-blind, phase 3 randomised placebo-controlled trial in adults with WHO functional class 2 or 3 PAH. People had sotatercept (n=163) or placebo (n=160) as a subcutaneous injection every 21 days in addition to background mono, dual or triple therapy for PAH. The primary endpoint was the change from baseline at week 24 in the 6‑minute walk distance (6MWD). At 24 weeks the observed mean change from baseline was 40.3 metres with a standard deviation of 64.18 in the sotatercept group. This was higher than the placebo group at the same point (-0.6 metres with a standard deviation of 69.54). The median treatment difference in 6MWD between the sotatercept and placebo groups was 40.4 metres (95% confidence interval 27.28 to 53.53, p<0.001). This trend continued up to week 84, although with a very small sample size. Clinical experts noted that the minimal clinically important difference in 6MWD was defined as 33 metres. At week 24, a significantly greater proportion of people had an improvement in 6MWD of at least 30 metres in the sotatercept arm than the placebo arm (54% compared with 22%). WHO FC improvement from baseline was a secondary outcome assessed in STELLAR. Evidence showed that at 24 weeks, more people on sotatercept (29.4%) improved on WHO FC compared with those on placebo group (13.8%), and the difference was statistically significant (p<0.001). There was no statistically significant difference between sotatercept and placebo on maintenance of WHO FC from baseline (p=0.605). But fewer people on sotatercept had worsened WHO FC compared with those on placebo and the difference was statistically significant (p=0.013). The committee concluded that the evidence from STELLAR showed sotatercept plus background treatment was associated with greater improvement in terms of 6WMD and improvement or worsening of WHO FC status compared with placebo plus background treatment at 24-week follow up. But it also noted that the primary and secondary outcomes were reported at 24-week follow up and there is a lack of robust evidence on sotatercept's longer-term treatment effect.

SOTERIA

3.6

SOTERIA (n=426) is a single arm, ongoing open-label extension study. It is evaluating the safety and efficacy of sotatercept in people with PAH previously treated with sotatercept. It included adults with PAH who have completed prior sotatercept trials (SPECTRA, PULSAR, ZENITH, HYPERION, STELLAR). People could enter SOTERIA if they had a clinical worsening event or completed the 24-week treatment period in the STELLAR trial. Data from SOTERIA is not directly included in the company's modelling. The committee noted evidence on people who had originally had sotatercept in STELLAR and enrolled in SOTERIA that suggested that, among those still on sotatercept, a relatively large proportion's WHO FC improved or was maintained over time. This WHO FC improvement or maintenance in SOTERIA was seen across all time points in SOTERIA when compared with STELLAR WHO FC status at both baseline and week 24 of STELLAR (the exact data is considered confidential so cannot be reported here). The company stated that evidence from the open-label extension showed that continued treatment with sotatercept leads to a persistent long-term treatment effect. The committee understood that the outcome assessed in SOTERIA was the proportion of people with improvement or maintenance of WHO FC relative to STELLAR at week 24 or baseline. But it noted that this did not answer the question of whether risk status improved over time relative to the comparators. The committee also noted that not all people enrolled from STELLAR were followed up in SOTERIA, with a lower sample size seen at later timepoints up to 2-year follow up. The committee concluded that the evidence from the SOTERIA open-label extension study suggested that the treatment effect (in terms of WHO FC improvement or maintenance) may continue in the longer term. But the evidence was very uncertain because of the study design and the number of people available at different follow-up timepoints. The committee took this into account in its decision making.

Indirect treatment comparisons

3.7

At the first committee meeting, the company presented an indirect treatment comparison (ITC). It compared sotatercept (data from STELLAR) with selexipag (data from GRIPHON and TRACE). The outcomes were:

WHO FC improvement
WHO FC worsening
change in 6MWD and
change in N-terminal pro-B-type natriuretic peptide (NT-proBNP).

GRIPHON (phase 3) and TRACE (phase 4) are studies comparing selexipag plus background treatment with placebo plus background treatment. The results of the ITC are confidential and cannot be reported here. The EAG highlighted that STELLAR had a longer duration from diagnosis to randomisation than GRIPHON (so the STELLAR population had more stable disease and responded better) and STELLAR participants had more background treatment. The EAG also highlighted that there was a difference in the risk stratification used. STELLAR used both ESC/ERS risk stratification guidelines and WHO FC stratification, whereas GRIPHON only used WHO FC risk stratification, which is more common in trials but is also less sensitive. The company also attempted to do other ITCs, including one that compared sotatercept with intravenous PCA. But, the ITC was not feasible because of a lack of common comparator. The company explained that this is one of the considerations in positioning sotatercept for initiation in intermediate–low risk only (see section 3.4). At the first committee meeting, the committee noted that the ITC had not been adjusted for any potential treatment-effect modifiers. But, it preferred this approach to the company's within-trial post-hoc analysis (see section 3.8). The committee suggested that the company explore a matching-adjusted indirect comparison (MAIC), so treatment-effect modifiers could be adjusted for. At its second meeting, the committee concluded that the company's MAIC submitted following draft guidance consultation (see section 3.9) was acceptable for decision making.

Post-hoc analysis using STELLAR data

3.8

At the first committee meeting, the company also presented a post-hoc within-trial analysis of STELLAR, which assessed the change from baseline in ESC/ERS risk status at week 24. This analysis included people having the following treatments:

PDE5i with ERA in the sotatercept arm
PDE5i with ERA plus selexipag in the placebo arm.

The results showed the likelihood of improving or worsening risk status for people who had sotatercept compared with people who had selexipag as part of their background treatment. The results of the within-trial analysis are confidential and cannot be reported here. The EAG expressed that this post-hoc analysis was not predefined and it considered it inappropriate. The EAG explained that a direct comparison between these 2 groups was not possible and that the results from this post-hoc analysis could be biased in favour of sotatercept because:
The placebo arm was already having PDE5i with ERAs plus selexipag as background treatment before the beginning of the trial. So, any treatment effect of selexipag that happened from the start of treatment to randomisation is unobservable.
Randomisation is broken because the analysis compares double background therapy with triple background therapy at baseline.

People having triple-combination therapy are having more intensive treatment, so are likely to have a worse prognosis. The company confirmed that at the start of the analysis people may have been having background treatment (including selexipag) for between 90 days and 8 years. The clinical and patient experts agreed that at 90 days people having background treatment are likely to be stable. The clinical expert stated that a large proportion of background treatment (including PDE5i with ERA plus selexipag) benefit is seen at initiation or soon after, and they would expect a therapeutic effect by 90 days. This is why assessments of clinical effectiveness are done around this time in clinical practice. The company stated that the 2 populations being compared were well balanced. The committee noted some differences between the sotatercept and the placebo arms, such as age, time from diagnosis and baseline risk status, and considered this a naive comparison. Because of the potential bias in the company's within-trial post-hoc analysis, at its first meeting the committee suggested propensity score matching (see NICE technical support document 17). This was to account for the differences in subgroups' baseline characteristics and how differences may impact the results. But the committee also acknowledged that this would not resolve the differences between when treatment began in both groups and how this impacts the observed treatment effect for selexipag. In its response to draft guidance consultation, the company stated that a propensity-score-matching analysis was explored but not feasible. This was because of the sample size that would be needed for a meaningful matching of potential treatment-effect modifiers across treatment arms. Within STELLAR, the number of people available for analysis was already low before any potential matching or adjustment. The EAG agreed that propensity score matching was not feasible in this sample. The committee acknowledged the potential bias in the within-trial analysis. It concluded that a matching-adjusted indirect comparison (see section 3.9) should be used in decision making.

MAIC between STELLAR and GRIPHON

3.9

In response to draft guidance consultation, the company did an anchored MAIC to address the differences in baseline characteristics between the key trials. It compared sotatercept (using data from STELLAR at 24 weeks) with selexipag (using data from GRIPHON at 26 weeks). The comparison was in a subpopulation of people with PAH who were having background monotherapy or dual therapy only. Covariates included WHO diagnostic group, WHO FC, age and 6MWD. The results of the MAIC suggested that sotatercept may be better than selexipag at reducing the worsening of WHO FC (the exact results of the MAIC are confidential so cannot be reported here). The EAG highlighted that matching only on some characteristics instead of all the potential modifiers may have biased the result of the MAIC. It also explained that restricting the analysis to people having background monotherapy or dual-combination therapy excluded a large proportion (61%) of the population in STELLAR. The company explained that it had adjusted for all key covariates for which there was data, based on company clinical expert advice. It added that adjusting for further modifiers would reduce the statistical power of the MAIC. It also clarified that in STELLAR, most people on background triple-combination therapy at baseline were higher risk status than the intermediate–low-risk population in which sotatercept would be started. At the second committee meeting, the clinical expert explained that in clinical practice, most people in the intermediate–low-risk state at baseline would likely be on monotherapy or dual therapy. They also said that most of the treatment modifiers that make up an intermediate–low-risk status had already been adjusted for in the MAIC. But the EAG highlighted that it was unclear whether all treatment-effect modifiers had been adjusted for. The clinical expert acknowledged that adjusting for more treatment-effect modifiers would reduce the sample size for analysis. The clinical expert considered that the main treatment-effect modifiers had already been adjusted for. They also explained that most of the covariates that could have been adjusted for in the analysis were included in the assessment of risk status. WHO FC status was similar across the subpopulations. The committee thought there was uncertainty in whether potential treatment-effect modifiers had all been adjusted for in the MAIC. It also noted that retaining statistical power was not sufficient justification for not adjusting for all potential treatment-effect modifiers. So the impact of any potential treatment-effect modifier that had not been adjusted for was uncertain. It also thought that the extent to which the MAIC improved the validity of ITC for selexipag was uncertain. The committee understood that the EAG also used the estimates from the MAIC in the model (see section 3.11 and section 3.12) despite the uncertainties. The EAG explained that this was because the subpopulations across the trials in the MAIC were more comparable with those in which sotatercept would be started. The committee concluded that the company's MAIC was acceptable for decision making. But it also noted uncertainties in the methods and the analyses of the MAIC were collectively substantial. It took these uncertainties into account in its decision making.

Economic model

Modelling approach

3.10

The company presented a Markov cohort model based on previous models in PAH. To reflect changes in clinical management (see section 3.3) the company updated the model structure using the ESC/ERS low-risk states rather than WHO FC states. The risk states reflect increasing risks of mortality and have been incorporated in the model using hazard ratios relative to the low-risk state. The EAG acknowledged that the model captures the progression of current management of PAH. But, it noted that the current commissioning guidelines for comparator treatments still refer to WHO FC rather than ESC/ERS risk strata. The EAG noted this misalignment between ESC/ERS and WHO FC created significant challenges for using robust estimates to inform the comparative effectiveness of sotatercept in the model. At the first committee meeting, the EAG also disagreed with the company's assumption of not allowing for clinical improvement after starting PGI2 in the intermediate–high-risk and high-risk states of the model. This was revised to allow for clinical improvement following draft guidance consultation (see section 3.16). The company modelled mortality using increasing ESC/ERS risk status based on Kaplan–Meier survival data from Rosenkranz et al. (2023), with survival data from the COMPERA PAH registry. The company used a parametric model with a gamma distribution, fitted to the low-risk health state, and applied a hazard ratio for non-low-risk health states. The company and the EAG acknowledged that the approach to modelling mortality was highly uncertain. At the second committee meeting, the company presented some additional mortality scenarios using the dependent mortality approach with a Gompertz extrapolation. But it maintained that a hazard ratio relative to low-risk approach was appropriate. The committee acknowledged the uncertainty in the evidence informing the model (see section 3.5, section 3.6 and sections 3.9 to 3.13), as well as the misalignment between ESC/ERS and WHO FC in the model. But, in the absence of more suitable informing evidence, it concluded that the company's revised model structure was acceptable for decision making.

Short-term transition probabilities for selexipag

3.11

At the first committee meeting, the company used the relative risk of ESC/ERS risk status improvement or deterioration from the STELLAR within-trial analysis to inform the selexipag transition probabilities in the model (see section 3.8). It applied these to the estimated transition probabilities for sotatercept in the model. The company explained that it used this approach because an ITC comparing change in ESC/ERS risk status from baseline was not possible because this outcome was not included in the ITC data. The EAG explained that the company's method would introduce bias towards sotatercept (see section 3.7). The EAG preferred using the ITC with GRIPHON and TRACE. This was because, although GRIPHON included WHO FC states, it would provide less biased estimates of the relative risk for improvement or deterioration. The committee acknowledged the different risk stratification systems used between STELLAR and GRIPHON and also that the WHO FC stratification is less sensitive than ESC/ERS. At the first committee meeting, the committee suggested that a MAIC could be used instead of this ITC. This could address the differences in baseline characteristics between STELLAR, GRIPHON and TRACE and would make the analysis more robust. After draft guidance consultation, the company revised its base case. To derive short-term transition probabilities for selexipag in the model it used the relative risk of WHO FC improvement or worsening estimated from the MAIC (see section 3.9). The EAG's updated approach was in line with the company's base case. But it noted that the extent to which these results were an improvement on the original ITC results used to inform the short-term transition probabilities at the first meeting was highly uncertain (see section 3.7). The committee recalled the substantial uncertainties in the evidence used to inform the MAIC and the limitations in its methods (see section 3.8). Considering all the evidence and the uncertainties, it concluded that using the results from the MAIC to derive short-term transition probabilities for selexipag was appropriate for decision making. But it acknowledged there were uncertainties and took this into account in its decision making.

Long-term transition probabilities

3.12

The short-term relative risks (see section 3.11) are used to derive long-term transition probabilities for sotatercept and selexipag over the entire time horizon of the model. The company stated that the reapplication of the short-term relative risks over cycles in the long-term was appropriate. This was because the results from SOTERIA showed no treatment waning effect over time and so it would be possible to assume a constant treatment effect for sotatercept. The EAG stated that this method lacked clinical validity because it meant that almost no one in the selexipag arm remained in lower-risk states after a certain time. At the first committee meeting, the EAG suggested that the company's ITC with GRIPHON and TRACE would provide less biased results of the relative risks for risk status improvement or deterioration for selexipag (see section 3.7) than using the within-trial analysis. The EAG also highlighted that even with the ITC relative risks, the reapplication of the same transition probabilities may overestimate risk status deterioration in the selexipag arm. So, the EAG suggested reducing the magnitude of these effects when applying them to long-term transition probabilities. At the first committee meeting the EAG presented multiple scenario analyses that varied the relative risk reduction of treatment effects from 100% to 25%. The EAG presented modelled projections to represent the percentage survival and progression for each scenario analysis. The clinical expert at the first meeting stated that the projection using the ITC with GRIPHON and TRACE may be plausible (this information is considered commercial in confidence) and that the within-trial analysis projects were less plausible. But they agreed that the EAG mid-scenario projections, which assume a 50% reduction in the relative risk of risk status deterioration, are the most plausible. The committee agreed that it would be reasonable to use the EAG mid-scenario percentage of relative risk reduction to the treatment effects. But it requested analysis using alternative data sets to inform and validate the relative risk reduction.

After draft guidance consultation, the company revised its base case. To derive long-term transition probabilities for disease progression in the model, it used the relative risk of WHO FC improvement or worsening from the MAIC at 24 to 26 weeks. It did not apply any reduction to the relative risk of disease progression beyond 24 weeks in its base case. The company explained that a 50% reduction in the relative risk of risk status deterioration was arbitrary and implied treatment waning. But the data from the SOTERIA extension study (see section 3.6) did not show a reduction in sotatercept treatment effect. The EAG was concerned with applying a 24-week relative risk over a lifetime horizon in the model. This was because, by doing so, the model assumes that selexipag is no better than placebo in preventing risk status deterioration in the long term. Also, the proportion of people remaining progression free on selexipag in the model at 3 years is lower than what would be seen in practice (exact data is considered confidential so cannot be reported here). Clinical advice to the EAG indicated that around 50% of people could be expected to remain in the intermediate–low risk state at 3 years on selexipag. Evidence from GRIPHON showed that about 60% of people with PAH remained progression free on selexipag at 3-year follow up. This was based on a composite outcome of death or complications related to PAH (including deterioration on several measures of disease progression). So the EAG chose to calibrate the relative risks of clinical improvement or worsening applied in the model. This was done so the average hazard ratio (weighted by the proportion remaining at risk) would be equal to the estimated hazard ratio from the company's original ITC for death or clinical worsening (this exact value is confidential so cannot be reported here). This involved applying 63% of the estimated relative risks of improving or worsening WHO FC for sotatercept versus selexipag from the MAIC to derive long-term transition probabilities for disease progression. The EAG explained that it calibrated to the hazard ratio in the original ITC because this value was for clinical worsening or death. It explained that this was a suitable proxy for progression in the longer term in the updated model. The committee recalled its discussion on the lack of robust evidence on sotatercept's longer-term treatment effect (see section 3.5), and the uncertainties in the evidence from SOTERIA (see section 3.6). It thought that the clinical benefit of selexipag may be underestimated in the company's approach. It also thought that assuming that selexipag is no better than placebo in the long term is not clinically plausible or in line with the evidence available. It thought that an adjustment was needed. It also recalled the clinical expert advice from the first committee meeting in which a 50% reduction seemed the most plausible of the scenarios presented. It acknowledged the uncertainty in the EAG's approach of deriving long-term transition probabilities. This was because these were still derived using estimates from the MAIC, which was based on data from STELLAR at 24 weeks and GRIPHON at 26 weeks (see section 3.9). But it concluded that, on balance, the EAG's approach to modelling long-term transition probabilities (in which 63% of the estimated relative risk reductions from the MAIC are applied) was more appropriate for decision making. The committee also noted the substantial uncertainties in the evidence informing the transition probabilities of selexipag in the longer term in the model. It took this into account in its decision making.

Extension of time frame for risk status improvement

3.13

After draft guidance consultation, the company revised its base case so that the time for potential improvement in risk status from intermediate–low risk to low risk increased from 24 weeks to 108 weeks in both model arms. This was based on data from SOTERIA, in a subgroup of people who enrolled into SOTERIA from the sotatercept arm of STELLAR. The company states that the evidence from SOTERIA showed that the treatment effect (in terms of WHO FC improvement or maintenance) persists into the long term with continued use of sotatercept. The EAG thought that extending the time frame for improvement was reasonable and also adopted it. The EAG explained that this approach resulted in sotatercept risk projections that were more in line with the data from SOTERIA. It also noted that this approach allowed for more optimistic extrapolated outcomes for selexipag, which the committee thought had been underestimated in the original model (see section 3.12). The committee recalled the data submitted by the company to support this extension. It was based on the proportion of people from STELLAR with either risk status improvement or maintenance relative to their risk status at week 24 of STELLAR (see section 3.5). It also noted that the company had not submitted any data from SOTERIA to support improvements alone between week 24 and week 108. It recalled previous clinical expert advice indicating that most improvement from treatments for PAH occurs in the short term. But it also recalled that the modelled clinical benefit of selexipag may be an underestimate (see section 3.12). This means that applying the extended window of improvement to both model arms resulted in longer-term projections for selexipag disease progression. These are closer to what may be seen in practice. So, despite the uncertainty, it accepted a time frame of 108 weeks for improvement from the intermediate–low risk to low-risk state in the modelling.

Proportion of PGI2 analogues

3.14

In its base case at the first committee meeting, the company assumed that everyone would have PGI2 analogues when they progressed to:

intermediate–high-risk and high-risk states in the selexipag arm, and
high-risk state in the sotatercept arm.

It also assumed that everyone in the sotatercept arm who stopped sotatercept, on progression to the intermediate–high risk state, would have PGI2 analogues. The company stated that this was in line with the World Symposia guidelines on pulmonary hypertension. Clinical expert advice to the EAG suggested that intravenous PCA would not be suitable for everyone and not everyone would accept it. The patient experts agreed with this. The EAG also suggested that in the intermediate–high-risk to high-risk states it would be reasonable to assume that 85% would have PGI2 analogues, but this could be lower in real-world practice. The EAG assumed that 85% of people will have PGI2 analogues on progression to the intermediate–high-risk and high-risk states in the selexipag arm and the high-risk state in the sotatercept arm. For those who progress to intermediate–high-risk state in the sotatercept arm, it assumed that 39.9% have PGI2 analogues, which was informed by the figures in the STELLAR trial. The remainder are assumed to stay on their current treatment regimen. This estimate considered both intravenous and inhaled preparations of PCA. The EAG's approach, in which less than 100% of people who progressed had PGI2 analogues, was supported by the estimates provided by clinical and patient experts at the meeting. So, the committee concluded that the EAG's approach was the most appropriate and should be used in the model. After draft guidance consultation, the company updated its base case in line with committee preferences. In its updated base case, the proportion of people having PGI2 analogues was:
85% for the intermediate–high-risk and high-risk states on selexipag
39.9% for the intermediate–high-risk state, and 85% for the high-risk state, on sotatercept.

Starting PGI2 analogues in the sotatercept arm

3.15

In the company's model presented at the first committee meeting, people who progressed to an intermediate–high-risk state on sotatercept stayed on sotatercept but did not start PGI2 analogues (apart from a small proportion modelled to stop sotatercept after 24 weeks because of a lack of response). The company stated that this reflected clinical practice in the NHS for people who had progressed to intermediate–high-risk. In the model, the EAG preferred to add PGI2 analogues to 39.9% of the sotatercept arm on progression to the intermediate–high-risk state. This aligns with the proportion that start PGI2 analogues on progression to an intermediate–high-risk state in STELLAR. This was used to inform the transition probabilities from the intermediate–high-risk state and aligns better with clinical expert opinion provided to the EAG. Both the company and the EAG base cases assumed that people who had progressed to a high-risk state on sotatercept stopped sotatercept and started PGI2 analogue treatment in the model. This is because there is no data to inform the efficacy of sotatercept plus intravenous PCAs compared with background treatment for the high-risk group in the selexipag arm. The clinical experts confirmed it is likely that intravenous PCAs would be added to sotatercept without stopping sotatercept treatment in practice. For progression to the intermediate–high-risk state, the committee concluded that the EAG's approach of adding PGI2 analogues to 39.9% of the sotatercept arm on progression to the intermediate–high-risk state was the most appropriate. This was because it was supported by clinical trial evidence and expert opinion. For progression to the high-risk state, the committee acknowledged that the EAG and company base cases both included the assumption of stopping sotatercept upon progression to the high-risk state because of a lack of data. It highlighted that if usual practice is to continue sotatercept with intravenous PCAs, significant costs of sotatercept may not be captured in the model. The committee requested further scenario analyses, including exploration of other data sources around sotatercept with PCA analogues. It discussed whether these analyses could be potentially informed by ZENITH data. The company explained, after draft guidance consultation, that the available comparative data from ZENITH was extremely limited, which also impacted the company's positioning (see section 3.4). The committee recalled its discussion on the positioning of sotatercept. It also recalled that data from STELLAR suggested that a small proportion of people on sotatercept may have progressed to high-risk status at 24 weeks, but this was uncertain (see section 3.5). The committee acknowledged the lack of available comparative evidence in the intermediate–high-risk and high-risk groups. It took this into account in its decision making.

Clinical improvement after starting PGI2

3.16

In the model, people move to the intermediate–high-risk and high-risk states on disease progression. On entering these states, either inhaled or intravenous PCA is started as a subsequent treatment. The possibility for improvements in risk status (backwards transitions) was switched off or not included in the model for subsequent intravenous PCA. The company stated that clinical improvement on intravenous PCA was not included in the selexipag arm because the evidence for its efficacy in the intermediate–low-risk group was uncertain. The costs of intravenous PCA and a utility decrement for administration of intravenous PCAs were included in the model. But the EAG noted that it was possible for clinical improvement in the sotatercept arm after starting PGI2 in the model. At clarification before the first committee meeting, the EAG queried the difference in assumptions about improvement on intravenous PCA. To resolve this, the company removed the ability to improve risk status in the sotatercept arm. The company stated that it could not add this function in the selexipag arm because of time constraints. The patient experts had a mixed experience with intravenous PCAs and both discussed the high treatment burden. One patient expert explained that they have had a maintained benefit over time when having treatment with intravenous PCAs. The clinical expert agreed that clinical improvement can be seen after starting intravenous PCAs. But, they stated that the main benefit of intravenous PCAs is stability in the intermediate–high-risk or high-risk states. This contributes towards increasing exercise capacity, improving 6MWD and reducing mortality. At its first meeting, the committee felt that including the intravenous PCA utility decrement and costs but not the utility increments from intravenous PCA treatment was inappropriate. This also did not reflect clinical and patient experiences. So, the committee requested that the model structure reflect improvements in risk status after starting intravenous PCA.

Following draft guidance consultation, the company revised its base case. The revised base case included an intermediate–high risk tunnel state in the model that allowed for improvement in risk status in the first 12-week cycle after starting intravenous PCA. This was in line with clinical advice provided to the company, which indicated that most people having PCA improve within 12 weeks of starting treatment. The transition probabilities for risk status improvement in the intermediate–high risk state in both model arms were informed by the transitions in risk status between 12 and 24 weeks in STELLAR. In the sotatercept arm, for the proportion of people who were not part of the 39.9% of people starting PCA (see section 3.14), the transitions were restricted. This meant that any chance of improvement according to the transition probabilities was instead applied as maintenance of risk status. The EAG accepted the company's updated modelling approach but highlighted that the STELLAR cohort had already been on background treatments, including PCA, for months before being involved in the trial. So it thought that the transition probabilities from STELLAR were not representative of the effect of starting PCA after progression in the selexipag arm. The EAG noted that the proportion of people improving risk status on selexipag after starting PCA in the model differed from clinical advice provided to the EAG. That advice indicated that around 60% of people would be expected to improve after starting PCA. So the EAG preferred to use data from Roman et al. (2012) to inform the transition probabilities in the selexipag arm. The EAG explained that Roman et al. (2012) was a study that compared the cost effectiveness of starting 3 different types of PCA, which assumed that the modelled population was naive to PCA before starting PCA treatment. So, it was more representative of a cohort in which 85% of people in the selexipag arm would start PCA (see section 3.14). Although the sotatercept arm in the EAG's base case still used the transition probabilities from the sotatercept arm of STELLAR, the EAG also applied an adjustment. This meant that the probability of risk status improvement for people starting PCA is no worse than in the selexipag arm. At the second committee meeting, the company acknowledged the uncertainties in using STELLAR data to inform the transition probabilities for people in the selexipag arm starting PCA upon progression to the intermediate–high-risk state. The committee noted that the EAG's approach to transition probabilities after starting PCA in the selexipag arm used data based on the point at which PCA is introduced. It felt that this was more representative of the treatment effect of PCA that would be seen in clinical practice. It concluded that the EAG's source of transition probabilities for people starting PCA in the intermediate–high-risk state of the selexipag arm was appropriate for decision making.

Weight-based dosing of intravenous PCA

3.17

The company based the dosing of intravenous PCA preparations on a Canadian Drug Agency appraisal of selexipag. The company used target doses of epoprostenol at 50 ng/kg/minute and treprostinil at 30 ng/kg/minute. Clinical expert opinion to the EAG suggested these doses are higher than the doses used routinely in NHS practice. This suggests overcosting of epoprostenol, the most frequently used intravenous PCA preparation in the NHS, and potential bias in favour of sotatercept. The EAG also highlighted that the company's approach excluded US participants for weight-based dosing but included all other countries, including those with a lower average weight than the UK. At clarification, the company provided some scenario analysis with different average weights. This demonstrated a small increase in the incremental cost-effectiveness ratio (ICER). The EAG preferred to apply dosing for PGI2 analogues in line with the ESC/ERS guidelines. That is, epoprostenol 23 ng/kg/minute (midpoint of 16 to 30 ng/kg/minute) and treprostinil 42.5 ng/kg/minute (midpoint of 25 to 60 ng/kg/minute). The clinical experts confirmed that dosing based on ESC/ERS guidelines is appropriate. The committee acknowledged that the company has accounted for wastage in its modelling. The committee concluded that the EAG's approach was in line with clinical practice and should be used in the model. At the second committee meeting, the company presented updated doses of intravenous PCA. The expected dosage for the intermediate–high-risk state in the sotatercept arm was epoprostenol at 23 ng/kg/minute and treprostinil at 42.5 ng/kg/minute. But the expected dosage for the high-risk state in the sotatercept arm, and the intermediate–high-risk and high-risk states in the selexipag arm, was updated to epoprostenol at 35 ng/kg/minute and treprostinil at 70 ng/kg/minute. The company explained that the higher doses were based on clinical expert advice from a panel. It stated that people in the intermediate–high-risk state who have had treatments targeting the prostacyclin receptor (such as selexipag), can have higher target doses with fewer tolerance issues. Clinical advice to the EAG acknowledged that intravenous PCA may be better tolerated in those switching from selexipag. But it disagreed that a higher target dose would be appropriate. So the EAG preferred to retain its original dosing assumptions from the ESC/ERS guidelines. The committee thought that it had seen nothing in clinical guidelines to indicate that a person previously on therapies such as selexipag would have a higher dose of intravenous PCA. It maintained its previous conclusion that the EAG's approach is in line with clinical practice and should be used in the model.

Utility in the model

3.18

The company included 3 additional utility increments and decrements in the model:

a decrement for intravenous PCA administration of -0.307
a decrement for hospitalisation of -0.105
an increment for carers of people in less severe risk states instead of a decrement for carers of people in high risk states.

The carer utility increment was accepted and included in the EAG base case. The EAG stated that a key concern of including a utility decrement for intravenous PCA administration was that no clinical improvement was included for intravenous PCA analogues in the model. The company noted that it did not use EQ-5D data from STELLAR because there were too few observations in some of the risk strata at 24 weeks (see section 3.16). For the decrement on hospitalisation, the company used a published utility decrement for hospitalisation within 30 days (McMurray et al. 2018). The EAG stated that, according to the literature, the utility decrement applied for hospitalisation might be an overestimation and partially double count the impact of hospitalisation. It may also have not adjusted for cycle length. The EAG explained that the McMurray et al. study also included a reduced utility decrement for hospitalisation within 90 days of -0.054. So, it averaged the 30-day and 90-day hospitalisation utility decrements over the 12-week model cycle, and applied the resulting reduced utility of -0.071 in its base case. At its first meeting, the committee felt that the utility decrement for intravenous PCA administration should be removed if there is no possibility of clinical improvement after starting PGI2. In its response to draft guidance consultation, the company maintained inclusion of a disutility for intravenous PCA administration. But it also revised its model to include clinical improvement when starting intravenous PCA analogues (see section 3.16). The EAG thought this was reasonable. But it noted that the source of the disutility was from a vignette study, which does not align with the NICE reference case for obtaining utility values. The committee thought that the size of the disutility for starting intravenous PCA may be an overestimate, because of how the disutility value was derived. It noted that the company had done a scenario analysis that reduced the disutility for intravenous PCA administration (the scenario disutility is confidential and cannot be reported here). This was to account for potential double counting of disutility from starting intravenous PCA and disutility from disease progression from the intermediate–low-risk to the intermediate–high-risk state. The committee thought that this disutility was more plausible and concluded that using the lower disutility value from the company scenario analysis was appropriate for decision making.

The committee concluded that:
EQ-5D data from STELLAR should be used to provide insight into overall improvement once intravenous PCA is started
the utility decrement associated with hospitalisation should be reduced from 0.105 to 0.071, applied for the duration of the cycle in which hospitalisation events occur (to align more closely with utility for hospitalisation between 0 and 90 days)
intravenous PCA disutility should be included, using the value from the company's scenario analysis
it was appropriate to include an increment for carers of people in less severe risk states.

Severity

3.19

The committee considered the severity of the condition (the future health lost by people living with the condition and having standard care in the NHS). The committee may apply a greater weight to quality-adjusted life years (QALYs), called a severity modifier, if technologies are indicated for conditions with a high degree of severity. The company provided absolute and proportional QALY shortfall estimates in line with NICE's health technology evaluations manual. Based on the company submission, the EAG agreed that the case was met for application of a severity weighting of 1.2 on QALY gains. This was based on the absolute QALY shortfall versus the expected QALYs of the age- and sex-matched general population. The values for the QALY shortfall are confidential. The committee noted that in some scenarios presented, a 1.2 multiplier may not hold. These scenarios included aligning population age and sex with those reported in the UK National Audit of Pulmonary Hypertension (NAPH) and different methods of modelling mortality. The EAG explained that the company's approach to modelling mortality was reasonable but that the UK NAPH may be more representative of the population seen in clinical practice. At the second committee meeting, the company presented some additional mortality scenarios. But it maintained that its approach to modelling mortality in its original base case was still appropriate (see section 3.10). It noted that the criteria for a 1.2 severity weighting was still met in a company scenario where NAPH baseline characteristics were used. But, it highlighted that this cohort was not representative of the appraisal population and was broader in scope. The clinical expert explained that not all of the population in the UK NAPH cohort would be offered sotatercept, because some would have existing conditions that prevented this. The committee noted that the EAG base case still met the severity modifier of 1.2 when using more plausible estimates of progression-free survival for selexipag and baseline characteristics from STELLAR. It also preferred to retain consistency between the source of the efficacy data used in the modelling and the source of the baseline characteristics used to calculate QALY shortfall. So it concluded that a severity weight of 1.2 applied to the QALYs was appropriate, based on population baseline characteristics from STELLAR.

Cost-effectiveness estimates

Acceptable ICER

3.20

NICE's manual on health technology evaluations notes that, above a most plausible ICER of £25,000 per QALY gained, judgements about the acceptability of a technology as an effective use of NHS resources will take into account the degree of certainty around the ICER. The committee will be more cautious about recommending a technology if it is less certain about the ICERs presented. But it will also take into account other aspects including uncaptured health benefits. The committee noted the lack of evidence and high level of uncertainty in the evidence and modelling, specifically with the:

lack of robust evidence on the longer-term treatment effect of sotatercept (see section 3.5 and section 3.6)
methods used in the MAIC (see section 3.9)
estimates of sotatercept's treatment effect compared with selexipag based on the MAIC (see section 3.9)
misalignment between ESC/ERS and WHO FC in the model (see section 3.10)
transition probabilities used to determine relative risk of improvement or deterioration in the model, specifically:
- using estimates from the MAIC to inform the short-term transition probabilities (see section 3.11)
- deriving long-term transition probabilities based on the MAIC, which included trials at 24 to 26 weeks' follow up (see section 3.12)
time frame of the potential improvement from intermediate–low-risk status given the lack of robust evidence on the longer term treatment effect of sotatercept (see section 3.5, section 3.6 and section 3.13).

So, the committee concluded that an acceptable ICER would be towards the lower end of the range NICE considers a cost-effective use of NHS resources (£25,000 to £35,000 per QALY gained).

Committee's preferred assumptions

3.21

The cost-effectiveness estimates used by the committee for decision making took into account all of the available confidential discounts, including those for comparators. The exact estimates are confidential and cannot be reported here. Both the company's and the EAG's base cases at the second committee meeting were above £25,000 per QALY gained regardless of the severity weighting applied. After comparing both the company's and the EAG's base cases, the committee preferred the following assumptions:

short-term transition probabilities for selexipag informed by application of WHO FC relative risks derived from the MAIC between STELLAR and GRIPHON (see section 3.11)
long-term transition probabilities for selexipag derived by applying 63% of the relative risk reduction of disease progression seen for sotatercept versus selexipag at 24 weeks, based on the MAIC between STELLAR and GRIPHON (see section 3.12)
a timeframe for potential improvement from intermediate–low-risk to low-risk status of 108 weeks (see section 3.13)
85% (rather than 100%) of people discontinue selexipag and have PGI2 analogues after progressing to the intermediate–high-risk or high-risk state in the selexipag arm (see section 3.14)
85% of people discontinue sotatercept and have PGI2 analogues after progressing to the high-risk state in the sotatercept arm (see section 3.14)
PGI2 analogues added to sotatercept for 39.9% of people on progression to the intermediate–high-risk state on sotatercept (see section 3.15)
a model structure that can reflect improvements in risk status after starting PCA (see section 3.16)
transition probabilities for people starting PCA in the intermediate–high-risk state in the selexipag arm, sourced from Roman et al. (2012; see section 3.16)
dosage assumed to be 23 ng/kg/minute for epoprostenol and 42.5 ng/kg/minute for treprostinil, based on target maintenance dosages in ESC/ERS guidelines (see section 3.17)
a utility decrement associated with starting intravenous PCA using the value from the company's scenario analysis (see section 3.18)
a utility decrement associated with hospitalisation of 0.071, applied for the duration of the cycle in which hospitalisation events occur (see section 3.18)
hospitalisation QALY loss calculations adjusted to the 12-week cycle length of the model (to correct a minor calculation inconsistency)
healthcare resource use associated with PCA initiation adjusted in line with 12-week cycle length of model (to correct a minor calculation inconsistency)
a severity weighting of 1.2 applied to the QALYs (see section 3.19).

With the committee's preferred assumptions, the ICER was within the range NICE considers a cost-effective use of NHS resources.

Other considerations

Equality

3.22

The committee considered that some older people and people with PAH who are menstruating may not be considered for sotatercept because of an increased risk of bleeding and associated complications. The committee also acknowledged that this treatment requires additional specialist-centre visits. The clinical expert highlighted that because of this sotatercept accessibility could be affected by ability to travel, symptom burden, financial burden and mobility. The committee discussed that people with some disabilities may require additional at-home support to administer the treatment. Characteristics such as age, sex and disability are protected under the Equality Act 2010, but the committee agreed that its recommendations would not have a different impact on people protected by the equality legislation than on the wider population.

At the second committee meeting, the committee also acknowledged that the company's positioning of sotatercept means that people already in the intermediate–high-risk group and those in high-risk group at any point would not be eligible for starting sotatercept. It also acknowledged that these groups may experience significant burden from use of PCA. This is because of side effects of PCA and the mobility required to administer PCA. The committee is aware that its remit is to evaluate the clinical and cost effectiveness of the technology in the population covered by the marketing authorisation. The committee noted that the marketing authorisation for sotatercept covers WHO FC 2 and 3. This corresponds to low-risk, intermediate–low-risk and intermediate–high-risk groups. There were no people at high risk in the sotatercept arm of STELLAR at baseline. A small proportion of people on the sotatercept arm may have progressed to high risk in STELLAR at 24-week follow up, but it was uncertain how many would in the longer term and in clinical practice (see section 3.5). There is also a lack of comparative evidence on starting sotatercept in the intermediate–high-risk group or high-risk group. The committee concluded that in the absence of data, this was not an equality issue that is within its remit.

Uncaptured benefit

3.23

The committee considered whether there were any uncaptured benefits of sotatercept. It did not identify additional benefits of sotatercept not captured in the economic modelling. So the committee concluded that all additional benefits of sotatercept had already been taken into account.

Conclusion

3.24

The committee recognised the unmet need for people with PAH who may have a high symptom burden even with current treatments. The committee acknowledged that sotatercept's marketing authorisation includes people with low-, intermediate–low- to intermediate–high-risk PAH. The committee noted that the company positioned it for initiation in people with intermediate–low-risk status and it could be continued on progression to intermediate–high risk state in the model. But, the company did not provide evidence for the cost effectiveness for initiation in the intermediate–high and high-risk groups because of limitations in available data. The committee agreed with most of the EAG's preferred assumptions. But it acknowledged that the EAG's approach still had substantial uncertainty because of the evidence used. It also acknowledged that the clinical benefit of the comparator may have been underestimated. With the preferred committee assumptions, the cost-effectiveness estimates for sotatercept were within the range NICE normally considers a cost-effective use of NHS resources. So, sotatercept can be used as an option to treat PAH in adults with World Health Organization functional class (WHO FC 2 to 3), to improve exercise capacity, if:

it is started when the PAH is at intermediate–low risk at follow up after usual treatment for PAH
the PAH has progressed to intermediate–high risk, if treatment with sotatercept was started when the PAH was intermediate–low risk.

Sotatercept for treating pulmonary arterial hypertension