3 Committee discussion

The evaluation committee considered evidence submitted by Madrigal Pharmaceuticals, a review of this submission by the external assessment group (EAG), and responses from stakeholders. See the committee papers for full details of the evidence.

The condition

Details of the condition

3.1

Metabolic dysfunction-associated steatohepatitis (MASH) is a subtype of metabolic dysfunction-associated steatotic liver disease (MASLD), a chronic and progressive disease that develops when excess fat builds up in the liver. MASH is an active form of MASLD, characterised by hepatocellular ballooning and inflammation of the liver lobes. MASH can lead to liver fibrosis (scarring of the liver tissue). Liver fibrosis increases the risk of major complications such as cirrhosis, liver cancer, the need for liver transplantation, and mortality. People with MASH typically have manifestations of metabolic syndrome, including obesity, type 2 diabetes, hypertension and dyslipidaemia. Disease progression is highly heterogeneous and depends on various factors, including genetics, lifestyles and comorbidities. Estimates of the prevalence of MASH vary widely. A study by Davidson et al. (2025) estimated the prevalence of MASH in England to be 26.6 per 100,000 people. But MASH is believed to be substantially underdiagnosed.

Effects on quality of life and unmet need

3.2

Patient experts explained that, while MASH is considered asymptomatic in early stages of disease, people can have non-specific symptoms such as fatigue that may be incorrectly attributed to other causes. As liver fibrosis associated with MASH progresses, common symptoms include poor sleep quality, sleep apnoea, lethargy, anxiety, depression and weight gain. There are currently no pharmacological treatments available for treating MASH. Managing MASH typically requires major diet, exercise and healthy living changes, which can be difficult to achieve and sustain. The patient experts explained that slowing or stopping progression to cirrhosis is important. With cirrhosis, people experience fatigue, weakness and appetite problems. They may also experience cognitive changes and become unable to do daily tasks. People must often attend many hospital appointments and family members often become caregivers, particularly as liver disease progresses. Patient experts explained that there is a need to improve the understanding of MASLD and MASH in primary care. This could lead to earlier diagnosis and referral to secondary care, and reduce the risk of progression. The committee concluded that there is an unmet need for pharmacological treatments that reduce progression to cirrhosis in MASH with moderate to advanced fibrosis.

Clinical management

Treatment pathway and positioning of resmetirom

3.3

Usual treatment for non-cirrhotic MASH with moderate to advanced liver fibrosis (consistent with fibrosis stages F2 to F3) is diet, exercise and healthy living changes. Clinical experts explained that diet, exercise and healthy living changes typically have a modest effect. This is because it is difficult to lose weight and even more difficult to sustain weight loss. People may also be having treatment for metabolic comorbidities, such as obesity, type 2 diabetes or dyslipidaemia. The company's proposed positioning of resmetirom is alongside diet and exercise (in line with the anticipated marketing authorisation and key clinical trial), and in addition to current pharmacological treatments for metabolic comorbidities. The company also proposed that resmetirom would be prescribed in secondary care. The clinical experts agreed that this was appropriate, and added that there was potential for a shared care arrangement with primary care once people were stabilised on treatment. They said that resmetirom would be used in parallel with treatments for metabolic-related comorbidities. The clinical experts explained that people are usually already having treatment for metabolic comorbidities before diagnosis of MASLD. The committee concluded that the company's positioning of resmetirom was appropriate.

Current and anticipated use of GLP-1 receptor agonists

3.4

Glucagonlike peptide-1 (GLP‑1) receptor agonists, such as semaglutide, are not relevant comparators for this evaluation because they are not licensed for treating MASH. But the EAG noted that GLP‑1 receptor agonists are increasingly used for obesity and for metabolic conditions such as type 2 diabetes and cardiovascular disease. It also noted that 14.2% of people in the main clinical trial for resmetirom (MAESTRO‑NASH) were receiving a GLP‑1 receptor agonist at baseline for treating type 2 diabetes. The committee noted that in clinical practice many people are taking a GLP‑1 receptor agonist to manage obesity and this population will increase in future, particularly as the body mass index (BMI) threshold for eligibility for starting a GLP-1 receptor agonist is lowered over time. The EAG said it was unclear whether resmetirom would be prescribed for people who are already taking a GLP‑1 receptor agonist to treat a metabolic comorbidity. This was because its clinical advisers had explained that it may be unnecessary to use resmetirom if people lose weight or have reduced liver stiffness while taking a GLP‑1 receptor agonist. The company said that it would be clinically plausible to use resmetirom alongside a GLP‑1 receptor agonist because they have complementary mechanisms of action. It explained that, while resmetirom is a liver-directed treatment, GLP‑1 receptor agonists are systemic cardiometabolic treatments that may improve cardiometabolic risk factors. The clinical experts agreed with the company's interpretation. The clinical experts said that, in line with the eligibility criteria for MAESTRO‑NASH, the GLP‑1 receptor agonist dose would need to be stabilised for at least 6 months before resmetirom could be added. The committee concluded that resmetirom would be used alongside treatments for comorbidities, including GLP‑1 receptor agonists when appropriate. It also concluded that it was not necessary to include the costs of GLP‑1 receptor agonists in the model because it did not expect GLP‑1 receptor agonist use to differ between treatment arms.

Clinical effectiveness

MAESTRO-NASH

3.5

The clinical evidence for resmetirom came from MAESTRO‑NASH, an ongoing double-blind, placebo-controlled randomised phase 3 trial in adults with biopsy-confirmed MASH and fibrosis stage 1B, 2 or 3. People were randomised to 1 of 3 treatment arms: 80 mg resmetirom, 100 mg resmetirom or placebo. The week 52 dual primary endpoints were:

MASH resolution without worsening of fibrosis and
fibrosis stage improvement without worsening of non-alcoholic fatty liver disease activity score (NAS).

The company presented results for the mITT‑LB‑W52 (n=955) population in the trial, which excluded 11 people who had delayed week 52 biopsies. Both primary endpoints occurred in significantly more of the people taking resmetirom than the people taking placebo. Compared with placebo, the percentage difference of people having MASH resolution with no worsening of fibrosis was:

16.4% (95% confidence interval [CI]: 11.0% to 21.8%) in the resmetirom 80 mg arm
20.7% (95% CI: 15.3% to 26.2%) in the resmetirom 100 mg arm.

Compared with placebo, the percentage difference of people having fibrosis improvement with no worsening of NAS was:

10.2% (95% CI: 4.8% to 15.7%) in the resmetirom 80 mg arm
11.8% (95% CI: 6.4% to 17.2%) in the resmetirom 100 mg arm.

The month 54 primary endpoint is time to composite clinical event, including all-cause mortality, liver transplant, and significant hepatic events. Data for the month 54 primary endpoint will be available in 2028. The committee concluded that resmetirom is more effective than placebo for resolving MASH and improving fibrosis. But this is uncertain because only 1 year of follow-up is currently available.

Use of resmetirom in clinical practice

3.6

In MAESTRO‑NASH, liver biopsy was used to determine treatment eligibility and measure outcomes. But in clinical practice, these will be done using non-invasive tests. The clinical experts explained that a step-wise approach to non-invasive testing is recommended in most clinical guidelines, and that this would be used to determine treatment eligibility for resmetirom. Fibrosis risk is currently assessed in primary care using a non-invasive Fibrosis-4 (FIB-4) score based on simple blood tests to determine the risk of clinically significant liver fibrosis. The FIB-4 is a screening test with a high negative predictive value, which means it is accurate at identifying people that do not have moderate to advanced liver fibrosis. If the FIB-4 value shows high or intermediate risk of clinically significant fibrosis, a second test is used to confirm this, using Vibration-Controlled Transient Elastography (VCTE) or Enhanced Liver Fibrosis (ELF). VCTE is an imaging-based ultrasound test for measuring liver stiffness, and ELF is a blood test. The clinical experts said that repeating FIB-4 tests annually in primary care is an effective way of reducing missed cases of clinically significant fibrosis and monitoring disease progression. They said that in routine practice, biopsy is used only in cases of diagnostic uncertainty.

People would only be eligible for resmetirom in clinical practice if they have VCTE or ELF results that are consistent with liver fibrosis stages F2 or F3. The company proposed using a VCTE liver stiffness measurement (LSM) threshold of 10 to 19.9 kilopascal (kPa), or equivalent threshold for other non-invasive tests. This aligns with expert panel recommendations from Noureddin et al. (2024). The company said that VCTE performs well in identifying moderate to advanced fibrosis. It also explained that the proposed thresholds are consistent with fibrosis stages F2 and F3 as defined by biopsy, in line with the anticipated marketing authorisation for resmetirom. To exclude people with clinical evidence of cirrhosis, who would not be eligible for resmetirom, the company said a LSM above 20 would be used (or LSM above 15 with a platelet count below 150,000 per microlitre, indicating clinically significant portal hypertension).

To evaluate resmetirom response annually, the company proposed using the following VCTE thresholds and stopping rules:

Improved response (consistent with F0 or F1): LSM 8 kPa or below, or 30% or greater improvement. Stop resmetirom treatment if LSM is 8 kPa or below after 2 assessments, 3 months apart. Repeat VCTE or ELF every 12 months.
Stable response (consistent with F2 or F3): LSM 10 to 20 kPa, or 30% or less improvement. Continue resmetirom treatment and repeat VCTE or ELF every 12 months.
Non-response (consistent with F4 compensated cirrhosis or worse): LSM greater than 20 kPa, 30% or greater worsening (or LSM above 15 kPa with platelet count below 150,000 per microlitre, indicating clinically significant portal hypertension). Stop resmetirom treatment.

The EAG had concerns about the specific non-invasive test thresholds proposed by the company. It noted that a proportion of the MAESTRO‑NASH population did not meet the proposed VCTE thresholds (the company considers the exact proportion commercial in confidence, and it cannot be reported here). Its clinical advisers indicated that the number of people who would be eligible for treatment using the proposed non-invasive test thresholds may be large. This group is also likely to include people with low risk of disease, including people with F0 and F1 fibrosis status who are not eligible for treatment with resmetirom. The EAG was concerned that the proposed thresholds could lead to overtreatment, based on evidence from several large cohort studies that demonstrated low risk of cirrhosis in many people with LSM above 10 kPa. The EAG also raised concerns about the company's proposed stopping rules and the additional costs of implementing these.

First, the EAG noted responses from clinical experts and NHS England during the technical engagement stage that improvements in access to non-invasive fibrosis testing will be required to implement resmetirom in clinical practice. The cost of these improvements had not been included in the company's economic analysis. The EAG preferred to include the additional cost of 1 VCTE test and 1 consultation in the resmetirom arm to reflect these costs.

Second, the EAG agreed that stopping resmetirom treatment was appropriate upon progression to decompensated cirrhosis or liver cancer. But it was uncertain if it was appropriate to stop treatment upon progression to F4 compensated cirrhosis. This was because there was no fibrosis stage stopping rule in MAESTRO‑NASH and there is an ongoing clinical trial evaluating resmetirom in people with compensated cirrhosis. The EAG was also uncertain about whether it was appropriate to stop resmetirom treatment after fibrosis regression to F0 or F1. This was because the summary of product characteristics for resmetirom does not specify a stopping rule based on fibrosis regression. The EAG questioned the feasibility of implementing this stopping rule in clinical practice because non-invasive tests do not reliably distinguish between adjacent fibrosis stages. The EAG also noted that a repeat test after 3 months may be needed to confirm test results before stopping treatment after fibrosis regression or progression. So, it preferred to include the cost of an additional confirmatory VCTE test and 3 months of additional resmetirom treatment while test results are confirmed.

The committee heard from clinical experts that the proposed VCTE thresholds were appropriate. The clinical experts agreed that using a more conservative LSM threshold of 10 kPa to determine treatment eligibility was appropriate because it would reduce the risk of treating people with mild disease. The clinical experts also said that a LSM above 20 kPa was indicative of compensated cirrhosis. They said that it would be appropriate to stop treatment if disease progressed to compensated cirrhosis because there is not any established evidence on the efficacy of resmetirom treatment in this population. The clinical experts explained that it may be appropriate to stop treatment in some people if LSM dropped below 8 kPa because this was evidence of fibrosis regression. But other drivers of disease would also need to be considered. For example, if someone had fibrosis regression but there was evidence of continued inflammation or they still had cardiometabolic risk factors, it may be appropriate to continue resmetirom treatment. The clinical experts also explained that the coefficient of VCTE was 30%, which means that at least a 30% change needs to be seen for the test to be outside of the margin of error. Because of this, the clinical experts agreed with the EAG that a repeat test should be done before stopping treatment and that this should be done 3 months after the first test. The clinical experts also said that tests should be done annually to measure fibrosis risk outcomes.

Based on the company, EAG and clinical expert comments, the committee concluded that:

Implementing resmetirom in clinical practice would require investment to expand non-invasive testing capacity and address geographical variation. The committee preferred the EAG approach of including the cost of an additional VCTE test and additional consultation in the resmetirom arm to start treatment. But the committee noted that this approach was unlikely to fully reflect the cost of improving non-invasive testing capacity.
The proposed VCTE thresholds for determining treatment eligibility were appropriate, but it was uncertain about the risk of overtreating people with mild disease.
It was appropriate to stop resmetirom treatment upon progression to F4 compensated cirrhosis (consistent with LSM above 20 kPa).
It may be appropriate to stop resmetirom treatment upon fibrosis regression to F0 or F1 (consistent with LSM 8 kPa or below), depending on other drivers of disease, such as obesity. However, this stopping rule was not included in the trial, and so the committee was uncertain about the proportion of people in clinical practice who would stop resmetirom treatment upon regression to F0 or F1.
Before stopping treatment a confirmatory VCTE test will be done 3 months after the first test, and resmetirom treatment will continue during this time.
All people who have a FIB-4 test result that shows high or intermediate risk of clinically significant fibrosis will have an annual VCTE or ELF to monitor disease progression.
It was uncertain if ELF tests would be used as an alternative to VCTE, and if the proposed ELF thresholds for initiating and stopping resmetirom treatment were appropriate.

Generalisability

3.7

The EAG raised noted that nearly all people in MAESTRO‑NASH had a liver biopsy at baseline to assess liver fibrosis stage for treatment eligibility, and again at 52 weeks to assess response to treatment. However, this is not required or desirable as a diagnostic test for MASH with fibrosis in UK clinical practice. The company said that liver biopsy was required by regulators for determining trial eligibility and for assessing outcomes in the trial. It said that the use of non-invasive tests in clinical practice does not undermine the external validity of the trial. But the EAG noted that a proportion of the trial population did not meet the VCTE criteria proposed by the company for treatment eligibility (see section 3.6). The EAG explained that this could have implications for the generalisability of the treatment effect observed in the trial to clinical practice. Because non-invasive tests cannot directly assess histological fibrosis stage (for example between F1 and F2), the EAG said that the use of non-invasive tests could lead to a proportion of people without fibrosis stage F2 or F3 having treatment in clinical practice. This could potentially result in the observed treatment effect of the trial being reduced in clinical practice. The EAG felt it was reassuring that the outcomes of a subgroup analysis provided by the company restricting the trial population to people who met the proposed non-invasive test criteria were similar to the outcomes of the whole trial population. But the EAG said that this subgroup analysis only partially addressed the uncertainty, because it could not account for people who were excluded from the trial but who would meet the non-invasive test thresholds for treatment in practice. So the EAG thought it remained unclear how the use of non-invasive tests would affect the generalisability of the observed treatment effect to clinical practice. At the committee meeting, the clinical experts said that the use of non-invasive tests in clinical practice is a recent development but was unlikely to meaningfully undermine the generalisability of the clinical trial.

The EAG also noted that MAESTRO‑NASH only included people who had 3 or more cardiometabolic risk factors, such as obesity and type 2 diabetes. But the company anticipated that resmetirom would be prescribed to people with MASH and 1 or more cardiometabolic risk factors. The company noted that people referred to secondary care in the UK have several cardiometabolic risk factors. Evidence from the Schneider et al. (2024) analysis of the UK Biobank dataset showed that the metabolic characteristics and demographic profiles of people in MAESTRO‑NASH resemble those of an 'at-risk MASH' subgroup (people with MASH and stage F2 or higher fibrosis) in clinical practice. So the company considered the MAESTRO‑NASH population to be reflective of the population referred to secondary care. The EAG agreed that people who would be eligible for resmetirom are likely to have multiple cardiometabolic risk factors. It noted comments from clinical experts during the technical engagement stage that most people with MASH and moderate to advanced fibrosis have 3 or more cardiometabolic risk factors. But the EAG noted that the anticipated marketing authorisation does not restrict the population to people with 3 or more cardiometabolic risk factors. So, it said that this restriction in MAESTRO‑NASH could limit the generalisability of the trial evidence if some people have 1 or 2 cardiometabolic risk factors in clinical practice.

The committee considered the implications of the use of liver biopsy and requirement for 3 or more cardiometabolic risk factors in MAESTRO‑NASH for the generalisability of the clinical trial evidence. The committee noted the clinical expert comments that most people who would be eligible for resmetirom are likely to have multiple cardiometabolic risk factors. It also noted that this was supported by evidence from Schneider et al. So the committee felt that the requirement for 3 or more cardiometabolic risk factors was unlikely to affect the generalisability of the trial. It noted the clinical expert comments that the use of biopsy in the trial and non-invasive tests in clinical practice was unlikely to meaningfully undermine the generalisability of the trial. But the committee agreed with the EAG that there was uncertainty about how this would affect the generalisability of the trial. The committee concluded that there was uncertainty about the generalisability of MAESTRO‑NASH and it would take this into consideration in its decision making.

Reliance on surrogate endpoints

3.8

The EAG noted that the main drivers of the model (avoidance of advanced liver complications and improved survival) were not directly observed in MAESTRO‑NASH. Instead, these outcomes are inferred indirectly through intermediate changes in fibrosis stage, MASH resolution and risk factors. This is because MAESTRO‑NASH is ongoing and only 1 year of follow-up data for fibrosis improvement and MASH resolution is currently available. The EAG explained that, because fibrosis stage and MASH resolution have not been validated as surrogate outcomes, there is uncertainty in the long-term clinical outcomes of people who take resmetirom and the modelled clinical outcomes. The EAG said that, while fibrosis stage is associated with adverse liver-related outcomes, the current evidence demonstrates associations rather than causal relationships. It said there is no direct evidence that improvements in fibrosis stage lead to better clinical outcomes. No pharmacotherapy trials have shown that treatment-induced fibrosis regression reduces liver-related events or mortality. During the technical engagement stage, the company said that the US Food and Drug Administration and European Medicines Agency accepted fibrosis improvement and MASH resolution in MAESTRO‑NASH as surrogate endpoints for accelerated approval and conditional marketing authorisation, respectively. The company also explained that fibrosis stage is a well-established prognostic factor in MASH, with higher stages associated with increased risk of severe liver disease, major adverse liver outcomes and mortality. The company provided evidence from several studies demonstrating the relationship between fibrosis stage and liver-related morbidity and mortality. The clinical experts said that longitudinal studies consistently demonstrate a correlation between fibrosis stage and long-term outcomes. Although data shows an association between steatohepatitis grade and fibrosis stage, there is less evidence to support MASH resolution as a valid surrogate outcome for survival and risk of advanced liver complications. The committee recalled that the month 54 primary endpoint of MAESTRO‑NASH is time to composite clinical event, including all-cause mortality, liver transplant, and significant hepatic events. The committee noted that data for the month 54 primary endpoint will be available in 2028. The committee concluded that there was uncertainty about the validity of fibrosis improvement and MASH resolution as surrogate outcomes, and it would take account of this uncertainty in its decision making. But it concluded that this uncertainty could potentially be addressed in the future with data from the month 54 primary endpoint of MAESTRO‑NASH.

Economic model

Company's modelling approach

3.9

The company developed an individual patient simulation model in R. The model had an annual cycle length and compared resmetirom with standard care, which included diet and healthy living changes. Health states in the model were defined by MASH resolution status and liver fibrosis stage (F0 to F3). People could also progress to advanced liver disease (F4 compensated cirrhosis, decompensated cirrhosis, hepatocellular carcinoma and liver transplantation). At any point during the time horizon, people could experience comorbid events such as type 2 diabetes and cardiovascular disease, which were determined by liver fibrosis stage and individual patient characteristics (including age, sex, and BMI). The EAG thought that the company's model structure and approach were consistent with previous cost-effectiveness analyses. It said that the individual patient simulation approach provided flexibility by allowing liver and non-liver related outcomes to be modelled at the same time. The EAG recognised the advantages of this approach in the context of MASH, where treatments may have multisystemic effects. But it noted that the individual patient simulation approach substantially increased the model complexity. The EAG felt it was unclear whether this additional complexity was justified. This was because there was limited evidence to support the non-liver effects of resmetirom, and these effects had a relatively small contribution to the overall costs and quality-adjusted life years (QALYs). The committee concluded that the modelling approach was broadly suitable, but it was uncertain if the additional complexity of the individual patient simulation was justified. It was also uncertain if the health states defined by histological fibrosis stage were relevant to clinical practice, given that access to treatment and response would be determined by non-invasive tests. To explore this structural uncertainty, the committee requested a supplementary cohort state transition model with health states based on the non-invasive test thresholds that would be used in clinical practice (see section 3.11). The purpose of this request is to allow the committee to validate the outcomes predicted in the company's main model.

Validation of the company model

3.10

The EAG noted that, while the revised model submitted during technical engagement addressed a number of concerns it had raised and was an improvement because it was more transparent, it still contained some errors. It noted that although the model appeared to work correctly, it could not be certain if there were any unidentified errors because it contained thousands of lines of R code that the EAG was not able to fully check. The committee concluded that the company's updated model was a substantial improvement compared with the previous model. But it concluded that the complexity of the model meant it was uncertain whether there were any unidentified errors. The committee noted that the company had completed the TECH-VER checklist. The committee requested that the company completes the AdViSHE checklist during consultation. The committee also requested a supplementary cohort state transition model with health states based on the non-invasive test thresholds that would be used in clinical practice (see section 3.11). The purpose of this request is to allow the committee to validate the outcomes predicted in the company's main model.

Non-invasive test thresholds used as a proxy for fibrosis stage

3.11

The EAG noted that the health states in the company's model structure are based on liver fibrosis stage rather than response as measured by non-invasive tests in clinical practice. During the technical engagement stage, the company proposed non-invasive test thresholds for defining stable response, improved response and non-response in clinical practice (see section 3.6). The company argued that its proposed clinical algorithm for initiating treatment, evaluating response and stopping treatment based on non-invasive tests is consistent with the model structure based on histological fibrosis stage. It thought that the non-invasive test thresholds for:

improved response were consistent with fibrosis stages F0 and F1
stable response were consistent with fibrosis stages F2 and F3
non-response were consistent with F4 compensated cirrhosis or worse.

The EAG noted that the proposed non-invasive test thresholds were assumed to correspond directly to fibrosis stages. But non-invasive tests cannot reliably distinguish between intermediate fibrosis stages (for example between F2 and F3). It also noted that non-invasive tests cannot accurately measure MASH resolution. The EAG expressed concern that structural misalignment between the model structure and the way resmetirom would be used in clinical practice makes it unclear if the model's predictions (based on histological outcomes) are representative of outcomes in clinical practice using non-invasive tests. The committee noted that in clinical practice it would not be possible to distinguish between fibrosis stages F0 and F1, or between F2 and F3, using non-invasive tests and the 30% coefficient associated with the VCTE test. It also noted that healthcare professionals and people with the condition are unlikely to know if MASH has been resolved based on non-invasive tests. So, it was uncertain whether the company's model structure based on liver fibrosis stage and MASH resolution was relevant to clinical practice. The committee concluded that a supplementary cohort model would help to address the structural uncertainty and validate the outcomes predicted in the company's main model, and requested that this be provided during consultation. The cohort model should be based on the non-invasive test LSM thresholds proposed by the company and include the following 3 health states:

improved response (consistent with histological fibrosis stages F0 and F1)
stable response (consistent with histological fibrosis stages F2 and F3)
non-response (consistent with histological fibrosis stage F4 compensated cirrhosis and advanced liver complications, including liver cancer, liver transplant and decompensated cirrhosis).

Health-state occupancy plots should also be provided to validate the disease progression and survival predicted in the model (see section 3.21).

Natural history and face validity of model progression rates

3.12

The company modelled disease progression by repeating 1-year transition probabilities derived from MAESTRO‑NASH for the entire time horizon. The company felt it was inappropriate to use data from the literature to inform disease progression in the model. This was because of issues it highlighted with the available studies. It said that pooled disease progression rates from the available natural history studies are not representative of the target population because they mix people with different risk profiles and baseline disease stages. The company noted that the authors of Singh et al. (2015) reported a limited ability to estimate fibrosis progression rates in people with intermediate or advanced fibrosis at baseline. It also said that the study may have underestimated the disease progression rates in people with advanced disease because they may not have repeat histological examination. The company also had concerns about the reliability and applicability of Le et al. (2023), because these studies found that disease progression slows with advancing fibrosis, which is inconsistent with the observations of rapid fibrosis progression in the MAESTRO‑NASH advanced fibrosis population. The company provided 4 scenario analyses that incorporated natural history evidence from Le et al. and Ng et al. (2022). It did this by deriving transition probabilities between fibrosis stages for standard care from the studies and applying a relative risk for resmetirom derived from MAESTRO‑NASH. The company said that these scenario analyses may be informative, but they do not reflect the costs and benefits of resmetirom because they come from aggregate data in heterogeneous MASH populations.

The EAG expressed concern about the company's approach to modelling disease progression. The EAG felt that the scenario analyses provided by the company were informative for assessing the face validity of the model predictions in the standard care arm. It compared the health-state occupancy in the 4 scenario analyses provided by the company and the company base case. The EAG noted that people in the standard care arm in the company base case spend substantially more time in the F0 or non-alcoholic fatty liver health states, and less time in F2 or F3, than in any of the scenario analyses. The EAG said this indicates that the company's model generates clinically implausible predictions of disease progression that may be too optimistic. The EAG acknowledged the limitations of Singh et al. and noted the company's concerns regarding the reliability of the natural history estimates in Le et al. It said that, while there may be anomalies in the calculations of Le et al., these do not invalidate the broad conclusions of the study. The EAG thought that Singh et al. and Le et al. indicate that progression to advanced liver disease typically occurs over a prolonged period of time. It also said that these studies do not provide support for the assumption that fibrosis progression accelerates at higher fibrosis stages. Although studies included in Le et al. represent a heterogeneous population, it was not clear that these populations were less representative of the target population than people recruited to MAESTRO‑NASH.

The committee was aware of the McPherson et al. (2015) and McPherson et al. (2017) studies, which were not considered for validation of the model's predicted progression rates. The committee thought that these studies may be relevant because they report results of sequential liver biopsies in people with MASLD who had treatment in the UK. The committee requested that data about disease progression from these studies be used to inform further scenario analyses.

The committee noted that the life-year predictions shown in the health-state occupancy table provided by the EAG were discounted. So the committee felt it was difficult to interpret the comparison of health-state occupancy in the company base case and the scenario analyses. The committee concluded that additional evidence is needed to validate disease progression in the standard-care arm of the model. So that it could assess the face validity of the disease progression rates predicted in the company's microsimulation model, the committee requested that health-state occupancy plots be presented for both arms of:

the company base case
the EAG base case
the 4 scenario analyses provided by the company
additional analyses incorporating data from McPherson et al. (2015) and McPherson et al. (2017).

The committee also noted that it was important to validate the model's predictions about survival. The committee requested that the survival estimates predicted in the model be compared with survival estimates derived from suitable sources in the literature. It requested that survival be validated for the whole population, and also by histological fibrosis stage and VCTE LSM threshold.

Population used for transition probabilities

3.13

To derive transition probabilities between fibrosis stages and MASH resolution status, the company preferred to use the PLB‑W52 F2/F3 (n=766) cohort of the MAESTRO‑NASH trial. This was a sample of the trial that excluded people with liver fibrosis stage F1B at baseline and people without complete biopsy records. Biopsy records were considered incomplete if people did not have readable biopsy slides at baseline and week 52, and if 2 clinicians did not agree on the assessment of the biopsy images. This meant that people were excluded from this sample if they discontinued the study for any reason. The company said that the PLB‑W52 F2/F3 cohort was necessary because it was only possible to derive transition probabilities when both baseline and week 52 biopsies were complete. It said that using the PLB‑W52 F2/F3 cohort avoided the need to assume that any missing data was equivalent to non-response. The company also considered that the PLB‑W52 dataset was representative of the randomised MAESTRO‑NASH population. This was because baseline characteristics and demographic profiles in the PLB‑W52 F2/F3 and mITT‑LB‑W52 (n=955) populations were comparable.

The EAG did not consider PLB‑W52 F2/F3 to be representative of the full trial population. This was because the company's analytical approach assumed that missing data were missing completely at random, but the EAG felt that this was assumption was implausible. It said that the PLB‑W52 cohort may be biased by discontinuation, for example due to adverse events related to resmetirom. It noted that discontinuation rates were higher in the resmetirom arms than the placebo arms. The EAG said that the treatment effect estimates for PLB‑W52 F2/F3 cohort were biased in favour of resmetirom. The EAG preferred to use the mITT‑LB‑W52 population to derive transition probabilities for MASH resolution status. This was because the mITT‑LB‑W52 population reflected the pre-specified primary analysis and provided the most unbiased treatment effect. But it used the PLB‑W52 F2/F3 population to derive transition probabilities between fibrosis stages because this was not possible with the mITT‑LB‑W52 population.

During the technical engagement stage, the company said that the PLB‑W52 F2/F3 cohort was unlikely to be biased in favour of resmetirom. This was because missing data was not necessarily attributable to adverse events or loss of response. It explained that data may be missing for several plausible clinical reasons. For example, clinicians may have been reluctant to do a week 52 biopsy on some people if they felt there was a risk of breaking the trial blinding. The company also referred to prespecified sensitivity analyses it had provided using different approaches for handling missing data. The sensitivity analyses showed that the treatment effect was consistent regardless of the approach. The EAG acknowledged that the company had explored the robustness of the results using different imputation methods for missing data. But it said that the additional analyses did not resolve the concern that missing data may have influenced the magnitude of the treatment effect. The EAG reiterated its preference to use the mITT‑LB‑W52 to derive transition probabilities in the model. The committee considered the company and EAG approaches. It thought that the assumption that missing data were missing completely at random was unlikely to be plausible. It also considered that the primary endpoint results for the PLB‑W52 F2/F3 cohort were biased in favour of resmetirom. But it noted that the mITT-LB-W52 analysis may be biased against resmetirom because it assumed that missing data was equivalent to non-response. The committee concluded that the mITT‑LB‑W52 population reflected the prespecified analysis of the trial and provided the most unbiased estimate of treatment effect. So, this population should be used for deriving transition probabilities for MASH resolution status.

Repeated application of MASH resolution transition probabilities

3.14

The company repeatedly applied the 1-year MASH resolution transition probabilities derived from MAESTRO‑NASH in both arms for the entire time horizon. The company considers it biologically and clinically plausible for the MASH resolution rates observed at year 1 to persist over time because of the ongoing activation of the thyroid hormone receptor beta pathway, which is likely to result in a continued biologic and clinical effect. The EAG acknowledged the company's biological rationale but said this relied on several assumptions that were not empirically established. It also noted that pharmacodynamic data from MAESTRO‑NASH indicates that reductions in liver fat occur within 16 weeks, indicating an early biological effect. The EAG said this suggests that long-term treatment may lead to continued MASH resolution for people in which MASH responds to treatment early. But it does not suggest that people in which MASH does not respond to treatment early are likely to have MASH resolution at a constant annual rate. The EAG said it was possible that some additional people could have MASH resolution with longer exposure to resmetirom treatment, but it was not possible to establish the proportion of people who might have MASH resolution after year 1 with only 1 year of follow-up data. So, the EAG preferred a conservative assumption that MASH resolution only occurs in the first annual cycle of the model. The clinical experts said there was limited data to support or refute the assumption of the same MASH resolution probability throughout the time horizon. One clinical expert said that MASH resolution would tend to happen within the first year of treatment, but fibrosis improvement is expected to be slower. The committee felt it was implausible that the treatment effect observed in year 1 would continue over the time horizon. The committee concluded that MASH resolution should only be modelled in year 1, based on the evidence that was available.

Transitions to advanced liver complications

3.15

Because of the limited trial follow-up data, transition probabilities to the advanced liver complications health states were derived from literature. These health states include decompensated cirrhosis, hepatocellular carcinoma and liver transplant. For the transitions from fibrosis stage F3 to decompensated cirrhosis and F4 compensated cirrhosis to decompensated cirrhosis, the company preferred to derive transition probabilities using data from a study by Davidson et al. (2025). It preferred to derive the transition probability from F3 to hepatocellular carcinoma using data from Vilar-Gomez et al. (2018). For the transitions from F4 compensated cirrhosis to hepatocellular carcinoma, decompensated cirrhosis to hepatocellular carcinoma and decompensated cirrhosis to liver transplant, the company preferred to derive transition probabilities using data from Shang et al. (2026). The EAG noted that the data used for deriving transition probabilities from Davidson et al. was unpublished and not reported in the manuscript. This meant it was not able to verify the data. It also noted that the F3 to decompensated cirrhosis transition probability derived from Davidson et al. was higher than values reported elsewhere in the literature. So for the transition from F3 to decompensated cirrhosis, the EAG preferred to use data from Vilar-Gomez et al. For consistency, it also preferred to use Vilar-Gomez et al. for the transition probability from F4 compensated cirrhosis to decompensated cirrhosis. For the transition probabilities informed by Shang et al., the EAG agreed that this study was an appropriate source. But it said that the company's calculations relied on events from table 4 of Shang et al. where the at-risk period was undefined. The EAG preferred to use data from table 3, which reports cumulative incidence over a period of 5 years.

The committee noted that the data from Davidson et al. was unpublished. It also noted that the F3 to decompensated cirrhosis transition probability using data from Davidson et al. seemed high. Because the data from Davidson et al. could not be verified, the committee preferred the EAG approach of using data Vilar-Gomez et al. to inform the transitions from F3 to decompensated cirrhosis, F3 to hepatocellular carcinoma and F4 compensated cirrhosis to decompensated cirrhosis. The committee also agreed with the EAG approach of using data from table 3 of Shang et al. to inform the transition probabilities from F4 compensated cirrhosis to hepatocellular carcinoma, decompensated cirrhosis to hepatocellular carcinoma and decompensated cirrhosis to liver transplant.

Health state utility values

3.16

The company accepted the EAG's recommendation to derive utility values for the F0 to F3 health states using data from MAESTRO‑NASH. It preferred to use a beta distribution and logit link function for its modelling approach based on statistical goodness of fit. The company also accepted the EAG's preference to use the utility value from Papatheodoridi et al. (2023) for the F4 compensated cirrhosis health state. But it said that the value from Papatheodoridi et al. was measured using the EQ-5D-5L questionnaire. So it crosswalked this value to EQ-5D-3L, in line with the NICE manual.

The EAG noted that the company's mean utility values were not fully consistent with disease progression because they did not always decrease with increasing severity. It said that this was clinically implausible and likely reflected the small sample size. The EAG preferred to model the F0 to F3 utility values independently of fibrosis stage. This was because it felt the evidence of differential utility based on fibrosis severity was weak, noting comments from clinical experts that most people with F0 to F3 disease were asymptomatic. The EAG preferred to use a Gaussian distribution with identity link for its modelling approach. For the F4 compensated cirrhosis utility value, the EAG said that Papatheodoridi et al. was an appropriate source. But it confirmed with the study authors that the EQ-5D-5L values in the paper had already been crosswalked to EQ-5D-3L (meaning the transformation had been double counted in the company base case). So it preferred to use the original utility value from the study (0.725).

During the committee meeting, the patient experts said that, while MASH with fibrosis is considered asymptomatic in early stages, people still experience symptoms, such as fatigue. They said that these symptoms can affect their ability to do daily tasks, but they may not be attributed to the disease because they are generic. The committee agreed with the EAG that the company's mean utility values were not consistent with disease progression. It noted the patient expert comments that people with fibrosis stages F0 to F3 may experience some symptoms that can affect their health-related quality of life. The committee concluded that it preferred the EAG approach of modelling the F0 to F3 utility values independently of fibrosis stage, because this appeared consistent with disease progression. Because the Papatheodoridi et al. F4 compensated cirrhosis utility value had already been mapped to EQ-5D-3L, it preferred the utility value from the study without crosswalking (0.725). But the committee noted that the difference in health-related quality of life between F3 and F4 compensated cirrhosis was implausibly large in the company and EAG base cases. So, to derive the appropriate F4 compensated cirrhosis utility value, the committee preferred to apply the utility decrement from F3 to F4 compensated cirrhosis (-0.052) from Papatheodoridi et al. to the F3 utility value derived from MAESTRO‑NASH.

MASH resolution utility increment

3.17

The company included a utility increment for MASH resolution. It said that the coefficient for MASH resolution was statistically significant in its preferred analysis. The EAG considered that the evidence for an independent utility benefit for MASH resolution was weak. It noted that data from MAESTRO_{‑NASH indicated a borderline statistically significant association between MASH resolution and EQ-5D utility, but this was sensitive to the choice of regression model. Despite this, the EAG included a MASH resolution increment in its base case.}

The committee agreed with the EAG that the evidence for a MASH resolution increment was weak. But it noted comments from patient experts about the symptoms experienced by people with MASH. To help determine whether a utility increment for MASH resolution is appropriate, the committee requested the exploration of a variety of regression models with the following covariates:

fibrosis stage alone
MASH status alone
fibrosis stage and MASH status.

All regression models should be adjusted for age and sex. Both beta (with logit link) and Gaussian (with identity link) generalised linear models should be explored. The committee requested that measures of goodness of fit should be reported and that p-values be reported to at least 3 decimal places.

Non-liver treatment effects

3.18

Some non-liver treatment effects are incorporated into the company model, including changes to low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), BMI and cardiovascular disease risk.

For LDL‑C and HDL‑C, the company incorporated changes observed in MAESTRO‑NASH in the first cycle of the model. From cycle 2, LDL‑C and HDL‑C evolve according to expected natural progression as liver disease progresses. The company said that resmetirom was expected to have a beneficial effect on lipid metabolism because of its mechanism of action. It noted that reductions in all lipid endpoints were observed after 4 weeks and sustained at 23 weeks and 52 weeks of resmetirom treatment. The EAG said that the company approach assumes a permanent treatment-related difference in LDL-C and HDL-C despite no statistically significant difference in HDL-C observed in MAESTRO‑NASH. The EAG preferred to assume no treatment-related difference for LDL-C and HDL-C after year 1, but it was not able to implement this assumption in the model.

The company said that BMI evolves annually in the model, based on data derived from the NHS Digital population and combined with a treatment effect observed in the trial. The EAG said that the model imposes a persistent and increasing treatment effect on BMI. This results in continuous weight decline for all people in both arms of the model over the entire time horizon. The EAG felt this was not clinically plausible and inconsistent with evidence on the natural history of weight change in MASH populations. It also noted that there is no long-term or statistically significant evidence to support a persistent or differential weight-loss effect for resmetirom. The EAG preferred to assume there was no further treatment-related change in BMI after year 1.

The company modelled the risk of cardiovascular disease events using risk equations linked to fibrosis stage. The company considered its approach to be conservative because it limited cardiovascular risk increases mainly to people with advanced liver disease or ongoing inflammatory activity. The company said that evidence from Shen et al. (2026) demonstrates that increasing MASLD fibrosis severity was associated with higher risks of major adverse cardiovascular events. The EAG said there was limited evidence to support the existence of an independent effect of fibrosis stage on cardiovascular disease risk. Although observational studies have reported associations between fibrosis stage and higher rates of cardiovascular disease events, the EAG said that these associations are likely to reflect shared underlying cardiometabolic risk factors. The EAG felt it was not appropriate to model fibrosis stage as directly affecting cardiovascular event risk, but it was not able to implement this assumption in its base case.

During the committee meeting, the clinical experts said that the effect of resmetirom on lipids was biologically plausible and was likely to be sustained while on treatment. The clinical experts explained that the effect of diet and exercise on weight loss in people with MASH tended to be modest, and that weight loss was particularly difficult to sustain. The clinical experts also said that data on cardiovascular disease risk was limited, although available evidence showed a correlation between fibrosis stage and cardiovascular disease risk.

The committee noted that there was limited evidence to support a quantitative benefit for resmetirom on LDL-C, HDL-C, or BMI beyond year 1. It noted that a treatment effect on lipids was biologically plausible but there was no evidence of a long-term benefit. It also noted that the difference in HDL-C observed in the trial was not statistically significant. The committee noted the clinical expert comment that weight loss was difficult to sustain. It also noted there was limited evidence to support a causal relationship between fibrosis stage and cardiovascular disease event risk. The committee concluded that no treatment-related difference in LDL-C, HDL-C or BMI should be modelled after year 1. The committee also concluded that it was not appropriate to model fibrosis stage as directly affecting cardiovascular disease event risks.

Severity

3.19

NICE's methods on conditions with a high degree of severity did not apply.

Cost-effectiveness estimates

Company and EAG cost-effectiveness estimates

3.20

The incremental cost-effectiveness ratios (ICERs) for the comparison of resmetirom with standard care were £35,034 using the company's base case and £57,755 using the EAG's base case assumptions. For the model assumptions, the committee preferred to:

not include costs of GLP‑1 receptor agonists in the model
include additional eligibility assessment costs in the resmetirom arm (1 additional VCTE test and 1 additional consultation)
assume that people stop resmetirom treatment upon progression to F4 compensated cirrhosis (consistent with LSM above 20 kPa)
assume a confirmatory VCTE test and an additional consultation before stopping resmetirom treatment, and an additional 3 months of resmetirom treatment during this time
assume that people who have a FIB-4 test result showing high or intermediate risk of clinically significant fibrosis will have an annual VCTE or ELF to monitor disease progression
use the mITT‑LB‑W52 population for deriving transition probabilities for MASH resolution status
assume MASH resolution is only possible in the first year
model utility values independently of fibrosis stage
apply the F3 to F4 compensated cirrhosis utility decrement (-0.052) from Papatheodoridi et al. to the F3 utility value from MAESTRO‑NASH to derive the utility value for F4 compensated cirrhosis
derive transition probabilities for F3 to decompensated cirrhosis, F3 to hepatocellular carcinoma and F4 to decompensated cirrhosis using data from Vilar-Gomez et al.
derive transition probabilities for F4 to hepatocellular carcinoma, decompensated cirrhosis to hepatocellular carcinoma, and decompensated cirrhosis to liver transplant using data from Table 3 of Shang et al.
assume no treatment-related change in LDL-C, HDL-C or BMI after year 1
model cardiovascular disease event risk independently from fibrosis stage.

Uncertainties to explore further in the modelling

3.21

NICE's technology appraisal and highly specialised technologies guidance manual notes that, above a most plausible ICER of £25,000 per QALY gained, judgements about the acceptability of a technology as an effective use of NHS resources will take into account the degree of certainty around the ICER. The committee will be more cautious about recommending a technology if it is less certain about the ICERs presented. But it will also take into account other aspects including uncaptured health benefits. The committee noted the high level of uncertainty, specifically that:

there is only 1 year of follow-up data available from MAESTRO‑NASH (see section 3.5)
treatment eligibility and clinical outcomes were determined using liver biopsy in the trial, but these would be determined using non-invasive tests in clinical practice (see section 3.6)
it was uncertain if ELF tests would be used as an alternative to VCTE, and if the proposed ELF thresholds for initiating and stopping resmetirom treatment were appropriate (see section 3.6)
the proportion of people who regress to fibrosis stage F0 or F1 who would stop resmetirom treatment is uncertain, and there was no stopping rule based on fibrosis regression in the trial (see section 3.6)
the validity of fibrosis improvement and MASH resolution as surrogate outcomes for advanced liver complications and survival is uncertain (see section 3.8)
the model is complex and it was uncertain whether there are any unidentified errors (see section 3.10).
the health states in the model are based on histological fibrosis stages instead of the non-invasive test thresholds that would be used in clinical practice (see section 3.11)
it is uncertain if the disease progression rates and survival predicted in the model have face validity (see section 3.12)
it is uncertain if it is appropriate to model a utility benefit for MASH resolution (see section 3.17).

The committee requested further analyses to help it understand the extent of the uncertainty and its impact on the cost effectiveness and determine its preferred ICER threshold for decision making. The committee requested the following analyses:

A supplementary cohort model based on non-invasive test thresholds to validate the outcomes predicted in the company's main model. The cohort model should be based on the non-invasive test thresholds proposed by the company and include the following 3 health states: improved response, stable response, and non-response (see section 3.11).
The company is asked to complete the AdViSHE checklist during consultation to validate its economic model and any supplementary models (see section 3.10).
The provision of health-state occupancy plots to validate disease progression and survival predicted in the model. In addition to the natural history studies already used, the following studies should also be explored: McPherson et al. (2015) and McPherson et al. (2017). Survival estimates predicted in the model should be compared with survival estimates derived from suitable sources in the literature (see section 3.12).
The exploration of further regression models to help determine whether it is appropriate to include a utility benefit for MASH resolution (see section 3.16).

Other factors

Equality

3.22

The committee considered that some ethnic groups are more likely to develop MASH than others. Race is a protected characteristic under the Equality Act 2010. But because its recommendation does not restrict access to treatment for some people over others, the committee agreed this was not a potential equalities issue.

Stakeholders noted that liver fibrosis is linked to BMI and suggested that lower BMI thresholds should be used for people of South Asian, Chinese, other Asian, Middle Eastern, Black African or African Caribbean family backgrounds for treatment eligibility for people living with MASLD/MASH and obesity. The committee noted that eligibility for resmetirom would not be determined based on BMI. So it concluded that it was not necessary to adjust BMI thresholds for people from certain family backgrounds.

Conclusion

Recommendations

3.23

The committee noted the important uncertainties in the clinical- and cost-effectiveness evidence. Because of these uncertainties, it is not possible to determine the most likely cost-effectiveness estimates for resmetirom. So, resmetirom should not be used. The committee concluded that the company should provide additional information for consideration at the next evaluation meeting (see section 3.21).

How are you taking part in this consultation?

Question on Consultation

Question on Consultation

Question on Consultation

Question on Consultation

1 Recommendations

3.1

3.2

3.3

3.4

3.5

3.6

3.7

3.8

3.9

3.10

3.11

3.12

3.13

3.14

3.15

3.16

3.17

3.18

3.19

3.20

3.21

3.22

3.23