4 Evidence

The diagnostics advisory committee (section 7) considered evidence on EndoPredict (EP score and EPclin score), MammaPrint, Oncotype DX (with and without the recurrence score-pathology-clinical [RSPC] calculator), Prosigna and IHC4 or IHC4+C from several sources. Full details of all the evidence are in the committee papers.

Clinical effectiveness

4.1 Evidence on the following outcomes was of interest in the clinical effectiveness review:

  • Prognostic ability – the degree to which the test can accurately predict the risk of an outcome such as disease recurrence.

  • Prediction of relative treatment effect – the ability of the test to predict which patients have disease that will respond to chemotherapy. It can be assessed by considering whether the relative treatment effect of chemotherapy or no chemotherapy on patient outcomes differs according to the test score.

  • Clinical utility – the ability of the prospective use of the test to affect patient outcomes such as recurrence and survival compared with current practice.

  • Decision impact – how the test influences decision making in terms of which patients will be offered chemotherapy.

4.2 A total of 153 references were included in the review. Studies assessing prognostic ability and prediction of relative treatment effect were quality assessed using relevant criteria from the draft prediction model study risk of bias assessment tool (PROBAST). Clinical utility studies were quality assessed using the Cochrane risk of bias tool for randomised controlled trials (RCTs).

Prognostic ability

4.3 Studies providing information on prognostic ability were retrospective analyses of RCT data or routinely collected data. Most of the studies excluded patients who did not have a large enough tissue sample for testing, which leaves the evidence base at potential risk of spectrum bias, because patients with smaller tumours (who may be systematically different to those with large tumours) are likely to be under-represented. In many studies patients had chemotherapy, which could affect event rates and therefore potentially reduce the apparent prognostic performance of a test. In other studies, patients who had chemotherapy were excluded from analyses, which may also lead to spectrum bias. Therefore studies in which all patients had endocrine monotherapy were preferable.

4.4 Results for prognostic ability were generally presented as unadjusted or adjusted analyses. Unadjusted analyses look at differences in the event rates among low, intermediate and high‑risk groups without adjusting for clinical and pathological variables. Adjusted analyses show whether the test has prognostic value over clinical and pathological variables.

Distribution of patients across risk categories

4.5 Among studies of patients with lymph node (LN)-negative disease who had endocrine monotherapy, in each group around 70% to 80% had disease that was categorised as low or low/intermediate risk across all tests (11 studies). Most MammaPrint studies had mixed endocrine and chemotherapy use, mixed hormone receptor status with or without mixed human epidermal growth factor receptor 2 (HER2) status, so results may not be comparable with results from other tests. In these studies 20% to 61% of patients had disease that was categorised as low risk (6 studies). Most IHC4 or IHC4+C studies used quartiles or tertiles to define risk groups. These studies do not provide useful information on the distribution of patients across risk categories.

4.6 The proportion of patients with low and intermediate risk was generally much lower in groups with LN-positive disease than in groups with LN-negative disease who had endocrine monotherapy (7 LN-positive studies). For Oncotype DX, however, the proportion of patients with low and intermediate risk was only slightly lower in the LN-negative group than in the LN-positive group. Studies of MammaPrint in patients with LN-positive disease were all done in groups with mixed hormone receptor status and mixed or unknown HER2 status, so results may not be comparable with results from other tests. In these studies 38% to 41% of patients had disease that was categorised as low risk (2 studies).

Oncotype DX

4.7 There were 11 data sets that provided information on the prognostic ability of Oncotype DX: 7 reanalyses of RCT data and 4 retrospective studies of routinely collected data. All studies were validation studies, and in 4 RCTs patients had endocrine monotherapy. Three of the studies were done in East Asia and it is uncertain whether the commercial version of Oncotype DX was used in these studies. Also, they may not be generalisable to England because usual clinical practice may differ between countries enough to affect prognostic outcomes. In addition, it is possible that people of different ethnicities have different underlying risk profiles and natural history of disease.

4.8 Unadjusted analyses indicated that Oncotype DX had prognostic accuracy (there were statistically significant differences between low-risk and high-risk groups) across various recurrence outcomes, regardless of lymph node status. However, hazard ratios between the intermediate-risk group and the high or low-risk groups were not always statistically significant, particularly in the group with LN‑positive disease.

4.9 In adjusted analyses, Oncotype DX provided statistically significant additional prognostic information over most commonly used clinical and pathological variables (age, grade, size, nodal status), regardless of lymph node status. A bespoke analysis of TransATAC study data also showed that Oncotype DX provided additional prognostic information over clinical and pathological tools to assess risk.

MammaPrint

4.10 There were 10 data sets that provided information on the prognostic ability of MammaPrint: 1 reanalysis of RCT data and 9 retrospective studies of routinely collected data. In addition, a further 4 studies pooled data on specific patients from the same 10 data sets. All studies were validation studies, and in 2 studies patients had endocrine monotherapy. Most studies included some patients who were out of scope (with HER2-positive or hormone receptor-negative disease or both).

4.11 In 6 of 7 unadjusted analyses, MammaPrint had prognostic accuracy (there were statistically significant differences between low-risk and high-risk groups) for 10 year distant recurrence-free survival or interval, regardless of LN status.

4.12 In adjusted analyses, a pooled analysis of patients with LN‑negative and LN-positive disease showed that MammaPrint had statistically significant prognostic accuracy for 10-year distant recurrence-free survival after adjusting for clinical and pathological variables. In patients with LN-negative disease, MammaPrint had statistically significant prognostic accuracy for 10-year distant recurrence-free interval when adjusted for Adjuvant! Online, Nottingham Prognostic Index (NPI) or clinical and pathological variables. In patients with LN-positive disease, MammaPrint had borderline statistically significant prognostic accuracy for 10-year distant metastasis-free survival when adjusted for clinical and pathological variables.

Prosigna

4.13 There were 8 data sets that provided information on the prognostic ability of Prosigna: 6 reanalyses of RCT data and 3 retrospective analyses of 2 prospective cohort studies. All studies were validation studies, and in 5 studies patients had endocrine monotherapy. Some studies included some patients who were out of scope (with HER2-positive or hormone receptor-negative disease or both).

4.14 Prosigna had statistically significant prognostic accuracy for 10‑year distant recurrence-free survival and interval in all unadjusted analyses of patients with LN-negative and LN-positive disease.

4.15 In analyses adjusted for clinical and pathological variables or tools, Prosigna had prognostic accuracy for 10-year distant metastasis‑free survival and distant recurrence-free survival. In patients with LN-negative disease the results were statistically significant. In patients with LN-positive disease the results were statistically or borderline significant.

EndoPredict

4.16 There were 3 data sets that provided information on the prognostic ability of EndoPredict; all were reanalyses of RCT data. All studies were validation studies, and in 2 of the 3 studies patients had endocrine monotherapy.

4.17 In unadjusted analyses, EndoPredict (EPclin) had statistically significant prognostic accuracy for 10-year distant recurrence-free survival and interval in patients with LN-negative and LN-positive disease.

4.18 Adjusted analyses of TransATAC data show that EndoPredict (EPclin) had statistically significant increases in likelihood ratio for 10-year distant recurrence-free interval over clinical and pathological variables or tools, regardless of LN status.

IHC4 and IHC4+C

4.19 There were 12 data sets that provided information on the prognostic ability of IHC4 and IHC4+C: 6 reanalyses of RCT data and 6 reanalyses of routinely collected data. Most of the data related to the IHC4 score alone, without including clinical factors. One of the studies was based on the derivation cohort for IHC4, and therefore may have overestimated prognostic ability. The remaining studies were validation studies. Patients had endocrine monotherapy in only 2 studies, 1 of which was the derivation cohort study.

4.20 In unadjusted analyses, IHC4 had statistically significantly better prognostic performance in groups with high risk than in groups with low risk (defined by quartiles or tertiles), regardless of lymph node status. However, no studies reported survival or recurrence outcomes by risk group. Also, many used laboratory methods that differed from the derivation study methodology. In adjusted analyses, IHC4 had additional prognostic value over clinical and pathological factors in 3 studies, but patients had endocrine monotherapy in only 1 of these studies.

4.21 Data on IHC4+C came from the derivation cohort and 1 validation cohort. These studies showed that IHC4+C had prognostic value in unadjusted analyses. In adjusted analyses IHC4+C provided statistically significantly more information than the NPI in LN‑negative, but not LN-positive, disease.

Prediction of relative treatment effect

4.22 In addition to estimating the risk of recurrence, the ability of Oncotype DX and MammaPrint to predict which patients have disease that will respond to chemotherapy was explored in 7 data sets. The external assessment group (EAG) reviewed evidence in support of this.

Oncotype DX

4.23 In 5 data sets (2 reanalyses of RCT data and 3 observational studies) reported across 11 published references and 1 confidential manuscript, analyses assessed the ability of Oncotype DX to predict relative treatment effects for chemotherapy.

4.24 The 2 reanalyses of RCTs suggest that Oncotype DX may predict differences in relative treatment effects for chemotherapy. Hazard ratios for disease-free survival for patients having chemotherapy compared with those having no chemotherapy suggested that the greatest relative treatment effect was for patients in the Oncotype DX high-risk category. Unadjusted interaction tests between Oncotype DX risk group and relative treatment effects were mainly statistically significant. Adjusted interaction tests were statistically significant in an analysis of patients with HER2-negative, LN‑negative disease, but in patients with LN-positive disease the interaction test was not significant when hormone receptor status was adjusted for. However, the data for the population with LN‑negative disease came from the derivation cohort for Oncotype DX and may overestimate predictive performance.

4.25 Results from the 3 observational studies were mixed and at high risk from confounding. One reported a statistically significant interaction test but this was only adjusted for a limited number of factors. Two others reported hazard ratios for chemotherapy compared with no chemotherapy; 1 study in patients with intermediate Recurrence Score results, and another in patients with high Recurrence Score results. Both of these studies reported statistically non-significant results.

4.26 The RSPC algorithm incorporates Oncotype DX plus age, tumour size and grade. There was a non-significant interaction test result between relative chemotherapy treatment effects and RSPC risk group.

MammaPrint

4.27 Two studies reported the ability of MammaPrint to predict the relative treatment effects for chemotherapy. In a pooled analysis including patients with LN-negative and LN-positive disease, the effect of chemotherapy compared with no chemotherapy was statistically significant in the MammaPrint high-risk group but not in the low-risk group in unadjusted and adjusted analyses. Further, the interaction test for chemotherapy treatment and risk group was non-significant. In a pooled analysis of patients with LN-positive disease, there was a non-significant interaction between chemotherapy treatment and risk group.

Clinical utility

4.28 The EAG noted that the best evidence for clinical utility was an RCT of treatment guided by the test compared with treatment guided by the comparator. There were no clinical utility data available for EndoPredict, Prosigna or IHC4+C.

Oncotype DX

4.29 Five data sets, reported across 9 published references and 1 confidential manuscript, reported evidence on the clinical utility of Oncotype DX. These studies included the low-risk group from TAILORx. One further study did not meet the inclusion criteria (because of insufficient follow-up length), but presented subgroup data according to age, lymph node status and ethnicity, and was therefore discussed by the EAG. Studies generally reported different outcomes, making comparisons across studies difficult. All studies were judged to be of poor quality using the Cochrane risk of bias tool for RCTs.

4.30 In patients with LN-negative disease, using the test in clinical practice appeared to result in low rates of chemotherapy in patients with low risk (2% to 12%), with acceptable outcomes (distant recurrence-free survival, distant recurrence-free interval or invasive disease-free survival 96% to 99.6%). Rates of chemotherapy increased with increasing risk category, and were generally higher in patients with LN-positive disease. It was not possible to conclude whether patients in intermediate and high-risk categories had better outcomes as a result of using Oncotype DX to guide treatment because there were no comparator groups (patients who had treatment without Oncotype DX testing).

4.31 One study (TAILORx; Sparano et al. 2018) reporting evidence on clinical utility was published after completion of the diagnostics assessment report. This study was a prospective, partially randomised study in which patients with an Oncotype DX Recurrence Score result of 0 to 10 had endocrine therapy, patients with Recurrence Score results of 26 and above had endocrine therapy plus chemotherapy, and those with Recurrence Score results of 11 to 25 were randomised to have either endocrine therapy alone, or endocrine therapy plus chemotherapy. The cut‑offs in this study were different to the cut-offs recommended by the company (less than 18, 18 to 30 and greater than 30; see section 3.15). The 2018 publication focused on the results from patients in the intermediate-risk group who were randomised to treatment. It reported that across all patients with Recurrence Score results of 11 to 25, there were no clinically relevant or statistically significant differences between those who had endocrine therapy alone and those who had chemotherapy plus endocrine therapy. Results for the primary end point of 9-year invasive disease-free survival were 84.3% with chemotherapy and 83.3% without chemotherapy; an absolute difference of 1.0% (hazard ratio [HR] 1.08, 95% confidence interval [CI] 0.94 to 1.24, p=0.26). The upper confidence interval was within the pre-specified non-inferiority margin (HR 1.322). Results for freedom from distant recurrence at 9 years were 95% with chemotherapy and 94.5% without chemotherapy; an absolute difference of 0.5% (HR 1.10, 95% CI 0.85 to 1.41, p=0.48). However, exploratory subgroup analyses suggested that chemotherapy may have an effect in some subgroups, such as those with Recurrence Score results of 21 to 25 and possibly Recurrence Score results of 16 to 20, particularly in people aged 50 or under. The EAG noted that no analysis was available for the subgroup of patients with Recurrence Score results of 11 to 25 and a modified Adjuvant! Online high risk score.

MammaPrint

4.32 Two studies reported evidence relating to the clinical utility of MammaPrint. MINDACT was a prospective, partially randomised study in which clinical risk was determined using a modified version of Adjuvant! Online. Patients with risk scores that disagreed from MammaPrint and modified Adjuvant! Online were randomised to chemotherapy or no chemotherapy. Of patients included in the study, 88% had HR-positive disease and 90% HER2-negative disease, therefore some patients were outside of the scope for this assessment. For the group who were high risk with modified Adjuvant! Online and low risk with MammaPrint, 5-year distant metastasis-free survival was 95.9% with chemotherapy and 94.4% without chemotherapy, a non-statistically significant absolute difference of 1.5% (adjusted hazard ratio for distant metastasis or death with chemotherapy compared with no chemotherapy, 0.78; 95% CI 0.50 to 1.21; p=0.27). For the group who were low risk with modified Adjuvant! Online and high risk with MammaPrint, 5-year distant metastasis-free survival was 95.8% with chemotherapy and 95.0% without chemotherapy, a non-statistically significant absolute difference of 0.8% (adjusted hazard ratio for distant metastasis or death with chemotherapy compared with no chemotherapy, 1.17; 95% CI 0.59 to 2.28; p=0.66). The EAG judged MINDACT to be at low risk of bias in terms of randomisation, allocation concealment and reporting. However, no details of blinding were reported.

4.33 Results from the RASTER study suggested that distant recurrence‑free interval rates were sufficiently low in the MammaPrint low-risk group for these patients to avoid chemotherapy. The 5-year distant recurrence-free interval rate for LN-negative disease was 97.0% for patients with low risk (15% had chemotherapy) and 91.7% for patients with high risk (81% had chemotherapy). In addition, MammaPrint provided additional prognostic information over Adjuvant! Online and the NPI, but not over the NHS PREDICT tool. The EAG judged RASTER to be at high risk of bias using the Cochrane risk of bias tool for RCTs.

Comparison of the tests with each other

4.34 There were 6 studies that compared more than 1 test: 4 reanalyses of RCTs and 2 observational studies. Evidence shows that generally when a test placed more patients in a low-risk category than another test, the event-free survival in the low-risk group was reduced. Also, the tests generally performed differently in patients with LN-negative and LN-positive disease.

4.35 Thirteen studies reported data from microarray analyses on more than 1 test, however, these studies had methodological limitations. The comparability of test algorithms applied to microarray data with the commercial assays was unknown, so the generalisability of findings from microarray studies to the decision problem was uncertain. All the studies reported data on Oncotype DX and MammaPrint, and 2 also reported data on EndoPredict. No studies reported data on Prosigna or IHC4+C. The microarray studies generally supported the conclusions from studies using the commercial versions of the assays in suggesting that Oncotype DX, MammaPrint and EndoPredict can discriminate between patients with high and low risk regardless of LN status. In terms of additional prognostic performance of the tests over clinical and pathological variables, EndoPredict appeared to have the greatest benefit, followed by Oncotype DX and then MammaPrint. However, because of the methodological limitations, the EAG judged that these studies did not provide conclusive evidence of the superiority of 1 test over others.

4.36 The OPTIMA Prelim study, a UK-based feasibility phase of an RCT, analysed concordance between different tests. The study included Oncotype DX, MammaPrint, Prosigna and IHC4 plus 2 other tests. Out of the 4 in-scope tests, MammaPrint assigned the most patients to the low-risk category, but unlike the other 3 tests it does not have an intermediate category. When the low and intermediate categories were treated as 1 category for the 3 tests that have 3 risk groups, Oncotype DX assigned the most patients to this category, and MammaPrint the least. Kappa statistics indicated modest agreement between tests, ranging from 0.33 to 0.53. Also, across 5 tests in the study, only 39% of tumours were uniformly classified as either low/intermediate risk or high risk by all 5 tests. Of these, 31% were classified as low/intermediate risk by all tests and 8% were high risk by all tests. The study authors concluded that although the tests assigned similar proportions of patients to low/intermediate-risk and high-risk categories, test results for an individual patient could differ markedly depending on which test was used.

Decision impact

4.37 The review of decision impact focused on studies done in the UK or the rest of Europe:

  • Oncotype DX: 6 UK studies and 12 other European studies

  • EndoPredict: 1 UK study and 3 other European studies

  • IHC4+C: 1 UK study and 0 other European studies

  • Prosigna: 0 UK studies and 3 other European studies

  • MammaPrint: 0 UK studies and 8 other European studies.

4.38 The percentage of patients with any change in treatment recommendation or decision (either to or from chemotherapy) in UK studies was 29% to 49% across 4 Oncotype DX studies, 37% in 1 EndoPredict study and 27% in 1 IHC4+C study. Ranges across European (non-UK) studies were 5% to 70% for Oncotype DX, 38% to 41% for EndoPredict, 14% to 41% for Prosigna and 13% to 51% for MammaPrint.

4.39 The net change in the percentage of patients with a chemotherapy recommendation or decision (pre-test to post-test) among UK studies was a reduction of 8% to 23% across 4 Oncotype DX studies, an increase of 1% in 1 EndoPredict study, and a reduction of between 2% and 26% in 1 IHC4+C study. Net changes across European (non-UK) studies were a reduction of 0% to 64% for Oncotype DX, a reduction of 13% to 26% for EndoPredict, a reduction of 2% to an increase of 9% for Prosigna, and a reduction of 31% to an increase of 8% for MammaPrint.

Anxiety and health-related quality of life

4.40 There were 6 studies that reported outcomes relating to anxiety (including worry and distress) and health-related quality of life. The lack of a comparator in the studies made it difficult to tell whether changes in anxiety experienced with the use of tumour profiling tests would also have occurred if patients received a definitive decision based on clinical risk factors alone. Overall, evidence suggests that tumour profile testing may reduce anxiety in some patients in some contexts, but generally there was little effect on health-related quality of life.

Cost effectiveness

Review of economic evidence

4.41 The EAG reviewed existing studies investigating the cost effectiveness of tumour profiling tests to guide treatment decisions in people with early breast cancer, and also did a detailed critique of the economic models and analyses provided by Agendia (MammaPrint), Genomic Health (Oncotype DX), and the chief investigator of a UK decision impact study (EndoPredict).

4.42 From the review, 26 studies were identified that had been published since the original assessment for diagnostics guidance 10. The models reported in the studies assessed the cost effectiveness of tumour profiling tests across different countries including the UK, the US, Canada, Mexico, Japan, Austria, Germany, France and the Netherlands. Most studies compared Oncotype DX (18 studies), MammaPrint (8 studies) or EndoPredict (1 study) with comparators such as Adjuvant! Online, the St Gallen guidelines, standard practice or other conventional diagnostic tools. There was variation between the analyses in the populations evaluated, the disease type and other patient characteristics.

4.43 There was a high level of consistency in the general modelling approach and structure, and several studies were based on a previously published model. Most of the models used a Markov or hybrid decision tree–Markov approach, 2 studies used a partitioned survival approach and 1 study used a discrete event simulation approach. The time horizons ranged from 10 years to the patient's remaining lifetime, with cycle lengths ranging from 1 month to 1 year when reported. Most of the models that evaluated Oncotype DX assumed that the test could predict relative treatment effects for chemotherapy.

Economic evaluation

4.44 None of the models identified in the literature review included all of the tests identified in the scope. Therefore, the EAG developed a de novo economic model designed to assess the cost effectiveness of Oncotype DX, MammaPrint, Prosigna, IHC4+C and EndoPredict compared with current practice without the use of the tumour profiling tests. The model used a lifetime time horizon (42 years) from the perspective of the UK NHS and personal social services. All costs and health outcomes were discounted at a rate of 3.5% per year. Unit costs were valued at 2015/16 prices. The main source of evidence used to inform the analyses of Oncotype DX, Prosigna, IHC4+C and EndoPredict was a bespoke analysis of TransATAC provided by the study investigators. This was limited to UK data on patients with hormone receptor-positive, HER2‑negative disease with 0 to 3 positive lymph nodes to match the scope for this assessment. Because this study did not include MammaPrint, MINDACT was used as the basis for evaluating the cost effectiveness of MammaPrint. PREDICT scores were not available in either data set, and so this tool could not be considered as a comparator or used to determine different risk subgroups. Therefore, the comparator for Oncotype DX, Prosigna, IHC4+C and EndoPredict was current practice (various tools and algorithms), and the comparator for MammaPrint was a modified version of Adjuvant! Online.

Model structure

4.45 The hybrid decision tree–Markov model was based on the model previously developed by Ward et al. (2013). The decision tree component of the model classified patients in the current practice group (no test) and the tumour profiling test group as high, intermediate and low risk. For EndoPredict and MammaPrint, the intermediate-risk category was excluded because the test provides results in terms of high and low risk only. In both the test group and the current practice group, the decision tree determined the probability that a patient would be in 1 of 6 groups: low risk, chemotherapy; low risk, no chemotherapy; intermediate risk, chemotherapy; intermediate risk, no chemotherapy; high risk, chemotherapy, and high risk, no chemotherapy. For EndoPredict and MammaPrint, 4 groups were used because there was no intermediate-risk category. Each group was linked to a Markov model which predicted lifetime quality-adjusted life years (QALYs) and costs according to the patient's risk of distant recurrence and whether or not they had chemotherapy.

4.46 Each Markov node included 4 health states: distant recurrence‑free; distant recurrence; long-term adverse events (acute myeloid leukaemia [AML]); and dead. Patients entered the model in the distant recurrence-free health state. A health-related quality of life decrement was applied during the first model cycle to account for health losses associated with short-term adverse events for patients having adjuvant chemotherapy. The treatment effect for adjuvant chemotherapy was modelled using a relative risk reduction for distant recurrence within each risk classification group. The benefit of the test was therefore captured in the model by changing the probability that patients with each test risk classification had adjuvant chemotherapy.

Model inputs

4.47 The risk classification probabilities used in the model for Oncotype DX, Prosigna, IHC4+C and EndoPredict were from the bespoke data analysis of TransATAC, which only included postmenopausal women. For MammaPrint, they were from MINDACT.

4.48 The probability of developing distant metastases in each group and risk category was based on 10-year recurrence-free interval data from the bespoke data analysis of TransATAC for Oncotype DX, Prosigna, IHC4+C and EndoPredict. For MammaPrint the probability of developing distant metastases was based on an adjusted analysis of 5-year distant metastasis-free survival data from MINDACT. The model assumed that the risk of distant metastases between 10 and 15 years was halved, and after 15 years was 0.

4.49 The probability of having chemotherapy in the current practice group and in the tumour profiling test groups was taken from the sources in table 1.

Table 1 Source for post-test probability of having chemotherapy
Current practice group

Population

Source

Proportion of patients having chemotherapy

LN-negative, NPI≤3.4

NCRAS data set

0.07

LN-negative, NPI>3.4

Genomic Health access scheme data set

0.43

LN-positive (1−3 nodes)

NCRAS data set

0.63

Overall population (MammaPrint)

Expert opinion

0.47

Abbreviations: LN, lymph node; NCRAS, National Cancer Registration and Analysis Service; NPI, Nottingham Prognostic Index; UKBCG, UK breast cancer group.

The Genomic Health access scheme data set is based on the access scheme operated by NHS England and is a result of the research recommendation from NICE's original diagnostics guidance 10.

For the proportion of patients having chemotherapy, the low, intermediate and high risks are combined.

3-level tests (Oncotype DX, Prosigna and IHC4+C)

Population

Source

Proportion of patients having chemotherapy (low risk)

Proportion of patients having chemotherapy (intermediate risk)

Proportion of patients having chemotherapy (high risk)

LN-negative, NPI≤3.4

UKBCG survey data

0.00

0.20

0.77

LN-negative, NPI>3.4

Genomic Health access scheme data set

0.01

0.33

0.89

LN-positive (1−3 nodes)

Loncaster et al. (2017) node-positive estimates

0.08

0.63

0.83

Abbreviations: LN, lymph node; NPI, Nottingham Prognostic Index; UKBCG, UK breast cancer group.

2-level tests (EndoPredict and MammaPrint

Population

Source

Proportion of patients having chemotherapy (low risk)

Proportion of patients having chemotherapy (intermediate risk)

Proportion of patients having chemotherapy (high risk)

EndoPredict: all 3 subgroups

Bloomfield et al. (2017) study

0.07

0.77

MammaPrint: all subgroups

Bloomfield et al. (2017) study

0.07

0.77

4.50 In the base-case analysis, the relative treatment effect for chemotherapy was assumed to be the same across all test risk groups, that is, all tests were assumed to be associated with prognostic benefit only. For Oncotype DX, Prosigna, IHC4+C and EndoPredict a 10-year relative risk of distant recurrence was estimated as 0.76 for chemotherapy compared with no chemotherapy (Early breast cancer trialists' collaborative group 2012), and was assumed to apply to the groups with LN-negative and LN-positive disease. For MammaPrint the 10-year relative risk of distant recurrence was estimated to be 0.77 (MINDACT) for chemotherapy compared with no chemotherapy.

4.51 In sensitivity analyses the effect of assuming that Oncotype DX could predict relative treatment effects for chemotherapy was explored, based on the B20 study by Paik et al. (2006) and the SWOG-8814 study by Albain et al. (2010). For the group with LN‑negative disease, the 10-year relative risks of distant recurrence with chemotherapy compared with no chemotherapy were 1.31, 0.61 and 0.26 for the low, intermediate and high-risk categories respectively. For the group with LN-positive disease, the 10-year relative risks of relapse with chemotherapy compared with no chemotherapy were 1.02, 0.72 and 0.59 respectively. It is possible that the no-chemotherapy arm of B20 may have overestimated the difference in response rates between low and high-risk patients, because this arm was the derivation set for Oncotype DX. Therefore, additional sensitivity analyses in the group with LN‑negative disease explored the impact of varying the relative chemotherapy treatment effect between risk groups on the incremental cost-effectiveness ratios (ICERs). Hazard ratios were based on naive indirect comparisons of the chemotherapy arms from the B20 study and the no-chemotherapy arms from the B14 study (estimated hazard ratios for treatment effects with chemotherapy compared with no chemotherapy were 0.64, 0.75 and 0.35 for the low, intermediate and high-risk categories respectively), and the chemotherapy arms of the B20 study and the no-chemotherapy arms of the TransATAC study (hazard ratios for treatment effects with chemotherapy compared with no chemotherapy were 0.86, 0.88 and 0.49 for the low, intermediate and high-risk categories respectively).

4.52 Survival following distant recurrence was based on a median of 40.1 months from Thomas et al. (2009). From this, the 6-month probability of death following distant recurrence was estimated to be 0.098, assuming a constant rate. The rate of death following distant metastases was assumed to be the same across the different subgroups and across each test risk group.

4.53 The model assumed that 10.5% of patients entering the distant recurrence health state had previously had local recurrence, based on de Bock et al. (2009). The 6-month probability of developing AML was estimated to be 0.00025, based on Wolff et al. (2015). Survival following the onset of AML was estimated to be approximately 8 months; assuming a constant event rate gave a 6‑month probability of death following AML of 0.53. Additional sensitivity analyses explored the effect of including congestive heart failure (average net lifetime QALY loss of 0.0385 and average net lifetime cost saving of £2 from Hall et al. 2017, using an excess congestive heart failure risk relative to that of the general population), permanent hair loss (disutility of 0.04495 from Nafees et al. 2008 applied to 15% of all patients having chemotherapy) and peripheral neuropathy (disutility of 0.02 from Shiroiwa et al. 2009 applied to 12% of all patients having chemotherapy) in the model.

Costs

4.54 The costs of the tumour profiling tests were based on company prices (see table 2).

Table 2 Test prices

Test

List price

Comments

Oncotype DX

£2,580

Tests carried out in Genomic Health laboratory in US. Cost includes sample handling and customer service. A commercial-in-confidence discounted test cost was used in the model.

Prosigna

£1,970

Based on doing the test in an NHS laboratory, which includes the laboratory costs (£240), the Prosigna kit (£1,650) and the nCounter system (£194,600) and is based on 2,500 samples per lifetime of the nCounter system).

Commercial-in-confidence discounted test costs were used in scenario analyses to account for the access proposal.

EndoPredict

£1,500

Tests carried out in Myriad's laboratory in Munich.

Commercial-in-confidence discounted test costs were used in scenario analyses to account for the access proposal.

IHC4

£203

The cost was based on 2014 prices. The total cost of the test (£198) was uplifted to current prices using the hospital and community health services indices.

MammaPrint

£2,326

Converted from Euros to UK pounds sterling, assuming an exchange rate of 1 British pound to 1.15 Euros.

4.55 The costs associated with adjuvant chemotherapy were from a previous costing analysis of the OPTIMA Prelim trial (Hall et al. 2017). The weighted mean cost of adjuvant chemotherapy acquisition, delivery and toxicity was estimated to be £3,145 per course.

4.56 All surviving patients had endocrine therapy for a period of between 5 and 8 years. Costs of endocrine therapy were taken from the British national formulary (2017). In addition, 30% of women with early breast cancer had 4 mg of bisphosphonates (zoledronic acid) by intravenous infusion every 6 months for up to 3 years, at a cost of £58.50, excluding administration.

4.57 All patients had 2 routine follow-up visits during the first year after surgery, with annual visits thereafter for 5 years. Patients were also assumed to have a routine annual mammogram for up to 5 years. The cost of a routine follow-up visit was estimated to be £162.84, and the cost of a mammogram was estimated to be £46.37.

4.58 Costs associated with treating local recurrence were taken from Karnon et al. (2007) and uplifted to current prices (£13,913). This was applied as a once-only cost to distant recurrence. Costs associated with treating distant metastases were derived from Thomas et al. (2009), and included visits, drugs, pharmacy, hospital admission and intervention, imaging, radiotherapy, pathology and transport. Cost components specifically associated with terminal care were excluded. The 6-monthly cost of treating metastatic breast cancer was estimated to be £4,541.

Health-related quality of life

4.59 Health utilities were taken from published studies (see table 3).

Table 3 Health utilities applied in the base case

Health state / event

Duration applied in model

Mean

Standard error

Source

Recurrence-free

Indefinite

0.824

0.002

Lidgren et al. 2007

Disutility distant metastases

Indefinite

0.14

0.11

Calculated from Lidgren et al. 2007

Local recurrence

Once-only QALY loss applied on transition to distant recurrence state

−0.108

0.04 (assumed)

Campbell et al. 2011

Chemotherapy AEs

Once-only QALY loss applied in first cycle

−0.038

0.004

Campbell et al. 2011

AML

Indefinite

0.26

0.04 (assumed)

Younis et al. 2008

Abbreviations: AEs, adverse events; AML, acute myeloid leukaemia; QALY, quality-adjusted life year.

Base-case results

4.60 The following key assumptions were applied in the base-case analysis:

  • Clinicians interpreted each of the 3-level tests in the same way (for example, an Oncotype DX high-risk Recurrence Score result would lead to the same chemotherapy decision as a Prosigna high-risk score).

  • Clinicians interpreted each of the 2-level tests in the same way (for example, a MammaPrint high-risk score would lead to the same chemotherapy decision as an EndoPredict high-risk score).

  • The treatment effect for adjuvant chemotherapy was the same across all risk score categories for all tests.

  • The prognosis of patients with AML and the costs and QALYs accrued within the AML state were independent of whether they had previously developed distant metastases.

  • A disutility associated with adjuvant chemotherapy was applied once during the first model cycle only (while the patient is taking the regimen).

  • Costs associated with endocrine therapy, bisphosphonates, follow-up appointments and mammograms were assumed to differ according to time since model entry.

  • The model assumed that people entered at an age of around 60 years.

4.61 In the subgroup with LN-negative disease and a NPI of 3.4 or less, compared with current practice, the probabilistic model gave ICERs of:

  • £147,419 per QALY gained (EndoPredict)

  • £122,725 per QALY gained (Oncotype DX)

  • £91,028 per QALY gained (Prosigna)

  • £2,654 per QALY gained (IHC4+C).

4.62 In the subgroup with LN-negative disease and a NPI of more than 3.4, compared with current practice, the probabilistic model gave ICERs of:

  • £46,788 per QALY gained (EndoPredict)

  • £26,058 per QALY gained (Prosigna)

  • Oncotype DX was dominated by current practice (that is, it was more expensive and less effective)

  • IHC4+C was dominant over current practice (that is, it was less expensive and more effective).

4.63 In the population with LN-positive disease, compared with current practice, the probabilistic model gave ICERs of:

  • £28,731 per QALY gained (Prosigna)

  • £21,458 per QALY gained (EndoPredict)

  • Oncotype DX was dominated by current practice

  • IHC4+C was dominant over current practice.

4.64 In the overall MINDACT population, MammaPrint compared with modified Adjuvant! Online had an ICER of £131,482 per QALY gained. In the modified Adjuvant! Online high-risk subgroup, MammaPrint was dominated by current practice, and in the modified Adjuvant! Online low-risk subgroup, MammaPrint compared with current practice had an ICER of £414,202 per QALY gained.

4.65 The risk classification probabilities and the probability of having chemotherapy were combined in the model to estimate chemotherapy use with and without tumour profiling. The modelled chemotherapy use in the base case is shown in table 4.

Table 4A Modelled chemotherapy use with and without tumour profiling: Oncotype DX

Test, subgroup compared with current practice

Chemotherapy use with tumour profiling

Chemotherapy use with no tumour profiling

Net change

LN0 NPI≤3.4

0.076

0.072

0.004

LN0 NPI>3.4

0.273

0.430

−0.157

LN+ (1–3 nodes)

0.337

0.627

−0.290

Abbreviations: LN0, lymph node negative; LN+, lymph node positive, mAOL, modified Adjuvant! Online; NPI, Nottingham Prognostic Index.

Table 4B Modelled chemotherapy use with and without tumour profiling: IHC4+C

Test, subgroup compared with current practice

Chemotherapy use with tumour profiling

Chemotherapy use with no tumour profiling

Net change

LN0 NPI≤3.4

0.030

0.072

−0.042

LN0 NPI>3.4

0.355

0.430

−0.075

LN+ (1–3 nodes)

0.554

0.627

−0.073

Table 4C Modelled chemotherapy use with and without tumour profiling: Prosigna

Test, subgroup compared with current practice

Chemotherapy use with tumour profiling

Chemotherapy use with no tumour profiling

Net change

LN0 NPI≤3.4

0.075

0.072

0.003

LN0 NPI>3.4

0.435

0.430

0.005

LN+ (1–3 nodes)

0.709

0.627

0.082

Table 4D Modelled chemotherapy use with and without tumour profiling: EndoPredict

Test, subgroup compared with current practice

Chemotherapy use with tumour profiling

Chemotherapy use with no tumour profiling

Net change

LN0 NPI≤3.4

0.140

0.072

0.068

LN0 NPI>3.4

0.438

0.430

0.008

LN+ (1–3 nodes)

0.603

0.627

−0.024

Table 4E Modelled chemotherapy use with and without tumour profiling: MamaPrint

Test, subgroup compared with current practice

Chemotherapy use with tumour profiling

Chemotherapy use with no tumour profiling

Net change

MINDACT overall population

0.319

0.466

−0.148

mAOL high risk

0.445

0.772

−0.327

mAOL low risk

0.191

0.159

0.033

Probabilistic sensitivity analyses

4.66 The cost-effectiveness planes from the probabilistic sensitivity analyses showed considerable uncertainty in the cost-effectiveness estimates.

4.67 In the subgroup with LN-negative disease and a NPI of 3.4 or less, the only test with a non-zero probability of producing more net benefit than current practice at maximum acceptable ICERs of £20,000 and £30,000 per QALY gained was IHC4+C.

4.68 In the subgroup with LN-negative disease and a NPI of more than 3.4, at a maximum acceptable ICER of £20,000 per QALY gained, IHC4+C had a probability of 0.69 of being cost effective compared with current practice. For EndoPredict, Oncotype DX and Prosigna, the probability that the test was cost effective compared with current practice at this threshold was 0.24 or less. In the same subgroup, at a maximum acceptable ICER of £30,000 per QALY gained, IHC4+C had a probability of 0.67 and Prosigna had a probability of 0.60 of being cost effective compared with current practice. Oncotype DX had a probability of 0.04 and EndoPredict had a probability of 0.26 of being cost effective compared with current practice.

4.69 In the subgroup with LN-positive disease, IHC4+C had probabilities of 0.95 and 0.94 of being cost effective compared with current practice at maximum acceptable ICERs of £20,000 and £30,000 per QALY gained respectively. In the same subgroup, the probabilities of EndoPredict producing more net benefit than current practice were 0.44 and 0.73, at maximum acceptable ICERs of £20,000 and £30,000 per QALY gained respectively. For Prosigna the probabilities were 0.24 and 0.55. In this subgroup Oncotype DX had very low probabilities of producing more net benefit than current practice at the same maximum acceptable ICERs (0.01 or lower).

4.70 In the overall MINDACT population and in the subgroups, the probability that MammaPrint would be cost effective compared with current practice at maximum acceptable ICERs of £20,000 and £30,000 per QALY gained was approximately 0.

Deterministic sensitivity analyses

4.71 The EAG did deterministic sensitivity analyses, testing a wide range of plausible values of key parameters.

4.72 Deterministic sensitivity analysis results for Oncotype DX compared with current practice were:

  • Subgroup with LN-negative disease and a NPI of 3.4 or less: ICERs remained over £34,000 per QALY gained across all analyses.

  • Subgroup with LN-negative disease and a NPI of more than 3.4: Oncotype DX was either dominated or had an ICER of more than £35,000 per QALY gained across almost all analyses. The only exception was when Oncotype DX was assumed to predict relative treatment effects for chemotherapy. In this analysis, Oncotype DX dominated current practice.

  • Population with LN-positive disease: Oncotype DX remained dominated across most analyses. The exceptions were when Oncotype DX was assumed to predict relative treatment effects for chemotherapy (it was dominant), and when the cost of chemotherapy was doubled (£3,700 saved per QALY lost).

4.73 Deterministic sensitivity analysis results for IHC4+C compared with current practice were:

  • Subgroup with LN-negative disease and a NPI of 3.4 or less: ICERs remained below £16,000 per QALY gained across all analyses, except when post-test chemotherapy probabilities were derived from Holt et al. (2011; £36,259 per QALY gained). Also, IHC4+C dominated current practice when the cost of chemotherapy was doubled.

  • Subgroup with LN-negative disease and a NPI of more than 3.4: IHC4+C dominated current practice or had an ICER below £6,000 per QALY gained across all scenarios.

  • Population with LN-positive disease: IHC4+C dominated current practice across all but 1 scenario. When the probability of having chemotherapy was based on the UK breast cancer group (UKBCG) survey the ICER was £1,929 per QALY gained.

4.74 Deterministic sensitivity analysis results for Prosigna compared with current practice were:

  • Subgroup with LN-negative disease and a NPI of 3.4 or less: ICERs were greater than £71,000 per QALY gained across all analyses.

  • Subgroup with LN-negative disease and a NPI of more than 3.4: ICERs were below £34,000 per QALY gained across all analyses.

  • Population with LN-positive disease: ICERs were below £38,000 per QALY gained across all analyses.

4.75 Deterministic sensitivity analysis results for EndoPredict compared with current practice were:

  • Subgroup with LN-negative disease and a NPI of 3.4 or less: ICERs remained greater than £91,000 per QALY gained across all analyses.

  • Subgroup with LN-negative disease and a NPI of more than 3.4: ICERs remained greater than £30,000 per QALY gained across all but 2 of the analyses. Exceptions were when the UKBCG survey was used to inform the probability of having chemotherapy (£25,250 per QALY gained), and when Cusumano et al. (2014) was used to inform the probability of having chemotherapy based on the EndoPredict test result (£26,689 per QALY gained).

  • Population with LN-positive disease: ICERs remained below £30,000 per QALY gained across all scenarios.

4.76 Deterministic sensitivity analysis results for MammaPrint compared with current practice were:

  • Overall MINDACT population: ICERs were estimated to be greater than £76,000 per QALY gained across all scenarios.

  • Modified Adjuvant! Online high-risk subgroup: MammaPrint was dominated by current practice across almost all scenarios.

  • Modified Adjuvant! Online low-risk subgroup: ICERs were greater than £161,000 per QALY gained across all analyses.

4.77 After consultation, the EAG did more deterministic sensitivity analyses varying the estimated relative risk of distant recurrence associated with chemotherapy, which was assumed to be 0.76 in the base case. Results showed that as the relative risk moved from 0.6 to 0.9, the tests became less cost effective.

  • National Institute for Health and Care Excellence (NICE)