Guidance
4 Evidence
The diagnostics advisory committee (section 7) considered evidence on EndoPredict (EP score and EPclin score), MammaPrint, Oncotype DX (with and without the recurrence scorepathologyclinical [RSPC] calculator), Prosigna and IHC4 or IHC4+C from several sources. Full details of all the evidence are in the committee papers.
Clinical effectiveness
4.1 Evidence on the following outcomes was of interest in the clinical effectiveness review:

Prognostic ability – the degree to which the test can accurately predict the risk of an outcome such as disease recurrence.

Prediction of relative treatment effect – the ability of the test to predict which patients have disease that will respond to chemotherapy. It can be assessed by considering whether the relative treatment effect of chemotherapy or no chemotherapy on patient outcomes differs according to the test score.

Clinical utility – the ability of the prospective use of the test to affect patient outcomes such as recurrence and survival compared with current practice.

Decision impact – how the test influences decision making in terms of which patients will be offered chemotherapy.
4.2 A total of 153 references were included in the review. Studies assessing prognostic ability and prediction of relative treatment effect were quality assessed using relevant criteria from the draft prediction model study risk of bias assessment tool (PROBAST). Clinical utility studies were quality assessed using the Cochrane risk of bias tool for randomised controlled trials (RCTs).
Prognostic ability
4.3 Studies providing information on prognostic ability were retrospective analyses of RCT data or routinely collected data. Most of the studies excluded patients who did not have a large enough tissue sample for testing, which leaves the evidence base at potential risk of spectrum bias, because patients with smaller tumours (who may be systematically different to those with large tumours) are likely to be underrepresented. In many studies patients had chemotherapy, which could affect event rates and therefore potentially reduce the apparent prognostic performance of a test. In other studies, patients who had chemotherapy were excluded from analyses, which may also lead to spectrum bias. Therefore studies in which all patients had endocrine monotherapy were preferable.
4.4 Results for prognostic ability were generally presented as unadjusted or adjusted analyses. Unadjusted analyses look at differences in the event rates among low, intermediate and high‑risk groups without adjusting for clinical and pathological variables. Adjusted analyses show whether the test has prognostic value over clinical and pathological variables.
Distribution of patients across risk categories
4.5 Among studies of patients with lymph node (LN)negative disease who had endocrine monotherapy, in each group around 70% to 80% had disease that was categorised as low or low/intermediate risk across all tests (11 studies). Most MammaPrint studies had mixed endocrine and chemotherapy use, mixed hormone receptor status with or without mixed human epidermal growth factor receptor 2 (HER2) status, so results may not be comparable with results from other tests. In these studies 20% to 61% of patients had disease that was categorised as low risk (6 studies). Most IHC4 or IHC4+C studies used quartiles or tertiles to define risk groups. These studies do not provide useful information on the distribution of patients across risk categories.
4.6 The proportion of patients with low and intermediate risk was generally much lower in groups with LNpositive disease than in groups with LNnegative disease who had endocrine monotherapy (7 LNpositive studies). For Oncotype DX, however, the proportion of patients with low and intermediate risk was only slightly lower in the LNnegative group than in the LNpositive group. Studies of MammaPrint in patients with LNpositive disease were all done in groups with mixed hormone receptor status and mixed or unknown HER2 status, so results may not be comparable with results from other tests. In these studies 38% to 41% of patients had disease that was categorised as low risk (2 studies).
Oncotype DX
4.7 There were 11 data sets that provided information on the prognostic ability of Oncotype DX: 7 reanalyses of RCT data and 4 retrospective studies of routinely collected data. All studies were validation studies, and in 4 RCTs patients had endocrine monotherapy. Three of the studies were done in East Asia and it is uncertain whether the commercial version of Oncotype DX was used in these studies. Also, they may not be generalisable to England because usual clinical practice may differ between countries enough to affect prognostic outcomes. In addition, it is possible that people of different ethnicities have different underlying risk profiles and natural history of disease.
4.8 Unadjusted analyses indicated that Oncotype DX had prognostic accuracy (there were statistically significant differences between lowrisk and highrisk groups) across various recurrence outcomes, regardless of lymph node status. However, hazard ratios between the intermediaterisk group and the high or lowrisk groups were not always statistically significant, particularly in the group with LN‑positive disease.
4.9 In adjusted analyses, Oncotype DX provided statistically significant additional prognostic information over most commonly used clinical and pathological variables (age, grade, size, nodal status), regardless of lymph node status. A bespoke analysis of TransATAC study data also showed that Oncotype DX provided additional prognostic information over clinical and pathological tools to assess risk.
MammaPrint
4.10 There were 10 data sets that provided information on the prognostic ability of MammaPrint: 1 reanalysis of RCT data and 9 retrospective studies of routinely collected data. In addition, a further 4 studies pooled data on specific patients from the same 10 data sets. All studies were validation studies, and in 2 studies patients had endocrine monotherapy. Most studies included some patients who were out of scope (with HER2positive or hormone receptornegative disease or both).
4.11 In 6 of 7 unadjusted analyses, MammaPrint had prognostic accuracy (there were statistically significant differences between lowrisk and highrisk groups) for 10 year distant recurrencefree survival or interval, regardless of LN status.
4.12 In adjusted analyses, a pooled analysis of patients with LN‑negative and LNpositive disease showed that MammaPrint had statistically significant prognostic accuracy for 10year distant recurrencefree survival after adjusting for clinical and pathological variables. In patients with LNnegative disease, MammaPrint had statistically significant prognostic accuracy for 10year distant recurrencefree interval when adjusted for Adjuvant! Online, Nottingham Prognostic Index (NPI) or clinical and pathological variables. In patients with LNpositive disease, MammaPrint had borderline statistically significant prognostic accuracy for 10year distant metastasisfree survival when adjusted for clinical and pathological variables.
Prosigna
4.13 There were 8 data sets that provided information on the prognostic ability of Prosigna: 6 reanalyses of RCT data and 3 retrospective analyses of 2 prospective cohort studies. All studies were validation studies, and in 5 studies patients had endocrine monotherapy. Some studies included some patients who were out of scope (with HER2positive or hormone receptornegative disease or both).
4.14 Prosigna had statistically significant prognostic accuracy for 10‑year distant recurrencefree survival and interval in all unadjusted analyses of patients with LNnegative and LNpositive disease.
4.15 In analyses adjusted for clinical and pathological variables or tools, Prosigna had prognostic accuracy for 10year distant metastasis‑free survival and distant recurrencefree survival. In patients with LNnegative disease the results were statistically significant. In patients with LNpositive disease the results were statistically or borderline significant.
EndoPredict
4.16 There were 3 data sets that provided information on the prognostic ability of EndoPredict; all were reanalyses of RCT data. All studies were validation studies, and in 2 of the 3 studies patients had endocrine monotherapy.
4.17 In unadjusted analyses, EndoPredict (EPclin) had statistically significant prognostic accuracy for 10year distant recurrencefree survival and interval in patients with LNnegative and LNpositive disease.
4.18 Adjusted analyses of TransATAC data show that EndoPredict (EPclin) had statistically significant increases in likelihood ratio for 10year distant recurrencefree interval over clinical and pathological variables or tools, regardless of LN status.
IHC4 and IHC4+C
4.19 There were 12 data sets that provided information on the prognostic ability of IHC4 and IHC4+C: 6 reanalyses of RCT data and 6 reanalyses of routinely collected data. Most of the data related to the IHC4 score alone, without including clinical factors. One of the studies was based on the derivation cohort for IHC4, and therefore may have overestimated prognostic ability. The remaining studies were validation studies. Patients had endocrine monotherapy in only 2 studies, 1 of which was the derivation cohort study.
4.20 In unadjusted analyses, IHC4 had statistically significantly better prognostic performance in groups with high risk than in groups with low risk (defined by quartiles or tertiles), regardless of lymph node status. However, no studies reported survival or recurrence outcomes by risk group. Also, many used laboratory methods that differed from the derivation study methodology. In adjusted analyses, IHC4 had additional prognostic value over clinical and pathological factors in 3 studies, but patients had endocrine monotherapy in only 1 of these studies.
4.21 Data on IHC4+C came from the derivation cohort and 1 validation cohort. These studies showed that IHC4+C had prognostic value in unadjusted analyses. In adjusted analyses IHC4+C provided statistically significantly more information than the NPI in LN‑negative, but not LNpositive, disease.
Prediction of relative treatment effect
4.22 In addition to estimating the risk of recurrence, the ability of Oncotype DX and MammaPrint to predict which patients have disease that will respond to chemotherapy was explored in 7 data sets. The external assessment group (EAG) reviewed evidence in support of this.
Oncotype DX
4.23 In 5 data sets (2 reanalyses of RCT data and 3 observational studies) reported across 11 published references and 1 confidential manuscript, analyses assessed the ability of Oncotype DX to predict relative treatment effects for chemotherapy.
4.24 The 2 reanalyses of RCTs suggest that Oncotype DX may predict differences in relative treatment effects for chemotherapy. Hazard ratios for diseasefree survival for patients having chemotherapy compared with those having no chemotherapy suggested that the greatest relative treatment effect was for patients in the Oncotype DX highrisk category. Unadjusted interaction tests between Oncotype DX risk group and relative treatment effects were mainly statistically significant. Adjusted interaction tests were statistically significant in an analysis of patients with HER2negative, LN‑negative disease, but in patients with LNpositive disease the interaction test was not significant when hormone receptor status was adjusted for. However, the data for the population with LN‑negative disease came from the derivation cohort for Oncotype DX and may overestimate predictive performance.
4.25 Results from the 3 observational studies were mixed and at high risk from confounding. One reported a statistically significant interaction test but this was only adjusted for a limited number of factors. Two others reported hazard ratios for chemotherapy compared with no chemotherapy; 1 study in patients with intermediate Recurrence Score results, and another in patients with high Recurrence Score results. Both of these studies reported statistically nonsignificant results.
4.26 The RSPC algorithm incorporates Oncotype DX plus age, tumour size and grade. There was a nonsignificant interaction test result between relative chemotherapy treatment effects and RSPC risk group.
MammaPrint
4.27 Two studies reported the ability of MammaPrint to predict the relative treatment effects for chemotherapy. In a pooled analysis including patients with LNnegative and LNpositive disease, the effect of chemotherapy compared with no chemotherapy was statistically significant in the MammaPrint highrisk group but not in the lowrisk group in unadjusted and adjusted analyses. Further, the interaction test for chemotherapy treatment and risk group was nonsignificant. In a pooled analysis of patients with LNpositive disease, there was a nonsignificant interaction between chemotherapy treatment and risk group.
Clinical utility
4.28 The EAG noted that the best evidence for clinical utility was an RCT of treatment guided by the test compared with treatment guided by the comparator. There were no clinical utility data available for EndoPredict, Prosigna or IHC4+C.
Oncotype DX
4.29 Five data sets, reported across 9 published references and 1 confidential manuscript, reported evidence on the clinical utility of Oncotype DX. These studies included the lowrisk group from TAILORx. One further study did not meet the inclusion criteria (because of insufficient followup length), but presented subgroup data according to age, lymph node status and ethnicity, and was therefore discussed by the EAG. Studies generally reported different outcomes, making comparisons across studies difficult. All studies were judged to be of poor quality using the Cochrane risk of bias tool for RCTs.
4.30 In patients with LNnegative disease, using the test in clinical practice appeared to result in low rates of chemotherapy in patients with low risk (2% to 12%), with acceptable outcomes (distant recurrencefree survival, distant recurrencefree interval or invasive diseasefree survival 96% to 99.6%). Rates of chemotherapy increased with increasing risk category, and were generally higher in patients with LNpositive disease. It was not possible to conclude whether patients in intermediate and highrisk categories had better outcomes as a result of using Oncotype DX to guide treatment because there were no comparator groups (patients who had treatment without Oncotype DX testing).
4.31 One study (TAILORx; Sparano et al. 2018) reporting evidence on clinical utility was published after completion of the diagnostics assessment report. This study was a prospective, partially randomised study in which patients with an Oncotype DX Recurrence Score result of 0 to 10 had endocrine therapy, patients with Recurrence Score results of 26 and above had endocrine therapy plus chemotherapy, and those with Recurrence Score results of 11 to 25 were randomised to have either endocrine therapy alone, or endocrine therapy plus chemotherapy. The cut‑offs in this study were different to the cutoffs recommended by the company (less than 18, 18 to 30 and greater than 30; see section 3.15). The 2018 publication focused on the results from patients in the intermediaterisk group who were randomised to treatment. It reported that across all patients with Recurrence Score results of 11 to 25, there were no clinically relevant or statistically significant differences between those who had endocrine therapy alone and those who had chemotherapy plus endocrine therapy. Results for the primary end point of 9year invasive diseasefree survival were 84.3% with chemotherapy and 83.3% without chemotherapy; an absolute difference of 1.0% (hazard ratio [HR] 1.08, 95% confidence interval [CI] 0.94 to 1.24, p=0.26). The upper confidence interval was within the prespecified noninferiority margin (HR 1.322). Results for freedom from distant recurrence at 9 years were 95% with chemotherapy and 94.5% without chemotherapy; an absolute difference of 0.5% (HR 1.10, 95% CI 0.85 to 1.41, p=0.48). However, exploratory subgroup analyses suggested that chemotherapy may have an effect in some subgroups, such as those with Recurrence Score results of 21 to 25 and possibly Recurrence Score results of 16 to 20, particularly in people aged 50 or under. The EAG noted that no analysis was available for the subgroup of patients with Recurrence Score results of 11 to 25 and a modified Adjuvant! Online high risk score.
MammaPrint
4.32 Two studies reported evidence relating to the clinical utility of MammaPrint. MINDACT was a prospective, partially randomised study in which clinical risk was determined using a modified version of Adjuvant! Online. Patients with risk scores that disagreed from MammaPrint and modified Adjuvant! Online were randomised to chemotherapy or no chemotherapy. Of patients included in the study, 88% had HRpositive disease and 90% HER2negative disease, therefore some patients were outside of the scope for this assessment. For the group who were high risk with modified Adjuvant! Online and low risk with MammaPrint, 5year distant metastasisfree survival was 95.9% with chemotherapy and 94.4% without chemotherapy, a nonstatistically significant absolute difference of 1.5% (adjusted hazard ratio for distant metastasis or death with chemotherapy compared with no chemotherapy, 0.78; 95% CI 0.50 to 1.21; p=0.27). For the group who were low risk with modified Adjuvant! Online and high risk with MammaPrint, 5year distant metastasisfree survival was 95.8% with chemotherapy and 95.0% without chemotherapy, a nonstatistically significant absolute difference of 0.8% (adjusted hazard ratio for distant metastasis or death with chemotherapy compared with no chemotherapy, 1.17; 95% CI 0.59 to 2.28; p=0.66). The EAG judged MINDACT to be at low risk of bias in terms of randomisation, allocation concealment and reporting. However, no details of blinding were reported.
4.33 Results from the RASTER study suggested that distant recurrence‑free interval rates were sufficiently low in the MammaPrint lowrisk group for these patients to avoid chemotherapy. The 5year distant recurrencefree interval rate for LNnegative disease was 97.0% for patients with low risk (15% had chemotherapy) and 91.7% for patients with high risk (81% had chemotherapy). In addition, MammaPrint provided additional prognostic information over Adjuvant! Online and the NPI, but not over the NHS PREDICT tool. The EAG judged RASTER to be at high risk of bias using the Cochrane risk of bias tool for RCTs.
Comparison of the tests with each other
4.34 There were 6 studies that compared more than 1 test: 4 reanalyses of RCTs and 2 observational studies. Evidence shows that generally when a test placed more patients in a lowrisk category than another test, the eventfree survival in the lowrisk group was reduced. Also, the tests generally performed differently in patients with LNnegative and LNpositive disease.
4.35 Thirteen studies reported data from microarray analyses on more than 1 test, however, these studies had methodological limitations. The comparability of test algorithms applied to microarray data with the commercial assays was unknown, so the generalisability of findings from microarray studies to the decision problem was uncertain. All the studies reported data on Oncotype DX and MammaPrint, and 2 also reported data on EndoPredict. No studies reported data on Prosigna or IHC4+C. The microarray studies generally supported the conclusions from studies using the commercial versions of the assays in suggesting that Oncotype DX, MammaPrint and EndoPredict can discriminate between patients with high and low risk regardless of LN status. In terms of additional prognostic performance of the tests over clinical and pathological variables, EndoPredict appeared to have the greatest benefit, followed by Oncotype DX and then MammaPrint. However, because of the methodological limitations, the EAG judged that these studies did not provide conclusive evidence of the superiority of 1 test over others.
4.36 The OPTIMA Prelim study, a UKbased feasibility phase of an RCT, analysed concordance between different tests. The study included Oncotype DX, MammaPrint, Prosigna and IHC4 plus 2 other tests. Out of the 4 inscope tests, MammaPrint assigned the most patients to the lowrisk category, but unlike the other 3 tests it does not have an intermediate category. When the low and intermediate categories were treated as 1 category for the 3 tests that have 3 risk groups, Oncotype DX assigned the most patients to this category, and MammaPrint the least. Kappa statistics indicated modest agreement between tests, ranging from 0.33 to 0.53. Also, across 5 tests in the study, only 39% of tumours were uniformly classified as either low/intermediate risk or high risk by all 5 tests. Of these, 31% were classified as low/intermediate risk by all tests and 8% were high risk by all tests. The study authors concluded that although the tests assigned similar proportions of patients to low/intermediaterisk and highrisk categories, test results for an individual patient could differ markedly depending on which test was used.
Decision impact
4.37 The review of decision impact focused on studies done in the UK or the rest of Europe:

Oncotype DX: 6 UK studies and 12 other European studies

EndoPredict: 1 UK study and 3 other European studies

IHC4+C: 1 UK study and 0 other European studies

Prosigna: 0 UK studies and 3 other European studies

MammaPrint: 0 UK studies and 8 other European studies.
4.38 The percentage of patients with any change in treatment recommendation or decision (either to or from chemotherapy) in UK studies was 29% to 49% across 4 Oncotype DX studies, 37% in 1 EndoPredict study and 27% in 1 IHC4+C study. Ranges across European (nonUK) studies were 5% to 70% for Oncotype DX, 38% to 41% for EndoPredict, 14% to 41% for Prosigna and 13% to 51% for MammaPrint.
4.39 The net change in the percentage of patients with a chemotherapy recommendation or decision (pretest to posttest) among UK studies was a reduction of 8% to 23% across 4 Oncotype DX studies, an increase of 1% in 1 EndoPredict study, and a reduction of between 2% and 26% in 1 IHC4+C study. Net changes across European (nonUK) studies were a reduction of 0% to 64% for Oncotype DX, a reduction of 13% to 26% for EndoPredict, a reduction of 2% to an increase of 9% for Prosigna, and a reduction of 31% to an increase of 8% for MammaPrint.
Anxiety and healthrelated quality of life
4.40 There were 6 studies that reported outcomes relating to anxiety (including worry and distress) and healthrelated quality of life. The lack of a comparator in the studies made it difficult to tell whether changes in anxiety experienced with the use of tumour profiling tests would also have occurred if patients received a definitive decision based on clinical risk factors alone. Overall, evidence suggests that tumour profile testing may reduce anxiety in some patients in some contexts, but generally there was little effect on healthrelated quality of life.
Cost effectiveness
Review of economic evidence
4.41 The EAG reviewed existing studies investigating the cost effectiveness of tumour profiling tests to guide treatment decisions in people with early breast cancer, and also did a detailed critique of the economic models and analyses provided by Agendia (MammaPrint), Genomic Health (Oncotype DX), and the chief investigator of a UK decision impact study (EndoPredict).
4.42 From the review, 26 studies were identified that had been published since the original assessment for diagnostics guidance 10. The models reported in the studies assessed the cost effectiveness of tumour profiling tests across different countries including the UK, the US, Canada, Mexico, Japan, Austria, Germany, France and the Netherlands. Most studies compared Oncotype DX (18 studies), MammaPrint (8 studies) or EndoPredict (1 study) with comparators such as Adjuvant! Online, the St Gallen guidelines, standard practice or other conventional diagnostic tools. There was variation between the analyses in the populations evaluated, the disease type and other patient characteristics.
4.43 There was a high level of consistency in the general modelling approach and structure, and several studies were based on a previously published model. Most of the models used a Markov or hybrid decision tree–Markov approach, 2 studies used a partitioned survival approach and 1 study used a discrete event simulation approach. The time horizons ranged from 10 years to the patient's remaining lifetime, with cycle lengths ranging from 1 month to 1 year when reported. Most of the models that evaluated Oncotype DX assumed that the test could predict relative treatment effects for chemotherapy.
Economic evaluation
4.44 None of the models identified in the literature review included all of the tests identified in the scope. Therefore, the EAG developed a de novo economic model designed to assess the cost effectiveness of Oncotype DX, MammaPrint, Prosigna, IHC4+C and EndoPredict compared with current practice without the use of the tumour profiling tests. The model used a lifetime time horizon (42 years) from the perspective of the UK NHS and personal social services. All costs and health outcomes were discounted at a rate of 3.5% per year. Unit costs were valued at 2015/16 prices. The main source of evidence used to inform the analyses of Oncotype DX, Prosigna, IHC4+C and EndoPredict was a bespoke analysis of TransATAC provided by the study investigators. This was limited to UK data on patients with hormone receptorpositive, HER2‑negative disease with 0 to 3 positive lymph nodes to match the scope for this assessment. Because this study did not include MammaPrint, MINDACT was used as the basis for evaluating the cost effectiveness of MammaPrint. PREDICT scores were not available in either data set, and so this tool could not be considered as a comparator or used to determine different risk subgroups. Therefore, the comparator for Oncotype DX, Prosigna, IHC4+C and EndoPredict was current practice (various tools and algorithms), and the comparator for MammaPrint was a modified version of Adjuvant! Online.
Model structure
4.45 The hybrid decision tree–Markov model was based on the model previously developed by Ward et al. (2013). The decision tree component of the model classified patients in the current practice group (no test) and the tumour profiling test group as high, intermediate and low risk. For EndoPredict and MammaPrint, the intermediaterisk category was excluded because the test provides results in terms of high and low risk only. In both the test group and the current practice group, the decision tree determined the probability that a patient would be in 1 of 6 groups: low risk, chemotherapy; low risk, no chemotherapy; intermediate risk, chemotherapy; intermediate risk, no chemotherapy; high risk, chemotherapy, and high risk, no chemotherapy. For EndoPredict and MammaPrint, 4 groups were used because there was no intermediaterisk category. Each group was linked to a Markov model which predicted lifetime qualityadjusted life years (QALYs) and costs according to the patient's risk of distant recurrence and whether or not they had chemotherapy.
4.46 Each Markov node included 4 health states: distant recurrence‑free; distant recurrence; longterm adverse events (acute myeloid leukaemia [AML]); and dead. Patients entered the model in the distant recurrencefree health state. A healthrelated quality of life decrement was applied during the first model cycle to account for health losses associated with shortterm adverse events for patients having adjuvant chemotherapy. The treatment effect for adjuvant chemotherapy was modelled using a relative risk reduction for distant recurrence within each risk classification group. The benefit of the test was therefore captured in the model by changing the probability that patients with each test risk classification had adjuvant chemotherapy.
Model inputs
4.47 The risk classification probabilities used in the model for Oncotype DX, Prosigna, IHC4+C and EndoPredict were from the bespoke data analysis of TransATAC, which only included postmenopausal women. For MammaPrint, they were from MINDACT.
4.48 The probability of developing distant metastases in each group and risk category was based on 10year recurrencefree interval data from the bespoke data analysis of TransATAC for Oncotype DX, Prosigna, IHC4+C and EndoPredict. For MammaPrint the probability of developing distant metastases was based on an adjusted analysis of 5year distant metastasisfree survival data from MINDACT. The model assumed that the risk of distant metastases between 10 and 15 years was halved, and after 15 years was 0.
4.49 The probability of having chemotherapy in the current practice group and in the tumour profiling test groups was taken from the sources in table 1.
Table 1 Source for posttest probability of having chemotherapy
Population 
Source 
Proportion of patients having chemotherapy 

Low risk 
Intermediate risk 
High risk 

Current practice group 

LNnegative, NPI≤3.4 
NCRAS data set 
0.07^{2} 

LNnegative, NPI>3.4 
Genomic Health access scheme data set^{1} 
0.43^{2} 

LNpositive (1−3 nodes) 
NCRAS data set 
0.63^{2} 

Overall population (MammaPrint) 
Expert opinion 
0.47^{2} 

3level tests (Oncotype DX, Prosigna and IHC4+C) 

LNnegative, NPI≤3.4 
UKBCG survey data 
0.00 
0.20 
0.77 
LNnegative, NPI>3.4 
Genomic Health access scheme data set 
0.01 
0.33 
0.89 
LNpositive (1−3 nodes) 
Loncaster et al. (2017) nodepositive estimates 
0.08 
0.63 
0.83 
2level tests (EndoPredict and MammaPrint) 

EndoPredict: all 3 subgroups 
Bloomfield et al. (2017) study 
0.07 
– 
0.77 
MammaPrint: all subgroups 
Bloomfield et al. (2017) study 
0.07 
– 
0.77 
Abbreviations: LN, lymph node; NCRAS, National Cancer Registration and Analysis Service; NPI, Nottingham Prognostic Index; UKBCG, UK breast cancer group. ^{1} The Genomic Health access scheme data set is based on the access scheme operated by NHS England and is a result of the research recommendation from NICE's original diagnostics guidance 10. ^{2} Low, intermediate and high risk combined. 
4.50 In the basecase analysis, the relative treatment effect for chemotherapy was assumed to be the same across all test risk groups, that is, all tests were assumed to be associated with prognostic benefit only. For Oncotype DX, Prosigna, IHC4+C and EndoPredict a 10year relative risk of distant recurrence was estimated as 0.76 for chemotherapy compared with no chemotherapy (Early breast cancer trialists' collaborative group 2012), and was assumed to apply to the groups with LNnegative and LNpositive disease. For MammaPrint the 10year relative risk of distant recurrence was estimated to be 0.77 (MINDACT) for chemotherapy compared with no chemotherapy.
4.51 In sensitivity analyses the effect of assuming that Oncotype DX could predict relative treatment effects for chemotherapy was explored, based on the B20 study by Paik et al. (2006) and the SWOG8814 study by Albain et al. (2010). For the group with LN‑negative disease, the 10year relative risks of distant recurrence with chemotherapy compared with no chemotherapy were 1.31, 0.61 and 0.26 for the low, intermediate and highrisk categories respectively. For the group with LNpositive disease, the 10year relative risks of relapse with chemotherapy compared with no chemotherapy were 1.02, 0.72 and 0.59 respectively. It is possible that the nochemotherapy arm of B20 may have overestimated the difference in response rates between low and highrisk patients, because this arm was the derivation set for Oncotype DX. Therefore, additional sensitivity analyses in the group with LN‑negative disease explored the impact of varying the relative chemotherapy treatment effect between risk groups on the incremental costeffectiveness ratios (ICERs). Hazard ratios were based on naive indirect comparisons of the chemotherapy arms from the B20 study and the nochemotherapy arms from the B14 study (estimated hazard ratios for treatment effects with chemotherapy compared with no chemotherapy were 0.64, 0.75 and 0.35 for the low, intermediate and highrisk categories respectively), and the chemotherapy arms of the B20 study and the nochemotherapy arms of the TransATAC study (hazard ratios for treatment effects with chemotherapy compared with no chemotherapy were 0.86, 0.88 and 0.49 for the low, intermediate and highrisk categories respectively).
4.52 Survival following distant recurrence was based on a median of 40.1 months from Thomas et al. (2009). From this, the 6month probability of death following distant recurrence was estimated to be 0.098, assuming a constant rate. The rate of death following distant metastases was assumed to be the same across the different subgroups and across each test risk group.
4.53 The model assumed that 10.5% of patients entering the distant recurrence health state had previously had local recurrence, based on de Bock et al. (2009). The 6month probability of developing AML was estimated to be 0.00025, based on Wolff et al. (2015). Survival following the onset of AML was estimated to be approximately 8 months; assuming a constant event rate gave a 6‑month probability of death following AML of 0.53. Additional sensitivity analyses explored the effect of including congestive heart failure (average net lifetime QALY loss of 0.0385 and average net lifetime cost saving of £2 from Hall et al. 2017, using an excess congestive heart failure risk relative to that of the general population), permanent hair loss (disutility of 0.04495 from Nafees et al. 2008 applied to 15% of all patients having chemotherapy) and peripheral neuropathy (disutility of 0.02 from Shiroiwa et al. 2009 applied to 12% of all patients having chemotherapy) in the model.
Table 2 Test prices
Test 
List price 
Comments 
Oncotype DX 
£2,580 
Tests carried out in Genomic Health laboratory in US. Cost includes sample handling and customer service. A commercialinconfidence discounted test cost was used in the model. 
Prosigna 
£1,970 
Based on doing the test in an NHS laboratory, which includes the laboratory costs (£240), the Prosigna kit (£1,650) and the nCounter system (£194,600) and is based on 2,500 samples per lifetime of the nCounter system). Commercialinconfidence discounted test costs were used in scenario analyses to account for the access proposal. 
EndoPredict 
£1,500 
Tests carried out in Myriad's laboratory in Munich. Commercialinconfidence discounted test costs were used in scenario analyses to account for the access proposal. 
IHC4 
£203 
The cost was based on 2014 prices. The total cost of the test (£198) was uplifted to current prices using the hospital and community health services indices. 
MammaPrint 
£2,326 
Converted from Euros to UK pounds sterling, assuming an exchange rate of 1 British pound to 1.15 Euros. 
4.55 The costs associated with adjuvant chemotherapy were from a previous costing analysis of the OPTIMA Prelim trial (Hall et al. 2017). The weighted mean cost of adjuvant chemotherapy acquisition, delivery and toxicity was estimated to be £3,145 per course.
4.56 All surviving patients had endocrine therapy for a period of between 5 and 8 years. Costs of endocrine therapy were taken from the British national formulary (2017). In addition, 30% of women with early breast cancer had 4 mg of bisphosphonates (zoledronic acid) by intravenous infusion every 6 months for up to 3 years, at a cost of £58.50, excluding administration.
4.57 All patients had 2 routine followup visits during the first year after surgery, with annual visits thereafter for 5 years. Patients were also assumed to have a routine annual mammogram for up to 5 years. The cost of a routine followup visit was estimated to be £162.84, and the cost of a mammogram was estimated to be £46.37.
4.58 Costs associated with treating local recurrence were taken from Karnon et al. (2007) and uplifted to current prices (£13,913). This was applied as a onceonly cost to distant recurrence. Costs associated with treating distant metastases were derived from Thomas et al. (2009), and included visits, drugs, pharmacy, hospital admission and intervention, imaging, radiotherapy, pathology and transport. Cost components specifically associated with terminal care were excluded. The 6monthly cost of treating metastatic breast cancer was estimated to be £4,541.
Table 3 Health utilities applied in the base case
Health state / event 
Duration applied in model 
Mean 
Standard error 
Source 
Recurrencefree 
Indefinite 
0.824 
0.002 
Lidgren et al. 2007 
Disutility distant metastases 
Indefinite 
0.14 
0.11 
Calculated from Lidgren et al. 2007 
Local recurrence 
Onceonly QALY loss applied on transition to distant recurrence state 
−0.108 
0.04 (assumed) 
Campbell et al. 2011 
Chemotherapy AEs 
Onceonly QALY loss applied in first cycle 
−0.038 
0.004 
Campbell et al. 2011 
AML 
Indefinite 
0.26 
0.04 (assumed) 
Younis et al. 2008 
Abbreviations: AEs, adverse events; AML, acute myeloid leukaemia; QALY, qualityadjusted life year. 
Basecase results
4.60 The following key assumptions were applied in the basecase analysis:

Clinicians interpreted each of the 3level tests in the same way (for example, an Oncotype DX highrisk Recurrence Score result would lead to the same chemotherapy decision as a Prosigna highrisk score).

Clinicians interpreted each of the 2level tests in the same way (for example, a MammaPrint highrisk score would lead to the same chemotherapy decision as an EndoPredict highrisk score).

The treatment effect for adjuvant chemotherapy was the same across all risk score categories for all tests.

The prognosis of patients with AML and the costs and QALYs accrued within the AML state were independent of whether they had previously developed distant metastases.

A disutility associated with adjuvant chemotherapy was applied once during the first model cycle only (while the patient is taking the regimen).

Costs associated with endocrine therapy, bisphosphonates, followup appointments and mammograms were assumed to differ according to time since model entry.

The model assumed that people entered at an age of around 60 years.
4.61 In the subgroup with LNnegative disease and a NPI of 3.4 or less, compared with current practice, the probabilistic model gave ICERs of:

£147,419 per QALY gained (EndoPredict)

£122,725 per QALY gained (Oncotype DX)

£91,028 per QALY gained (Prosigna)

£2,654 per QALY gained (IHC4+C).
4.62 In the subgroup with LNnegative disease and a NPI of more than 3.4, compared with current practice, the probabilistic model gave ICERs of:

£46,788 per QALY gained (EndoPredict)

£26,058 per QALY gained (Prosigna)

Oncotype DX was dominated by current practice (that is, it was more expensive and less effective)

IHC4+C was dominant over current practice (that is, it was less expensive and more effective).
4.63 In the population with LNpositive disease, compared with current practice, the probabilistic model gave ICERs of:

£28,731 per QALY gained (Prosigna)

£21,458 per QALY gained (EndoPredict)

Oncotype DX was dominated by current practice

IHC4+C was dominant over current practice.
4.64 In the overall MINDACT population, MammaPrint compared with modified Adjuvant! Online had an ICER of £131,482 per QALY gained. In the modified Adjuvant! Online highrisk subgroup, MammaPrint was dominated by current practice, and in the modified Adjuvant! Online lowrisk subgroup, MammaPrint compared with current practice had an ICER of £414,202 per QALY gained.
4.65 The risk classification probabilities and the probability of having chemotherapy were combined in the model to estimate chemotherapy use with and without tumour profiling. The modelled chemotherapy use in the base case is shown in table 4.
Table 4 Modelled chemotherapy use with and without tumour profiling
Test, subgroup compared with current practice 
Chemotherapy use 

Test 
No test 
Net change 

Oncotype DX 

LN0 NPI≤3.4 
0.076 
0.072 
0.004 
LN0 NPI>3.4 
0.273 
0.430 
−0.157 
LN+ (1–3 nodes) 
0.337 
0.627 
−0.290 
IHC4+C 

LN0 NPI≤3.4 
0.030 
0.072 
−0.042 
LN0 NPI>3.4 
0.355 
0.430 
−0.075 
LN+ (1–3 nodes) 
0.554 
0.627 
−0.073 
Prosigna 

LN0 NPI≤3.4 
0.075 
0.072 
0.003 
LN0 NPI>3.4 
0.435 
0.430 
0.005 
LN+ (1–3 nodes) 
0.709 
0.627 
0.082 
EndoPredict 

LN0 NPI≤3.4 
0.140 
0.072 
0.068 
LN0 NPI>3.4 
0.438 
0.430 
0.008 
LN+ (1–3 nodes) 
0.603 
0.627 
−0.024 
MammaPrint 

MINDACT overall population 
0.319 
0.466 
−0.148 
mAOL high risk 
0.445 
0.772 
−0.327 
mAOL low risk 
0.191 
0.159 
0.033 
Abbreviations: LN0, lymph node negative; LN+, lymph node positive, mAOL, modified Adjuvant! Online; NPI, Nottingham Prognostic Index. 
Probabilistic sensitivity analyses
4.66 The costeffectiveness planes from the probabilistic sensitivity analyses showed considerable uncertainty in the costeffectiveness estimates.
4.67 In the subgroup with LNnegative disease and a NPI of 3.4 or less, the only test with a nonzero probability of producing more net benefit than current practice at maximum acceptable ICERs of £20,000 and £30,000 per QALY gained was IHC4+C.
4.68 In the subgroup with LNnegative disease and a NPI of more than 3.4, at a maximum acceptable ICER of £20,000 per QALY gained, IHC4+C had a probability of 0.69 of being cost effective compared with current practice. For EndoPredict, Oncotype DX and Prosigna, the probability that the test was cost effective compared with current practice at this threshold was 0.24 or less. In the same subgroup, at a maximum acceptable ICER of £30,000 per QALY gained, IHC4+C had a probability of 0.67 and Prosigna had a probability of 0.60 of being cost effective compared with current practice. Oncotype DX had a probability of 0.04 and EndoPredict had a probability of 0.26 of being cost effective compared with current practice.
4.69 In the subgroup with LNpositive disease, IHC4+C had probabilities of 0.95 and 0.94 of being cost effective compared with current practice at maximum acceptable ICERs of £20,000 and £30,000 per QALY gained respectively. In the same subgroup, the probabilities of EndoPredict producing more net benefit than current practice were 0.44 and 0.73, at maximum acceptable ICERs of £20,000 and £30,000 per QALY gained respectively. For Prosigna the probabilities were 0.24 and 0.55. In this subgroup Oncotype DX had very low probabilities of producing more net benefit than current practice at the same maximum acceptable ICERs (0.01 or lower).
4.70 In the overall MINDACT population and in the subgroups, the probability that MammaPrint would be cost effective compared with current practice at maximum acceptable ICERs of £20,000 and £30,000 per QALY gained was approximately 0.
Deterministic sensitivity analyses
4.71 The EAG did deterministic sensitivity analyses, testing a wide range of plausible values of key parameters.
4.72 Deterministic sensitivity analysis results for Oncotype DX compared with current practice were:

Subgroup with LNnegative disease and a NPI of 3.4 or less: ICERs remained over £34,000 per QALY gained across all analyses.

Subgroup with LNnegative disease and a NPI of more than 3.4: Oncotype DX was either dominated or had an ICER of more than £35,000 per QALY gained across almost all analyses. The only exception was when Oncotype DX was assumed to predict relative treatment effects for chemotherapy. In this analysis, Oncotype DX dominated current practice.

Population with LNpositive disease: Oncotype DX remained dominated across most analyses. The exceptions were when Oncotype DX was assumed to predict relative treatment effects for chemotherapy (it was dominant), and when the cost of chemotherapy was doubled (£3,700 saved per QALY lost).
4.73 Deterministic sensitivity analysis results for IHC4+C compared with current practice were:

Subgroup with LNnegative disease and a NPI of 3.4 or less: ICERs remained below £16,000 per QALY gained across all analyses, except when posttest chemotherapy probabilities were derived from Holt et al. (2011; £36,259 per QALY gained). Also, IHC4+C dominated current practice when the cost of chemotherapy was doubled.

Subgroup with LNnegative disease and a NPI of more than 3.4: IHC4+C dominated current practice or had an ICER below £6,000 per QALY gained across all scenarios.

Population with LNpositive disease: IHC4+C dominated current practice across all but 1 scenario. When the probability of having chemotherapy was based on the UK breast cancer group (UKBCG) survey the ICER was £1,929 per QALY gained.
4.74 Deterministic sensitivity analysis results for Prosigna compared with current practice were:

Subgroup with LNnegative disease and a NPI of 3.4 or less: ICERs were greater than £71,000 per QALY gained across all analyses.

Subgroup with LNnegative disease and a NPI of more than 3.4: ICERs were below £34,000 per QALY gained across all analyses.

Population with LNpositive disease: ICERs were below £38,000 per QALY gained across all analyses.
4.75 Deterministic sensitivity analysis results for EndoPredict compared with current practice were:

Subgroup with LNnegative disease and a NPI of 3.4 or less: ICERs remained greater than £91,000 per QALY gained across all analyses.

Subgroup with LNnegative disease and a NPI of more than 3.4: ICERs remained greater than £30,000 per QALY gained across all but 2 of the analyses. Exceptions were when the UKBCG survey was used to inform the probability of having chemotherapy (£25,250 per QALY gained), and when Cusumano et al. (2014) was used to inform the probability of having chemotherapy based on the EndoPredict test result (£26,689 per QALY gained).

Population with LNpositive disease: ICERs remained below £30,000 per QALY gained across all scenarios.
4.76 Deterministic sensitivity analysis results for MammaPrint compared with current practice were:

Overall MINDACT population: ICERs were estimated to be greater than £76,000 per QALY gained across all scenarios.

Modified Adjuvant! Online highrisk subgroup: MammaPrint was dominated by current practice across almost all scenarios.

Modified Adjuvant! Online lowrisk subgroup: ICERs were greater than £161,000 per QALY gained across all analyses.
4.77 After consultation, the EAG did more deterministic sensitivity analyses varying the estimated relative risk of distant recurrence associated with chemotherapy, which was assumed to be 0.76 in the base case. Results showed that as the relative risk moved from 0.6 to 0.9, the tests became less cost effective.