5 Committee discussion
5.1 The committee discussed current practice for diagnosing and managing preterm labour in women with intact membranes. The clinical experts explained that the incidence of birth before 37 weeks of pregnancy in the UK was around 8% and an estimated 50,000 to 60,000 babies are born preterm each year. Women with suspected preterm labour have a clinical assessment and a fetal fibronectin (fFN) test is commonly used to help determine whether labour is established. Although NICE's guideline on preterm labour and birth recommends transvaginal ultrasound as the preferred diagnostic option, this is not available everywhere. Also, it needs healthcare professionals with appropriate training to perform and interpret the scan. The committee noted that the NICE guideline recommends that all women who present at less than 30 weeks of pregnancy have treatment based on clinical assessment alone, but heard that in practice many of these women also have a fFN test, or another test. The committee concluded that for women who present at 30 weeks or above, the most appropriate comparator was fFN at a threshold of 50 nanograms/millilitre (ng/ml). For women who present at less than 30 weeks, a treat-all management pathway and fFN testing should be considered to reflect variation in current practice.
5.2 The committee discussed the effect that suspected preterm labour can have on pregnant women and their partners. The patient experts explained that preterm labour is associated with substantial anxiety, particularly when a diagnosis is difficult to confirm. For example, women who have a false positive result might be transferred to a higher level hospital unnecessarily. The committee noted that understanding whether preterm labour is established is of considerable importance to pregnant women. It heard about the importance of communicating the risks and benefits associated with the different diagnostic options so that women are able to understand the test results. The possibility of false negative results should also be explained and women given reassurance that they can return to hospital if they feel that their symptoms have not resolved despite a negative test result.
5.3 The committee discussed the studies included in the diagnostic accuracy review. It noted that 16 studies were available for Actim Partus, 2 studies were available for fFN using thresholds of 10 ng/ml, 200 ng/ml and 500 ng/ml and 4 studies were available for PartoSure. It acknowledged that 7 studies had been submitted by 1 company after the diagnostics assessment report had been completed. The variation in estimates of diagnostic accuracy from the additional studies was similar to that seen in the studies assessed by the external assessment group (EAG), and so added further uncertainty to the results.
5.4 The committee noted that the studies included in the review were done outside of the UK. It understood that the management of preterm labour was likely to vary from country to country and that this had the potential to affect test accuracy. The EAG was unable to explore the likely effect of this variation because many studies did not provide details on how preterm labour was managed, particularly whether the delivery was spontaneous or induced for medical reasons. This variable had a direct effect on how the reference standards of birth within 48 hours or 7 days of testing were interpreted, and on determining true and false positive index test results. The committee concluded that because of shortcomings in how the results of the included studies were reported, it was not able to judge whether the results were generalisable to the NHS.
5.5 The committee considered the results of the EAG's diagnostic accuracy meta-analyses, and noted that the accuracy estimates differed substantially between studies. The studies included in the review recruited a wide range of women and the EAG raised concerns that there was substantial heterogeneity in a number of important patient characteristics, including gestational age, multiple pregnancy and history of preterm birth. The EAG explained that it did not consider the pooled diagnostic accuracy results to be reliable, because variables that may affect test accuracy such as the use of tocolytics, mode of delivery and gestational age were not reported in many of the studies. Therefore the EAG had not been able to explore which variables were driving the differences in test accuracy estimates between studies. More detailed reporting of the results would be needed for the EAG to have confidence in the pooled diagnostic accuracy results. The EAG further cautioned that the confidence intervals around the pooled estimates were unlikely to sufficiently characterise the uncertainty in the included studies. The committee concluded that there was substantial uncertainty in the pooled results. It also considered that there was a need for further robust diagnostic accuracy studies which aim to address the methodological limitations identified by the EAG (see section 5.14).
5.6 The committee questioned whether gestational age was likely to affect the accuracy of the tests. It noted that the EAG had not been able to do subgroup analyses by gestational age because insufficient data were available. The clinical experts explained that it was plausible that test accuracy could vary by gestational age, because the causes of preterm labour might be different at certain gestational ages. Although the biomarkers detected by each of the tests are thought to be associated with a common preterm labour biochemical signal, their expression may differ depending on the cause of preterm labour. For example, the biomarkers may perform differently in women with preterm labour caused by placental bleeding compared with preterm labour caused by ascending infection. The committee concluded that further evidence is needed on the effect of gestational age on test accuracy (see section 5.14).
5.7 The committee discussed how the test results would be used to guide clinical management. The EAG explained that it had not found any studies which reported the effect of the tests on decision-making, but noted that its search had been restricted to controlled study designs only. The clinical experts explained that biomarker test results are used as an aid to decision-making and are not intended to provide a final decision on the treatment pathway for a woman with symptoms of preterm labour. The importance given to biomarker test results varies in practice depending on the presenting symptoms and clinical history. The clinical experts noted that negative results are often interpreted with caution. They highlighted a study (Dutta et al. 2011) which looked at clinicians' compliance with test results and found that 20 to 30% of women with negative test results had corticosteroid treatment as if they were in preterm labour. The committee considered that it was uncertain how differences in test accuracy might translate to differences in patient outcomes in practice, particularly quality of life for the mother and child. It concluded that further research was needed to collect these data (see sections 5.15 and 5.16).
5.8 The committee noted that because the EAG considered the results of the meta-analyses to be unreliable, the EAG preferred to use data from studies comparing at least 2 of the biomarker tests in the same population to assess their relative accuracy in the economic model. The committee understood that this approach was taken to minimise bias in the accuracy estimates, which might arise because of differences in study design. It noted that no studies assessed all 3 biomarker tests in the same population. The EAG used diagnostic accuracy results from 2 studies (APOSTEL‑1 and Hadzi‑Lega et al. 2017) in the base-case analysis, although this was subsequently revised to include additional studies for PartoSure. The clinical experts noted that the studies were unlikely to be representative of women in NHS clinical practice. In APOSTEL‑1 there was a high proportion of women who would be considered high risk, with 23% of women having previous preterm delivery. The clinical experts explained that this would be lower in practice. Also, Hadzi‑Lega et al. included a small number of women (n=57) and was carried out in Macedonia, where the care pathway was likely to vary considerably compared with NHS clinical practice. The committee concluded that the women in the studies included in the model (APOSTEL‑1 and Hadzi‑Lega et al.) did not represent those seen in NHS clinical practice.
5.9 The committee was aware that the EAG's de novo model comprised a decision tree, which took account of both the neonatal care options available at a hospital (levels 1 to 3) and gestational age. The EAG explained that all tests included in the model were compared with no treatment. The committee considered the quality-adjusted life-year (QALY) payoffs that had been attached to each of the test outcomes. The EAG explained that because of a lack of clinical outcome data, equal QALYs had been assumed for true negative, false positive and false negative results. The patient and clinical experts noted that this approach did not adequately capture the outcomes that could arise from testing. False positive results may be associated with substantial anxiety and may also result in unnecessary treatment, particularly because the model assumed that clinical judgement did not influence the interpretation of test results. Also, although in practice women with false negative results are likely to return with ongoing symptoms, the patient experts said that sometimes these results reassure women and they may not return to hospital in time for effective treatment. This could severely affect the longer-term health of the child born preterm. Therefore the committee considered that the costs and longer-term health outcomes of the child were unlikely to have been adequately captured for false negative results. It concluded that, to capture the full effect of testing, future models should incorporate the effect of changes in both sensitivity and specificity.
5.10 The committee considered the costs used in the economic model. It noted stakeholder comments received during consultation on tocolytic costs used in the model. The EAG explained that it had assumed that atosiban would be used in a scenario analysis in which in utero transfers occurred, but was aware that nifedipine is the first-line tocolytic recommended in NICE's guideline on preterm labour and birth. The clinical experts stated that nifedipine was much less expensive than atosiban. The committee noted that although this cost did not affect the base-case results, if nifedipine had been used in the scenario analysis, the observed cost savings would have been lower. It also considered whether the economic model adequately captured the costs relating to adverse events and long-term health outcomes of the mother and child. The clinical experts explained that the cost of intraventricular haemorrhage in the economic model was likely to be an underestimate, at an average cost of £114,648 per child, and noted that it had been estimated using costing data for cerebral palsy. They also explained that they would expect the lifetime healthcare costs to be at least 10 times higher than those in the model, particularly for babies who were extremely preterm and more likely to have a severe form of intraventricular haemorrhage (grades 3 to 4). Also the committee noted that the model did not account for costs relating to necrotising enterocolitis, which can be significant, and it excluded costs of neonatal and maternal sepsis. The EAG explained that not all costs for long-term health events could be included because there were no data about this. The committee concluded that the model was likely to considerably underestimate the longer-term costs of preterm birth.
5.11 The committee discussed the economic model's results. It recalled its consideration of the limitations in the clinical data available for the biomarker tests. These included the lack of studies on clinical outcomes, poor reporting of studies, the heterogeneity and lack of head-to-head studies comparing all 3 tests (see sections 5.3 to 5.7). This led to many simplifying assumptions in the economic model, which the committee did not consider to be clinically plausible. It noted that probabilistic ICERs had not been presented, and that the fully incremental analyses appeared to contain errors. It therefore considered the available pairwise deterministic ICERs, but noted that probabilistic ICERs would have been preferred. Many of the deterministic ICERs for the tests compared with fetal fibronectin at a threshold of 50 ng/ml were in the south-west quadrant of the cost-effectiveness plane, that is, the index tests were cheaper and less effective than the comparator. The committee noted that the QALY loss in most comparisons was relatively small (−0.006). However, because of the limitations in the clinical data and the implementation of the model (see sections 5.8 and 5.9), it was not possible to determine the magnitude or direction of the health-related outcomes that might occur in practice. Also, it noted that the model's predicted cost savings may not be realised in practice (see section 5.10). The committee agreed that the degree of uncertainty in the current clinical evidence was too high for it to be able to use the ICERs for decision-making. It considered that the scope of any further revisions to the assumptions in the modelling would be limited without more robust clinical data. The committee concluded that without robust diagnostic accuracy and clinical outcome data, it was not able to recommend Actim Partus, quantitative fFN testing using the Rapid fFN 10Q Cassette Kit at thresholds other than 50 ng/ml and PartoSure for use in the NHS to diagnose preterm labour in women with intact membranes.
5.12 The clinical experts explained that there are 2 ongoing UK studies looking at the use of biomarker tests for preterm labour in the NHS; QUIDs II and PETRA. QUIDs II plans to recruit over 2,000 women and includes Actim Partus, PartoSure and quantitative fFN testing. It is scheduled to complete by September 2018. PETRA plans to recruit over 1,000 women and includes quantitative fFN testing. It is scheduled to report by the end of 2018. The committee considered that the results of these studies could provide diagnostic accuracy data that are generalisable to NHS practice and additional data on patient-reported outcomes. Also, it noted that QUIDs II would provide comparative accuracy data for all 3 interventions from the same population, which should overcome some of the bias introduced to the analyses by indirect comparisons.
5.13 The committee questioned how reproducible the test results were in clinical practice. The companies explained that each test has a recommended sample collection protocol which should be followed, although the clinical experts commented that it was uncertain how strictly these were followed in practice. The EAG noted that the reproducibility of PartoSure in practice had been explored (Werlen et al. 2015), but that equivalent data were not available for Actim Partus or quantitative fFN testing. The committee encouraged the companies to do similar studies to show the reproducibility of Actim Partus and quantitative fFN testing using the Rapid fFN 10Q Cassette Kit.
5.14 The committee noted the need for further diagnostic accuracy studies to assess whether the accuracy of Actim Partus, PartoSure and quantitative fetal fibronectin using the Rapid fFN 10Q Cassette Kit differs by gestational age (see section 5.6). Studies should also aim to collect data that help to identify if a birth occurred spontaneously or because of medical intervention, including whether the woman had tocolytics and the mode of delivery (see section 5.5).
5.15 The committee noted that there were no data on how the test results affect clinical decision-making. It considered that further studies should be done to address this uncertainty (see section 5.7). This could be incorporated into a clinical outcome study, or could be done as a standalone study with clinical experts being asked to provide a management plan for a clinical scenario both with and without knowledge of the biomarker test result.
5.16 The committee noted that further studies should be done to assess the effect of Actim Partus, PartoSure and quantitative fetal fibronectin testing using the Rapid fFN 10Q Cassette Kit on maternal and neonatal outcomes, including quality of life (see section 5.7). When possible, these studies should also collect data on resource use associated with preterm birth.