Clinical and technical evidence

Download (PDF)

A literature search was carried out for this briefing in accordance with the interim process and methods statement. This briefing includes the most relevant or best available published evidence relating to the clinical effectiveness of the technology. Further information about how the evidence for this briefing was selected is available on request by contacting mibs@nice.org.uk.

Published evidence

Four studies including 3,485,065 patient records are summarised in this briefing. Patient records were drawn from populations in Israel, the UK and the US. The most recent study (Birks et al. 2017), reports the results of a large UK study of 2,550,119 patient records. Table 2 summarises the clinical evidence as well as its strengths and limitations.

Overall assessment of the evidence

There are 2 studies performed on data from GPs in the UK, which account for 75% of the data in this briefing. Most of the evidence compares the diagnostic accuracy of ColonFlag to standard care, which may vary across country and region. The area under the receiver operator curve (AUC) and the odds ratio (OR) of having colorectal cancer are the most commonly reported outcomes. These showed that the overall performance of the algorithm was reasonably consistent across the different populations. The AUC was slightly lower in the UK study, because the time interval before diagnosis was longer than the other studies. In an age-matched, case-control design from the same study, the AUC was considerably lower than in the other studies, showing that age is an important predictive factor. The reported ORs showed that ColonFlag is potentially useful for identifying people at 10 to 30 times increased risk of colorectal cancer (CRC).

All of the studies are retrospective and observational. There is no evidence available on the cost effectiveness, resource consequences or utility of ColonFlag.

Table 2 Summary of selected studies

Kinar et al. (2016)
Study size, design and location	A retrospective, observational study on registry data in Israel (n=779,654) and the UK (n=25,613). All people aged 40 or over within the Maccabi Healthcare Services who had CBC results from 2008 and 2009 were included. The UK dataset was comprised of a subset of an anonymised UK primary care database. This resulted in a cohort design for the Israeli dataset and a case-control design for the UK dataset; 80% of the Israeli cohort was used as a derivation dataset and the remaining 20% was used as a validation dataset.
Intervention and comparator(s)	ColonFlag compared with the standard of care and gFOBT.
Key outcomes	Israeli dataset: AUC=0.82±0.01. OR at a false-positive rate of 0.5% was 26±5 and the specificity at 50% sensitivity was 88±2%. UK dataset: AUC=0.81. OR at a false-positive rate of 0.5% was 40±6 and the specificity at 50% sensitivity was 94±1%. ColonFlag detected 48% more CRC cases than gFOBT in a dataset of 75,822 Israeli records.
Strengths and limitations	A large number of patient records were included across 2 populations; the Israeli cohort was randomised, the UK one was not. The reference standards used are not clear (and may vary over the dataset), other than the comparison between ColonFlag and gFOBT in the Israeli subset. No power calculation is reported.
Birks et al. (2017)
Study size, design and location	A retrospective, observational study on registry data from the Clinical Practice Research Datalink (CPRD) in the UK (n=2,550,119). People aged 40 or over with a CBC result from January 2000 to April 2015 were included. Following the methodology used in Kinar (2016), a primary analysis (n=2,225,249) and sensitivity analyses were performed. The sensitivity analyses included a cohort study, performed for people with CBCs taken during 2012 (n=600,273) and a case-control study, matching for age, sex and year of risk score (n=519,241).
Intervention and comparator(s)	ColonFlag compared with the standard of care.
Key outcomes	AUC=0.776 (95% Cl 0.771 to 0.781) for CBCs taken in an 18–24 month interval before diagnosis. For the case-control group (age-matched), the AUC was 0.583 (95% CI 0.574 to 0.591). In the 2012 cohort, the PPV was 8.8% and the NPV was 99.6% at a specificity of 99.5%. At this cut‑off, the OR was found to be 26.5 (95% CI 23.3 to 30.2). The AUC was 0.781 (95% CI 0.772 to 0.791).
Strengths and limitations	A very large study population within the NHS was used, including a large cohort group – allowing PPV and NPV to be calculated. A sensitivity analysis was performed showing that the results were robust to variations in randomly selected CBCs. Some of the individuals with no diagnosis during the study period may have been diagnosed outside the follow‑up interval. The reference standards may have varied among patients. No power calculation is reported.
Hornbrook et al. (2017)
Study size, design and location	A retrospective, observational study on registry data in the US (n=17,095). Data from people with CRC and CBC results before diagnosis (from 2000 to 2013; n=900) were included, along with data from 16,195 controls. The data were taken from the KPNW tumour registry.
Intervention and comparator(s)	ColonFlag compared with the standard of care.
Key outcomes	AUC=0.80±0.01. OR=34.7 (95% CI 28.9 to 40.4) for a specificity of 99%. ColonFlag was found to be more accurate at detecting right-sided CRCs than left-sided tumours.
Strengths and limitations	A smaller population was included than in the other studies and all records were drawn from within a single private healthcare service. This study uses a smaller age range (40 to 89 years) than the ColonFlag intended use (>40). Reference standards varied between records and may have included multiple tests (colonoscopies, flexible/rigid sigmoidoscopies, gFOBT and FIT). No power calculation is reported.
Kinar et al. (2017)
Study size, design and location	A retrospective, observational study on registry based data in Israel (n=112,584) for all people aged between 50 and 75 years within the Maccabi Healthcare Services with CBC results from July to December 2007.
Intervention and comparator(s)	ColonFlag compared with the standard of care.
Key outcomes	ColonFlag risk scores were converted into percentiles. 3,337 individuals were within a 3‑percentile cut‑off and 1,094 within a 1‑percentile cut‑off. In the 3‑percentile group, the OR was 10.9 (95% CI 7.3 to 16.2), sensitivity was 25%. In the 1‑percentile group, the OR was 21.8 (95% CI 13.8 to 34.2), sensitivity was 17.3%. Anticoagulant treatments, treatments for other gastrointestinal diseases and presence of other cancers were found to be possible causes of false positives.
Strengths and limitations	A small population was used in comparison to the other studies. The age group used was different than in the other studies and the population was all drawn from the same private healthcare service. There was no information given on the reference standard. No power calculation is reported.
Abbreviations: AUC, area under the receiver operator curve; CBC, complete blood count; CRC, colorectal cancer; gFOBT, guaiac faecal occult blood test; FIT, faecal immunochemical test; CI, confidence interval; KPNW, Kaiser Permanente Northwest region; OR, odds ratio; PPV, positive predictive value; NPV, negative predictive value.

Table 3 Summary of findings

Author	Location	Number of patients	AUC	OR of colorectal cancer
Kinar et al. 2016	Israel and UK	805,267	Israel: 0.82±0.01 UK: 0.81	Israel: (at false-positive rate 0.5%) 26±5 UK: (at false-positive rate 0.5%) 40±6
Birks et al. 2017	UK	2,550,119	Primary Analysis: 0.776 (95% Cl 0.771 to 0.781) Case-control group (age-matched): 0.583 (95% CI 0.574 to 0.591) Cohort: 0.781 (95% CI 0.772 to 0.791)	26.5 (95% CI 23.3 to 30.2; for a specificity of 99.5%)
Hornbrook et al. 2017	US	17,095	0.80±0.01	34.7 (95% CI 28.9 to 40.4; for a specificity of 99%)
Kinar et al. 2017	Israel	112,584	–	3‑percentile group: 10.9 (95% CI 7.3 to 16.2) 1‑percentile group: 21.8 (95% CI, 13.8 to 34.2)
Abbreviations: AUC, area under the receiver operator curve; CI, confidence interval; OR, odds ratio.

Recent and ongoing studies

Prediction of findings at screening colonoscopy using a machine learning algorithm based on complete blood counts (ColonFlag): Robert J Hilsden, Department of Medicine, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada. Status: submitted for publication.
Computer-assisted flagging of individuals at high risk of colon cancer in a large health maintenance organization using the ColonFlag Test: R. Goshen – Medial EarlySign Varda Shalev – Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel. Status: submitted for publication.
Validation of the model's performance in the detection of colorectal cancer and precancerous findings, based on cancer registry and colonoscopy data, Kaiser Permanente North California, US. Status: submitted for publication.