3 Committee discussion

The diagnostics advisory committee considered evidence on computer-aided detection (CAD) software with artificial intelligence (AI)‑derived algorithms for detecting and measuring lung nodules in CT scan images from several sources, including an external assessment report and an overview of that report. Full details are in the project documents for this guidance.

Improving detection of lung cancer

3.1 Clinical experts explained that lung nodules may be challenging for healthcare professionals to detect because of their small size, varying shape, and how close they are to other structures such as blood vessels in the lung. This may mean that sometimes a nodule will be missed by a clinician reviewing CT scans. Although most lung nodules are benign, some may be cancerous or develop into lung cancer. Using CAD software to assist reviewing CT scans for lung nodules could improve the detection of lung nodules and improve early diagnosis of lung cancer.

Anxiety while having CT surveillance

3.2 A patient expert said that people who are told they have a lung nodule may experience anxiety, especially if the CT scan was done without expecting to find lung nodules. Not knowing whether or not the nodule is benign may further increase the anxiety. Having more information as soon as possible is important for people with lung nodules and their families. The patient expert explained that when CT surveillance is needed to understand the nature of the lung nodule, anxiety can be spread over a long period. If a follow-up scan shows no signs of malignancy, the anxiety may reduce, but will not disappear. Because CT surveillance involves multiple CT scans, people may be further concerned about the repeated radiation exposure.

Need for software for targeted lung cancer screening

3.3 The committee noted that using AI‑derived CAD software that automatically detects and measures lung nodules could be helpful for targeted lung cancer screening. Use of CAD software is included in the protocol for NHS England's Targeted Lung Health Check programme. The clinical experts explained that it would be useful to understand which technologies are clinically and cost effective in this setting.

Clinical effectiveness

Technologies with no published evidence

3.4 The committee considered the available evidence for each technology. It noted that the external assessment group (EAG) review found no relevant published evidence in any of the populations for 5 of the 13 technologies: JLD-01K, Lung AI, Lung Nodule AI, qCT‑Lung and SenseCare‑Lung Pro. The committee recommended more research on Lung AI, qCT‑Lung and SenseCare‑Lung Pro (see section 4). At the time of writing, JLD‑01K and Lung Nodule AI were not available in the UK.

Accuracy of detecting lung nodules in screening

3.5 There were 5 studies on 4 different software (ClearRead CT, InferRead CT Lung, Veolity and VUNO Med-LungCT AI) that compared the accuracy of CT scan review with and without CAD software to detect lung nodules and reported per-person results from a screening population. The committee considered that reporting accuracy results per person instead of per nodule was important. This is because many people have more than 1 lung nodule identified in their CT scan images. When there are multiple nodules, clinical decisions are made at a person level, based on the largest nodule. Per-nodule results would only tell whether nodules were missed or wrongly detected, but not if people were wrongly identified as having nodules or if those with nodules were missed. Nearly all of the studies found that CT scan review with software was more sensitive but less specific than review without the software. In practice, this would mean that more lung nodules and potentially more cancers would be detected, but more people would go on to have CT surveillance as well. The clinical experts noted that it was not clear whether this was the case for both solid and subsolid nodules. Subsolid nodules may be especially hard for the software to detect in the low-dose CT scan used for targeted screening. The committee also noted that because none of the studies looked at more than 1 software, a direct comparison between different software was not possible. The committee concluded that further research was needed on how using AI‑derived CAD software alongside clinician review of CT scan images affects the accuracy of detecting, measuring and assessing the growth of lung nodules.

Generalisability of evidence to clinical practice outside of screening

3.6 Only 1 study (on InferRead CT Lung) looked at accuracy in a population with signs and symptoms that suggest lung cancer. The results were similar to those from studies in the screening setting, but the committee noted that because there was only 1 study, performance of the AI‑derived CAD software when not used as part of screening (from here, outside of screening) was uncertain. It considered whether evidence from screening may be generalisable to routine clinical practice. It noted that the prevalence of lung nodules would be expected to differ between populations. The clinical experts described that people who are referred for a chest CT scan outside of targeted lung cancer screening because of signs or symptoms that suggest lung cancer, or for reasons unrelated to suspicion of lung cancer, are more likely to have other underlying lung conditions (for example, asthma, chronic obstructive pulmonary disease or granulomatous lung diseases). How common these conditions are may also depend on age and family background. The underlying lung conditions may make it harder for the software to differentiate nodules, especially subsolid nodules, from other nodule-like structures in the lungs and cause them to be falsely detected as nodules. The clinical experts also noted that, unlike targeted lung cancer screening scans that are always low-dose CT scans without contrast, chest CT scans done for other reasons use a standard dose and may be done with contrast. Further, the healthcare professionals reviewing targeted screening scans are radiologists specialised in reviewing chest CT images for lung nodules, whereas in other settings, levels of specialisation and experience of the healthcare professionals reviewing the scan may vary. The committee concluded that the evidence from a screening population is unlikely to be generalisable to people who have a chest CT scan for other reasons.

Populations that could particularly benefit from the technologies

3.7 The committee considered groups of people that could particularly benefit from the software. It recognised that people with underlying lung conditions, and people whose family background means they are more likely to have subsolid nodules, may be at a higher risk of not having nodules detected and lung cancer missed. If using the software helped to improve detection, it would be particularly beneficial to these groups.

Time to read and report a CT scan

3.8 The EAG's review included 9 studies that looked at time to read and report a CT scan. All of the studies suggested that review was faster with AI‑derived CAD software than without. Some studies suggested that less experienced readers may save even more time than experienced readers. But the committee noted that the comparisons in the studies were done at least partly in laboratory-like conditions, rather than in routine clinical practice. The clinical experts explained that in targeted screening, scan review time is protected. But in routine clinical practice, reviewing scans may be less continuous because of interruptions. Using an additional tool could help reviewing but could also slow it down. The time may also depend on how well the software integrates into the radiologists' workflow within the picture archiving and communication system (PACS) in which CT scan images are reviewed and reported. The EAG pointed out that people reporting scans may behave differently in research conditions than in clinical practice because they know that their decisions will not affect health outcomes. The committee concluded that it is uncertain whether using software would speed up reading and reporting a CT scan outside of targeted screening. There is also a need to confirm whether the suggested time advantage applies in the targeted screening setting.

Cost effectiveness

Exploratory model

3.9 The EAG built a health economic model to evaluate the cost effectiveness of AI‑derived CAD software for detecting and measuring lung nodules from CT scan images. The model captured targeted screening and clinical practice in which people might have a chest CT scan because of signs or symptoms that suggest lung cancer or for other unrelated reasons. Because there was not enough clinical-effectiveness evidence on any of the individual technologies, the model combined data from various sources to assess a hypothetical software. For data that was needed but not available, the EAG simulated model inputs using other sources of information. The committee concluded that because of the limitations in the data available for modelling, the model should be considered exploratory and the model results indicative only.

Software costs

3.10 The EAG used a software cost of £2 per scan in the model for targeted screening, and £3.34 per scan for chest CT scans because of signs or symptoms that suggest lung cancer or for other unrelated reasons. The model did not include set-up and maintenance fees, but the EAG did a sensitivity analysis on software costs ranging from £1.50 to £6 per scan for targeted screening and from £2.67 to £6 per scan outside of targeted screening. This analysis showed that the software cost was not particularly influential.

Influential model inputs

3.11 The committee recalled that data on accuracy (a model input that had a substantial effect on the model results) was especially limited in people having a chest CT scan because of signs or symptoms that suggest lung cancer or for other unrelated reasons. The clinical experts also pointed out that the prevalence of lung nodules, another key model input, for screening (50.9%) and for people with symptoms that suggest lung cancer (94.9%) was higher than they would expect. They stated that the prevalence in people with symptoms that suggest lung cancer was uncertain. The committee concluded that further research is needed to assess the prevalence of lung nodules in people who have a chest CT scan because of signs or symptoms that suggest lung cancer.

Potential for cost effectiveness for targeted screening

3.12 In the EAG's model, for targeted screening, using CAD software alongside clinician review of CT scan images detected a larger number of people with nodules that needed follow up compared with clinician review alone. It also detected slightly more lung cancers. Using the software alongside clinician review was slightly less costly and more effective than clinician review alone. The committee recalled that because of a lack of evidence, the model results were only indicative. But it noted that changing different model assumptions did not change the direction of the model results (scan image reviews with software stayed slightly less costly and more effective than scan image reviews without the software). The committee concluded that the software had the potential to be cost effective when used alongside clinician review for targeted screening.

Quality-adjusted life year loss outside of targeted screening

3.13 Based on the EAG's model, outside of targeted screening, using CAD software alongside clinician review of CT scan images was slightly less effective than clinician review alone. The clinical experts said that some quality-adjusted life year (QALY) loss in this setting could be expected; because of underlying conditions affecting the lungs being more common, the software may more often flag up nodule-like structures as lung nodules. This could lead to more people unnecessarily being offered CT surveillance, which was associated with a disutility in the model to reflect patient anxiety (see section 3.2). The committee noted that, like in targeted screening, using the software could also lead to more cancers being detected. But outside of targeted screening, the clinical benefits of improved detection may not outweigh the disutility associated with more people having CT surveillance. The committee noted that changing some of the model assumptions affected the extent of the QALY loss, mostly reducing it. It recalled the weak clinical-effectiveness evidence in this setting (see section 3.9) and concluded that the QALY loss was uncertain but possible. It further concluded that because there was not enough evidence, it was not possible to say whether using the software outside of targeted screening could be cost effective. The committee recommended further research in these settings (see section 4).

Conclusions

3.14 The committee recalled that the model results indicated that the AI‑derived CAD software had the potential to be cost effective in targeted screening. It acknowledged that the software could be helpful for targeted lung cancer screening. It noted that the limited evidence did not allow the committee to decide which technologies are the most clinically and cost effective in this setting. It considered the need for the software, the model results, and the robustness of the results to changes in model inputs and assumptions. It concluded that although there was not enough evidence to recommend the software for targeted lung cancer screening, centres may use the software alongside clinician review of CT scan images as part of targeted lung cancer screening if further evidence is generated. This is to make sure the potential benefits are realised in practice for people having screening and for clinicians using the software, and to allow comparisons between the different software (see section 4). The committee further recalled that outside of targeted screening, it is possible that using the software could lead to more nodule-like structures being falsely detected as lung nodules and more people unnecessarily having CT surveillance. The patient expert reminded the committee that having CT surveillance may lead to anxiety. The committee concluded that more research is needed before the software is used outside of targeted screening (see section 4). It further concluded that although there are published studies for some software, all the software needed more evidence.

Research considerations

3.15 The committee recommended more research on the AI‑derived CAD software technologies (see section 4). Studies should include groups of people similar to those seen in the NHS. Ideally, studies should compare more than 1 software. The committee noted that the DART project, an ongoing research project that collects data from NHS England's Targeted Lung Health Checks programme, may be a helpful data source for comparing different technologies for targeted screening. Auditing before and after introducing the software into a new centre, or a cluster randomised controlled trial, could help generate further evidence on the potential benefits of the software.