Expert comments

Comments on this technology were invited from clinical experts working in the field and relevant patient organisations. The comments received are individual opinions and do not represent NICE's view.

All experts were familiar with artificial intelligence (AI) for imaging, and 3 had used AI technologies for imaging before. Two experts were currently using AI technology – 1 to assess bone age in children and 1 in lung cancer screening. None had been involved in research and development of AI technologies. None of the experts believed that clinical AI software was in routine use in the NHS (except in some centres running the national lung cancer screening programme). The technology is relatively new and having ongoing validation and development.

Level of innovation

Three experts stated that AI technology for chest imaging was a novel concept in the NHS. One expert noted that the technology could introduce a paradigm shift in UK radiology practice. Two experts noted that the technology could improve patient safety by reducing the risk of abnormalities being missed.

Two experts were aware of other competing technologies and highlighted variability in the field. An expert highlighted that many CT scanner vendors provide similar AI technologies embedded on their software package. One expert noted that comparing the different technologies was challenging as the concept of 'deep learning' is broad and a detailed description of technologies is needed for a comprehensive comparison. Performance of the technologies may also vary, with the results of nodule detection and measurement differing significantly across different AI software. Another expert highlighted that there is a broad range of AI technologies emerging in all radiological disciplines for both diagnostic and therapeutic purposes.

Potential patient impact

All 4 experts noted that the technology could improve diagnostic accuracy in image interpretation. One expert felt that AI should be mandatory for any CT imaging of the thorax to minimise the risk of missing early-stage lung cancer or lung metastases. For example, implementing automated lung nodule detection tools before a radiologist interprets the scans may reduce human error and so significantly reduce the number of delayed diagnoses of lung cancer or metastases. One expert suggested that using AI software may increase the abnormality detection rate, potentially through increased sensitivity, but at the cost of reduced specificity. Three experts suggested that the technology may help improve triaging or speed up time to diagnosis. However, 2 experts explained there may be limitations to this. One noted that from experience the time spent by radiologists to report each case may increase, because results from the AI analysis need to be taken into account in addition to the time taken to do routine reporting. Another expert cited evidence from a study showing that algorithms designed to help prioritise scans based on perceived abnormality may produce a trade-off, with other people waiting longer for their results (Annarumma et al. 2019). One expert noted that AI technology may decrease interobserver and intraobserver variability in diagnostic work. All experts felt that the technology has potential to change the current pathway or clinical outcomes by improving interpretation accuracy and reporting times. Earlier diagnosis may result in less invasive treatment or allow curative treatment (which may not be possible with a late diagnosis). However, 1 expert noted that identification of clinically insignificant disease may unnecessarily increase patient anxiety and demand on the CT scanning service. One expert suggested that although tuberculosis assessment AI is unlikely to add significant value to UK radiology practice, assessment of other pathologies such as pneumothorax, pleural effusion, and nodules would useful.

Potential system impact

Experts thought that improved accuracy and reporting speed of interpreting radiographs and radiology reporting would be the main system benefits of the technology. One expert noted that productivity may improve in settings such as lung cancer screening if the technology could provide an initial reading, avoiding the need for 2 reading radiologists. However, there is no evidence for this. Another expert advised that over-diagnosis was a risk with adopting AI for CT, estimating that around 70% of people referred for a chest CT present with a lung nodule, and only a very small minority of these nodules have clinical implication. One expert highlighted that radiologists and radiographers can feel guilt from missed diagnoses, and potential improvements in accuracy may help reduce these effects.

Expert opinions varied about the cost impact of AI technologies compared with standard care. One noted that installation costs would be significant and included software hosting and support, integration with existing software, and security. Another highlighted the need for additional specialised staff, and that there may be an increase in the time radiologists and radiographers spend on each case reported. Two experts thought that costs would increase (for example because of increased need for scanners, radiography staff, radiologist, lung nurses and chest clinicians), but that these would be offset by savings in efficiency. Another expert stated that using AI technology could result in cost savings if the cost is offset by earlier disease diagnosis. One noted that improved accuracy would also help offset costs. One expert did not anticipate significant resource gains from these technologies. Finally, 1 expert noted that there were diverse opinions about whether the technology would enhance the function of radiology departments or potentially replace radiologists.

Experts noted the need for a robust IT structure before implementing AI software. At installation, software hosting and bandwidth may need capital investment, and the AI software would need to be effectively integrated into clinical systems (usually picture archiving and communication system [PACS] or electronic health record). Specific training may be needed to understand limitations and scope of the technology function. One expert suggested that training needs may be fairly modest, and depend on product design and radiologist or radiographer experience levels.

Experts highlighted a number of potential safety and regulatory issues. A fundamental safety issue is the validity of training data and its application to the local population. AI software is based on machine learning of large datasets and will therefore be influenced by the characteristics in those populations. One expert described an example of algorithm training from people in China being unlikely to be fully applicable to a typical NHS population for assessing tuberculosis (because prevalence differs between these populations). One expert suggested that there may be data security issues if software companies want to use NHS patient data to develop their products. Another expert highlighted that there is significant variability of methods used to quantify lung nodules which needs to be accounted for in assessment and guidelines involving AI for volumetric quantification of lung nodules.

One expert described 2 potential cognitive biases that may affect interpretation of AI output. Firstly, in 'automation bias' clinical staff may be overconfident in the results of an automated system, which may bias clinical opinion negatively if the AI output is not accurate. Secondly, the expert noted the 'satisfaction of search' bias when a reader may carry out an incomplete assessment. AI technology may promote this error if the reader incorrectly believes the image has been analysed. In relation to this, another expert added that de-skilling of radiologists (and reporting radiographers) may be a factor after AI implementation.

Two experts mentioned discussions around regulation and legal responsibilities. One noted that the regulatory environment was complex (including the Medicines and Healthcare products Regulatory Agency and Care Quality Commission oversight) and that the current framework may not be adequate to keep up with the sector.

One expert noted that the evidence about AI technology is still limited. Further research into multicentre cohorts would be needed to understand risks associated with implementation in clinical practice.

General comments

One expert provided advice based on their own experiences with AI software. Firstly, when automated analysis is available, the technology may be met with considerable scepticism from some staff, who then request older 'manual' methods. Clinical engagement is vital in developing clinical AI technologies, and this must include training in the software scope and limitations for all clinical staff. Secondly, automated assessment may reveal areas of inadequate clinical practice, for example when a clinician has used an incorrect method for image interpretation. The software may incorrectly be labelled inaccurate. Thirdly, implementation of clinical AI software, will need significant involvement of NHS IT staff, for hosting software, firewall configuration, software integration and troubleshooting. Storage of any metadata associated with the AI image analysis will need to be considered. Ideally these data should be stored long term in NHS systems. Metadata may be lost if stored in vendor's software storage or potentially at the end of contract when transferring from one vendor to another.

Another expert noted that thoracic imaging scans can show numerous changes within the lung, mediastinum, chest wall or visualised portions of the neck and abdomen. They suggested the AI technologies may not be properly designed or tested for analysis of the chest wall, neck and upper abdomen.

Other considerations

All experts thought a large number of people would be eligible for chest AI technology in the NHS. One expert noted that over 5 million CT scans are done every year in the UK, and a large proportion of those would cover the thorax. AI software could be applied to all acute chest CT. For example, nodule volumetry could apply to all cases of lung cancer and lung metastatic disease.

All experts felt that, at least in the short to medium term, this technology would be an addition to current standard care. One expert suggested that in 7 to 10 years, this technology may begin to replace current standard care.

In terms of the practical aspects of the technology, 1 expert indicated that training may be needed to understand and use the technology appropriately. Another highlighted that outsourcing scans to be analysed by AI software may be of concern. One expert noted that the AI technology would need to be seamlessly integrated into software so that the AI interpretation appears in the same report box as the standard X-ray or CT report. The expert noted that technology that needed 'extra mouse clicks' may not be used.

All experts suggested a number of potential barriers to this technology being adopted. Factors included clinical governance and costs, difficulty integrating the technology into clinical pathways, lack of IT capacity for integration (and potential issues of incompatibility between the AI technology and existing IT infrastructure). Clinical acceptance was mentioned as a significant barrier, but one expert noted that this may change over time with wider acceptance of AI technologies in different fields. Another suggested that the lack of robust clinical evidence on the broader impact of the technologies also affects the translation from research to clinical practice.

Two experts were not aware of further evidence for the technology included in this briefing. Other experts noted that there were various developing research projects for AI in thoracic imaging, including tools for quantitative analysis of diffuse interstitial lung, cardiovascular and pleural diseases.

All experts proposed potential research studies to address uncertainties in the evidence base. This included post-marketing studies of effectiveness to independently show the use of these technologies for NHS populations. One expert noted that the evidence base for the technologies in this briefing would benefit from more publications independent from the software companies that had formal peer review. One expert specified that in vivo research would be helpful into the diagnostic accuracy and variability of measurement methods. They suggested further assessments would benefit from being randomised, prospective, blinded, UK-based, controlled and well powered.

All experts felt that NICE guidance into chest AI technologies would be very useful or crucial. One expert suggested the techniques, limitations and scope of these technologies is poorly understood. Another expressed concerns about uncontrolled introduction of AI software. One noted that it may be too early in the life cycle of the technologies to produce a fully informed assessment.