3 Approach to evidence generation

An approach to generating evidence for artificial intelligence (AI) contouring is presented. How this will address the evidence gaps is considered, and any strengths and weaknesses highlighted.

Most technologies do not have ongoing studies that will address the evidence gaps. The King's Technology Evaluation Centre are doing a study (completing by June 2024) with some of the technologies, which may address:

  • organ delineation and acceptability of the contour

  • time saving and resource use

  • performance in different anatomical sites.

3.1 Ongoing studies

Table 1 summarises the evidence gaps and ongoing studies that might address them.

Table 1 Summary of the evidence gaps and ongoing studies

Evidence gaps and technologies

Time saving and resource use

Organ delineation and acceptability of the contour

Adverse effects of treatment

Performance in different anatomical sites and patient subgroups

AI-Rad Companion Organs RT (Siemens Healthineers)

Evidence is available

Ongoing study

Evidence is available

Ongoing study

No relevant evidence identified

No relevant evidence identified

ART Plan (Thera-Panacea)

No relevant evidence identified

Ongoing study

No relevant evidence identified

Ongoing study

No relevant evidence identified

No relevant evidence identified

Ongoing study

DLCExpert (Mirada Medical)

Evidence is available

Evidence is available

No relevant evidence identified

Limited available evidence

INTContour (Carina Medical)

No relevant evidence identified

Evidence is available

No relevant evidence identified

No relevant evidence identified

Limbus Contour (Limbus AI, AMG Medtech)

Evidence is available

Ongoing study

Evidence is available

Ongoing study

No relevant evidence identified

Evidence is available

Ongoing study

MIM Contour Protege AI (MIM Software)

Evidence is available

Evidence is available

No relevant evidence identified

No relevant evidence identified

MRCAT Prostate plus Auto-contouring (Philips)

No relevant evidence identified

No relevant evidence identified

No relevant evidence identified

No relevant evidence identified

Mvision Segmentation Service

(Mvision AI Oy, Xiel)

Evidence is available

Ongoing study

Evidence is available

Ongoing study

No relevant evidence identified

Evidence is available

Ongoing study

RayStation (RaySearch)

Evidence is available

Ongoing study

Evidence is available

Ongoing study

No relevant evidence identified

Evidence is available

Ongoing study

Information about current evidence status is derived from the external assessment group's report; evidence not meeting the scope and inclusion criteria are not included. AutoContour (Radformation) does not currently have any evidence for the evidence gaps.

3.2 Data sources

There are several data collections that have different strengths and weaknesses that could potentially support evidence generation. NICE's real-world evidence framework provides detailed guidance on assessing the suitability of a real-world data source to answer a specific research question.

The Radiotherapy Data Set (RTDS) is the national standard for collecting radiotherapy data in the NHS. It is currently collecting data for all NHS Acute Trust providers of radiotherapy services in England. It will need to be modified to collect data addressing the evidence gaps. But this could take up to 2 years.

Local or regional data collections such as the sub-national secure data environments that measure outcomes specified in the evidence generation plan could be used to collect data to address the evidence gaps. Secure data environments are data storage and access platforms that bring together many sources of data, such as from primary and secondary care, to enable research and analysis. The sub-national secure data environments are designed to be agile and can be modified to suit the needs of new projects.

The quality and coverage of real-world data collections are of key importance when used in generating evidence. Active monitoring and follow‑up through a central coordinating point is an effective and viable approach of ensuring good-quality data with high coverage.

3.3 Evidence collection plan

To address the evidence gaps, a before and after study is suggested. A before and after design allows comparison when there are differences between sites or departments, such as in processes, protocols or equipment. In a before and after study, data is collected and compared before and after implementing the AI contouring technologies in radiotherapy departments.

Data collection for a particular technology can be at a single centre or ideally across multiple centres.

For qualitative outcomes, surveys or interviews could be used to assess people's experiences and views on the technologies' acceptability, performance and impact on productivity in routine clinical practice. Open-ended questions could be included to gather information on the potential of the technologies to improve current clinical practice.

3.4 Data to be collected

Outcome variables for data collection should include data for the technology and, where appropriate, the current standard of care (manual or atlas-based contouring). The following outcomes have been identified for collection through the suggested before and after studies:

Quantitative

Information to be collected before and after implementation:

  • Total time needed for contouring.

  • Average number of contours completed per hour per reviewer.

  • NHS band of the reviewer.

  • Characteristics of patients reviewed. For example, age, sex, ethnicity, height and weight or body mass index, and comorbidities that may make scans challenging to perform.

  • Adverse events and dosimetric analyses.

Information to be collected after implementation:

  • Acceptability of the contours, measured by a scale:

    • score 0: no edits needed

    • score 1: minor edits needed

    • score 2: moderate edits needed

    • score 3: major edits needed.

  • Training, implementation, and administrative costs.

Qualitative

  • Perceived impact on time to review and edit contours.

  • Information about other factors that may influence time saving in clinical practice.

  • Perceived ease of use.

  • Perceived acceptability of output and accuracy of contours.

  • Variability of contour accuracy in groups for whom contours may be more challenging to do.

  • How use of the technology may affect contouring skills.

  • Opinion on patient outcomes:

    • accurate organ delineation

    • accurate clinical target volume delineation

    • improvements in throughput

    • adverse events.

Information about the technologies

Information about how the technologies were developed and the effect of updates should also be collected. See the NICE evidence standards framework for guidance.

3.5 Evidence generation period

This will be 3 years to allow for setting up, implementation, data collection, analysis, and reporting.