3 Clinical evidence

3 Clinical evidence

Summary of clinical evidence

3.1 The key clinical outcomes for Virtual Touch Quantification (VTq) presented in the decision problem were:

  • correlation in assessment of stage of liver disease and stage of fibrosis using METAVIR score

  • sensitivity and specificity (using area under receiver operating characteristic [AUROC] curve) in assessment of liver fibrosis

  • use of antiviral drugs

  • quality of life measures

  • hospital bed usage and length of stay

  • need for liver biopsy

  • device‑related adverse events.

3.2 The company identified 23 published papers as suitable for full‑text review. No unpublished studies were identified. After review, the company excluded 12 papers which were conference abstracts with insufficient information. Of the remaining 11 papers presented as the clinical evidence for VTq, 10 reported case‑control observational studies and 1 was a meta‑analysis of 8 studies.

3.3 The External Assessment Centre reviewed the literature presented by the company in its submission. It considered that 8 of the 11 papers included by the company should be excluded from further assessment because they reported on studies with overlapping cohorts. The External Assessment Centre carried out a further literature search using revised search terms and found an additional 7 papers; in total, it considered 10 papers to be relevant to this evaluation.

3.4 Seven of the papers evaluated VTq in people with hepatitis C and 3 evaluated VTq in people with hepatitis B. Five studies compared VTq with transient elastography and liver biopsy and 5 compared it with liver biopsy only. Optimal cut‑off values for VTq measurements were calculated to classify fibrosis stages by METAVIR score. Most studies describe VTq as acoustic radiation force impulse (ARFI) imaging carried out on a Siemens Acuson S2000 ultrasound machine.

3.5 Chen et al. (2012) carried out a prospective observational study evaluating ARFI (VTq) to measure fibrosis in 127 people with chronic hepatitis C attending a liver centre in Taiwan. ARFI measurements were compared with liver biopsy and blood tests (FibroTest) for staging of fibrosis. Necro‑inflammatory activity was also measured. Histological fibrosis staging was done using METAVIR scoring by a pathologist blinded to the ARFI and FibroTest results. The Spearman correlation coefficient between ARFI and liver biopsy was 0.696 (p<0.001). The AUROC curve measures for ARFI were: 0.847 for F1 compared with F2–4 (95% confidence interval [CI] 0.779 to 0.914); 0.902 for F1–2 compared with F3–4 (95% CI 0.835 to 0.97); and 0.831 for F1–3 compared with F4 (95% CI 0.723 to 0.939). The authors reported that the degree of necro‑inflammatory activity artificially raised the severity of fibrosis detected by ARFI, but concluded that ARFI was a promising alternative technology to measure liver stiffness.

3.6 The paper by Friedrich‑Rust et al. (2014) is a published abstract from a conference poster presentation reporting findings from a prospective international multicentre study. The study examined the use of ARFI (VTq) compared with transient elastography for fibrosis staging in 253 people with chronic hepatitis C, using liver biopsy as a reference method. Each person had ARFI, transient elastography and blood tests. The extent of fibrosis was staged from liver histology using METAVIR scoring by a single pathologist. The authors did an intention‑to‑diagnose analysis including 247 people and a per‑protocol analysis including 182 people. They reported that both ARFI and transient elastography correlated significantly with the histological fibrosis staging and that no statistically significant differences were found between ARFI and transient elastography for identifying fibrosis at stage F2 or higher in the per‑protocol analysis. The authors concluded that ARFI and transient elastography are comparable methods for non‑invasive fibrosis staging.

3.7 Friedrich‑Rust et al. (2013) report findings from a prospective, international multicentre study examining ARFI (VTq) used to assess liver fibrosis in people with chronic hepatitis B. In the study, 131 people attending hospitals in 3 European centres were recruited consecutively and tested with ARFI to assess the extent of fibrosis. Of the 131 people, 105 also had transient elastography (FibroScan). Liver biopsy was used as a reference method for histological assessment using METAVIR scoring, and blood tests were taken to confirm the diagnosis of chronic hepatitis B. Following exclusions because of invalid biopsy or ARFI measurements, data from 114 people were included in the final analysis. Of those, 92 also had transient elastography and were included in an intention‑to‑diagnose analysis. A per‑protocol analysis was done using data for 88 people who had valid ARFI and transient elastography measurements. Diagnostic accuracy was determined by AUROC curves. Both ARFI and transient elastography correlated significantly with liver biopsy results; the Spearman correlation coefficient was 0.415 (p<0.001) for ARFI and 0.556 (p<0.001) for transient elastography. The diagnostic accuracy of ARFI was 0.66 for mild fibrosis (F1), 0.73 for moderate fibrosis (F2), 0.94 for severe fibrosis (F3) and 0.97 for liver cirrhosis (F4). No statistically significant differences were found between ARFI and transient elastography in either the intention‑to‑diagnose or per‑protocol analyses.

3.8 Kuroda et al. (2010) carried out a prospective diagnostic accuracy study for ARFI (VTq) used in 70 people in Japan; 30 with chronic hepatitis C, 30 with liver cirrhosis and hepatitis C, and 10 healthy controls. The assessment of fibrosis by ARFI was compared with blood tests for serum markers of liver function. Liver biopsy for METAVIR staging was done for 19 patients. Mean shear‑wave velocity was 2.67±1.18 m/s in the liver cirrhosis group, 1.33±0.54 m/s in the chronic hepatitis C group and 0.99±0.21 m/s in the control group. The authors reported that shear‑wave velocity measured by ARFI was significantly higher in the liver cirrhosis group (p<0.001) than in the chronic hepatitis C group, and significantly higher in the chronic hepatitis C group than in the control group (p<0.0023). Mean shear‑wave velocity in each stage of fibrosis was: 1.09±0.22 m/s for F0–1; 1.24±0.52 m/s for F2; 1.61±0.79 m/s for F3; and 2.35±1.11 m/s for F4. ARFI measurements correlated significantly with fibrosis staging (r=0.9772, p=0.002) and all except 1 of the serum marker test results. ARFI showed better diagnostic accuracy for liver cirrhosis than the serum marker tests (AUROC: 0.930, no CI reported). The most appropriate cut‑off value for shear‑wave velocity was judged to be 1.59 (sensitivity 95%, specificity 83%).

3.9 Liu et al. (2014) explored the diagnostic accuracy of ARFI (VTq) compared with transient elastography and a biochemical test that determines the aspartate aminotransferase‑to‑platelet ratio index, in 95 people with hepatitis B and 16 healthy volunteers. All 95 people with hepatitis had a liver biopsy to stage fibrosis using METAVIR scoring. The authors developed an optimal linear combination of the 3 intervention methods and evaluated its accuracy. Results were analysed for 108 people; 3 were excluded because of transient elastography failure. ARFI and transient elastography correlated strongly with histological staging (r=0.85, p<0.001 for ARFI; r=0.81, p<0.001 for transient elastography) and APRI correlated moderately (r=0.63, p<0.001). The AUROC curve results reported for ARFI were 0.91 for F2 and 0.96 for F4, and the results for transient elastography were 0.87 for F2 and 0.96 for F4. The authors compared the accuracy of the combined methods against their individual accuracy, and found that accuracy was superior when they were combined, particularly for diagnosis of moderate fibrosis (F2) and cirrhosis (F4).

3.10 Nishikawa et al. (2014) investigated the correlation between ARFI and fibrosis stage as well as other factors including BMI, hyaluronic acid blood level, gamma‑glutamyltranspeptidase level and inflammation. ARFI (VTq) was used in 108 people with chronic hepatitis C attending hospital in Japan. All patients had liver biopsy for histological staging of fibrosis using METAVIR scoring, by assessors blinded to the clinical data. The investigators carrying out ARFI and clinical tests were blinded to the histological data. Multiple regression analysis showed that ARFI correlated significantly with fibrosis stage (b=0.1865, p<0.0001) and hyaluronic acid levels (b=0.0008, p<0.0039) independently, in all patients. ARFI correlated significantly with BMI in F≤1 fibrosis, with gamma‑glutamyltranspeptidase level in F2 fibrosis, and with fibrosis stage and hyaluronic acid levels in F3 and F4 fibrosis, indicating that these factors could affect ARFI measurements. ARFI did not correlate with inflammation.

3.11 Rizzo et al. (2011) report findings from a study exploring the accuracy of ARFI (VTq) compared with transient elastography in people with chronic hepatitis C, using liver biopsy as a reference standard. In the study, 139 people were recruited consecutively from 2 hospitals in Italy and each had both ARFI and transient elastography as well as liver biopsy for histological staging. Fibrosis staging was done using METAVIR scoring by an assessor blinded to clinical data. No invalid measurements were reported for ARFI, but transient elastography measurements were invalid in 9 people (6.5%). Using pairwise AUROC analysis, ARFI was significantly more accurate than transient elastography for diagnosing moderate (or higher) and severe (or higher) fibrosis (F2: 86 compared with 0.78, p=0.024; F3: 0.94 compared with 0.83, p=0.002) but not for cirrhosis (F4; 0.89 compared with 0.80, p=0.09). Partial AUROC analysis showed that ARFI was statistically significantly more accurate than transient elastography for all stages of fibrosis.

3.12 An international multicentre study was carried out by Sporea et al. (2012a) in 10 centres across 5 countries in Europe and Asia. Liver biopsy (using METAVIR scoring) and ARFI (VTq) measurements of fibrosis were compared for 911 people with chronic hepatitis C. A subset of 400 people also had transient elastography and their results were compared with ARFI and biopsy. Diagnostic accuracy was assessed using AUROC curves. ARFI correlated significantly with liver biopsy staging (Spearman correlation coefficient r=0.654, p<0.0001). In the subgroup having transient elastography and ARFI, the overall correlation with liver biopsy staging was reported as being similar for both ARFI and transient elastography (r=0.689, p<0.001 and 0.728, p<0.001 respectively). The number of people with reliable measurements was significantly higher for ARFI (98.8%) compared with transient elastography (93.7%; p=0.003). The same reliability criteria were used for both ARFI and transient elastography, but different numbers of measurements were used; for ARFI, between 5 and 10 measures were used depending on the centre, and for transient elastography all centres used 10 measurements. The authors reported that ARFI was less effective than transient elastography in predicting liver cirrhosis (F4; AUROC 0.885 compared with 0.932; p=0.01). However, for moderate or severe fibrosis (F2–3), ARFI and transient elastography showed equivalent effectiveness. The authors also noted that the cut‑off levels for ARFI to determine fibrosis at stages F2 and F4 were different for European and Asian people. It is not clear how these subgroups were identified. The authors noted that more people in the Asian group either did not have fibrosis or had mild fibrosis (F1), and that people in the Asian group were older and had a lower mean BMI than those in the European group.

3.13 A study done in Japan by Yamada et al. (2014) evaluated the accuracy of ARFI (VTq) in assessing liver fibrosis in people with chronic hepatitis C as well as the association between ARFI and the response to antiviral therapy. Of the 124 people enrolled in the study, 94 had genotype 1 hepatitis C virus, 46 of whom had antiviral pegylated interferon and ribavirin combination therapy. Although not stated, it can be assumed that the remainder had genotype 2 hepatitis C virus, 15 of whom had antiviral therapy. Liver biopsy with histological analysis was used to determine fibrosis stage. Forty (30%) people were judged to have moderate (F2) fibrosis. ARFI was found to have a strong correlation with fibrosis stage (Pearson's r=0.764, p<0.001). people with the genotype 1 hepatitis C virus and less severe fibrosis (indicated by ARFI measurements of less than 1.40 m/s) showed a better response to treatment, indicating that ARFI could have some benefit in predicting response. ARFI could not predict treatment response in people with the genotype 2 hepatitis C virus.

3.14 Ye et al. (2012) assessed the performance of ARFI (VTq) to measure liver and spleen stiffness in 204 people with chronic hepatitis B and 60 healthy volunteers. Of those with hepatitis B, 66 had liver biopsy and 138 had been diagnosed previously as having cirrhosis. Histological staging using METAVIR scoring was done by an experienced pathologist for those people having biopsies. ARFI measurements showed good correlation with fibrosis stage using Spearman's correlation coefficient (r= 0.87, p<0.001), and a high diagnostic accuracy for predicting severe fibrosis and cirrhosis using optimal measurement cut‑off values for each stage (AUROC curve F3=0.99; F4=0.97).

Evidence synthesis

3.15 The company included a brief synthesis of the clinical evidence in its submission and concluded that VTq and transient elastography have equivalent accuracy, although transient elastography may be slightly more accurate in diagnosing mild fibrosis (F1). The External Assessment Centre considered that this conclusion was plausible, but noted that the company did not carry out a meta‑analysis which would have provided a more definitive result.

3.16 The company provided an overall interpretation of the clinical evidence and concluded that VTq can be a good tool when used in clinical practice to diagnose moderate fibrosis (F2) and an excellent tool to diagnose severe fibrosis (F3) or cirrhosis (F4). The External Assessment Centre considered that this interpretation was reasonable and that the company's assessment of the strengths and weaknesses of the studies was fair.

3.17 As a result of its concerns about the studies selected by the company and the subsequent identification of additional clinical evidence, the External Assessment Centre did a meta‑analysis of the 10 studies it selected for inclusion. A random effects approach was used to calculate pooled outcome data for correlation coefficients (between either VTq or transient elastography and liver biopsy METAVIR scores), and for sensitivity, specificity and prevalence for each disease type (hepatitis B, C or a combination) using liver biopsy as the reference standard. Proportions were transformed using the logit function where necessary to overcome skewness, and values of 0 were transformed to 0.5 to allow pooling of the data. Results were back‑transformed to provide estimated pooled proportions and 95% confidence intervals. Nine outcome estimates were made from multiple studies and 6 from single studies. The pooled estimates for sensitivity, specificity and prevalence for each disease type (hepatitis B, C or a combination) are shown in table 1.

Table 1 Pooled estimates from External Assessment Centre's meta‑analysis with 95% confidence intervals for prevalence and diagnostic accuracy

Hepatitis type

Reference standard ( liver biopsy)

VTq

Transient elastography

Fibrosis stage

F≥1

F≥2

F≥3

F4

F≥1

F≥2

F≥3

F4

B

No. of studies

2

1

1

1

Prevalence

0.43 (0.06–0.79)

0.27 (0.19–0.36)

0.61 (0.51–0.70)

0.27 (0.19–0.36)

Sensitivity % (95% CI)

70.02 (31.59–92.19)

93.1 (77.23–99.15)

81.8 (70.39–90.24)

88.1 (72.65–97.81)

Specificity % (95% CI)

87.01 (78.69–92.40)

76.83 (66.40–85.90)

71.24 (55.42–84.28)

86.67 (77.95–93.76)

C

No. of studies

2

5

5

4

1

1

1

Prevalence

0.91 (0.83–0.95)

0.60 (0.48–0.71)

0.40 (0.32–0.484

0.23 (0.18–0.29)

0.63 (0.54–0.71)

0.39 (0.31–0.47)

0.22 (0.15–0.29)

Sensitivity % (95% CI)

69.82 (66.82–72.66)

78.47 (70.04–85.03)

85.76 (75.94–91.99)

84.48 (79.78–88.24)

71.0 (60.57–80.46)

77.0 (64.40–80.46)

70.0 (50.60–85.27)

Specificity % (95% CI)

80.95 (70.44–88.34)

78.96 (73.49–83.55)

84.36 (80.69–87.45)

81.45 (75.43–86.27)

71.0 (56.92–82.87)

85.0 (75.27–91.60)

82.0 (73.09–88.42)

B and C

No. of studies

7

5

2

2

Prevalence

0.55 (0.42–0.67)

0.23 (0.18–0.29)

0.62 (0.53–0.70)

0.23 (00.14–0.36)

Sensitivity % (95% CI)

77.01 (68.88–83.52)

85.03 (80.59–88.60)

76.16 (63.89–85.22)

79.43 (55.69–92.22)

Specificity % (95% CI)

81.07 (75.83–85.39)

80.44 (75.57–84.54)

71.11 (61.17–79.36)

83.82 (77.81–88.45)

3.18 The results of the meta‑analysis showed that the prevalence of moderate fibrosis (F2) for both hepatitis B and C was lower with VTq (0.55) than transient elastography (0.62). However, the techniques had similar scores for cirrhosis (F4; 0.23 for VTq and 0.23 for transient elastography). Sensitivity and specificity values were similar for hepatitis B and C. VTq had slightly higher values for both sensitivity and specificity in diagnosing moderate fibrosis (F2; 77% and 81% respectively) than transient elastography (76% and 71% respectively) in people with hepatitis B and C. The sensitivity was higher than the specificity for identifying cirrhosis (F4) in the combined study population for VTq (85% and 80% respectively), but the opposite was found for transient elastography (79% and 84% respectively). The correlation coefficients for VTq and transient elastography were similar in the combined study population (0.68 to 0.69).

3.19 The External Assessment Centre noted that no adjustment could be made in the meta‑analysis for confounding variables such as patient characteristics, other than disease type, study design and location, because there was insufficient information in the papers.

Adverse events

3.20 In its submission, the company stated that no adverse events have been reported for VTq. None were identified in the literature, or from searches of the MHRA website and US Food and Drug Administration database: Manufacturer and User Device Facility Experience (MAUDE). The External Assessment Centre repeated the literature searches and found no adverse events relating to the use of VTq.

Committee considerations

3.21 The Committee considered that despite some limitations, the clinical evidence was sufficient to demonstrate that VTq has equivalent accuracy to transient elastography for diagnosing liver fibrosis. The Committee recognised that there was no evidence specifically evaluating the use of VTq for monitoring of liver fibrosis, but considered that the technology was likely to be useful for this purpose. It was advised by clinical experts that the use of image‑guided assessment of liver fibrosis in VTq allowed measurements to be taken from the same part of the liver at different times, which would be useful for monitoring changes in fibrosis relating to disease progression.

3.22 The Committee heard advice from clinical experts that VTq would be particularly useful in hospitals without access to transient elastography for determining the stage of fibrosis and informing the decision to start antiviral therapy, because those hospitals currently use liver biopsy as their primary method of diagnosis. The Committee considered that using VTq could avoid a considerable number of liver biopsies for both initial assessment and ongoing monitoring (particularly for people with hepatitis B), reducing the risks of morbidity and mortality.

3.23 For hospitals with access to transient elastography, the Committee was advised that using VTq could offer advantages by enabling image‑guided assessment of the liver. This assessment offers the opportunity to select areas from which to assess fibrosis and to identify associated pathologies, for example cirrhosis and hepatocellular carcinoma.

3.24 The Committee noted that no evidence was available on the use of VTq in children and that the company's instructions for use do not distinguish between adults and children. However, clinical experts advised the Committee that there was no reason to suppose that VTq would be any less effective in children than in adults, although the range of readings considered normal may differ between them. It heard from clinical experts that avoiding liver biopsies would be particularly beneficial for children, who might need a general anaesthetic and an overnight stay in hospital, as well as facing restrictions to their daily activities following each biopsy.

3.25 The Committee also noted a lack of evidence for improved quality of life as a consequence of using VTq, but it considered that reducing the need for liver biopsies would have a positive effect on quality of life.

3.26 The Committee noted comments made during consultation about the cut‑off values used with VTq. It considered that the interpretation of these values was a matter for clinical judgement by specialists, taking into account results of other tests and the clinical context. It also noted several comments made during consultation that suggested a variety of factors, including obesity and hepatic inflammation, may influence the readings from both VTq and transient elastography. The Committee considered it unclear from the clinical evidence how exactly these factors influence the accuracy of the tests.

  • National Institute for Health and Care Excellence (NICE)