Clinical Insights

X-ray vision: Using AI to maximize the value of radiographic images


 

FROM AACR: AI, DIAGNOSIS, AND IMAGING 2021

Artificial intelligence (AI) is expected to one day affect the entire continuum of cancer care – from screening and risk prediction to diagnosis, risk stratification, treatment selection, and follow-up, according to an expert in the field.

Dr. Alan P. Lyss, now retired, was a community-based medical oncologist and clinical researcher for more than 35 years, practicing in St. Louis.

Dr. Alan P. Lyss

Hugo J.W.L. Aerts, PhD, director of the AI in Medicine Program at Brigham and Women’s Hospital in Boston, described studies using AI for some of these purposes during a presentation at the AACR Virtual Special Conference: Artificial Intelligence, Diagnosis, and Imaging (Abstract IA-06).

In one study, Dr. Aerts and colleagues set out to determine whether a convolutional neural network (CNN) could extract prognostic information from chest radiographs. The researchers tested this theory using patients from two trials – the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial and the National Lung Screening Trial (NLST).

The team developed a CNN, called CXR-risk, and tested whether it could predict the longevity and prognosis of patients in the PLCO (n = 52,320) and NLST (n = 5,493) trials over a 12-year time period, based only on chest radiographs. No clinical information, demographics, radiographic interpretations, duration of follow-up, or censoring were provided to the deep-learning system.

CXR-risk output was stratified into five categories of radiographic risk scores for probability of death, from 0 (very low likelihood of mortality) to 1 (very high likelihood of mortality).

The investigators found a graded association between radiographic risk score and mortality. The very-high-risk group had mortality rates of 53.0% (PLCO) and 33.9% (NLST). In both trials, this was significantly higher than for the very-low-risk group. The unadjusted hazard ratio was 18.3 in the PCLO data set and 15.2 in the NLST data set (P < .001 for both).

This association was maintained after adjustment for radiologists’ findings (e.g., a lung nodule) and risk factors such as age, gender, and comorbid illnesses like diabetes. The adjusted HR was 4.8 in the PCLO data set and 7.0 in the NLST data set (P < .001 for both).

In both data sets, individuals in the very-high-risk group were significantly more likely to die of lung cancer. The aHR was 11.1 in the PCLO data set and 8.4 in the NSLT data set (P < .001 for both).

This might be expected for people who were interested in being screened for lung cancer. However, patients in the very-high-risk group were also more likely to die of cardiovascular illness (aHR, 3.6 for PLCO and 47.8 for NSLT; P < .001 for both) and respiratory illness (aHR, 27.5 for PLCO and 31.9 for NLST; P ≤ .001 for both).

With this information, a clinician could initiate additional testing and/or utilize more aggressive surveillance measures. If an oncologist considered therapy for a patient with newly diagnosed cancer, treatment choices and stratification for adverse events would be more intelligently planned.

Using AI to predict the risk of lung cancer

In another study, Dr. Aerts and colleagues developed and validated a CNN called CXR-LC, which was based on CXR-risk. The goal of this study was to see if CXR-LC could predict long-term incident lung cancer using data available in the EHR, including chest radiographs, age, sex, and smoking status.

The CXR-LC model was developed using data from the PLCO trial (n = 41,856) and was validated in smokers from the PLCO trial (n = 5,615; 12-year follow-up) as well as heavy smokers from the NLST trial (n = 5,493; 6-year follow-up).

Results showed that CXR-LC was able to predict which patients were at highest risk for developing lung cancer.

CXR-LC had better discrimination for incident lung cancer than did Medicare eligibility in the PLCO data set (area under the curve, 0.755 vs. 0.634; P < .001). And the performance of CXR-LC was similar to that of the PLCOM2012 risk score in both the PLCO data set (AUC, 0.755 vs. 0.751) and the NLST data set (AUC, 0.659 vs. 0.650).

When they were compared in screening populations of equal size, CXR-LC was more sensitive than Medicare eligibility criteria in the PLCO data set (74.9% vs. 63.8%; P = .012) and missed 30.7% fewer incident lung cancer diagnoses.

AI as a substitute for specialized testing and consultation

In a third study, Dr. Aerts and colleagues used a CNN to predict cardiovascular risk by assessing coronary artery calcium (CAC) from clinically obtained, readily available CT scans.

Ordinarily, identifying CAC – an accurate predictor of cardiovascular events – requires specialized expertise (manual measurement and cardiologist interpretation), time (estimated at 20 minutes/scan), and equipment (ECG-gated cardiac CT scan and special software).

In this study, the researchers used a fully end-to-end automated system with analytic time measured in less than 2 seconds.

The team trained and tuned their CNN using the Framingham Heart Study Offspring and Third Generation cohorts (n = 1,636), which included asymptomatic patients with high-quality, cardiac-gated CT scans for CAC quantification.

The researchers then tested the CNN on two asymptomatic and two symptomatic cohorts:

  • Asymptomatic Framingham Heart Study participants (n = 663) in whom the outcome measures were cardiovascular disease and death.
  • Asymptomatic NLST participants (n = 14,959) in whom the outcome measure was atherosclerotic cardiovascular death.
  • Symptomatic PROMISE study participants with stable chest pain (n = 4,021) in whom the outcome measures were all-cause mortality, MI, and hospitalization for unstable angina.
  • Symptomatic ROMICAT-II study patients with acute chest pain (n = 441) in whom the outcome measure was acute coronary syndrome at 28 days.

Among 5,521 subjects across all testing cohorts with cardiac-gated and nongated chest CT scans, the CNN and expert reader interpretations agreed on the CAC risk scores with a high level of concordance (kappa, 0.71; concordance rate, 0.79).

There was a very high Spearman’s correlation of 0.92 (P < .0001) and substantial agreement between automatically and manually calculated CAC risk groups, substantiating robust risk prediction for cardiovascular disease across multiple clinical scenarios.

Dr. Aerts commented that, among the NLST participants who had the highest risk of developing lung cancer, the risk of cardiovascular death was as high as the risk of death from lung cancer.

Pages

Next Article: