Diagnostic accuracy of three computer-aided detection systems for detecting pulmonary tuberculosis on chest radiography when used for screening: analysis of an international, multicenter migrants screening study

Author/s: Sifrash Meseret Gelaw, Sandra V. Kik, Morten Ruhwald, Stefano Ongarello, Tesfa Semagne Egzertegegne, Olga Gorbacheva, Christopher Gilpin, Nina Marano, Scott Lee, Christina R. Phares, Victoria Medina, Bhaskar Amatya, Claudia M. Denkinger
Language: English
Publication Type: Scientific report (Journal)(External)

Download this Publication

The aim of this study was to independently evaluate the diagnostic accuracy of three artificial intelligence (AI)-based computer aided detection (CAD) systems for detecting pulmonary tuberculosis (TB) on global migrants screening chest x-ray (CXR) cases.

Retrospective clinical data and CXR images were collected from the International Organization for Migration (IOM) pre-migration health assessment TB screening global database for US-bound migrants. A total of 2,812 participants were included in the dataset, of which 1,769 (62.9%) had accompanying microbiological test results. All CXRs were interpreted by three CAD systems (CAD4TB v6, Lunit INSIGHT v4.9.0, and qXR v2) offline and re-interpreted by two expert radiologists in a blinded fashion. The performance was evaluated using receiver operating characteristics curve (ROC), estimates of sensitivity and specificity at different CAD thresholds against both microbiological and radiological reference standards (MRS and RadRS, respectively).

The area under the curve against MRS was highest for Lunit (0.85; 95% CI 0.83−0.87), followed by qXR (0.75; 95% CI 0.72−0.77) and then CAD4TB (0.71; 95% CI 0.68−0.73). At a set specificity of 70%, Lunit had the highest sensitivity (54.5%; 95% CI 51.7–57.3); at a set sensitivity of 90%, specificity was also highest for Lunit (81.4%; 95% CI 77.9–84.6). The CAD systems performed comparable to sensitivity (98.3%), and except CAD4TB, to specificity (13.7 %) of expert radiologist. Similar trends were observed when using RadRS.

In conclusion, the study demonstrated that the three CAD systems had broadly similar diagnostic accuracy with regard to TB screening, and comparable accuracy to expert radiologist. Compared with different reference standards, Lunit performed better than both qXR and CAD4TB against MRS, and better than qXR against RadRS. Overall, these findings suggest that CAD systems could be a useful tool for TB screening programs in remote, high TB prevalent places where access to expert radiologists may be limited.

Region/Country (by coverage)