Clinical 2025-03-31
Vision-Language Models for Tuberculosis Diagnosis
A multimodal approach combining chest X-ray imaging with clinical data for enhanced tuberculosis diagnostic accuracy, demonstrating significant improvements in early-stage TB identification and differential diagnosis.
5C Network Research Team · arXiv · DOI: 10.48550/arXiv.2503.14538
Key Findings
- Vision-language models significantly outperform image-only classifiers for TB diagnosis by incorporating clinical context
- Multimodal framework improves early-stage TB identification where subtle radiographic signs are easily missed
- Differential diagnosis accuracy improves when the model reasons jointly over imaging and patient history
- Companion study on chronic TB diagnostics extends the framework to long-term treatment monitoring