From Slices to Reports
The State of AI in Cross-Sectional Medical Imaging
Kalyan Sivasailam
CEO & Founder, 5C Network
Abstract
Cross-sectional medical imaging — CT and MRI — forms the diagnostic backbone of modern medicine, with over 100 million CT scans performed annually in the US alone. Despite remarkable progress, a fundamental disconnect persists: research has advanced to native 3D volumetric understanding (0.87 AUROC across 366 findings), while production systems remain confined to 2D slice-based processing.
No deployed system generates complete radiology reports from raw imaging volumes. This survey presents the first comprehensive taxonomy of AI approaches spanning the full pipeline from DICOM ingestion to report generation, evaluating 20+ models and identifying three critical gaps.
Published · 5C Network Research
TL;DR: Despite 1,200+ FDA-cleared radiology AI devices, no deployed system can generate a complete radiology report from raw CT or MRI volumes. Research has achieved 3D volumetric understanding (0.87 AUROC across 366 findings), but production systems remain stuck on 2D single-finding detection. This survey identifies three critical gaps and proposes a hybrid architecture combining segmentation grounding with vision-language models as a path toward the first clinically viable AI radiologist system.
Why This Paper Matters
Radiology AI is one of the most funded and regulated segments of healthcare technology. According to the FDA's AI/ML device database, over 1,200 AI/ML-enabled medical devices have been cleared, and 71.5% of all FDA AI clearances are in radiology — more than any other medical specialty combined. Yet something fundamental is missing.
Not a single deployed system can take a raw CT or MRI volume and produce a complete radiology report. The technology that exists today detects individual findings — a nodule here, a hemorrhage there — but the cognitive work of radiology is interpretation, synthesis, and structured reporting. That gap between detection and diagnosis is where patients wait.
Meanwhile, a parallel disconnect has emerged: research has leapt ahead of production. Academic models now process full 3D volumes and understand hundreds of findings simultaneously. But the systems hospitals can actually buy remain stuck on 2D, single-task detection.
100M+
CT scans per year in the US
30%
Global radiologist shortage (WHO)
24hr+
Avg report turnaround at many institutions
Sources: WHO Global Health Observatory; FDA AI/ML Device Database (2024); IMV Medical Information Division CT Market Outlook Report
Three Critical Gaps
Our analysis identifies three structural disconnects that explain why radiology AI hasn't delivered on its promise.
The 3D Processing Gap
Research processes full 3D volumes — Pillar-0 (2025) achieves 0.87 AUROC across 366 findings. Production processes 2D slices one at a time. The academic world has solved volumetric understanding; the commercial world hasn't shipped it.
Key finding
Of 1,200+ FDA-cleared devices, fewer than 5% use true 3D volumetric processing.
The MRI Data Desert
CT has 25,000+ volumes with paired reports (via CT-RATE). MRI has essentially zero large-scale public datasets with paired radiology reports. MRI foundation models are years behind their CT counterparts.
Key finding
Zero large-scale public MRI datasets exist with paired radiology reports — the highest-impact data challenge in the field.
The Report Generation Chasm
No deployed system generates complete radiology reports from raw imaging volumes. The most-funded company in "report generation" operates on text alone — converting dictated findings to impressions without analyzing any images.
Key finding
The market's leading "report generation" system processes zero images. It rewrites text that radiologists already dictated.
The Hybrid Architecture
The paper proposes a unified architecture that bridges these gaps — combining segmentation grounding with vision-language model flexibility.
Anatomy Track
3D segmentation models identify and measure anatomical structures — organs, vessels, lesions — with sub-millimeter precision.
Vision Track
Vision-language models and 3D encoders process imaging volumes for pathology detection across hundreds of findings simultaneously.
Cross-Validation Fusion
Findings from both tracks are merged, conflicts flagged, and confidence scores calibrated before report generation.
Report Composition
RAG-augmented multi-agent system generates structured radiology reports grounded in validated findings and institutional templates.
This architecture reflects how 5C Network's Bionic AI Suite operates — combining computer vision, structured reporting, and multi-agent quality control. Read more about our approach to Generalised Medical AI and Hybrid Intelligence.
By the Numbers
1,200+
FDA-cleared radiology AI devices
71.5%
of all FDA AI/ML clearances are radiology
0.87
AUROC — state of the art for 3D understanding
8-15%
Hallucination rate in current VLMs
142d
Median FDA 510(k) clearance timeline
Sources: FDA AI/ML Device Database (2024); Pillar-0 (2025); CT-RATE (2024). Full citations in the paper.
Open Problems
The paper identifies seven open research challenges. Here are the most impactful.
The MRI Data Problem
The most impactful data challenge in the field. Building MRI foundation models requires large-scale datasets with paired radiology reports — and none exist publicly. Solving this would unlock an entire modality for AI.
Longitudinal Reasoning
Radiologists routinely compare current studies with prior imaging — "the nodule has increased from 4mm to 7mm since the prior study." No current AI system handles temporal reasoning across multiple imaging volumes.
Hallucination Safety
Current vision-language models hallucinate at 8-15%. For critical findings — pulmonary embolism, stroke, pneumothorax — this rate must approach zero. Solving this requires new architectures, not just better prompting.
Read the Complete Survey
"From Slices to Reports: The State of AI in Cross-Sectional Medical Imaging"
by Kalyan Sivasailam · No email required. No paywall.