Skip to main content Skip to secondary navigation

Docket #: S24-168

A Generalist Foundation Model for Medicine (MedFM) Achieving Human Level Performance Across Multiple Disciplines

Stanford researchers have developed MedFM, a multimodal foundation model designed to classify and interpret medical images with expert-level performance across a wide range of modalities, including X-rays, MRIs, CT scans, histopathology, and clinical photos. MedFM also enables image-based search and diagnostic support through natural language and visual queries.

Medical image interpretation requires vast labeled datasets and expert annotation, which are often unavailable for rare diseases or emerging infections. Existing AI models either lack domain-specific knowledge or require large-scale, hand-curated data to match expert performance.
MedFM addresses these challenges by automatically curating a large-scale, high-quality medical image-text dataset from open access literature using a coordinated system of AI agents. A visual-language model is then fine-tuned on this dataset to understand both medical imagery and clinical context. At inference time, MedFM boosts diagnostic accuracy through a zero-shot reasoning mechanism that simulates expert analysis by generating disease-relevant prompts. In clinical testing, MedFM outperforms existing models in outbreak scenarios like mpox, and surpasses human accuracy on NEJM image quizzes. A companion app further enables real-time patient image search and diagnostic ranking based on similarity and extracted textual findings.

Applications

  • Rare disease identification
  • Automated medical image classification
  • Clinical Decision Support Systems

Advantages

  • Zero-shot inference
  • Expert-level accuracy
  • Multimodal capability

Related Links

Similar Technologies

Explore similar technologies by keyword: