Docket #: S24-448
AI Model Aids Clinical Decision-Making and Precision Medicine Predictions
Stanford researchers have developed a vision-language foundation model designed to leverage large-scale, unlabeled, unpaired image and text data. The model, coined "MUSK", shows strong performance in outcome prediction, including melanoma relapse prediction, pan-cancer prognosis prediction, and immunotherapy response prediction in lung and gastro-esophageal cancers.
Clinical decision-making is driven by multimodal data, including clinical notes and pathologic characteristics. Artificial intelligence approaches that can effectively integrate multimodal data hold significant promise to advance clinical care. Foundation AI models in particular represent a new frontier of medical AI R&D. These models are pretrained on massive, diverse datasets and can be applied to numerous downstream tasks with minimal or no further training. A major hurdle for the development of multi-modal AI models has been the scarcity of well-annotated datasets, especially in the clinical setting. Recent efforts to develop vision-language foundation models for medicine have fallen short due to insufficient data scale, inefficient training approach (i.e. using contrastive learning) and focusing on primarily simple tasks such as image classification or retrieval.
To advance the clinical relevance and impact of foundation AI models, Stanford researchers developed MUSK. MUSK can be used for a wide range of downstream tasks including image and text retrieval, visual question answering, image classification, and molecular biomarker prediction. MUSK outperforms comparable state-of-the-art foundation models on multiple benchmarks. Additionally, not only can MUSK be used to develop more reliable and robust foundation AI models but the method can be utilized to improve the prediction of treatment response and patient outcomes for precision medicine. MUSK has commercial potential as clinical diagnostic tests, as a provided service and more.
Stage of research
In vitro data
Applications
- Image and text retrieval
- Visual question answering
- Image classification
- Molecular biomarker prediction
- Development of more reliable and robust foundation AI models
- Clinical diagnostic tests or provided as services
- Oncology and precision medicine
Advantages
- Superior performance over state-of-the-art foundation models
- Strong performance for predicting clinical outcomes
- Custom-designed foundation model
Publications
- Xiang, J., Wang, X., Zhang, X. et al. A vision–language foundation model for precision oncology . Nature (2025). https://doi.org/10.1038/s41586-024-08378-w
Related Links
Similar Technologies
-
Imaging features for treatment selection, disease monitoring, and outcome prediction S22-156Imaging features for treatment selection, disease monitoring, and outcome prediction
-
Improved cfDNA methylation profiling through correction of misrepaired jagged-ends S23-034Improved cfDNA methylation profiling through correction of misrepaired jagged-ends
-
Quantification of Antigen Molecules Without Calibrators Using Dynamic Flow Cytometry S15-009Quantification of Antigen Molecules Without Calibrators Using Dynamic Flow Cytometry