About me

I am currently a PhD student at the University of Edinburgh, supervised by Prof. Timothy Hospedales and Dr. Yongxin Yang, where I am funded by UKRI CDT in Biomedical AI. I obtained my BSc in computer science from Tongji University, in 2021.

I am broadly interested in machine learning and its applications in biomedicine, especially with multi-modal learning and large vision-language models. Feel free to drop me an email for potential collaborations!

👉 I am actively looking for a research intern position this year. Shoot me an email if you think I am a good fit! (Flexible to anywhere/anytime)

Research Interests

  • Machine Learning: Multimodal Learning, Algorithmic Fairness, Self-supervised Learning.
  • AI4Healthcare: Computational Biology, Medical Imaging.


[01/2024] Giving a talk about Fool your (V)LLMs at BMVA Vision-Language Symposium!
[11/2023] Invited talk about MEDFAIR at FAIMI workshop!
[02/2023] Meta-Omnium is accepted to CVPR’23!
[01/2023] MEDFAIR is accepted to ICLR’23 as spotlight!

Publications / Preprints

Check my Google scholar.

Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models.[website][paper][code]
Yongshuo Zong, Ondrej Bohdal, Tingyang Yu, Yongxin Yang, Timothy Hospedales.
arXiv 2024.

Fool Your Large (Vision and) Language Model With Embarrassingly Simple Permutations.
Yongshuo Zong, Tingyang Yu, Bingchen Zhao, Ruchika Chavhan, Timothy Hospedales.
arXiv 2023.

What If the TV Was Off? Examining Counterfactual Reasoning Abilities of Multi-modal Language Models. [paper][code]
Letian Zhang, Xiaotong Zhai, Zhongkai Zhao, Xin Wen, Yongshuo Zong, Bingchen Zhao.
arXiv 2023.

Self-Supervised Multimodal Learning: A Survey. [paper][Github]
Yongshuo Zong, Oisin Mac Aodha, Timothy Hospedales.
arXiv 2023.

Meta Omnium: A Benchmark for General-Purpose Learning-to-learn. [website][paper][code]
Ondrej Bohdal, Yinbing Tian, Yongshuo Zong, Ruchika Chavhan, Da Li, Henry Gouk, Li Guo, Timothy Hospedales.
Computer Vision and Pattern Recognition (CVPR 2023)

MEDFAIR: Benchmarking Fairness for Medical Imaging. [paper][code][website][docs]
Yongshuo Zong, Yongxin Yang, Timothy Hospedales.
International Conference on Learning Representations (ICLR 2023 Spotlight)

conST: an Interpretable Multi-modal Contrastive Learning Framework for Spatial Transcriptomics. [paper][code]
Yongshuo Zong, Tingyang Yu, Xuesong Wang, Yixuan Wang, Zhihang Hu, Yu Li.
Biorxiv (2022).

scMinerva: a GCN-featured Interpretable Framework for Single-cell Multi-omics Integration with Random Walk on Heterogeneous Graph. [paper][code]
Tingyang Yu, Yongshuo Zong, Yixuan Wang, Xuesong Wang, Yu Li.
Biorxiv (2022).