publications

Selected publications and preprints.

2024

  1. zong2024safety.png
    Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models
    Yongshuo Zong, Ondrej Bohdal, Tingyang Yu, Yongxin Yang, and Timothy Hospedales
    ICML, 2024
    TL;DR: VLLM fine-tuning breaks LLM safety, but our VLGuard can fix this.
  2. zong2024fool.png
    Fool your (vision and) language model with embarrassingly simple permutations
    Yongshuo Zong, Tingyang Yu, Bingchen Zhao, Ruchika Chavhan, and Timothy Hospedales
    ICML, 2024
    TL;DR: (V)LLM-based MCQ is not permutation robust.
  3. zhang2024if.png
    What if the tv was off? examining counterfactual reasoning abilities of multi-modal language models
    Letian Zhang, Xiaotong Zhai, Zhongkai Zhao, Yongshuo Zong, Xin Wen, and 1 more author
    CVPR, 2024
    TL;DR: Vision large language models do not understand counterfactual conditions well.
  4. zong2024vl.png
    VL-ICL Bench: The Devil in the Details of Benchmarking Multimodal In-Context Learning
    Yongshuo Zong, Ondrej Bohdal, and Timothy Hospedales
    arXiv preprint, 2024
    TL;DR: VL-ICL Bench is a better multimodal ICL benchmark than VQA and captioning.
  5. zong2023self.png
    Self-supervised multimodal learning: A survey
    Yongshuo Zong, Oisin Mac Aodha, and Timothy Hospedales
    IEEE T-PAMI, 2024
    TL;DR: Systematic review of self-supervised multimodal learning methods.

2023

  1. bohdal2023meta.png
    Meta omnium: A benchmark for general-purpose learning-to-learn
    Ondrej Bohdal, Yinbing Tian, Yongshuo Zong, Ruchika Chavhan, Da Li, and 3 more authors
    CVPR, 2023
    TL;DR: A framework for evaluating meta-learners across various vision tasks consistently.
  2. zong2023medfair.png
    MEDFAIR: benchmarking fairness for medical imaging
    Yongshuo Zong, Yongxin Yang, and Timothy Hospedales
    ICLR, 2023
    TL;DR: We develop a fairness benchmark for medical imaging and find that the state-of-the-art bias mitigation algorithm does not significantly outperform ERM.

2022

  1. zong2022const.png
    conST: an interpretable multi-modal contrastive learning framework for spatial transcriptomics
    Yongshuo Zong, Tingyang Yu, Xuesong Wang, Yixuan Wang, Zhihang Hu, and 1 more author
    BioRxiv preprint, 2022
    TL;DR: A contrastive SSL method for spatial transcriptomics representation learning.