Hi πŸ‘‹, I’m Lang Gao(/lΓ¦Ε‹ Ι‘aʊ/), an undergraduate student of Computer Science and Technology at Huazhong University of Science and Technology(HUST), expected to graduate in July 2025.

I am currently a research assistant at MBZUAI. It is a nice place for research.

I am currently actively seeking for PhD opportunities. If you have any relevant opportunities or suggestions, please feel free to contact me. I am very excited to discuss potential collaborations.

πŸ’‘ Research Interest

  • Mechanistic Interpretability of AI: My current research goal is to interpret why large foundational models suffer from issues such as hallucinations and vulnerabilities by investigating their abnormal intrinsic behaviors and structures and proposing data-efficient solutions to these problems.
  • Reliable Application of AI: I am also highly interested and experienced in exploring the reliable application of large foundational models (like Large Language Models and Vision-Language Models), particularly in the Biomedical domain.

πŸ“– Educations

09 / 2021 - 07 / 2025 : B.E.(expected), Huazhong University of Science and Technology(HUST)

Skills

I have the necessary theoretical foundation and skills in AI/NLP research, including proficiency in deep learning frameworks (PyTorch, TensorFlow), training and evaluation techniques, and large-scale data management.

I am also familiar with the architectures of large foundational models such as GPT, Llama, and LLaVA. I enjoy manipulating activations and neurons within these models and am eager to observe how changes affect their output.

πŸ“ Publications

πŸ§‘β€πŸ”¬ Interpretable AI

sym

Shaping the Safety Boundaries: Understanding and Defending Against Jailbreaks in Large Language Models Static Badge

Lang Gao, Xiangliang Zhang, Preslav Nakov, and Xiuying Chen

β€œTry to interpret common mechanisms of diverse LLM jailbreak attacks in the activation space and propose an efficient defense method.”

sym

Adversarial Cooperative Rationalization: The Risk of Spurious Correlations in Even Clean Datasets Static Badge

Wei Liu, Zhongyu Niu, Lang Gao, Zhiying Deng, Jun Wang, Haozhao Wang, Zhigang Zeng, and Ruixuan Li

β€œAn interpretable, causal learning paradigm that simultaneously avoids spurious correlations in data and traditional self-interpretable models.”

πŸ‘¨β€πŸ”§ Reliable Application of AI

sym

MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine Static Badge

Yunfei Xie*, Ce Zhou*, Lang Gao*, Juncheng Wu*, Xianhang Li, Hong-Yu Zhou, Sheng Liu, Lei Xing, James Zou, Cihang Xie, and Yuyin Zhou (*: first co-authors) Toolkit & Code

Static Badge Website Static

β€œA comprehensive, large-scale multimodal dataset for medical vision-language models.”

sym

VulDetectBench: Evaluating the Deep Capability of Vulnerability Detection with Large Language Models

Yu Liu*, Lang Gao*, Mingxin Yang*, Yu Xie, Ping Chen, Xiaojin Zhang, and Wei Chen (*: first co-authors)

β€œA novel, comprehensive benchmark, specifically designed to assess the code vulnerability detection capabilities of LLMs.”

Toolkit & Code

πŸ’Ό Experiences

  • [10 / 2024 - now ] MBZUAI, Research Intern (Supervisor: Prof.Xiuying Chen, topic: Mechanistic Interpretability of LLMs)
  • [07 / 2024 - 10 / 2024] University of Notre Dame, Research Intern (Supervisor: Prof.Xiangliang Zhang, topic: LLMs for Bayesian Optimization)
  • [01 / 2024 - 06 / 2024] UC Santa Cruz, Research Intern (Supervisor: Prof.Yuyin Zhou, topic: Visual-Language models for healthcare)
  • [10 / 2023 - 12 / 2023] HUST (Supervisor: Prof.Ruixuan Li, topic: Interpretable deep learning frameworks)

πŸ† Honors and Awards

  • πŸ₯‡ National First Price, RAICOM Robotics Developer Contest - CAIR Engineering Competition National Finals,2024
  • πŸ₯ˆ National Second Price, 15th China College Students’ Service Outsourcing Innovation and Entrepreneurship Competition, 2024
  • πŸ₯ˆ National Second Prize, The 5th Integrated Circuit EDA Design Elite Challenge (Deep Learning Track), 2023
  • πŸ₯‰ National Third Prize, The 5th Global Campus Artificial Intelligence Algorithm Elite Competition, 2023.
  • πŸ₯‰ National Third Prize, iFlytek Developer Competition, NLP Track, 2023

πŸ“š Resources

Insights

  • Book: Interpretability in Deep Learning [Link]
  • Book: Interpretable Machine Learning [Link]
  • Book: Trustworthy Machine Learning [Link]
  • Book: ε€§θ―­θ¨€ζ¨‘εž‹ (The Chinese Book for Large Language Models) [Link]
  • Article: The Bitter Lesson [Link]

Blogs

  • [05/24] [Chinese] National Undergraduate Innovation Project Documentation. [Link]
  • [03/24] [Chinese] Negative Transfer. [Link]
  • [03/24] [Chinese] Mixture of Experts Explained. [Link]
  • [01/24] [Chinese] EMNLP2020 Tutorial Notes (Topic: Explainable AI). [Link]

πŸ“œ References

You can find my full CV and an English Transcript here (Latest update: March 1st, 2025).