Hi 👋, I’m Lang Gao(/læŋ ɡaʊ/), an undergraduate student of Computer Science and Technology at Huazhong University of Science and Technology (HUST), expected to graduate in July 2025.

I am currently a research associate and an incoming PhD student at MBZUAI, a great place for research. I’m fortunate to be supervised by Prof. Xiuying Chen, an outstanding rising star and a truly supportive mentor.

💡 What I want to do

  • Mechanistic Interpretability: Empirically or theoretically interpret behaviors of LLMs / provide empirically or theoretically interpretable approaches to enhance LLMs.
  • Reliable Application of AI (secondary): Explore the reliable application of machine learning models, particularly in the Biomedical domains.
    • Currently, although I am not particularly drawn to trend-driven application work, I do not reject it, simply because I see it as a potential platform to extend and prove the usefulness of interpretability (Useful XAI).

I’m always happy to connect with anyone interested in interpretability. It’s a field full of different sparks, and I’m eager to learn from new perspectives. Feel free to reach out!

⚙️ Skills

  • Deep learning frameworks like Transformers, PyTorch, etc.
  • Mechanistic Interpretability toolkits: NNsight, TransformerLens, SAELens.

📝 Publications

🧑‍🔬 Mechanistic Interpretability

sym

Shaping the Safety Boundaries: Understanding and Defending Against Jailbreaks in Large Language Models Static Badge

Lang Gao, Xiangliang Zhang, Preslav Nakov, and Xiuying Chen

“Try to interpret common mechanisms of diverse LLM jailbreak attacks in the activation space and propose an efficient defense method.”

sym

Adversarial Cooperative Rationalization: The Risk of Spurious Correlations in Even Clean Datasets Static Badge

Wei Liu, Zhongyu Niu, Lang Gao, Zhiying Deng, Jun Wang, Haozhao Wang, Ruixuan Li

Code

“An interpretable, causal learning paradigm that simultaneously avoids spurious correlations in data and traditional self-interpretable models.”

👨‍🔧 Applications

sym

MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine Static Badge

Yunfei Xie*, Ce Zhou*, Lang Gao*, Juncheng Wu*, Xianhang Li, Hong-Yu Zhou, Sheng Liu, Lei Xing, James Zou, Cihang Xie, and Yuyin Zhou (*: first co-authors)

Toolkit & Code

Static Badge Website Static

“A comprehensive, large-scale multimodal dataset for medical vision-language models.”

sym

VulDetectBench: Evaluating the Deep Capability of Vulnerability Detection with Large Language Models

Yu Liu*, Lang Gao*, Mingxin Yang*, Yu Xie, Ping Chen, Xiaojin Zhang, and Wei Chen (*: first co-authors)

“A novel, comprehensive benchmark, specifically designed to assess the code vulnerability detection capabilities of LLMs.”

Toolkit & Code

🧐 Service

  • 2025, Reviewer: ARR.

💼 Experiences

  • [10 / 2024 - now ] MBZUAI, Research Intern (Supervisor: Prof. Xiuying Chen, topic: Mechanistic Interpretability of LLMs)
  • [07 / 2024 - 10 / 2024] University of Notre Dame, Research Intern (Supervisor: Prof. Xiangliang Zhang, topic: LLMs for Bayesian Optimization)
  • [01 / 2024 - 06 / 2024] UC Santa Cruz, Research Intern (Supervisor: Prof. Yuyin Zhou, topic: Visual-Language models for healthcare)
  • [10 / 2023 - 12 / 2023] HUST (Supervisor: Prof. Ruixuan Li, topic: Interpretable deep learning frameworks)

    💬 I am deeply grateful to all the mentors and collaborators who have guided and supported me along the way. Your encouragement, trust, and inspiration have made all the difference in my journey.

📖 Educations

09 / 2021 - 07 / 2025 : B.E.(expected), Huazhong University of Science and Technology(HUST)

🧩Miscellaneous

📚 Resources

Insights

  • Book: Interpretability in Deep Learning [Link]
  • Book: Interpretable Machine Learning [Link]
  • Book: Trustworthy Machine Learning [Link]
  • Book: 大语言模型 (The Chinese Book for Large Language Models) [Link]
  • Article: The Bitter Lesson [Link]
  • Article: The Urgency of Interpretability [Link]

Blogs

  • [05/24] [Chinese] National Undergraduate Innovation Project Documentation. [Link]
  • [03/24] [Chinese] Negative Transfer. [Link]
  • [03/24] [Chinese] Mixture of Experts Explained. [Link]
  • [01/24] [Chinese] EMNLP2020 Tutorial Notes (Topic: Explainable AI). [Link]

Other Stuff

I also like photography. Sometimes I take good photos by accident. So I might upload a few here someday, along with some unnecessary commentary, but feel free to pretend you’re looking forward to it.🙃