Hi 👋,I’m Lang Gao(/læŋ ɡaʊ/), an undergraduate student of Computer Science and Technology at Huazhong University of Science and Technology(HUST),expected to graduate in July 2025.

I am about to join MBZUAI as an research assistant this autumn with a focus on the explainability of medical large language models.

I am currently actively seeking for PhD opportunities.If you have any relevant opportunities or suggestions, please feel free to contact me. I am very excited to discuss potential collaborations.

💡 Research Interest

  • Large Language Models(LLMs)
    • Application on various domains:LLM for healthcare,security,and scientific research.
    • Data-Centric Solutions:Constructing high-quality benchmarks and datasets to evaluate and improve LLMs on various tasks.
    • Multimodal Large Language Models(MLLMs).
  • Trustworthy and Explanable AI(XAI)
    • Building self-explaining deep-learning models and workflows to provide faithful and trustworthy predictions.

📖 Educations

2021.09 - now : B.E.(expected), Huazhong University of Science and Technology(HUST)

Proficiencies

GPA:4.28/5.00 (or 3.70/4.00 according to WES)

Course Result
Calculus 97
Software Engineering 97
Algorithmic Design & Analysis 97
Advanced Programming Language 94
Computer Vision 94
Principles of Imperative Computation 94
Operating System 91
Machine Learning 91

Skills

  • Deep Learning Framework: Proficient in Pytorch, Tensorflow
  • Large Language Models: Proficient in Prompt Engineering(Chain-of-Thought,In-Context-Learning and few-shot learning) Fine-tune techniques(PEFT,full-parameter training;large-scale distributed training on server cluster),Deepspeed, Transformers
  • Strong Data Management and Processing Skills: deduplication, cleaning, formatting, and statical analysis.
  • Programming Languages: Proficient in Python, Linux,C and C++.

📝 Publications

sym

nips24 MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine

Yunfei Xie*, Ce Zhou*, Lang Gao*, Juncheng Wu*, Xianhang Li, Hong-Yu Zhou, Sheng Liu, Lei Xing, James Zou, Cihang Xie, and Yuyin Zhou (*: first co-authors) Toolkit & Code

Static Badge Website Static

“A comprehensive, large-scale multimodal dataset for medical vision-language models.”

sym

nips24 VulDetectBench:Evaluating the Deep Capability of Vulnerability Detection with Large Language Models

Yu Liu*,Lang Gao*,Mingxin Yang*,Yu Xie,Ping Chen,Xiaojin Zhang, and Wei Chen (*: first co-authors)

“A novel, comprehensive benchmark, specifically designed to assess the code vulnerability detection capabilities of LLMs.”

Toolkit & Code

nips24 Attacking for Inspection and Instruction: The Risk of Spurious Correlations in Even Clean Datasets Wei Liu, Zhiying Deng, Zhongyu Niu, Lang Gao, Jun Wang, Haozhao Wang, and Ruixuan Li

“An improved interpretable causal model architecture that can simultaneously avoid spurious correlations in data and those caused by insufficient training in traditional self-interpretable models.”

💼 Expeiences

  • [2024.07 - 2024.09] Univerisy of Notre Dame,Research Intern (Supervisor:Prof.Xiangliang Zhang,topic:LLMs for Bayesian Optimization)
  • [2024.01 - 2024.06] UC Santa Cruz,Research Intern (Supervisor:Prof.Yuyin Zhou,topic:Visual-Language models for healthcare)
  • [2023.10 - 2023.12] HUST (Supervisor:Prof.Ruixuan Li,topic:Interpretable deep learning frameworks)

🏆 Honors and Awards

  • National First Price,RAICOM Robotics Developer Contest - CAIR Engineering Competition National Finals,2024
  • National Second Price,15th China College Students’ Service Outsourcing Innovation and Entrepreneurship Competition,2024
  • National Second Prize, The 5th Integrated Circuit EDA Design Elite Challenge (Deep Learning Track), 2023
  • National Third Prize, The 5th Global Campus Artificial Intelligence Algorithm Elite Competition,2023.
  • National Third Prize, iFlytek Developer Competition, NLP Track, 2023

📜 References

You can find my full CV and an English Transcript here (Latest update:Aug 14th).