Zhihui Chen 陈致晖
Ph.D. Student in Artificial Intelligence, National University of Singapore
新加坡国立大学人工智能博士生

I am currently a first-year PhD student at the National University of Singapore, supervised by Prof. Mengling Feng.

Reviewer for NIPS, IJCAI, ACM TIST, with over 10 papers reviewed.

Research Interests: Trustworthy and Agentic LLM • Multi-modality Intelligence in Healthcare

Looking for a summer 26' internship!! 🙏

我目前是新加坡国立大学一年级博士生,导师为 冯梦凌教授

担任NIPS、IJCAI、ACM TIST等会议和期刊的审稿人,已审稿10余篇。

研究方向:可信与智能体大语言模型 • 医疗领域多模态智能

期待26年暑期实习机会!!🙏


Education 教育经历
  • National University of Singapore
    National University of Singapore
    Ph.D. in Artificial Intelligence
    Jan. 2025 - present
  • The University of Hong Kong
    The University of Hong Kong
    M.Sc. in Artificial Intelligence
    Sep. 2022 - Jul. 2024
  • The Chinese University of Hong Kong, Shenzhen
    The Chinese University of Hong Kong, Shenzhen
    B.Sc. in Statistics, Data Science Stream
    Sep. 2018 - May. 2022
  • 新加坡国立大学
    新加坡国立大学
    人工智能博士
    2025年1月 - 至今
  • 香港大学
    香港大学
    人工智能理学硕士
    2022年9月 - 2024年7月
  • 香港中文大学(深圳)
    香港中文大学(深圳)
    统计学理学学士(数据科学方向)
    2018年9月 - 2022年5月
Experience 经历
  • StepFun AI Intelligent Technology
    StepFun AI Intelligent Technology
    LLM Research Intern
    Jan. 2025 - June. 2025
  • The Chinese University of Hong Kong
    The Chinese University of Hong Kong
    Research Staff
    Sep. 2024 - Jan. 2025
  • 阶跃星辰(StepFun)
    阶跃星辰(StepFun)
    大模型实习生
    2025年1月 - 2025年6月
  • 香港中文大学
    香港中文大学
    科研助理
    2024年9月 - 2025年1月
Honors & Awards 荣誉与奖励
  • Full Ph.D. Scholarship, National University of Singapore
    2025
  • Outstanding College Graduate, CUHK-Shenzhen Harmonia College
    2022
  • Undergraduate Research Excellence Award, CUHK-Shenzhen
    2021
  • 新加坡国立大学全额博士奖学金
    2025
  • 香港中文大学(深圳)祥波书院优秀毕业生
    2022
  • 香港中文大学(深圳)本科生科研卓越奖
    2021
Selected Publications (view all ) (查看全部
DivScore: Zero-Shot Detection of LLM-Generated Text in Specialized Domains
DivScore: Zero-Shot Detection of LLM-Generated Text in Specialized Domains 🔗

Zhihui Chen, Kai He, Yucheng Huang, Yunxiao Zhu, Mengling Feng

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2025

Detecting LLM-generated text in specialized and high-stakes domains like medicine and law is crucial for combating misinformation and ensuring authenticity. We propose DivScore, a zero-shot detection framework using normalized entropy-based scoring and domain knowledge distillation to robustly identify LLM-generated text in specialized domains. Experiments show that DivScore consistently outperforms state-of-the-art detectors, with 14.4% higher AUROC and 64.0% higher recall at 0.1% false positive rate threshold.

DivScore: Zero-Shot Detection of LLM-Generated Text in Specialized Domains 🔗

Zhihui Chen, Kai He, Yucheng Huang, Yunxiao Zhu, Mengling Feng

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2025

Detecting LLM-generated text in specialized and high-stakes domains like medicine and law is crucial for combating misinformation and ensuring authenticity. We propose DivScore, a zero-shot detection framework using normalized entropy-based scoring and domain knowledge distillation to robustly identify LLM-generated text in specialized domains. Experiments show that DivScore consistently outperforms state-of-the-art detectors, with 14.4% higher AUROC and 64.0% higher recall at 0.1% false positive rate threshold.

Med-Banana-50K: A Large-Scale Cross-Modality Dataset for Medical Image Editing
Med-Banana-50K: A Large-Scale Cross-Modality Dataset for Medical Image Editing 🔗

Zhihui Chen, et al.

arXiv preprint 2025

Recent advances in multimodal large language models have enabled remarkable medical image editing capabilities. However, the research community's progress remains constrained by the absence of large-scale, high-quality, and openly accessible datasets built specifically for medical image editing with strict anatomical and clinical constraints. We introduce Med-Banana-50K, a comprehensive 50K-image dataset for instruction-based medical image editing spanning three modalities (chest X-ray, brain MRI, fundus photography) and 23 disease types. Our dataset is constructed by leveraging Gemini-2.5-Flash-Image to generate bidirectional edits (lesion addition and removal) from real medical images. What distinguishes Med-Banana-50K from general-domain editing datasets is our systematic approach to medical quality control: we employ LLM-as-Judge with a medically grounded rubric and history-aware iterative refinement up to five rounds.

Med-Banana-50K: A Large-Scale Cross-Modality Dataset for Medical Image Editing 🔗

Zhihui Chen, et al.

arXiv preprint 2025

Recent advances in multimodal large language models have enabled remarkable medical image editing capabilities. However, the research community's progress remains constrained by the absence of large-scale, high-quality, and openly accessible datasets built specifically for medical image editing with strict anatomical and clinical constraints. We introduce Med-Banana-50K, a comprehensive 50K-image dataset for instruction-based medical image editing spanning three modalities (chest X-ray, brain MRI, fundus photography) and 23 disease types. Our dataset is constructed by leveraging Gemini-2.5-Flash-Image to generate bidirectional edits (lesion addition and removal) from real medical images. What distinguishes Med-Banana-50K from general-domain editing datasets is our systematic approach to medical quality control: we employ LLM-as-Judge with a medically grounded rubric and history-aware iterative refinement up to five rounds.

All publications 全部论文
News动态
2025
[Dataset] We released Med-Banana-50K, a large-scale medical image editing dataset! 🎉
Nov 01
[EMNLP] One first-author paper accepted to EMNLP 2025. Finally! 🎉
Aug 20
[NUS] Started my PhD journey at NUS! Ready to debug my life and my code simultaneously 🐛
Jan 15
[Award] Received full PhD scholarship from NUS. Time to invest in more coffee ☕
Jan 01
2024
[HKU] Graduated from HKU with M.Sc. in AI. One step closer to teaching robots to take over the world (responsibly) 🤖
Jul 15