LLM Post-Training & Agentic AI Research 大模型后训练与 Agentic AI 研究
Zhihui Chen 陈致晖
Ph.D. Student in Artificial Intelligence, National University of Singapore
新加坡国立大学人工智能博士生

I research post-training for large language models — reward design, preference alignment (DPO/RLHF/RLVR), and verifier-in-the-loop evaluation — together with multimodal agent harnesses. My work spans RL post-training pipelines, agent trajectory and preference data, and open data infrastructure for alignment research, with high-stakes clinical AI as the primary application domain. 我从事大模型后训练研究——奖励设计、偏好对齐(DPO/RLHF/RLVR)与验证器闭环评测,以及多模态 Agent 工程。研究工作涵盖强化学习后训练流程、Agent 轨迹与偏好数据,以及服务于对齐研究的开放数据基础设施,主要应用场景为高风险临床 AI。

Post-Training & Alignment Research 后训练与对齐研究 Multimodal Agents · Clinical AI Applications 多模态 Agent · 临床 AI 应用
ByteDance (Singapore): Multimodal Agent RL Research Intern — 2026 字节跳动(新加坡):多模态 Agent 强化学习研发实习 — 2026
StepFun: Speech-LLM Research Intern — 2025 · MiniMax Cowork Team Fellowship 阶跃星辰:语音大模型研发实习 — 2025 · MiniMax Cowork Team Fellowship
Portrait

I am a PhD student in Artificial Intelligence at the National University of Singapore, supervised by Prof. Mengling Feng. My research centers on post-training methods for large language models — reward design, preference alignment (DPO/RLHF/RLVR), verifier-in-the-loop evaluation, and agent harness design. Healthcare is my primary application domain, not the boundary of my research.

Research Output: ACL 2026 & EMNLP 2025 Main; 3 open datasets with 100K+ Hugging Face downloads; preference-alignment data flywheels linking annotation, reward signals, and iterative post-training loops.

Industry: ByteDance (Singapore) Multimodal Agent RL Research Intern (2026); StepFun Speech-LLM Research Intern (2025); MiniMax Cowork Team Fellowship.

Academic Service: Reviewer for NeurIPS 2026, IJCAI, IEEE Affective Computing, ACM TIST; ACL 2026 Main Session Chair.

我目前是新加坡国立大学人工智能博士生,导师为 冯梦凌教授。 我的研究聚焦大模型后训练方法:奖励设计、偏好对齐(DPO/RLHF/RLVR)、验证器闭环评测与 Agent 工程设计。医疗是我的主要应用场景,而非研究边界。

研究成果:ACL 2026 与 EMNLP 2025 主会议;3 个开放数据集,Hugging Face 累计 10 万+ 下载;打通标注、奖励信号与迭代后训练闭环的偏好对齐数据飞轮。

产业经历:字节跳动(新加坡)多模态 Agent 强化学习研发实习(2026);阶跃星辰语音大模型研发实习(2025);MiniMax Cowork Team Fellowship。

学术服务:NeurIPS 2026、IJCAI、IEEE Affective Computing、ACM TIST 审稿人;ACL 2026 主会议 Session Chair。

Curriculum Vitae 个人简历
Publication Highlights 论文与影响力
2
top-tier NLP main conference papers 篇顶级 NLP 主会议论文
ACL 2026 and EMNLP 2025 Main ACL 2026 与 EMNLP 2025 主会议
88K+
preference-alignment training pairs 偏好对齐训练数据对
DPO/RLHF-ready data across MedForge and Med-Banana pipelines MedForge 与 Med-Banana 流程产出的 DPO/RLHF 就绪数据
1K+
agent downloads in week 1 Agent 首周下载量
Smart Word Agent: ReAct harness, Kimi-K2 core, single-file deployment Smart Word Agent:ReAct 框架、Kimi-K2 核心、单文件部署
ByteDance + StepFun
LLM lab research internships 大模型实验室研发实习
Multimodal Agent RL (ByteDance) and Speech-LLM data (StepFun) 多模态 Agent 强化学习(字节)与语音大模型数据(阶跃)
MedForge
Interpretable medical deepfake detection 可解释医学深度伪造检测

MedForge MedForge

A data-and-model framework that detects medically plausible image forgeries through localized evidence, expert-aligned reasoning, and a Localize-then-Analyze detector. 通过 MedForge-90K 基准、专家对齐推理和先定位再分析检测器,对医学影像伪造进行定位、判别和解释。

ACL 2026 Main ACL 2026 主会议 19 lesion types 19 类病灶 10 deepfake models 10 种伪造模型
MiniMax Cowork Team Fellowship
Medical foundation model development and clinically verifiable Agent workflows 医疗大模型开发与临床可验证 Agent 工作流

MiniMax Cowork Team Fellowship MiniMax Cowork Team Fellowship

A compute-supported project connecting long-context, multimodal, and Agent capabilities with medical foundation model development, data flywheels, and verifiable decision paths. 将 MiniMax 长上下文、多模态与 Agent 能力连接到医疗大模型开发、数据飞轮和可验证决策路径。

MiniMax Cowork Team MiniMax Cowork Team USD 4,500 compute grant 4500 美金算力支持 Medical Foundation Models 医疗大模型开发
Smart Word Agent
ReAct agent harness for document workflows 面向文档工作流的 ReAct Agent 框架

Smart Word Agent Smart Word Agent

A production ReAct pipeline with Kimi-K2 as core LLM: multi-document context, tool-calling over document APIs, and zero-dependency single-file deployment. 1,000 downloads in week one. 以 Kimi-K2 为核心大模型的 ReAct 流水线:多文档上下文、面向文档 API 的工具调用、零依赖单文件部署,上线首周下载量超 1000。

ReAct Harness ReAct 框架 Kimi-K2 Core LLM Kimi-K2 核心大模型 1K downloads / week 1 首周 1K 下载 100+ GitHub stars GitHub 100+ Stars
Research Value 研究价值
Post-Training & Alignment 后训练与对齐

Preference data and RL loops for multimodal LLMs 多模态大模型的偏好数据与 RL 闭环

MedForge and Med-Banana produce structured preference pairs and failure logs that feed DPO/RLHF post-training pipelines, with Forgery-aware GSPO grounding model reasoning on visual evidence. The data flywheel turns evaluation findings into targeted training signal. MedForge 与 Med-Banana 产出结构化偏好对和失败日志,直接用于 DPO/RLHF 后训练流程;Forgery-aware GSPO 以视觉证据约束模型推理。数据飞轮将评测发现转化为定向训练信号。

ACL 2026 Main · GSPO · 88K alignment-ready pairs ACL 2026 主会议 · GSPO · 88K 对齐就绪数据对
Verifier & Eval Design 验证器与评测设计

Evidence-grounded evaluation under domain shift 领域偏移下的证据驱动评测

DivScore demonstrates zero-shot verifier design: a normalized-entropy scorer with theoretical guarantees that generalizes across medical and legal domains without domain-specific labels. DivScore 展示零样本验证器设计:一种带理论保证的归一化熵评分器,无需领域特定标注即可泛化至医疗与法律领域。

EMNLP 2025 Main · +14.4% AUROC · +64.0% recall EMNLP 2025 主会议 · +14.4% AUROC · +64.0% 召回
Agents & Systems Agent 与系统

Agent harness engineering and serving Agent 工程与推理服务

Smart Word Agent (ReAct, Kimi-K2 core) and an autonomous LangChain trading agent demonstrate end-to-end harness design, tool-calling, and orchestration; quantized serving work targets inference efficiency. Smart Word Agent(ReAct、Kimi-K2 核心)与基于 LangChain 的自动化交易 Agent 展示端到端框架设计、工具调用与编排能力;量化部署工作聚焦推理效率。

1K+ downloads / week 1 · 100+ GitHub stars 首周 1K+ 下载 · GitHub 100+ Stars
Selected Papers & Open Research Assets 精选论文与开放研究资产 (view all ) (查看全部
MedForge: Interpretable Medical Deepfake Detection via Forgery-aware Reasoning
MedForge: Interpretable Medical Deepfake Detection via Forgery-aware Reasoning 🔗

Zhihui Chen, Kai He, Qingyuan Lei, Bin Pu, Jian Zhang, Yuling Xu, Mengling Feng# (# corresponding author)

Annual Meeting of the Association for Computational Linguistics (ACL) 2026 Main Conference" data-zh=" 主会议"> Main Conference

As generative models improve, medical deepfakes that implant or remove lesions while staying visually plausible pose growing risks to clinical safety and the integrity of medical evidence. Most prior work reduces detection to binary real-vs-fake scoring with little insight into where manipulation occurs or why. We present MedForge, an interpretable framework that introduces MedForge-90K—the first large-scale explainable medical deepfake dataset spanning CT, MRI, and X-ray, covering 19 lesion types with forgeries from 10 state-of-the-art deepfake models, each paired with expert-guided localization and clinical-grade explanations—and MedForge-Reasoner, a detector trained with a Localize-then-Analyze chain-of-thought paradigm and Forgery-aware GSPO reinforcement learning. MedForge-Reasoner achieves state-of-the-art detection while producing localized, verifiable medical rationales.

MedForge: Interpretable Medical Deepfake Detection via Forgery-aware Reasoning 🔗

Zhihui Chen, Kai He, Qingyuan Lei, Bin Pu, Jian Zhang, Yuling Xu, Mengling Feng# (# corresponding author)

Annual Meeting of the Association for Computational Linguistics (ACL) 2026 Main Conference" data-zh=" 主会议"> Main Conference

As generative models improve, medical deepfakes that implant or remove lesions while staying visually plausible pose growing risks to clinical safety and the integrity of medical evidence. Most prior work reduces detection to binary real-vs-fake scoring with little insight into where manipulation occurs or why. We present MedForge, an interpretable framework that introduces MedForge-90K—the first large-scale explainable medical deepfake dataset spanning CT, MRI, and X-ray, covering 19 lesion types with forgeries from 10 state-of-the-art deepfake models, each paired with expert-guided localization and clinical-grade explanations—and MedForge-Reasoner, a detector trained with a Localize-then-Analyze chain-of-thought paradigm and Forgery-aware GSPO reinforcement learning. MedForge-Reasoner achieves state-of-the-art detection while producing localized, verifiable medical rationales.

DivScore: Zero-Shot Detection of LLM-Generated Text in Specialized Domains
DivScore: Zero-Shot Detection of LLM-Generated Text in Specialized Domains 🔗

Zhihui Chen, Kai He, Yucheng Huang, Yunxiao Zhu, Mengling Feng

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2025 Main Conference" data-zh=" 主会议"> Main Conference

Detecting LLM-generated text in specialized and high-stakes domains like medicine and law is crucial for combating misinformation and ensuring authenticity. We propose DivScore, a zero-shot detection framework using normalized entropy-based scoring and domain knowledge distillation to robustly identify LLM-generated text in specialized domains. Experiments show that DivScore consistently outperforms state-of-the-art detectors, with 14.4% higher AUROC and 64.0% higher recall at 0.1% false positive rate threshold.

DivScore: Zero-Shot Detection of LLM-Generated Text in Specialized Domains 🔗

Zhihui Chen, Kai He, Yucheng Huang, Yunxiao Zhu, Mengling Feng

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2025 Main Conference" data-zh=" 主会议"> Main Conference

Detecting LLM-generated text in specialized and high-stakes domains like medicine and law is crucial for combating misinformation and ensuring authenticity. We propose DivScore, a zero-shot detection framework using normalized entropy-based scoring and domain knowledge distillation to robustly identify LLM-generated text in specialized domains. Experiments show that DivScore consistently outperforms state-of-the-art detectors, with 14.4% higher AUROC and 64.0% higher recall at 0.1% false positive rate threshold.

Med-Banana-50K: A Large-Scale Cross-Modality Dataset for Medical Image Editing
Med-Banana-50K: A Large-Scale Cross-Modality Dataset for Medical Image Editing 🔗

Zhihui Chen, et al.

arXiv preprint 2025

Recent advances in multimodal large language models have enabled remarkable medical image editing capabilities. However, the research community's progress remains constrained by the absence of large-scale, high-quality, and openly accessible datasets built specifically for medical image editing with strict anatomical and clinical constraints. We introduce Med-Banana-50K, a comprehensive dataset for instruction-based medical image editing spanning chest X-ray, brain MRI, and fundus photography across 23 disease types. The public release includes 50,635 successful bidirectional edits and 37,822 failed attempts with full conversation logs, enabling evaluation, preference learning, and alignment research for medically grounded image editing systems.

Med-Banana-50K: A Large-Scale Cross-Modality Dataset for Medical Image Editing 🔗

Zhihui Chen, et al.

arXiv preprint 2025

Recent advances in multimodal large language models have enabled remarkable medical image editing capabilities. However, the research community's progress remains constrained by the absence of large-scale, high-quality, and openly accessible datasets built specifically for medical image editing with strict anatomical and clinical constraints. We introduce Med-Banana-50K, a comprehensive dataset for instruction-based medical image editing spanning chest X-ray, brain MRI, and fundus photography across 23 disease types. The public release includes 50,635 successful bidirectional edits and 37,822 failed attempts with full conversation logs, enabling evaluation, preference learning, and alignment research for medically grounded image editing systems.

All publications 全部论文
Background 教育与经历
Education 教育经历
  • National University of Singapore
    National University of Singapore
    Ph.D. in Artificial Intelligence
    Jan. 2025 - present
  • The University of Hong Kong
    The University of Hong Kong
    M.Sc. in Artificial Intelligence
    Sep. 2022 - Jul. 2024
  • The Chinese University of Hong Kong, Shenzhen
    The Chinese University of Hong Kong, Shenzhen
    B.Sc. in Statistics, Data Science Stream
    Sep. 2018 - May. 2022
  • 新加坡国立大学
    新加坡国立大学
    人工智能博士
    2025年1月 - 至今
  • 香港大学
    香港大学
    人工智能理学硕士
    2022年9月 - 2024年7月
  • 香港中文大学(深圳)
    香港中文大学(深圳)
    统计学理学学士(数据科学方向)
    2018年9月 - 2022年5月
Experience 经历
  • ByteDance (Singapore)
    ByteDance (Singapore)
    Reward design, eval loops, trajectory & preference data for online/offline RL
    Multimodal Agent RL Research Intern
    Jun. 2026 - Nov. 2026
  • StepFun (阶跃星辰)
    StepFun (阶跃星辰)
    Pre-training data system & governance for ASR/TTS/real-time speech; metadata schema
    Speech-LLM Research Intern
    Jan. 2025 - May. 2025
  • AQUMON (Hong Kong)
    LangChain agent orchestration; Futu OpenAPI real-time data & automated execution
    Quant Trading Agent Intern
    May. 2024 - Jul. 2024
  • NUS Business School
    NUS Business School
    Curriculum design; tutorials on PyTorch, HuggingFace, pre-training, SFT, RLHF
    Teaching Assistant — "Generative AI and LLM"
    Feb. 2024 - Apr. 2024
  • 字节跳动(新加坡)
    字节跳动(新加坡)
    奖励设计、评测闭环、轨迹与偏好数据,支持在线/离线 RL 迭代
    多模态 Agent 强化学习 研发实习生
    2026年6月 - 2026年11月
  • 阶跃星辰(StepFun)
    阶跃星辰(StepFun)
    ASR/TTS/实时语音预训练数据体系与治理;元数据 schema 设计
    语音大模型 研发实习生
    2025年1月 - 2025年5月
  • AQUMON(香港)
    LangChain Agent 编排;对接 Futu OpenAPI 实时行情与自动下单
    量化交易 Agent 研发实习生
    2024年5月 - 2024年7月
  • NUS 商学院
    NUS 商学院
    课程设计;PyTorch、HuggingFace、预训练、SFT、RLHF 教学
    教学助理 —《Generative AI and LLM》
    2024年2月 - 2024年4月
Honors & Awards 荣誉与奖励
  • MiniMax Cowork Team Fellowship, USD 4,500 compute grant
    2026
  • Full Ph.D. Scholarship, National University of Singapore
    2025
  • Outstanding College Graduate, CUHK-Shenzhen Harmonia College
    2022
  • Undergraduate Research Excellence Award, CUHK-Shenzhen
    2021
  • MiniMax Cowork Team Fellowship,4500 美金算力支持
    2026
  • 新加坡国立大学全额博士奖学金
    2025
  • 香港中文大学(深圳)祥波书院优秀毕业生
    2022
  • 香港中文大学(深圳)本科生科研卓越奖
    2021
Skills & Tech Stack 技术栈
Post-Training & Alignment
  • RLHF / RLVR pipeline design
  • DPO and reward modeling / reward design
  • Preference & trajectory data, data flywheels
  • Verifier and eval-loop design
  • SFT curriculum (PyTorch / HuggingFace)
Agents & Orchestration
  • ReAct agent harness (multi-step, multi-doc)
  • LangChain orchestration & tool-calling
  • Structured output & failure attribution
  • Kimi-K2 / Qwen3-VL / GPT integration
Serving & Systems
  • llama.cpp · GGUF quantization
  • vLLM / SGLang
  • Docker · FastAPI serving
  • Whisper deployment (Large-v2)
Foundations
  • Python · PyTorch · HuggingFace Transformers
  • R · C++ · MATLAB · MySQL
  • Linux · Git
  • English (TOEFL 106, GRE 320) · Mandarin · Cantonese
后训练与对齐
  • RLHF / RLVR 流程设计
  • DPO 与奖励建模 / 奖励设计
  • 偏好与轨迹数据、数据飞轮
  • 验证器与评测闭环设计
  • SFT 课程(PyTorch / HuggingFace)
Agent 与编排
  • ReAct Agent 框架(多步、多文档)
  • LangChain 编排与工具调用
  • 结构化输出与失败归因
  • Kimi-K2 / Qwen3-VL / GPT 接入
推理服务与系统
  • llama.cpp · GGUF 量化
  • vLLM / SGLang
  • Docker · FastAPI 部署
  • Whisper 部署(Large-v2)
基础工具链
  • Python · PyTorch · HuggingFace Transformers
  • R · C++ · MATLAB · MySQL
  • Linux · Git
  • 英语(TOEFL 106, GRE 320)· 普通话 · 粤语
Academic Service 学术服务
  • Session Chair, ACL 2026 Main
    2026
  • Reviewer, NeurIPS 2026
    2026
  • Reviewer, IJCAI 2026
    2026
  • Reviewer, IEEE Transactions on Affective Computing
    2026
  • Reviewer, IEEE Transactions on Consumer Electronics
    2026
  • Reviewer, ACM TIST
    2026
  • Reviewer, ACL 2026
    2026
  • ACL 2026 主会议 Session Chair
    2026
  • NeurIPS 2026 审稿人
    2026
  • IJCAI 2026 审稿人
    2026
  • IEEE Transactions on Affective Computing 审稿人
    2026
  • IEEE Transactions on Consumer Electronics 审稿人
    2026
  • ACM TIST 审稿人
    2026
  • ACL 2026 审稿人
    2026
Recent Updates 近期动态
News动态
2026
[ACL 2026] Honored to serve as a session chair—glad to help the conference run smoothly.
Jun 08
[MiniMax] Selected for the MiniMax Cowork Team Fellowship to support medical foundation model development and clinically verifiable Agent workflows.
Apr 18
[TikTok] Joining TikTok in 2026 as a PhD Intern on multimodal Agent post-training, focusing on training data, reward design, and evaluation loops.
Apr 17