MedForge MedForge
A data-and-model framework that detects medically plausible image forgeries through localized evidence, expert-aligned reasoning, and a Localize-then-Analyze detector. 通过 MedForge-90K 基准、专家对齐推理和先定位再分析检测器,对医学影像伪造进行定位、判别和解释。
I research post-training for large language models — reward design, preference alignment (DPO/RLHF/RLVR), and verifier-in-the-loop evaluation — together with multimodal agent harnesses. My work spans RL post-training pipelines, agent trajectory and preference data, and open data infrastructure for alignment research, with high-stakes clinical AI as the primary application domain. 我从事大模型后训练研究——奖励设计、偏好对齐(DPO/RLHF/RLVR)与验证器闭环评测,以及多模态 Agent 工程。研究工作涵盖强化学习后训练流程、Agent 轨迹与偏好数据,以及服务于对齐研究的开放数据基础设施,主要应用场景为高风险临床 AI。
I am a PhD student in Artificial Intelligence at the National University of Singapore, supervised by Prof. Mengling Feng. My research centers on post-training methods for large language models — reward design, preference alignment (DPO/RLHF/RLVR), verifier-in-the-loop evaluation, and agent harness design. Healthcare is my primary application domain, not the boundary of my research.
Research Output: ACL 2026 & EMNLP 2025 Main; 3 open datasets with 100K+ Hugging Face downloads; preference-alignment data flywheels linking annotation, reward signals, and iterative post-training loops.
Industry: ByteDance (Singapore) Multimodal Agent RL Research Intern (2026); StepFun Speech-LLM Research Intern (2025); MiniMax Cowork Team Fellowship.
Academic Service: Reviewer for NeurIPS 2026, IJCAI, IEEE Affective Computing, ACM TIST; ACL 2026 Main Session Chair.
我目前是新加坡国立大学人工智能博士生,导师为 冯梦凌教授。 我的研究聚焦大模型后训练方法:奖励设计、偏好对齐(DPO/RLHF/RLVR)、验证器闭环评测与 Agent 工程设计。医疗是我的主要应用场景,而非研究边界。
研究成果:ACL 2026 与 EMNLP 2025 主会议;3 个开放数据集,Hugging Face 累计 10 万+ 下载;打通标注、奖励信号与迭代后训练闭环的偏好对齐数据飞轮。
产业经历:字节跳动(新加坡)多模态 Agent 强化学习研发实习(2026);阶跃星辰语音大模型研发实习(2025);MiniMax Cowork Team Fellowship。
学术服务:NeurIPS 2026、IJCAI、IEEE Affective Computing、ACM TIST 审稿人;ACL 2026 主会议 Session Chair。
A data-and-model framework that detects medically plausible image forgeries through localized evidence, expert-aligned reasoning, and a Localize-then-Analyze detector. 通过 MedForge-90K 基准、专家对齐推理和先定位再分析检测器,对医学影像伪造进行定位、判别和解释。
A compute-supported project connecting long-context, multimodal, and Agent capabilities with medical foundation model development, data flywheels, and verifiable decision paths. 将 MiniMax 长上下文、多模态与 Agent 能力连接到医疗大模型开发、数据飞轮和可验证决策路径。
A production ReAct pipeline with Kimi-K2 as core LLM: multi-document context, tool-calling over document APIs, and zero-dependency single-file deployment. 1,000 downloads in week one. 以 Kimi-K2 为核心大模型的 ReAct 流水线:多文档上下文、面向文档 API 的工具调用、零依赖单文件部署,上线首周下载量超 1000。
MedForge and Med-Banana produce structured preference pairs and failure logs that feed DPO/RLHF post-training pipelines, with Forgery-aware GSPO grounding model reasoning on visual evidence. The data flywheel turns evaluation findings into targeted training signal. MedForge 与 Med-Banana 产出结构化偏好对和失败日志,直接用于 DPO/RLHF 后训练流程;Forgery-aware GSPO 以视觉证据约束模型推理。数据飞轮将评测发现转化为定向训练信号。
DivScore demonstrates zero-shot verifier design: a normalized-entropy scorer with theoretical guarantees that generalizes across medical and legal domains without domain-specific labels. DivScore 展示零样本验证器设计:一种带理论保证的归一化熵评分器,无需领域特定标注即可泛化至医疗与法律领域。
Smart Word Agent (ReAct, Kimi-K2 core) and an autonomous LangChain trading agent demonstrate end-to-end harness design, tool-calling, and orchestration; quantized serving work targets inference efficiency. Smart Word Agent(ReAct、Kimi-K2 核心)与基于 LangChain 的自动化交易 Agent 展示端到端框架设计、工具调用与编排能力;量化部署工作聚焦推理效率。

Zhihui Chen, Kai He, Qingyuan Lei, Bin Pu, Jian Zhang, Yuling Xu, Mengling Feng# (# corresponding author)
Annual Meeting of the Association for Computational Linguistics (ACL) 2026 Main Conference" data-zh=" 主会议"> Main Conference
As generative models improve, medical deepfakes that implant or remove lesions while staying visually plausible pose growing risks to clinical safety and the integrity of medical evidence. Most prior work reduces detection to binary real-vs-fake scoring with little insight into where manipulation occurs or why. We present MedForge, an interpretable framework that introduces MedForge-90K—the first large-scale explainable medical deepfake dataset spanning CT, MRI, and X-ray, covering 19 lesion types with forgeries from 10 state-of-the-art deepfake models, each paired with expert-guided localization and clinical-grade explanations—and MedForge-Reasoner, a detector trained with a Localize-then-Analyze chain-of-thought paradigm and Forgery-aware GSPO reinforcement learning. MedForge-Reasoner achieves state-of-the-art detection while producing localized, verifiable medical rationales.
Zhihui Chen, Kai He, Qingyuan Lei, Bin Pu, Jian Zhang, Yuling Xu, Mengling Feng# (# corresponding author)
Annual Meeting of the Association for Computational Linguistics (ACL) 2026 Main Conference" data-zh=" 主会议"> Main Conference
As generative models improve, medical deepfakes that implant or remove lesions while staying visually plausible pose growing risks to clinical safety and the integrity of medical evidence. Most prior work reduces detection to binary real-vs-fake scoring with little insight into where manipulation occurs or why. We present MedForge, an interpretable framework that introduces MedForge-90K—the first large-scale explainable medical deepfake dataset spanning CT, MRI, and X-ray, covering 19 lesion types with forgeries from 10 state-of-the-art deepfake models, each paired with expert-guided localization and clinical-grade explanations—and MedForge-Reasoner, a detector trained with a Localize-then-Analyze chain-of-thought paradigm and Forgery-aware GSPO reinforcement learning. MedForge-Reasoner achieves state-of-the-art detection while producing localized, verifiable medical rationales.

Zhihui Chen, Kai He, Yucheng Huang, Yunxiao Zhu, Mengling Feng
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2025 Main Conference" data-zh=" 主会议"> Main Conference
Detecting LLM-generated text in specialized and high-stakes domains like medicine and law is crucial for combating misinformation and ensuring authenticity. We propose DivScore, a zero-shot detection framework using normalized entropy-based scoring and domain knowledge distillation to robustly identify LLM-generated text in specialized domains. Experiments show that DivScore consistently outperforms state-of-the-art detectors, with 14.4% higher AUROC and 64.0% higher recall at 0.1% false positive rate threshold.
Zhihui Chen, Kai He, Yucheng Huang, Yunxiao Zhu, Mengling Feng
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2025 Main Conference" data-zh=" 主会议"> Main Conference
Detecting LLM-generated text in specialized and high-stakes domains like medicine and law is crucial for combating misinformation and ensuring authenticity. We propose DivScore, a zero-shot detection framework using normalized entropy-based scoring and domain knowledge distillation to robustly identify LLM-generated text in specialized domains. Experiments show that DivScore consistently outperforms state-of-the-art detectors, with 14.4% higher AUROC and 64.0% higher recall at 0.1% false positive rate threshold.

Zhihui Chen, et al.
arXiv preprint 2025
Recent advances in multimodal large language models have enabled remarkable medical image editing capabilities. However, the research community's progress remains constrained by the absence of large-scale, high-quality, and openly accessible datasets built specifically for medical image editing with strict anatomical and clinical constraints. We introduce Med-Banana-50K, a comprehensive dataset for instruction-based medical image editing spanning chest X-ray, brain MRI, and fundus photography across 23 disease types. The public release includes 50,635 successful bidirectional edits and 37,822 failed attempts with full conversation logs, enabling evaluation, preference learning, and alignment research for medically grounded image editing systems.
Zhihui Chen, et al.
arXiv preprint 2025
Recent advances in multimodal large language models have enabled remarkable medical image editing capabilities. However, the research community's progress remains constrained by the absence of large-scale, high-quality, and openly accessible datasets built specifically for medical image editing with strict anatomical and clinical constraints. We introduce Med-Banana-50K, a comprehensive dataset for instruction-based medical image editing spanning chest X-ray, brain MRI, and fundus photography across 23 disease types. The public release includes 50,635 successful bidirectional edits and 37,822 failed attempts with full conversation logs, enabling evaluation, preference learning, and alignment research for medically grounded image editing systems.