Showcase - Zhihui Chen

MiniMax Cowork Team Fellowship: Medical Foundation Model Development and Clinically Verifiable Agent Workflows

A compute-supported project from MiniMax, focused on turning long-context, multimodal, and Agent capabilities into medical foundation model development and clinically verifiable workflows.

Project focus

Medical foundation model development: long-context, multimodal, and Agent capabilities for healthcare workflows
Data flywheel: evaluation findings are converted into targeted cases, feedback signals, and iterative improvement loops
Clinically verifiable Agent workflows: outputs are structured around traceable evidence, review checkpoints, and reproducible decision paths

Grant support

Program	MiniMax Cowork Team Fellowship
Support	USD 4,500 compute grant
Direction	Medical foundation model development and clinically verifiable Agent workflows

Highlights

MiniMax Cowork Team Compute Grant Medical Foundation Models

MedForge: Interpretable Medical Deepfake Detection via Forgery-aware Reasoning

A data-and-model framework for trustworthy medical deepfake detection, built around evidence-grounded reasoning, forgery localization, and a public demo stack.

What is included

ACL 2026 main conference paper on interpretable medical forgery detection
MedForge-90K dataset: 30K real images, 30K lesion implant forgeries, and 30K lesion removal forgeries
MedForge-Reasoner: a Qwen3-VL based detector using a Localize-then-Analyze reasoning pipeline
Interactive demo for medical image deepfake detection and reasoning visualization

Why it matters

Moves beyond black-box real/fake prediction to localized, evidence-grounded explanations
Targets realistic lesion implantation and removal risks in chest X-ray, brain MRI, and fundus images
Combines dataset, model, and demo into a single research artifact instead of a paper-only release

Public resources

Asset	Details
Paper	ACL 2026 Main Conference
Dataset	MedForge-90K, covering CT, MRI, and X-ray with 19 lesion types
Model	MedForge-Reasoner on Hugging Face
Demo	Online detector Space for interactive testing

Read Paper Try Demo Model Dataset Project Page

Highlights

ACL 2026 Demo + Model + Dataset

Med-Banana-50K: Large-Scale Medical Image Editing Dataset

An open medical image editing dataset with 50,635 successful edits and 37,822 failed attempts across three modalities and 23 disease types.

Key Features

50,635 successful edits across 3 medical imaging modalities
Chest X-ray: 12 pathology types (Pneumothorax, Pleural Effusion, etc.)
Brain MRI: 4 tumor types (Glioma, Meningioma, Pituitary)
Fundus photography: 7 disease types (Diabetic Retinopathy, Glaucoma, etc.)
Bidirectional editing: lesion addition and removal
LLM-as-Judge quality control with medically grounded rubric
37,822 failed attempts with full conversation logs for preference learning and alignment research

Dataset Statistics

Modality	Task	Diseases	Success	Failed
Chest X-ray	Add	12	9,854	7,971
Chest X-ray	Remove	12	10,667	4,750
Brain MRI	Add	4	4,536	8,630
Brain MRI	Remove	4	4,355	6,949
Fundus	Add	7	18,505	3,162
Fundus	Remove	7	2,718	6,360
Total		23+	50,635	37,822

📄 Read Paper 💻 View Code 📦 Dataset on Hugging Face

Open asset: Dataset, code, and paper are publicly available for medically grounded image editing research.

DivScore: Zero-Shot LLM Detection in Specialized Domains

A zero-shot detection framework for identifying LLM-generated text in specialized domains like medicine and law, using normalized entropy-based scoring and domain knowledge distillation.

Key Innovations

Zero-shot detection: No training data required for new domains
Normalized entropy scoring: Robust metric for specialized text
Domain knowledge distillation: Leverages domain-specific patterns
Cross-domain robustness: Tested on medical, legal, and financial texts

Performance Highlights

Metric	Improvement
AUROC	+14.4% vs. SOTA
Recall @ 0.1% FPR	+64.0% vs. SOTA
Zero-shot Capability	No training needed

Applications

Detecting AI-generated medical content to combat misinformation
Verifying authenticity of legal documents and contracts
Ensuring integrity in academic and scientific publishing
Quality control for financial reports and analysis

📄 Read Paper 💻 View Code

Published at

EMNLP 2025 Main Conference

Legal ASR Service: Whisper Large-v2 Deployment with Docker + FastAPI

A GPU-accelerated legal-domain speech-to-text service delivered for Haiwen & Partners LLP (HK), built on Whisper Large-v2 and packaged as a production serving stack.

Serving stack

Whisper Large-v2 with GPU acceleration for legal-domain audio
Docker + FastAPI serving pipeline for reproducible deployment
7.8% average WER on the delivered legal transcription workload

Highlights

Whisper Large-v2 Docker + FastAPI WER 7.8%

Quant Trading Agent: Autonomous LangChain Agent for HK Equities

An autonomous trading agent built at AQUMON on a LangChain architecture, orchestrating market analysis, signal generation, decision-making, and execution monitoring into a single closed loop for programmatic Hong Kong equity trading.

What it does

LangChain orchestration connecting market understanding, signal generation, strategy decision, and execution monitoring
Futu OpenAPI integration for real-time market-data streaming and automated order execution
Strategy validation loop to speed up iteration and deployment of programmatic strategies

Why it matters

Demonstrates agent orchestration and tool-calling against a real, latency-sensitive external API
End-to-end loop from perception to action, the same shape as agentic post-training environments

Highlights

LangChain Agent Futu OpenAPI AQUMON (HK)

Smart Word Agent: End-to-End ReAct Agent Harness for Document Workflows

A production agent harness that parses, edits, and re-formats documents from natural-language instructions, built on the ReAct paradigm with Kimi-K2 as the core LLM. Designed as a self-contained, shippable agent rather than a notebook demo.

Harness design

ReAct loop with tool-calling over document APIs: parsing, bulk formatting, table manipulation, attachment reasoning
Multi-document context with Kimi-K2 long-context handling for complex multi-section documents
Stateless agent loop with streaming responses for low-latency interaction
Zero-dependency shipping: packaged as a single-file portable executable for end users

Traction

1,000 downloads in the first week after release
100+ GitHub stars as an open-source project

GitHub

Highlights

ReAct Harness Kimi-K2 Core LLM 1K downloads / week 1

Projects 项目展示

MiniMax Cowork Team Fellowship: Medical Foundation Model Development and Clinically Verifiable Agent Workflows

Project focus

Grant support

MedForge: Interpretable Medical Deepfake Detection via Forgery-aware Reasoning

What is included

Why it matters

Public resources

Med-Banana-50K: Large-Scale Medical Image Editing Dataset

Key Features

Dataset Statistics

DivScore: Zero-Shot LLM Detection in Specialized Domains

Key Innovations

Performance Highlights

Applications

Legal ASR Service: Whisper Large-v2 Deployment with Docker + FastAPI

Serving stack

Quant Trading Agent: Autonomous LangChain Agent for HK Equities

What it does

Why it matters

Smart Word Agent: End-to-End ReAct Agent Harness for Document Workflows

Harness design

Traction