Xudong Wu
吴煦东
Ph.D. Student in Reinforcement Learning
The University of Hong Kong
About Me
I am a Ph.D. student (starting Fall 2025) at The University of Hong Kong, where I focus on the theoretical foundations and algorithmic development of reinforcement learning (RL) for Embodied AI and Large Language Models (LLMs).
I completed my undergraduate studies in Mathematics and Statistics at The University of Edinburgh, graduating with First-Class Honours. My academic background encompasses statistical learning theory, optimization, and applied probability, with a strong emphasis on machine learning methodologies.
Before that, I studied Information and Computational Science at Dalian University of Technology, where I built a solid foundation in mathematical modeling, computational methods, and probability theory.
News
Selected Research
Highlights from my research experience. View all →
Dynamic Self-Rewarding for Medical Large Language Models
Developed a dynamic self-rewarding framework for aligning medical LLMs without human-annotated supervision. Integrated a two-tier judge system with ChatGPT-4o and executed multi-round DPO for adaptive reward modeling on domain-specific medical datasets.
A Comparative Study of Simulation-Based Inference Algorithms
Benchmarked three SBI algorithms — BayesFlow, SNL, and Affine Flow Matching — on synthetic and real-world inference tasks. Demonstrated AFM's superiority in capturing spatial structure in high-dimensional Poisson–CAR disease mapping models.
Education
The University of Hong Kong
Ph.D. Student
Research: Reinforcement Learning, LLMs, Embodied AI
Advisor: Prof. Jiayu Chen · Co-advisors: Prof. Vaneet Aggarwal (Purdue), Prof. Wenjie Huang (HKU)
University of Edinburgh
BSc (Hons) in Mathematics and Statistics
First-Class Honours (Equivalent to 4.0/4.0 GPA)
Dalian University of Technology
BSc in Information and Computing Science
Average Score: 89.9/100