Xudong Wu

吴煦东

Incoming PhD in Reinforcement Learning @ University of Hong Kong (Fall 2025)

BSc Hons in Mathematics and Statistics @ University of Edinburgh

BSc in Information and Computing Sciences @ Dalian University of Technology


Email: X.Wu-97 [at] sms.ed.ac.uk (Primary) | xudongwu02 [at] gmail.com

[Open CV]

About Me | Education | Research | Academic background | Honors


About Me

           

I am an incoming Ph.D. student (Fall 2025) at The University of Hong Kong (HKU), where I will be working on theoretical and algorithmic advancements in reinforcement learning (RL), bandit algorithms, policy optimization, and multi-agent systems. My research focuses on the mathematical foundations and computational methodologies for sequential decision-making, with applications in AI and large-scale learning systems.

I am currently completing my undergraduate studies in Mathematics and Statistics at The University of Edinburgh. My academic background includes statistical learning theory, optimization, and applied probability, with a strong focus on machine learning methodologies. I am on track to achieve a first-class degree.

Prior to this, I studied Information and Computational Science at Dalian University of Technology, where I built a strong foundation in mathematical modeling, computational methods, and probability theory. My academic performance placed me in the top 5% of my cohort, with an average score of 89.

My current research interests build upon my previous experience in statistical modeling, Bayesian inference, and machine learning frameworks. I am particularly interested in:

Education

The University of Hong Kong (HKU), Hong Kong SAR (Starting Sep. 2025)

  • Ph.D. in Reinforcement Learning and Optimization
  • Research areas: Theoretical RL, bandit algorithms, policy gradient algorithms, and RLHF

University of Edinburgh, Edinburgh, UK (Sep. 2023 – Present)

  • BSc (Hons) in Mathematics and Statistics
  • First-Class Honours (Equivalent to 4.0/4.0 GPA)

Dalian University of Technology, Dalian, China (Sep. 2021 – Jun. 2023)

  • BSc in Information and Computing Science
  • Average score: 89.2/100, Rank: 10/195

Research Experience

University of Edinburgh, Edinburgh, UK (Jan. 18, 2025 - Present)

  • Position: Research Collaborator
  • Project: Self-Rewarding Model Replication & Qwen-7B Fine-Tuning
  • Research Focus:
    • Replicating the self-rewarding mechanism in reinforcement learning from human feedback (RLHF)
    • Fine-tuning the Qwen-7B model on an expert consultation database to enhance task-specific reasoning
    • Evaluating model performance across multiple NLP benchmarks (e.g., MMLU, TruthfulQA, GSM8K)
    • Exploring reward modeling techniques to improve alignment and generalization in large-scale language models
    • Investigating the trade-offs between self-generated rewards and human-labeled rewards in fine-tuning

University of Edinburgh, Edinburgh, UK (Aug. 2024 - May 2025)

  • Position: Honours Dissertation / Final Year Project
  • Supervisor: Dr. Amanda Lenzi
  • Research Focus:
    • Comparative analysis of simulation-based inference (SBI) algorithms
    • Evaluating neural network-based Bayesian parameter estimation in stochastic models
    • Applying amortized inference methods to reinforcement learning settings
    • Using Bayesian neural networks to improve sample efficiency in sequential decision-making

University of California, Irvine, USA (June 2024 - Sep. 2024)

  • Position: Summer Research Assistant
  • Supervisor: Prof. Chen Li (IEEE Fellow)
  • Project: Optimizing Texera, a machine learning-based data analysis workflow platform
  • Responsibilities:
    • Integrated AI-driven automation for workflow optimization, enabling seamless machine learning pipeline execution
    • Developed an automated report generation system that converts data analysis workflows into structured insights
    • Enhanced Texera’s data cleaning and visualization capabilities to improve model interpretability

University of Edinburgh, Edinburgh, UK (Feb. 2024 - May 2024)

  • Position: Research Assistant in Uncertainty Economics
  • Research Focus:
    • Analyzed the economic impact of uncertainty during the COVID-19 pandemic using stochastic models
    • Implemented Monte Carlo simulations and Bayesian inference techniques for probabilistic estimation
    • Developed computational tools for visualizing uncertainty in economic forecasting

University of Edinburgh, Edinburgh, UK (Sep. 2023 - Dec. 2023)

  • Position: Project Leader in Differential Equations
  • Research Focus:
    • Developed a system of ordinary differential equations (ODEs) to model bacterial infections and antibiotic treatments
    • Used Fourier series analysis and Laplace transforms to predict bacterial resistance patterns
    • Optimized drug treatment schedules using numerical simulations

Dalian University of Technology, Dalian, China (Dec. 2022)

  • Position: Research Project in Mathematical Modeling
  • Research Focus:
    • Developed a population dynamics model incorporating fertility rate adjustments
    • Improved predictive accuracy by incorporating age-stratified birth rate variations
    • Performed parameter sensitivity analysis to optimize demographic forecasting

Academic Background

University of Edinburgh, Edinburgh, UK (Sep. 2023 - Present)

  • BSc in Mathematics and Statistics, First-Class Honours (Expected)
  • Mathematical & Optimization Courses:
    • Numerical Ordinary Differential Equations – Stability & convergence analysis, applied to RL optimization
    • Honours Differential Equations – Applied in stochastic dynamic programming & Bellman equations
    • Honours Complex Variables – Conformal mapping & contour integration, useful for stochastic control
    • Honours Analysis – Functional analysis and variational methods, applicable to policy optimization
  • Statistics & Probabilistic Modelling:
    • Financial Mathematics – Stochastic processes, Ito calculus, and Black-Scholes model
    • Stochastic Modelling – Markov chains, Poisson processes, Brownian motion, key concepts for RL
    • Statistical Computing – Monte Carlo methods, MCMC, and Bayesian inference
    • Statistical Methodology – Hypothesis testing, regression, and statistical decision theory
  • Programming & Computational Methods:
    • Applied Statistics – Hands-on experience with real-world data, Bayesian updating in ML
    • Python, C++, MATLAB – Numerical computing, algorithm development for RL applications

Dalian University of Technology, Dalian, China (Sep. 2021 - Jun. 2023)

  • BSc in Information and Computing Science, GPA: 3.92/4.0 (Top 5%)
  • Mathematical & Theoretical Foundations:
    • Mathematical Analysis (1-3) – Real & complex analysis, measure theory fundamentals
    • Ordinary Differential Equations – Theoretical and numerical solutions to dynamic systems
    • Abstract Algebra – Group theory, rings, and fields, applicable in cryptography & optimization
    • Complex Function Theory – Applications in function approximation & conformal mapping
  • Probability & Statistical Modelling:
    • Probability & Mathematical Statistics – Statistical inference, Bayesian learning, decision theory
    • Real Analysis – Measure theory and Lebesgue integration, foundation for stochastic processes
  • Computational & Algorithmic Skills:
    • C++ Programming – Algorithm design, object-oriented programming
    • Python Programming – Data structures, numerical computation, scientific computing
    • Mathematical Modelling – Applied techniques in RL-based decision-making and optimization

Honors and Awards

  • First-Class Scholarship - Dalian University of Technology, 2022-2023 (Top 5%)
  • Dual Degree Student Scholarship - University of Edinburgh, 2022-2023
  • International Study Scholarship - Dalian University of Technology, 2022-2023

Last update: 18 Feb 2025