Supervision of student projects:

Are you a year 3 student interested in a project? Please read the information on student projects (year 3).

For undergraduate and postgraduate taught students I offer thesis projects in areas related to statistics and data science:

Typically, the topic of the thesis is to study a recently proposed statistical or machine learning method and to apply it to data. The focus may vary from theory to computational experiments to applied analysis, depending on the interests of the student.

I also supervise long-term projects of postgraduate research students (PhD) - see current group members.

Current project students:

  1. Jack Hodgkinson (2023-24, BSc) - medical image analysis.
  2. Shumeng Lou (2023-24, MMath) - manifold learning.
  3. Xiaorong Gai (2023-24, MSc Data Science) - dimensionality reduction.

Recently supervised theses (completed):

BSc:

  1. Lucia Poyatos Baena (2023) - A comprehensive comparison of statistical and machine learning methods for medical image data analysis.
  2. Yuqi Jing (2023) - Deep generative modelling: a comparative review of normalising flows and diffusion models.
  3. Alice-Gabriela Stratula (2023) - Bayesian nonparametric clustering with Dirichlet process mixture models.
  4. Gaia Tortora (2022) - A comparative analysis of recommender systems models: K-nearest neighbour, singular value decomposition and alternating least squares.
  5. Julia Kaczmarczyk (2022) - Topic modelling using latent dirichlet allocation.
  6. Peijie Zeng (2022)- Overview of statistical approaches in electroencephalography signal processing.
  7. Ivan Dewerpe (2021) - Non-linear dimensionality reduction methods: Uniform Manifold Approximation and Projection (UMAP) versus t-Distributed Stochastic Neighbor Embedding (t-SNE).
  8. Wenlin Chen (2020) - Variational auto-encoders with application to unsupervised representation learning.
  9. Wei Wang (2020) - Interpretable factor models and variational autoencoders.
  10. Konstantin Siroki (2019) - Advanced methods for image classification tasks.
  11. Yifan Yu (2019) - The classification performance of neural networks with images.

MMath:

  1. Maros Botond (2019-2020) - An introductory approach to reinforcement learning.
  2. Niall Garner (2018-2019) - Theoretical and empirical analysis of tree based ensemble methods.

MSc Actuarial Science:

  1. Mudong Liu (2023) - Analysis of car insurance claims data using on generalised linear models and neural networks.
  2. Jiyuan Huang (2023) - Car insurance claim prediction based on generalized linear models and random forest.
  3. Haoyu Zhai (2023, MSc) - Comparison of generalised linear models and neural networks for analysing insurance claim data.

MSc Statistics:

  1. Ashwag Alsedran (2022) - Nonlinear dimension reduction for data visualization: comparing PCA and t-SNE.
  2. Yue Tang (2022) - Comparison and analysis of statistical recommendation algorithms.
  3. Jiani Wu (2022) - Latent Dirichlet allocation for topic modelling.
  4. Yue Liu (2021) - Uniform manifold approximation and projection for dimensionality reduction.
  5. Zhen Huang (2021) - Comparative study on UMAP and other modern dimensionality reduction techniques.
  6. Zhipei Qin (2021) - Dimensionality reduction methods: t-distributed stochastic neighbor embedding versus principle component analysis.
  7. Tinglu Liu (2021) - Data visualisation and discovering low-dimensional subspaces using t-distributed stochastic neighbour embedding.
  8. Zejie Shu (2021) - Dimensionality reduction using t-distributed stochastic neighbor embedding.
  9. Mengyi Liao (2020) - Comparison study of the maximal information coefficient for measuring nonlinear dependence.
  10. Ruixin Song (2020) - Comparison of mutual information and maximal information coefficient for measuring nonlinear association.
  11. Jingyi Ren (2020) - Measuring nonlinear multivariate association by distance correlation.
  12. Huiwen Zheng (2019) - Investigation of statistical variable selection methods implemented in the MXM R package.
  13. Guoquing Sun (2019) - Feature selection via conditional likelihood.
  14. Chunli Zhou (2019) - Statistics based on distance correlation
  15. Daoyu Zhu (2019) - Statistical analysis based on distance: an overview over energy statistics.
  16. Heeyeong Jung (2018) - Empirical Bayes approaches for estimating false discovery rates.
  17. Yanhong Liu (2018) - Entropy approach to information integration using linear and log-linear pools.
  18. Nan Ma (2018) - Empirical study of the maximal information coefficient and additional measures of dependence.
  19. Yuhao Tang (2018) - Investigating non-linear relationships with mutual information.

MSc Bioinformatics:

  1. Scott Ward (2017) - A study of latent factor models for analysing high dimensional omics data.