Learning outcomes

This course is an introduction to some modern methods in the field of statistics. In particular, the course will focus on socalled "high-dimensional" data. At the end of the course, the student will have mastered the tools to analyse data for which "classical" methods (such as, for example, those seen in the multivariate statistics course) are no longer valid or even available. The course will be divided into two parts, in each of which the mathematical tools used differ. The topics for each of the two parts will be chosen, in consultation with the students, from the following list: (a) classification, (b) the study of highdimensional (large p, small n) random vectors, (c) functional data analysis, (d) statistical depth.

Goals

The objective of the course is to cover a wide range of methods for classification, non-parametric analysis, highdimensional and functional data. At the end of the course, the student will be able to (i) recognize when and why the use of such methods is necessary; (ii) implement these methods on simulated and real data sets; (iii) justify their theoretical foundations.

Content

An outline table of contents for each of the potential strands is given below: (a) Classification: discriminant analysis, tree and forest classification, logistic regression, neural networks (b) high-dimensional multivariate analysis: variable selection, sparsity, regression (such as ridge regression or LASSO), principal component analysis, classification (supervised or not), etc. (c) functional data analysis: probability distributions on functional spaces, functional regression, principal component analysis, functional time series. (d) statistical depth: notions of positional depths, implementation, general depths, scatter, depth-based classification.

Assessment method

The assessment of the course will be through a detailed student project, which will then be defended orally. The objective of the project will be either to analyse a dataset (to be provided by the student or supplied, as desired) and, in so doing, demonstrate mastery (technical, theoretical and practical) of the course or the technical presentation of a research paper on the subject. The oral defence of the project will be done in public, after which the student will be asked various questions to test his/her understanding of the material. An oral exam will test the understanding of the course concepts. The final mark will be the arithmetic average of the two marks if they are greater than 10, the minimum otherwise.

Sources, references and any support material

All sources and references will be available on Webcampus. These will contain the slides used in the course as well as the videos associated with them, the exercise sessions, their answers, etc.

Language of instruction

English