Défense de thèse de doctorat en informatique - Sacha Corbugy

Abstract

In recent decades, the volume of data generated worldwide has grown exponentially, significantly accelerating advancements in machine learning. This explosion of data has led to an increased need for effective data exploration techniques, giving rise to a specialized field known as dimensionality reduction. Dimensionality reduction methods are used to transform high-dimensional data into a low-dimensional space (typically 2D or 3D), so that it can be easily visualized and understood by humans. Algorithms such as Principal Component Analysis (PCA), Multidimensional Scaling (MDS), and t-distributed Stochastic Neighbor Embedding (t-SNE) have become essential tools for visualizing complex datasets. These techniques play a critical role in exploratory data analysis and in interpreting complex models like Convolutional Neural Networks (CNNs). Despite their widespread adoption, dimensionality reduction techniques, particularly non-linear ones, often lack interpretability. This opacity makes it difficult for users to understand the meaning of the visualizations or the rationale behind specific low-dimensional representations. In contrast, the field of supervised machine learning has seen significant progress in explainable AI (XAI), which aims to clarify model decisions, especially in high-stakes scenarios. While many post-hoc explanation tools have been developed to interpret the outputs of supervised models, there is still a notable gap in methods for explaining the results of dimensionality reduction techniques.

This research investigates how post-hoc explanation techniques can be integrated into dimensionality reduction algorithms to improve user understanding of the resulting visualizations. Specifically, it explores how interpretability methods originally developed for supervised learning can be adapted to explain the behavior of non-linear dimensionality reduction algorithms. Additionally, this work examines whether the integration of post-hoc explanations can enhance the overall effectiveness of data exploration. As these tools are intended for end-users, we also design and evaluate an interactive system that incorporates explanatory mechanisms. We argue that combining interpretability with interactivity significantly improves users' understanding of embeddings produced by non-linear dimensionality reduction techniques. In this research, we propose enhancements to an existing post-hoc explanation method that adapts LIME for t-SNE. We introduce a globally-local framework for fast and scalable explanations of t-SNE embeddings. Furthermore, we present a completely new approach that adapts saliency map-based explanations to locally interpret non-linear dimensionality reduction results. Lastly, we introduce our interactive tool, Insight-SNE, which integrates our gradient-based explanation method and enables users to explore low-dimensional embeddings through direct interaction with the explanations.

Jury

Prof. Wim Vanhoof - University of Namur, Belgium
Prof. Benoit Frénay - University of Namur, Belgium
Prof. Bruno Dumas - University of Namur, Belgium
Prof. John Lee - University of Louvain, Belgium
Prof. Luis Galarraga - University of Rennes, France

La défense publique sera suivie d'une réception.

Inscription obligatoire.

Je m'inscris