Abstract: For the last decade, the neuroscience field has observed the emergence of machine learning methods for the analysis of neuroimaging data. Unlike univariate methods that consider voxels one per one, these techniques analyse relationships between several voxels and are able to detect multivariate patterns. In the context of neurodegenerative diseases, such as Alzheimer’s disease (AD), they can be used to design a diagnosis system and to find in neuroimages the patterns responsible for the disease. The context of the work presented here is thus the field of pattern recognition with neuroimaging. Our objective is to explore the possibilities that tree ensemble methods, such as Random Forests, offer in this domain in general, and in particular in the context of AD research. These methods suit very well the needs of this domain, as they combine very good predictive performances and provide interpretable results in the form of variable importance scores. Our contributions include both methodological developments around tree ensemble methods and applications of these methods on real datasets. The methodological part of the thesis focuses on the analysis and the improvement of Random Forests variable importances for neuroimaging problems. Typical datasets in this domain are of very high dimensionality (hundreds of thousands of voxels) and contain comparatively very few samples (tens or hundreds of patients). Our first contribution is a theoretical and empirical analysis of how importance scores behave in such extreme settings, depending on the method parameters. We then propose several improvements of importance scores in such settings that take advantage of either the spatial structure between the features or a pre-defined partitioning of these features into groups. Finally, we address an issue with Random Forests importances, which is to find a threshold between truly relevant and irrelevant variables. For this purpose, we adapt several statistical methods proposed in the bioinformatics literature. These methods are ...
No Comments.