Unsupervised Machine Learning for Exploratory Data Analysis in Imaging Mass Spectrometry

Unsupervised machine learning methods for exploratory data analysis in imaging mass spectrometry can be broadly broken down into 3 main categories, namely **factorization methods**, **clustering methods** and **manifold learning or non-linear dimensionality reduction techniques**.

In this review paper we give an extensive overview of the wide range of unsupervised machine learning methods that have been applied in the analysis of Mass Spectrometry Imaging (MSI) data. Unlike many other molecular imaging technologies, MSI does not require prior tagging of molecular targets and is able to measure large numbers of ions concurrently in a single experiment. While this makes the technology particularly suited for exploratory analysis, it also leads to very large and complex datasets (GB’s up to TB’s of raw data for a single experiment), making automated computational analysis indispensable.

Unsupervised machine learning methods are primarily targeted at exploring the content of the data, and extracting their underlying trends in a mostly unbiased way. They are often the first step in gaining insight into a MSI dataset. A wide array of techniques has been used in the unsupervised analysis of MSI, which can broadly be broken down into 3 main categories, namely factorization methods, clustering methods and manifold learning or non-linear dimensionality reduction techniques. In this work we discuss the various machine learning methods for each class, and provide a theoretical basis for each method, along with their specific use cases in MSI applications.

This review aims to be an entry point for both (i) analytical chemists and mass spectrometry experts who want to explore computational techniques, and (ii) computer scientists and data mining specialists who want to enter the MSI field.

Publication details

Unsupervised Machine Learning for Exploratory Data Analysis in Imaging Mass Spectrometry

Nico Verbeeck 1 , 2 , 3 , Richard M. Caprioli 4 , 5 , 6 , 7 , 8 , Raf Van de Plas 1 , 4 , 5 Vanderbilt University

Mass Spectrometry Reviews 39:245-291, 2020

Browse to this journal paper

Author affiliations:

  1. Delft Center for Systems and Control, Delft University of Technology ‐ TU Delft, Delft, The Netherlands
  2. Aspect Analytics NV, Genk, Belgium
  3. STADIUS Center for Dynamical Systems, Signal Processing, and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, Leuven, Belgium
  4. Mass Spectrometry Research Center, Vanderbilt University, Nashville, TN
  5. Department of Biochemistry, Vanderbilt University, Nashville, TN
  6. Department of Chemistry, Vanderbilt University, Nashville, TN
  7. Department of Pharmacology, Vanderbilt University, Nashville, TN
  8. Department of Medicine, Vanderbilt University, Nashville, TN