Bioinformatic approaches to processing and annotation of high-resolution mass spectrometry data

Ralf J.M. Weber*, Mark R. Viant

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapter


Biology has become an increasingly data-rich subject. The accumulating amount of biological data harvested using high-throughput “omics” technologies (e.g., genomics, transcriptomics, proteomics, and metabolomics) has necessitated the development of new and improved bioinformatic methods for data processing, analysis, and storage (27). Metabolomics refers to the study of endogenous, low-molecular-weight metabolites and is used to examine various biological systems ranging from single cells to tissues, organs, or even whole organisms (11). Mass spectrometry (MS) has proven to be an indispensable analytical tool to measure and study metabolites (11). This chapter presents several methods to collect, process, and annotate high-resolution (HR) MS data. Data processing and putative metabolite identification are the main focus of this chapter; data collection is included to provide context for the reader (Figure 7.1) (for chemometric methods see Chapter 12). Fourier transform (FT) MS has become an important method for the measurement of metabolites (11), represented by Fourier transform ion cyclotron resonance (FT-ICR) MS and Orbitrap technologies. Of all commercial MS platforms, FT-ICR enables mass measurements with the highest mass accuracy (i.e., <1 part per million [ppm]) and ultra-high resolution (i.e., ≤100,000 m/∆m, full width at half maximum). However, not even these specifications fulfill the requirements to measure and identify the complete metabolome (i.e., the collection of all metabolites within a cell). Southam et al. (21) successfully developed a spectral stitching approach, named selective ion monitoring (SIM) stitching, to improve MS specifications further by increasing the dynamic range (and hence the metabolome coverage), without the expense of degrading the mass accuracy (see section 2.2.1: Selective Ion Monitoring Stitching). Raw data recorded using HR FTMS in combination with the SIM stitching approach consists of multiple preprocessed files and associated transient (i.e., time-domain) files (see section 2.3: From Raw Spectral Data to Processed Mass Spectra). Each preprocessed file, usually produced by a commercial mass spectrometer, corresponds to a specific sample and includes detailed information regarding the MS experiment that has been performed (e.g., acquisition parameters, external calibration, and mass spectral data). Several commercial (e.g., Xcalibur) and open-source software packages (e.g., MZmine 2 (18), XCMS (20), and Midas (9)) are available to the user to process the preprocessed files or to convert them into more general formats (e.g., mzXML (15) or a peak list, i.e., mass-to-charge ratio [m/z] values and associated intensities).

Original languageEnglish
Title of host publicationMethodologies for Metabolomics
Subtitle of host publicationExperimental Strategies and Techniques
PublisherCambridge University Press
Number of pages15
ISBN (Electronic)9780511996634
ISBN (Print)9780521765909
Publication statusPublished - 1 Jan 2010

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)


Dive into the research topics of 'Bioinformatic approaches to processing and annotation of high-resolution mass spectrometry data'. Together they form a unique fingerprint.

Cite this