A peaklet-based generic strategy for the untargeted analysis of comprehensive two-dimensional gas chromatography mass spectrometry data set
Comprehensive two-dimensional gas chromatography mass spectrometry (GC × GC-MS) is a well-established key technology in analytical chemistry and increasingly used in the field of untargeted metabolomics. However, automated processing of large GC × GC-MS data sets is still a major bottleneck in untargeted, large-scale metabolomics. For this reason we introduce a novel peaklet-based alignment strategy. The algorithm is capable of an untargeted deterministic alignment exploiting a density based clustering procedure within a time constrained similarity matrix. Exploiting minimal 1D and 2D retention time shifts between peak modulations, the alignment is done without the need for peak merging which also eliminates the need for linear or nonlinear retention time correction procedures. The approach is validated in detail using data of urine samples from a large human metabolomics study. The data was acquired by a Shimadzu GCMS-QP2010 Ultra GC × GC-qMS system and consists of 512 runs, including 312 study samples and 178 quality control sample injections, measured within a time period of 22 days. The final result table consisted of 313 analytes, each of these being detectable in at least 75 percent of the study samples. In summary, we present an automated, reliable and fully transparent workflow for the analysis of large GC × GC-qMS metabolomics data sets.
Use and reproduction:
All rights reserved