Summary of Study ST001430

This data is available at the NIH Common Fund's National Metabolomics Data Repository (NMDR) website, the Metabolomics Workbench, https://www.metabolomicsworkbench.org, where it has been assigned Project ID PR000918. The data can be accessed directly via it's Project DOI: 10.21228/M81H58 This work is supported by NIH grant, U2C- DK119886.

See: https://www.metabolomicsworkbench.org/about/howtocite.php

This study contains a large results data set and is not available in the mwTab file. It is only available for download via FTP as data file(s) here.

Show all samples | Perform analysis on untargeted data
Download mwTab file (text) | Download mwTab file(JSON) | Download data files (Contains raw data)

Study ID	ST001430
Study Title	Metabolic dynamics and prediction og gestational ange and time to delivery in pregant women
Study Summary	Metabolism during pregnancy is a constantly changing yet precisely programmed process, the failure of which may have devastating consequences for the fetus. To capture in high resolution the sequence of metabolic events underlying the normal human pregnancy, we carried out an untargeted metabolome investigation on 784 weekly blood samples collected from 30 Danish pregnant women. The study revealed extensive metabolome alterations over the course of normal pregnancy: of 9,651 detected metabolic features, 4,995 were significantly changed (FDR < 0.05). Many metabolic changes were timed precisely according to pregnancy progression so that the overall metabolic profile demonstrated a highly choreographed pattern. Using machine-learning methods, we were able to build a linear models with five metabolites (four steroids and one phospholipid) that predicts gestational age with high accuracy (Pearson correlation coefficient, R = 0.95).
Institute	Stanford University
Laboratory	Snyder lab
Last Name	Liang
First Name	Liang
Address	Alway M339, 300 Pasteur Drive, Palo Alto, California, 94305, USA
Email	liangtro@stanford.edu
Phone	+1 8167852490
Submit Date	2019-08-30
Raw Data Available	Yes
Raw Data File Type(s)	mzXML
Analysis Type Detail	LC-MS
Release Date	2020-07-24
Release Version	1

Select appropriate tab below to view additional metadata details:

Combined analysis:

Analysis ID	AN002391	AN002392
Analysis type	MS	MS
Chromatography type	Reversed phase	Reversed phase
Chromatography system	Thermo Dionex Ultimate 3000	Thermo Dionex Ultimate 3000
Column	Agilent Zorbax Eclipse Plus C18 (100 x 2.1mm, 1.8 um)	Agilent Zorbax Eclipse Plus C18 (100 x 2.1mm, 1.8 um)
MS Type	ESI	ESI
MS instrument type	Orbitrap	Orbitrap
MS instrument name	Thermo Q Exactive Plus Orbitrap	Thermo Q Exactive Plus Orbitrap
Ion Mode	POSITIVE	NEGATIVE
Units	peak area	peak area

MS:

MS ID:	MS002233
Analysis ID:	AN002391
Instrument Name:	Thermo Q Exactive Plus Orbitrap
Instrument Type:	Orbitrap
MS Type:	ESI
MS Comments:	MS acquisition Metabolic extracts were analyzed by reversed-phase liquid chromatographic (RPLC)-mass spectrometry (MS) in both positive and negative ionization modes. Thermo Q Exactive Hybrid Quadrupole-Orbitrap plus and Q Exactive mass spectrometers (Xcalibur, Thermo Scientific, San Jose, CA, USA) were operated in full MS-scan mode for data acquisition (acquisition from m/z 500 to 2,000) with a scan rate of approximately 4 Hz and a resolution set at 30,000 (at m/z 400). The MS/MS spectra of the QC sample were acquired under different fragmentation energy (25 NCE and 50 NCE) of the top 10 parent ions. The resulting mass spectra were exported into Progenesis QI Software (Nonlinear Dynamics, Durham, NC, USA) for further processing. Section 1: Metabolomics Data Processing Metabolomic features were extracted with a unique mass/charge ratio and retention time, then aligned and quantified with the Progenesis QI software (Nonlinear Dynamics, Durham, NC, USA, http://www.nonlinear.com/progenesis/qi/). Peak deconvolution ll e2 Cell 181, 1680–1692.e1–e5, June 25, 2020 Resource was performed under default settings in Progenesis QI. Acquired data were processed using an analysis pipeline written in R (https:// www.R-project.org). Progenesis QI output was then processed by removing all metabolites that were quantified in less than 30% of the samples or had a median intensity of less than twofold signal over the noise threshold (S/N < 2). The noise threshold was estimated by using the median signal across all the blank runs (if no quantitation was reported in any of the blank runs, the feature was also included in the analysis, as it likely had good S/N characteristics). Then the data were log-transformed and normalized. For each run, the median of all features was centered to correct for variation in the sample amount. Then for each analyte, a linear correction was applied per batch to correct for any linear decrease or increase in abundance during the acquisition of a batch. In short, for each analyte and each batch, a linear model was fitted with the log-abundance of the analyte as the dependent variable and the acquisition number [run order (randomized)] as the independent variable. The model prediction was interpreted as an underlying drift in mass spectrometric sensitivity and subtracted from the analyte level to yield within-batch normalized abundances. Finally, for each analyte, the abundances were median centered by batch to correct for sensitivity differences between batches. The positive- and negative-mode features were then concatenated for downstream analysis. In total, 9,651 features were included in the final analysis. In addition, for samples with more than 50% of the values missing, the sample was removed (one sample in total). The remaining missing values were imputed by the nearest 10 neighbors using the k-Nearest Neighbor algorithm (Altman, 1992). Note that Discovery and Test Set 1 were normalized together, while samples of Test Set 2 were normalized independently. We applied principal component analysis (PCA) to examine the overall distribution of the sample data (with all 9,651 features) and check the run quality. The gestational ages (based on first-trimester ultrasound measurements) were superimposed to facilitate the analysis. During the analysis, the vast majority of the samples were separated by pre- and postpartum in PCA space defined by two components, which explained the largest variations (PC1 and 2, Figure 1B), while two samples of a same subject (last two in her collection, before and after childbirth) displayed irregular behavior in PCA and unsupervised clustering analysis. The two samples were treated as outliers and excluded from further analysis. We also performed partial least-squares discriminant analysis (PLSDA) according to the categories of gestational age (by the mixOmics package).
Ion Mode:	POSITIVE

MS ID:	MS002234
Analysis ID:	AN002392
Instrument Name:	Thermo Q Exactive Plus Orbitrap
Instrument Type:	Orbitrap
MS Type:	ESI
MS Comments:	MS acquisition Metabolic extracts were analyzed by reversed-phase liquid chromatographic (RPLC)-mass spectrometry (MS) in both positive and negative ionization modes. Thermo Q Exactive Hybrid Quadrupole-Orbitrap plus and Q Exactive mass spectrometers (Xcalibur, Thermo Scientific, San Jose, CA, USA) were operated in full MS-scan mode for data acquisition (acquisition from m/z 500 to 2,000) with a scan rate of approximately 4 Hz and a resolution set at 30,000 (at m/z 400). The MS/MS spectra of the QC sample were acquired under different fragmentation energy (25 NCE and 50 NCE) of the top 10 parent ions. The resulting mass spectra were exported into Progenesis QI Software (Nonlinear Dynamics, Durham, NC, USA) for further processing. Section 1: Metabolomics Data Processing Metabolomic features were extracted with a unique mass/charge ratio and retention time, then aligned and quantified with the Progenesis QI software (Nonlinear Dynamics, Durham, NC, USA, http://www.nonlinear.com/progenesis/qi/). Peak deconvolution ll e2 Cell 181, 1680–1692.e1–e5, June 25, 2020 Resource was performed under default settings in Progenesis QI. Acquired data were processed using an analysis pipeline written in R (https:// www.R-project.org). Progenesis QI output was then processed by removing all metabolites that were quantified in less than 30% of the samples or had a median intensity of less than twofold signal over the noise threshold (S/N < 2). The noise threshold was estimated by using the median signal across all the blank runs (if no quantitation was reported in any of the blank runs, the feature was also included in the analysis, as it likely had good S/N characteristics). Then the data were log-transformed and normalized. For each run, the median of all features was centered to correct for variation in the sample amount. Then for each analyte, a linear correction was applied per batch to correct for any linear decrease or increase in abundance during the acquisition of a batch. In short, for each analyte and each batch, a linear model was fitted with the log-abundance of the analyte as the dependent variable and the acquisition number [run order (randomized)] as the independent variable. The model prediction was interpreted as an underlying drift in mass spectrometric sensitivity and subtracted from the analyte level to yield within-batch normalized abundances. Finally, for each analyte, the abundances were median centered by batch to correct for sensitivity differences between batches. The positive- and negative-mode features were then concatenated for downstream analysis. In total, 9,651 features were included in the final analysis. In addition, for samples with more than 50% of the values missing, the sample was removed (one sample in total). The remaining missing values were imputed by the nearest 10 neighbors using the k-Nearest Neighbor algorithm (Altman, 1992). Note that Discovery and Test Set 1 were normalized together, while samples of Test Set 2 were normalized independently. We applied principal component analysis (PCA) to examine the overall distribution of the sample data (with all 9,651 features) and check the run quality. The gestational ages (based on first-trimester ultrasound measurements) were superimposed to facilitate the analysis. During the analysis, the vast majority of the samples were separated by pre- and postpartum in PCA space defined by two components, which explained the largest variations (PC1 and 2, Figure 1B), while two samples of a same subject (last two in her collection, before and after childbirth) displayed irregular behavior in PCA and unsupervised clustering analysis. The two samples were treated as outliers and excluded from further analysis. We also performed partial least-squares discriminant analysis (PLSDA) according to the categories of gestational age (by the mixOmics package).
Ion Mode:	NEGATIVE