11---
2- title : " Exploring and Analyzing LC-MS data with Spectra and xcms"
2+ title : " Exploring and Analyzing LC-MS Data with Spectra and xcms"
33author :
44- name : " Philippine Louail, Johannes Rainer"
55 affiliation :
" Eurac Research, Bolzano, Italy; [email protected] github: jorainer" @@ -210,8 +210,8 @@ spectra(data)
210210
211211```
212212
213- The new version of * xcms* uses thus the more modern and flexible infrastructure
214- for MS data analysis provided by the ` r Biocpkg("Spectra") ` package. While it is
213+ From version 4 on, * xcms* uses the more modern and flexible infrastructure for
214+ MS data analysis provided by the ` r Biocpkg("Spectra") ` package. While it is
215215still possible and supported to use * xcms* together with the `r
216216Biocpkg("MSnbase")` package, users are advised to switch to this new
217217infrastructure as it provides more flexibility and a higher performance. Also,
@@ -252,6 +252,10 @@ fromFile(data) |>
252252 table()
253253```
254254
255+ Such basic data summaries can be helpful for a first initial quality assessment
256+ to potentially identify problematic data files with e.g. a unexpected low number
257+ of spectra.
258+
255259Besides the peak data (* m/z* and intensity values) also additional spectra
256260variables (metadata) are available in a ` Spectra ` object. These can be listed
257261using the ` spectraVariables ` function that we call on our example MS data below.
@@ -278,10 +282,7 @@ spectra(data) |>
278282 table()
279283```
280284
281- The present data set contains thus 1,862 spectra, all from MS level 1. Such
282- basic data summaries can be helpful for a first initial quality assessment to
283- potentially identify problematic data files with e.g. a unexpected low number of
284- spectra.
285+ The present data set contains thus 1,862 spectra, all from MS level 1.
285286
286287We could also check the number of peaks per spectrum in the different data
287288files. The number of peaks per spectrum can be extracted with the ` lengths `
@@ -736,20 +737,21 @@ fls <- basename(fls)
736737data <- readMsExperiment(fls, sampleData = pd)
737738```
738739
739- This, or similar, code would allow to create scripts to batch-perform an R-based
740- centroiding.
740+ Thus, with few lines of R code we performed MS data centroiding in R which gives
741+ us possibly more, and better, control over the process and would also allow
742+ (parallel) batch processing.
741743
742744
743745
744746# Preprocessing of LC-MS data
745747
746748Preprocessing of (untargeted) LC-MS data aims at detecting and quantifying the
747749signal from ions generated from all molecules present in a sample. It consists
748- of the following 3 steps: chromatographic peak detection, alignment (also
749- called retention time correction) and correspondence (also called peak
750- grouping). The resulting matrix of feature abundances can then be used as an
751- input in downstream analyses including data normalization, identification of
752- features of interest and annotation of features to metabolites.
750+ of the following 3 steps: chromatographic peak detection, retention time
751+ alignment and correspondence (also called peak grouping). The resulting matrix
752+ of feature abundances can then be used as an input in downstream analyses
753+ including data normalization, identification of features of interest and
754+ annotation of features to metabolites.
753755
754756
755757## Chromatographic peak detection
@@ -891,10 +893,11 @@ plot(srn)
891893We can observe some scattering of the data points around an * m/z* of 105.05 in
892894the lower panel of the above plot. This scattering also decreases with
893895increasing signal intensity (as for many MS instruments the precision of the
894- signal increases with the intensity). To investigate the observed differences in
895- * m/z* values for the signal of serine we below first subset the data to the
896- first file and then restrict the * m/z* range further to values between 106.045
897- and 106.055.
896+ signal increases with the intensity). To quantify the observed differences in
897+ * m/z* values for the signal of serine we restrict the data to a * bona fide*
898+ region with signal for the serine ion. Below we first subset the data to the
899+ first file and then restrict the * m/z* range to values between 106.045 and
900+ 106.055.
898901
899902``` {r}
900903#' Reduce the data set to signal of the [M+H]+ ion of serine
@@ -1054,14 +1057,15 @@ observed above (see also the documentation of the `refineChromPeaks` function
10541057for all possible refinement options).
10551058
10561059To fuse the wrongly split peaks in the second row, we use the
1057- ` MergeNeighboringPeaksParam ` algorithm and configure it to merge all
1058- chromatographic peaks with a similar * m/z* that are less than 8 seconds apart
1059- from each other on the retention time axis (parameter ` expandRt = 4 ` ; the
1060- distance tail to head of the peaks evaluated for merging should thus be less
1061- than ` 2 * expandRt ` ) and for which the signal (intensity) between the two peaks
1062- is higher than 75% of the smaller apex intensity of the two peaks (parameter
1063- ` minProp = 0.75 ` ). We below apply these settings on the EICs and evaluate the
1064- result of this post-processing.
1060+ ` MergeNeighboringPeaksParam ` algorithm that merges chromatographic peaks that
1061+ are overlapping on the * m/z* and retention time dimension for which the signal
1062+ between them is lower than a certain value. We specify ` expandRt = 4 ` to expand
1063+ the retention time width of each peak by 4 seconds on each side and set `minProp
1064+ = 0.75`. All chromatographic peaks with a distance tail to head in retention
1065+ time dimension that is less ` 2 * expandRt ` and for which the intensity between
1066+ them is higher than 75% of the lower (apex) intensity of the two peaks are thus
1067+ merged. We below apply these settings on the EICs and evaluate the result of
1068+ this post-processing.
10651069
10661070``` {r}
10671071#' Define the setting for the peak refinement
@@ -1085,7 +1089,7 @@ data <- refineChromPeaks(data, param = mpp)
10851089```
10861090
10871091
1088- ## Alignment
1092+ ## Retention time alignment
10891093
10901094While chromatography helps to better discriminate between analytes it is also
10911095affected by variances that lead to shifts in retention times between measurement
0 commit comments