Waste Not, Want Not: Why Rarefying Microbiome Data Is Inadmissible
Figure 3
Examples of overdispersion in microbiome data.
Common-Scale Variance versus Mean for Microbiome Data. Each point in each panel represents a different OTU's mean/variance estimate for a biological replicate and study. The data in this figure come from the Global Patterns survey [48] and the Long-Term Dietary Patterns study [75], with results from additional studies included in Protocol S1. (Right) Variance versus mean abundance for rarefied counts. (Left) Common-scale variances and common-scale means, estimated according to Equations 6 and 7 from Anders and Huber [13], implemented in the DESeq package (Text S1). The dashed gray line denotes the σ2 = μ case (Poisson; φ = 0). The cyan curve denotes the fitted variance estimate using DESeq [13], with method = ‘pooled’, sharingMode = ‘fit-only’, fitType = ‘local’.