This repository contains code to apply the EP-means clustering algorithm to cluster DNA methylation (DNAm) individual CpG distributions.
CpGs often do not follow a gaussian distribution, which can be problematic for traditional linear regression models commonly employed in EWASes. Here we map their actual distribution patterns across two cohorts and arrays (450K and EPICv1). We then evaluate how these patterns relate to biological and technical factors.