Predicting gene expression using DNA methylation in two human populations
A peer-reviewed article of this Preprint also exists.
Author and article information
Abstract
Background. DNA methylation, an important epigenetic mark, is well known for its regulatory role in gene expression, especially the negative regulation in the promoter region. However, its correlation with gene expression at population level has not been well studied. In particular, it is unclear if genome-wide DNA methylation profile of an individual can predict her/his gene expression profile. Previous studies were mostly limited to association analyses between single CpG site methylation and gene expression. It is not known whether DNA methylation of a gene has enough prediction power to serve as a surrogate for gene expression in existing human study cohorts with DNA samples but not RNA samples.
Results. We studied two human population datasets, Multiple Tissue Human Expression Resource Projects (MuTHER)’s Adipose tissue as well as asthma and normal peoples’ peripheral blood mononuclear cell (PBMC), for predicting gene expression using methylation of all CpG sites from the gene region. Three prediction models were investigated; single linear regression, multiple linear regression, and least absolute shrinkage and selection operator (LASSO) penalized regression. Our results showed that LASSO regression has superior performance among these methods. However, even with LASSO regression, very small prediction R2 was obtained for the majority of genes and only about one thousand genes had prediction R2 greater than 0.1. GO term and pathway analyses of these more predictable genes showed that they are enriched for immune and defense genes.
Conclusion. In human populations, DNA methylation of CpG sites at gene region have weak prediction power for gene expression. The relatively more predictable genes tend to be defense and immune genes.
Cite this as
2018. Predicting gene expression using DNA methylation in two human populations. PeerJ Preprints 6:e27055v1 https://doi.org/10.7287/peerj.preprints.27055v1Author comment
This is a submission to PeerJ for review.
Sections
Supplemental Information
Additional Information
Competing Interests
Xiangqin Cui is an Academic Editor for PeerJ.
Author Contributions
Huan Zhong analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, approved the final draft.
Soyeon Kim analyzed the data, authored or reviewed drafts of the paper, approved the final draft.
Degui Zhi conceived and designed the experiments, authored or reviewed drafts of the paper, approved the final draft.
Xiangqin Cui conceived and designed the experiments, authored or reviewed drafts of the paper, approved the final draft.
Data Deposition
The following information was supplied regarding data availability:
The data used in the study are public data and our code are at https://github.com/dorothyzh/MethylXcan
Funding
DZ was partially supported by NIH Grant R01 HG008115; XC was partially supported by NIH 2P60AR048095. HZ was supported by Hong Kong Baptist University’s strategic development fund SDF15-1012-P04 to Y.X. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.