Efficient Dimension-reduction Technique for the Joint Analysis of Correlated Phenotypes

Maxime Turgeon, Karim Oualkacha, Antonio Ciampi, Golsa Dehghan, Brent Zanke, Celia Greenwood, Aurélie Labbe

Background: Over the past decade, researchers have pointed out several limitations of the genome-wide association paradigm. In particular, in the presence of pleiotropy and correlated phenotypes, testing the association between genomic loci and phenotypes in a series of pairwise analyses, ignoring the correlation, can lead to low power. Principal components of explained variance (PCEV), formerly known as principal components of heritability (PCH) in the context of genotype data, has been proposed as a dimension-reduction technique which is similar in spirit to principal components analysis (PCA), but which takes the genetic association into account when defining the principal components. However, in the presence of high- dimensional phenotypes, PCEV may be unstable and current high-dimensional methods require parameters than are computationally expensive to tune.

Methods: Under the assumption that the variables can be partitioned into independent subsets, we present an efficient scheme enabling testing genetic associations that has good power and does not require tuning parameters. We investigate the sensitivity of our approach to violations of our assumption about the independence of subsets, and we also present a real-data application.

Results: We show that our proposed method is computationally faster than current competitors and is weakly sensitive to the independence assumption. Moreover, under certain scenarios, it has higher power than principal components regression in detecting a true association.

Conclusions: Our approach can successfully and efficiently be used by researchers for joint analyses of high-dimensional, correlated phenotypes.

Key words: Dimension reduction; methylation; pleiotropy; high-dimensional data; multivariate phenotypes.