top of page


Rare-Variant Association Analysis and Meta-Analysis

With the development of sequencing technologies, sequencing-based association studies are increasingly being conducted to identify rare variants associated with complex traits.  We developed methods for rare variant association analysis, including SKAT and SKAT-O tests. These methods are highly cited and have been established as an industry standard. We extended these methods to meta-analysis, multiple-phenotypes, family design, longitudinal studies, and gene-environment (GxE) interaction test. 

Biobank Data and Phenome-Wide Association Studies 


PheWAS uses electronic health record (EHR) to phenotype thousands of diseases status and carries out genetic association analysis for all the phenotypes. We have developed fast and accurate computation methods for binary phenotypes (fastSPA). We currently expanded the method to adjust for sample relatedness even when the sample size is very large (SAIGE). 

High-Dimensional Data Analysis

Principal component analysis (PCA) is a powerful tool to explore characteristics of high dimensional data. In genome-wide association studies (GWAS), it is widely used to adjust for the confounding effect of population stratification. We have developed practical tools for GWAS and studied the theoretical properties of PCA in high dimensional settings. We are expanding these results to other high-dimensional methods including surrogate variable analysis and partial least squares. 

Our research is supported by the following grants

Brain Pool Plus (BP+) Program, National Research Foundation of Korea (NRF)

NIH R01 (HG008773, 2016-2021), Statistical and computational methods for rare variant association analysis

NIH R01 (HL142023, 2018-2020, MPI with Drs Cristen Willer and Xiang Zhou), Integrative analysis to uncover biology of blood lipids & coronary heart disease

bottom of page