Track topics on Twitter Track topics that are important to you
Methods to impute missing data are routinely used to increase power in genome-wide association studies. There are two broad classes of imputation methods. The first class imputes genotypes at the untyped variants, given those at the typed variants, and then performs a statistical test of association at the imputed variants. The second class, summary statistic imputation (SSI), directly imputes association statistics at the untyped variants, given the association statistics observed at the typed variants. The second class is appealing as it tends to be computationally efficient while only requiring the summary statistics from a study, while the former class requires access to individual-level data that can be difficult to obtain. The statistical properties of these two classes of imputation methods have not been fully understood. In this study, we show that the two classes of imputation methods yield association statistics with similar distributions for sufficiently large sample sizes. Using this relationship, we can understand the effect of the imputation method on power. We show that a commonly used approach to SSI that we term SSI with variance reweighting generally leads to a loss in power. On the contrary, our proposed method for SSI that does not perform variance reweighting fully accounts for imputation uncertainty, while achieving better power.
This article was published in the following journal.
Name: Journal of computational biology : a journal of computational molecular cell biology
Although genome-wide association studies (GWAS) have deepened our understanding of the genetic architecture of complex traits, the mechanistic links that underlie how genetic variants cause complex tr...
Sleep is an essential physiological process that protects our physical and mental health. However, the causality of the association between sleep and coronary heart disease (CHD) is unknown. Mendelian...
Whole-genome regressions methods represent a key framework for genome-wide prediction, cross-validation studies, and association analysis. The bWGR offers a compendium of Bayesian methods with various...
Genetic risk prediction is an important problem in human genetics, and accurate prediction can facilitate disease prevention and treatment. Calculating polygenic risk score (PRS) has become widely use...
Linkage disequilibrium SCore regression (LDSC) has become a popular approach to estimate confounding bias, heritability, and genetic correlation using only genome-wide association study (GWAS) test st...
This is an observational study to identify genetic risks for neonatal diseases, necrotizing enterocolitis (NEC) using genome-wide association study (GWAS) and enterotype investigation. We ...
The aim of this study was to elucidate genetic susceptibility of patients with nontuberculous mycobacterial lung disease using genome-wide association study.
Performing a phenome-wide association study (PheWAS) identifying clinical diagnoses associated with a polygenic predictor of Thyroid stimulating hormone (TSH) levels identified by a previo...
In the last decade, investigators from the Department of Cancer Epidemiology and Genetics (National Cancer Institute, USA) have conducted genome-wide association (GWAS) studies of renal ce...
Some of the liver transplantation recipients experience postoperative acute kidney injury due to various causes including genetic factors. Prevention of postoperative acute kidney injury i...
An analysis comparing the allele frequencies of all available (or a whole GENOME representative set of) polymorphic markers in unrelated patients with a specific symptom or disease condition, and those of healthy controls to identify markers associated with a specific disease or condition.
Used for general articles concerning statistics of births, deaths, marriages, etc.
A center in the PUBLIC HEALTH SERVICE which is primarily concerned with the collection, analysis, and dissemination of health statistics on vital events and health activities to reflect the health status of people, health needs, and health resources.
Component of the NATIONAL INSTITUTES OF HEALTH. It conducts and supports research into the mapping of the human genome and other organism genomes. The National Center for Human Genome Research was established in 1989 and re-named the National Human Genome Research Institute in 1997.
Techniques to determine the entire sequence of the GENOME of an organism or individual.
Bioinformatics is the application of computer software and hardware to the management of biological data to create useful information. Computers are used to gather, store, analyze and integrate biological and genetic information which can then be applied...