EHS
EHS

H3AGWAS: a portable workflow for genome wide association studies | BMC Bioinformatics


  • Uffelmann E, Huang QQ, Munung NS, de Vries J, Okada Y, Martin AR, et al. Genome-wide association studies. Nat Rev Methods Primers. 2021;1(1):1–21.

    Article 

    Google Scholar
     

  • Marees AT, de Kluiver H, Stringer S, Vorspan F, Curis E, Marie-Claire C, et al. A tutorial on conducting genomewide association studies: quality control and statistical analysis. Int J Methods Psychiatr Res. 2018;27(2): e1608.

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Anderson CA, Pettersson FH, Clarke GM, Cardon LR, Morris AP, Zondervan KT. Data quality control in genetic case-control association studies. Nat Protoc. 2010;5(9):1564–73.

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Laurie CC, Doheny KF, Mirel DB, Pugh EW, Bierut LJ, Bhangale T, et al. Quality control and quality assurance in genotypic data for genome-wide association studies. Genet Epidemiol. 2010;34(6):591–602.

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Adam Y, Samtal C, Brandenburg J, Falola O, Adebiyi E. Performing post-genome-wide association study analysis: overview, challenges and recommendations. F1000Research. 2021;10:1002.

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Mulder NJ, Adebiyi E, Alami R, Benkahla A, Brandful J, Doumbia S, et al. H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa. Genome Res. 2016;26(2):271–7.

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Baichoo S, Souilmi Y, Panji S, Botha G, Meintjes A, Bendou H, et al. Developing reproducible bioinformatics analysis workflows for heterogenous computing environments to support African genomics. BMC Bioinform. 2018;19(457):1–9.


    Google Scholar
     

  • Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017;35(4):316–9.

    Article 
    PubMed 

    Google Scholar
     

  • Van Rossum G, Drake FL. Python 3 reference manual. Scotts Valley: CreateSpace; 2009.


    Google Scholar
     

  • R Core Team. R: a language and environment for statistical computing. Vienna, Austria; 2020. https://www.R-project.org/.

  • Lippert C, Listgarten J, Liu Y, Kadie CM, Davidson RI, Heckerman D. FaST linear mixed models for genome-wide association studies. Nat Methods. 2011;8(10):833–5.

    Article 
    PubMed 
    CAS 

    Google Scholar
     

  • Zhou X, Stephens M. Genome-wide efficient mixed-model analysis for association studies. Nat Genet. 2012;44(7):821–4.

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Loh PR, Kichaev G, Gazal S, Schoech AP, Price AL. Mixed-model association for biobank-scale datasets. Nat Genet. 2018;50(7):906–8.

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Yang J, Lee SH, Goddard ME, Visscher PM. GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet. 2011;88(1):76–82.

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Jiang L, Zheng Z, Qi T, Kemper KE, Wray NR, Visscher PM, et al. A resource-efficient tool for mixed model association analysis of large-scale data. Nat Genet. 2019;51(12):1749–55.

    Article 
    PubMed 
    CAS 

    Google Scholar
     

  • Zhou W, Nielsen JB, Fritsche LG, Dey R, Gabrielsen ME, Wolford BN, et al. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat Genet. 2018;50(9):1335–41.

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Mbatchou J, Barnard L, Backman J, Marcketta A, Kosmicki JA, Ziyatdinov A, et al. Computationally efficient whole-genome regression for quantitative and binary traits. Nat Genet. 2021;53(7):1097–103.

    Article 
    PubMed 
    CAS 

    Google Scholar
     

  • Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4(1):1–16.

    Article 

    Google Scholar
     

  • Yang J, Ferreira T, Morris AP, Medland SE, Genetic Investigation of ANthropometric Traits (GIANT) Consortium, DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) Consortium, et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat Genet. 2012;44(4):369–75, S1–3.

  • Han B, Eskin E. Random-effects model aimed at discovering associations in meta-analysis of genome-wide association studies. Am J Hum Genet. 2011;88(5):586–98.

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Mägi R, Morris AP. GWAMA: software for genome-wide association meta-analysis. BMC Bioinform. 2010;11:288.

    Article 

    Google Scholar
     

  • Willer CJ, Li Y, Abecasis GR. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics (Oxford, England). 2010;26(17):2190–1.

    Article 
    CAS 

    Google Scholar
     

  • Turley P, Walters RK, Maghzian O, Okbay A, Lee JJ, Fontana MA, et al. Multi-trait analysis of genome-wide association summary statistics using MTAG. Nat Genet. 2018;50(2):229–37.

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Bulik-Sullivan B, Finucane HK, Anttila V, Gusev A, Day FR, Loh PR, et al. An atlas of genetic correlations across human diseases and traits. Nat Genet. 2015;47(11):1236–41.

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Günther T, Gawenda I, Schmid KJ. phenosim—a software to simulate phenotypes for testing in genome-wide association studies. BMC Bioinform. 2011;12:265.

    Article 

    Google Scholar
     

  • Pruim RJ, Welch RP, Sanna S, Teslovich TM, Chines PS, Gliedt TP, et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics (Oxford, England). 2010;26(18):2336–7.

    Article 
    CAS 

    Google Scholar
     

  • Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38(16):e164–e164.

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, et al. Twelve years of SAMtools and BCFtools. GigaScience. 2021;10(2):giab008.

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wigginton JE, Cutler DJ, Abecasis GR. A note on exact tests of Hardy–Weinberg equilibrium. Am J Hum Genet. 2005;76(5):887–93.

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Zhao H, Sun Z, Wang J, Huang H, Kocher JP, Wang L. CrossMap: a versatile tool for coordinate conversion between genome assemblies. Bioinformatics (Oxford, England). 2014;30(7):1006–7.

    Article 

    Google Scholar
     

  • Loh PR, Bhatia G, Gusev A, Finucane HK, Bulik-Sullivan BK, Pollack SJ, et al. Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance-components analysis. Nat Genet. 2015;47(12):1385–92.

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, Nyholt DR, et al. Common SNPs explain a large proportion of the heritability for human height. Nat Genet. 2010;42(7):565–9.

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Zhou X. A unified framework for variance component estimation with summary statistics in genome-wide association studies. Ann Appl Stat. 2017;11(4):2027–51.

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Auton A, Abecasis GR, Altshuler DM, Durbin RM, Abecasis GR, Bentley DR, et al. A global reference for human genetic variation. Nature. 2015;526(7571):68–74.

    Article 
    PubMed 

    Google Scholar
     

  • Chen W, Larrabee BR, Ovsyannikova IG, Kennedy RB, Haralambieva IH, Poland GA, et al. Fine mapping causal variants with an approximate Bayesian method using marginal test statistics. Genetics. 2015;200(3):719–36.

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Han B, InterpretingEskin E. Meta-analyses of genome-wide association studies. PLOS Genet. 2012;8(3): e1002555. https://doi.org/10.1371/journal.pgen.1002555.

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Buniello A, MacArthur JAL, Cerezo M, Harris LW, Hayhurst J, Malangone C, et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 2019;47(D1):D1005–12.

    Article 
    PubMed 
    CAS 

    Google Scholar
     

  • Kumuthini J, Zass L, Panji S, Salifu SP, Kayondo JK, Nembaware V, et al. The H3ABioNet helpdesk: an online bioinformatics resource, enhancing Africa’s capacity for genomics research. BMC Bioinform. 2019;20(1):1–7.

    Article 

    Google Scholar
     

  • Kurtzer GM, Sochat V, Bauer MW. Singularity: scientific containers for mobility of compute. PLoS ONE. 2017;12(5):e01775459. https://doi.org/10.1371/journal.pone.0177459.

    Article 
    CAS 

    Google Scholar
     

  • Ramsay M, Crowther N, Tambo E, Agongo G, Baloyi V, Dikotope S, et al. H3Africa AWI-Gen Collaborative Centre: a resource to study the interplay between genomic and environmental risk factors for cardiometabolic diseases in four sub-Saharan African countries. Global Health Epidemiol Genom. 2016;1: e20.

    Article 
    CAS 

    Google Scholar
     

  • Loh PR, Danecek P, Palamara PF, Fuchsberger C, Reshef Y, Finucane H, et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat Genet. 2016;48(11):1443–8.

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Choudhury A, Brandenburg JT, Chikowore T, Sengupta D, Boua PR, Crowther NJ, et al. Meta-analysis of sub-Saharan African studies provides insights into genetic architecture of lipid traits. Nat Commun. 2022;13(1):2578.

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Band G, Marchini J, BGEN: a binary file format for imputed genotype and haplotype data. 2018. https://doi.org/10.1101/308296v2.

  • Kässens JC, Wienbrandt L, Ellinghaus D. BIGwas: single-command quality control and association testing for multi-cohort and biobank-scale GWAS/PheWAS data. GigaScience. 2021;10(6):Giab047. https://doi.org/10.1093/gigascience/giab047.

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Gogarten SM, Bhangale T, Conomos MP, Laurie CA, McHugh CP, Painter I, et al. GWASTools: an R/Bioconductor package for quality control and analysis of genome-wide association studies. Bioinformatics (Oxford, England). 2012;28(24):3329–31.

    Article 
    CAS 

    Google Scholar
     

  • Meyer HV. HannahVMeyer/plinkQC: plinkQC version 0.2.3. Zenodo; 2019. https://zenodo.org/record/3373798.

  • Ellingson SR, Fardo DW. Automated quality control for genome wide association studies. F1000Research. 2016;5.

  • Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES. TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics. 2007;23(19):2633–5. https://doi.org/10.1093/bioinformatics/btm308.

    Article 
    PubMed 
    CAS 

    Google Scholar
     

  • Wang J, Huang D, Zhou Y, Yao H, Liu H, Zhai S, et al. CAUSALdb: a database for disease/trait causal variants identified using summary statistics of genome-wide association studies. Nucleic Acids Res. 2019;48(D1):D807–16. https://doi.org/10.1093/nar/gkz1026.

    Article 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Schaid DJ, Chen W, Larson NB. From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat Rev Genet. 2018;19(8):491–504.

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Watanabe K, Taskesen E, van Bochoven A, Posthuma D. Functional mapping and annotation of genetic associations with FUMA. Nat Commun. 2017;8(1):1826.

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Watanabe K, Umićević Mirkov M, de Leeuw CA, van den Heuvel MP, Posthuma D. Genetic mapping of cell type specificity for complex traits. Nat Commun. 2019;10(1):3222.

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Peat G, Jones W, Nuhn M, Marugán JC, Newell W, Dunham I, et al. The open targets post-GWAS analysis pipeline. Bioinformatics. 2020;36(9):2936–7. https://doi.org/10.1093/bioinformatics/btaa020.

    Article 
    PubMed 
    PubMed Central 
    CAS 

    Google Scholar
     

  • Song Z, Gurinovich A, Federico A, Monti S, Sebastiani P. nf-gwas-pipeline: a nextflow genome-wide association study pipeline. J Open Source Softw. 2021;6(59):2957. https://doi.org/10.21105/joss.02957.

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     



  • Source link

    EHS
    Back to top button