The purpose of this analysis is to find associations between variants in the PE and PPE genes of Mycobacterium tuberculosis and various antibiotic resistances.
This analysis was run by Lindsay V. Clark at Research Scientific Computing, Seattle Children's Research Institute, in collaboration with Christoph Grundner and Vishant Boradia.
This repo uses the VCF generated in the repo https://github.com/RSC-RP/2024-12_grundner_genotyping_Mtub
Workbooks run in this order:
- relatedness_lineages_2024-12-06.qmd (normalize indels, estimate relatedness among isolates, identify lineages)
- association_setup_2024-12-16.qmd (compile phenotypes and format data for PySEER)
- association_all_pe_ppe_2025-05-15.qmd (filter variants, agglomerate genotypes to amino acid and promoter level, run GWAS with PySEER and process output)
Some of the workbooks refer to scripts which can be found in the scripts directory.