Roth Lab Computational Algorithm VARITY is Best at Pinpointing Disease-Causing Gene Variants
Research Highlights
By
Soha Usmani
One notable and medically relevant type of genomic variant is missense variants, defined as gene mutations resulting in a change from one amino acid to another. A significant barrier in determining individual risk for genetic disorders is ascribing disease risk to specific variants. This is because over 99% of missense variants in the human population are classified as rare, meaning they have a minor allele frequency (MAF)—defined as the frequency of the second most common allele in a population—of around 0.5%, and 90% are extremely rare, with a MAF less than 10-6. There is a lack of available evidence for the pathogenicity of rare variants compared to common variants, resulting in a need to improve computational methods for inferring disease risk. In their new study, a research effort from the lab of department faculty member Prof. Fritz Roth, led by PhD student Yingzhou (Joe) Wu, aimed to better predict pathogenicity for rare and extremely rare variants by creating a computational algorithm called VARITY, optimized for rare and extremely rare missense variants.
The VARITY model is as follows: the team extracted all missense variants from roughly 18,000 genes and identified around 4,000 disease-associated proteins. They used variants and properties (‘features’) of variants from many databases to train their machine-learning algorithm to classify variants. Although many sources of variant annotation were used, the model was optimized for performance on rare or extremely rare variants with high quality pathogenicity annotations from ClinVar.. After a machine learning step, the researchers analyzed their model. They found that features such as conservation scores, differences in physicochemical properties between the missense and wild-type amino acid, and molecule surface area accessible to solvent were the most critical contributors to predicting variant outcomes. Most importantly, the Roth lab found that the VARITY approach outperforms other computational methods in pinpointing rare pathogenic variants, identifying 12-13% more pathogenic variants than others. Indeed, when tested on de novo missense mutations for neurodevelopmental disorders, VARITY was more sensitive (had higher recall) than all the other algorithms, at a stringent threshold where 90% of predictions were correct. It also surpassed other methods when tested on ClinVar rare variants that had not been used to train the model. Future studies could address VARITY performance improvement by adding features such as inheritance (ex. dominant, recessive) and mechanism (gain or loss of function) to their databases. This model alongside further research into computational predictors will contribute to boosting clinical genetic testing accuracy and giving further insight into genetic disorders and their mechanisms.
Readers can browse and download the calculated features of missense variants via the VARITY web portal. Learn more about Dr. Roth’s work into missense variants on U of T News and his lab website.
Researchers from the University of Toronto's Department of Molecular Genetics, led by Dr. Xi Huang and Dr. Weifan Dong, have developed a groundbreaking designer peptide therapy targeting glioblastoma (GBM). Their innovative approach disrupts a tumour-specific potassium channel complex, offering promising advancements in GBM treatment.
We interviewed Dr. Gingras for International Day of Women and Girls in Science to discuss her passion for science, challenges as a woman in her career, strategies for promoting diversity and inclusion at LTRI, and her approach to mentorship.