Skip to main content
Sep 30, 2025

A study from the Yuen Lab reveals repeated DNA sequences play active role in gene regulation

Photo: Dr. Sasha Mitina (left) and Dr. Ryan Yuen (right)
Photo: Dr. Sasha Mitina (left) and Dr. Ryan Yuen (right)
By Marcia Iglesias

A new study from the University of Toronto at the Department of Molecular Genetics has found that repeated long stretches of DNA, long considered to be “junk DNA,” are more involved in regulating genes than previously thought. The findings reveal that it is not only the size of these repeats that matters, but also their sequence and position in the genome, factors often overlooked in genetic research.

These DNA elements, called short tandem repeats (STRs), consist of sequences of one to six “letters” copied many times. They make up about seven percent of the human genome. STRs have traditionally been studied in the context of disease, since abnormal expansions of certain repeats are known to cause conditions such as Huntington disease and fragile X syndrome.

Researchers at the University of Toronto's Department of Molecular Genetics, along with The Hospital for Sick Children, found that STRs affect gene regulation. The findings were published in Genome Biology.

The study, led by postdoctoral researcher Dr. Sasha Mitina in the lab of Professor Ryan Yuen, analyzed genome data from more than 3,000 people across global populations. The team found that about seven percent of STRs vary in sequence between individuals. These variations were not random. Many occurred near genes important for brain development and neuron function, located near splice junctions of neuronal genes and in regions associated with mobile elements, such as Alu sequences.

“These tandem repeat sequence changes, mostly perceived as disease-causing, are actually common in the general population,” said Mitina. “Even more surprising, these changes affect neuronal genes, suggesting they may help shape brain development and function—not just drive disease.”

The study also revealed population-specific patterns. For instance, certain STR variants were more frequent in people of African ancestry, while others were distinct in East Asian or South Asian groups. These findings suggest a potential role for STRs in understanding human phenotypic diversity.

By showing that repeat sequence context can influence splicing and regulatory activity, the study highlights the potential of STRs as active elements in the genome that can adjust gene expression.

“By considering the context of repeats, we no longer see them as stretches of junk DNA, but as active elements that can disrupt splicing, alter or create new regulatory motifs,” said Mitina.

The study also highlights the importance of long-read sequencing technologies. Unlike older short-read methods, which often miss repetitive regions, long-read platforms can capture the full complexity of STRs and reveal how they shape gene regulation. These approaches will enable researchers to explore STR variability more thoroughly, hopefully contributing to both understanding normal biology and assessing disease risk.

“What excites me most is that tandem repeat sequence changes, mostly perceived as disease-causing, are actually common in healthy people,” said Mitina. “Even more surprising, these changes affect neuronal genes, suggesting they may help shape brain development and function, not just drive disease.”

The study also revealed population-specific distributions. For example, some repeat variants were more common in people of African ancestry, while others were distinct in East Asian or South Asian groups. These findings suggest that STR variability contributes to human phenotypic diversity and may have evolutionary significance.

By showing that repeat context influences splicing and regulatory processes, the study challenges the view of STRs as “junk DNA”. Instead, these regions position them as active elements in genome regulation. “By considering the context of repeats, we no longer see them as stretches of filler DNA, but as sequences that can disrupt splicing or create new regulatory motifs,” said Mitina. “More research is needed to understand the exact mechanisms, but this is a necessary and exciting next step.”

The work highlights the importance of emerging sequencing technologies, such as long-read platforms, in characterizing complex repetitive regions. These approaches will allow deeper examination of STR variability and its contribution to both normal biology and disease risk.

Funding Acknowledgement:

This work from the Yuen Lab was supported by the Canadian Institutes of Health Research (CIHR) and Genome Canada through the Ontario Genomics Institute, with additional support from the Hospital for Sick Children Foundation, the McLaughlin Centre, and the University of Toronto McLaughlin Centre’s Genome Engineering and Disease Modelling Initiative.