May 23, 2025

Chromosomal Kiss and Tell: Unravelling the 3D Genome Through Machine Learning

MoGenBlog, Medical Genomics, Research Highlights

Dr. Philipp Maass’s lab has developed a groundbreaking interdisciplinary approach for determining the spatial organization of the genome.

By Mahek Khatri, Rohan Khan and Nithya Gopalakrishnan

This interview was created as an assignment in the MMG3001Y (Advanced Human Genetics) course of the MHSc in Medical Genomics program under the guidance of course instructor Dr. Kinjal Desai. By Mahek Khatri, Rohan Khan, Nithya Gopalakrishnan.

The human genome is essential to life, yet only a small portion of DNA encodes for proteins. A majority of the genome is actually non-coding, playing a role in regulating the expression of proteins, dictating when genes are “turned” on or off. From transcription to gene-regulatory elements like enhancers and chromosomal interactions, the non-coding genome is complex and can have significant implications on the protein expression of our cells. Understanding these mechanisms plays a key role in uncovering the impact of the genome on our daily lives.

At the forefront of this research is Dr. Philipp Maass, a senior scientist in the Genetic and Genome Biology Program at the SickKids Research Institute. His research at SickKids is focused on functional regions of the non-coding genome, such as chromosomal interactions and long non-coding RNAs (lncRNAs), and their impact on development and disease mechanisms. In particular, he has led the discovery and understanding of inter-chromosomal contacts (ICCs) – trans interactions between different chromosomes that were once thought to be rare or insignificant1.

From Mendelian Genetics to Computational Biology: An Intercontinental Journey

Dr. Maass found his passion for genetics early on in life, long before he came to be a leading researcher at SickKids studying 3D genome organization. It was in high school that he became fascinated by the intricacies of gene regulation, from enhancer elements to chromatin looping. He recalls asking fundamental questions: “How [is genome] organization maintained? How are genes regulated by [a] three-dimensional architecture? How is chromatin organized, and how are the genes transcribed?” These questions about the three-dimensional structure of the nucleus sparked a curiosity that would shape his scientific career.

Dr. Maass’s passion for genetics and genome organization led him to the Max Delbrück Center for Molecular Medicine in Berlin. During his time in Berlin, genomic research was focused on understanding disease mechanisms. Researchers were looking at Mendelian disorders with specific disease mechanisms in families. However, Dr. Maass found himself drawn to broader questions about gene regulation. His scientific curiosity soon pushed him to explore further and answer broader questions about how large-scale genome organization influences biological function.

This quest led him to Harvard University, an experience that broadened his scientific perspective. “At Harvard it’s 24/7 science… you go for a beer in the evening… and it’s a ball of science,” he recalls. The research at Harvard was aimed at gaining an understanding of basic processes like gene and genome regulation, and lncRNA biology, all of which perfectly aligned with Dr. Maass’ interests.

Here, he was introduced to interdisciplinary approaches to address biology. Specifically, he learnt to implement bioinformatics, machine learning, and statistical approaches to do genetic and genomic research. This computational knowledge became a cornerstone of his recent publications, allowing him to analyze genome structure in previously unattainable ways.

Dr. Maass’s journey from Europe to North America, from classical genetics to cutting-edge computational biology, has played a significant role in shaping his scientific outlook. He emphasizes that, “for science, it’s really important that you do interdisciplinary or multidisciplinary science [and] get an understanding of what other people are doing.” By having such global exposure to research and understanding of what other scientists in the community are working on, it can allow us to gain a deeper understanding of our field.

Challenging Genome Organization Models

During his academic journey, many scientists had dismissed the idea that chromosomes could physically interact. Instead, the first models of genome organization suggested that each chromosome occupied its own space within the nucleus. But Dr. Maass was determined to investigate genome architecture further. He was drawn to the challenge of understanding chromosome organization, gene regulation and how these transcriptional programs dictate crucial biological processes such as development and tissue maintenance.

A pivotal moment in Dr. Maass’s research came during his PhD, when he found evidence of ICCs, a discovery that challenged assumptions in the field. He describes that he “came across one amazing example where [he] found a contact between different chromosomes.”. At the time, however, many geneticists were skeptical. “They did not believe that chromosomes were interacting,” he explains. Despite this skepticism, Dr. Maass’s findings suggested that ICCs exist. “This ‘chromosomal kissing,’ which people are now starting to accept, is really happening in a non-random manner,” he notes. Since then, it has been found that these interactions play a critical role in gene regulation and disease mechanisms, reinforcing the idea that our genome is far more deterministically organized than previously believed. To assess disease mechanisms and ICCs, we need a strong understanding of human DNA topology – the spatial organization and three-dimensional structure of the genome.

Human DNA is condensed into 23 pairs of chromosomes inside the nucleus. These chromosomes will interact at specific contact sites - ICCs (Figure 1). These interactions form a part of the 3D genome, following consistent patterns, and contributing to genetic functions such as genome organization and gene regulation. For example, certain ICCs organize active genes in regions of the nucleus where transcriptional activity is high, making it easier for the cell to produce proteins from these genes. Although they may have significant importance, detecting and studying chromosomal interactions has been difficult due to analytical and computational challenges.

Identifying ICCs Using Machine Learning

Hi-C (High-throughput Chromosome Conformation Capture) is a method in which DNA fragments, that are physically close in three-dimensional space, are sequenced together. This, alongside imaging of chromosomes in individual cells, can reveal how chromosomes are spatially organized within the nucleus and the interactions between them. When reflecting on his research, Dr. Maass again stresses the essentiality of interdisciplinary approaches, especially within systems biology. “In terms of imaging, seeing is believing,” He explained. “The good thing about imaging is you go from cell to cell [to see variability, although] it is a laborious approach.” At the University of Toronto Dr. Maass’s team has been utilizing an interdisciplinary approach between computational and biological methods to analyze the entire Spatially Interacting Genomic Architecture (Signature) and to generate a new data science pipeline. This Signature algorithm applied machine learning to analyze Hi-C datasets (Figure 1) and ultimately identified over 40,000 ICCs across 53 different human cell types, highlighting that they are not rare or random and play a key role in genome topology.

chromosome A chart

Figure 1: Inter-chromosomal contacts (ICCs) between different chromosomes within the nucleus. On the left, individual arms of chromosome A “kiss” two points (red) on chromosome B, whilst blue sections depict non-interacting regions. On the right is a graphical representation of Signature’s output, specifically identifying interacting (red) and non-interacting (blue) regions across two chromosomes as z scores. Figure adapted from Mokhtaridoost et al. (2024).

 From analyzing 62 different Hi-C datasets with Signature, Dr. Maass and his team showcased that while some ICCs are consistent across different cell types, there are others which are cell-specific. Beyond this, ICCs are commonly enriched in gene-dense regions with high gene expression and transcription factor binding. This further indicates that ICCs are not just structural features but are also functional and may aid in defining a cell’s identity. For example, in neuronal cells, these contacts may cluster genes responsible for specialized neuron cell functions into regulatory hubs, influencing their expression. However, these same ICCs may not exist in muscle cells where those genes are not essential. This suggests that the non-coding genome is important in cell differentiation, and that ICCs help orchestrate cell-type-specific transcriptional programs, ensuring that genes interact in ways that support each cell’s specialized role.

Another important finding from Signature was the confirmation of Rabl’s configuration in human cells. Rabl’s configuration is a structural feature where telomeres and centromeres of chromosomes arrange such that they cluster at opposite ends of a nucleus (proposed by Carl Rabl around 1895). This configuration has been observed in yeast, plants, and other mammals, but until now it was unclear whether this occurred in the human genome. From Signature analysis, it was demonstrated that ICCs align with this spatial organization, suggesting that genome organization in humans follows principles that have been conserved across species6. This conservation lends credence to the theory of ICCs being important in genomic regulation.

The ability of Signature to analyze ICCs is groundbreaking within the field, but its development was not always seamless. The most significant challenge to visualizing ICCs in a three-dimensional genome topology came in the form of a lack of available computing power. Analyzing every possible combination of ICCs required both a strong computing cluster and an abundance of time, a frustrating caveat to being able to process the copious amount of Hi-C data the models were trained on. Dr. Maass recalls that early iterations of Signature required over “400 hours for one job [and it would] crash because [the servers ran] out of memory”. Workflow and computation optimization had to be refined to efficiently process this large-scale data.

Then came the breakthrough. Dr. Maass recollects the day when Signature was able to successfully identify ICCs. "I remember the day when my postdoc, Daniella, was showing me on her screen just a plotted result… It showed that Signature can determine these trans-contacts.” Although this plot never made it into a publication, it was the first indication that Signature was working, proving that this machine learning approach had potential. The seeds of Signature had finally sprouted, slowly growing into today's robust and scalable version.

The Future of Chromosomal Contacts and Implications for Genetics and Medicine

Despite some initial hurdles, Signature’s publication signifies a host of new possibilities for clinical genomic interpretation. Dr. Maass emphasizes that understanding genome organization in healthy cells is a necessary step in recognizing the basis for clinical phenotypes for a wide variety of conditions, from a ‘simple' trisomy to a cancer genome. In the future, he hopes to see the application of spatial genome topology for therapeutic targets in disease treatment or improved prognosis. A potential future direction could be assessing cancer genetics, with genomic targets of interest being oncogenes such as MYC or TP53. Signature has the potential to redefine our understanding of the genomic basis of several conditions, including but not limited to cancer, and it remains to be seen how exactly the program can aid in diagnostics and therapeutics.

In terms of future directions, Dr. Maass highlights three ongoing projects, ranging from domestic to international partnerships. The first is oligo-painting as a means of imaging the genomic spatial gradient with Dr. Eric Joyce’s lab at the University of Pennsylvania. He then mentions Dr. Ana Pombo’s lab at the Max Delbrück Centre for Molecular Medicine in Berlin, to characterize genomic organization in brain disorders. Integrating genome topology studies into the construction of a clinical phenotype for conditions is a thrilling future direction for genomics. Lastly, Dr. Maass was sure to call attention to Dr. Artem Babaian, from University of Toronto’s Donnelly Centre, who is a leading expert in using ultra massive cloud computing to demonstrate the utility of technological advancements within the field to compute high-dimensional datasets of 3D genome topology. Across all three of these partnerships, it is abundantly clear that Signature’s potential is multifactorial across diverse contexts.

In addition to developing Signature, Dr. Maass is engaged with leading his lab and teaching. He is fueled by his passion for science and collaboration with his team, pictured below. Reflecting on his favourite part of research and lab work, he left us with some advice for future scientists. “I think it's great to validate some ideas that other people had more than 100 years ago, like when we talk about the Rabl configuration,” he explained. This love for validation clearly culminated in Signature, and he is deliberate to highlight his excitement about how advancements in technology can help confirm these long-standing structural ideas. The new generation of scientists in medical genomics are imperative to furthering this work and should be supported. “When you're interested in pursuing a scientific academic career, let me tell you, it's not easy. But if you really want to do it, then do it,” Dr. Maass encourages. “At the end of the day, research is about finding new things and validating them. That’s the fun part.” With such passionate researchers within the field working in tandem with ever-evolving technological capabilities and computational tools, the future of medical genomics is looking bright.