Connetquot senior publishes breakthrough epilepsy research

Congratulations to Muhammad Omer Latif, a senior at Connetquot High School, on the official publication of his research paper in the Cornell University arXiv repository!
His groundbreaking work, titled "A Deep Learning Pipeline Integrating GPT-2 XL and GPU Acceleration for Investigating Gene Expression Patterns in Epilepsy," marks a major step forward in biomedical research. It’s one of the first to successfully adapt a language-based AI model to analyze the "genetic language" of neurological disease.
Epilepsy is a chronic brain condition that causes repeated seizures and affects around 50 million people worldwide. To better understand this complex disorder, researchers are turning to genetics, specifically focusing on how genes in the brain behave. But with such enormous and complicated datasets, analyzing this information has become a major challenge.
To address this, Muhammad Omer Latif and his team developed a new method using artificial intelligence to study gene activity in epilepsy. Their approach applies GPT-2 XL, a powerful AI model originally designed to understand human language, to analyze genetic sequences, treating DNA like a language of its own. Combined with NVIDIA’s cutting-edge GPU hardware, this method allows for fast and large-scale analysis of genomic data.
The research was tested on two epilepsy datasets and showed promising results. Findings included a possible link between the ketogenic diet and reduced brain inflammation, as well as improved gene activity in a zebrafish model of epilepsy.
“The field of genomics in neurological diseases, such as epilepsy, generates vast mountains of data that overwhelm traditional analysis methods,” said Latif. “My inspiration was the immediate aspiration to close that gap. The challenge was not just to observe this ‘hot topic’ intersection of AI and life science, but to engineer a solution that could turn data into therapeutic insight.”
Muhammad Omer Latif worked closely with Zhihua Dong of Brookhaven National Laboratory, whose expertise in high-performance computing was key to the project’s success. He also acknowledged his mentor, Dr. David Dakota Blair of Computational Data Sciences, whose deep knowledge of bioinformatics was the guiding force behind the project. Additionally, Hayat Ullah (Florida Atlantic University) and Muhammad Ali Shafique (Kansas State University) played essential roles in shaping the paper’s scientific narrative and presentation.
This work was completed through the Brookhaven National Laboratory’s High School Research Program and stands as an inspiring example of what young researchers can achieve at the cutting edge of science and technology.