Biomedical and Health Informatics research banner

Our Research

word cloud

Our research is focused on developing artificial intelligence (AI) methods to analyze heterogeneous biomedical big data for translational applications. This ongoing work brings together two branches of AI: knowledge representation reasoning and machine learning algorithms to characterize brain network dynamics and electronic health records (EHR) data.

Knowledge representation and reasoning involves development of knowledge models or ontologies. We have led the development of new methods to use ontology engineering principles across multiple stages of machine learning workflows, including feature engineering and model validation. This involves the development of deep neural network (DNN) models and the use of classical machine learning algorithms such as support vector machines (SVM) for integrative analysis of multi-modal brain connectivity data in neurological disorders such as epilepsy and Parkinson's Disease. To address the challenges of data quality and scientific reproducibility, we have led the development of a provenance metadata framework called ProvCaRe using ontology engineering and natural language processing techniques.


Research Interests

Epilepsy seizure networks; Structural connectivity networks derived from MRI; Functional connectivity networks derived from EEG; Provenance metadata; Ontology engineering; Data integration; High performance computing

News

Opportunities

CWRU Students

We welcome interest from current CWRU undergraduate and graduate students who are interested in working at the intersection of biomedical research and computer science (primarily artificial intelligence research). Please contact Dr. Sahoo at sss124@case.edu.

Projects and Resources

Brain Connectivity in Neurological Disorders



We study underlying mechanisms that influence the generation and progression of abnormal electrophysiological signals in epilepsy, which is a serious neurological disorder affecting more than 50 million individuals worldwide with debilitating seizures. Our research uses high resolution signal data recorded using intracranial EEG with multiple contacts. However, this approach involves querying and analyzing large volume of multi-modal data. To address this challenge, we incorporate techniques of Big Data analytics, including the development of new data models that are compatible with techniques of large-scale data analysis, such as parallel and distributed computing. We have developed flexible analysis workflows with multiple measures of statistical correlation that can quantitatively assess the strength of the connections among the brain regions active during a seizure event. More information is available on the project page. NIC Workflow Website


MaTiLDA



As an extension of these data-processing workflows, we have also developed MaTiLDA as an integrated web platform for analyzing abnormal electrophysiological signals using topological data analysis (TDA) and machine learning algorithms. MaTiLDA features a graphical user interface that enables users to apply topological data analysis and machine learning algorithms to analyze datasets from neurophysiological recordings without requiring substantial domain knowledge in mathematics or computing. More information on MaTiLDA can be found here. MaTiLDA

Epilepsy and Seizure Ontology



Standardized terminologies and structured knowledge representation are critical for advancing research, improving clinical decision-making, and ensuring high-quality patient care in epilepsy and seizure disorders. To support this need, the Epilepsy and Seizure Ontology (EpSO) was developed, offering a formal, domain-specific framework that captures comprehensive knowledge about epilepsy, seizures, and related clinical features. EpSO integrates clinical, research, and genetic information by harmonizing terminology and defining relationships, enabling seamless data sharing and interoperability across systems. Researchers and clinicians can leverage EpSO to annotate data, ensure consistent interpretation, and facilitate integrative analyses that drive progress in both clinical practice and translational research.

Epilepsy and Seizure Ontology

Dravet Syndrome AI Platform

Ontology-based clinical decision-support system



Our primary objective is to create a clinical decision support system (CDSS), which mirrors the clinical workflow of movement disorders in diagnosing Parkinson’s disease (PD) using the International Parkinson and Movement Disorders Society criteria (MDS-PD). The MDS-PD criteria allow highly sensitive and specific diagnosis of PD but are inherently complex to apply using a manual approach with pen-and-paper and are not supported currently in electronic health record systems. This highlights the need for a CDSS to enable implementation of these criteria, which can support clinicians and researchers alike as part of clinical care and research. Our modular approach to creating ORMIS-PD consists of three steps; first building the data entry module to support capturing of relevant patient data needed for the MDS-PD criteria, then building the knowledge base module for modelling of the algorithm of the MDS-PD criteria, and finally building the data analytics module application of the algorithm on the captured data to classify the patient into one of the three levels of diagnostic classification of the MDS-PD criteria. ORMIS-PD Website


Publications

2026

Rose J, Hussain F, Sahoo SS, Lyytinen, K, Song S, Menegay HJ, Lanese R, Sarangadharan Geetha A, Liu L, Tsui J, Beno N, Burus T, Koroukian SM. The Population Cancer Assessment and Surveillance Engine (PopCASE): an Emerging Population Cancer Data Platform. J Registry Management. 2026. In press.

2026

2025

Iyer, V., Zurlo, I., Golnari, P., Prantzalos, K., Lobb, B. M., Boyd, J., ... & Gupta, D. K. (2025). Demonstration of a Prototype Clinical Decision Support System for Diagnosing Parkinson’s Disease . Parkinsonism & Related Disorders, 134.

2025

Huang, C., Bodner, D., Schumacher, F., Sahoo, S., & Wilfredwu, C. H. (2025). IP06-32 CONTEMPORARY DATA REDEFINES KIDNEY STONE INCIDENCE: EMERGING AGE-RELATED PATTERNS . Journal of Urology, 213(5S), e325.

2025

Zweber, C., Cholerton, B., Ryan, A., Zabetian, C., Miller, R., Iyer, V., ... & Gupta, D. K. (2025). Modeling the Heterogeneity and Trajectories of Cognitive Dysfunction in Parkinson’s Disease Using Partially Ordered Set Models . medRxiv, 2025-07.

2025

Gupta, D. K., Golnari, P., Prantzalos, K., Zurlo, I., Iyer, V., Lobb, B. M., ... & Sahoo, S. S. (2025). CDS-PD: A Novel Clinical Decision Support Platform for Parkinson’s Disease . medRxiv, 2025-07.

2025

Golnari, P., Prantzalos, K., Upadhyaya, D., Buchhalter, J., & Sahoo, S. S. (2025). Human in the Loop: Embedding Medical Expert Input in Large Language Models for Clinical Applications. Studies in health technology and informatics, 329, 658-662.

2025

Upadhyaya, D.P., Cakir, G.B., Stefano, R., Shaikh, A., Albert, J., Sahoo S.S.*, and Ghasia, F*., 2025. A Multi-Head Attention Deep Learning Algorithm to Detect Amblyopia using Fixation Eye Movements . Ophthalmology Science 2025, by the American Academy of Ophthalmology. *Co-corresponding author. A Multi-Head Attention Deep Learning Algorithm to Detect Amblyopia using Fixation Eye Movements.

2025

Turner, M. D., Appaji, A., Ar Rakib, N., Golnari, P., Rajasekar, A. K., KV, A. R., ... & Turner, J. A. (2025). Large language models can extract metadata for annotation of human neuroimaging publications . Frontiers in Neuroinformatics, 19, 1609077.

2025

Golnari P, Prantzalos K, Hood V, Meskis MA, Isom LL, Wilcox K, Parent JM, Lal D, Lhatoo SD, Goodkin HP, Wirrell EC, Knupp KG, Patel M, Loeb JA, Sullivan JE, Harte-Hargrove L, Fureman BE, Buchhalter J, Sahoo SS. Ontology accelerates few-shot learning capability of large language model: A study in extraction of drug efficacy in a rare pediatric epilepsy. Int J Med Inform. 2025 Sep;201:105942. doi: 10.1016/j.ijmedinf.2025.105942. Epub 2025 Apr 21. PMID: 40311258.

Int J Med Inform , 2025

Prantzalos, K., Golnari, P., Upadhyaya D., Thyagaraj S., Fernandez Baca-Vaca, G., Luders H., Sahoo S.S., Standardized Epilepsy Data Collection and Analysis Leveraging the Four-Dimensional Epilepsy Classification (4D-EC) Framework. MEDINFO 2025. (Accepted)

2025

2024

Upadhyaya, D.P., Shaikh, A.G., Cakir, G.B., Prantzalos, K., Golnari, P., Ghasia, F.F. & Sahoo, S.S. 2024, A 360° view for large language models: Early detection of amblyopia in children using multi-view eye movement recordings. in J. Finkelstein, R. Moskovitch & E. Parimbelli (eds), Artificial intelligence in medicine. AIME 2024. Lecture notes in computer science, vol. 14845, Springer, Cham, pp. 165–175.

2024

Sahoo, S.S., Plasek, J.M., Xu, H., Uzuner, Ö., Cohen, T., Yetisgen, M., Liu, H., Meystre, S. and Wang, Y., 2024. Large language models for biomedicine: foundations, opportunities, challenges, and best practices. Journal of the American Medical Informatics Association, p.ocae074.

2024

Upadhyaya, D.P., Shaikh, A., Prantzalos, K., Golnari, P., Ghasia, F.F. & Sahoo, S.S. 2024, Helios: a platform for early childhood amblyopia detection using fixation eye movements. AMIA 2024 Annual Symposium, November 9–13, San Francisco, CA, American Medical Informatics Association (Poster), (Accepted).

2024

Upadhyaya, D.P., Prantzalos, K., Golnari, P., Shaikh, A.G., Sivagnanam, S., Majumdar, A., Ghasia, F.F. & Sahoo, S.S. 2025, Explainable artificial intelligence (XAI) in the era of large language models: applying an XAI framework in pediatric ophthalmology diagnosis using the Gemini model. AMIA 2025 Informatics Summit, March 10–13, Pittsburgh, PA, American Medical Informatics Association, (Accepted).

NA , 2024

Prantzalos, K., Upadhyaya, D.P., Golnari, P., Fernandez-BacaVaca, G., Aispuro, G.P., Salehizadeh, S., Thyagaraj, S., Gurski, N. & Sahoo, S.S. 2024, Neural mosaics: detecting aberrant brain interactions using algebraic topology and generative artificial intelligence. , AMIA 2024 Annual Symposium, November 9–13, San Francisco, CA, American Medical Informatics Association, (Accepted).

2024

Upadhyaya, D.P., Tarabichi, Y., Prantzalos, K., Ayub, S., Kaelber, D.C. and Sahoo, S.S., 2024. Machine learning interpretability methods to characterize the importance of hematologic biomarkers in prognosticating patients with suspected infection . Computers in Biology and Medicine, 183, p.109251.

Computers in Biology and Medicine , 2024

PMID: DOI: 10.1016/j.compbiomed.2024.109251

Sivagnanam, S., Yeu, S., Lin, K., Sakai, S., Garzon, F., Yoshimoto, K., Prantzalos, K., Upadhyaya, D.P., Majumdar, A., Sahoo, S.S. and Lytton, W.W., 2024. Towards building a trustworthy pipeline integrating Neuroscience Gateway and Open Science Chain . Database, 2024, p.baae023.

2024

Sanchez, E., Upadhyaya, D.P., Cakir, G.B., Shaikh, A., Stefano, R., Sahoo, S. and Ghasia, F., 2024. Machine Learning, Artificial Intelligence and Eye Movements: Utility in Detection of Amblyopia. Investigative Ophthalmology & Visual Science, 65(7), pp.4301-4301.

2024

2023

Upadhyaya, D.P., Prantzalos, K., Thyagaraj, S., Shafiabadi, N., Fernandez-BacaVaca, G., Sivagnanam, S., Majumdar, A. and Sahoo, S.S., 2023. Machine Learning Interpretability Methods to Characterize Brain Network Dynamics in Epilepsy. medRxiv.

medrxiv 2023.06.25.23291874; doi: https://doi.org/10.1101/2023.06.25.23291874 , 2023

PMID: medrxiv.org 2023.06.25.23291874; doi: https://doi.org/10.1101/2023.06.25.23291874

Sahoo, S.S., Turner, M.D., Wang, L., Ambite, J.L., Appaji, A., Rajasekar, A., Lander, H.M., Wang, Y. and Turner, J.A., 2023. NeuroBridge ontology: computable provenance metadata to give the long tail of neuroimaging data a FAIR chance for secondary use. Frontiers in Neuroinformatics, 17, p.1216443.

Frontiers in Neuroinformatics , 2023

PMID: NA

Wang, X., Wang, Y., Ambite, J.L., Appaji, A., Lander, H., Moore, S.M., Rajasekar, A.K., Turner, J.A., Turner, M.D., Wang, L. and Sahoo, S.S., 2023, April. Enabling scientific reproducibility through FAIR data management: An ontology-driven deep learning approach in the NeuroBridge Project. In AMIA Annual Symposium Proceedings (Vol. 2022, p. 1135).

2023

Prantzalos, K., Upadhyaya, D., Shafiabadi, N., Fernandez-BacaVaca, G., Gurski, N., Yoshimoto, K., Sivagnanam, S., Majumdar, A. and Sahoo, S.S., 2023. MaTiLDA: an integrated machine learning and topological data analysis platform for brain network dynamics. In PACIFIC SYMPOSIUM ON BIOCOMPUTING 2024 (pp. 65-80).

Proceedings of the 29th Pacific Symposium on Biocomputing (PSB) 2024. , 2023

PMID: medRxiv 2023.06.08.23290830; doi: https://doi.org/10.1101/2023.06.08.23290830

Wang, L., Ambite, J.L., Appaji, A., Bijsterbosch, J., Dockes, J., Herrick, R., Kogan, A., Lander, H., Marcus, D., Moore, S.M. and Poline, J.B., 2023. NeuroBridge: a prototype platform for discovery of the long-tail neuroimaging data. Frontiers in neuroinformatics, 17, p.1215261.

Frontiers in Neuroinformatics (accepted) , 2023

PMID: NA

2022

Turner, J.A., Turner, M.D., Appaji, A., Rajasekar, A.K., Wang, L. & Sahoo, S.S. 2022, NeuroBridge ontology development for shared neuroimaging datasets. International Neuroinformatics Coordinating Facility (INCF) Assembly, 2022.

NA , 2022

PMID: NA

Spilsbury, J.C., Hernandez, E., Kiley, K., Gillerlane Hinkes, E., Prasanna, S., Shafiabadi, N., Rao, P. and Sahoo, S.S., 2022. Social Service Workers’ Use of Social Media to Obtain Client Information: Current Practices and Perspectives on a Potential Informatics Platform. Journal of social service research, 48(6), pp.739-752.

NA , 2022

PMID: NA

Sahoo, S.S., Kobow, K., Zhang, J., Buchhalter, J., Dayyani, M., Upadhyaya, D.P., Prantzalos, K., Bhattacharjee, M., Blumcke, I., Wiebe, S. and Lhatoo, S.D., 2022. Ontology-based feature engineering in machine learning workflows for heterogeneous epilepsy patient records. Scientific reports, 12(1), p.19430.

NA , 2022

PMID: NA

Lander, H., Rajasekar, A., Wang, Y., Watson, M., Sahoo, S., Turner, J., Poline, J-B. & Wang, L. 2022, Linking NeuroBridge and NeuroQuery with deep semantic matching. Neuroinformatics Assembly.

International Neuroinformatics Coordinating Facility (INCF) Assembly, 2022 (poster) , 2022

PMID: 0

Wang, X., Wang, Y., Ambite, J.L., Appaji, A., Lander, H., Moore, S.M., Rajasekar, A.K., Turner, J.A., Turner, M.D., Wang, L. and Sahoo, S.S., 2023, April. Enabling scientific reproducibility through FAIR data management: An ontology-driven deep learning approach in the NeuroBridge Project. In AMIA Annual Symposium Proceedings (Vol. 2022, p. 1135).

AMIA Annual Symposium Proceedings , 2022

PMID: {null}

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Education


Biomedical & Health Informatics Doctoral Program PQHS 416: Introduction to Computing in Biomedical Health Informatics
The Biomedical & Health Informatics (BHI) doctoral program trains researchers in biomedicine, population health, and clinical care. Program trainees will acquire a core set of skills spanning computing, biostatistics, and biomedical research through a combination of course work and participation in the study in the Population and Quantitative Health Sciences (PQHS) department. The doctoral program is designed for students to acquire skills in the three areas of concentration: Data Analytics with a focus on statistics and data wrangling, Biomedical Health with a focus on systems biology, clinical, and health issues and Computational and System Design with a focus on knowledge representation, information retrieval, and Big Data. “PQHS 416 introduces students to computational techniques and concepts that underpin biomedical and health informatics data management and analysis. In particular, the course will focus on the three topics of: (1) Biomedical terminologies and formal logic used in building knowledge models such as ontologies; (2) Natural language processing (NLP), and (3) Big Data technologies, including components of Hadoop stack and Apache Spark. This is a lecture-based course that relies on both materials covered in class and out-of-class readings of published literature. Students will be assigned reading assignments, homework exercise assignments and they are expected to complete homework assignment for each class. The students will be involved in a team project and they will be expected to prepare a project report at the end of the semester.”

Our Team

Satya Sahoo, PhD

Headshot of Satya Sahoo, PhD

Professor of Medical Informatics

Dipak Upadhyaya, MPH

Headshot of Dipak Upadhyaya, MPH

PhD Candidate

Katrina Prantzalos, MS

Headshot of Katrina Prantzalos, MS

PhD Candidate

Pedram Golnari, MD

Headshot of Pedram Golnari, MD

PhD Student

Leonora Lipson

Headshot of Leonora Lipson

Undergraduate Researcher

Deep Desai

Headshot of Deep Desai

Undergraduate Researcher

Grace Wilding

Headshot of Grace Wilding

Undergraduate Researcher

Our Alumni


1. Nasim Shafiabadi, MD

Research Fellow


2. Jianzhe Zhang

MS (First employer: ByteDance)


3. Arthur Gershon

PhD (Status: Post-Doctoral Scholar)


4. Catherine Jayapandian

PhD (Status: Post-Doctoral Scholar)


5. Priya Ramesh

MS (First employer: CoverMyMeds)


6. Xinting Hong

MS


7. Pramith Devulapalli

BS (Status: PhD at Purdue University)


8. Vimig Socrates

BS, MS (Status: PhD at Yale University)


9. Meng Zhao

MS (First employer: IBM Explorys)


10. Li Wang

MS


11. Chien-Hung Chen

BS


12. Chang Liu

MS (First employer: Microsoft Corporation)


13. Annan Wei

MS (First employer: Google Inc)


14. Manu Bulusu

BS (Undergraduate Researcher)


15. Pranav Nampoothiripad

BS (Undergraduate Researcher)


16. Srikeerthi Sevugan

BS (Undergraduate Researcher)


17. Tom Kupferer, MS

PhD Student


Funding Agencies