Our research is focused on developing artificial intelligence (AI) methods to analyze heterogeneous biomedical big data for translational applications. This ongoing work brings together two branches of AI: knowledge representation reasoning and machine learning algorithms to characterize brain network dynamics and electronic health records (EHR) data.

Knowledge representation and reasoning involves development of knowledge models or ontologies. We have led the development of new methods to use ontology engineering principles across multiple stages of machine learning workflows, including feature engineering and model validation. This involves the development of deep neural network (DNN) models and the use of classical machine learning algorithms such as support vector machines (SVM) for integrative analysis of multi-modal brain connectivity data in neurological disorders such as epilepsy and Parkinson's Disease. To address the challenges of data quality and scientific reproducibility, we have led the development of a provenance metadata framework called ProvCaRe using ontology engineering and natural language processing techniques.

Epilepsy seizure networks; Structural connectivity networks derived from MRI; Functional connectivity networks derived from EEG; Provenance metadata; Ontology engineering; Data integration; High performance computing


22nd International Conference on Artificial Intelligence in Medicine (AIME 2024), Salt Lake City Utah, USA, July 9-12

This premier conference will bring together global experts to explore the latest trends, research, and practical applications of AI in medicine. At AIME 2024, our lab presents research in medical diagnostics, led by Dipak Upadhyaya, demonstrating the efficacy of large language models in healthcare applications.


We welcome interest from current CWRU undergraduate and graduate students who are interested in working at the intersection of biomedical research and computer science (primarily artificial intelligence research). Please contact Dr. Sahoo at sss124@case.edu.

Brain Connectivity in Neurological Disorders

We study underlying mechanisms that influence the generation and progression of abnormal electrophysiological signals in epilepsy, which is a serious neurological disorder affecting more than 50 million individuals worldwide with debilitating seizures. Our research uses high resolution signal data recorded using intracranial EEG with multiple contacts. However, this approach involves querying and analyzing large volume of multi-modal data. To address this challenge, we incorporate techniques of Big Data analytics, including the development of new data models that are compatible with techniques of large-scale data analysis, such as parallel and distributed computing. We have developed flexible analysis workflows with multiple measures of statistical correlation that can quantitatively assess the strength of the connections among the brain regions active during a seizure event. More information is available on the project page. NIC Workflow Website


As an extension of these data-processing workflows, we have also developed MaTiLDA as an integrated web platform for analyzing abnormal electrophysiological signals using topological data analysis (TDA) and machine learning algorithms. MaTiLDA features a graphical user interface that enables users to apply topological data analysis and machine learning algorithms to analyze datasets from neurophysiological recordings without requiring substantial domain knowledge in mathematics or computing. More information on MaTiLDA can be found here. MaTiLDA

Provenance Metadata for Scientific Reproducibility

Scientific reproducibility is key to scientific progress as it allows the research community to build on validated results, protect patients from potentially harmful trial drugs derived from incorrect results, and reduce wastage of valuable resources. To address this challenge in the biomedical research domain, we are developing the Provenance for Clinical and Healthcare Research (ProvCaRe) framework using World Wide Web Consortium (W3C) PROV specifications, including the PROV Ontology (PROV-O). In the ProvCaRe project, we are extending PROV-O to create a formal model of provenance information that is necessary for scientific reproducibility in biomedical research. ProvCaRe framework aims to model, extract, and analyze provenance information. The ProvCaRe framework consists of the S3 Model that extends the PROV specifications to model provenance metadata describing Study Method, Study Tools, Study Data in a research study. We have developed a provenance-specific text processing pipeline that uses the ProvCaRe ontology to identify and extract provenance metadata from published literature describing biomedical research studies. The ProveCaRe knowledge repository contains provenance "triples" extracted from published research studies that can be queried and explored by users using "hypothesis-based search". Coming Soon

Ontology-based clinical decision-support system

Our primary objective is to create a clinical decision support system (CDSS), which mirrors the clinical workflow of movement disorders in diagnosing Parkinson’s disease (PD) using the International Parkinson and Movement Disorders Society criteria (MDS-PD). The MDS-PD criteria allow highly sensitive and specific diagnosis of PD but are inherently complex to apply using a manual approach with pen-and-paper and are not supported currently in electronic health record systems. This highlights the need for a CDSS to enable implementation of these criteria, which can support clinicians and researchers alike as part of clinical care and research. Our modular approach to creating ORMIS-PD consists of three steps; first building the data entry module to support capturing of relevant patient data needed for the MDS-PD criteria, then building the knowledge base module for modelling of the algorithm of the MDS-PD criteria, and finally building the data analytics module application of the algorithm on the captured data to classify the patient into one of the three levels of diagnostic classification of the MDS-PD criteria. ORMIS-PD Website



Sheth, A.P., York, W.S., Thomas, C., Nagarajan, M., Miller, J.A., Kochut, K., Sahoo, S.S. and Yi, X., 2004. Semantic Web technology in support of Bioinformatics for Glycan Expression.


Biomedical & Health Informatics Doctoral Program PQHS 416: Introduction to Computing in Biomedical Health Informatics
The Biomedical & Health Informatics (BHI) doctoral program trains researchers in biomedicine, population health, and clinical care. Program trainees will acquire a core set of skills spanning computing, biostatistics, and biomedical research through a combination of course work and participation in the study in the Population and Quantitative Health Sciences (PQHS) department. The doctoral program is designed for students to acquire skills in the three areas of concentration: Data Analytics with a focus on statistics and data wrangling, Biomedical Health with a focus on systems biology, clinical, and health issues and Computational and System Design with a focus on knowledge representation, information retrieval, and Big Data. “PQHS 416 introduces students to computational techniques and concepts that underpin biomedical and health informatics data management and analysis. In particular, the course will focus on the three topics of: (1) Biomedical terminologies and formal logic used in building knowledge models such as ontologies; (2) Natural language processing (NLP), and (3) Big Data technologies, including components of Hadoop stack and Apache Spark. This is a lecture-based course that relies on both materials covered in class and out-of-class readings of published literature. Students will be assigned reading assignments, homework exercise assignments and they are expected to complete homework assignment for each class. The students will be involved in a team project and they will be expected to prepare a project report at the end of the semester.”

Our Team

Satya Sahoo, PhD

headshot of team member

Assoc. Prof. of Medical Informatics

Katrina Prantzalos, MS

headshot of team member

PhD Candidate

Pedram Golnari, MD

headshot of team member

PhD Student

Nasim Shafiabadi, MD

headshot of team member

Research Fellow

Dipak Upadhyaya, MPH

headshot of team member

PhD Candidate

Pranav Nampoothiripad

headshot of team member

Undergraduate Researcher

Keerthi Sevugan

headshot of team member

Undergraduate Researcher

Leonora Lipson

headshot of team member

Undergraduate Researcher

Our Alumni

1. Jianzhe Zhang

MS (First employer: ByteDance)

2. Arthur Gershon

PhD (Status: Post-Doctoral Scholar)

3. Catherine Jayapandian

PhD (Status: Post-Doctoral Scholar)

4. Priya Ramesh

MS (First employer: CoverMyMeds)

5. Xinting Hong


6. Pramith Devulapalli

BS (Status: PhD at Purdue University)

7. Vimig Socrates

BS, MS (Status: PhD at Yale University)

8. Meng Zhao

MS (First employer: IBM Explorys)

9. Li Wang


10. Chien-Hung Chen


11. Chang Liu

MS (First employer: Microsoft Corporation)

12. Annan Wei

MS (First employer: Google Inc)

Funding Agencies

National Institute of Biomedical Imaging and Bioengineering

logo of funding_agency

National Institute on Drug Abuse

logo of funding_agency

Department of Defence, Congressionally Directed Medical Research Programs

logo of funding_agency

Dravet Syndrome Foundation

logo of funding_agency

U.S. Department of Veterans Affairs

logo of funding_agency

