Our research is focused on developing artificial intelligence (AI) methods to analyze heterogeneous biomedical big data for translational applications. This ongoing work brings together two branches of AI: knowledge representation reasoning and machine learning algorithms to characterize brain network dynamics and electronic health records (EHR) data.
Knowledge representation and reasoning involves development of knowledge models or ontologies. We have led the development of new methods to use ontology engineering principles across multiple stages of machine learning workflows, including feature engineering and model validation. This involves the development of deep neural network (DNN) models and the use of classical machine learning algorithms such as support vector machines (SVM) for integrative analysis of multi-modal brain connectivity data in neurological disorders such as epilepsy and Parkinson's Disease. To address the challenges of data quality and scientific reproducibility, we have led the development of a provenance metadata framework called ProvCaRe using ontology engineering and natural language processing techniques.
Epilepsy seizure networks; Structural connectivity networks derived from MRI; Functional connectivity networks derived from EEG; Provenance metadata; Ontology engineering; Data integration; High performance computing
This annual event is the world’s premier meeting for the research and practice of biomedical and health informatics. Symposium convenes informaticians from around the world to share research and insights for leveraging health information and cutting-edge technologies to improve human health. At AMIA 2024, our lab, led by Katrina Prantazlos and Dipak Upadhyaya, showcases research in medical diagnostics, highlighting the effectiveness of large language models in healthcare applications.
The AMIA 2025 Informatics Summit is a premier gathering of researchers, clinicians, industry experts, and policymakers dedicated to advancing health informatics and data-driven healthcare. At the summit, our lab—led by Dipak Upadhyaya—showcases research demonstrating how explainable AI principles, when integrated with large language models, can enhance diagnostic accuracy and transparency in pediatric ophthalmology.
A 1-year Google Cloud Research resource award for developing a new tuning approach for Foundational AI models, such as large language models.
Leonora Lipson, an undergraduate researcher, presented her poster titled "Edge-Aware Signal Coupling Graph Neural Networks for Seizure Detection" at the Spring Intersections Symposium at Case Western Reserve University. Mentored by Professor Satya Sahoo and Katrina Prantzalos, her interdisciplinary project leverages advanced graph neural network architectures to enhance seizure detection accuracy using neurophysiological data. The work, co-authored with experts in biomedical informatics and neurology, exemplifies the impactful research being conducted by CWRU undergraduates.
We welcome interest from current CWRU undergraduate and graduate students who are interested in working at the intersection of biomedical research and computer science (primarily artificial intelligence research). Please contact Dr. Sahoo at sss124@case.edu.
Upadhyaya, D.P., Cakir, G.B., Stefano, R., Shaikh, A., Albert, J., Sahoo, S.S.*, and Ghasia, F*., 2025. A Multi-Head Attention Deep Learning Algorithm to Detect Amblyopia Using Fixation Eye Movements. Ophthalmology Science, 2025, Published by the American Academy of Ophthalmology. *Co-corresponding author.
Golnari, P., Prantzalos, K., Hood, V., Meskis, M.A., Isom, L.L., Wilcox, K., Parent, J.M., Lal, D., Lhatoo, S.D., Goodkin, H.P., Wirrell, E.C., Knupp, K.G., Patel, M., Loeb, J.A., Sullivan, J.E., Harte-Hargrove, L., Fureman, B.E., Buchhalter, J., Sahoo, S.S., 2025. Ontology Accelerates Few-Shot Learning Capability of Large Language Model: A Study in Extraction of Drug Efficacy in a Rare Pediatric Epilepsy. International Journal of Medical Informatics, 2025 (Accepted).
Golnari, P., Prantzalos, K., Upadhyaya D., Buchhalter, J., Sahoo S.S., Human in the Loop: Embedding Medical Expert Input in Large Language Models for Clinical Applications. MEDINFO 2025. (Accepted).
Prantzalos, K., Golnari, P., Upadhyaya D., Thyagaraj S., Fernandez Baca-Vaca, G., Luders H., Sahoo S.S., Standardized Epilepsy Data Collection and Analysis Leveraging the Four-Dimensional Epilepsy Classification (4D-EC) Framework. MEDINFO 2025. (Accepted). Standardized Epilepsy Data Collection and Analysis Leveraging the Four-Dimensional Epilepsy Classification (4D-EC) Framework.
Iyer, V., Zurlo, I., Golnari, P., Prantzalos, K., Lobb, B. M., Boyd, J., Hiller, A.L., Sahoo, S.S. & Gupta, D. K. 2025. Demonstration of a Prototype Clinical Decision Support System for Diagnosing Parkinson’s Disease. Parkinsonism & Related Disorders (Poster).
Sivagnanam, S., Yeu, S., Lin, K., Sakai, S., Garzon, F., Yoshimoto, K., Prantzalos, K., Upadhyaya, D.P., Majumdar, A., Sahoo, S.S. and Lytton, W.W., 2024. Towards building a trustworthy pipeline integrating Neuroscience Gateway and Open Science Chain . Database, 2024, p.baae023.
Upadhyaya, D.P., Prantzalos, K., Golnari, P., Shaikh, A.G., Sivagnanam, S., Majumdar, A., Ghasia, F.F. & Sahoo, S.S. 2025, Explainable artificial intelligence (XAI) in the era of large language models: applying an XAI framework in pediatric ophthalmology diagnosis using the Gemini model. AMIA 2025 Informatics Summit, March 10–13, Pittsburgh, PA, American Medical Informatics Association, (Accepted).
Prantzalos, K., Upadhyaya, D.P., Golnari, P., Fernandez-BacaVaca, G., Aispuro, G.P., Salehizadeh, S., Thyagaraj, S., Gurski, N. & Sahoo, S.S. 2024, Neural mosaics: detecting aberrant brain interactions using algebraic topology and generative artificial intelligence. , AMIA 2024 Annual Symposium, November 9–13, San Francisco, CA, American Medical Informatics Association, (Accepted).
Turner MD, Golnari P, Rakib NA, Rathnam A, Appaji A, Rajasekar A, Sahoo SS, Wang Y, Wang Y, Turner JA, 2024. Benchmarks for Methods and Study Data Extraction from Human Neuroscience Publications INCF Neuroinformatics Assembly Austin, TX; 2024.
Sanchez, E., Upadhyaya, D.P., Cakir, G.B., Shaikh, A., Stefano, R., Sahoo, S. and Ghasia, F., 2024. Machine Learning, Artificial Intelligence and Eye Movements: Utility in Detection of Amblyopia. Investigative Ophthalmology & Visual Science, 65(7), pp.4301-4301.
Sahoo, S.S., Plasek, J.M., Xu, H., Uzuner, Ö., Cohen, T., Yetisgen, M., Liu, H., Meystre, S. and Wang, Y., 2024. Large language models for biomedicine: foundations, opportunities, challenges, and best practices. Journal of the American Medical Informatics Association, p.ocae074.
Upadhyaya, D.P., Tarabichi, Y., Prantzalos, K., Ayub, S., Kaelber, D.C. and Sahoo, S.S., 2024. Machine learning interpretability methods to characterize the importance of hematologic biomarkers in prognosticating patients with suspected infection . Computers in Biology and Medicine, 183, p.109251.
Upadhyaya, D.P., Shaikh, A.G., Cakir, G.B., Prantzalos, K., Golnari, P., Ghasia, F.F. & Sahoo, S.S. 2024, A 360° view for large language models: Early detection of amblyopia in children using multi-view eye movement recordings. in J. Finkelstein, R. Moskovitch & E. Parimbelli (eds), Artificial intelligence in medicine. AIME 2024. Lecture notes in computer science, vol. 14845, Springer, Cham, pp. 165–175.
Upadhyaya, D.P., Shaikh, A., Prantzalos, K., Golnari, P., Ghasia, F.F. & Sahoo, S.S. 2024, Helios: a platform for early childhood amblyopia detection using fixation eye movements. AMIA 2024 Annual Symposium, November 9–13, San Francisco, CA, American Medical Informatics Association (Poster), (Accepted).
Prantzalos, K., Upadhyaya, D., Shafiabadi, N., Fernandez-BacaVaca, G., Gurski, N., Yoshimoto, K., Sivagnanam, S., Majumdar, A. and Sahoo, S.S., 2023. MaTiLDA: an integrated machine learning and topological data analysis platform for brain network dynamics. In PACIFIC SYMPOSIUM ON BIOCOMPUTING 2024 (pp. 65-80).
Sahoo, S.S., Turner, M.D., Wang, L., Ambite, J.L., Appaji, A., Rajasekar, A., Lander, H.M., Wang, Y. and Turner, J.A., 2023. NeuroBridge ontology: computable provenance metadata to give the long tail of neuroimaging data a FAIR chance for secondary use. Frontiers in Neuroinformatics, 17, p.1216443.
Upadhyaya, D.P., Prantzalos, K., Thyagaraj, S., Shafiabadi, N., Fernandez-BacaVaca, G., Sivagnanam, S., Majumdar, A. and Sahoo, S.S., 2023. Machine Learning Interpretability Methods to Characterize Brain Network Dynamics in Epilepsy. medRxiv.
Wang, X., Wang, Y., Ambite, J.L., Appaji, A., Lander, H., Moore, S.M., Rajasekar, A.K., Turner, J.A., Turner, M.D., Wang, L. and Sahoo, S.S., 2023, April. Enabling scientific reproducibility through FAIR data management: An ontology-driven deep learning approach in the NeuroBridge Project. In AMIA Annual Symposium Proceedings (Vol. 2022, p. 1135).
Wang, L., Ambite, J.L., Appaji, A., Bijsterbosch, J., Dockes, J., Herrick, R., Kogan, A., Lander, H., Marcus, D., Moore, S.M. and Poline, J.B., 2023. NeuroBridge: a prototype platform for discovery of the long-tail neuroimaging data. Frontiers in neuroinformatics, 17, p.1215261.
Sahoo, S.S., Kobow, K., Zhang, J., Buchhalter, J., Dayyani, M., Upadhyaya, D.P., Prantzalos, K., Bhattacharjee, M., Blumcke, I., Wiebe, S. and Lhatoo, S.D., 2022. Ontology-based feature engineering in machine learning workflows for heterogeneous epilepsy patient records. Scientific reports, 12(1), p.19430.
Spilsbury, J.C., Hernandez, E., Kiley, K., Gillerlane Hinkes, E., Prasanna, S., Shafiabadi, N., Rao, P. and Sahoo, S.S., 2022. Social Service Workers’ Use of Social Media to Obtain Client Information: Current Practices and Perspectives on a Potential Informatics Platform. Journal of social service research, 48(6), pp.739-752.
Wang, X., Wang, Y., Ambite, J.L., Appaji, A., Lander, H., Moore, S.M., Rajasekar, A.K., Turner, J.A., Turner, M.D., Wang, L. and Sahoo, S.S., 2023, April. Enabling scientific reproducibility through FAIR data management: An ontology-driven deep learning approach in the NeuroBridge Project. In AMIA Annual Symposium Proceedings (Vol. 2022, p. 1135).
Turner, J.A., Turner, M.D., Appaji, A., Rajasekar, A.K., Wang, L. & Sahoo, S.S. 2022, NeuroBridge ontology development for shared neuroimaging datasets. International Neuroinformatics Coordinating Facility (INCF) Assembly, 2022.
Lander, H., Rajasekar, A., Wang, Y., Watson, M., Sahoo, S., Turner, J., Poline, J-B. & Wang, L. 2022, Linking NeuroBridge and NeuroQuery with deep semantic matching. Neuroinformatics Assembly.
Zhang, J., Bauman, R., Shafiabadi, N., Gurski, N., Fernandez-BacaVaca, G. and Sahoo, S.S., 2022, February. Characterizing brain network dynamics using persistent homology in patients with refractory epilepsy. In AMIA Annual Symposium Proceedings (Vol. 2021, p. 1244).
Gupta, D.K., Marano, M., Aurora, R., Boyd, J. and Sahoo, S.S., 2020. Movement disorders ontology for clinically oriented and clinicians-driven data mining of multi-center cohorts in Parkinson’s disease. medRxiv, pp.2020-11.
Prantzalos, K., Zhang, J., Shafiabadi, N., Fernandez-BacaVaca, G. and Sahoo, S.S., 2022, February. Epilepsy-Connect: An Integrated Knowledgebase for Characterizing Alterations in Consciousness State of Pharmacoresistant Epilepsy Patients. In AMIA Annual Symposium Proceedings (Vol. 2021, p. 1019).
Carr, S.J., Gershon, A., Shafiabadi, N., Lhatoo, S.D., Tatsuoka, C. and Sahoo, S.S., 2021. An integrative approach to study structural and functional network connectivity in epilepsy using imaging and signal data. Frontiers in integrative neuroscience, 14, p.491403.
Liu, C., Kim, M., Rueschman, M. and Sahoo, S.S., 2020. ProvCaRe: A Large-Scale Semantic Provenance Resource for Scientific Reproducibility. In Provenance in Data Science: From Data Models to Context-Aware Knowledge Graphs (pp. 59-73). Cham: Springer International Publishing.
Sahoo, S.S., Gershon, A., Nassim, S., Kaushik, G., Curtis, T., Lhatoo, S.D. and Fernandez-BacaVaca, G., 2020. NeuroIntegrative Connectivity (NIC) informatics tool for brain functional connectivity network analysis in cohort studies. In AMIA Annual Symposium Proceedings (Vol. 2020, p. 1090). American Medical Informatics Association.
Lhatoo, S.D., Bernasconi, N., Blumcke, I., Braun, K., Buchhalter, J., Denaxas, S., Galanopoulou, A., Josephson, C., Kobow, K., Lowenstein, D. and Ryvlin, P., 2020. Big data in epilepsy: clinical and research considerations. Report from the Epilepsy Big Data Task Force of the International League Against Epilepsy. Epilepsia, 61(9), pp.1869-1883.
Sahoo, S.S., Valdez, J., Rueschman, M. and Kim, M., 2019. Semantic Provenance Graph for Reproducibility of Biomedical Research Studies: Generating and Analyzing Graph Structures from Published Literature. In MEDINFO 2019: Health and Wellbeing e-Networks for All (pp. 328-332). IOS Press.
Hong, X., Liu, C., Momotaz, H., Cassidy, K., Sajatovic, M. and Sahoo, S.S., 2020, March. Enhancing multi-center patient cohort studies in the managing epilepsy well (MEW) network: integrated data integration and statistical analysis. In AMIA Annual Symposium Proceedings (Vol. 2019, p. 1071).
Sahoo, S.S., Valdez, J., Kim, M., Rueschman, M. and Redline, S., 2019. ProvCaRe: characterizing scientific reproducibility of biomedical research studies using semantic provenance metadata. International journal of medical informatics, 121, pp.10-18.
Gershon, A., Devulapalli, P., Zonjy, B., Ghosh, K., Tatsuoka, C. and Sahoo, S.S., 2019. Computing functional brain connectivity in neurological disorders: efficient processing and retrieval of electrophysiological signal data. AMIA Summits on Translational Science Proceedings, 2019, p.107.
Socrates, V., Gershon, A.L. and Sahoo, S.S., 2019, August. Computation of Brain Functional Connectivity Network Measures in Epilepsy: A Web-Based Platform for EEG Signal Data Processing and Analysis. In MedInfo (pp. 1590-1591).
Yang, S., Ghosh, K., Sakaie, K., Sahoo, S.S., Carr, S.J.A. and Tatsuoka, C., 2019. A simplified crossing fiber model in diffusion weighted imaging. Frontiers in neuroscience, 13, p.492.
Biomedical & Health Informatics Doctoral Program | PQHS 416: Introduction to Computing in Biomedical Health Informatics |
---|---|
The Biomedical & Health Informatics (BHI) doctoral program trains researchers in biomedicine, population health, and clinical care. Program trainees will acquire a core set of skills spanning computing, biostatistics, and biomedical research through a combination of course work and participation in the study in the Population and Quantitative Health Sciences (PQHS) department. The doctoral program is designed for students to acquire skills in the three areas of concentration: Data Analytics with a focus on statistics and data wrangling, Biomedical Health with a focus on systems biology, clinical, and health issues and Computational and System Design with a focus on knowledge representation, information retrieval, and Big Data. | “PQHS 416 introduces students to computational techniques and concepts that underpin biomedical and health informatics data management and analysis. In particular, the course will focus on the three topics of: (1) Biomedical terminologies and formal logic used in building knowledge models such as ontologies; (2) Natural language processing (NLP), and (3) Big Data technologies, including components of Hadoop stack and Apache Spark. This is a lecture-based course that relies on both materials covered in class and out-of-class readings of published literature. Students will be assigned reading assignments, homework exercise assignments and they are expected to complete homework assignment for each class. The students will be involved in a team project and they will be expected to prepare a project report at the end of the semester.” |
Research Fellow
Undergraduate Researcher
Undergraduate Researcher
Undergraduate Researcher
MS (First employer: ByteDance)
PhD (Status: Post-Doctoral Scholar)
PhD (Status: Post-Doctoral Scholar)
MS (First employer: CoverMyMeds)
MS
BS (Status: PhD at Purdue University)
BS, MS (Status: PhD at Yale University)
MS (First employer: IBM Explorys)
MS
BS
MS (First employer: Microsoft Corporation)
MS (First employer: Google Inc)
BS (Undergraduate Researcher)
BS (Undergraduate Researcher)