Our Research

word cloud

Our research is focused on developing artificial intelligence (AI) methods to analyze heterogeneous biomedical big data for translational applications. This ongoing work brings together two branches of AI: knowledge representation reasoning and machine learning algorithms to characterize brain network dynamics and electronic health records (EHR) data.

Knowledge representation and reasoning involves development of knowledge models or ontologies. We have led the development of new methods to use ontology engineering principles across multiple stages of machine learning workflows, including feature engineering and model validation. This involves the development of deep neural network (DNN) models and the use of classical machine learning algorithms such as support vector machines (SVM) for integrative analysis of multi-modal brain connectivity data in neurological disorders such as epilepsy and Parkinson's Disease. To address the challenges of data quality and scientific reproducibility, we have led the development of a provenance metadata framework called ProvCaRe using ontology engineering and natural language processing techniques.


Research Interests

Epilepsy seizure networks; Structural connectivity networks derived from MRI; Functional connectivity networks derived from EEG; Provenance metadata; Ontology engineering; Data integration; High performance computing

News

AMIA 2024 Annual Symposium, San Francisco, California, USA, November 9 - 13, 2024

This annual event is the world’s premier meeting for the research and practice of biomedical and health informatics. Symposium convenes informaticians from around the world to share research and insights for leveraging health information and cutting-edge technologies to improve human health. At AMIA 2024, our lab, led by Katrina Prantazlos and Dipak Upadhyaya, showcases research in medical diagnostics, highlighting the effectiveness of large language models in healthcare applications.


AMIA 2025 Informatics Summit, Pittsburgh, Pennsylvania, USA, March 10 - 13, 2025

The AMIA 2025 Informatics Summit is a premier gathering of researchers, clinicians, industry experts, and policymakers dedicated to advancing health informatics and data-driven healthcare. At the summit, our lab—led by Dipak Upadhyaya—showcases research demonstrating how explainable AI principles, when integrated with large language models, can enhance diagnostic accuracy and transparency in pediatric ophthalmology.


Opportunities

CWRU Students

We welcome interest from current CWRU undergraduate and graduate students who are interested in working at the intersection of biomedical research and computer science (primarily artificial intelligence research). Please contact Dr. Sahoo at sss124@case.edu.


Projects and Resources

Brain Connectivity in Neurological Disorders



We study underlying mechanisms that influence the generation and progression of abnormal electrophysiological signals in epilepsy, which is a serious neurological disorder affecting more than 50 million individuals worldwide with debilitating seizures. Our research uses high resolution signal data recorded using intracranial EEG with multiple contacts. However, this approach involves querying and analyzing large volume of multi-modal data. To address this challenge, we incorporate techniques of Big Data analytics, including the development of new data models that are compatible with techniques of large-scale data analysis, such as parallel and distributed computing. We have developed flexible analysis workflows with multiple measures of statistical correlation that can quantitatively assess the strength of the connections among the brain regions active during a seizure event. More information is available on the project page. NIC Workflow Website


MaTiLDA



As an extension of these data-processing workflows, we have also developed MaTiLDA as an integrated web platform for analyzing abnormal electrophysiological signals using topological data analysis (TDA) and machine learning algorithms. MaTiLDA features a graphical user interface that enables users to apply topological data analysis and machine learning algorithms to analyze datasets from neurophysiological recordings without requiring substantial domain knowledge in mathematics or computing. More information on MaTiLDA can be found here. MaTiLDA

Provenance Metadata for Scientific Reproducibility



Scientific reproducibility is key to scientific progress as it allows the research community to build on validated results, protect patients from potentially harmful trial drugs derived from incorrect results, and reduce wastage of valuable resources. To address this challenge in the biomedical research domain, we are developing the Provenance for Clinical and Healthcare Research (ProvCaRe) framework using World Wide Web Consortium (W3C) PROV specifications, including the PROV Ontology (PROV-O). In the ProvCaRe project, we are extending PROV-O to create a formal model of provenance information that is necessary for scientific reproducibility in biomedical research. ProvCaRe framework aims to model, extract, and analyze provenance information. The ProvCaRe framework consists of the S3 Model that extends the PROV specifications to model provenance metadata describing Study Method, Study Tools, Study Data in a research study. We have developed a provenance-specific text processing pipeline that uses the ProvCaRe ontology to identify and extract provenance metadata from published literature describing biomedical research studies. The ProveCaRe knowledge repository contains provenance "triples" extracted from published research studies that can be queried and explored by users using "hypothesis-based search". Coming Soon

Ontology-based clinical decision-support system



Our primary objective is to create a clinical decision support system (CDSS), which mirrors the clinical workflow of movement disorders in diagnosing Parkinson’s disease (PD) using the International Parkinson and Movement Disorders Society criteria (MDS-PD). The MDS-PD criteria allow highly sensitive and specific diagnosis of PD but are inherently complex to apply using a manual approach with pen-and-paper and are not supported currently in electronic health record systems. This highlights the need for a CDSS to enable implementation of these criteria, which can support clinicians and researchers alike as part of clinical care and research. Our modular approach to creating ORMIS-PD consists of three steps; first building the data entry module to support capturing of relevant patient data needed for the MDS-PD criteria, then building the knowledge base module for modelling of the algorithm of the MDS-PD criteria, and finally building the data analytics module application of the algorithm on the captured data to classify the patient into one of the three levels of diagnostic classification of the MDS-PD criteria. ORMIS-PD Website


Publications


2025

Upadhyaya, D.P., Cakir, G.B., Stefano, R., Shaikh, A., Albert, J., Sahoo, S.S.*, and Ghasia, F*., 2025. A Multi-Head Attention Deep Learning Algorithm to Detect Amblyopia using Fixation Eye Movements. Ophthalmology Science, 2025, Published by the American Academy of Ophthalmology. *Co-corresponding author.


Golnari, P., Prantzalos, K., Upadhyaya D., Buchhalter, J., Sahoo S.S., Human in the Loop: Embedding Medical Expert Input in Large Language Models for Clinical Applications. MEDINFO 2025. (Accepted).


Prantzalos, K., Golnari, P., Upadhyaya D., Thyagaraj S., Fernandez Baca-Vaca, G., Luders H., Sahoo S.S., Standardized Epilepsy Data Collection and Analysis Leveraging the Four-Dimensional Epilepsy Classification (4D-EC) Framework. MEDINFO 2025. (Accepted). Standardized Epilepsy Data Collection and Analysis Leveraging the Four-Dimensional Epilepsy Classification (4D-EC) Framework.



2024

Sahoo, S.S., Plasek, J.M., Xu, H., Uzuner, Ö., Cohen, T., Yetisgen, M., Liu, H., Meystre, S. and Wang, Y., 2024. Large language models for biomedicine: foundations, opportunities, challenges, and best practices. Journal of the American Medical Informatics Association, p.ocae074.


Prantzalos, K., Upadhyaya, D.P., Golnari, P., Fernandez-BacaVaca, G., Aispuro, G.P., Salehizadeh, S., Thyagaraj, S., Gurski, N. & Sahoo, S.S. 2024, Neural mosaics: detecting aberrant brain interactions using algebraic topology and generative artificial intelligence. , AMIA 2024 Annual Symposium, November 9–13, San Francisco, CA, American Medical Informatics Association, (Accepted).


Upadhyaya, D.P., Tarabichi, Y., Prantzalos, K., Ayub, S., Kaelber, D.C. and Sahoo, S.S., 2024. Machine learning interpretability methods to characterize the importance of hematologic biomarkers in prognosticating patients with suspected infection . Computers in Biology and Medicine, 183, p.109251.


Upadhyaya, D.P., Prantzalos, K., Golnari, P., Shaikh, A.G., Sivagnanam, S., Majumdar, A., Ghasia, F.F. & Sahoo, S.S. 2025, Explainable artificial intelligence (XAI) in the era of large language models: applying an XAI framework in pediatric ophthalmology diagnosis using the Gemini model. AMIA 2025 Informatics Summit, March 10–13, Pittsburgh, PA, American Medical Informatics Association, (Accepted).


Sivagnanam, S., Yeu, S., Lin, K., Sakai, S., Garzon, F., Yoshimoto, K., Prantzalos, K., Upadhyaya, D.P., Majumdar, A., Sahoo, S.S. and Lytton, W.W., 2024. Towards building a trustworthy pipeline integrating Neuroscience Gateway and Open Science Chain . Database, 2024, p.baae023.


Turner MD, Golnari P, Rakib NA, Rathnam A, Appaji A, Rajasekar A, Sahoo SS, Wang Y, Wang Y, Turner JA, 2024. Benchmarks for Methods and Study Data Extraction from Human Neuroscience Publications INCF Neuroinformatics Assembly Austin, TX; 2024.


Sanchez, E., Upadhyaya, D.P., Cakir, G.B., Shaikh, A., Stefano, R., Sahoo, S. and Ghasia, F., 2024. Machine Learning, Artificial Intelligence and Eye Movements: Utility in Detection of Amblyopia. Investigative Ophthalmology & Visual Science, 65(7), pp.4301-4301.


Upadhyaya, D.P., Shaikh, A.G., Cakir, G.B., Prantzalos, K., Golnari, P., Ghasia, F.F. & Sahoo, S.S. 2024, A 360° view for large language models: Early detection of amblyopia in children using multi-view eye movement recordings. in J. Finkelstein, R. Moskovitch & E. Parimbelli (eds), Artificial intelligence in medicine. AIME 2024. Lecture notes in computer science, vol. 14845, Springer, Cham, pp. 165–175.


Upadhyaya, D.P., Shaikh, A., Prantzalos, K., Golnari, P., Ghasia, F.F. & Sahoo, S.S. 2024, Helios: a platform for early childhood amblyopia detection using fixation eye movements. AMIA 2024 Annual Symposium, November 9–13, San Francisco, CA, American Medical Informatics Association (Poster), (Accepted).



2023

Prantzalos, K., Upadhyaya, D., Shafiabadi, N., Fernandez-BacaVaca, G., Gurski, N., Yoshimoto, K., Sivagnanam, S., Majumdar, A. and Sahoo, S.S., 2023. MaTiLDA: an integrated machine learning and topological data analysis platform for brain network dynamics. In PACIFIC SYMPOSIUM ON BIOCOMPUTING 2024 (pp. 65-80).


Sahoo, S.S., Turner, M.D., Wang, L., Ambite, J.L., Appaji, A., Rajasekar, A., Lander, H.M., Wang, Y. and Turner, J.A., 2023. NeuroBridge ontology: computable provenance metadata to give the long tail of neuroimaging data a FAIR chance for secondary use. Frontiers in Neuroinformatics, 17, p.1216443.


Upadhyaya, D.P., Prantzalos, K., Thyagaraj, S., Shafiabadi, N., Fernandez-BacaVaca, G., Sivagnanam, S., Majumdar, A. and Sahoo, S.S., 2023. Machine Learning Interpretability Methods to Characterize Brain Network Dynamics in Epilepsy. medRxiv.


Wang, X., Wang, Y., Ambite, J.L., Appaji, A., Lander, H., Moore, S.M., Rajasekar, A.K., Turner, J.A., Turner, M.D., Wang, L. and Sahoo, S.S., 2023, April. Enabling scientific reproducibility through FAIR data management: An ontology-driven deep learning approach in the NeuroBridge Project. In AMIA Annual Symposium Proceedings (Vol. 2022, p. 1135).


Wang, L., Ambite, J.L., Appaji, A., Bijsterbosch, J., Dockes, J., Herrick, R., Kogan, A., Lander, H., Marcus, D., Moore, S.M. and Poline, J.B., 2023. NeuroBridge: a prototype platform for discovery of the long-tail neuroimaging data. Frontiers in neuroinformatics, 17, p.1215261.



2022

Wang, X., Wang, Y., Ambite, J.L., Appaji, A., Lander, H., Moore, S.M., Rajasekar, A.K., Turner, J.A., Turner, M.D., Wang, L. and Sahoo, S.S., 2023, April. Enabling scientific reproducibility through FAIR data management: An ontology-driven deep learning approach in the NeuroBridge Project. In AMIA Annual Symposium Proceedings (Vol. 2022, p. 1135).


Spilsbury, J.C., Hernandez, E., Kiley, K., Gillerlane Hinkes, E., Prasanna, S., Shafiabadi, N., Rao, P. and Sahoo, S.S., 2022. Social Service Workers’ Use of Social Media to Obtain Client Information: Current Practices and Perspectives on a Potential Informatics Platform. Journal of social service research, 48(6), pp.739-752.


Sahoo, S.S., Kobow, K., Zhang, J., Buchhalter, J., Dayyani, M., Upadhyaya, D.P., Prantzalos, K., Bhattacharjee, M., Blumcke, I., Wiebe, S. and Lhatoo, S.D., 2022. Ontology-based feature engineering in machine learning workflows for heterogeneous epilepsy patient records. Scientific reports, 12(1), p.19430.


Turner, J.A., Turner, M.D., Appaji, A., Rajasekar, A.K., Wang, L. & Sahoo, S.S. 2022, NeuroBridge ontology development for shared neuroimaging datasets. International Neuroinformatics Coordinating Facility (INCF) Assembly, 2022.


Lander, H., Rajasekar, A., Wang, Y., Watson, M., Sahoo, S., Turner, J., Poline, J-B. & Wang, L. 2022, Linking NeuroBridge and NeuroQuery with deep semantic matching. Neuroinformatics Assembly.



2021

Prantzalos, K., Zhang, J., Shafiabadi, N., Fernandez-BacaVaca, G. and Sahoo, S.S., 2022, February. Epilepsy-Connect: An Integrated Knowledgebase for Characterizing Alterations in Consciousness State of Pharmacoresistant Epilepsy Patients. In AMIA Annual Symposium Proceedings (Vol. 2021, p. 1019).


Gupta, D.K., Marano, M., Aurora, R., Boyd, J. and Sahoo, S.S., 2020. Movement disorders ontology for clinically oriented and clinicians-driven data mining of multi-center cohorts in Parkinson’s disease. medRxiv, pp.2020-11.


Zhang, J., Bauman, R., Shafiabadi, N., Gurski, N., Fernandez-BacaVaca, G. and Sahoo, S.S., 2022, February. Characterizing brain network dynamics using persistent homology in patients with refractory epilepsy. In AMIA Annual Symposium Proceedings (Vol. 2021, p. 1244).



2020

Carr, S.J., Gershon, A., Shafiabadi, N., Lhatoo, S.D., Tatsuoka, C. and Sahoo, S.S., 2021. An integrative approach to study structural and functional network connectivity in epilepsy using imaging and signal data. Frontiers in integrative neuroscience, 14, p.491403.


Lhatoo, S.D., Bernasconi, N., Blumcke, I., Braun, K., Buchhalter, J., Denaxas, S., Galanopoulou, A., Josephson, C., Kobow, K., Lowenstein, D. and Ryvlin, P., 2020. Big data in epilepsy: clinical and research considerations. Report from the Epilepsy Big Data Task Force of the International League Against Epilepsy. Epilepsia, 61(9), pp.1869-1883.


Liu, C., Kim, M., Rueschman, M. and Sahoo, S.S., 2020. ProvCaRe: A Large-Scale Semantic Provenance Resource for Scientific Reproducibility. In Provenance in Data Science: From Data Models to Context-Aware Knowledge Graphs (pp. 59-73). Cham: Springer International Publishing.


Sahoo, S.S., Gershon, A., Nassim, S., Kaushik, G., Curtis, T., Lhatoo, S.D. and Fernandez-BacaVaca, G., 2020. NeuroIntegrative Connectivity (NIC) informatics tool for brain functional connectivity network analysis in cohort studies. In AMIA Annual Symposium Proceedings (Vol. 2020, p. 1090). American Medical Informatics Association.



2019

Sahoo, S.S., Valdez, J., Rueschman, M. and Kim, M., 2019. Semantic Provenance Graph for Reproducibility of Biomedical Research Studies: Generating and Analyzing Graph Structures from Published Literature. In MEDINFO 2019: Health and Wellbeing e-Networks for All (pp. 328-332). IOS Press.


Yang, S., Ghosh, K., Sakaie, K., Sahoo, S.S., Carr, S.J.A. and Tatsuoka, C., 2019. A simplified crossing fiber model in diffusion weighted imaging. Frontiers in neuroscience, 13, p.492.


Hong, X., Liu, C., Momotaz, H., Cassidy, K., Sajatovic, M. and Sahoo, S.S., 2020, March. Enhancing multi-center patient cohort studies in the managing epilepsy well (MEW) network: integrated data integration and statistical analysis. In AMIA Annual Symposium Proceedings (Vol. 2019, p. 1071).


Sahoo, S.S., Valdez, J., Kim, M., Rueschman, M. and Redline, S., 2019. ProvCaRe: characterizing scientific reproducibility of biomedical research studies using semantic provenance metadata. International journal of medical informatics, 121, pp.10-18.


Gershon, A., Devulapalli, P., Zonjy, B., Ghosh, K., Tatsuoka, C. and Sahoo, S.S., 2019. Computing functional brain connectivity in neurological disorders: efficient processing and retrieval of electrophysiological signal data. AMIA Summits on Translational Science Proceedings, 2019, p.107.


Socrates, V., Gershon, A.L. and Sahoo, S.S., 2019, August. Computation of Brain Functional Connectivity Network Measures in Epilepsy: A Web-Based Platform for EEG Signal Data Processing and Analysis. In MedInfo (pp. 1590-1591).



2018

Valdez, J., Kim, M., Rueschman, M., Redline, S. and Sahoo, S.S., 2018. Classification of provenance triples for scientific reproducibility: A comparative evaluation of deep learning models in the ProvCaRe project. In Provenance and Annotation of Data and Processes: 7th International Provenance and Annotation Workshop, IPAW 2018, London, UK, July 9-10, 2018, Proceedings (pp. 30-41). Springer International Publishing.


Gershon, A., Lhatoo, S.D., Tatsuoka, C., Ghosh, K., Loparo, K. and Sahoo, S.S., 2018. Scalable Signal Data Processing for Measuring Functional Connectivity in Epilepsy Neurological Disorder. In Signal Processing and Machine Learning for Biomedical Big Data (pp. 259-269). (Book)CRC Press.



2017

Sajatovic, M., Tatsuoka, C., Welter, E., Friedman, D., Spruill, T.M., Stoll, S., Sahoo, S.S., Bukach, A., Bamps, Y.A., Valdez, J. and Jobst, B.C., 2017. Correlates of quality of life among individuals with epilepsy enrolled in self-management research: from the US Centers for Disease Control and Prevention Managing Epilepsy Well Network. Epilepsy & Behavior, 69, pp.177-180.


Valdez, J., Kim, M., Rueschman, M., Socrates, V., Redline, S. and Sahoo, S.S., 2017. ProvCaRe semantic provenance knowledgebase: evaluating scientific reproducibility of research studies. In AMIA Annual Symposium Proceedings (Vol. 2017, p. 1705). American Medical Informatics Association.


Valdez, J., Rueschman, M., Kim, M., Arabyarmohammadi, S., Redline, S. and Sahoo, S.S., 2017. An extensible ontology modeling approach using post coordinated expressions for semantic provenance in biomedical research. In On the Move to Meaningful Internet Systems. OTM 2017 Conferences: Confederated International Conferences: CoopIS, C&TC, and ODBASE 2017, Rhodes, Greece, October 23-27, 2017, Proceedings, Part II (pp. 337-352). Springer International Publishing.


Gershon, A.L., Zonjy, B., Tatsuoka, C., Ghosh, K. and Sahoo, S.S., 2017. A Flexible Computational Neuroinformatics Workflow for Computing Functional Networks in Epilepsy Neurological Disorder. In AMIA.



2016

Sahoo, S.S., Wei, A., Tatsuoka, C., Ghosh, K. and Lhatoo, S.D., 2016. Processing neurology clinical data for knowledge discovery: scalable data flows using distributed computing. Machine Learning for Health Informatics: State-of-the-Art and Future Challenges, pp.303-318.


Sahoo, S.S., Wei, A., Valdez, J., Wang, L., Zonjy, B., Tatsuoka, C., Loparo, K.A. and Lhatoo, S.D., 2016. NeuroPigPen: a scalable toolkit for processing electrophysiological signal data in neuroscience applications using apache pig. Frontiers in neuroinformatics, 10, p.18.


Dean, D.A., Goldberger, A.L., Mueller, R., Kim, M., Rueschman, M., Mobley, D., Sahoo, S.S., Jayapandian, C.P., Cui, L., Morrical, M.G. and Surovec, S., 2016. Scaling up scientific discovery in sleep medicine: the national sleep research resource. Sleep, 39(5), pp.1151-1164.


Sahoo, S.S., Zhang, G.Q., Bamps, Y., Fraser, R., Stoll, S., Lhatoo, S.D., Tatsuoka, C., Sams, J., Welter, E. and Sajatovic, M., 2016. Managing information well: Toward an ontology-driven informatics platform for data sharing and secondary use in epilepsy self-management research centers. Health informatics journal, 22(3), pp.548-561.


Yang, S., Tatsuoka, C., Ghosh, K., Lacuey-Lecumberri, N., Lhatoo, S.D. and Sahoo, S.S., 2016. Comparative evaluation for brain structural connectivity approaches: towards integrative neuroinformatics tool for epilepsy clinical research. AMIA Summits on Translational Science Proceedings, 2016, p.446.


Sahoo, S.S., Ramesh, P., Welter, E., Bukach, A., Valdez, J., Tatsuoka, C., Bamps, Y., Stoll, S., Jobst, B.C. and Sajatovic, M., 2016. Insight: An ontology-based integrated database and analysis platform for epilepsy self-management research. International journal of medical informatics, 94, pp.21-30.


Valdez, J., Rueschman, M., Kim, M., Redline, S. and Sahoo, S.S., 2016. An ontology-enabled natural language processing pipeline for provenance metadata extraction from biomedical text (short paper). In On the Move to Meaningful Internet Systems: OTM 2016 Conferences: Confederated International Conferences: CoopIS, C&TC, and ODBASE 2016, Rhodes, Greece, October 24-28, 2016, Proceedings (pp. 699-708). Springer International Publishing. An Ontology-Enabled Natural Language Processing Pipeline for Provenance Metadata Extraction from Biomedical Text


Sahoo, S.S., Valdez, J. and Rueschman, M., 2016. Scientific reproducibility in biomedical research: provenance metadata ontology for semantic annotation of study description. In AMIA Annual Symposium Proceedings (Vol. 2016, p. 1070). American Medical Informatics Association.



2015

Jayapandian, C., Wei, A., Ramesh, P., Zonjy, B., Lhatoo, S.D., Loparo, K., Zhang, G.Q. and Sahoo, S.S., 2015. A scalable neuroinformatics data flow for electrophysiological signals using MapReduce. Frontiers in neuroinformatics, 9, p.4.


Ramesh, P., Wei, A., Welter, E., Bamps, Y., Stoll, S., Bukach, A., Sajatovic, M. and Sahoo, S.S., 2015, November. Insight: Semantic provenance and analysis platform for multi-center neurology healthcare research. In 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (pp. 731-736). IEEE.


LaFrance Jr, W.C., Ranieri, R., Bamps, Y., Stoll, S., Sahoo, S.S., Welter, E., Sams, J., Tatsuoka, C. and Sajatovic, M., 2015. Comparison of common data elements from the Managing Epilepsy Well (MEW) Network integrated database and a well-characterized sample with nonepileptic seizures. Epilepsy & Behavior, 45, pp.136-141.


Sahoo SS, Rueschman M, Valdez J, Hsu W, Lhatoo SD, Redline S. Provenance Analysis over Biomedical Big Data Using PROV: Towards Effective Secondary Data Analysis Across Multiple Studies. NIH Big Data to Knowledge (BD2K) Meeting, Bethesda MD. Nov 12-13, 201 Provenance Analysis over Biomedical Big Data Using PROV: Towards Effective Secondary Data Analysis Across Multiple Studies (Poster)


Sahoo SS, Rao P. Provenance Analysis and RDF Query Processing: W3C PROV for Data Quality and Trust . In the 14th International Semantic Web Conference (ISWC 2015), Bethlehem, PA, 2015. (Tutorial, to appear)



2014

Zhang, G.Q., Cui, L., Lhatoo, S., Schuele, S.U. and Sahoo, S.S., 2014. MEDCIS: multi-modality epilepsy data capture and integration system. In AMIA Annual Symposium Proceedings (Vol. 2014, p. 1248). American Medical Informatics Association.


Sahoo, S.S., Tao, S., Parchman, A., Luo, Z., Cui, L., Mergler, P., Lanese, R., Barnholtz-Sloan, J.S., Meropol, N.J. and Zhang, G.Q., 2014. Trial prospector: matching patients with cancer research studies using an automated and scalable approach. Cancer informatics, 13, pp.CIN-S19454.


Sahoo SS, McIntyre C, Lhatoo SD. A Match Made in Cloud? Meeting the Requirements of the Next Generation Neuroscience Research Using Configurable Cloud Infrastructure . National Science Foundation (NSF) Cloud Workshop, Dec 11-12, 2014


Cui, L., Sahoo, S.S., Lhatoo, S.D., Garg, G., Rai, P., Bozorgi, A. and Zhang, G.Q., 2014. Complex epilepsy phenotype extraction from narrative clinical discharge summaries. Journal of biomedical informatics, 51, pp.272-279.


Jayapandian, C., Chen, C.H., Dabir, A., Lhatoo, S., Zhang, G.Q. and Sahoo, S.S., 2014. Domain ontology as conceptual model for big data management: application in biomedical informatics. In Conceptual Modeling: 33rd International Conference, ER 2014 , Atlanta, GA, USA, October 27-29, 2014. Proceedings 33 (pp. 144-157). Springer International Publishing.



2013

Lebo, T., Sahoo, S., McGuinness, D., Belhajjame, K., Cheney, J., Corsar, D., Garijo, D., Soiland-Reyes, S., Zednik, S. and Zhao, J., 2013. Prov-o: The prov ontology. W3C recommendation, 30.


Sahoo, S.S., Jayapandian, C., Garg, G., Kaffashi, F., Chung, S., Bozorgi, A., Chen, C.H., Loparo, K., Lhatoo, S.D. and Zhang, G.Q., 2014. Heart beats in the cloud: distributed analysis of electrophysiological ‘Big Data’using cloud computing for epilepsy clinical research. Journal of the American Medical Informatics Association, 21(2), pp.263-271.


Sahoo, S.S., Lhatoo, S.D., Gupta, D.K., Cui, L., Zhao, M., Jayapandian, C., Bozorgi, A. and Zhang, G.Q., 2014. Epilepsy and seizure ontology: towards an epilepsy informatics infrastructure for clinical research and patient care. Journal of the American Medical Informatics Association, 21(1), pp.82-89.


Cui, L., Mueller, R., Sahoo, S. and Zhang, G.Q., 2013, September. Querying complex federated clinical data using ontological mapping and subsumption reasoning. In 2013 IEEE International Conference on Healthcare Informatics (pp. 351-360). IEEE.


Parchman AJ, Zhang GQ, Mergler P, Barnholtz-Sloan J, Lanese R, Miller DW, Opper C,Sahoo SS, Tao S, Teagno J, Warfe J, Meropol NJ. Trial prospector: An automated clinical trials eligibility matching program . Proceedings of the American Society of Clinical


Bozorgi, A., Chung, S., Kaffashi, F., Loparo, K.A., Sahoo, S., Zhang, G.Q., Kaiboriboon, K. and Lhatoo, S.D., 2013. Significant postictal hypotension: Expanding the spectrum of seizure‐induced autonomic dysregulation. Epilepsia, 54(9), pp.e127-e130.


Jayapandian, C.P., Chen, C.H., Bozorgi, A., Lhatoo, S.D., Zhang, G.Q. and Sahoo, S.S., 2013. Cloudwave: distributed processing of “Big Data” from electrophysiological recordings for epilepsy clinical research using Hadoop. In AMIA Annual Symposium Proceedings (Vol. 2013, p. 691). American Medical Informatics Association.


Jayapandian, C.P., Chen, C.H., Bozorgi, A., Lhatoo, S.D., Zhang, G.Q. and Sahoo, S.S., 2013. Electrophysiological signal analysis and visualization using cloudwave for epilepsy clinical research. Studies in health technology and informatics, 192, p.817.


Sahoo, S.S., Zhang, G.Q. and Lhatoo, S.D., 2013. Epilepsy informatics and an ontology‐driven infrastructure for large database research and patient care in epilepsy. Epilepsia, 54(8), pp.1335-1341.


Asiaee, A.H., Doshi, P., Minning, T., Sahoo, S., Parikh, P., Sheth, A. and Tarleton, R.L., 2013. From questions to effective answers: On the utility of knowledge-driven querying systems for life sciences data. In Data Integration in the Life Sciences: 9th International Conference, DILS 2013, Montreal, QC, Canada, July 11-12, 2013. Proceedings 9 (pp. 38-45). Springer Berlin Heidelberg.



2012

Parikh, P.P., Zheng, J., Logan-Klumpler, F., Stoeckert, C.J., Louis, C., Topalis, P., Protasio, A.V., Sheth, A.P., Carrington, M., Berriman, M. and Sahoo, S.S., 2012. The Ontology for Parasite Lifecycle (OPL): towards a consistent vocabulary of lifecycle stages in parasitic organisms. Journal of biomedical semantics, 3(1), pp.1-13.


Zhang, G.Q., Luo, L., Ogbuji, C., Joslyn, C., Mejino, J. and Sahoo, S.S., 2012. An analysis of multi-type relational interactions in FMA using graph motifs with disjointness constraints. In AMIA Annual Symposium Proceedings (Vol. 2012, p. 1060). American Medical Informatics Association.


Cui, L., Bozorgi, A., Lhatoo, S.D., Zhang, G.Q. and Sahoo, S.S., 2012. EpiDEA: extracting structured epilepsy and seizure information from patient discharge summaries for cohort identification. In AMIA Annual Symposium Proceedings (Vol. 2012, p. 1191). American Medical Informatics Association.


Parikh, P.P., Minning, T.A., Nguyen, V., Lalithsena, S., Asiaee, A.H., Sahoo, S.S., Doshi, P., Tarleton, R. and Sheth, A.P., 2012. A semantic problem solving environment for integrative parasite research: Identification of intervention targets for Trypanosoma cruzi. PLoS neglected tropical diseases, 6(1), p.e1458.


Sahoo, S.S., Zhao, M., Luo, L., Bozorgi, A., Gupta, D., Lhatoo, S.D. and Zhang, G.Q., 2012. OPIC: ontology-driven patient information capturing system for epilepsy. In AMIA Annual Symposium Proceedings (Vol. 2012, p. 799). American Medical Informatics Association.


Jayapandian, C.P., Zhao, M., Ewing, R.M., Zhang, G.Q. and Sahoo, S.S., 2012. A semantic proteomics dashboard (SemPoD) for data management in translational research. BMC systems biology, 6, pp.1-13.


Zhang, G.Q., Sahoo, S.S. and Lhatoo, S.D., 2012. From classification to epilepsy ontology and informatics. Epilepsia, 53, pp.28-32.


Jayapandian, C.P., Zhao, M., Ewing, R.M., Zhang, G.Q. and Sahoo, S.S., 2012. A semantic proteomics dashboard (SemPoD) for data management in translational research. BMC systems biology, 6, pp.1-13.


Teagno, J., Kiefer, R.C., Pathak, J., Zhang, G.Q. and Sahoo, S.S., 2012. A Distributed Semantic Web Approach for Cohort Identification. In AMIA.



2011

Sahoo, S.S., Ogbuji, C., Luo, L., Dong, X., Cui, L., Redline, S.S. and Zhang, G.Q., 2011. Midas: automatic extraction of a common domain of discourse in sleep medicine for multi-center data integration. In AMIA Annual Symposium Proceedings (Vol. 2011, p. 1196). American Medical Informatics Association.


Zhang GQ, Mueller R, Jonhson N, Arabandi S, Sahoo SS, Redline S. Online Exploration of Case-control Study Designs in VISAGE . AMIA Clinical Research Informatics Summit (CRI), 2011.


Sahoo, S.S., Nguyen, V., Bodenreider, O., Parikh, P., Minning, T. and Sheth, A.P., 2011. A unified framework for managing provenance information in translational research . BMC bioinformatics, 12(1), pp.1-18.


Zhao, J., Sahoo, S.S., Missier, P., Sheth, A. and Goble, C., 2010. Extending semantic provenance into the web of data. IEEE Internet Computing, 15(1), pp.40-48.


Sahoo, S.S., 2011. Towards Desiderata for Provenance Ontologies in Biomedicine . In ICBO.


Mueller R, Sahoo SS, Dong X, Redline S, Arabandi S, Luo L, Zhang GQ. Mapping multi-institution data sources to domain ontology for data federation: the PhysioMIMI approach . AMIA Clinical Research Informatics Summit (CRI), 2011.



2010

Sahoo SS, Bodenreider O, Hitzler P, Sheth AP, Thirunarayan K. Provenance Context Entity (PaCE): Scalable provenance tracking for scientific RDF data . The 22nd International Conference on Scientific and Statistical Database Management (SSDBM), 2010. pp. 46


Barga R, Simmhan Y, Chinthaka-Withana E, Sahoo SS, Jackson J, Araujo N. Provenance for Scientific Workflows Towards Reproducible Research . IEEE Data Engineering Bulletin, 2010. Vol. 33(3). pp. 50-58.


Missier P,Sahoo SS, Zhao J, Goble C, Sheth A. Janus: from workflows to semantic provenance and linked open data . The 3rd International Provenance and Annotation Workshop (IPAW), Lecture Notes in Computer Science, Vol. 6378/2010, 2010. pp. 129-141.


Deus H, Zhao J, Sahoo SS, Samwald M, Prud’hommeaux E, Miller M, Marshall MS, Cheung K. Provenance of Microarray Experiments for a Better Understanding of Experiment Results . The 2nd International Workshop on Role of Semantic Web in Provenance Management (


Patni H, Sahoo SS, Henson C, Sheth A. Provenance Aware Linked Sensor Data , The 2nd International Workshop on Trust and Privacy on the Social and Semantic Web, co-located with ESWC, 2010.


Sahoo SS, Groth P, Hartig O, Miles S, Coppens S, Myers J, Gil Y, Moreau L, Zhao J, Panzer M, Garijo D. Provenance Vocabulary Mappings . W3C Provenance Incubator Group Report, 2010.



2009

Sahoo SS, Weatherly DB, Mutharaju R, Anantharam P, Sheth AP, Tarleton RL. Ontology-driven Provenance Management in eScience: an Application in Parasite Research . The 8th International Conference on Ontologies, DataBases, and Applications of Semantics, (OD


Sahoo SS, Sheth A. Provenir ontology: Towards a Framework for eScience Provenance Management. Microsoft eScience Workshop, 2009. http://cci.case.edu/cci/images/7/7a/Framework_for_eScience_Provenance_Management_CR.pdf


Sahoo SS, Halb W, Hellmann S, Idehen K, Thibodeau Jr. T, Auer S, Sequeda J, Ezzat A. A Survey of Current Approaches for Mapping of Relational Databases to RDF . W3C RDB2RDF Incubator Group Report, 2009. http://cci.case.edu/cci/images/0/04/RDB2RDF_SurveyRep



2008

Valerio MD, Sahoo SS, Barga RS, Jackson JJ. Capturing Workflow Event Data for Monitoring, Performance Analysis, and Management of Scientific Workflows . SWBES08, co-located with the 4th IEEE International Conference on eScience, 2008. pp. 626-33. http://cc


Sahoo SS, Sheth AP, Henson C. Semantic Provenance for eScience: ‘Meaningful’ Metadata to Manage the Deluge of Scientific Data. IEEE Internet Computing, Web-Scale Workflow Track, M.B. Blake and M. Huhns (Eds.) , 2008. Vol. 12(4). pp.46-54. (Featured in Asso


Sahoo SS, Bodenreider O, Rutter JL, Skinner KJ, Sheth AP. An ontology-driven semantic mash-up of gene and biological pathway information: Application to the domain of nicotine dependence . Journal of Biomedical Informatics (Special Issue: Semantic Mashup o


Sheth A, Henson C,Sahoo SS. Semantic Sensor Web . IEEE Internet Computing, 2008. Vol. 12(4). pp. 78-83. http://cci.case.edu/cci/images/2/2a/SHS08-IC-Column-SSW.pdf



2007

Sahoo, S.S., Zeng, K., Bodenreider, O. and Sheth, A., 2007. From “glycosyltransferase” to “congenital muscular dystrophy”: Integrating knowledge from NCBI Entrez Gene and the Gene Ontology. Studies in health technology and informatics, 129(Pt 2), p.1260.


Sahoo, S.S., 2007. Integrating gene and pathway information about nicotine dependence. Communications (NLM/NIH).


Sahoo, S.S., Sheth, A., Hunter, B. and York, W.S., 2007. SemBOWSER-Semantic Biological Web Services Registry. Semantic Web: Revolutionizing Knowledge Discovery in the Life Sciences, pp.317-340.


Sahoo, S.S., Bodenreider, O., Zeng, K. and Sheth, A.P., 2007. An experiment in integrating large biomedical knowledge resources with RDF: Application to associating genotype and phenotype information.



2006

Sahoo, S.S., Thomas, C., Sheth, A., York, W.S. and Tartir, S., 2006, May. Knowledge modeling and its application in life sciences: a tale of two ontologies. In Proceedings of the 15th international conference on World Wide Web (pp. 317-326).


Sahoo, S.S. and Sheth, A., 2006. Bioinformatics applications of Web Services, Web Processes and role of Semantics. In Semantic Web Services, Processes and Applications (pp. 305-322). Boston, MA: Springer US.


Sahoo, S.S., Thomas, C., Sheth, A., York, W.S. and Tartir, S., 2006, May. Knowledge modeling and its application in life sciences: a tale of two ontologies. In Proceedings of the 15th international conference on World Wide Web (pp. 317-326).


Sahoo, S.S., Converting biological information to the W3C Resource Description Framework (RDF): Experience with Entrez Gene Report Lister Hill National Center for Biomedical.


Sahoo, S.S., Bodenreider, O., Zeng, K. and Sheth, A.P., 2006. Adapting resources to the semantic web: experience with Entrez Gene. In Workshop on semantic web health care & life sciences at ISWC.



2005

Boanerges, A.M., Christian, H.W., Satya, S.S., Amit, S.I. and Budak, A., 2005. Template based semantic similarity for security applications . Technical Reqport, LSDIS Lab, Computer Science Department.


Sahoo, S.S., Sheth, A.P., York, W.S. and Miller, J.A., 2005, May. Semantic Web Services for N-glycosylation Process. In International Symposium on Web Services for Computational Biology and Bioinformatics, VBI, Blacksburg, VA.


Sahoo, S.S., Thomas, C., Sheth, A., Henson, C. and York, W.S., 2005. GLYDE—an expressive XML standard for the representation of glycan structure. Carbohydrate research, 340(18), pp.2802-2807.


Alvarez-Manilla, G., Atwood III, J., Guo, Y., Warren, N.L., Orlando, R. and Pierce, M., 2006. Tools for glycoproteomic analysis: size exclusion chromatography facilitates identification of tryptic glycopeptides with N-linked glycosylation sites. Journal of proteome research, 5(3), pp.701-708.


Atwood III, J.A., Sahoo, S.S., Alvarez‐Manilla, G., Weatherly, D.B., Kolli, K., Orlando, R. and York, W.S., 2005. Simple modification of a protein database for mass spectral identification of N‐linked glycopeptides. Rapid Communications in Mass Spectrometry: An International Journal Devoted to the Rapid Dissemination of Up‐to‐the‐Minute Research in Mass Spectrometry, 19(21), pp.3002-3006.



2004

Sheth, A.P., York, W.S., Thomas, C., Nagarajan, M., Miller, J.A., Kochut, K., Sahoo, S.S. and Yi, X., 2004. Semantic Web technology in support of Bioinformatics for Glycan Expression.


York, W.S., Sheth, A., Kochut, K., Miller, J.A., Thomas, C., Gomadam, K., Yi, X., Sahoo, S. and Nagarajan, M., 2004. Semantic Integration of Glycomics Data and Information. In Proceedings of the Human Disease Glycomics/Proteome Initiative 1st Workshop (pp. 1-1).



Education


Biomedical & Health Informatics Doctoral Program PQHS 416: Introduction to Computing in Biomedical Health Informatics
The Biomedical & Health Informatics (BHI) doctoral program trains researchers in biomedicine, population health, and clinical care. Program trainees will acquire a core set of skills spanning computing, biostatistics, and biomedical research through a combination of course work and participation in the study in the Population and Quantitative Health Sciences (PQHS) department. The doctoral program is designed for students to acquire skills in the three areas of concentration: Data Analytics with a focus on statistics and data wrangling, Biomedical Health with a focus on systems biology, clinical, and health issues and Computational and System Design with a focus on knowledge representation, information retrieval, and Big Data. “PQHS 416 introduces students to computational techniques and concepts that underpin biomedical and health informatics data management and analysis. In particular, the course will focus on the three topics of: (1) Biomedical terminologies and formal logic used in building knowledge models such as ontologies; (2) Natural language processing (NLP), and (3) Big Data technologies, including components of Hadoop stack and Apache Spark. This is a lecture-based course that relies on both materials covered in class and out-of-class readings of published literature. Students will be assigned reading assignments, homework exercise assignments and they are expected to complete homework assignment for each class. The students will be involved in a team project and they will be expected to prepare a project report at the end of the semester.”

Our Team


Pedram Golnari, MD

headshot of team member

PhD Student

Nasim Shafiabadi, MD

headshot of team member

Research Fellow

Satya Sahoo, PhD

headshot of team member

Professor of Medical Informatics

Leonora Lipson

headshot of team member

Undergraduate Researcher

Dipak Upadhyaya, MPH

headshot of team member

PhD Candidate

Katrina Prantzalos, MS

headshot of team member

PhD Candidate

Manu Bulusu

headshot of team member

Undergraduate Researcher

Deep Desai

headshot of team member

Undergraduate Researcher


Our Alumni


1. Jianzhe Zhang

MS (First employer: ByteDance)


2. Arthur Gershon

PhD (Status: Post-Doctoral Scholar)


3. Catherine Jayapandian

PhD (Status: Post-Doctoral Scholar)


4. Priya Ramesh

MS (First employer: CoverMyMeds)


5. Xinting Hong

MS


6. Pramith Devulapalli

BS (Status: PhD at Purdue University)


7. Vimig Socrates

BS, MS (Status: PhD at Yale University)


8. Meng Zhao

MS (First employer: IBM Explorys)


9. Li Wang

MS


10. Chien-Hung Chen

BS


11. Chang Liu

MS (First employer: Microsoft Corporation)


12. Annan Wei

MS (First employer: Google Inc)


13. Srikeerthi Sevugan

BS (Undergraduate Researcher)


14. Pranav Nampoothiripad

BS (Undergraduate Researcher)


Funding Agencies


National Institute of Biomedical Imaging and Bioengineering

logo of funding_agency

National Institute on Drug Abuse

logo of funding_agency

Department of Defence, Congressionally Directed Medical Research Programs

logo of funding_agency

Dravet Syndrome Foundation

logo of funding_agency

U.S. Department of Veterans Affairs

logo of funding_agency

National Cancer Institute

logo of funding_agency



© 2023 Case Western Reserve University
10900 Euclid Ave. Cleveland, Ohio 44106 216.368.2000
Department of Population and Quantative Health Sciences
Phone Number: 216-368-3286
Mailing Address: 2103 Cornell road, Iris S. & Bert l. Wolstein Research Building, Cleveland, OH44106-7291