Our research is focused on developing artificial intelligence (AI) methods to analyze heterogeneous biomedical big data for translational applications. This ongoing work brings together two branches of AI: knowledge representation reasoning and machine learning algorithms to characterize brain network dynamics and electronic health records (EHR) data.
Knowledge representation and reasoning involves development of knowledge models or ontologies. We have led the development of new methods to use ontology engineering principles across multiple stages of machine learning workflows, including feature engineering and model validation. This involves the development of deep neural network (DNN) models and the use of classical machine learning algorithms such as support vector machines (SVM) for integrative analysis of multi-modal brain connectivity data in neurological disorders such as epilepsy and Parkinson's Disease. To address the challenges of data quality and scientific reproducibility, we have led the development of a provenance metadata framework called ProvCaRe using ontology engineering and natural language processing techniques.
Epilepsy seizure networks; Structural connectivity networks derived from MRI; Functional connectivity networks derived from EEG; Provenance metadata; Ontology engineering; Data integration; High performance computing
We do not currently have any job openings, but as soon as we do, we will post the open position here on our lab website.
Upadhyaya D.P,. Prantzalos K, Thyagaraj, S., Shafiabadi N, Fernandez-BacaVaca G, Sivagnanam S, Majumdar A, Sahoo SS. Machine Learning Interpretability Methods to Characterize Brain Network Dynamics in Epilepsy. medrxiv 2023.06.25.23291874; doi: https://doi.org/10.1101/2023.06.25.23291874, 2023.
Upadhyaya DP, Tarabichi Y, Prantzalos K, Ayub S, Kaelber DC, Sahoo SS. Characterizing the Importance of Hematologic Biomarkers in Screening for Severe Sepsis using Machine Learning Interpretability Methods. medRxiv 2023.05.30.23290757; doi: https://doi.org/10.1101/2023.05.30.23290757, 2023.
Wang L, Ambite JL, Appaji AM, Bijsterbosch J, Dockès J, Herrick R, Kogan A, Lander HM, Lenzini P, Marcus D, Moore SM, Poline J-B, Rajasekar A, Sahoo SS, Turner MD, Wang X, Wang Y, Turner JA. NeuroBridge: A Prototype Platform for Discovery of The Long-Tail Neuroimaging Data. Frontiers in Neuroinformatics (accepted), 2023.
Prantzalos K, Upadhyaya DP, Shafiabadi N, Gurski N, Fernandez-BacaVaca G, Yoshimoto K, Sivagnanam S, Majumdar A, Sahoo SS. MaTiLDA: An Integrated Machine Learning and Topological Data Analysis Platform for Brain Network Dynamics. Pacific Symposium on Biocomputing (accepted), 2023.
Sahoo SS, Turner MD, Wang L, Ambite JL, Appaji AM, Rajasekar A, Lander HM, Wang Y, Turner JA. NeuroBridge ontology: Computable provenance metadata to give the long tail of neuroimaging data a FAIR chance for secondary use. Frontiers in Neuroinformatics, 2023.
Turner JA, Turner MD, Appaji A, Rajasekar AK, Wang L, Sahoo SS. NeuroBridge ontology development for shared neuroimaging datasets. International Neuroinformatics Coordinating Facility (INCF) Assembly, 2022 -Abstract, 2022.
Lander H, Rajasekar AK, Wang Y, Watson M, Sahoo SS, Turner J, Poline J-B, Wang L. Linking NeuroBridge and NeuroQuery with Deep Semantic Matching. International Neuroinformatics Coordinating Facility (INCF) Assembly, 2022 (poster), 2022.
Sahoo SS, Kobow K, Zhang J, Buchhalter J, Dayyani M, Upadhyaya DP, Prantzalos K, Bhattacharjee M, Blumcke I, Wiebe S, Lhatoo SD. Ontology-based feature engineering in machine learning workflows for heterogeneous epilepsy patient records. Scientific Reports, 2022.
Wang X, Wang Y, Ambite J-L, Appaji A, Lander H, Moore S, Rajasekar AK, Turner JA, Turner MD, Wang L, Sahoo SS. Enabling Scientific Reproducibility through FAIR Data Management: An ontology-driven deep learning approach in the NeuroBridge Project. AMIA Annual Symposium Proceedings, 2022.
Spilsbury JC, Hernandez E, Kiley K, Gillerlane EH, Prasanna S, Shafiabadi N, Rao P, Sahoo SS. Social service workers’ use of social media to obtain client information: Current practices and perspectives on a potential informatics platform. Journal of Social Service Research. 2022 (accepted), 2022.
Gupta DK, Prantzalos K, Hiller AL, Lobb BM, Chan K, Boyd J, Sahoo SS. Ontology-based, Real-time, Machine learning Informatics System for Parkinson Disease (ORMIS-PD). International Congress of Parkinson’s Disease and Movement Disorders 2022 (poster), 2022.
Gupta DK, Marano M, Aurora R, Boyd J, Sahoo SS. Movement disorders ontology for clinically-oriented and clinicians-driven data mining of multi-center cohorts in Parkinson's disease (Poster). AMIA Annual Symposium Proceedings, 2021.
Prantzalos K, Zhang J, Shafiabadi N, Fernandez- BacaVaca G, Sahoo SS. Epilepsy-Connect: An Integrated Knowledgebase for Characterizing Alterations in Consciousness State of Pharmacoresistant Epilepsy Patients. AMIA Annual Symposium Proceedings, 2021.
Zhang J, Bauman R, Shafiabadi N, Gurski N, Fernandez-BacaVaca G, Sahoo SS. Characterizing Brain Network Dynamics using Persistent Homology in Patients with Refractory Epilepsy. AMIA Annual Symposium Proceedings, 2021.
Lhatoo SD, Bernasconi N, Blumcke I, Braun K, Buchhalter J, Denaxas S, Galanopoulou A, Josephson C, Kobow K, Lowenstein D, Ryvlin P, Schulze-Bonhage A, Sahoo SS, Thom M, Thurman D, Worrell G, Zhang GQ, Wiebe S. Big Data in Epilepsy: Clinical and Research Considerations. Report from the Epilepsy Big Data Task Force of the International League Against Epilepsy, Epilepsia, 2020.
Carr SJ, Gershon A, Shafiabadi N, Lhatoo SD, Tatsuoka C, Sahoo SS. An Integrative Approach to Study Structural and Functional Network Connectivity in Epilepsy using Imaging and Signal Data. Frontiers in Integrative Neuroscience, 2020.
Liu C, Kim M, Rueschman M, Sahoo SS. ProvCaRe: A Large-Scale Semantic Provenance Resource for Scientific Reproducibility, in Knowledge Graphs and RDF Data Provenance: AI Actions with Machine-Interpretable Data. Springer book series on Advanced Information & Knowledge Processing, 2020.
Sahoo SS, Gershon A, Shafiabadi N, Ghosh K, Tatsuoka C, Lhatoo SD, Fernandez-BacaVaca G. NeuroIntegrative Connectivity (NIC) Informatics Tool for Brain Functional Connectivity Network Analysis in Cohort Studies. AMIA Annual Symposium Proceedings 2020, 2020.
Hong X, Liu C, Momotaz H, Cassidy K, Sajatovic M, Sahoo SS. Enhancing Multi-Center Patient Cohort Studies in the Managing Epilepsy Well (MEW) Network: Integrated Data Integration and Statistical Analysis. AMIA Annual Symposium Proceedings, 2019.
Sahoo SS, Valdez J, Rueschman M, Kim M. Semantic Provenance Graph for Reproducibility of Biomedical Research Studies: Generating and Analyzing Graph Structures from Published Literature. International Medical Informatics Association (IMIA), MedInfo, 2019.
Sahoo SS, Valdez J, Kim M, Rueschman M, Redline S. ProvCaRe: Characterizing scientific reproducibility of biomedical research studies using semantic provenance metadata. International Journal of Medical Informatics, 2019.
Gershon A, Devulapalli P, Zonjy B, Ghosh K, Tatsuoka C, Sahoo SS. Computing Functional Brain Connectivity in Neurological Disorders: Efficient Processing and Retrieval of Electrophysiological Signal Data. AMIA Joint Summits 2019, 2019.
Yang S, Ghosh K, Sakaie K, Sahoo SS, Carr S, Tatsuoka C. A Simplified Crossing Fiber Model in Diffusion Weighted Imaging. Frontiers in Neuroscience, 2019.
Socrates V, Gershon A, Sahoo SS. Computation of Brain Functional Connectivity Network Measures in Epilepsy: A Web-based Platform for EEG Signal Data Processing and Analysis (Poster). International Medical Informatics Association (IMIA), MedInfo, 2019.
Gershon AL, Lhatoo SD, Tatsuoka C, Ghosh K, Loparo K, Sahoo SS. Scalable Signal Data Processing for Measuring Functional Connectivity in Epilepsy Neurological Disorder. Biomedical Signal Processing in Big Data, Ervin Sejdic, Tiago Falk (Eds), 2018.
Valdez J, Kim M, Rueschman M, Redline S, Sahoo SS. Classification of Provenance Triples for Scientific Reproducibility: A Comparative Evaluation of Deep Learning Models in the ProvCaRe Project. International Provenance Annotation Workshop (IPAW) 2018 Proceedings, Springer, 2018.
Valdez J, Kim M, Rueschman M, Socrates V, Redline S, Sahoo SS. ProvCaRe Semantic Provenance Knowledgebase: Evaluating Scientific Reproducibility of Research Studies (Finalist for Distinguished Paper Award). American Medical Informatics Association (AMIA) Annual Symposium, 2017.
Sajatovic M, Tatsuoka C, Welter E, Friedman D, Spruill TM, Stoll S, Sahoo SS, Bukach A, Bamps YA, Valdez J, Jobst BC. Correlates of quality of life among individuals with epilepsy enrolled in self-management research: From the US Centers for Disease Control and Prevention Managing Epilepsy Well Network. Epilepsy Behavior, 2017.
Gershon AL, Zonjy B, Tatsuoka C, Ghosh K, Lhatoo SD, Sahoo SS. A Flexible Computational Neuroinformatics Workflow for Computing Functional Networks in Epilepsy Neurological Disorder (Abstract). American Medical Informatics Association (AMIA) Annual Symposium, Washington DC, 2017.
Valdez J, Rueschman M, Kim M, Arabyarmohammadi S, Redline S, Sahoo SS. An Extensible Ontology Modeling Approach Using Post Coordinated Expressions for Semantic Provenance in Biomedical Research. The 16th International Conference on. Ontologies, DataBases, and Applications of Semantics (ODBASE), Rhodes, Greece, 2017.
Gershon AL, Lhatoo SD, Tatsuoka C, Ghosh K, Loparo K, Sahoo SS. Scalable Signal Data Processing for Measuring Functional Connectivity in Epilepsy Neurological Disorder. Biomedical Signal Processing in Big Data, Ervin Sejdic, Tiago Falk (Eds), 2017.
Sahoo SS, Valdez J, Rueschman M. Scientific Reproducibility in Biomedical Research: Provenance Metadata Ontology for Semantic Annotation of Study Description. American Medical Informatics Association (AMIA) Annual Symposium, 2016.
Valdez J, Rueschman M, Kim M, Redline S, Sahoo SS. An Ontology-Enabled Natural Language Processing Pipeline for Provenance Metadata Extraction from Biomedical Text. 15th International Conference on Ontologies, DataBases, and Applications of Semantics (ODBASE), 2016.
Sahoo SS, Ramesh P, Welter E, Bukach A, Valdez J, Tatsuoka C, Bamps Y, Stoll S, Jobst BC, Sajatovic M. Insight: An Ontology-based Integrated Database and Analysis Platform for Epilepsy Self-Management Research. International Journal of Medical Informatics, 2016.
Sahoo SS, Wei A, Valdez J, Wang L, Zonjy B, Tatsuoka C, Loparo KA, Lhatoo SD. NeuroPigPen: a Scalable Toolkit for Processing Electrophysiological Signal Data in Neuroscience Applications using Apache Pig. Frontiers in Neuroinformatics, 2016.
Sahoo SS, Wei A, Tatsuoka C, Ghosh K, Lhatoo SD. Processing Neurology Clinical Data for Knowledge Discovery: Scalable Data Flows Using Distributed Computing (Book Chapter). {null}, 2016.
Dean DA, Goldberger AL, Mueller R, Kim M, Rueschman M, Mobley D, Sahoo SS, Jayapandian C, Cui L, Morrical MG, Surovec S, Zhang GQ, Redline S. Scaling up Scientific Discovery in Sleep Medicine. The National Sleep Research Resource, 2016.
LaFrance Jr. WC, Ranieri R, Bamps Y, Stoll S, Sahoo SS, Welter E, Sams J, Tatsuoka C, Sajatovic M. Comparison of common data elements from the Managing Epilepsy Well (MEW) Network integrated database and a well-characterized sample with nonepileptic seizures. Epilepsy & Behavior, 2015.
Yang S, Tatsuoka C, Ghosh K, Lacuey-Lecumberri N, Lhatoo SD, Sahoo SS. Comparative Evaluation for Brain Structural Connectivity Approaches: Towards Integrative Neuroinformatics Tool for Epilepsy Clinical Research (Nominated for the Best Student Paper Award). AMIA 2016 Joint Summits on Translational Science, 2015.
Ramesh P, Wei A, Sams J, Welter E, Lhatoo S, Sajatovic M, Sahoo SS. Insight: Semantic Provenance and Analysis Platform for Multi-center Neurology Healthcare Research. Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2015.
Sahoo SS, Rao P. Provenance Analysis and RDF Query Processing: W3C PROV for Data Quality and Trust. In the 14th International Semantic Web Conference (ISWC 2015), Bethlehem, PA, 2015.
Jayapandian C, Wei A, Ramesh P, Zonjy B, Lhatoo SD, Loparo K, Zhang GQ, Sahoo SS. A Scalable Neuroinformatics Data Flow for Electrophysiological Signals using MapReduce. Frontiers in Neuroinformatics, 2015.
Sahoo SS, Zhang GQ, Bamps Y, Fraser R, Stoll S, Lhatoo SD, Tatsuoka C, Welter E, Sajatovic M. Managing Information Well: Toward an Ontology-driven Informatics Platform for Data Sharing and Secondary Use in Epilepsy Self-Management Research Centers. Health Informatics Journal, 2015.
Sahoo SS, Rueschman M, Valdez J, Hsu W, Lhatoo SD, Redline S. Provenance Analysis over Biomedical Big Data Using PROV: Towards Effective Secondary Data Analysis Across Multiple Studies (Poster). NIH Big Data to Knowledge (BD2K) Meeting, Bethesda MD, 2015.
Cui L, Sahoo SS, Lhatoo SD, Garg G, Rai P, Bozorgi A, Zhang GQ. Complex Epilepsy Phenotype Extraction from Narrative Clinical Discharge Summaries. Journal of Biomedical Informatics, 2014.
Jayapandian CP, Chen CH, Dabir A, Zhang GQ, Lhatoo SD, Sahoo SS. Domain Ontology As Conceptual Model for Big Data Management: Application in Biomedical Informatics. Proceedings of the 33rd International Conference on Conceptual Modeling (ER 2014), 2014.
Zhang GQ, Cui L, Lhatoo, SD, Schuele SU, Sahoo SS. MEDCIS: Multi-Modality Epilepsy Data Capture and Integration System. American Medical Informatics Association (AMIA) Annual Symposium, 2014.
Sahoo SS, Tao S, Parchman A, Luo Z, Cui L, Mergler P, Lanese R, Barnholtz-Sloan JS, Meropol NJ, Zhang GQ. Trial Prospector: Matching Patients with Cancer Research Studies using an Automated and Scalable Approach. Journal of Cancer Informatics, 2014.
Sahoo SS, McIntyre C, Lhatoo SD. A Match Made in Cloud? Meeting the Requirements of the Next Generation Neuroscience Research Using Configurable Cloud Infrastructure. National Science Foundation (NSF) Cloud Workshop, 2014.
Sahoo SS, Jayapandian C, Garg G, Kaffashi F, Chung S, Bozorgi A, Chen CH, Loparo K, Lhatoo SD, Zhang GQ. Heartbeats in the Cloud: Distributed Analysis of Electrophysiological “Big Data” using Cloud Computing for Epilepsy Clinical Research. Journal of American Medical Informatics Association JAMIA (special issue on Big Data in Healthcare and Biomedical Research), 2013.
Parchman AJ, Zhang GQ, Mergler P, Barnholtz-Sloan J, Lanese R, Miller DW, Opper C,Sahoo SS, Tao S, Teagno J, Warfe J, Meropol NJ. Trial prospector: An automated clinical trials eligibility matching program. Proceedings of the American Society of Clinical Oncology (ASCO) Annual Meeting, 2013.
Asiaee AH, Doshi P, Minning T, Sahoo SS, Parikh P, Sheth A, Tarleton RL. From Questions to Effective Answers: On the Utility of Knowledge-Driven Querying Systems for Life Sciences Data. The 9th International Conference on Data Integration in the Life Sciences (DILS), 2013.
Jayapandian CP, Chen CH, Bozorgi A, Lhatoo SD, Zhang GQ, Sahoo SS. Electrophysiological Signal Analysis and Visualization using Cloudwave for Epilepsy Clinical Research. The 14th World Congress on Medical and Health Informatics (MedInfo), Stud Health Technol Inform, 2013.
Sahoo SS, Zhang GQ, Lhatoo SD. Epilepsy Informatics and an Ontology-driven Infrastructure for Large Database Research and Patient Care in Epilepsy. Review Paper, Epilepsia, 2013.
Lebo T, Sahoo SS, McGuinness D. (eds.). PROV-O: The PROV Ontology. W3C Recommendation, 2013.
Cui L, Mueller R, Sahoo SS, Zhang GQ. Querying Complex Federated Clinical Data Using Ontological Mapping and Subsumption Reasoning. IEEE International Conference on Healthcare Informatics 2013 (ICHI 2013), 2013.
Bozorgi A, Chung S, Kaffashi F, Loparo KA, Sahoo SS, Zhang GQ, Kaiboriboon K, Lhatoo SD. Significant postictal hypotension: expanding the spectrum of seizure-induced autonomic dysregulation. Epilepsia, 2013.
Sahoo SS, Lhatoo SD, Gupta DK, Cui L, Zhao M, Jayapadian C, Bozorgi A, Zhang GQ. Epilepsy and Seizure Ontology: Towards an Epilepsy Informatics Infrastructure for Clinical Research and Patient Care. Journal of American Medical Informatics Association (JAMIA), 2013.
Jayapandian CP, Chen CH, Bozorgi A, Lhatoo SD, Zhang GQ, Sahoo SS. Cloudwave: Distributed Processing of “Big Data” from Electrophysiological Recordings for Epilepsy Clinical Research Using Hadoop. American Medical Informatics Association (AMIA) Annual Symposium, 2013.
Jayapandian C, Ewing R, Zhang GQ, Sahoo SS. A Semantic Proteomics Dashboard (SemPoD) for Proteomics Data Management in Translational Research. AMIA Clinical Research Informatics Summit (CRI), 2012.
Jayapandian C, Zhao M, Ewing R, Zhang GQ, Sahoo SS. A semantic proteomics dashboard (SemPoD) for data management in translational research. BMC Systems Biology, 2012.
Parikh PP, Zheng J, Logan-Klumper F, Stoeckert Jr. CJ, Louis C, Topalis P, Protasio AV, Sheth AP, Carrington M, Berriman M, Sahoo SS. The Ontology for Parasite Lifecycle (OPL): Towards a Consistent Vocabulary of Lifecycle Stages in Parasitic Organisms. Journal Biomedical Semantics (JBMS), 2012.
Zhang GQ, Sahoo SS, Lhatoo SD. From Classification to Epilepsy Ontology and Informatics. Epilepsia, 2012.
Parikh PP, Minning TA, Nguyen V, Lalithsena S, Asiaee AH, Sahoo SS, Doshi P, Tarleton R, Sheth AP. A Semantic Problem Solving Environment for Integrative Parasite Research: Identification of Intervention Targets for Trypanosoma cruzi. PLoS Neglected Tropical Diseases, 2012.
S.S. Sahoo, M. Zhao, L. Luo, A. Bozorgi, D. Gupta, S.D Lhatoo, GQ Zhang. OPIC: Ontology-driven Patient Information Capturing System for Epilepsy. Proceedings of the American Medical Informatics Association (AMIA) Annual Symposium, 2012.
Cui L, Bozorgi A, Lhatoo SD, Zhang GQ, Sahoo SS. EpiDEA: Extracting Structured Epilepsy and Seizure Information from Patient Discharge Summaries for Cohort Identification. American Medical Informatics Association (AMIA) Annual Symposium, 2012.
Zhang GQ, Luo L, Ogbuji C, Joslyn C, Mejino J, Sahoo SS. An Analysis of Multi-type Relational Interactions in FMA Using Graph Motifs. American Medical Informatics Association (AMIA) Annual Symposium, 2012.
Teagno J, Kiefer RC, Pathak J, Zhang GQ, Sahoo SS. A Distributed Semantic Web Approach for Cohort Identification. Proceedings of the American Medical Informatics Association (AMIA) Annual Symposium, 2012.
Biomedical & Health Informatics Doctoral Program | PQHS 416: Introduction to Computing in Biomedical Health Informatics |
---|---|
The Biomedical & Health Informatics (BHI) doctoral program trains researchers in biomedicine, population health, and clinical care. Program trainees will acquire a core set of skills spanning computing, biostatistics, and biomedical research through a combination of course work and participation in the study in the Population and Quantitative Health Sciences (PQHS) department. The doctoral program is designed for students to acquire skills in the three areas of concentration: Data Analytics with a focus on statistics and data wrangling, Biomedical Health with a focus on systems biology, clinical, and health issues and Computational and System Design with a focus on knowledge representation, information retrieval, and Big Data. | “PQHS 416 introduces students to computational techniques and concepts that underpin biomedical and health informatics data management and analysis. In particular, the course will focus on the three topics of: (1) Biomedical terminologies and formal logic used in building knowledge models such as ontologies; (2) Natural language processing (NLP), and (3) Big Data technologies, including components of Hadoop stack and Apache Spark. This is a lecture-based course that relies on both materials covered in class and out-of-class readings of published literature. Students will be assigned reading assignments, homework exercise assignments and they are expected to complete homework assignment for each class. The students will be involved in a team project and they will be expected to prepare a project report at the end of the semester.” |
Research Fellow
Undergraduate Researcher
Undergraduate Researcher
MS (First employer: ByteDance)
PhD (Status: Post-Doctoral Scholar)
PhD (Status: Post-Doctoral Scholar)
MS (First employer: CoverMyMeds)
MS
BS (Status: PhD at Purdue University)
BS, MS (Status: PhD at Yale University)
MS (First employer: IBM Explorys)
MS
BS
MS (First employer: Microsoft Corporation)
MS (First employer: Google Inc)