Drug-drug interactions (DDIs) are a significant source of morbidity and adverse drug events (ADEs), particularly in situations of polypharmacy and complex medication regimens. While rules-based softwa Show more
Drug-drug interactions (DDIs) are a significant source of morbidity and adverse drug events (ADEs), particularly in situations of polypharmacy and complex medication regimens. While rules-based software integrated in electronic health records (EHRs) has demonstrated proficiency in identifying DDIs present in medication regimens, large language model (LLM) based identification requires thorough benchmarking and performance evaluation using high-quality datasets for safe use. The purpose of this study was to develop a series of performance benchmarking experiments specifically for LLM performance in identification and management of DDIs using a specifically curated clinician-annotated dataset of clinically-relevant DDIs. Show less
The use of multiple medications increases the risk of harmful drug-drug interactions (DDIs). Conventional DDI screening databases vary in coverage and often trigger low-relevance alerts, contributing Show more
The use of multiple medications increases the risk of harmful drug-drug interactions (DDIs). Conventional DDI screening databases vary in coverage and often trigger low-relevance alerts, contributing to alert fatigue. Large language models (LLMs) have emerged as potential tools for DDI identification, however, their performance compared to established databases using real-world patient data remains under-explored. Show less
Drug-drug interactions (DDI) are an important cause of adverse drug reactions (ADRs). Could large language models (LLMs) serve as valuable tools for pharmacovigilance specialists in detecting DDIs tha Show more
Drug-drug interactions (DDI) are an important cause of adverse drug reactions (ADRs). Could large language models (LLMs) serve as valuable tools for pharmacovigilance specialists in detecting DDIs that lead to ADR notifications? Show less
2025 · Bioinformatics · Oxford University Press · added 2026-04-21
Motivation: Rare diseases affect over 300 million people worldwide and are often caused by genetic variants. While variant detection has be come cost-effective, interpreting these variants—particular Show more
Motivation: Rare diseases affect over 300 million people worldwide and are often caused by genetic variants. While variant detection has be come cost-effective, interpreting these variants—particularly collecting literature-based evidence like ACMG/AMP PM3—remains complex and time-consuming. Results: We present AutoPM3, a method that automates PM3 evidence extraction from literatures using open-source large language models (LLMs). AutoPM3 combines a Text2SQL-based variant extractor and a retrieval-augmented generation (RAG) module, enhanced by a variantspecific retriever and fine-tuned LLM, to separately process tables and text. We curated PM3-Bench, a dataset of 1027 variant-publication Show less
2024 · Scientific Data · Nature · added 2026-04-21
11,571 — — NER 2008 SCAI33 1,206 — — NER 2012 ADE39 300 case reports 5,063 drugs — 6,821 drug adverse effects 279 drug dosage RE 2013 DDI43 1,025, including texts from DrugBank and 18,502 drugs — 5,02 Show more
11,571 — — NER 2008 SCAI33 1,206 — — NER 2012 ADE39 300 case reports 5,063 drugs — 6,821 drug adverse effects 279 drug dosage RE 2013 DDI43 1,025, including texts from DrugBank and 18,502 drugs — 5,028 drug-drug interactions RE 2015 CHEMDNER34 84,355 chemicals — — NER 2016 BC5CDR 1,500 articles 15,935 chemicals 12,850 diseases 4,409 MeSH chemically induced diseases NER, NEN, RE 2017 N-ary drug-gene-mutation 35 — — — 137,469 drug–gene 3,192 drug–mutation RE 2017 40 ChemProt 32,514 chemicals 30,922 genes Show less
2024 · Bioinformatics · Oxford University Press · added 2026-04-21
Motivation: Thousands of genomes are publicly available, however, most genes in those genomes have poorly defined functions. This is partly due to a gap between previously published, experimentally ch Show more
Motivation: Thousands of genomes are publicly available, however, most genes in those genomes have poorly defined functions. This is partly due to a gap between previously published, experimentally characterized protein activities and activities deposited in databases. This activity de position is bottlenecked by the time-consuming biocuration process. The emergence of large language models presents an opportunity to speed up the text-mining of protein activities for biocuration. Results: We developed FuncFetch—a workflow that integrates NCBI E-Utilities, OpenAI’s GPT-4, and Zotero—to screen thousands of manu Show less
Background: Drug-drug interactions (DDIs) are a significant source of morbidity and adverse drug events (ADEs), particularly in situations of polypharmacy and complex medication regimens. While rules- Show more
Background: Drug-drug interactions (DDIs) are a significant source of morbidity and adverse drug events (ADEs), particularly in situations of polypharmacy and complex medication regimens. While rules-based software integrated in electronic health records (EHRs) has demonstrated proficiency in identifying DDIs present in medication regimens, large language model (LLM) based identification requires thorough benchmarking and performance evaluation using high-quality datasets for safe use. The purpose of this study was to develop a series of Show less