📚 Article Archive

Machine-Learning Adaptivity, DFT-Level Accuracy, and Semi-Empirical Quantum-Chemistry Speed with Neural-Network Extended Tight-Binding

Yufan Xia, Joshua Soon, Albert Thie +1 more · 2025 · · added 2026-04-22

Yufan Xia, Joshua Soon, Albert Thie, Giuseppe Maria Junior Barca Show less

no PDF DOI: 10.26434/chemrxiv-2025-chlcc-v2 📎 SI

density functional theory machine learning neural networks quantum chemistry semi empirical methods tight binding xtb

From Detection to Prediction: Advances in m6A Methylation

2025 · International journal of molecular sciences · MDPI · added 2026-04-21

Academic Editor: Sabrina Venditti Received: 22 May 2025 Revised: 5 July 2025 N6-methyladenosine (m6A) represents the most common and thoroughly investigated RNA modification and exerts essential funct Show more

Academic Editor: Sabrina Venditti Received: 22 May 2025 Revised: 5 July 2025 N6-methyladenosine (m6A) represents the most common and thoroughly investigated RNA modification and exerts essential functions in regulating gene expression through influencing the RNA stability, the translation efficiency, alternative splicing, and nuclear export processes. The rapid development of high-throughput sequencing approaches, including miCLIP and MeRIP-seq, has profoundly transformed epitranscriptomics research. Show less

📄 PDF DOI: 10.3390/ijms26146701

alternative splicing bioinformatics cancer deep learning detection gene expression regulation high-throughput sequencing m6a

MPIDNN-GPPI: multi-protein language model

2025 · Li et al. BMC Genomics · BioMed Central · added 2026-04-21

Predicting protein‒protein interactions (PPIs) plays a crucial role in understanding biological processes. Although biological experimental methods can identify PPIs, they are costly, time-consuming, Show more

Predicting protein‒protein interactions (PPIs) plays a crucial role in understanding biological processes. Although biological experimental methods can identify PPIs, they are costly, time-consuming, labor-intensive, and often lack stability. In contrast, computational approaches for PPI prediction, particularly deep learning methods, can efficiently learn representations from protein sequences. However, the generalizability, robustness, and stability of computational PPI prediction models still need improvement, especially for species with limited verified PPI Show less

📄 PDF DOI: 10.1186/s12864-025-12228-y

ablation experiments bioinformatics computational biology deep learning deep neural network generalized protein-protein interaction prediction machine learning multi-head attention mechanism

Recent advances in deep learning for proteinprotein interaction: a review

2025 · Cui et al. BioData Mining · BioMed Central · added 2026-04-21

Deep learning, a cornerstone of artificial intelligence, is driving rapid advancements in computational biology. Protein-protein interactions (PPIs) are fundamental regulators of biological functions. Show more

Deep learning, a cornerstone of artificial intelligence, is driving rapid advancements in computational biology. Protein-protein interactions (PPIs) are fundamental regulators of biological functions. With the inclusion of deep learning in PPI research, the field is undergoing transformative changes. Therefore, there is an urgent need for a comprehensive review and assessment of recent developments to improve analytical methods and open up a wider range of biomedical applications. This review meticulously assesses deep learning progress in PPI prediction from 2021 Show less

📄 PDF DOI: 10.1186/s13040-025-00457-6

artificial intelligence autoencoders bert bioinformatics cnns computational biology deep learning esm

Large Language Model-Enhanced Drug Repositioning Knowledge Extraction via Long Chain-of-Thought: Development and Evaluation Study.

Hongyu Kang, Jiao Li, Li Hou +3 more · 2025 · JMIR medical informatics · added 2026-04-20

Hongyu Kang, Jiao Li, Li Hou, Xiaowei Xu, Si Zheng, Qin Li Show less

BACKGROUND: Drug repositioning is a pivotal strategy in pharmaceutical research, offering accelerated and cost-effective therapeutic discovery. However, biomedical information relevant to drug reposit Show more

BACKGROUND: Drug repositioning is a pivotal strategy in pharmaceutical research, offering accelerated and cost-effective therapeutic discovery. However, biomedical information relevant to drug repositioning is often complex, dispersed, and underutilized due to limitations in traditional extraction methods, such as reliance on annotated data and poor generalizability. Large language models (LLMs) show promise but face challenges such as hallucinations and interpretability issues. OBJECTIVE: This study proposed long chain-of-thought for drug repositioning knowledge extraction (LCoDR-KE), a lightweight and domain-specific framework to enhance LLMs' accuracy and adaptability in extracting structured biomedical knowledge for drug repositioning. METHODS: A domain-specific schema defined 11 entities (eg, drug, disease) and 18 relationships (eg, treats, is biomarker of). Following the established schema architecture, we constructed automatic annotation based on 10,000 PubMed abstracts via chain-of-thought prompt engineering. A total of 1000 expert-validated abstracts were curated into a drug repositioning corpus, a high-quality specialized corpus, while the remaining entries were allocated for model training purposes. Then, the proposed LCoDR-KE framework combined supervised fine-tuning of the Qwen2.5-7B-Instruct model with reinforcement learning and dual-reward mechanisms. Performance was evaluated against state-of-the-art models (eg, conditional random fields, Bidirectional Encoder Representations From Transformers, BioBERT, Qwen2.5, DeepSeek-R1, OpenBioLLM-70B, and model variants) using precision, recall, and F1-score. In addition, the convergence of the training method was assessed by analyzing performance progression across iteration steps. RESULTS: LCoDR-KE achieved an entity F1 of 81.46% (eg, drug 95.83%, disease 90.52%) and triplet F1 of 69.04%, outperforming traditional models and rivaling larger LLMs (DeepSeek-R1: entity F1=84.64%, triplet F1=69.02%). Ablation studies confirmed the contributions of supervised fine-tuning (8.61% and 20.70% F1 drop if removed) and reinforcement learning (6.09% and 14.09% F1 drop if removed). The training process demonstrated stable convergence, validated through iterative performance monitoring. Qualitative analysis of the model's chain-of-thought outputs showed that LCoDR-KE performed structured and schema-aware reasoning by validating entity types, rejecting incompatible relations, enforcing constraints, and generating compliant JSON. Error analysis revealed 4 main types of mistakes and challenges for further improvement. CONCLUSIONS: LCoDR-KE enhances LLMs' domain-specific adaptability for drug repositioning by offering an open-source drug repositioning corpus and a long chain-of-thought framework based on a lightweight LLM model. This framework supports drug discovery and knowledge reasoning while providing scalable, interpretable solutions applicable to broader biomedical knowledge extraction tasks. Show less

no PDF DOI: 10.2196/77837 📎 SI

biomedical informatics chain-of-thought prompt engineering drug repositioning knowledge extraction large language model machine learning medicinal chemistry natural language processing

Finding the dark matter: Large language model‐based enzyme kinetic data extractor and its validation

2025 · Protein Science · Wiley · added 2026-04-21

Despite the vast number of enzymatic kinetic measurements reported across decades of biochemical literature, the majority of relational enzyme kinetic data—linking amino acid sequence, substrate ident Show more

Despite the vast number of enzymatic kinetic measurements reported across decades of biochemical literature, the majority of relational enzyme kinetic data—linking amino acid sequence, substrate identity, kinetic parameters, and assay conditions—remains uncollected and inaccessible in structured form. This constitutes a significant portion of the “dark matter” of enzymology. Unlocking these hidden data through automated extraction offers an opportunity to expand enzyme dataset diversity and size, critical Show less

📄 PDF DOI: 10.1002/pro.70251

amino acid sequence automated extraction benchmarking bioinformatics data curation data extraction data mining data science

Navigating structure-based drug discovery with emerging innovations in physics- and knowledge-based approaches

2025 · npj Drug Discovery · Nature · added 2026-04-21

Structure-based drug design is rapidly evolving, driven by advances in both physics-based and knowledge-based methods. These computational approaches are increasingly integrated across all stages of d Show more

Structure-based drug design is rapidly evolving, driven by advances in both physics-based and knowledge-based methods. These computational approaches are increasingly integrated across all stages of drug discovery. Despite remarkable progress, challenges remain in achieving accuracy, generalizability, computational efficiency, and chemical synthesizability. In this review, we provide a critical overview of advances, strengths, and limitations of recent methods. We also discuss synergies between the two concepts that hold promises for future advancements towards their practical applicability. Show less

📄 PDF DOI: 10.1038/s44386-025-00031-4 📎 SI

biological targets computational approaches drug discovery drug repurposing lead optimization machine learning medicinal chemistry qsar

Transformer-based models for ADR detection: cross-drug validation and benchmarking against large language models

2025 · Therapeutic advances in drug safety · SAGE Publications · added 2026-04-21

Background: Adverse drug reactions (ADRs) are harmful side effects of medications. Social media provides real-time, patient-generated data, though its unstructured format presents challenges. Natural Show more

Background: Adverse drug reactions (ADRs) are harmful side effects of medications. Social media provides real-time, patient-generated data, though its unstructured format presents challenges. Natural language processing and transfer learning offer promising solutions. Objective: This study aimed to evaluate whether transformer-based models fine-tuned on a general ADR dataset can effectively classify ADRs from tweets related to glucagon-like peptide-1 (GLP-1) receptor agonists and to benchmark their performance against state-ofthe-art large language models (LLMs). Show less

📄 PDF DOI: 10.1177/20420986251405082

adr detection adverse drug reactions artificial intelligence bert bioinformatics fine-tuning glp-1 receptor agonists gpt-2

Machine learning-based drug-drug interaction prediction: a critical review of models, limitations, and data challenges

2025 · Frontiers in pharmacology · Frontiers · added 2026-04-21

Background/ObjectivesNew computational methods, based on statistical, machine learning, and deep learning techniques using drug-related entities (e.g., genes, protein bindings, etc.), help reduce the Show more

Background/ObjectivesNew computational methods, based on statistical, machine learning, and deep learning techniques using drug-related entities (e.g., genes, protein bindings, etc.), help reduce the costs of in-vitro experiments through drug-drug interaction prediction (DDIp). This review examines recent advances in DDIp. It presents an in-depth review of the state-of-the-art studies relating to semi-supervised, supervised, self-supervised learning, and other techniques such as graph-based learning and matrix factorization methods for predicting DDIs. All possible interactions between drugs are not known, and accurately predicting interactions is even more difficult due to the complex nature of drug-drug interactions (DDI).MethodsOf the 49 papers published in Web of Science in the last 6 years, 24 papers were considered relevant based on information presented in their titles and abstracts. The included articles focus specifically on predicting DDIs using a type of machine learning algorithm. Excluded articles focused on drug discovery, drug repurposing, molecular representation, or the extraction of biomedical interactions. The methodology, results limitations, and future research directions were studied for each paper. Common challenges, limitations, and future research directions were analyzed.Results and conclusionThe main limitations are class imbalance, poor performance on new drugs, limited explainability, and the need for additional data sources. Show less

📄 PDF DOI: 10.3389/fphar.2025.1632775

bioinformatics computational biology deep learning drug interaction prediction drug-drug interaction prediction graph-based learning machine learning matrix factorization

AI-driven pharmacovigilance: Enhancing adverse drug reaction detection with deep learning and NLP

2025 · MethodsX 15 · Elsevier · added 2026-04-21

In the healthcare industry, the ever-increasing volume of clinical trial data presents challenges for ensuring drug safety and detecting adverse drug reactions (ADRs). This study aims to address the c Show more

In the healthcare industry, the ever-increasing volume of clinical trial data presents challenges for ensuring drug safety and detecting adverse drug reactions (ADRs). This study aims to address the challenge of accurately detecting Serious Adverse Events (SAEs) in pharmacovigilance, a critical component in ensuring drug safety during and after clinical trials. The key problem lies in the underreporting and delayed detection of Adverse Drug Reactions (ADRs) due to the heterogeneous nature of medical data, class imbalance, and the limited scope of traditional monitoring techniques. This study proposes a hybrid AI-driven framework that integrates structured (e.g., patient demographics, lab results) and unstructured data (e.g., clinical notes) to detect ADRs using advanced deep learning and NLP methods. The objective is to outperform traditional signal detection methods and provide interpretable predictions to aid clinicians in real-time. By leveraging advanced Machine Learning (ML) and Deep Learning (DL) techniques, including Random Forests, Gradient Boosting Machines, and Convolutional Neural Networks (CNNs), our model aims to identify potential ADRs across different patient subgroups. Through meticulous feature engineering and the application of techniques to address data imbalance, our model demonstrates improved accuracy and interpretability in predicting ADRs. The CNN model achieved an accuracy of 85 %, outperforming traditional models, such as Logistic Regression (78 %) and Support Vector Machines (80 %). These findings suggest that specific demographic and clinical factors significantly influence the likelihood of adverse reactions, offering valuable insights for targeted monitoring and risk mitigation strategies[11]. This research underscores the potential of predictive modeling to enhance pharmacovigilance efforts and ensure safer clinical trial outcomes.•The research methodology includes a comparison of supervised learning algorithms, such as Logistic Regression, Random Forest, Gradient Boost, CNN, and genetic algorithms, to identify patterns and anomalies in clinical trial data. BERT and GPT, were also employed to provide the functionality of textual interactions over medical data.•Performance metrics such as accuracy, precision, recall, and F1-score were systematically applied to evaluate each model's performance. Among the models tested, the CNN model with BERT achieved the highest accuracy, providing valuable insights into the potential of deep learning for enhancing pharmacovigilance practices.•These findings suggest that an inclusion of diverse clinical data when supplied to advanced ML and NLP techniques can significantly improve the detection of ADRs, leading to better alignment with the fundamental principles of Good Clinical Practice (GCP). Show less

📄 PDF DOI: 10.1016/j.mex.2025.103460 📎 SI

adverse drug reaction detection adverse drug reactions artificial intelligence bert clinical notes clinical trial data clinical trial data processing convolutional neural networks

A Systematic Review of Drug-Related Interactions�Utilizing Deep

2025 · ACS Omega · ACS Publications · added 2026-04-21

Computational drug discovery is essential for screening potential treatments and reducing the costs and time associated with proposing or combining drugs for disease management. Despite the extensive Show more

Computational drug discovery is essential for screening potential treatments and reducing the costs and time associated with proposing or combining drugs for disease management. Despite the extensive research conducted in this field, it remains an emerging area, particularly with the advent of machine learning, deep learning, and large language models (LLMs). This systematic review examines the integration of machine learning and deep learning techniques in drug discovery, concentrating on three critical areas: drug−drug interactions (DDIs), drug-target interactions (DTIs), and adverse drug reactions (ADRs). The review analyzes over 100 papers published between 2020 and 2025, categorizing the methods into deep learning, machine learning, graph learning, and hybrid models. It highlights the transformative impact of natural language processing (NLP) and LLMs in extracting meaningful insights from biomedical literature and chemical data. Furthermore, this work introduces key databases and data sets widely utilized in drug discovery. Additionally, this review identifies gaps in the existing research, such as the lack of comprehensive studies that simultaneously address DDI, DTI, and ADR extraction, and it proposes a more holistic approach to fill these gaps. The paper concludes by thoroughly evaluating various models, underscoring their performance metrics. Show less

📄 PDF DOI: 10.1021/acsomega.5c04997

bioinformatic techniques bioinformatics biological target biological testing clinical trials computational drug discovery computational modeling deep learning

GeNius: an ultrafast drug–target interaction inference method based on graph neural networks

2024 · Bioinformatics · Oxford University Press · added 2026-04-21

Motivation: Drug–target interaction (DTI) prediction is a relevant but challenging task in the drug repurposing field. In-silico approaches have drawn particular attention as they can reduce associate Show more

Motivation: Drug–target interaction (DTI) prediction is a relevant but challenging task in the drug repurposing field. In-silico approaches have drawn particular attention as they can reduce associated costs and time commitment of traditional methodologies. Yet, current state-of-the-art methods present several limitations: existing DTI prediction approaches are computationally expensive, thereby hindering the ability to use large networks and exploit available datasets and, the generalization to unseen datasets of DTI prediction methods remains unexplored, which could Show less

📄 PDF DOI: 10.1093/bioinformatics/btad774

bioinformatics bioinorganic data mining drug drug repurposing drug-target interaction prediction graph neural networks in-silico

MDTips: a multimodal-data-based drug–target interaction prediction system fusing knowledge, gene expression profile, and structural data

2023 · Bioinformatics · Oxford University Press · added 2026-04-21

Motivation: Screening new drug–target interactions (DTIs) by traditional experimental methods is costly and time-consuming. Recent advances in knowledge graphs, chemical linear notations, and genomic Show more

Motivation: Screening new drug–target interactions (DTIs) by traditional experimental methods is costly and time-consuming. Recent advances in knowledge graphs, chemical linear notations, and genomic data enable researchers to develop computational-based-DTI models, which play a pivotal role in drug repurposing and discovery. However, there still needs to develop a multimodal fusion DTI model that integrates available heterogeneous data into a unified framework. Results: We developed MDTips, a multimodal-data-based DTI prediction system, by fusing the knowledge graphs, gene expression profiles, and Show less

📄 PDF DOI: 10.1093/bioinformatics/btad411

bioinformatics computational modeling deep learning drug drug discovery drug repurposing drug-target interaction gene expression profile

Computational analyses of mechanism of action (MoA): data, methods and integration

2022 · RSC Chemical Biology · Royal Society of Chemistry · added 2026-04-21

This review summarises different data, data resources and methods for computational mechanism of action (MoA) analysis, and highlights some case studies where integration of data types and methods ena Show more

This review summarises different data, data resources and methods for computational mechanism of action (MoA) analysis, and highlights some case studies where integration of data types and methods enabled MoA elucidation on the systems-level. Show less

📄 PDF DOI: 10.1039/d1cb00069a

bioinformatics computational analysis connectivity mapping drug discovery machine learning mechanism of action medicinal chemistry multi-omics integration

Computational and Structural Biotechnology Journal 19 (2021) 4538–4558

2021 · Computational and Structural Biotechnology Journal · Elsevier · added 2026-04-21

Drug discovery aims at finding new compounds with specific chemical properties for the treatment of diseases. In the last years, the approach used in this search presents an important component in com Show more

Drug discovery aims at finding new compounds with specific chemical properties for the treatment of diseases. In the last years, the approach used in this search presents an important component in computer science with the skyrocketing of machine learning techniques due to its democratization. With the objectives set by the Precision Medicine initiative and the new challenges generated, it is necessary to establish robust, standard and reproducible computational methodologies to achieve the objectives set. Currently, predictive models based on Machine Learning have gained great importance in the step prior to preclinical studies. This stage manages to drastically reduce costs and research times in the discovery of new drugs. This review article focuses on how these new methodologies are being used in recent years of research. Analyzing the state of the art in this field will give us an idea of where cheminformatics will be developed in the short term, the limitations it presents and the positive results it has achieved. This review will focus mainly on the methods used to model the molecular data, as well as the biological problems addressed and the Machine Learning algorithms used for drug discovery in recent years. Show less

📄 PDF DOI: 10.1016/j.csbj.2021.08.011 📎 SI

cheminformatics computational methodologies deep learning drug discovery machine learning molecular descriptors precision medicine qsar

FullMeSH: improving large-scale MeSH indexing with full text.

Dai S, You R, Lu Z +3 more · 2020 · Bioinformatics · Oxford University Press · added 2026-04-20

Dai S, You R, Lu Z, Huang X, Mamitsuka H, Zhu S Show less

With the rapidly growing biomedical literature, automatically indexing biomedical articles by Medical Subject Heading (MeSH), namely MeSH indexing, has become increasingly important for facilitating h Show more

With the rapidly growing biomedical literature, automatically indexing biomedical articles by Medical Subject Heading (MeSH), namely MeSH indexing, has become increasingly important for facilitating hypothesis generation and knowledge discovery. Over the past years, many large-scale MeSH indexing approaches have been proposed, such as Medical Text Indexer, MeSHLabeler, DeepMeSH and MeSHProbeNet. However, the performance of these methods is hampered by using limited information, i.e. only the title and abstract of biomedical articles. Show less

📄 PDF DOI: 10.1093/bioinformatics/btz756 📎 SI

bioinformatics biomedical literature information retrieval machine learning medical subject heading mesh indexing natural language processing text analysis

Drug-drug interaction identification using large language models - PMC

· added 2026-04-20

Background: Drug-drug interactions (DDIs) are a significant source of morbidity and adverse drug events (ADEs), particularly in situations of polypharmacy and complex medication regimens. While rules- Show more

Background: Drug-drug interactions (DDIs) are a significant source of morbidity and adverse drug events (ADEs), particularly in situations of polypharmacy and complex medication regimens. While rules-based software integrated in electronic health records (EHRs) has demonstrated proficiency in identifying DDIs present in medication regimens, large language model (LLM) based identification requires thorough benchmarking and performance evaluation using high-quality datasets for safe use. The purpose of this study was to develop a series of Show less

📄 PDF DOI: 10.64898/2025.12.03.25341549; 📎 SI

adverse drug reactions drug interactions drug safety assessment drug-drug interaction identification large language models machine learning medication safety natural language processing

📋 Browse Articles

🔍 Filters