📚 Article Archive

The BOS-Lig Dataset: Accurate Ligand Charges from a Consensus Approach for 66,810 Experimentally Synthesized Ligands

Michel, Roland G. St. · 2026 · · arXiv · added 2026-04-22

Understanding ligand properties is essential for computational high-throughput screening of transition metal complexes. However, ligand properties such as net charge and other information such as their application area are often absent or inconsistently recorded in crystallographic datasets. Here, we construct a ligand dataset from 126,985 mononuclear transition metal complexes curated from the Cambridge Structural Database. Using an iterative charge-balancing workflow that combines complex charges, metal oxidation states, and consensus across crystallographic observations, we confidently assign net charges to 66,810 ligands among 94,581 identified unique ligand structures to curate the Boston Open-Shell Ligand (BOS-Lig) dataset. The workflow assigns ligand charges in homoleptic complexes first and then iteratively propagates these assignments across heteroleptic environments, allowing charges to be inferred even when direct charge information is unavailable. We analyze cases where simple heuristics such as the octet rule would have failed and introduce a purity metric to identify when our charge assignments may be incorrect. Each ligand is also classified in terms of its metal coordinating atoms and whether there are multiple variants (i.e., hemilability). We then link complexes to their associated journal abstracts and apply a topic-modeling workflow to link 25,146 ligands with functional application areas spanning reactivity, redox chemistry, biological chemistry, and photophysical chemistry. Together, we provide an experimentally grounded dataset of ligand chemical space that connects charge and functional application as a foundation for computational screening and data-driven ligand design. Show less

no PDF

bos-lig charge balancing charge-balancing workflow chemistry computational chemistry crystallographic datasets crystallography dataset

A Systematic Review of Drug-Related Interactions�Utilizing Deep

2025 · ACS Omega · ACS Publications · added 2026-04-21

Computational drug discovery is essential for screening potential treatments and reducing the costs and time associated with proposing or combining drugs for disease management. Despite the extensive research conducted in this field, it remains an emerging area, particularly with the advent of machine learning, deep learning, and large language models (LLMs). This systematic review examines the integration of machine learning and deep learning techniques in drug discovery, concentrating on three critical areas: drug−drug interactions (DDIs), drug-target interactions (DTIs), and adverse drug reactions (ADRs). The review analyzes over 100 papers published between 2020 and 2025, categorizing the methods into deep learning, machine learning, graph learning, and hybrid models. It highlights the transformative impact of natural language processing (NLP) and LLMs in extracting meaningful insights from biomedical literature and chemical data. Furthermore, this work introduces key databases and data sets widely utilized in drug discovery. Additionally, this review identifies gaps in the existing research, such as the lack of comprehensive studies that simultaneously address DDI, DTI, and ADR extraction, and it proposes a more holistic approach to fill these gaps. The paper concludes by thoroughly evaluating various models, underscoring their performance metrics. Show less

📄 PDF DOI: 10.1021/acsomega.5c04997

bioinformatic techniques bioinformatics biological target biological testing clinical trials computational drug discovery computational modeling deep learning

📋 Browse Articles

🔍 Filters

The BOS-Lig Dataset: Accurate Ligand Charges from a Consensus Approach for 66,810 Experimentally Synthesized Ligands

A Systematic Review of Drug-Related Interactions�Utilizing Deep