ChEMBL is a large-scale, open-access, FAIR database of bioactive molecules with drug-like properties. ChEMBL 35 contains 17,500 approved drugs, and drugs that are progressing through the clinical deve Show more
ChEMBL is a large-scale, open-access, FAIR database of bioactive molecules with drug-like properties. ChEMBL 35 contains 17,500 approved drugs, and drugs that are progressing through the clinical development pipeline. Drug curation has formed an integral part of the core offering of the ChEMBL database since its inception. The paper is a reference guide to present the principles of why the ChEMBL drug data has been curated in a particular manner so that data users can better understand the nature of the data. The drug data include information on: names, synonyms and trade names, chemical structure or biological sequence, data sources, indications, mechanisms, warnings and drug properties such as maximum phase of development, type of molecule, prodrug status and first approval. The integrated nature of the drug data within the context of a bioactivity resource enables the wide use of the data set in drug discovery, AI and machine learning. Show less
Barbara Zdrazil · 2025 · Journal of Cheminformatics · BioMed Central · added 2026-04-20
Abstract In October 2024 we celebrated the 15th anniversary of the first launch of ChEMBL, Europe’s most impactful, open-access drug discovery database, hosted by EMBL’s European Bioinformatics Instit Show more
Abstract In October 2024 we celebrated the 15th anniversary of the first launch of ChEMBL, Europe’s most impactful, open-access drug discovery database, hosted by EMBL’s European Bioinformatics Institute (EMBL-EBI). This is a good moment to reflect on ChEMBL’s history, the role that ChEMBL plays in Cheminformatics and Drug Discovery as well as innovations accelerated using data extracted from it. The review closes by discussing current challenges and possible directions that need to be taken to guarantee that ChEMBL continues to be the pioneering resource for highly curated, open bioactivity data on the European continent and beyond. Show less
ChEMBL (https://www.ebi.ac.uk/chembl/) is a manually curated, high-quality, large-scale, open, FAIR and Global Core Biodata Resource of bioactive molecules with drug-like properties, previously descri Show more
ChEMBL (https://www.ebi.ac.uk/chembl/) is a manually curated, high-quality, large-scale, open, FAIR and Global Core Biodata Resource of bioactive molecules with drug-like properties, previously described in the 2012, 2014, 2017 and 2019 Nucleic Acids Research Database Issues. Since its introduction in 2009, ChEMBL's content has changed dramatically in size and diversity of data types. Through incorporation of multiple new datasets from depositors since the 2019 update, ChEMBL now contains slightly more bioactivity data from deposited data vs data extracted from literature. In collaboration with the EUbOPEN consortium, chemical probe data is now regularly deposited into ChEMBL. Release 27 made curated data available for compounds screened for potential anti-SARS-CoV-2 activity from several large-scale drug repurposing screens. In addition, new patent bioactivity data have been added to the latest ChEMBL releases, and various new features have been incorporated, including a Natural Product likeness score, updated flags for Natural Products, a new flag for Chemical Probes, and the initial annotation of the action type for ∼270 000 bioactivity measurements. Show less