2024 · Scientific Data · Nature · added 2026-04-21
11,571 — — NER 2008 SCAI33 1,206 — — NER 2012 ADE39 300 case reports 5,063 drugs — 6,821 drug adverse effects 279 drug dosage RE 2013 DDI43 1,025, including texts from DrugBank and 18,502 drugs — 5,02 Show more
11,571 — — NER 2008 SCAI33 1,206 — — NER 2012 ADE39 300 case reports 5,063 drugs — 6,821 drug adverse effects 279 drug dosage RE 2013 DDI43 1,025, including texts from DrugBank and 18,502 drugs — 5,028 drug-drug interactions RE 2015 CHEMDNER34 84,355 chemicals — — NER 2016 BC5CDR 1,500 articles 15,935 chemicals 12,850 diseases 4,409 MeSH chemically induced diseases NER, NEN, RE 2017 N-ary drug-gene-mutation 35 — — — 137,469 drug–gene 3,192 drug–mutation RE 2017 40 ChemProt 32,514 chemicals 30,922 genes Show less
With the rapidly growing biomedical literature, automatically indexing biomedical articles by Medical Subject Heading (MeSH), namely MeSH indexing, has become increasingly important for facilitating h Show more
With the rapidly growing biomedical literature, automatically indexing biomedical articles by Medical Subject Heading (MeSH), namely MeSH indexing, has become increasingly important for facilitating hypothesis generation and knowledge discovery. Over the past years, many large-scale MeSH indexing approaches have been proposed, such as Medical Text Indexer, MeSHLabeler, DeepMeSH and MeSHProbeNet. However, the performance of these methods is hampered by using limited information, i.e. only the title and abstract of biomedical articles. Show less
The US National Library of Medicine (NLM) uses the Medical Subject Headings (MeSH) (see Note 1 ) to index almost all 24 million citations in MEDLINE, which greatly facilitates the application of biome Show more
The US National Library of Medicine (NLM) uses the Medical Subject Headings (MeSH) (see Note 1 ) to index almost all 24 million citations in MEDLINE, which greatly facilitates the application of biomedical information retrieval and text mining. Large-scale automatic MeSH indexing has two challenging aspects: the MeSH side and citation side. For the MeSH side, each citation is annotated by only 12 (on average) out of all 28,000 MeSH terms. For the citation side, all existing methods, including Medical Text Indexer (MTI) by NLM, deal with text by bag-of-words, which cannot capture semantic and context-dependent information well. To solve these two challenges, we developed the MeSHLabeler and DeepMeSH. By utilizing "learning to rank" (LTR) framework, MeSHLabeler integrates multiple types of information to solve the challenge in the MeSH side, while DeepMeSH integrates deep semantic representation to solve the challenge in the citation side. MeSHLabeler achieved the first place in both BioASQ2 and BioASQ3, and DeepMeSH achieved the first place in both BioASQ4 and BioASQ5 challenges. DeepMeSH is available at http://datamining-iip.fudan.edu.cn/deepmesh . Show less