The scientific program includes the talks related to the individual tracks, a panel about mining adverse drug reactions, a keynote talk and a flash talks for selected posters. Detailed agenda is shown below
Monday, November 8, 2021
UTC (Universal)
EST
Session
2:30-2:40 pm
9:30-9:40 am
Opening remarks
2:40-3:55 pm
9:40-10:55 am
NLM-Chem Track: Full text Chemical Identification and Indexing in PubMed articles (Track 2) Chair: Zhiyong Lu
Overview of the NLM-CHEM track - Full-text Chemical Identification and Indexing in PubMed articles (Rezarta Islamaj, Robert Leaman)
Chemical detection and indexing in PubMed full text articles using deep learning and rule-based methods (João Figueira Silva)
Improving Tagging Consistency and Entity Coverage for Chemical Identification in Full-text Articles (Hyunjae Kim)
A BERT-Based Hybrid System for Chemical Identification and Indexing in Full-Text Articles (Arslan Erdengasileng)
Chemical Identification and Indexing in PubMed Articles via BERT and Text-to-Text Based Approaches (Virginia Adams)
Automatic extraction of medication names in tweets (Track 3)
Chair: Davy Weissenbacher
BioCreative VII – Track 3: Automatic Extraction of Medication Names in Tweets (Davy Weissenbacher)
NCU-IISR/AS-GIS: Detecting Medication Names in Imbalanced Twitter Data with Pretrained Extractive QA Model and Data-Centric Approach (Yu Zhang)
BCH-NLP at BioCreative VII Track 3 - medications detection in tweets using transformer networks and multi-task learning (Dongfang Xu)
Tuesday, November 9, 2021
UTC (Universal)
EST
Session
2:00-2:10 pm
9:00-9:10 am
Opening remarks
2:10-3:55 pm
9:10-10:55 am
DrugProt:Text mining drug/chemical-protein interactions (Track 1)
Chair: Antonio Miranda-Escalada
Overview of DrugProt BioCreative VII track: quality evaluation and large scale text mining of drug-gene/protein relations (Martin Krallinger, Antonio Miranda-Escalada)
Using Knowledge Base to Refine Data Augmentation for Biomedical Relation Extraction (WonJin Yoon)
Extracting Drug-Protein Interaction using an Ensemble of Biomedical Pre-trained Language Models through Sequence Labeling and Text Classification Techniques (Ling Luo)
Text Mining Drug-Protein Interactions using an Ensemble of BERT, Sentence BERT and T5 models (Xin Sui)
Humboldt @ DrugProt: Chemical-Protein Relation Extraction with Pretrained Transformers and Entity Descriptions (Leon Weber)
Does constituency analysis enhance domain-specific pre-trained BERT models for relation extraction? (Anfu Tang)
Text Mining Drug/Chemical-Protein Interactions using an Ensemble of BERT and T5 Based Models (Virginia Adams)
CU-UD: text-mining drug and chemical-protein interactions with ensembles of BERT-based models(Mehmet Efruz Karabulut)
TTI-COIN at BioCreative VII Track 1 (Naoki Iinuma/Masaki Asada)
A Multi-Task Transfer Learning-based method for Extracting Drug-Protein Interactions (Ed-drissiya El-allaly)
lasigeBioTM at BioCreative VII Track 1: Text mining drug and chemical-protein interactions using biomedical ontologies (Diana Sousa)
Identifying Drug/chemical-protein Interactions in Biomedical Literature using the BERT-based Ensemble Learning Approach for the BioCreative 2021 DrugProt Track (Tzu-Yi Li)
Catalytic DS at BioCreative VII: DrugProt Track (Dennis Mehay)
Claim Detection in Biomedical Twitter Posts as a Prerequisite for Fact-Checking (Amelie Wührl)
Visual Exploration of Randomized Clinical Trials for COVID-19 (Abel Correa Dias)
COVID-SEE: The Scientific Evidence Explorer for COVID-19 Related Research (Karin Verspoor)
Long Covid: A Comprehensive Collection of Articles on Post-COVID Conditions (Robert Leaman)
Automated topic prediction of LitCovid using BioBERT (Vangala G Saipradeep)
A Survey of Relation Extraction Techniques Using Hybrid Classical and State of the Art Methods (Onur Kara)
Automatic Extraction of Medication Names in Tweets as Named Entity Recognition (Carole Anderson)
PubMedBERT-based Classifier with Data Augmentation Strategy for Detecting Medication Mentions in Tweets (Qing Hang)
Extraction of Medication Names from Twitter Using Augmentation and an Ensemble of Language Models (Igor Kulev)
Recognizing Chemical Entity in Biomedical Literature using a BERT-based Ensemble Learning Methods for the BioCreative 2021 NLM-Chem Track (Yu Wen Chiu)
Fine-tuning transformers for automatic chemical entity identification in PubMed articles (Robert Bevan)
PolyU CBS-NLP at BioCreative-VII LitCovid Task: Ensemble Learning for COVID-19 Multilabel Classification (Jinghang Gu)
Multi-label topic classification for COVID-19 literature annotation using an ensemble model based on PubMedBERT (Shubo Tian)
RobertNLP at the BioCreative VII - LitCovid track: Neural Document Classification Using SciBERT (Friedrich Annemarie)
TTI-COIN at BioCreative VII Track 2 (Tomoki Tsujimura)
Chemical–protein relation extraction in PubMed abstracts using BERT and neural networks (Rui Antunes)
R-BERT-CNN: Drug-target interactions extraction from biomedical literature (Jehad Aldahdooh)
5:00-6:15 pm
12:00-1:15 pm
Panel: Challenges in mining adverse drug reactions
The BioCreative organizers have convened this panel to explore the possibility of a future BioCreative evaluation on mining adverse drug reactions (ADRs). The panel will explore challenges of mining ADRs, focusing on applications (e.g., post-market surveillance, early warning from tracking social media, predictive models of toxic endpoints for chemicals and drugs, pre-clinical and clinical research) and data sources (including their limitations and accessibility).
Chairs: Martin Krallinger, Lynette Hirschman
Panelists:
Dr. Martin Krallinger (Chair)
CDR Monica Muñoz, FDA CDER
Prof. Özlem Uzuner, George Mason University
Dr. Raul Rodriguez-Esteban, Roche Pharmaceutics
Prof. Graciela Gonzalez-Hernandez, U Pennsylvania Medical School
Wednesday, November 10, 2021
UTC (Universal)
EST
Session
2:30-2:40 pm
9:30-9:40 am
Opening remarks
2:40-3:55 pm
9:40-10:55 am
LitCovid track Multi-label topic classification for COVID-19 literature annotation (Track 5)
Chair: Rezarta Islamaj
BioCreative VII LitCovid Track: multi-label topic classification for COVID-19 literature annotation (Qingyu Chen)
Multic-label topic classification for COVID-19 literature with Bioformer (Fang Li)
Multi-label topic classification for COVID-19 literature annotation: A BioBERT-based feature enhancement approach (Wentai Tang)
BERT-based bagging-stacking for multi-topic classification (Loïc Rakotoson)
Multi-label Topic Classification for COVID-19 Literature Annotation using the BERT-based Ensemble Learning Approach for the BioCreative 2021 LitCovid Track (Sheng-Jie Lin)
Note for Biocreative participants: For registration to a track please use the Google form. Do not use the team "Team page" tab as it is non functional.
BioCreative VII Challenge and Workshop CFP
The workshop will take place on November 8-10, 2021. This workshop will be virtual.
BioCreative: Critical Assessment of Information Extraction in Biology is a community-wide effort for evaluating text mining and information extraction systems applied to the biological domain. BioCreative has been an invaluable source for advancing state-of-the-art text mining methods by providing reference datasets and a collegial environment to develop and evaluate these methods in both shared and interactive modes.
The sudden spread of COVID-19 has triggered an unexpected pressure on the biomedical community to quickly identify potential treatments by repurposing existing drugs or identifying new chemicals with anti-Sars-CoV-2 activity. Thus, BioCreative VII will focus around detection of chemicals, drugs and related substances with three tracks: Track 1 (DrugProt) focuses on the detection of interactions between chemicals/drugs/substances and genes/proteins in abstracts, Track 2: (NLM Chem track) focuses on detecting chemical names and their MeSH encoding in full-length articles and Track 3: Medications in Tweets focuses on extracting medication mentions from social media.
In addition, COVID-19 has triggered the development of multiple text mining tools to support ongoing research efforts that await community feedback. Thus, we are offering an interactive track, Track 4, to provide an environment for tools to be reviewed by users and get their feedback on utility and usability.
We further offer Track 5, LitCovid Track on multi-label topic classification for COVID-19 literature annotation, calling for innovative text mining tools to support the curation of COVID-19 literature in LitCovid, a literature database of COVID-19-related papers in PubMed.
Here are more details about the tracks. Click on the Track number for accessing track specific pages:
Track 1- DrugProt:Text mining drug/chemical-protein interactions Organizers: Martin Krallinger, Alfonso Valencia DrugProt will explore recognition of chemical-protein entity relations from abstracts. The aim of the DrugProt track is to promote the development and evaluation of systems that are able to automatically detect relations between chemical compounds/drug and genes/proteins. We have therefore generated a manually annotated corpus, the DrugProt corpus, where domain experts have exhaustively labeled: (a) all chemical and gene mentions, and (b) all binary relationships between them corresponding to a specific set of biologically relevant relation types (DrugProt relation classes).
Track 2- NLM-Chem Track: Full text Chemical Identification and Indexing in PubMed articles Organizers: Rezarta Islamaj, Robert Leaman, and Zhiyong Lu, National Library of Medicine (NLM) Current chemical concept recognition tools have demonstrated significantly lower performance for in full-text articles than in abstracts. Improving automated full-text chemical concept recognition can substantially accelerate manual indexing and curation and advance downstream NLP tasks such as relevant article retrieval. The NLM-CHEM task will consist of two sub-tasks, focusing on (1) identifying chemicals in full-text articles (i.e. named entity recognition and normalization) and (2) ranking chemical concepts for full-text document indexing. The task will use the recently released NLM-CHEM corpus, consisting of 150 full-text articles, with ~5000 unique chemical names mapped to ~2,000 MeSH identifiers.
Track 3- Automatic extraction of medication names in tweets Organizers: Graciela Gonzalez-Hernandez, Davy Weissenbacher, Ivan Flores, Karen O’Connor The goal of this task is to extract the spans that mention a medication or dietary supplement in tweets. The dataset consists of all tweets posted by 212 Twitter users during their pregnancy. This data represents the natural and highly imbalanced distribution of drug mentions in Twitter, with only approximately 0.2% of the tweets mentioning a medication. Training and evaluating a sequence labeler on this data set will closely model the detection of drugs in tweets in practice. Click here for more information.
Track 4- COVID-19 text mining tool interactive demo Organizers: Cecilia Arighi, Andrew Chatr-Aryamontri, Lynette Hirschman, Martin Krallinger, Karen Ross, Tonia Korves The COVID-19 text mining tool interactive demo track is a demonstration task, and will focus on tools specifically developed to support COVID-19 research efforts. Similar to previous interactive tasks (e.g., PMID:27589961), tools will be reviewed by the research community, providing feedback on effectiveness and usability. The goal of this task is to foster the interaction between system developers and potential users to advance in the development of text mining tools that are useful for the research community. Participating teams will present a web-based system that can address some task(s) of their choice. Users will be recruited to review the system and provide feedback via a user questionnaire. More information here.
Track 5- LitCovid track Multi-label topic classification for COVID-19 literature annotation Organizers: Qingyu Chen, Alexis Allot, Rezarta Islamaj, Robert Leaman, and Zhiyong Lu, National Library of Medicine (NLM) The number of COVID-19-related articles in the literature is growing by about 10,000 articles per month. LitCovid, a literature database of COVID-19-related papers in PubMed, has accumulated more than 100,000 articles, with millions of accesses each month by users worldwide. LitCovid is updated daily, and this rapid growth significantly increases the burden of manual curation. In particular, annotating each article with up to eight possible topics, e.g., Treatment and Diagnosis, has been a bottleneck in the LitCovid curation pipeline. Increasing the accuracy of automated topic prediction in COVID-19-related literature would be a timely improvement beneficial to curators and researchers worldwide. The LitCovid track calls for a community effort to tackle automated topic annotation for COVID-19 literature. The task will use ~60K articles in LitCovid with manually reviewed topics.
PUBLICATION
The BioCreative VII Proceedings will host all the submissions from participating teams and it will be freely available by the time of the workshop.
In addition, we are happy to announce that the journal Database will host the BioCreative VII special issue for work that has passed their peer-review process. Invitation to submit will be sent after the workshop.
TEAM REGISTRATION
Teams can participate in one or more of these tracks. Team registration will continue until final commitment is requested by the individual tracks.
To register a team go to the Registration form. If you have restrictions accessing Google forms please send e-mail to BiocreativeChallenge@gmail.com.
Note: The BioCreative site has a Team page link, please ignore it as it is non functional. Registration is done via Google forms this time.
BIOCREATIVE ORGANIZING COMMITTEE
Cecilia Arighi, University of Delaware, USA
Andrew Chatr-Aryamontri, University of Montreal, Canada
Rezarta Dogan, National Center for Biotechnology Information (NCBI), NIH, USA
Graciela Gonzalez-Hernandez, University of Pennsylvania, USA
Lynette Hirschman, MITRE Corporation, USA
Martin Krallinger, Barcelona Supercomputing Center, Spain
Robert Leaman, National Center for Biotechnology Information (NCBI), NIH, USA
Zhiyong Lu, National Center for Biotechnology Information (NCBI), NIH, USA
Karen Ross, Georgetown University Medical School, USA
Alfonso Valencia, Barcelona Supercomputing Center, Spain
Davy Weissenbacher, University of Pennsylvania, USA