Adapters for
The UKP Sentential Argument Mining Corpus includes 25,492 sentences over eight controversial topics. Each sentence was annotated via crowdsourcing as either a supporting argument, an attacking... Argument Mining
Cosmos QA is a large-scale dataset of 35,600 problems that require commonsense-based reading comprehension, formulated as multiple-choice questions. Common Sense Reasoning
To investigate question answering with prior knowledge, we present CommonsenseQA: a challenging new dataset for commonsense question answering. To capture common sense beyond associations, we... Common Sense Reasoning
HellaSwag, a new benchmark for commonsense NLI. Common Sense Reasoning
Social IQa is the first largescale benchmark for commonsense reasoning about social situations. Social IQa contains 38,000 multiple choice questions for probing emotional and social intelligence... Common Sense Reasoning
WinoGrande is a new collection of 44k problems, inspired by Winograd Schema Challenge (Levesque, Davis, and Morgenstern 2011), but adjusted to improve the scale and robustness against the... Common Sense Reasoning
Arabic Dialect Detection classifies text into Modern Standard Arabic (MSA), Egyptian, Maghrebi (northwest Africa), Gulf, and Levantine Dialect Detection
The Corpus of Linguistic Acceptability (CoLA) in its full form consists of 10657 sentences from 23 linguistics publications, expertly annotated for acceptability (grammaticality) by their original... Linguistic Acceptability
The Multi-Sentence Reading Comprehension dataset (MultiRC, Khashabi et al., 2018) is a true/false question-answering task. Each example consists of a context paragraph, a question about that... Machine Reading Comprehension
Race is a large-scale reading comprehension dataset with more than 28,000 passages and nearly 100,000 questions. The dataset is collected from English examinations in China, which are designed for... Machine Reading Comprehension
The CoNLL 2003 Task is a language independent Named Entity Recognition. In the dataset are englich and german training, development and test sets. The english data is from the Reuter corpus and... Named Entity Recognition
The CommitmentBank is a corpus of naturally occurring discourses whose final sentence contains a clause-embedding predicate under an entailment canceling operator. Natural Language Inference
Multi-Genre Natural Language Inference is a large-scale, crowdsourced entailment classification task. Given a pair of sentences, the goal is to predict whether the second sentence is an... Natural Language Inference
Question Natural Language Inference is a version of SQuAD which has been converted to a binary classification task. The positive examples are (question, sentence) pairs which do contain... Natural Language Inference
Recognizing Textual Entailment is a binary entailment task similar to MNLI, but with much less training data. Natural Language Inference
The SciTail dataset is an entailment dataset created from multiple-choice science exams and web sentences. Each question and the correct answer choice are converted into an assertive statement to... Natural Language Inference
The SICK relatedness (SICK-R) task trains a linear model to output a score from 1 to 5 indicating the relatedness of two sentences. For the same dataset (SICK-E) can be treated as a three-class... Natural Language Inference
English Web Treebank was developed by the Linguistic Data Consortium (LDC) with funding through a gift from Google Inc. It consists of over 250,000 words of English weblogs, newsgroups, email,... Part-Of-Speech Tagging
CoNLL-2003 shared task:language-independent named entity recognition is a dataset corpus consisting of the Reuters news stories between August 1996 and August 1997. Each line of the corpus has... Phrase Chunking
We build a reading comprehension dataset, BoolQ, of such questions, and show that they are unexpectedly challenging. Question Answering
Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is... Question Answering
SQuAD2.0 combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD2.0,... Question Answering
Microsoft Research Paraphrase Corpus consists of sentence pairs automatically extracted from online news sources, with human annotations for whether the sentences in the pair are semantically equivalent. Semantic Textual Similarity
Quora Question Pairs is a binary classification task where the goal is to determine if two questions asked on Quora are semantically equivalent. Semantic Textual Similarity
StackExchange QA similarity determines whether two questions or a question-answer pair in StackExchange forums are related or not (e.g., to find duplicate questions or relevant answers). Semantic Textual Similarity
The Semantic Textual Similarity Benchmark is a collection of sentence pairs drawn from news headlines and other sources. They were annotated with a score from 1 to 5 denoting how similar the two... Semantic Textual Similarity
This dataset was released as part of SemEval 2020, Task 9 on Sentiment Analysis in Code Mixed Social Media (Twitter) text. It tags positive, neutral and negative sentiment. There are 17,000... Sentiment Analysis
MDB dataset having 50K movie reviews for natural language processing or Text analytics. This is a dataset for binary sentiment classification containing substantially more data than previous... Sentiment Analysis
Movie Review Dataset. This is a dataset of containing 5,331 positive and 5,331 negative processed sentences from Rotten Tomatoes movie reviews. This data was first used in Bo Pang and Lillian Lee,... Sentiment Analysis
The Stanford Sentiment Treebank is a binary single-sentence classification task consisting of sentences extracted from movie reviews with human annotations of their sentiment. Sentiment Analysis
Add your task to AdapterHub, it's super awesome!