Adapters for
The UKP Sentential Argument Mining Corpus includes 25,492 sentences over eight controversial topics. Each sentence was annotated via crowdsourcing as either a supporting argument, an attacking... Argument Mining
COPA (Gordon et al., 2012) is a causal commonsense reasoning task with the goal to select the more plausible sentence of two alternatives, given a premise sentence. Common Sense Reasoning
Cosmos QA is a large-scale dataset of 35,600 problems that require commonsense-based reading comprehension, formulated as multiple-choice questions. Common Sense Reasoning
To investigate question answering with prior knowledge, we present CommonsenseQA: a challenging new dataset for commonsense question answering. To capture common sense beyond associations, we... Common Sense Reasoning
HellaSwag, a new benchmark for commonsense NLI. Common Sense Reasoning
Social IQa is the first largescale benchmark for commonsense reasoning about social situations. Social IQa contains 38,000 multiple choice questions for probing emotional and social intelligence... Common Sense Reasoning
WinoGrande is a new collection of 44k problems, inspired by Winograd Schema Challenge (Levesque, Davis, and Morgenstern 2011), but adjusted to improve the scale and robustness against the... Common Sense Reasoning
Dependency parsing on Universal Dependencies English EWT. UD English EWT is a Gold Standard Universal Dependencies Corpus for English, built over the source material of the English Web Treebank... Dependency Parsing
Dependency relation classification on Universal Dependencies English EWT. UD English EWT is a Gold Standard Universal Dependencies Corpus for English, built over the source material of the English... Dependency Relation Classification
Arabic Dialect Detection classifies text into Modern Standard Arabic (MSA), Egyptian, Maghrebi (northwest Africa), Gulf, and Levantine Dialect Detection
The Grammarly’s Yahoo Answers Formality Corpus (GYAFC) is the largest dataset for formality classification which contains a total of 110K informal / formal sentence pairs. This is a... Formality Classification
The First Certificate in English (FCE) dataset in sequence labeling format, converted by Rei and Yannakoudakis (2016), wher eeach token in a sentence is labelled either as correct or as incorrect.... Grammatical Error Detection
The model is trained to learn the structure of poems in the english language. Language Modeling
The Corpus of Linguistic Acceptability (CoLA) in its full form consists of 10657 sentences from 23 linguistics publications, expertly annotated for acceptability (grammaticality) by their original... Linguistic Acceptability
The Multi-Sentence Reading Comprehension dataset (MultiRC, Khashabi et al., 2018) is a true/false question-answering task. Each example consists of a context paragraph, a question about that... Machine Reading Comprehension
Race is a large-scale reading comprehension dataset with more than 28,000 passages and nearly 100,000 questions. The dataset is collected from English examinations in China, which are designed for... Machine Reading Comprehension
ReCoRD is a reading comprehension task requiring commonsense reasoning. The context paragraphs and queries are generated from curated CNN/ Daily Mail news articles. The task itself is formulated... Machine Reading Comprehension
The English-Romanian translation dataset from the shared task of the First Conference on Machine Translation (WMT16). Machine Translation
Enhancing the phrase-level cross-lingual entity alignment in language models, which is suitable for knowledge graph tasks Multilingual Knowledge Integration
Enhancing the sentence-level cross-lingual entity alignment in language models, which is suitable for language modeling tasks Multilingual Knowledge Integration
Enhancing the phrase-level factual triple knowledge in language models, which is suitable for knowledge graph tasks Multilingual Knowledge Integration
Enhancing the sentence-level factual triple knowledge in language models, which is suitable for language modeling tasks Multilingual Knowledge Integration
The CoNLL 2003 Task is a language independent Named Entity Recognition. In the dataset are englich and german training, development and test sets. The english data is from the Reuter corpus and... Named Entity Recognition
The MIT Movie Corpus contains movie reviews tagged in BIO format. We use the larger trivia10k13 corpus which contains more complex examples. Named Entity Recognition
The CommitmentBank is a corpus of naturally occurring discourses whose final sentence contains a clause-embedding predicate under an entailment canceling operator. Natural Language Inference
Dataset Card for "anli" Dataset Summary The Adversarial Natural Language Inference (ANLI) is a new large-scale NLI benchmark dataset, The dataset is collected via an... Natural Language Inference
Multi-Genre Natural Language Inference is a large-scale, crowdsourced entailment classification task. Given a pair of sentences, the goal is to predict whether the second sentence is an... Natural Language Inference
Question Natural Language Inference is a version of SQuAD which has been converted to a binary classification task. The positive examples are (question, sentence) pairs which do contain... Natural Language Inference
Recognizing Textual Entailment is a binary entailment task similar to MNLI, but with much less training data. Natural Language Inference
The SciTail dataset is an entailment dataset created from multiple-choice science exams and web sentences. Each question and the correct answer choice are converted into an assertive statement to... Natural Language Inference
The SICK relatedness (SICK-R) task trains a linear model to output a score from 1 to 5 indicating the relatedness of two sentences. For the same dataset (SICK-E) can be treated as a three-class... Natural Language Inference
POS-tagging using the part-of-speech tags provided by the English subset of the dataset for the CoNLL 2003 shared task for NER. Part-Of-Speech Tagging
English Web Treebank was developed by the Linguistic Data Consortium (LDC) with funding through a gift from Google Inc. It consists of over 250,000 words of English weblogs, newsgroups, email,... Part-Of-Speech Tagging
POS-tagging on Universal Dependencies English EWT. UD English EWT is a Gold Standard Universal Dependencies Corpus for English, built over the source material of the English Web Treebank... Part-Of-Speech Tagging
Text chunking was the shared task for CoNLL-2000. This data consists of the same partitions of the Wall Street Journal corpus (WSJ) as the widely used data for noun phrase chunking: sections 15-18... Phrase Chunking
CoNLL-2003 shared task:language-independent named entity recognition is a dataset corpus consisting of the Reuters news stories between August 1996 and August 1997. Each line of the corpus has... Phrase Chunking
The WMT21 shared task on quality estimation. Training language pairs: high-resource English--German (En-De) and English--Chinese (En-Zh), medium-resource Russian-English (Ru-En), Romanian--English... Quality Estimation
We build a reading comprehension dataset, BoolQ, of such questions, and show that they are unexpectedly challenging. Question Answering
ComplexQuestions is a QA dataset originally created for querying knowledge bases. The questions are obtained from Google web queries and don’t provide supporting context paragraphs inthe original... Question Answering
NarrativeQA is dataset of stories and corresponding questions which are designed to test reading comprehension. Question Answering
Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is... Question Answering
SQuAD2.0 combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD2.0,... Question Answering
Dataset Card for "drop" Dataset Summary DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs. . DROP is a crowdsourced,... Question Answering
WikiHop is open-domain and based on Wikipedia articles; the goal is to recover Wikidata information by hopping through documents. The example on the right shows the relevant documents leading to... Question Answering
The Parallel Meaning Bank (PMB), developed at the University of Groningen and building upon the Groningen Meaning Bank, comprises sentences and texts in raw and tokenised format, syntactic... Semantic Tagging
Microsoft Research Paraphrase Corpus consists of sentence pairs automatically extracted from online news sources, with human annotations for whether the sentences in the pair are semantically equivalent. Semantic Textual Similarity
Quora Question Pairs is a binary classification task where the goal is to determine if two questions asked on Quora are semantically equivalent. Semantic Textual Similarity
StackExchange QA similarity determines whether two questions or a question-answer pair in StackExchange forums are related or not (e.g., to find duplicate questions or relevant answers). Semantic Textual Similarity
The Semantic Textual Similarity Benchmark is a collection of sentence pairs drawn from news headlines and other sources. They were annotated with a score from 1 to 5 denoting how similar the two... Semantic Textual Similarity
This dataset was released as part of SemEval 2020, Task 9 on Sentiment Analysis in Code Mixed Social Media (Twitter) text. It tags positive, neutral and negative sentiment. There are 17,000... Sentiment Analysis
MDB dataset having 50K movie reviews for natural language processing or Text analytics. This is a dataset for binary sentiment classification containing substantially more data than previous... Sentiment Analysis
Movie Review Dataset. This is a dataset of containing 5,331 positive and 5,331 negative processed sentences from Rotten Tomatoes movie reviews. This data was first used in Bo Pang and Lillian Lee,... Sentiment Analysis
The Stanford Sentiment Treebank is a binary single-sentence classification task consisting of sentences extracted from movie reviews with human annotations of their sentiment. Sentiment Analysis
CNN/ DailyMail summarization dataset. Summarization
Extreme Summarization (XSum) Dataset. Summarization
NER on Wikipedia Documents. Wiki Ann NER
The Word-in-Context dataset is a word sense disambiguation task in the form of a binary classification task. Each dataset example consists of two sentences and a polysemous word that appears in... Word Sense Disambiguation
Add your task to AdapterHub, it's super awesome!