Edit on GitHub

Description

Adapter for bart-large in Pfeiffer architecture trained on the XSum dataset for 10 epochs with early stopping and a learning rate of 1e-4.

Usage

model = BartForConditionalGeneration.from_pretrained("facebook/bart-large")
config = AdapterConfig.load("houlsby", non_linearity="swish", reduction_factor=2)
model.load_adapter("sum/xsum@ukp", config=config)

Properties

Pre-trained model
facebook/bart-large
Adapter type
Prediction Head
  Yes
Task
Summarization
Dataset

Architecture

Name
houlsby
Non-linearity
swish
Reduction factor
2
{
    "ln_after": false,
    "ln_before": false,
    "mh_adapter": true,
    "output_adapter": true,
    "adapter_residual_before_ln": false,
    "non_linearity": "swish",
    "original_ln_after": true,
    "original_ln_before": false,
    "reduction_factor": 16,
    "residual_before_ln": true
}

Author

  Name
Clifton Poth
  E-Mail
  GitHub
  Twitter

Versions

Identifier Comment Score Download
1 DEFAULT 20.9

Citations

Architecture
@misc{houlsby2019parameterefficient,
  title={Parameter-Efficient Transfer Learning for NLP},
  author={Neil Houlsby and Andrei Giurgiu and Stanislaw Jastrzebski and Bruna Morrone and Quentin de Laroussilhe and Andrea Gesmundo and Mona Attariyan and Sylvain Gelly},
  year={2019},
  eprint={1902.00751},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}
Task
@InProceedings{xsum-emnlp,
  author =      "Shashi Narayan and Shay B. Cohen and Mirella Lapata",
  title =       "Don't Give Me the Details, Just the Summary! {T}opic-Aware Convolutional Neural Networks for Extreme Summarization",
  booktitle =   "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing ",
  year =        "2018",
  address =     "Brussels, Belgium",
}