AdapterHub - ukp/facebook-bart-large_sum_cnn_dailymail

Edit on GitHub

model = AutoAdapterModel.from_pretrained("facebook/bart-large")
config = AdapterConfig.load("houlsby", non_linearity="swish", reduction_factor=2)
model.load_adapter("sum/cnn_dailymail@ukp", config=config)

Description

Adapter for bart-large in Houlsby architecture trained on the CNN/ DailyMail dataset for 10 epochs with early stopping and a learning rate of 1e-4.

Properties

Pre-trained model

facebook/bart-large

Adapter type

Task

Prediction Head

Yes

Task

Summarization

Dataset

CNN/ DailyMail

Architecture

Name

houlsby

Non-linearity

swish

Reduction factor

{
  "ln_after": false,
  "ln_before": false,
  "mh_adapter": true,
  "output_adapter": true,
  "adapter_residual_before_ln": false,
  "non_linearity": "swish",
  "original_ln_after": true,
  "original_ln_before": false,
  "reduction_factor": 2,
  "residual_before_ln": true
}

Author

Name

Clifton Poth

E-Mail

calpt@mail.de

Web

https://adapterhub.ml

GitHub

calpt

Twitter

@clifapt

Versions

Identifier	Comment	Score	Download
1 DEFAULT		20.51

Citations

Architecture

@misc{houlsby2019parameterefficient,
  title={Parameter-Efficient Transfer Learning for NLP},
  author={Neil Houlsby and Andrei Giurgiu and Stanislaw Jastrzebski and Bruna Morrone and Quentin de Laroussilhe and Andrea Gesmundo and Mona Attariyan and Sylvain Gelly},
  year={2019},
  eprint={1902.00751},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}

Task

@inproceedings{Hermann2015TeachingMT,
  title={Teaching Machines to Read and Comprehend},
  author={K. Hermann and Tom{\'a}s Kocisk{\'y} and Edward Grefenstette and Lasse Espeholt and W. Kay and Mustafa Suleyman and P. Blunsom},
  booktitle={NIPS},
  year={2015}
}

Adapter for CNN/ DailyMail

ukp / facebook-bart-large_sum_cnn_dailymail_houlsby

Description

Properties

Architecture

Configuration

Author

Versions

Citations

BibTeX

Architecture

Task