View on huggingface.co

model = AutoAdapterModel.from_pretrained("AdapterHub/xmod-base")
model.load_adapter("AdapterHub/xmod-base-en_XX", source="hf")

Description

Adapter AdapterHub/xmod-base-en_XX for AdapterHub/xmod-base

An adapter for the AdapterHub/xmod-base model that was trained on the en/cc100 dataset.

This adapter was created for usage with the Adapters library.

Usage

First, install adapters:

pip install -U adapters

Now, the adapter can be loaded and activated like this:

from adapters import AutoAdapterModel

model = AutoAdapterModel.from_pretrained("AdapterHub/xmod-base")
adapter_name = model.load_adapter("AdapterHub/xmod-base-en_XX", source="hf", set_active=True)

Architecture & Training

This adapter was extracted from the original model checkpoint facebook/xmod-base to allow loading it independently via the Adapters library. For more information on architecture and training, please refer to the original model card.

Evaluation results

Citation

Lifting the Curse of Multilinguality by Pre-training Modular Transformers (Pfeiffer et al., 2022)

@inproceedings{pfeiffer-etal-2022-lifting,
    title = "Lifting the Curse of Multilinguality by Pre-training Modular Transformers",
    author = "Pfeiffer, Jonas  and
      Goyal, Naman  and
      Lin, Xi  and
      Li, Xian  and
      Cross, James  and
      Riedel, Sebastian  and
      Artetxe, Mikel",
    booktitle = "Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies",
    month = jul,
    year = "2022",
    address = "Seattle, United States",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.naacl-main.255",
    doi = "10.18653/v1/2022.naacl-main.255",
    pages = "3479--3495"
}

Properties

Pre-trained model
AdapterHub/xmod-base
Adapter type
Prediction Head
  No
Task
English
Dataset

Architecture

{
  "adapter_residual_before_ln": false,
  "cross_adapter": false,
  "factorized_phm_W": true,
  "factorized_phm_rule": false,
  "hypercomplex_nonlinearity": "glorot-uniform",
  "init_weights": "bert",
  "inv_adapter": null,
  "inv_adapter_reduction_factor": null,
  "is_parallel": false,
  "learn_phm": true,
  "leave_out": [],
  "ln_after": false,
  "ln_before": false,
  "mh_adapter": false,
  "non_linearity": "gelu",
  "original_ln_after": false,
  "original_ln_before": true,
  "output_adapter": true,
  "phm_bias": true,
  "phm_c_init": "normal",
  "phm_dim": 4,
  "phm_init_range": 0.0001,
  "phm_layer": false,
  "phm_rank": 1,
  "reduction_factor": 2,
  "residual_before_ln": false,
  "scaling": 1.0,
  "shared_W_phm": false,
  "shared_phm_rule": true,
  "use_gating": false
}

Citations