Pre-trained model:
Adapter for distilbert-base-uncased in Houlsby architecture trained on the QQP dataset for 15 epochs with early stopping and a learning rate of 1e-4.
Adapter for distilbert-base-uncased in Pfeiffer architecture trained on the QQP dataset for 15 epochs with early stopping and a learning rate of 1e-4.