nli-deberta-v3-large

nli-deberta-v3-large 项目介绍

项目背景

nli-deberta-v3-large 是一个用于自然语言推理的交叉编码器模型。该模型基于微软推出的 deberta-v3-large，并使用了 SentenceTransformers 的 Cross-Encoder 类进行训练。自然语言推理（NLI）是一种判断两个句子之间关系的任务，常见关系包括矛盾、蕴涵和中立。

训练数据

nli-deberta-v3-large 模型的训练使用了两个大型数据集：SNLI 和 MultiNLI。这些数据集收集了大量句子对，并标注了它们之间的关系。模型在训练过程中学习以输入的句子对为基础输出对应的三个标签：矛盾（contradiction）、蕴涵（entailment）和中立（neutral）。

性能表现

nli-deberta-v3-large 模型在以下测试集上表现出色：

在 SNLI 测试集上的准确率为 92.20%。
在 MNLI mismatched 集上的准确率为 90.49%。

这些性能指标说明了该模型在自然语言推理任务中的卓越表现。

使用方法

nli-deberta-v3-large 模型已经预先训练好，用户可以通过两种方式来使用：

使用 SentenceTransformers 库

用户可以利用 SentenceTransformers 库轻松地加载和使用该模型。以下是一个简单的示例代码，用于预测指定句子对之间的关系：

from sentence_transformers import CrossEncoder
model = CrossEncoder('cross-encoder/nli-deberta-v3-large')
scores = model.predict([('A man is eating pizza', 'A man eats something'), ('A black race car starts up in front of a crowd of people.', 'A man is driving down a lonely road.')])

# 将得分转换为标签
label_mapping = ['contradiction', 'entailment', 'neutral']
labels = [label_mapping[score_max] for score_max in scores.argmax(axis=1)]

使用 Transformers 库

此外，用户还可以直接通过 Transformers 库来使用模型，而无需借助 SentenceTransformers 库：

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model = AutoModelForSequenceClassification.from_pretrained('cross-encoder/nli-deberta-v3-large')
tokenizer = AutoTokenizer.from_pretrained('cross-encoder/nli-deberta-v3-large')

features = tokenizer(['A man is eating pizza', 'A black race car starts up in front of a crowd of people.'], ['A man eats something', 'A man is driving down a lonely road.'],  padding=True, truncation=True, return_tensors="pt")

model.eval()
with torch.no_grad():
    scores = model(**features).logits
    label_mapping = ['contradiction', 'entailment', 'neutral']
    labels = [label_mapping[score_max] for score_max in scores.argmax(dim=1)]
    print(labels)

零样本分类

该模型还能用于零样本分类（zero-shot-classification），无需再特定训练的情况下直接对未见过的任务进行分类。以下是一个零样本分类的示例：

from transformers import pipeline

classifier = pipeline("zero-shot-classification", model='cross-encoder/nli-deberta-v3-large')

sent = "Apple just announced the newest iPhone X"
candidate_labels = ["technology", "sports", "politics"]
res = classifier(sent, candidate_labels)
print(res)