deberta-v3-large-zeroshot-v1

deberta-v3-large-zeroshot-v1项目介绍

模型描述

deberta-v3-large-zeroshot-v1模型是一个旨在进行零样本分类（Zero-shot classification）的模型，利用了Hugging Face pipeline的能力。该模型的表现相比于其他现有的零样本模型更为优越，尤其是在Hugging Face平台上发布的其他模型中。本质上，它可以处理一项通用的任务：在给定文本的情况下，判定一个假设是否为“真”或者“非真”，这在自然语言推理（NLI）任务中被称为“蕴涵”（entailment）与“非蕴涵”（not entailment）。由于这一任务形式的普遍性，任何分类任务都可以转化为这一任务格式进行处理。

训练数据

该模型的训练基于27个不同的任务和310个不同的类别，这些任务和类别都被重新格式化为上述的通用任务格式。其中包括：

26个分类任务，涉及约40万条文本数据，如：'amazonpolarity'、'imdb'、'appreviews'、'yelpreviews'、'rottentomatoes'等。
五个NLI数据集，共约88.5万条文本数据，包含："MNLI"、"ANLI"、"fever"、"wanli"、"ling"。

需要注意的是，与其他自然语言推理模型相比，该模型仅预测两个类别（“蕴涵”与“非蕴涵”），而不是常见的三个类别（蕴涵/中立/矛盾）。

如何使用该模型

简单的零样本分类示例

使用该模型可以通过以下代码进行零样本分类：

from transformers import pipeline
classifier = pipeline("zero-shot-classification", model="MoritzLaurer/deberta-v3-large-zeroshot-v1")
sequence_to_classify = "Angela Merkel is a politician in Germany and leader of the CDU"
candidate_labels = ["politics", "economy", "entertainment", "environment"]
output = classifier(sequence_to_classify, candidate_labels, multi_label=False)
print(output)

数据与训练细节

为了透明化和易于复现，此模型的数据准备、训练与评估代码是完全开源的，可以在GitHub上访问。

限制与偏见

尽管该模型功能强大，但其应用范围仅限于文本分类任务。此外，模型可能存在潜在的偏见。为深入了解偏见来源，建议查阅原始的DeBERTa论文和训练数据集的相关文献。

许可证

DeBERTa-v3基础模型是根据MIT许可证发布的。模型微调所使用的数据集则有各自不同的许可证信息，相关详情可以在附加的电子表格中查看。

引用

如果使用此模型进行学术研究或项目开发，请使用以下格式引用：

@article{laurer_less_2023,
	title = {Less {Annotating}, {More} {Classifying}: {Addressing} the {Data} {Scarcity} {Issue of {Supervised} {Machine} {Learning} with {Deep} {Transfer} {Learning} and {BERT}-{NLI}},
	issn = {1047-1987, 1476-4989},
	shorttitle = {Less {Annotating}, {More} {Classifying}},
	url = {https://www.cambridge.org/core/product/identifier/S1047198723000207/type/journal_article},
	doi = {10.1017/pan.2023.20},
	language = {en},
	urldate = {2023-06-20},
	journal = {Political Analysis},
	author = {Laurer, Moritz and Van Atteveldt, Wouter and Casas, Andreu and Welbers, Kasper},
	month = jun,
	year = {2023},
	pages = {1--33},
}