Advances-in-Label-Noise-Learning

Advances-in-Label-Noise-Learning

标签噪声学习最新研究进展与实践技术

这个项目全面总结了标签噪声学习领域的最新研究成果,包括论文、代码、软件工具、竞赛和教程等资源。它涵盖了群体分布鲁棒性、标签分布偏移等热点问题,并提供了真实噪声数据集和模拟框架。对于从事标签噪声学习研究的学者和工程师来说,这是一个非常有价值的知识库。

机器学习噪声标签数据集数据清洗深度学习Github开源项目

Learning-with-Noisy-Labels

A curated list of most recent papers & codes in Learning with Noisy Labels

Some recent works about group-distributional robustness, label distribution shifts, are also included.

Public Software

Docta-AI: An advanced data-centric AI platform that detects and rectifies issues in any data format (i.e., label error detection). [Website]

Competition

A Hands-on Tutorial for Learning with Noisy Labels (IJCAI 2022)[website]

Tutorial

1st Learning and Mining with Noisy Labels Challenge (IJCAI 2023)[Website][GitHub]

Content


Benchmarks & Leaderboard

Real-world noisy-label bechmarks:

DatasetLeaderboard LinkWebsitePaper
CIFAR-10N[Leaderboard][Website][Paper]
CIFAR-100N[Leaderboard][Website][Paper]
Red Stanford CarsN/A[Website][Paper]
Red Mini-ImageNetN/A[Website][Paper]
Animal-10N[Leaderboard][Website][Paper]
Food-101NN/A[Website][Paper]
Clothing1M[Leaderboard][Website][Paper]

Simulation of label noise: An Instance-Dependent Simulation Framework for Learning with Label Noise. [Paper]

This repo focus on papers after 2019, for previous works, please refer to (https://github.com/subeeshvasu/Awesome-Learning-with-Label-Noise).

Papers & Code in 2023


KDD 2023

  • [UCSC REAL Lab] To Aggregate or Not? Learning with Separate Noisy Labels. [Paper]
  • DyGen: Learning from Noisy Labels via Dynamics-Enhanced Generative Modeling. [Paper][Code]
  • Robust Positive-Unlabeled Learning via Noise Negative Sample Self-correction. [Paper]
  • Neural-Hidden-CRF: A Robust Weakly-Supervised Sequence Labeler. [Paper][Code]
  • Complementary Classifier Induced Partial Label Learning. [Paper][Code]
  • Partial-label Learning with Mixed Closed-Set and Open-Set Out-of-Candidate Examples. [Paper]
  • Weakly Supervised Multi-Label Classification of Full-Text Scientific Papers. [Paper][Code]

NeurIPS 2023

  • The Pursuit of Human Labeling: A New Perspective on Unsupervised Learning. [Paper][Code]
  • AQuA: A Benchmarking Tool for Label Quality Assessment. [Paper]
  • Efficient Testable Learning of Halfspaces with Adversarial Label Noise. [Paper]
  • Neural Relation Graph: A Unified Framework for Identifying Label Noise and Outlier Data. [Paper][Code]
  • Robust Data Pruning under Label Noise via Maximizing Re-labeling Accuracy. [Paper]
  • Subclass-Dominant Label Noise: A Counterexample for the Success of Early Stopping. [Paper][Code]
  • Label Correction of Crowdsourced Noisy Annotations with an Instance-Dependent Noise Transition Model. [Paper]
  • Scale-teaching: Robust Multi-scale Training for Time Series Classification with Noisy Labels. [Paper][Code]
  • SoTTA: Robust Test-Time Adaptation on Noisy Data Streams. [Paper][Code]
  • Active Negative Loss Functions for Learning with Noisy Labels. [Paper][Code]
  • Label-Retrieval-Augmented Diffusion Models for Learning from Noisy Labels. [Paper][Code]
  • Training shallow ReLU networks on noisy data using hinge loss: when do we overfit and is it benign? [Paper]
  • CSOT: Curriculum and Structure-Aware Optimal Transport for Learning with Noisy Labels. [Paper][Code]
  • Deep Insights into Noisy Pseudo Labeling on Graph Data. [Paper]
  • ARTIC3D: Learning Robust Articulated 3D Shapes from Noisy Web Image Collections. [Paper][Code]
  • ALIM: Adjusting Label Importance Mechanism for Noisy Partial Label Learning. [Paper][Code]
  • Weakly-Supervised Concealed Object Segmentation with SAM-based Pseudo Labeling and Multi-scale Feature Grouping. [Paper][Code]
  • Label Poisoning is All You Need. [Paper][Code]
  • SLaM: Student-Label Mixing for Distillation with Unlabeled Examples. [Paper]
  • IPMix: Label-Preserving Data Augmentation Method for Training Robust Classifiers. [Paper]
  • HQA-Attack: Toward High Quality Black-Box Hard-Label Adversarial Attack on Text. [Paper][Code]

ICML 2023

  • [UCSC REAL Lab] Identifiability of Label Noise Transition Matrix. [Paper]
  • Which is Better for Learning with Noisy Labels: The Semi-supervised Method or Modeling Label Noise? [Paper]
  • Mitigating Memorization of Noisy Labels by Clipping the Model Prediction. [Paper][Code]
  • CrossSplit: Mitigating Label Noise Memorization through Data Splitting. [Paper][Code]
  • Understanding Self-Distillation in the Presence of Label Noise. [Paper]
  • RandomClassificationNoisedoesnotdefeatAllConvexPotentialBoosters IrrespectiveofModelChoice. [Paper]
  • Deep Clustering with Incomplete Noisy Pairwise Annotations: A Geometric Regularization Approach. [Paper]
  • Delving into Noisy Label Detection with Clean Data. [Paper]
  • When does Privileged information Explain Away Label Noise? [Paper]
  • Squeeze, Recover and Relabel: Dataset Condensation at ImageNet Scale From A New Perspective. [Paper][Code]
  • Promises and Pitfalls of Threshold-based Auto-labeling. [Paper]
  • Accelerating Exploration with Unlabeled Prior Data. [Paper]

CVPR 2023

  • Twin Contrastive Learning with Noisy Labels. [Paper][Code]
  • Exploring High-Quality Pseudo Masks for Weakly Supervised Instance Segmentation. [Paper][Code]
  • HandsOff: Labeled Dataset Generation with No Additional Human Annotations. [Paper][Code]
  • Learning from Noisy Labels with Decoupled Meta Label Purifier. [Paper][Code]
  • DISC: Learning from Noisy Labels via Dynamic Instance-Specific Selection and Correction. [Paper][Code]
  • Leveraging Inter-Rater Agreement for Classification in the Presence of Noisy Labels. [Paper]
  • Fine-Grained Classification with Noisy Labels. [Paper]
  • Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies. [Paper][Code]
  • MixTeacher: Mining Promising Labels with Mixed Scale Teacher for Semi-supervised Object Detection. [Paper][Code]
  • OT-Filter: An Optimal Transport Filter for Learning With Noisy Labels. [Paper]
  • Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly Supervised Video Anomaly Detection. [Paper][Code]
  • Semi-Supervised 2D Human Pose Estimation Driven by Position Inconsistency Pseudo Label Correction Module. [Paper][Code]
  • Learning with Noisy labels via Self-supervised

编辑推荐精选

讯飞智文

讯飞智文

一键生成PPT和Word,让学习生活更轻松

讯飞智文是一个利用 AI 技术的项目,能够帮助用户生成 PPT 以及各类文档。无论是商业领域的市场分析报告、年度目标制定,还是学生群体的职业生涯规划、实习避坑指南,亦或是活动策划、旅游攻略等内容,它都能提供支持,帮助用户精准表达,轻松呈现各种信息。

AI办公办公工具AI工具讯飞智文AI在线生成PPTAI撰写助手多语种文档生成AI自动配图热门
讯飞星火

讯飞星火

深度推理能力全新升级,全面对标OpenAI o1

科大讯飞的星火大模型,支持语言理解、知识问答和文本创作等多功能,适用于多种文件和业务场景,提升办公和日常生活的效率。讯飞星火是一个提供丰富智能服务的平台,涵盖科技资讯、图像创作、写作辅助、编程解答、科研文献解读等功能,能为不同需求的用户提供便捷高效的帮助,助力用户轻松获取信息、解决问题,满足多样化使用场景。

热门AI开发模型训练AI工具讯飞星火大模型智能问答内容创作多语种支持智慧生活
Spark-TTS

Spark-TTS

一种基于大语言模型的高效单流解耦语音令牌文本到语音合成模型

Spark-TTS 是一个基于 PyTorch 的开源文本到语音合成项目,由多个知名机构联合参与。该项目提供了高效的 LLM(大语言模型)驱动的语音合成方案,支持语音克隆和语音创建功能,可通过命令行界面(CLI)和 Web UI 两种方式使用。用户可以根据需求调整语音的性别、音高、速度等参数,生成高质量的语音。该项目适用于多种场景,如有声读物制作、智能语音助手开发等。

Trae

Trae

字节跳动发布的AI编程神器IDE

Trae是一种自适应的集成开发环境(IDE),通过自动化和多元协作改变开发流程。利用Trae,团队能够更快速、精确地编写和部署代码,从而提高编程效率和项目交付速度。Trae具备上下文感知和代码自动完成功能,是提升开发效率的理想工具。

AI工具TraeAI IDE协作生产力转型热门
咔片PPT

咔片PPT

AI助力,做PPT更简单!

咔片是一款轻量化在线演示设计工具,借助 AI 技术,实现从内容生成到智能设计的一站式 PPT 制作服务。支持多种文档格式导入生成 PPT,提供海量模板、智能美化、素材替换等功能,适用于销售、教师、学生等各类人群,能高效制作出高品质 PPT,满足不同场景演示需求。

讯飞绘文

讯飞绘文

选题、配图、成文,一站式创作,让内容运营更高效

讯飞绘文,一个AI集成平台,支持写作、选题、配图、排版和发布。高效生成适用于各类媒体的定制内容,加速品牌传播,提升内容营销效果。

热门AI辅助写作AI工具讯飞绘文内容运营AI创作个性化文章多平台分发AI助手
材料星

材料星

专业的AI公文写作平台,公文写作神器

AI 材料星,专业的 AI 公文写作辅助平台,为体制内工作人员提供高效的公文写作解决方案。拥有海量公文文库、9 大核心 AI 功能,支持 30 + 文稿类型生成,助力快速完成领导讲话、工作总结、述职报告等材料,提升办公效率,是体制打工人的得力写作神器。

openai-agents-python

openai-agents-python

OpenAI Agents SDK,助力开发者便捷使用 OpenAI 相关功能。

openai-agents-python 是 OpenAI 推出的一款强大 Python SDK,它为开发者提供了与 OpenAI 模型交互的高效工具,支持工具调用、结果处理、追踪等功能,涵盖多种应用场景,如研究助手、财务研究等,能显著提升开发效率,让开发者更轻松地利用 OpenAI 的技术优势。

Hunyuan3D-2

Hunyuan3D-2

高分辨率纹理 3D 资产生成

Hunyuan3D-2 是腾讯开发的用于 3D 资产生成的强大工具,支持从文本描述、单张图片或多视角图片生成 3D 模型,具备快速形状生成能力,可生成带纹理的高质量 3D 模型,适用于多个领域,为 3D 创作提供了高效解决方案。

3FS

3FS

一个具备存储、管理和客户端操作等多种功能的分布式文件系统相关项目。

3FS 是一个功能强大的分布式文件系统项目,涵盖了存储引擎、元数据管理、客户端工具等多个模块。它支持多种文件操作,如创建文件和目录、设置布局等,同时具备高效的事件循环、节点选择和协程池管理等特性。适用于需要大规模数据存储和管理的场景,能够提高系统的性能和可靠性,是分布式存储领域的优质解决方案。

下拉加载更多