moco

MoCo: 用于无监督视觉表示学习的动量对比

这是MoCo论文的PyTorch实现：

@Article{he2019moco,
  author  = {Kaiming He and Haoqi Fan and Yuxin Wu and Saining Xie and Ross Girshick},
  title   = {Momentum Contrast for Unsupervised Visual Representation Learning},
  journal = {arXiv preprint arXiv:1911.05722},
  year    = {2019},
}

它还包括MoCo v2论文的实现：

@Article{chen2020mocov2,
  author  = {Xinlei Chen and Haoqi Fan and Ross Girshick and Kaiming He},
  title   = {Improved Baselines with Momentum Contrastive Learning},
  journal = {arXiv preprint arXiv:2003.04297},
  year    = {2020},
}

准备工作

按照官方PyTorch ImageNet训练代码安装PyTorch和ImageNet数据集。

本仓库旨在对该代码进行最小修改。通过以下方式检查修改：

diff main_moco.py <(curl https://raw.githubusercontent.com/pytorch/examples/master/imagenet/main.py)
diff main_lincls.py <(curl https://raw.githubusercontent.com/pytorch/examples/master/imagenet/main.py)

无监督训练

此实现仅支持多GPU、DistributedDataParallel训练，这种方式更快速、更简单；不支持单GPU或DataParallel训练。

要在8 GPU机器上对ImageNet上的ResNet-50模型进行无监督预训练，运行：

python main_moco.py \
  -a resnet50 \
  --lr 0.03 \
  --batch-size 256 \
  --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 \
  [包含train和val文件夹的imagenet文件夹路径]

此脚本使用MoCo v1论文中描述的所有默认超参数。要运行MoCo v2，设置 --mlp --moco-t 0.2 --aug-plus --cos。

注意：对于4 GPU训练，我们建议遵循线性学习率缩放方法：使用4个GPU时设置 --lr 0.015 --batch-size 128。我们使用这种设置得到了类似的结果。

线性分类

使用预训练模型，在8 GPU机器上对冻结特征/权重进行监督线性分类器训练，运行：

python main_lincls.py \
  -a resnet50 \
  --lr 30.0 \
  --batch-size 256 \
  --pretrained [你的检查点路径]/checkpoint_0199.pth.tar \
  --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 \
  [包含train和val文件夹的imagenet文件夹路径]

使用8个NVIDIA V100 GPU在ImageNet上进行线性分类的结果：

<table><tbody>   <th valign="bottom"></th> <th valign="bottom">预训练<br/>轮数</th> <th valign="bottom">预训练<br/>时间</th> <th valign="bottom">MoCo v1<br/>top-1准确率</th> <th valign="bottom">MoCo v2<br/>top-1准确率</th>  <tr><td align="left">ResNet-50</td> <td align="center">200</td> <td align="center">53小时</td> <td align="center">60.8±0.2</td> <td align="center">67.5±0.1</td> </tr> </tbody></table>

我们进行了5次试验（预训练和线性分类）并报告平均值±标准差：MoCo v1的5个结果为{60.6, 60.6, 60.7, 60.9, 61.1}，MoCo v2的结果为{67.7, 67.6, 67.4, 67.6, 67.3}。