通道剪枝以加速超深度神经网络

GitHub - yihui-he/channel-pruning: 通道剪枝以加速超深度神经网络 (ICCV'17)

ICCV 2017, 作者：何宜晖、张祥雨和孙剑

请查看我们在压缩深度模型方面的新作品：

AMC: 移动设备上模型压缩和加速的AutoML ECCV'18，结合了通道剪枝和强化学习以进一步加速CNN。代码和模型已公开！
AddressNet: 基于移位操作的高效卷积神经网络原语 WACV'19。我们提出了一系列基于移位操作的高效网络。
MoBiNet: 用于图像分类的移动二值网络 WACV'20 二值化的MobileNets。

在这个代码库中，我们发布了以下模型的代码：

模型	加速比	准确率
https://github.com/yihui-he/channel-pruning/releases/tag/channel_pruning_5x	5倍	88.1 (Top-5), 67.8 (Top-1)
https://github.com/yihui-he/channel-pruning/releases/tag/VGG-16_3C4x	4倍	89.9 (Top-5), 70.6 (Top-1)
https://github.com/yihui-he/channel-pruning/releases/tag/ResNet-50-2X	2倍	90.8 (Top-5), 72.3 (Top-1)
https://github.com/yihui-he/channel-pruning/releases/tag/faster-RCNN-2X4X	2倍	36.7 (AP@.50:.05:.95)
https://github.com/yihui-he/channel-pruning/releases/tag/faster-RCNN-2X4X	4倍	35.1 (AP@.50:.05:.95)

3C方法结合了空间分解（使用低秩展开加速卷积神经网络）和通道分解（加速用于分类和检测的超深度卷积网络）（在4.1.2节中提到）

引用

如果您在研究中发现这些代码有用，请考虑引用：

@InProceedings{He_2017_ICCV,
author = {He, Yihui and Zhang, Xiangyu and Sun, Jian},
title = {Channel Pruning for Accelerating Very Deep Neural Networks},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2017}
}

要求

您可能没有的Python3包：scipy、sklearn、easydict，使用sudo pip3 install安装。
对于128批量大小的微调，需要4个GPU（约11G内存）

安装（足够运行演示）

克隆代码库

# 确保使用--recursive克隆
 git clone --recursive https://github.com/yihui-he/channel-pruning.git

构建我的Caffe分支（支持双三次插值和将图像较短边调整为256然后裁剪到224x224）

cd caffe

 # 如果您熟悉Caffe并已安装所有要求，只需执行：
 make all -j8 && make pycaffe
 # 或按照Caffe安装说明进行操作：
 # http://caffe.berkeleyvision.org/installation.html

 # 如果您之前已经有了caffe，可能需要将pycaffe添加到PYTHONPATH

下载ImageNet分类数据集 http://www.image-net.org/download-images
在temp/vgg.prototxt中指定imagenet source路径（第12行和第36行）

通道剪枝

为了快速测试，您可以直接下载已剪枝的模型。请参见[下一节](https://github.com/ethanhe42/channel-pruning/blob/master/about:blank#pruned-models-for-download 1. 下载原始VGG-16模型 http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_16_layers.caffemodel 将其移动到temp/vgg.caffemodel（或创建一个软链接）

开始通道剪枝

python3 train.py -action c3 -caffe [GPU0]
# 或者使用 ./run.sh python3 train.py -action c3 -caffe [GPU0] 进行日志记录
# 将 [GPU0] 替换为实际的 GPU 设备，如 0、1 或 2

合并一些分解后的层以进一步压缩，并计算加速比。将 temp/cb_3c_3C4x_mem_bn_vgg.prototxt 中的 ImageData 层替换为 [temp/vgg.prototxt 中的相应部分。执行 Shell ./combine.sh | xargs ./calflop.sh
微调

caffe train -solver temp/solver.prototxt -weights temp/cb_3c_vgg.caffemodel -gpu [GPU0,GPU1,GPU2,GPU3]
# 将 [GPU0,GPU1,GPU2,GPU3] 替换为实际的 GPU 设备，如 0,1,2,3

测试

虽然在微调过程中会进行测试，但你可以随时使用以下命令进行测试：

caffe test -model path/to/prototxt -weights path/to/caffemodel -iterations 5000 -gpu [GPU0]
# 将 [GPU0] 替换为实际的 GPU 设备，如 0、1 或 2

剪枝模型（可下载）

为快速测试，你可以直接从发布页面下载剪枝后的模型：VGG-16 3C 4X、VGG-16 5X、ResNet-50 2X。或者使用百度网盘下载链接

使用以下命令进行测试：

caffe test -model channel_pruning_VGG-16_3C4x.prototxt -weights channel_pruning_VGG-16_3C4x.caffemodel -iterations 5000 -gpu [GPU0]
# 将 [GPU0] 替换为实际的 GPU 设备，如 0、1 或 2