且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

GoogLeNet 模型的微调

更新时间:2023-12-02 21:47:58

假设您正在尝试进行图像分类.这些应该是微调模型的步骤:

Assuming you are trying to do image classification. These should be the steps for finetuning a model:

原始分类层loss3/classifier"; 输出 1000 个类别的预测(它的 mum_output 设置为 1000).您需要将其替换为具有适当 num_output 的新层.替换分类层:

The original classification layer "loss3/classifier" outputs predictions for 1000 classes (it's mum_output is set to 1000). You'll need to replace it with a new layer with appropriate num_output. Replacing the classification layer:

  1. 更改层的名称(这样当您从 caffemodel 文件中读取原始权重时,不会与该层的权重发生冲突).
  2. num_output 更改为您尝试预测的正确数量的输出类别.
  3. 请注意,您需要更改所有分类层.通常只有一个,但 GoogLeNet 恰好有三个:"loss1/classifier", "loss2/classifier""loss3/classifier".
  1. Change layer's name (so that when you read the original weights from caffemodel file there will be no conflict with the weights of this layer).
  2. Change num_output to the right number of output classes you are trying to predict.
  3. Note that you need to change ALL classification layers. Usually there is only one, but GoogLeNet happens to have three: "loss1/classifier", "loss2/classifier" and "loss3/classifier".

2.数据

您需要使用要微调的新标签制作新的训练数据集.例如,请参阅这篇博文,了解如何制作 lmdb 数据集.

2. Data

You need to make a new training dataset with the new labels you want to fine tune to. See, for example, this post on how to make an lmdb dataset.

在微调模型时,您可以训练所有模型的权重或选择固定一些权重(通常是较低/较深层的过滤器)并仅训练最顶层的权重.这个选择取决于你,它通常取决于可用的训练数据量(你拥有的示例越多,你可以负担得起的权重越多).
每个层(保存可训练参数)都有 param { lr_mult: XX }.该系数决定了这些权重对 SGD 更新的敏感程度.设置 param { lr_mult: 0 } 意味着你固定这一层的权重,它们在训练过程中不会改变.
相应地编辑您的 train_val.prototxt.

When finetuning a model, you can train ALL model's weights or choose to fix some weights (usually filters of the lower/deeper layers) and train only the weights of the top-most layers. This choice is up to you and it ususally depends on the amount of training data available (the more examples you have the more weights you can afford to finetune).
Each layer (that holds trainable parameters) has param { lr_mult: XX }. This coefficient determines how susceptible these weights to SGD updates. Setting param { lr_mult: 0 } means you FIX the weights of this layer and they will not be changed during the training process.
Edit your train_val.prototxt accordingly.

运行caffe train,但为其提供caffemodel权重作为初始权重:

Run caffe train but supply it with caffemodel weights as an initial weights:

~$ $CAFFE_ROOT/build/tools/caffe train -solver /path/to/solver.ptototxt -weights /path/to/orig_googlenet_weights.caffemodel