Normalize中的mean和std

实验过程遇到了一个问题，首先图像预处理步骤有一个normalize，需要用mean和std对图像进行normalize，我发现我自己算的mean和std有问题（这个问题早就知道，就是懒得去改，毕竟费时间）。刚开始不注意，但是现在模型陷入了一个瓶颈，我就想这两个参数是否对模型最后的表现有影响，就去google了下。得到了以下结论：

Pretrained models

由于训练的过程中，自己从头开心训练一个模型还是太难了，因此transfer learning还是一个重要的提速手段。对于pytroch而言，torchvision中models主要都是在imagenet上的预训练，把最后一层的fc层或classifier层去掉换成需要Linear层就能够利用已有的权重。

那么对于利用pretrained models的mean和std改怎么定这个是一个问题。谷歌后，发现了一个大佬对此做了解答，恍然大悟。

It’s the same means and stds the model was trained with on ImageNet. For most of the models from torchvision, according to pretrained_models_pytorch:

means = [0.485, 0.456, 0.406]
stds = [0.229, 0.224, 0.225]

link

如果我们使用的pretrained model，那么models是适用于imagenet的统计量（std和mean），如果想更好地利用pretrained model的权重，显然输入也要符合imagenet，因此此时应该选用imagenet的统计量，应该对tranferlearning有更好的帮助

models from scratch

如果是先自己从头训练一个model的话，我觉得还是应该要用当前任务的数据来自己算，当然大佬说这些值是任意的比较任意的：

This only matters if you are using a pretrained model. Also, means and stds of the dataset are calculated from the raw images without data augmentation as far as I know. And the values can be chosen quite arbitrarily as you can see from inceptionv3.

这里面还提到了means和std的计算方式，在不经过任何preprocessing的原图上计算的。

Reference

https://github.com/DeepVoltaire/AutoAugment/issues/4