I’ve loaded CIFAR-10 from the datasets (that worked very nicely, thank you )
I want to use an untrained ResNet on the data.
PL has a pretrained ResNet50 component, but I would like to re-run a DoubleDescent experiment using ResNet-18 from this paper:
- Nakkiran, Preetum, Gal Kaplun, Yamini Bansal, Tristan Yang, Boaz Barak, and Ilya Sutskever. ‘Deep Double Descent: Where Bigger Models and More Data Hurt’. ArXiv:1912.02292 [Cs, Stat] , 4 December 2019. http://arxiv.org/abs/1912.02292.
(so that I can see Double Descent clearly in PL. I think I have created it in another model, but it’s nice to reproduce known results sometimes.)
In particular according to the ResNets description in Appendix B (B.1 Models):
ResNets. We define a family of ResNet18s of increasing size as follows. We follow the Preactivation ResNet18 architecture of He et al. (2016), using 4 ResNet blocks, each consisting of two BatchNorm-ReLU-Convolution layers. The layer widths for the 4 blocks are [k; 2k; 4k; 8k] for varying k 2 N and the strides are [1, 2, 2, 2]. The standard ResNet18 corresponds to k = 64 convolutional channels in the first layer. The scaling of model size with k is shown in Figure 13b. Our implementation is adapted from https://github.com/kuangliu/pytorch-cifar.
All input gratefully received - pure PL components or custom.
I think this the original paper where ResNet-18 was introduced, but the image below only shows the 34 layer version (obviously)
- He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. ‘Deep Residual Learning for Image Recognition’. ArXiv:1512.03385 [Cs] , 10 December 2015. http://arxiv.org/abs/1512.03385.