I don't think it's explicitly stated anywhere that the ImageNet example is supposed to be an exact reimplementation of the Krizhevsky 2012 architecture, but if it is, then the order of the LRN and max pool layers in Caffe's implementation seems to be backwards.
This network uses conv -> max pool -> LRN.
https://github.com/BVLC/caffe/blob/master/examples/imagenet/imagenet_train.prototxt#L48
This text suggests that he used conv -> LRN -> max pool.
"Response-normalization layers follow the first and second convolutional layers. Max-pooling layers, of the kind described in Section 3.4, follow both response-normalization layers as well as the fifth convolutional layer."
Either ordering seems to get good results, but for people reimplementing papers that say Krizhevsky's architecture was used, then it might be worthwhile to make sure your implementation matches his paper.
I don't think it's explicitly stated anywhere that the ImageNet example is supposed to be an exact reimplementation of the Krizhevsky 2012 architecture, but if it is, then the order of the LRN and max pool layers in Caffe's implementation seems to be backwards.
This network uses conv -> max pool -> LRN.
https://github.com/BVLC/caffe/blob/master/examples/imagenet/imagenet_train.prototxt#L48
This text suggests that he used conv -> LRN -> max pool.
"Response-normalization layers follow the first and second convolutional layers. Max-pooling layers, of the kind described in Section 3.4, follow both response-normalization layers as well as the fifth convolutional layer."
Either ordering seems to get good results, but for people reimplementing papers that say Krizhevsky's architecture was used, then it might be worthwhile to make sure your implementation matches his paper.