| 1st place model | 2nd place model | 3rd place model | 4th place model | Matsuoka et al. (2018) |
---|---|---|---|---|---|
CNN architecture | PyramidNet | WideResNet | Shake-shake ResNet26 | MobileNetV2 | LeNet |
Number of parameters | 7.6Â M | 4.3Â M | 3.0Â M | 11.6Â M | 0.60Â M |
Preprocessing | Binarization | – | Downsampling (32 × 32) | Upsampling (96 × 96) | – |
Oversampling (Data augmentation) | Vertical flip Horizontal flip Cutout Random shift | Cropping Random rotation Random erasing | Random crop + Padding Horizontal flip | Random rotation Mixup | – |
Undersampling | Hard negative mining | Random sampling | Random sampling | Random sampling | Random sampling |
Ensemble learning | 5 models (Different hard negative ratio) | – | – | 5 models (Different learning rate/preproccesing) | 10 models (Different negative samples) |
TTA | Same as training phase | 5 Crop × 4 Rotation | 10 Crop + Padding | – | – |
Others | – | Focal loss | RReLU | – | – |
Time for training | 10Â days | 1Â day | 4Â days | 5Â days | 15Â h |
Time for test | 10Â h | 1Â h | 1Â h | 5Â h | 4Â h |