ImageNet
This is not a complete article: This is a draft, a work in progress that is intended to be published into an article, which may or may not be ready for inclusion in the main wiki. It should not necessarily be considered factual or authoritative.
Compute Canada makes available on Graham cluster a copy of the ImageNet dataset, stored in the /datashare
space. For the time being, this dataset is available only on Graham and you must opt-in to access this dataset by agreeing that you have registered for an ImageNet license:
By selecting this service you acknowledge that you have registered with the owner of the data (at http://image-net.org/download) and have agreed to ImageNet’s terms of use (https://image-net.org/download.php). En sélectionnant ce service, vous reconnaissez que vous êtes inscrit auprès du propriétaire des données (à l'adresse http://image-net.org/download) et que vous avez accepté les conditions d'utilisation d'ImageNet (https://image-net.org/download.php).
This dataset is provided as is, and will only be updated based on image-net.org releases. If data from other challenges than the ones provided are required, please contact our Technical support with the subject ImageNet dataset
Request access through the opt-in service
Please visit this registration page to request access by acknowledging that you have registered with the ImageNet providers and that you agree with their terms and conditions.
Available versions
The ImageNet directory in /datashare
contains several versions of the ImageNet dataset:
- Full dataset (ImageNet-21k): the Fall 2011 and Winter 2021 releases of the full dataset can be found in
fall11_whole
andwinter21_whole
respectively. It contains 14,197,122 images divided into 21,841 classes. - Large-scale Visual Recognition Challenge (LSVRC): versions from the 2012 and 2017 can be found in
ILSVRC2012
andILSVRC2017
respectively. The datasets contain 1,281,167 images for training with variable number of images for each of the 1,000 classes (synsets) ranging from 732 to 1300. The validation set contains 50,000 images with 50 images per synset and a testing dataset containing 100,000 images. The ILSVRC datasets are generally the most commonly used versions of the ImageNet datasets. - Tiny Imagnet: this is a downsampled and reduced dataset that contains 100000 images of 200 classes downsized to 64×64 colored images. It can be found in the directory
tiny-imagenet-200
- Downsampled: In addition we provide downsampled versions of ImageNet on
/datashare/ImageNet/DownSampled
. 8x8, 16x16, 32x32 and 64x64 versions are available. The number of training images, synsets, evaluation images and testing images are unchanged from the original LSVRC datasets.
Location and contents
The files can be accessed at /datashare/ImageNet/
, and it contains:
├── DownSampled │ ├── Imagenet16_train_npz │ ├── Imagenet16_val_npz │ ├── Imagenet32_train_npz │ ├── Imagenet32_val_npz │ ├── Imagenet64_train_part1_npz │ ├── Imagenet64_train_part2_npz │ ├── Imagenet64_val_npz │ ├── Imagenet8_train_npz │ └── Imagenet8_val_npz ├── fall11_whole │ ├── n00004475 │ ├── n00005787 │ ├── n00006024 │ ├── n00006484 │ ├── n00007846 │ ├── n00015388 │ ├── n00017222 │ ├── n00021265 | . | . | . │ ├── n15102359 │ ├── n15102455 │ ├── n15102894 │ └── tars.tar.bz2 ├── ILSVRC2012 │ ├── ILSVRC2012_devkit_t12 │ ├── ILSVRC2012_devkit_t3 │ ├── ILSVRC2012_img_test_patch_v10102019.tar │ ├── ILSVRC2012_img_test_v10102019.tar │ ├── ILSVRC2012_img_train_t3.tar │ ├── ILSVRC2012_img_train.tar │ ├── ILSVRC2012_img_val.tar │ ├── ILSVRC2012.md5 │ ├── test │ ├── train │ ├── train_T3 │ └── validation ├── ILSVRC2017 │ ├── Annotations │ ├── Data │ ├── devkit │ ├── ILSVRC │ ├── ILSVRC2017_DET.tar.gz │ ├── ILSVRC2017_DET_test_new.tar.gz │ ├── ILSVRC2017_devkit.tar.gz │ └── ImageSets ├── tiny-imagenet-200 │ ├── test │ ├── train │ ├── val │ ├── wnids.txt │ └── words.txt └── winter21_whole ├── n00004475 ├── n00007846 ├── n00017222 ├── n00288384 ├── n00324978 ├── n00433458 ├── n00433661 . . . ├── n15092751 ├── n15102359 ├── n15102894 └── tars
This is an NFS3 mount!!!
The ImageNet provided in Graham is a NFS3 mount, and therefore you might have issues accessing the files if you belong to more than 16 groups in CC