thingsvision - extracting features from deep neural nets.

thingsvision is a Python package that let’s you easily extract image representations from many state-of-the-art neural networks for computer vision. In a nutshell, you feed thingsvision with a directory of images and tell it which neural network you are interested in. thingsvision will then give you the representation of the indicated neural network for each image so that you will end up with one feature vector per image. You can use these feature vectors for further analyses. We use the word features for short when we mean “image representation”.

To get started, please check out the Getting Started page, or try our Colab notebooks for Pytorch and Tensorflow .

Model collection

Neural networks come from different sources. With thingsvision, you can extract image representations of all models from:

  • torchvision
  • Keras
  • timm
  • ssl (self-supervised learning models)
    • simclr-rn50, mocov2-rn50, barlowtwins-rn50, pirl-rn50
    • jigsaw-rn50, rotnet-rn50, swav-rn50, vicreg-rn50
    • dino-rn50, dino-xcit-{small/medium}-{12/24}-p{8/16}
    • dino-vit-{tiny/small/base}-p{8/16}, dinov2-vit-{small/base/large/giant}-p14
    • mae-vit-{base/large}-p16, mae-vit-huge-p14
  • OpenCLIP models (CLIP trained on LAION-{400M/2B/5B})
  • CLIP models (CLIP trained on WiT)
  • a few custom models (Alexnet, VGG-16, Resnet50, and Inception_v3) trained on Ecoset rather than ImageNet and one Alexnet model pretrained on ImageNet and fine-tuned on SalObjSub
  • each of the many CORnet versions
  • Harmonization models (see Harmonization repo). The default variant is ViT_B16. Other available models are ResNet50, VGG16, EfficientNetB0, tiny_ConvNeXT, tiny_MaxViT, and LeViT_small
  • DreamSim models (see DreamSim repo). The default variant is open_clip_vitb32. Other available models are clip_vitb32, dino_vitb16, and an ensemble. See the docs for more information

Note that you have to use the respective model name (str). For example, if you want to use VGG16 from torchvision, use vgg16 as the model name and if you want to use VGG16 from TensorFlow/Keras, use the model name VGG16. You further have to specify the model source by setting the source parameter (e.g., timm, torchvision, keras).

We provide more details on the model collection in the Available Models page.

Using thingsvision with the THINGS dataset:

If you happen to use the THINGS image database, make sure to correctly unzip all zip files (sorted from A-Z), and have all object directories stored in the parent directory ./images/ (e.g., ./images/object_xy/) as well as the things_concepts.tsv file stored in the ./data/ folder. bash get_files.sh does the latter for you. Images, however, must be downloaded from the THINGS database Main subfolder. The download is around 5GB.

  • Go to https://osf.io/jum2f/files/
  • Select Main folder and click on “Download as zip” button (top right).
  • Unzip contained object_images_*.zip file using the password (check the description.txt file for details). For example:

       for fn in object_images_*.zip; do unzip -P the_password $fn; done
    

Citation

If you use this GitHub repository (or any modules associated with it), we would grately appreciate to cite our paper as follows:

@article{Muttenthaler_2021,
	author = {Muttenthaler, Lukas and Hebart, Martin N.},
	title = {THINGSvision: A Python Toolbox for Streamlining the Extraction of Activations From Deep Neural Networks},
	journal ={Frontiers in Neuroinformatics},
	volume = {15},
	pages = {45},
	year = {2021},
	url = {https://www.frontiersin.org/article/10.3389/fninf.2021.679838},
	doi = {10.3389/fninf.2021.679838},
	issn = {1662-5196},
}