EHS
EHS

A walk in the black-box: 3D visualization of large neural networks in virtual reality



doi: 10.1007/s00521-022-07608-4.


Online ahead of print.

Affiliations

Item in Clipboard

Christoph Linse et al.


Neural Comput Appl.


.

Abstract

Within the last decade Deep Learning has become a tool for solving challenging problems like image recognition. Still, Convolutional Neural Networks (CNNs) are considered black-boxes, which are difficult to understand by humans. Hence, there is an urge to visualize CNN architectures, their internal processes and what they actually learn. Previously, virtual realityhas been successfully applied to display small CNNs in immersive 3D environments. In this work, we address the problem how to feasibly render large-scale CNNs, thereby enabling the visualization of popular architectures with ten thousands of feature maps and branches in the computational graph in 3D. Our software “DeepVisionVR” enables the user to freely walk through the layered network, pick up and place images, move/scale layers for better readability, perform feature visualization and export the results. We also provide a novel Pytorch module to dynamically link PyTorch with Unity, which gives developers and researchers a convenient interface to visualize their own architectures. The visualization is directly created from the PyTorch class that defines the Pytorch model used for training and testing. This approach allows full access to the network’s internals and direct control over what exactly is visualized. In a use-case study, we apply the module to analyze models with different generalization abilities in order to understand how networks memorize images. We train two recent architectures, CovidResNet and CovidDenseNet on the Caltech101 and the SARS-CoV-2 datasets and find that bad generalization is driven by high-frequency features and the susceptibility to specific pixel arrangements, leading to implications for the practical application of CNNs. The code is available on Github https://github.com/Criscraft/DeepVisionVR.


Keywords:

Deep convolutional neural network visualization; Explainable artificial intelligence; Human-understandable AI systems; Virtual reality.

Conflict of interest statement

Conflict of interestThe authors declare that they have no conflict of interest.

Figures



Fig. 1

DeepVisionVR architecture


Fig. 2


Fig. 2

Representation of CovidResNet in 3D space. Each 2D panel shows the feature maps (channels) of a specific layer. Negative activations are colored blue, zero activations black and positive activations white


Fig. 3


Fig. 3

Left: Dataset panel from which the user can pick images from. The software randomly draws images from a provided Pytorch dataset class. Center: User interface and statistics for a specific network layer. Right: The feature visualization generates input images, which maximize the mean activation of one specific channel. Each color image corresponds to one generated image for that specific channel


Fig. 4


Fig. 4

Representation of different architectures. Left: ResNet basic block. Right: Dense block. Bottom: Inception block


Fig. 5


Fig. 5

Three training strategies for getting models with different levels of generalization abilities


Fig. 6


Fig. 6

Example images from the Caltech101 dataset for the classes crocodile head, panda, pyramid, rooster, schooner, Snoopy, sunflower, wild cat and Yin and Yang


Fig. 7


Fig. 7

First training strategy: activations of the last convolutional layer


Fig. 8


Fig. 8

Second training strategy: activations of the last convolutional layer. Before training, all labels in the train set were shuffled


Fig. 9


Fig. 9

Third training strategy: activations of the last convolutional layer. The train set was copied 9 times and each copy received white noise and random labels. The original dataset with original labels is contained exactly once

References

    1. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th International conference on neural information processing systems, pp. 1097–1105

    1. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L. Imagenet large scale visual recognition challenge. Int J Comput Vis. 2015;115(3):211–252. doi: 10.1007/s11263-015-0816-y.



      DOI

    1. Arrieta AB, Díaz-Rodríguez N, Del Ser J, Bennetot A, Tabik S, Barbado A, García S, Gil-López S, Molina D, Benjamins R, et al. Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai. Inf Fusion. 2020;58:82–115. doi: 10.1016/j.inffus.2019.12.012.



      DOI

    1. Tjoa E, Guan C. A survey on explainable artificial intelligence (xai): Toward medical xai. IEEE Trans Neural Netw Learn Syst. 2020;32(11):4793–4813. doi: 10.1109/TNNLS.2020.3027314.



      DOI



      PubMed

    1. Samek W, Wiegand T, Müller K-R. Explainable artificial intelligence: understanding, visualizing and interpreting deep learning models. ITU J ICT Discov. 2017;1(1):1–10.



Source link

EHS
Back to top button