Lecture1_SaliencyMaps

This chapter covers results in the foundational paper Deep Inside Convolutional Networks: Visualizing Image Classification Models and Saliency Maps available at this Arxiv link.

Lecture

Introduction to Saliency Maps

--- primaryColor: steelblue shuffleQuestions: false shuffleAnswers: true --- ### What is the goal of this paper? - [ ] Make neural networks faster. - [x] Visualize convolutional neural networks. > Correct. - [ ] Visualize transformers. ### The focus of the proposed approach will be on 1. [x] Gradient of the output w.r.t. the input. > Correct. 1. [ ] Gradient of the input w.r.t. the output. 1. [ ] Accuracy of the model. 1. [ ] Speed of training of the model.

Problem Definition and Model Visualization

--- primaryColor: steelblue shuffleQuestions: false shuffleAnswers: true --- ### An earlier approach solved the visualization problem by performing gradient search - [ ] in the output space to maximize the input. - [x] in the input space to maximize the output. > Correct. - [ ] in the input space to maximize the loss function.

Model Visualization via Optimization

--- primaryColor: steelblue shuffleQuestions: false shuffleAnswers: true --- ### The paper's implementation of visualization starts with an input that is - [ ] a random image. - [x] all zeros. > Correct. - [ ] a random image of the same class as the input.

Saliency Maps

--- primaryColor: steelblue shuffleQuestions: false shuffleAnswers: true --- ### The saliency map uses the - [ ] second derivative of the output with respect to the input. - [x] first derivative of the output with respect to the input. > Correct. - [ ] integral of the output with respect to the input.

Results of Saliency Maps

Localization using Saliency Maps

--- primaryColor: steelblue shuffleQuestions: false shuffleAnswers: true --- ### Localization using saliency maps - [ ] does not use graphs. - [x] employed a graph theory based algorithm developed earlier. > Correct. - [ ] is not possible.

Conclusions

--- primaryColor: steelblue shuffleQuestions: false shuffleAnswers: true --- ### A saliency map is - [ ] specific to a model and an input but does not depend on an assigned input class. - [x] specific to a model, an input, and an assigned input class. > Correct. - [ ] specific to an input but does not depend on the model or the input class.

Code and Assignment: Visualization

A minimal working example for visualizing an input given a class label is presented in the Colab document. Here are a few illustrations generated by this minimal example:

Assignment 1: Visualize Input for Given Class (Due February 6, 2023)

Using any input benchmark of your choice such as ImageNet and any model of your choice such as ResNet101, write a Jupyter notebook to visualize the input for a given class (60 points). Use an optimization different from the one presented in the minimal working example above.

Code and Assignment: Attributions

A minimal working example for creating an attribution similar to the saliency map of an input given a class label is presented in the Colab document. Here are a few illustrations generated by this minimal example:

Assignment 2: Creating Saliency Maps for Models (Due February 20, 2023)

Using any input benchmark of your choice such as ImageNet and any model of your choice such as ResNet101, write a Jupyter notebook to identify components of an input that cause it to be predicted as a given label (40 points). Do not copy code from the minimal working example above.

FAQs

These assignments will be evaluated as graduate assignments, and there is no single correct answer that is expected. A variety of answers can equally satisfy the requirements in the above assignments.

Q1. I was wondering if you want the notebook to look like the example one, such as including load model, load input, transformation, and visualization?

A1. The code does not have to look like the example notebook. You can arrange the workflow in the code as you deemed fit.

Q2. If we do, for loading input do we load it from http://sumitkumarjha.com/ as well or pick a picture from imageNet? For transformation, can I do something like the example?

A2. You can use any inputs, such as pictures from the ImageNet, MNIST, CIFAR10, SVHN, or RESISC45 benchmark data set. You should make an attempt to use different transformations and different inputs, if possible. You can even use non-image inputs, if that makes things interesting such as use a Fast Fourier Transform on audio or EM signals from different benchmark data sets.

Q3. For benchmark, I didn’t see one being used in the example notebook, I was wondering if you can explain more about that?

A3. We would like the inputs to come from a publicly available benchmark data set such as ImageNet so that trained models may be easily available for your use.

Corn	Dining Table	Red Wine	Pineapple

Image	Attribution