Course Home Page

Fast Axiomatic Attributions

This chapter covers results in the paper Fast Axiomatic Attribution for Neural Networks available at this Arxiv link.


Lecture

The lecture slides are available here.

Introduction

--- primaryColor: steelblue shuffleQuestions: false shuffleAnswers: true --- ### The paper focuses on a class of neural networks called - [ ] fast neural networks. - [x] nonnegatively homogeneous DNNs. > Correct. - [ ] axiomatic neural networks.

Prior Work

--- primaryColor: steelblue shuffleQuestions: false shuffleAnswers: true --- ### The paper notes that training using attribution priors - [ ] leads to 30X reduction in training time. - [x] leads to 30X increase in training time. > Correct. - [ ] has no impact on the training time for neural networks.

Integrated Gradients and X-DNNs

--- primaryColor: steelblue shuffleQuestions: false shuffleAnswers: true --- ### The paper shows that integrated gradients for nonnegatively homogeneous DNN can be computed using - [ ] without any forward or backward pass. - [x] a single forward/backward pass. > Correct. - [ ] about n = 50 forward/backward passes.

Constructing X-DNNs

--- primaryColor: steelblue shuffleQuestions: false shuffleAnswers: true --- ### The paper recursively defines a neural network using weight matrices and bias terms. For a deep neural network to be a nonnegatively homogeneous DNN or a X-DNN, the bias term should be - [ ] greater than 0. - [x] 0. > Correct. - [ ] 1.

Results and Conclusions

--- primaryColor: steelblue shuffleQuestions: false shuffleAnswers: true --- ### The paper shows that the following neural nets can be transformed into X-DNNs: - [ ] ResNets but not AlexNets or VGGs. - [x] ResNets, AlexNets, and VGGs. > Correct. - [ ] AlexNets and VGGs but not ResNets.

Code and Assignment: X-DNNs

A minimal working example for creating a simple X-DNN for MNIST is presented in the Colab document. The same code without the modification essential for making the network nonnegatively homogeneous is presented in this Colab document. The plots below compare how the output of the two models change on the Y-axis as the input is scaled from 10% to 100%. The X-DNN has a linear plot while the ordinary DNN does not show a linear behavior.

DNN X-DNN
DNN Output vs. Input Scaling Ratio X-DNN Output vs. Input Scaling Ratio

Assignment 5: Creating X-DNNs and IG Attributions for X-DNNs (Due April 17, 2023)

Using any input benchmark of your choice, such as FashionMNIST or CIFAR-10, write a Jupyter notebook to implement an X-DNN or a nonnegatively homogeneous deep neural network of your choice (40 points). Do not copy code from the minimal working example above.

  1. Using your X-DNN implementation, implement an integrated gradient attribution using the approach shown in the paper that involves a single forward/backward pass. Demonstrate your approach on 10 different input examples. (40 points)

  2. Using your X-DNN implementation, implement the saliency map attribution approach covered in the course earlier and qualitatively compare the results with the IG approach implemented in (1) above. Show your results on 10 different input examples. (20 points)

  3. Suggest an approach for learning X-DNNs based on ResNet-18 and ResNet-50 models. Implement and evaluate your approach on at least 10 examples from the ImageNet or a similarly complex high-dimensional data set. (optional extra credit: 100 points)

FAQs

These assignments will be evaluated as graduate assignments, and there is no single correct answer that is expected. A variety of answers can equally satisfy the requirements in the above assignments.

Q1. Do we have to use the MNIST benchmark?

A1. You can use ImageNet, CIFAR-10, or any benchmark data set of your choice. You can even use a data set of another modality, such as text, speech, EM signals, etc.