Self-Attention Attribution for Transformers

This chapter covers results in the paper Self-Attention Attribution: Interpreting Information Interactions Inside Transformer available at this Arxiv link 2021 version.

Lecture

The lecture slides are available here.

Introduction to Transformers and Self-Attention

--- primaryColor: steelblue shuffleQuestions: false shuffleAnswers: true --- ### For an input of size n token, the dimension of the attention matrix is? - [ ] n x 1. - [x] n x n. > Correct. - [ ] 1 x n. ### Attention scores are not enough because 1. [x] they are too dense. > Correct. 1. [ ] they are too sparse. 1. [ ] they are randomly generated. 1. [ ] they do not depend on the input.

Attributions using IG and Attention

--- primaryColor: steelblue shuffleQuestions: false shuffleAnswers: true --- ### Attributions are computed using a discrete approximation of the IG-style integral in this paper. The suggested number of samples (n) for the discrete approximation in the paper is: - [ ] n = 3. - [x] n = 20. > Correct. - [ ] n = 1000 or more.

Experiments using Attributions

--- primaryColor: steelblue shuffleQuestions: false shuffleAnswers: true --- ### The Pearson correlation between heads for similar tasks in the paper is - [ ] close to 0. - [x] close to 1. > Correct. - [ ] close to -1.

Conclusions

--- primaryColor: steelblue shuffleQuestions: false shuffleAnswers: true --- ### Self-attention attribution, as presented in this paper, - [ ] does not rely on attention. - [x] makes the self-attention mechanism more explainable. > Correct. - [ ] is independent of the input to the transformer.

Code and Assignment

There are no programming assignments for this lecture.