I-GOS, a black-box saliency (attribution) map approach to explain deep networks.
I-GOS (Integrated-Gradient Optimized Saliency) is a recent visualization method for DNN predictions developed by the XAI team at Oregon State University. Collaborators: Zhongang Qi, Saeed Khorram, and Fuxin Li.
TL;DR; I-GOS uses integrated-gradient to solve the masking optimization for generating saliency maps. Using the descent direction from the integrated-gradient helps the optimization to avoid local optimum. I-GOS significantly outperforms the baselines in terms of quantitative metrics (at all resolutions) while being 3–10× faster than other perturbation-based methods. DEMO | PAPER
In the past couple of years, the performance of the AI systems has consistently improved in domains such as image classification. Flash forward from 2012 to now, more sophisticated and efficient architectures of neural networks have pushed the top-5 error rate over ImageNet from 17% (AlexNet with a model having 60 million parameters) to only 2.5% (EfficinetNet with a model having 480 million parameters). These achievements show how “smart” the current AI systems have become and everything looks right for robots apocalypse. But not exactly (, yet)!
What do you see in the below image? (If you are familiar with adversarial examples, you can skip the next two paragraphs — I know skipping the panda images is hard, though!).
If your answer is a “Panda”, congrats! you have solved one piece out of the 14 million images in the ImageNet. Then, let’s feed this image to a state-of-the-art network:
We can see that the network is very confident (99.3%) about its prediction and everything seems decent. But can we rely on this confidence? Adversarial machine learning has shown how easily these networks can be fooled and their confidence score is a fragile metric for trusting them — by adding a small designed perturbation (Adversarial Noise) to the “Panda” image, the network’s prediction completely shifts to a “Goldfish”, again with new high confidence (95.5%). This is not clearly the case for humans, at least we hope.
This clearly shows the need for explanations for decisions of the AI systems, especially in the domains that the AI systems are constantly making critical decisions such as in self-driving cars, medical diagnosis, etc. In computer vision, computing saliency maps from a deep network is a popular approach for visualizing and explaining a deep network’s decisions. However, heatmaps that do not correlate with the network may mislead humans, hence the performance of heatmaps in providing a faithful explanation to the underlying deep network is crucial.
There are two main approaches to generate saliency maps:
- One-step backpropagation methods use the gradient to visualize the explanation which makes them fast to compute. However, these methods suffer from disadvantages such as being independent of the network weights and performing (partial) image recovery (Guided BackProp), visualizing a diffuse heatmap that is not in-line with human interpretability (Integrated Gradient), and the fact that they only reflect infinitesimal changes in the output which for deep networks (that are highly non-linear functions) this is not necessarily reflective of large enough changes to alter their final prediction.
- Perturbation-based (masking) methods that firstly perturb parts of the input and then run a forward pass to see which parts are most important to preserve the network's final decision. This makes them more intuitive to humans, nevertheless, the optimization problem to find the mask is highly memory/time consuming (RISE).
I-GOS: Integrated-Gradient Optimized Saliency
How does I-GOS do it? In I-GOS (Integrated Gradient Optimized Saliency), we take a casual approach to explain the prediction of the deep networks by optimizing a (blurred) mask over the image that locates the smallest and smoothest non-adversarial regions over the image (heatmaps) to maximally decrease the prediction score. To put it more technically, we would have the following masking optimization problem to solve:
To solve the sophisticated optimization above, instead of using normal gradient descent to solve the above optimization, I-GOS proposes to use the integrated gradient to avoid easily falling into the local optimum and try to get closer to the global optimum. For this purpose, the integrated gradient uses a baseline image (zero-valued/highly blurred image) that gives a near-zero score in the output (unconstrained global optimum of the above objective). This naturally provides a better direction than the normal gradient in that it points more directly to the global optimum in the objective function. The following figure shows a scenario in the above optimization:
“A” is the initial score of the deep network for a target class c and is the starting point in the optimization. The constraints over the mask optimization are indicated by a black dotted line. Simply using gradient descent will direct to the local optimum “C” while using the average of the gradients over the line from “A” to the baseline “B” better directs the optimization and hypothetically finds a better solution: the area enclosed by the white dashed line and the black dotted one.
How is I-GOS in terms of speed? I-GOS uses back-tracking line-search to update the step size during optimization that along with the descent direction from the integrated gradient it mitigates a major hurdle of perturbation based saliency methods: time-consuming optimization. The optimization time of I-GOS tends to be 3–10× faster than other perturbation based methods. For additional details with regard to the optimization, please refer to the paper.
So how good is this, quantitatively? For the quantitative evaluation of the I-GOS, the “deletion” and the “insertion” metrics are used. First, all the values in the mask are sorted in descending order. Then, for the deletion metric, starting from the original image, (according to the sorted values) patches of pixels are being gradually deleted (blurred) from the image to observe how sharp the prediction score of the deep network would drop. The same goes for the insertion metric except it starts from a baseline image and patches of pixels are gradually inserted to (de-blurred from) the images to observe how fast the prediction score would increase. In both cases, the area under the curve (AUC) is calculated as the evaluation score — for deletion, lower AUC and for insertion, higher AUC is better. I-GOS (in all mask resolutions) significantly outperforms the other baselines in terms of these casual metrics:
For more evaluation results, details about the experiments, and discussion on the evaluation metrics, I could encourage the interested readers to refer to the paper.
How about finally showing some visual saliency maps?! This part should be more fun to follow. In the following, we can find a visual comparison of I-GOS and two other well-known saliency maps: Grad-Cam and RISE (which tend to generate coarse saliency maps while I-GOS saliency maps are more local and abstract).
Further, we have a few examples of deletion and insertion using the IGOS generated saliencies that give some quite interesting (and surprising!) insights of how the DNNs make a prediction:
It can be noted that:
- For the “Eft” (top), only by blurring 6.1% of the pixels, the confidence drops from 99.6% to only 14%. On the other hand, by only inserting 1.5% of the pixels back into the baseline (part of the legs and eyes), the model is 97% confident the image to be an “Eft”.
- By only inserting 0.8% of the pixels (only the tip of the long hand) for the “Analog clock” (middle) image, the model is even more confident about its prediction that the original image.
- In the “Pomeranian” (bottom) example, one can observe that by only retaining 2.3% of the pixels (the face), the network can predict the correct class by 82.9% confident.
Failure cases? We have found the optimized saliencies from I-GOS for targets that have low initial prediction confidence (less than 0.01) or when the deep model makes a wrong prediction (example below!) do not work well and sometimes fail. As an example:
Wait, how about the “Panda” image from the beginning? Right, we have not forgotten about it, either. In the figure below, we can observe that the saliency maps generated for natural and adversarial examples are completely different. Also, for the natural image, I-GOS often leads to high classification confidence by inserting a small portion of the pixels while for the adversarial image, almost the entire image needs to be inserted for the model to predict the adversarial category (right-most column):
Note that we are NOT presenting I-GOS as a defense against adversarial attacks, and that specific attacks may be designed targeting the salient regions in the image. However, these figures show that the I-GOS heatmap and the insertion metric are robust against those full-image based attacks and not performing mere image reconstruction.