Linbo Liu, Trong Nghia Hoang, et al.
ICLR 2022
Recent studies have found that deep learning systems are vulnerable to adversarial examples; e.g., visually unrecognizable adversarial images can easily be crafted to result in misclassification. The robustness of neural networks has been studied extensively in the context of adversary detection, which compares a metric that exhibits strong discriminate power between natural and adversarial examples. In this paper, we propose to characterize the adversarial subspaces through the lens of mutual information (MI) approximated by conditional generation methods. We use MI as an information-theoretic metric to strengthen existing defenses and improve the performance of adversary detection. Experimental results on Mag-Net defense demonstrate that our proposed MI detector can strengthen its robustness against powerful adversarial attacks.
Linbo Liu, Trong Nghia Hoang, et al.
ICLR 2022
Yong Xie, Dakuo Wang, et al.
NAACL 2022
Pratik Vaishnavi, Kevin Eykholt, et al.
USENIX Security 2022
Minhao Cheng, Jinfeng Yi, et al.
AAAI 2020