COMPUTER SCIENCE TECHNICAL REPORT ABSTRACTS

CMU-CS-22-104
Computer Science Department
School of Computer Science, Carnegie Mellon University

CMU-CS-22-104

Identifying, Analyzing, and Addressing Weaknesses
in Deep Networks
Foundations for Conceptually Sound Neural Networks

Klas Leino

Ph.D. Thesis

May 2022

CMU-CS-22-104.pdf

Keywords: Machine Learning, Artificial Intelligence, Security, Privacy, Robustness, Trans- parency, Neural Networks

Deep neural networks have seen great success in many domains, with the ability to perform complex human tasks such as image recognition, text translation, and medical diagnosis; however, despite their remarkable abilities, neural networks have several peculiar shortcomings and vulnerabilities. Many of these weaknesses relate to a lack of conceptual soundness in the features encoded and used by the network–that is, the features the network learns to use may represent concepts that are not appropriate for the task at hand, even when they apparently allow the network to perform well on previously unseen validation data. This thesis examines the problems that arise in deep networks when they are not sufficiently conceptually sound, and provides steps towards improving the conceptual soundness of modern networks.

The first contribution of this thesis is a general, axiomatically justified framework for explaining neural network behavior, which serves as a powerful tool for assessing conceptual soundness. This work takes the unique perspective that to accurately assess the conceptual soundness of a model, an explanation must provide a faithful account of its behavior. By contrast, the literature has often attempted to justify explanations based on their appeal to human intuition; however, this begs the question, as it assumes the model captured conceptually sound human intuition in the first place.

To the contrary, a large body of prior work provides conclusive evidence that conceptual soundness is not the norm in standard deep networks, as adversarial examples–small, semantically meaningless input perturbations that cause erroneous behavior–found ubiquitously therein, violate the tenets of conceptual soundness. The second part of this thesis addresses this issue by contributing a state-of-the-art method for training neural networks with provable guarantees against a common class of adversarial examples.

Finally, we demonstrate that robustness to malicious input perturbations is only the first step—with contributions uncovering several orthogonal weaknesses and vulnerabilities relating to the conceptual soundness of deep networks.

189 pages

Thesis Committee:
Matt Fredrickson (Chair)
Anupam Datta
J. Zico Kolter
Corina Păsăreanu (CMU/NASA Ames)
Kamalika Chaudhuri (University of Southern California, San Diego/Meta AI)

Srinivasan Seshan, Head, Computer Science Department
Martial Hebert, Dean, School of Computer Science

Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by [email protected]