Rearchitecting Classification Frameworks For Increased Robustness


While generalizing well over natural inputs, neural networks are vulnerable to adversarial inputs. Existing defenses against adversarial inputs have largely been detached from the real world. These defenses also come at a cost to accuracy. Fortunately, there are invariances of an object that are its salient features; when we break them it will necessarily change the perception of the object. We find that applying invariants to the classification task makes robustness and accuracy feasible together. Two questions follow: how to extract and model these invariances? and how to design a classification paradigm that leverages these invariances to improve the robustness accuracy trade-off? The remainder of the paper discusses solutions to the aformenetioned questions.

Arxiv Preprint