Software failures can have deadly consequences, but building software that is reliable and also has high performance remains an unsolved problem. This is especially so for learning-enabled AI systems. State-of-the-art machine learning models commonly behave incorrectly on unexpected or adversarial inputs. Also, AI systems deployers in the real world can often violate norms of privacy and fairness that we expect human decision-makers to follow. At the same time, it is usually impossible to reason about the correctness of machine learning models using traditional software debugging and development techniques. Overcoming these challenges through a synthesis of ideas from formal methods, probabilistic reasoning, and machine learning is a central objective of our research.
One aspect of our work on this topic concerns methods for analyzing the safety and robustness of machine learning models [IEEE S&P, 2018; PLDI 2019]. Techniques that integrate such analysis into the learning loop [NeurIPS 2020] of intelligent systems form the other dimension. Open challenges include scaling these methods to larger models, reducing the burden of formal specification needed for safety assurance, finding effective tradeoffs between safety and performance, and discovering algorithms that bring together symbolic and statistical methods for safety assurance.