[Originally published on PL Perspectives, 4/15/2020]
In a previous post, I talked about the potential of program synthesis as an approach to discovering truths about the natural world. As it happens, a team that Armando Solar-Lezama leads and that I am a part of recently won a US NSF Expeditions in Computing grant to work on this topic. The Expeditions program aims to fund “ambitious, fundamental research agendas that promise to deﬁne the future of computing and information.” The program has played an important role in PL research. The ExCAPE expedition enabled major progress in program synthesis: the problem of automatically discovering programs from specifications. The ongoing DeepSpec expedition is making the dream of formally verified software ever closer to reality. I am optimistic that our expedition will continue this tradition and produce results that are deeply grounded in PL while also being of fundamental interest to science and the broader society.
My last post primarily focused on the application dimension of the project. In this post, I will talk about the central technical idea that we aim to pursue: the data- and specification-driven synthesis of neurosymbolic programs, or programs in which traditional code is melded together with neural modules.
Bridging Program Synthesis and Deep Learning
Program synthesis is a hot topic of research in the PL and formal methods (FM) communities. However, it is worth noting that machine learning is also a form of program synthesis, because a computable function learned from data is nothing but a program. Specifically, deep learning techniques can be seen to synthesize parameters for a particular class of low-level programs (neural networks) using a specific synthesis algorithm (stochastic gradient descent).
PL/FM-style synthesis and deep learning have different advantages. Regarding program representations, neural networks can represent arbitrarily complex patterns in data, but this representation is also inscrutable to humans. In contrast, the programs that PL/FM approaches generate are high-level and interpretable. As for synthesis algorithms, gradient-based parameter synthesis scales to extremely high-dimensional input and parameter spaces. On the other hand, synthesis techniques in PL/FM can use automated reasoning and the structure of the programming language to decompose problems and prune large parts of the search space.
The vision of our project is to bring together these two kinds of approaches. We imagine neurosymbolic program representations that blend together neural networks and compositional, high-level code. We also imagine neurosymbolic program synthesis techniques that simultaneously synthesize the neural and symbolic parts of such programs, using a combination of deep learning and symbolic methods from PL/FM.
Neurosymbolic program representations
A neurosymbolic program is like a normal program in a compositional, high-level language, except it is also allowed to invoke a set of neural networks as library routines. The inputs and outputs of these neural networks may themselves be constructed using neurosymbolic programs.
Because they contain high-level, compositional elements, such programs are more human-interpretable, and more easily analyzed using symbolic tools from PL/FM. Also, while learning these programs, we can go by more than data; we can also provide a strong inductive bias for the learner using constraints that encode rich domain knowledge. Doing so can lead to more reliable and generalizable learning.
Let’s now look at a few examples of neurosymbolic programs.
Neurosymbolic Program Synthesis
Having discussed neurosymbolic program representations, we now turn to the problem of devising algorithms that produce them. The neurosymbolic synthesis problem generalizes classical machine learning as well as classical program synthesis. In the former context, one normally discovers models (programs) from noisy input-output examples that approximately describe a program’s behavior. In the latter, the specification that directs a synthesizer is a formal constraint on the syntax and semantics of a program. In the synthesis of neurosymbolic programs, the specification could include all of these things: noisy input-output data, constraints on the architectures of the neural modules, skeletal syntax (e.g., with “holes”) for the symbolic elements, and even formal semantic requirements such as the robustness of the overall program to perturbations to the inputs.
It seems natural that algorithms for solving this synthesis problem would use a combination of gradient-based learning and symbolic search. Initial explorations of such combinations appear promising. For example, among the papers I mentioned above, Valkov et al. use a type-directed, top-down search through a space of programs to generate neurosymbolic programs in which certain parameters are missing, then uses gradient descent to learn values for those parameters. The Ellis et al. paper used the Sketch program synthesis system to generate programs operating over neurally generated features. The Cheung et al. paper used Sketch to generate symbolic features as well. The Verma et al. paper combined oracle-guided program synthesis with a constrained optimization technique called mirror descent to find optimal neurosymbolic controllers. These methods barely scrape the surface of what’s possible in this space; much is left for future research to discover.
Neurosymbolic program synthesis brings together two very different approaches to automating programming: the PL/FM approach of generating composable, human-interpretable code that can be plugged into a human-driven software engineering process, and the machine learning approach of discovering blackbox code that represents patterns not easily described in language. The problem has many pieces, including:
- how to design the DSLs that limit the symbolic parts of programs and the architectures for the neural parts,
- how to model uncertainty,
- how to effectively combine the combinatorial search over symbolic program components with the gradient-based optimization of neural network parameters,
- how to decide when to deploy a specific synthesis algorithm,
- how to ensure that any synthesized program satisfies worst-case or statistical guarantees.
I expect that many exciting results on these topics will come out in the next few years, and I look forward to sharing some of these results through this blog!