Programmers often leverage data structure
libraries that provide useful and reusable
abstractions. Modular verification of programs
that make use of these libraries naturally rely
on specifications that capture important
properties about how the library expects these
data structures to be accessed and manipulated.
However, these specifications are often missing
or incomplete, making it hard for clients to be
confident they are using the library safely.
When library source code is also unavailable, as
is often the case, the challenge to infer
meaningful specifications is further
exacerbated. In this paper, we present a novel
data-driven abductive inference mechanism that
infers specifications for library methods
sufficient to enable verification of the
library's clients. Our technique combines a
data-driven learning-based framework to
postulate candidate specifications, and uses
SMT-provided counterexamples to refine these
candidates, taking special care to prevent
generating specifications that overfit to
sampled tests. The resulting specifications
form a minimal set of requirements on the
behavior of library implementations that ensures
safety of a particular client program. Our
solution thus provides a new multi-abduction
procedure for precise specification inference of
data structure libraries guided by client-side
verification tasks. Experimental results on a
wide range of realistic OCaml data structure
programs demonstrate the effectiveness of the
approach.
Recipient of a Distinguished Artifact Award at OOPSLA 2021