Given a single witness to a fault in a program
(in the form of a buggy input), we often wish to
discover related inputs that can also trigger
the same fault. This kind of error
generalization is important to help document API
misuse, better localize faults, provide crucial
detail in bug reports, and facilitate
data-driven program analyses, verification, and
inference techniques that require both
meaningful positive and negative inputs to a
program. Error generalization is particularly
challenging, however, when the identified fault
occurs in blackbox components whose source code
is either unavailable or too complex to
understand or effectively analyze. To facilitate
error generalization in such contexts, we
present a generative learning-based mechanism
that synthesizes error-producing test generators
for a program under test given one or more known
buggy inputs. Our learned test generators are
input perturbations, functions implemented as
sequential compositions of datatype operations
that transform one erroneous input into
another. These perturbations can be thus used to
generate additional error-producing inputs from
some initial set of buggy inputs. Our results
demonstrate that perturbation learning can
effectively and systematically generalize from a
small set of known errors in the presence of
blackbox components, providing significant
benefits to data-driven analysis and
verification tools.