The code responsible for serializing and
deserializing un- trusted external data is a
vital component of any software that
communicates with the outside world, as any bugs
in these components can compromise the entire
system. This is particularly true for verified
systems which rely on trusted code to process
external data, as any defects in the parsing
code can invalidate any formal proofs about the
system. One way to reduce the trusted code base
of these systems is to use interface generators
like Protocol Buffer and ASN.1 to generate
serializers and deserializers from data
descriptors. Of course, these generators are not
immune to bugs.
In this work, we formally verify a compiler for
a realistic subset of the popular Protocol Buffer
serialization format using the Coq proof
assistant, proving once and for all the
correctness of every generated serializer and
deserializer. One of the challenges we had to
overcome was the extreme flexibility of the
Protocol Buffer format: the same source data can
be encoded in an infinite number of ways, and the
deserializer must faithfully recover the original
source value from each. We have validated our
verified system using the official conformance
tests.