Countless devices over the world are connected by networks and communicated via network protocols. Just like common software, protocol implementations suffer from bugs, many of which only cause silent data corruption instead of crashes. Hence, existing automated bug-finding techniques focused on memory safety, such as fuzzing, can hardly detect them. In this work, we propose a static differential analysis called ParDiff to find protocol implementation bugs, especially those that hide in message parsers but do not cause crashes. Our key observation is that a network protocol often has multiple implementations and any semantic discrepancy between them may indicate bugs. However, different implementations are often written in disparate styles (e.g., using different data structures or written with different control structures), making it challenging to directly compare two implementations of even the same protocol.

To exploit this observation and effectively compare multiple protocol implementations, ParDiff (1) automatically extracts finite state machines representing protocol format specifications from programs, and (2) then leverages bisimulation and SMT solvers to find fine-grained, semantic inconsistencies between them. We have extensively evaluated our approach using 14 network protocols, each with two different implementations. The results show that ParDiff exhibits higher precision in discovering bugs in protocol parsers compared to both differential symbolic execution and differential fuzzing tools. To date, we have detected 41 logical bugs with 25 confirmed by developers. The baseline DPIFuzz can only detect 3 of them in the same time budget and can only detect 25 of them even with 720× time cost.