Foundationally Verified Data Plane Programming

Foundationally Verified

Data Plane Programming

Qinshi Wang

A Dissertation

Presented to the Faculty

of Princeton University

in Candidacy for the Degree

of Doctor of Philosophy

Recommended for Acceptance

by the Department of

Computer Science

Adviser: Prof. Andrew W. Appel

September 2023

Abstract

P4 is a major standardized programming language for programming and specifying

the network data plane. P4 is widely used in a variety of network functionalities, in-

cluding monitoring, traﬃc management, forwarding, and security. Recently, stateful

applications have been emerging in this area, as supported by programmable hard-

ware. Typical stateful applications include network telemetry (heavy hitters, dis-

tributed denial-of-service (DDoS) detection, performance monitoring), middleboxes

(ﬁrewalls, network address translation (NAT), load balancers, intrusion detection),

and distributed services (in-network caching, lock management, conﬂict detection).

Their complexity and rich properties are beyond the ability of existing P4 veriﬁers.

In this thesis, we propose Veriﬁable P4: a new framework for P4 program veriﬁca-

tion based on interactive theorem proving that is (1) capable of proving multi-packet

properties, (2) modular in terms of the structure of P4 programs, and (3) foundation-

ally sound with respect to a mechanized formal semantics of P4. In order to achieve

these goals, we built (1) a mechanized formal semantics of P4 more comprehensive

and convenient than existing formal semantics, (2) a set of program logic rules that

are proven sound, and (3) an interactive veriﬁcation system based on the program

logic and Coq tactic mechanism. We veriﬁed a stateful ﬁrewall fully implemented in

P4 using a sliding-window Bloom ﬁlter with Veriﬁable P4 and evaluated its utility.

Acknowledgements

It was a long and challenging journey to pursue a Ph.D. degree, and it is my

fortune to have received support from so many people.

First, I would like to express my gratitude my advisor, Andrew W. Appel, who led

me to the fantastic ﬁeld of software veriﬁcation, and provided great guidance during

my Ph.D. studies. In Spring 2018, when I was looking for a research direction that

is both theoretically interesting and practically useful, I took his course on theorem

proving and programming languages and fell in love with this ﬁeld. As my advisor,

he is a smart guide on research, a patient mentor on writing, and an invaluable

cornerstone in my academic journey.

I appreciate my thesis committee, Nate Foster, David Walker, Jennifer Rexford,

and Zachary Kincaid. Their insightful feedback and constructive criticism have sig-

niﬁcantly enhanced my research and the quality of this thesis.

I am grateful to Qinxiang Cao, who has been a great model of success and mentor

for me in both competitions and research since my high school days.

I would like to acknowledge my collaborators, Andrew W. Appel, Lennart

Beringer, Qinxiang Cao, Joshua Cohen, Ryan Doenges, Mengying Pan, Rudy Pe-

terson, Vilhelm Sj¨oberg, and Shengyi Wang. It was my honor and pleasure to have

worked with them.

I am thankful to my friends, Tingting Cai, Xiaoqi Chen, Jiaxin Guan, Guanhua

He, Zhongtian He, Junnan Hu, Renzhi Jing, Changshuo Liu, Shang Mu, Yuqi Nie,

Liqun Peng, Zhiyi Ren, Weiyi Tang, Chenggong Wang, Lianyong Wang, Gaoyuan

Wu, Yantao Wu, Zhelun Wu, Dazhi Xi, Zhaojian Xu, Weikun Yang, Conghao Yi,

Qiang Zhang, Wenda Zhang, Zidong Zhang, Yuyan Zhao, and Yuqing Zhu, who have

made my life brighter and more colorful.

I would like to express my gratitude to my family friends Lan Bin and Fred Ling,

who have oﬀered guidance and care as if I were their nephew.

Finally, I wish to express my most profound appreciation to my parents, Qian

Xu and Weibing Wang. Their unwavering love, boundless support, endless patience,

and constant belief in my dreams have been the cornerstone of my academic jour-

ney. Especially when the COVID-19 pandemic brought unprecedented challenges that

destabilized my life, their guidance and reassurance were a beacon of hope, enabling

me to persevere and continue pursuing my goals with determination in those trying

times.

This material is based upon work supported by the Defense Advanced Research

Projects Agency (DARPA) under Contract No. HR001120C0160.

Contents

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1 Introduction 12

1.1 Introduction to P4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

1.2.1 Early P4 Veriﬁers . . . . . . . . . . . . . . . . . . . . . . . . . 22

1.2.2 Aquila . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

1.2.3 Π4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

1.2.4 Petr4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

1.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2 P4 Semantics 28

2.1 P4light and the front end . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.2 Instantiation phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.2.1 Function lookup . . . . . . . . . . . . . . . . . . . . . . . . . . 37

2.2.2 Abstract methods . . . . . . . . . . . . . . . . . . . . . . . . . 38

2.3 Execution phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

2.3.1 Uninitialized bits . . . . . . . . . . . . . . . . . . . . . . . . . 40

2.3.2 Nondeterministic semantics . . . . . . . . . . . . . . . . . . . 41

2.3.3 Semantic rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

2.4 Comparison with Petr4 . . . . . . . . . . . . . . . . . . . . . . . . . . 50

2.5 Architecture speciﬁcation . . . . . . . . . . . . . . . . . . . . . . . . . 57

2.6 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

3 Program Logic 60

3.1 Hierarchical extern predicates . . . . . . . . . . . . . . . . . . . . . . 62

3.2 Program logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

3.2.1 Program logic rules . . . . . . . . . . . . . . . . . . . . . . . . 72

4 Tactic-Based Veriﬁer 75

4.1 Walkthrough . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

4.2 Tactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

4.3 Automated proof of MOD-clauses . . . . . . . . . . . . . . . . . . . . 83

5 Application Demonstration: Sliding-Window Bloom Filter 86

5.1 Sliding-window Bloom ﬁlter . . . . . . . . . . . . . . . . . . . . . . . 87

5.1.1 P4 implementation . . . . . . . . . . . . . . . . . . . . . . . . 90

5.2 Veriﬁcation organization . . . . . . . . . . . . . . . . . . . . . . . . . 95

5.2.1 Axiomatic interface of SBF . . . . . . . . . . . . . . . . . . . 97

5.3 Veriﬁcation of sliding-window Bloom ﬁlter . . . . . . . . . . . . . . . 99

5.3.1 Concrete functional model . . . . . . . . . . . . . . . . . . . . 99

5.3.2 Function speciﬁcations . . . . . . . . . . . . . . . . . . . . . . 101

5.3.3 Veriﬁcation of abstract methods . . . . . . . . . . . . . . . . . 102

5.3.4 Abstract functional model . . . . . . . . . . . . . . . . . . . . 104

5.3.5 Reﬁnement proof . . . . . . . . . . . . . . . . . . . . . . . . . 106

5.4 Veriﬁcation of stateful ﬁrewall . . . . . . . . . . . . . . . . . . . . . . 109

5.4.1 Verifying the P4 program of stateful ﬁrewall . . . . . . . . . . 110

5.4.2 Switch model . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

5.4.3 Flow property proof . . . . . . . . . . . . . . . . . . . . . . . 111

6 Conclusion 113

6.1 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

A Programs Omitted in the Main Text 116

A.1 P4 implementation of sliding-window Bloom ﬁlter and stateful ﬁrewall 116

A.2 Concrete functional model . . . . . . . . . . . . . . . . . . . . . . . . 130

Bibliography 145

List of Tables

3.1 Truth table of bitwise abstract interpretation . . . . . . . . . . . . . . 71

List of Figures

1.1 Very Simple Switch (VSS) architecture . . . . . . . . . . . . . . . . . 16

1.2 Architecture header ﬁle of VSS . . . . . . . . . . . . . . . . . . . . . 17

2.1 Example of instantiation . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.2 Pseudocode of instantiation . . . . . . . . . . . . . . . . . . . . . . . 35

2.3 Semantics rules for function lookup . . . . . . . . . . . . . . . . . . . 38

2.4 An example of abstract method . . . . . . . . . . . . . . . . . . . . . 39

2.5 L-value evaluation, read and write rules of P4light . . . . . . . . . . . 45

2.6 Semantics rules for argument evaluation . . . . . . . . . . . . . . . . 46

2.7 Semantics rules for statements . . . . . . . . . . . . . . . . . . . . . . 47

2.8 Semantics rules for function call expressions . . . . . . . . . . . . . . 48

2.9 Semantics rules for functions . . . . . . . . . . . . . . . . . . . . . . . 49

2.10 Petr4’s Core P4 semantic rules for objects . . . . . . . . . . . . . . . 51

2.11 Comparison with Petr4 on location manipulation . . . . . . . . . . . 53

2.12 P4 program in Example 2.1 . . . . . . . . . . . . . . . . . . . . . . . 54

3.1 Simple form of function speciﬁcation . . . . . . . . . . . . . . . . . . 61

3.2 Inference rules for well-formedness . . . . . . . . . . . . . . . . . . . . 65

3.3 Function speciﬁcation . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

3.4 Sample program logic rules . . . . . . . . . . . . . . . . . . . . . . . . 72

4.1 An example P4 control block in V1Model . . . . . . . . . . . . . . . . 77

4.2 Function speciﬁcation of Increment . . . . . . . . . . . . . . . . . . . 77

4.3 Proof script of Increment . . . . . . . . . . . . . . . . . . . . . . . . . 79

5.1 Panes in a sliding window Bloom ﬁlter . . . . . . . . . . . . . . . . . 89

5.2 Behavior of execute in RegisterAction . . . . . . . . . . . . . . . . . . 91

5.3 Overview. Rounded rectangles represent blocks of deﬁnitions; paral-

lelograms represent blocks of proofs. A solid line indicates that a block

depends on the implementation of another block; a dashed line indi-

cates that a block depends only on the interface of another block. A

proof block usually establishes the relationship between two deﬁnition

block and therefore depends on these two blocks. The proof body is ir-

relevant to its usage, so proof blocks are always depended with dashed

lines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

5.4 Axiomatic interface of SBF . . . . . . . . . . . . . . . . . . . . . . . . 98

5.5 Function Speciﬁcations with concrete functional model . . . . . . . . 101

5.6 Function speciﬁcations in RegisterAction . . . . . . . . . . . . . . . . 103

5.7 Function Speciﬁcations relating the abstract functional model . . . . 108

5.8 Switch model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

Chapter 1

Introduction

Nowadays, computer networking is indispensable almost everywhere. Network hosts

are connected through network switches. Traditionally, network switches are ﬁxed-

function devices, whose behavior is speciﬁed and implemented by their manufactur-

ers. In recent years, people have become interested in programming languages for

switches, for two reasons: ﬁrst, in order to secure the network, it is worth specifying

and reasoning about networks, including the behavior of switches and links; second,

programmable switches are prevalent in scenarios demanding ﬂexibility, especially

since programmable switches have evolved to be as fast as ﬁxed-function switches, we

need a programming language for them.

P4 [5, 12] is the most widespread programming language for specifying and pro-

gramming switches. [22] But P4 is a low-level language, and P4 programs are often

contorted to ﬁt within the constraints of resources in a particular target architecture,

which is designed to process billions of packets per second. So the correctness of

these programs has become a concern. To address that concern, there are several

veriﬁcation tools for P4 programs (see Section 1.2).

In many classic P4 applications, processing a packet does not typically change

the state of the switch. But recently there are new applications for programmable

data planes, in which each packet changes the state of the switch and aﬀects how

the following packets are processed. These applications include network telemetry

systems (SketchLib [31], BeauCoup [9], FlowRadar [27]), network functions (SilkRoad

[30]), and distributed services (NetCache [24], NetLock [46]). Misconﬁguration in such

programs may lead to serious network failures. But the existing veriﬁcation tools

cannot reason about the packet-to-packet state changes of these stateful programs.

For example, consider a stateful ﬁrewall that protects the internal network from

unsolicited traﬃc. External packets should be permitted to enter through the ﬁrewall

only if they are responses to recent outgoing requests to the same IP address. The

stateful ﬁrewall remembers recent outgoing packet headers. We need to verify a prop-

erty about multiple packets: no valid incoming responses are blocked. A small rate of

false positives is tolerated, i.e., allowing incoming packets that are not responses.

The multi-packet property that we want to verify can be formally written as

follows. Let T be the valid response time window, h be the list of historical packets,

p be the current packet and r be the action on p (forwarding or dropping it). We

want that for every integer i, we have

p.dir = in ∧ h[i].dir = out ∧ h[i].dst = p.src ∧

h[i].src = p.dst ∧ p.t − h[i].t ≤ T =⇒ r = forward.

To verify such programs, we need stateful reasoning. None of the existing veri-

ﬁers can properly characterize the way the switch’s state changes per packet, either

because they don’t handle state at all, or because their speciﬁcation languages are

too weak to properly relate the pre-state to the post-state after processing a packet

(see Section 1.2); and none of existing veriﬁers can reason about such multi-packet

properties.

Arbitrary multi-packet properties cannot be proved fully automatically.

So we

embed our veriﬁer in an interactive proof assistant (Coq)—this allows a very general-

purpose logic in which almost any kind of mathematics can be expressed. Inevitably,

in such general math one cannot always get 100% proof automation. To minimize

proof workload, we use Coq’s programmability to automate where possible (matching

P4 to functional models) and we use interactive proof where necessary (proving multi-

packet properties from per-packet state changes).

Modular proofs clarify protocols between modules and make modules reusable.

In order to modularize P4 programs with stateful objects, we propose a hierarchical

representation of states used in semantics, speciﬁcation, and veriﬁcation that im-

proves modularity; unlike some previous P4 semantics, we enforce a phase distinction

between instantiation (that populates this hierarchy) and run-time packet processing.

All programs, including veriﬁcation tools, can have bugs. P4 veriﬁers with bugs

can “verify” something that isn’t true. The correctness concern of P4 veriﬁers has

not been fully addressed in the previous work (see Section 1.2). Our veriﬁer has a

once-and-for-all machine-checked soundness guarantee: if it can prove a property of

your program, then your program behaves that way in the P4 semantics.

Contributions We have built Veriﬁable P4, which is a P4 veriﬁcation system in

Coq that supports very rich speciﬁcations, especially for stateful programs. The veri-

ﬁcation system also makes modular veriﬁcation possible, and the veriﬁcation result is

foundational. We have applied Veriﬁable P4 to a stateful ﬁrewall purely implemented

in P4 using a sliding-window Bloom ﬁlter, and proved the multi-packet property that

solicited packets are never dropped. In particular, we make the following technical

contributions:

P4 programs does not have loops or recursions for a single packet, so it might be possible to

automatically reason about single-packet processing. But lifting from single-packet processing to

multi-packet processing involves induction, which is known to be diﬃcult.

1. We revisit the formal semantics in Petr4 [18], and propose a phase distinction

between instantiation (compile-time allocation) and execution (run-time packet

processing). This phase distinction makes the semantics clearer and easier to

reason about.

2. We built a mechanized semantics for P4 in Coq, guided by the P4

Language

Speciﬁcation [12].

3. We propose a hierarchy for states, semantics, speciﬁcation, and veriﬁcation that

improves modular veriﬁcation. We deﬁne hierarchical predicates to specify state

of extern objects.

4. We demonstrate the program logic and veriﬁcation tool by verifying a high-

performance implementation of a stateful ﬁrewall that uses a sliding-window

Bloom ﬁlter.

1.1 Introduction to P4

P4 is a standardized programming language for both programming programmable

network switches and specifying nonprogrammable network switches. This section

gives a brief introduction to readers not familiar with P4, by giving a walkthrough

example and pointing out some key aspects of P4.

P4 runs on diﬀerent hardware/software platforms, each of which is called a P4

target. A P4 architecture is the interface that describes the programming model

of a target. A target may support multiple architectures, which makes the target

compatible with programs written for diﬀerent architectures. An architecture may

be implemented by multiple targets, too.

Figure 1.1 illustrates the Very Simple Switch (VSS) architecture (adapted from the

Speciﬁcation [12]), which is a P4 architecture for introductory purposes. It has

three P4-programmable components: the parser, the match-action pipeline, and the

Parser

Match-action

pipeline

Deparser

input

packet

input

headers

payload

output

headers

output

packet

Control plane

tables

Figure 1.1: Very Simple Switch (VSS) architecture

deparser. The remaining parts of the switch are nonprogrammable. When a packet

is processed by a VSS switch, it is ﬁrst parsed by the programmable parser, and the

headers are passed to the match-action pipeline while the rest of the packet, called

the payload, is directly passed to the deparser. The match-action pipeline produces

output headers by performing operations determined by the P4 program and tables.

These tables can be dynamically updated by the control plane. The deparser then

reassembles the output headers with the payload to form the output packet.

Figure 1.2 is the architecture header ﬁle of the VSS architecture, which should

be included by every P4 program on VSS. For a real architecture, such a ﬁle should

be provided by the manufacturer. It deﬁnes the interface

of each P4-programmable

block, namely Parser, Pipe, and Deparser. Each P4-programmable block is an instance,

which will be introduced later, but its entrance can be viewed as a P4 function,

and

its function signature has to match the corresponding interface. These interfaces do

not have return values. In fact, although there are some other functions in P4 that

have return values, a diﬀerent value-passing mechanism, called copy-in/copy-out, is

consistently used instead of return values.

They are called parser types/control types in the P4 speciﬁcation.

We use a more general deﬁnition of “function” than the P4 Speciﬁcation. Any entity callable

at runtime is regarded as a function.

#include <core.p4> // P4’s standard core library

struct std meta t {

bit<9> ingress port;

bit<9> egress port;

}

parser Parser<H>(packet in pkt, out H hdrs, inout std meta t std meta);

control Pipe<H>(inout H hdrs, inout std meta t std meta);

control Deparser<H>(packet out pkt, in H hdrs);

package VSS<H>(Parser<H> p, Pipe<H> pipe, Deparser<H> dp);

extern Checksum16 {

Checksum16(); // constructor

void clear(); // reset checksum

void update<T>(in T data); // add data to checksum

void remove<T>(in T data); // remove data from existing checksum

bit<16> get(); // get the checksum for the data added since last clear

}

Figure 1.2: Architecture header ﬁle of VSS

In the copy-in/copy-out mechanism, function parameters are usually directed,

including in, out, inout. in parameters are normal parameters passed from the caller

to the function; out parameters are similar to return values and passed from the

function to the caller; inout parameters serve the role of both in and out parameters.

In addition, there are directionless parameters, such as pkt in the interfaces Parser

and Deparser. Directionless parameters are evaluated at compile time, so they must

be compile-time known values. Unlike directional parameters, they can be objects,

such as packet in and packet out, which represent the input packet and the output

packet, respectively. Package interface VSS deﬁnes that a P4 program for a VSS

switch consists of a Parser, a Pipe, and a Deparser.

The architecture header ﬁle also contains the interface of extern objects and func-

tions available in the architecture. While the expressiveness of P4 is limited, each

target hardware provides various additional functionalities, which are modeled as

extern objects and functions in P4. For example, VSS supports an extern object,

Checksum16. The user can create an instance of Checksum16 and use its methods to

calculate the checksum of a header. packet in and packet out are also extern objects,

which are deﬁned in core.p4. The library core.p4 is a standard ﬁle. It deﬁnes standard

types of P4, and should be included by every P4 program.

The architecture can impose diﬀerent restrictions on each programmable block,

tailored to the hardware capabilities. For example, in VSS, Checksum16 can only be

used in the parser and the deparser, and tables can only be deﬁned in the match-

action pipeline. Although these restrictions are not described in the header ﬁle, the

compiler may reject programs that violate such restrictions. Also, the hardware has

limited resources, so programs may be rejected by the compiler because it cannot ﬁnd

a layout to ﬁt the program in the hardware.

In the following paragraphs, we show an example P4 program on the VSS archi-

tecture that performs the standard IPv4 forwarding, oﬀering a concrete example for

the fundamental concepts and structure of P4 programs.

1 #include <vss.p4>

3 header Ethernet h {

4 EthernetAddress dstAddr;

5 EthernetAddress srcAddr;

6 bit<16> etherType;

7 }

9 struct hdrs t {

10 Ethernet h ethernet;

11 IPv4 h ip;

12 }

The P4 program starts by including the architecture description and deﬁning

header types. Ethernet h is the Ethernet header, which consists of three ﬁelds. A

header in P4 is like a struct,

except that it includes an extra bit indicating its

struct in P4 is similar to struct in C.

validity. The ﬁelds of a header are meaningful only when the header is valid. We

omit the deﬁnition of IPv4 header (IPv4 h). The type hdrs t is a struct that includes

the headers to manipulate for a packet.

13 parser MyParser(packet in pkt, out hdrs t hdrs, inout std meta t std meta) {

14 Checksum16() ck; // instantiate checksum unit

16 state start {

17 pkt.extract(hdrs.ethernet);

18 transition select(p.ethernet.etherType) {

19 0x0800: parse ipv4;

20 }

21 }

23 state parse ipv4 {

24 b.extract(p.ip);

25 ck.clear();

26 ck.update(p.ip);

27 // Verify that packet checksum is zero

28 verify(ck.get() == 16w0, error.IPv4ChecksumError);

29 transition accept;

30 }

31 }

Then comes the parser declaration MyParser. A P4 parser is a state machine in

which each state executes a program block and transitions to another state. It begins

from the state called start. In this example, the parser extracts the Ethernet header

from the packet, and examines the EtherType. It transitions to the parse ipv4 state

if the EtherType is 0x0800. Otherwise, the transition fails and the packet is rejected.

In parse ipv4, it extracts the IPv4 header from the packet and veriﬁes its checksum.

The headers as result of parsing are passed out through the out parameter hdrs.

32 control MyPipe(inout hdrs t hdrs, inout std meta t std meta) {

33 IPv4Address nextHop;

34 action forward(IPv4Address nextHop, bit<9> port) {

35 nextHop = nextHop;

36 std meta.egress port = port;

37 hdrs.ipv4.ttl = hdrs.ipv4.ttl − 1;

38 }

40 action drop() {

41 std meta.egress port = DROP PORT;

42 }

44 table routing table {

45 key = {

46 hdrs.ipv4.dstAddr: lpm;

47 }

48 actions = {

49 forward();

50 drop();

51 }

52 default action = drop();

53 }

55 apply {

56 if (hdrs.ip.ttl ≤ 1)

57 drop();

58 routing table.apply();

59 dmac table.apply();

60 smac table.apply();

61 }

62 }

We then have a control declaration MyPipe, which programs the match-action

pipeline block. It analyzes the input headers and determines the packet’s forwarding

behavior. While it is simple in this example, this block is where the core logic is

implemented in more sophisticated applications.

The match-action pipeline consists of match-action tables. Each table contains

some entries, and invoking a table involves ﬁnding the entry that matches the keys,

and executing the action speciﬁed by the entry. The entries may be either maintained

dynamically by the control plane, or hardcoded in the P4 program as constants. In

this example, routing table is the conventional IPv4 routing table, whose entries are

maintained by the control plane, so the entries are not in the P4 program and may be

dynamically changed by the control plane between packets. The table examines the

destination address in the IPv4 header (as speciﬁed on line 46), and executes either

the action forward or the action drop. The parameters of the action forward do not

have direction annotations, which means they are given by the matched entry when

the table invokes forward. At the end of MyPipe is the apply block (line 55), which

is the main body of MyPipe. It also invokes tables dmac table and smac table, whose

deﬁnitions in MyPipe are omitted.

63 control MyDeparser(packet out pkt, in hdrs t hdrs) {

64 Checksum16() ck;

65 apply {

66 pkt.emit(hdrs.ethernet);

67 if (hdrs.ip.isValid()) {

68 ck.clear(); // prepare checksum unit

69 hdrs.ip.hdrChecksum = 16w0;

70 ck.update(p.ip); // compute new checksum according to RFC 791

71 hdrs.ip.hdrChecksum = ck.get();

72 }

73 pkt.emit(p.ip);

74 }

75 }

We then have a control declaration MyDeparser, i.e. the deparser block. It takes

the headers determined by the match-action pipeline to construct the output packet.

In this block, the function pkt.emit inserts a header into the packet.

76 VSS(MyParser(), MyPipe(), MyDeparser()) main;

The code blocks MyParser, MyPipe, and MyDeparser are declarations, they need

to be instantiated before being invoked. For example, the expression “MyParser()”

creates an instance of MyParser, and this instance is used in the instantiation of the

main package “VSS(...) main”, which represents the switch in the P4 program. So the

switch can invoke MyParser as its Parser block.

A parser or control declaration can be instantiated multiple times, so the instanti-

ation mechanism allows the code to be reused modularly to create more complicated

programs. In order to understand instantiation, we can compare to object-oriented

programming (OOP). Parser or control declarations are similar to classes in OOP,

and instances are similar to objects. In this thesis, we use classes to refer to parser

or control declarations, and use objects to refer to instances. We will discuss more

details about instantiation in Section 2.2.

In summary, the aspects in which P4 diﬀers from conventional imperative pro-

gramming languages are: instantiation, copy-in/copy-out mechanism, parsers, tables,

extern objects and functions.

We have given a brief introduction for readers unfamiliar with the P4 language

to understand the content of this thesis smoothly. For more details about the P4

language, we refer to the P4 Speciﬁcation [12]. For more applications of P4, we refer

to a comprehensive survey paper [22].

1.2 Related Work

In this section, we summarize existing works on P4 program correctness. We will

make more detailed comparisons of our results to other works in speciﬁc chapters of

the thesis from the programming language and veriﬁcation perspective.

1.2.1 Early P4 Veriﬁers

Bugs in P4 programs have been of concern since the invention of the language. Three

ﬁrst generation veriﬁers were built in 2018, including p4v [28], Vera [40], and ASSERT-

P4 [33]. These three veriﬁers are generally similar: they are all automatic and mono-

lithic tools to check properties less complex than full functional correctness for stateful

programs.

For example, if a P4 program accesses a ﬁeld of an invalid header or drops a

packet due to a bug mishandling a packet ﬁeld, these veriﬁers can identify the bug

and generate a counterexample. However, these veriﬁers are not capable to express

or verify properties that involve multiple packets, which are fundamental properties

of stateful applications.

p4v is an automatic veriﬁer for P4

It veriﬁes relatively simple properties,

including general safety properties (e.g. header validity), architectural properties (e.g.

parser roundtripping, which means the composition of deparsing and reparsing is the

identity function), and simple application-speciﬁc properties that are expressible by

Boolean logic and linear integer arithmetic. p4v translates P4 into guarded command

language [17]. The translation procedure unrolls P4 parsers into straight-line code and

rejects programs that contains unproductive cycles that do not extract any headers.

From guarded command language, p4v generates veriﬁcation conditions and checks

them using Z3 [15].

The interesting technique is the treatment of tables that are ﬁlled by the control

plane. The entries of these tables are not given in the P4 program. Instead, the

control plane can modify them to change the packet processing policy, so we cannot

determine the action invoked by the table. To address this problem, p4v uses a

control plane interface to characterize the possibilities of each table, and includes all

the possible execution traces for the veriﬁcation. Vera also uses this technique, where

it is called symbolic entries.

Vera is also an automatic veriﬁer for P4

programs. It translates a P4 program

into the SEFL language in Symnet [41], which is a symbolic execution system for net-

work systems. It then uses Symnet to examine all possible execution paths of the P4

is the predecessor language to P4

program. Vera automatically detects common mistakes and hazards, including im-

plicit drops, manipulating dropped packets, invalid header access, out-of-bounds ar-

ray accesses, and arithmetic overﬂows. Packets may be recirculated in the switch, i.e.

the P4 program may send a packet back to some earlier stages to process the packet

again. To reason about packet recirculation, Vera detects loops without progress, and

supports speciﬁcations in a subset of computation tree logic (CTL).

ASSERT-P4 is a veriﬁcation tool for P4

, which translates the P4 program into

a C program together with the speciﬁcation to analyze using KLEE [6]. However, as

we will show in Section 2.3, the semantics of C and the semantics of P4 are diﬀerent,

making this encoding inaccurate and probably causing false reports of bugs.

These three veriﬁer have made their contributions to the correctness of P4 pro-

grams, but they do not handle stateful objects well. Although p4v supports stateful

objects, we did not ﬁnd any evidence that p4v can relate initial and ﬁnal states. For

example, it cannot express that the ﬁnal state of a register is obtained by modifying

a particular position from the initial state. Vera [40] uses an extremely expensive

encoding that is proportional to the size of registers (impractical when the regis-

ter contains an entire hash table). ASSERT-P4 does not claim to support stateful

objects.

1.2.2 Aquila

Aquila [42] supports a more convenient assertion language, multi-pipeline control,

more time-eﬃcient veriﬁcation, and bug localization when the veriﬁcation claims a

bug. But Aquila oversimpliﬁes registers into ﬁelds without indexes, so it could not

verify programs such as the stateful ﬁrewall. Aquila also reduces the risk of bugs in the

veriﬁer by translation validation—checking whether its intermediate representation is

equivalent to the counterpart generated by Gauntlet [38] (which is a tool for ﬁnding

bugs in P4 compilers). But that does not address other software bugs, e.g. the bugs

in manipulating assertions, especially when we need a rich and modular assertion

language.

1.2.3 Π4

Π4 [20] is a research language and dependent reﬁnement type system designed for a

subset of P4. This subset involves headers, parsers, deparsers, but not extern objects

and persistent state. The type system uses reﬁnement types to represent constraints

on values. For example, {y : τ | 0 ≤ y.a < N} represents that variable a is in [0, N).

A dependent type of the form (x : τ

) → τ

is used to deﬁne a type for a program

segment, which means when starting with a state x that satisﬁes type τ

, the program

segment is safe and resulting state satisﬁes τ

. In particular, x may be referred to in

to connect the initial and ﬁnal states, e.g. τ

being {y : τ | y.a = x.a} means a’s

value stays the same.

Type checking in this system is similar to a veriﬁcation task. Reﬁnement types

are assertions, and dependent types for program segments are function contracts. Π4

also uses an SMT solver to check these types. So Π4 is similar to a veriﬁer. Compar-

ing with previous veriﬁers, Π4 is compositional: once a program segment satisﬁes a

function contract, one can use the contract to verify the whole program that uses the

program segment. For example, this allows a device vendor to specify a ﬁxed-function

component such that it can be compositionally veriﬁed with user-deﬁned components,

and the vendor can modify the ﬁxed-function component without breaking the veri-

ﬁcation, as long as the function contract of the ﬁxed-function component still holds.

Π4 does not consider stateful objects, e.g. registers, so it cannot be used to reason

about stateful behaviour. Also, it does not have scopes to keep variables private in

modules.

1.2.4 Petr4

Petr4 [18] is a study of P4’s formal semantics, which gives us an important reference.

But Petr4’s semantics does not have a machine-checkable formalization, and mixes

instantiation and execution, which means it “instantiates at runtime” and has to de-

ﬁne each control instance as a closure. This makes the semantics less straightforward

and makes it much more challenging to prove the program logic and the type system

sound. We improve this with a phase distinction between instantiation (Section 2.2)

and execution. We will compare our formalization with Petr4 in much more detail in

Section 2.4. We also identify and ﬁx some bugs in Petr4 during the formalization in

Coq.

Summary Although there are several previous P4 veriﬁers, none of them address

the problem of writing and verifying nontrivial stateful programs, such as programs

based on data structures (sliding-window Bloom ﬁlter, count-min sketch, etc.).

1.3 Organization

It is assumed that the reader has basic knowledge about formal operational semantics

and program logic (e.g. Hoare logic). Readers may refer to textbooks [37, 36] for an

introduction to these concepts.

The rest of the thesis is organized as follows. Chapter 2 presents our formalization

and mechanization of P4’s operational semantics (by the author and Mengying Pan).

Chapter 3 shows the design of the Veriﬁable P4 program logic and its soundness (by

the author, with assistance of Shengyi Wang). Chapter 4 describes the veriﬁcation

system based on the program logic and implemented using Coq’s Ltac (by the author,

with assistance of Mengying Pan and Shengyi Wang). Chapter 5 demonstrates an

end-to-end veriﬁed stateful ﬁrewall that combines the P4 program veriﬁcation using

the veriﬁer and property proof in Coq (by the author, Shengyi Wang and Lennart

Beringer). Chapter 6 concludes the thesis and discusses future work directions.

We described much of this work in a paper published in the 2023 Conference on

Interactive Theorem Proving [45] in a short version.

Chapter 2

P4 Semantics

The term “foundationally veriﬁed” means that the correctness or desired properties

of a program is veriﬁed with respect to its formal semantics. So the ﬁrst step towards

foundationally verifying P4 programs is to deﬁne the operational semantics of the

P4 language. The authority document of P4 semantics is the P4

Speciﬁcation [12]

written by the P4 Committee. But it is a document written in natural language and

pseudocode, instead of formal language. In order to clear the jungle, Doenges et al.

made an important initiative with Petr4 [18], a pen-and-paper formalization of P4’s

operational semantics. But Petr4 is not implemented in a proof assistant, so it does

not support mechanized proofs. Petr4 semantics also left some unresolved ambiguities

and unsupported P4 features.

In this chapter, we describe how we built a mechanized formalization of P4 se-

mantics by inheriting and improving Petr4. We introduce a phase distinction in P4’s

semantics that separates it into the instantiation phase (Section 2.2) and the exe-

cution phase (Section 2.3). This phase distinction mimics the actual deployment of

P4 programs: the compiler evaluates the P4 program itself, allocates hardware re-

sources, and produces a hardware program, and the hardware processes the packets

without higher-order computation and resource allocation. The instantiation phase

and the execution phase correspond to the compiler and the hardware, respectively.

Section 2.4 compares with Petr4’s semantics and explains the beneﬁt of this approach.

P4 does not have C-style undeﬁned behavior that permits completely arbitrary

results. Instead, it produces unspeciﬁed values when reading from uninitialized ﬁelds.

However, an easily overlooked point in P4’s semantics is that reading an uninitialized

ﬁeld twice may yield diﬀerent values [12, Section 8.23]. We implemented this feature

in the execution phase (Section 2.3).

P4 is a language for programming and specifying a variety of hardware and soft-

ware targets. A P4 architecture is a model for a set of compatible targets. The inter-

nal language constructs of P4 are independent from architectures, but P4 supports

architecture-speciﬁc extern objects and methods, and the switch’s global behavior

depends on the switch model. Section 2.5 shows how to write a formal architecture

speciﬁcation and how our operational semantics is linked to an architecture speciﬁca-

tion. An actual target may reject some programs during compilation due to resource

constraints. Our operational semantics does not specify any of these constraints but

what the program means when it can compile.

2.1 P4light and the front end

The formalization of P4 semantics is based on an abstract syntax tree (AST). But

which AST should we use in order to deﬁne formal semantics and verify programs

smoothly? According to the experience from CompCert Clight [3] and VST-Floyd [7]

on the C language, it is nicer to elaborate, annotate, and transform the AST before

deﬁning the formal semantics and verifying programs. So we designed P4light, an

AST of P4 such that

• each expression node is annotated with its type and implicit type casts are made

explicit;

• each name is annotated with a locator (see below) to distinguish the same names

in diﬀerent scopes;

• side-eﬀect expressions do not nest as subexpressions;

• it is still close to P4 source code and each P4 light program is a legal P4 program.

We adapted the Petr4 front end (including the typechecker) to parse, elaborate,

and typecheck P4 source program into P4light. Two additional transformation passes

are applied to the AST after typechecking in order to make it easier to deﬁne the

semantics and analyze the program. The ﬁrst pass extracts side eﬀects from subex-

pressions such that every every expression with side eﬀects (e.g. function call) must

appear directly on the right hand side of an assignment. For example, a = f(b) + c is

transformed into t1 = f(b); a = t1 + c. So when analyzing the program, we can treat

function calls as a kind of statement and do not have side eﬀects when evaluating

expressions.

The second pass adds locators to names in a P4 program. Names in a P4 program

have multiple layers of scopes and the same identiﬁer may not refer to the same thing.

So a locator is added to each appearance of names to distinguish them. The syntax

of locators is

Locator ::= glob p | inst p,

where p is a path, for names deﬁned in the global scope and instance scopes (or say

class scopes), respectively. Theoretically, one could use identiﬁers instead of paths in

locators to do the same thing, but using paths makes it easier to track the location

where each name is deﬁned. With locators, there is no need for the semantics to

maintain an environment mapping names to locations. (see Section 2.4)

This point simpliﬁes the development of operational semantics and program logic, and improves

the interaction experience of our veriﬁer. But P4 always evaluates expressions from left to right,

not like C, whose evaluation order is unspeciﬁed. So this transformation does not add restrictions

to the semantics.

2.2 Instantiation phase

P4 programs are often compiled and executed on programmable hardware (in other

cases, compiled to C code, etc.). Such hardware is similar to FPGAs in the way

that each computation unit is only used for one particular computation step once

the program is loaded into the hardware. Hardware resources cannot be dynamically

allocated during execution; programs cannot have loops, either. These constraints

are reﬂected in the language design of P4. P4 has reusable modules (parser and

control declarations), but the language only allows these modules to be instantiated

with “compile-time known” arguments, so that the compiler can create a copy of

the code for each usage and statically allocate hardware resources. For example,

Figure 2.1 shows a sliding-window Bloom ﬁlter

implemented modularly in P4. Each

sliding-window Bloom ﬁlter instance has 4 panes and each pane has 3 rows. Each row

must have a persistent state that survives between packets, which is implemented as

a “register” in the Toﬁno architecture

[23], the architecture used in this program.

Each of these registers needs to be allocated to a unique location in the hardware.

This is similar to hardware description languages, for example, modules in Verilog

[34], which can be reused by statically allocating copies on hardware, too. But to

my best knowledge, the instantiation phase has not been described in the literature

about Verilog’s formal semantics, nor in discussions of P4 semantics.

In order to have a clear and human-readable semantics and make it easy to ana-

lyze programs’ behavior, we separate the semantics of P4 programs into two phases:

the ﬁrst is the instantiation phase, which simulates what the compiler should do;

the second is the execution phase, which simulates the hardware processing of each

In Chapter 5, we will explain what is a sliding-window Bloom ﬁlter, give the full version of

Figure 2.1 and demonstrate how to verify this P4 program using Coq and our Veriﬁable P4. Here

we only focus on its modular structure.

Toﬁno is a series of high-performance P4-programmable switch chips developed by Intel. With

its performance and ﬂexibility, it is widely used in data centers.

package Switch(...); /∗ Prototype of the main package ∗/

control Row(...) { /∗ A Bloom ﬁlter row ∗/

}

control Pane(...) { /∗ A simple Bloom ﬁlter ∗/

Row() row 1; Row() row 2; Row() row 3; ...

}

control SBFilter(...) { /∗ A sliding−window Bloom ﬁlter ∗/

/∗ Registers for timing ∗/

Pane() pane 1; Pane() pane 2; Pane() pane 3; Pane() pane 4; ...

}

Switch(SBFilter()) main; /∗ SBFilter is instantiated here ∗/

Figure 2.1: Example of instantiation

packet. The main beneﬁt of introducing this phase distinction is to separate two

kinds of computation. The instantiation phase needs to handle higher-order objects,

but they are fully determined by the P4 program and independent from the packets

and the switch conﬁguration from the control plane. The execution phase processes

the packets without higher-order objects and object reallocation.

Similar design choices have also been discussed for “instantiation” in other pro-

gramming languages, such as the ML language. ML compilers handle modules and

functors either by closure passing [29] or by defunctorization [8] (i.e. instantiation at

ﬁrst during compilation). The former is similar to Petr4’s approach and the latter is

similar to ours. The semantics of ML is usually deﬁned by closure passing. It seems

to be because the ﬁrst a few ML compilers used closure passing and because the

instantiation phase seems to be more complicated for ML than P4. The advantage of

using closure passing in compilers is that a functor in ML can be compiled just once

into machine code that serves for all of its instances. But program analysis is easier

after instantiation in ML [8], which is the same as we observe for P4.

Although P4 compilers create a copy of code for each instance, it is not an ideal

approach for the formal semantics, because that means we cannot analyze shared

properties for all the instances of the same class. So, instead, we store the class

name for each instance as the link to the corresponding code. Meanwhile, the formal

semantics should be simple and independent from any hardware architecture, so the

objects (i.e. instances) should be still addressed by names unrelated to hardware

resource allocation. The control plane names described in the P4 Speciﬁcation [12,

Section 18.3] are perfect for distinguishing objects. The control plane names are fully

qualiﬁed names (paths) allocated as follows.

1. The control plane names of an object instantiated at the top level are just its

name. For example, the control plane name of Switch(SBFilter()) main is main.

2. An object instantiated by nameless instantiation (i.e. directly used as a construc-

tor argument when instantiating another object) gets its control plane name by

appending the name of the corresponding formal parameter after the control

plane name of the object to which the parameter is passed, separated by a dot.

For example, the control plane name of SBFilter() in Switch(SBFilter()) main is

main.ig, as ig is the formal parameter’s name.

3. An object instantiated inside a parser/control block gets its control plane name

by appending its local name after the control plane name of the parser/control

block, separated by a dot. For example, if the control plane name of the SBFilter

instance is main.ig, then the control plane name of Pane() pane 1 in this instance

is main.ig.pane 1.

4. P4 allows using @name annotation to overwrite the local names when generating

control plane names as above. This feature is not yet supported in this work.

This allocation guarantees the control plane names are distinct, because the local

names are distinct (including parameters and local deﬁnitions). These names are

determined at compile time and can be used to describe how the switch or the control

plane handles these objects (e.g. the control plane may modify registers and table

entries). Also, this allocation forms a hierarchy in which all the objects instantiated

in an instance (e.g. an SBFilter) have the same preﬁx, and the suﬃx of these objects

are the same between diﬀerent instances of the same parser/control block. This

beneﬁt will be discussed in more detail in Chapter 3.

Technically, the instantiation phase produces a global environment Γ that does

not change during the execution phase. Γ consists of six parts:

• Γ

func

, for function deﬁnitions (including all callable objects, e.g. parsers, control

blocks and tables),

• Γ

typ

, for type deﬁnitions,

• Γ

senum

, for values of serializable enumeration types looked up by member names,

• Γ

inst

, for class information of instances and references to other instances,

• Γ

const

, for values of constants which may diﬀer between instances, and,

• Γ

ext

, for static information of extern objects.

func

, Γ

typ

, and Γ

senum

are obtained from a single pass through the program, which

is simple Γ

inst

is the main product of the instantiation phase, while Γ

const

and Γ

ext

are

byproducts during this procedure. Another byproduct is the initial state of extern

objects before processing any packets, denoted by s

init

. Γ

inst

is a partial map Path ⇀

Ident × Path that is used to look up object references. For local name bar in an

object at path foo, Γ

inst

(foo.bar) will be used. Entry p 7→ (n, q) in Γ

inst

means the

object referred by p belongs to the class named n and its actual location is q. Because

q is already the actual location, Γ

inst

always has q 7→ (n, q).

global Γ

inst

, Γ

const

, Γ

ext

, s

init

, decl env := []

procedure instantiate(p, e, decl) :=

inst name := decl.name

class name := decl.class name

inst

:= p · inst name

:= e

for each param in decl.params

param

:= p

inst

· (param.name)

if param is an instantiation

instantiate(p

inst

, e

, declaration form of param)

v := (param.class name, p

param

)

else // param is not an instantiation

// param may evaluate to either a value or an object reference

v := evaluate(e

, param)

e := e[param.name 7→ v]

if v is a value

const

:= Γ

const

param

7→ v]

else

inst

:= Γ

inst

param

7→ v]

body := decl

env[class name]

inst

:= Γ

inst

7→ (class name, p

inst

)]

if class name is an extern object class

inv

, v

init

) := construct extern(class name, e)

ext

:= Γ

ext

inst

7→ v

inv

]

init

:= s

init

inst

7→ v

init

]

else // class

name is a parser/control block

for each decl

′

in body

if decl

′

is an instantiation then

instantiate(p

inst

, e, decl

′

)

update e, Γ

const

, Γ

inst

(similar to evaluating parameters)

procedure instantiate prog(prog) :=

e := []

for each decl in prog

if decl is a class then

decl env := decl env(decl.name 7→ decl)

else if decl is an instantiation then

instantiate(ε, e, decl)

e := e[decl.name 7→ (value of the instance)]

Figure 2.2: Pseudocode of instantiation

Figure 2.2 shows the pseudocode of the instantiation phase. The operator “·”

(centered dot) is used for concatenating paths. All global variables are initialized as

empty maps. decl env stores declarations of classes. These classes must be declared at

the top level, so there is no need to consider naming scope, and it is ﬁne to use a global

decl env in the pseudocode. But in the Coq implementation, in order to rule out circu-

lar instantiation and guarantee that the instantiation phase terminates, declarations

are stored into decl env as closures: decl env := decl env[decl.name 7→ (decl env, decl)].

The corresponding decl env is used when instantiating the body of each declaration,

so every recursive call of instantiate decreases on decl env. The function instantiate

takes a declaration decl that instantiates an object and the declaration appears in an

object at path p. Local environment e is only used in the instantiation phase. The

path for the newly instantiated object will be p

inst

. The ﬁrst loop evaluates each pa-

rameter of decl. If a parameter is an instantiation expression, it will be converted to

a declaration named by the name of the corresponding formal parameter. Then this

declaration is instantiated recursively under p

inst

, and it becomes an object of class

param.class name at path p

param

. If the parameter is not an instantiation, it will be

evaluated while the names will be looked up in e

. The result v will be stored in e and

in Γ

const

or Γ

inst

depending on whether it is an value or a reference. After evaluating

and instantiating the parameters, the body of decl will be instantiated. If it is an

extern object, the architecture-speciﬁc function construct extern is called to produce

the static value (e.g. the size and width of a register) and the initial value. Otherwise,

decl is a parser/control block. In this case, for each decl

′

that is an instantiation, call

instantiate recursively.

The procedure instantiate prog instantiates the whole program. It inserts class

deﬁnitions into decl env and calls instantiate with an empty path for each top level

instantiation.

2.2.1 Function lookup

The instantiation phase makes the resolution of program constructs independent from

runtime data. Besides the global environment Γ, we only need to know the path of the

current control/parser instance, denoted as p. We will show the detailed resolution

methods in Section 2.3. Here we show the most important part: function lookup.

The judgment of function lookup is of the form

Γ, p ⊢ exp ⇓

lookup

′

, p

func

where exp is the function expression, and (p

′

, p

func

) is the result. In the result, p

′

either ⋆ or a path: If p

′

is ⋆, the function exp is executed in the current control/parser

instance, such as calling a table or an action deﬁned in the same control block; If p

′

is a path, the function exp is executed in the instance at p

′

, such as calling a global

action/function or another control/parser instance. The diﬀerence is that the callee

can access class-scope variables of the caller if and only if p

′

= ⋆. The other part of

the lookup result, p

func

, is the path to look up the function body of exp in Γ

func

Figure 2.3 presents the semantic rules for function lookup. The ﬁrst two rules

handle the case that the function expression is an identiﬁer, and the last three rules

handle the case that the function expression is a member expression (expression with

a dot). We distinguish the origin of identiﬁers using locators, which annotate the

identiﬁers in the form n@loc. E-LGlob is used for global actions and functions. E-

LInst is used for local actions and parser states, which are modeled as functions. The

premise Γ

inst

(p) = (n

class

, p) indicates that the current instance p is an instance of class

class

. E-LTable is used for calling apply method of tables. We use kind(n

@(inst p

′

))

to denote the determination of whether the expression is a table, using the type

annotation generated by the front end (Section 2.1). Tables can access control block’s

local variables, so the ﬁrst lookup result is ⋆. In the last two rules, E-LMemGlob

Γ, p ⊢ n@(glob p

′

) ⇓

lookup

(ε, p

′

)

E-LGlob

inst

(p) = (n

class

, p)

Γ, p ⊢ n@(inst p

′

) ⇓

lookup

(⋆, n

class

′

)

E-LInst

kind(n

@(inst p

′

)) = table Γ

inst

(p) = (n

class

, p)

Γ, p ⊢ n

@(inst p

′

).n

⇓

lookup

(⋆, n

class

′

)

E-LTable

kind(n

@(glob p

)) ̸= table Γ

inst

) = (n

class

, p

)

Γ, p

⊢ n

@(glob p

).n

⇓

lookup

, n

class

)

E-LMemGlob

kind(n

@(inst p

)) ̸= table Γ

inst

) = (n

class

, p

)

Γ, p

⊢ n

@(inst p

).n

⇓

lookup

, n

class

)

E-LMemInst

Figure 2.3: Semantics rules for function lookup

and E-LMemInst, the method name n

is prepended with the class name n

class

which determined by the locator annotated to n

The function lookup is fully deterministic and implemented as a function in the

mechanized semantics. This function is used in the same way in the program logic

(Section 3.2), making it an important beneﬁt of the instantiation phase.

2.2.2 Abstract methods

An important feature that has not been formalized previously (including Petr4’s im-

plementation) is abstract methods. Formalizing abstract methods is necessary for

verifying stateful programs on Toﬁno. An abstract method is a short P4 program

segment provided to an extern object to customize its behavior. The extern object’s

internal logic may call these abstract methods. The syntax of abstract methods is

RegisterAction<bit<32>, bit<16>, bit<32>>(reg) regact = {

void apply(inout bit<32> value, out bit<32> rv) {

rv = value;

if (value == N − 1) {

value = 0;

}

else {

value = value + 1;

}

};

...

regact.execute(0);

Figure 2.4: An example of abstract method

similar to object-oriented languages. The extern object classes deﬁned by the architec-

ture may declare abstract methods and the abstract methods must be implemented

for each instance during instantiation.

Abstract methods are necessary in P4 for at least one reason: on Toﬁno, each

(unless recirculating, which is costly). So any read-then-write operation (such as in-

crementing a register cell) must be done in a single operation. The abstract method

provides an interface that allows the programmer to specify how the value to write

is computed from the original value. Figure 2.4 shows an example that uses abstract

method to read and then update a register in the Toﬁno architecture. reg is a register

and regact deﬁnes an action that modiﬁes reg’s value. apply is an abstract method,

whose prototype is deﬁned in the extern type RegisterAction and whose implemen-

tation for regact is as shown in Figure 2.4. Later, the P4 program may use, e.g.,

regact.execute(0) to update the 0-th cell’s value. Inside the execute method, the ab-

stract method apply will be invoked and the old value of the 0-th cell will be passed

in as value while value after the function call will be written into the 0-th cell.

The P4 speciﬁcation only allows abstract methods to “use the supplied arguments

or refer to values that are in the top-level scope” [12]. Since it is independent from

any local variable, the semantics of an abstract method can be represented by a

relation of initial extern state, list of input arguments, ﬁnal extern state, list of output

arguments, and signal. This relation is determined during instantiation from the

semantics of normal functions and statements and stored in Γ

ext

as a constant data

of the extern object.

The Toﬁno architecture allows more general forms of abstract methods. For ex-

ample, abstract methods with @synchronous annotation may access local variables.

This feature is nonstandard P4, in that it has not been accepted by the P4 Lan-

guage Consortium. The formalization of more general abstract methods will require

investigation in more applications.

2.3 Execution phase

The instantiation phase generates the static global environment Γ. Then the execu-

tion phase executes the program according to Γ. Because P4 programs do not have

loops and recursions, it is not necessary to consider nonterminating programs. So the

execution phase is deﬁned as a big-step operational semantics. The basic expressions

and statements are treated similarly as in Petr4. The most important diﬀerence is

the treatment of uninitialized bits.

2.3.1 Uninitialized bits

Unlike C, where reading an uninitialized variable may cause undeﬁned behavior, the

P4 Speciﬁcation states such read yields an unspeciﬁed value, including reading an

uninitialized variable and reading a ﬁeld of an invalid header [12, Section 8.23]. Also,

bit<8> x, y, z;

y = x;

z = x; /∗ z may diﬀer from y ∗/

reading twice may yield diﬀerent values. The following program gives an example.

So it is not enough to characterize P4’s behavior by assigning an arbitrary value

when a variable is uninitialized. Further, P4 supports bit access, e.g. x[5:2] = y,

so each bit is virtually a ﬁeld and has its own initialized-or-not status. Therefore

in the operational semantics, we consider that each bit in storage can be either 0,

1, or uninitialized, represented by 0, 1, ⊥. A value with such three-valued bits is

called a storable value. When a variable is declared without initialization, a storable

value ﬁlled with ⊥ is stored, unless the P4 Speciﬁcation speciﬁes something diﬀerent

in particular. For example, the validity bit of a uninitialized header is 0 (invalid),

not ⊥. On the other hand, the result of any expression evaluation, including single-

variable expressions, is a normal value without uninitialized bits. So storable values

are ﬁrst converted to normal values nondeterministically when used in an expression,

including being used as operands of arithmetic operations or the right-hand side of

an assignment and being passed as arguments. When storing an evaluation result, it

is converted to a storable value without uninitialized bits. For example,

bit<8> x, y, z, w;

y = x; /∗ The storable value is converted to normal value before writing into y ∗/

z = y;

w = y; /∗ w and z must be the same ∗/

2.3.2 Nondeterministic semantics

Because storable values need to be converted to normal values nondeterministically,

P4’s semantics is nondeterministic. Consider an abstract semantic judgment a ⇓ b,

where a is the input and b is the output. For example, a may be a program statement

and the initial state while b may be the ﬁnal state of executing the statement. In

a nondeterministic semantics, a ⇓ b means a possible result of executing a is b, and

a ⇓ b

, a ⇓ b

may be all valid judgments. If a ⇓ b does not hold for any b,

a cannot be executed at all, instead of undeﬁned behavior. Correctness of compiling

source program a into target program a

′

is that any behavior of a

′

is a valid behavior

of a: ∀b, a

′

⇓ b =⇒ a ⇓ b.

When proving program a satisﬁes some property P , the

formalization is ∀b. a ⇓ b =⇒ P (b), which reads as every possible result b satisﬁes

P .

2.3.3 Semantic rules

The syntax of values and l-values is as follows:

Value := BasicValue (e.g. signed/unsigned integers and Booleans)

| List(Ident × Value) (structs)

| Bool × List(Ident × Value) (headers; Boolean is the validity bit)

| · · · (other values not used in this thesis)

Lvalue := Path (local variable)

| Lvalue × Ident (ﬁeld of a struct, header, or union)

| Lvalue × Int × Int (bit-slice)

| · · · (other l-values not used in this thesis)

Values include basic values, structs, and headers, but unlike previous work [18], we do

not use closures. L-values are assignable variables or ﬁelds, including local variables,

ﬁelds, and bit-slices. There are some more kinds of values and l-values, but they are

The target program does not need to exhibit every possibility of the source program. It only

needs to be within the possible execution of the source program.

not interesting enough to be covered in the thesis. The program state, usually denoted

by s, consists of two parts: a stack frame for local variables within a control/parser

block, and an extern state for extern objects.

StackFrame := Path ⇀ Value

ExternState := Path ⇀ ExternObject

State := StackFrame × ExternState

extern objects are as deﬁned by the architecture. A table whose entries can be con-

ﬁgured by the control plane is also considered as an extern object, because they are

accessible from the outside of the P4 program.

Let Γ be the global static environment generated in Section 2.2. As P4 program

statements are mostly inside classes (i.e. parsers and control blocks) and the instan-

tiation phase does not duplicate the program for each instance, we need to know

in which instance the current statement is executed in order to correctly interpret

names. So we use a path p to indicate the path of the object that the program is

currently in (p is an empty path if not in any object). Names are handled using p

and locators (generated in Section 2.1). Local variables are always deﬁned in classes,

therefore a name referring to a local variable always has a locator of the form inst p

′

they are looked up in the stack frame using p

′

. A new empty stack frame will be

used when calling a method in a diﬀerent instance, since stack frames are not shared

between diﬀerent instances, and the old stack frame will be reused when returning

from the call, so scope of local variables is resolved. A name referring to an instance

may have a locator of the form glob p

′

or inst p

′

. In the ﬁrst case, it will be looked

up in Γ

inst

using p

′

, and in the second case, it will be looked up in Γ

inst

using p · p

′

because p

′

is the relative path from the current path p.

We adopt the following notations in the semantic rules. We use italic font for vari-

ables, such as exp for an expression and stmt for a statement, and use sans serif font

for constants, such as normal and return for signals. We use [v]

to denote convert-

ing a normal value v to a storable value, and use [v]

to denote nondeterministically

converting a storable value to a normal value. Overline is used to indicate a list of

items, such as v.

The big-step semantic judgments are written as six auxiliary judgments

s ⊢ lv ⇓

read

v (l-value read, 6 rules)

s ⊢ lv := v ⇓

write

′

(l-value write, 9 rules)

Γ, p, s ⊢ exp ⇓ v (expression, 18 rules)

Γ, p, s ⊢ exp ⇓ (lv, sig) (l-expression, 5 rules)

Γ, p, s ⊢ (d, exp) ⇓ (v, lv ) (argument list, 5 rules)

Γ, p ⊢ exp ⇓

lookup

′

, p

func

) (function lookup, 5 rules)

and three main judgments

Γ, p, s ⊢ stmt ⇓ (s

′

, sig) (statement, 16 rules)

Γ, p, s ⊢ exp ⇓ (s

′

, sig) (call-expression, 2 rules)

Γ, p, s ⊢ (f, v

) ⇓ (s

′

, v

out

, sig) (function, 3 rules)

For example, the judgment Γ, p, s ⊢ e ⇓ v reads as “in global environment Γ, with

object path p, in state s, the P4 expression e evaluates to value v.” Judgment

Γ, p, s ⊢ stmt ⇓ (s

′

, sig) reads as “for Γ and p, from state s, the execution of the P4

statement stmt results in state s

′

and signal sig.” Signal is used to mark control ﬂow

in the conventional way to handle return and exit statements versus normal control

ﬂow. Call-expressions are treated as a separated judgment for convenience. They are

Γ, p, s ⊢ x@(inst p

′

) ⇓ (p

′

, normal)

E-LVar

Γ, p, s ⊢ exp ⇓ (lv, sig)

Γ, p, s ⊢ exp.n ⇓ ((lv, n), sig)

E-LMember

local

(p) = v

local

, s

extern

) ⊢ p ⇓

read

E-RVar

s ⊢ lv ⇓

read

[. . . , (n, v

′

), . . . ]

s ⊢ (lv, n) ⇓

read

′

E-RStruct

s ⊢ lv ⇓

read

(valid := , [. . . , (n, v

′

), . . . ])

s ⊢ (lv, n) ⇓

read

′

E-RHeader

local

, s

extern

) ⊢ p := v ⇓

write

local

[p 7→ v], s

extern

)

E-WVar

s ⊢ lv ⇓

read

[. . . , (n, ), . . . ] s ⊢ lv := [. . . , (n, v), . . . ] ⇓

write

′

s ⊢ (lv, n) := v ⇓

write

′

E-WStruct

s ⊢ lv ⇓

read

(valid := 1, [. . . , (n, ), . . . ])

s ⊢ lv := (valid := 1, [. . . , (n, v), . . . ]) ⇓

write

′

s ⊢ (lv, n) := v ⇓

write

′

E-WHeader1

s ⊢ lv ⇓

read

(valid := b, [. . . , (n, ), . . . ]) b ∈ {0, ⊥}

s ⊢ (lv, n) := v ⇓

write

E-WHeader0

Figure 2.5: L-value evaluation, read and write rules of P4light

used not only in statements with function calls but also in tables. Tables are treated

as functions that ﬁrst evaluate keys and match them with a table entry, followed by

constructing and executing a call-expression from the matched table entry.

Figure 2.5 shows selected rules for evaluating, reading and writing l-values. Rules

E-LVar and E-LMember evaluate l-expressions based on locators. Rules E-RVar

and E-WVar are the base cases in which the l-value is just a path of a local variable,

so it is directly read from or write to s

local

. E-RStruct (E-RHeader, resp.) reads

a ﬁeld from a struct (header, resp.). E-WStruct writes to a ﬁeld of a struct. E-

Γ, p, s ⊢ exp ⇓ v

Γ, p, s ⊢ (in, exp) ⇓ ([v]

, )

E-ArgIn

Γ, p, s ⊢ exp ⇓ (lv, normal)

Γ, p, s ⊢ (out, exp) ⇓ ( , lv)

E-ArgOut

Γ, p, s ⊢ exp ⇓ (lv, normal) s ⊢ lv ⇓

read

Γ, p, s ⊢ (out, exp) ⇓ ([[v]

]

, lv)

E-ArgInOut

Figure 2.6: Semantics rules for argument evaluation

WHeader1 and E-WHeader0 deﬁne the semantics of writing to a header ﬁeld.

The write operation only takes eﬀect when the validity bit of the header is 1 (i.e. the

header is valid), in accordance with the P4 Speciﬁcation [12, Section 8.23].

It is worth noticing that the rule E-RHeader ignores the header’s validity bit. If

the validity bit is 1, it reads the ﬁeld as expected. If the validity bit is 0 or ⊥, it still

works because we designed the semantics to maintain an invariant that all the bits

of the header’s ﬁelds are ⊥ if the validity bit is 0 or ⊥. So reading from an invalid

header according to the rule E-RHeader will yield ⊥s, precisely following the P4

Speciﬁcation.

Figure 2.6 displays the rules for argument evaluation. Each argument is evaluated

to a pair of a storable value and an l-value. Either part of the pair might be empty,

denoted by “ ”. Rules are presented for a single argument, and its lifting to multiple

arguments is conventional. Handling of nonnormal signals in l-value evaluation is

omitted. The rule E-ArgIn evaluates an in parameter to the storable value corre-

sponding to v, and the l-value part is empty. Similarly, the rule E-ArgOut evaluates

an out parameter, and the result is only an l-value. The rule E-ArgInOut evaluates

an inout parameter. The semantic rules for argument evaluation is essentially the

same as Petr4’s copy-in and copy-out rules, except handling nondeterministic bits.

But we prefer calling them “argument evaluation”, because copy-in and copy-out

happen during argument passing, not argument evaluation. (see E-CallFunc rules

and E-Internal)

Γ, p, s ⊢ stmt

⇓ (s

′

, normal) Γ, p, s

′

⊢ stmt

⇓ (s

′′

, sig)

Γ, p, s ⊢ stmt

; stmt

⇓ (s

′′

, sig)

E-Seq

Γ, p, s ⊢ stmt

⇓ (s

′

, sig) sig ̸= normal

Γ, p, s ⊢ stmt

; stmt

⇓ (s

′

, sig)

E-Seq2

Γ, p, s ⊢ exp ⇓ true Γ, p, s ⊢ stmt

⇓ (s

′

, sig)

Γ, p, s ⊢ if (exp) stmt

else stmt

⇓ (s

′

, sig)

E-IfT

Γ, p, s ⊢ exp ⇓ false Γ, p, s ⊢ stmt

⇓ (s

′

, sig)

Γ, p, s ⊢ if (exp) stmt

else stmt

⇓ (s

′

, sig)

E-IfF

Γ, p, s ⊢ exp ⇓ v

Γ, p, s ⊢ return exp ⇓ (s, return v)

E-Return

kind(exp

) = expr Γ, p, s ⊢ exp

⇓ (lv, normal)

Γ, p, s ⊢ exp

⇓ v s ⊢ lv := [v]

⇓

write

′

Γ, p, s ⊢ exp

:= exp

⇓ (s

′

, normal)

E-Assign

Γ, p, s ⊢ exp

⇓ (lv, normal)

Γ, p, s ⊢ exp

(exp

) ⇓ (s

′

, return v) s

′

⊢ lv := [v]

⇓ s

′′

Γ, p, s ⊢ exp

:= exp

(exp

) ⇓ (s

′′

, normal)

E-AssignCall

Figure 2.7: Semantics rules for statements

The selected semantics rules for statements are shown in Figure 2.7. These rules

are mostly standard. In the rule E-Seq, the execution continues to stmt

if the signal

from stmt

is normal. In the rule E-Seq2, stmt

is skipped as stmt

has changed

the normal control ﬂow: e.g., the signal will be “return v” if stmt

executes a return

statement. The rules E-IfT and E-IfF handle if-statements. The rule E-Return

handles return statements. The rule E-Assign handles simple assignment statements

without function calls. The rule E-AssignCall handles statements with function

calls.

Figure 2.8 shows the rules for function call expressions. E-CallBuiltin handles

built-in function calls, e.g. h.setValid(), where h is a header. Normal function calls are

kind(exp

.n) = builtin

Γ, p, s ⊢ exp

⇓ (lv, normal) d = dirs(exp

.n)

Γ, p, s ⊢ (d, exp

) ⇓ (v, lv

′

) Γ, p, s ⊢ (lv, n, v) ⇓

builtin

′

, sig)

Γ, p, s ⊢ exp

.n(exp

) ⇓ (s

′

, sig)

E-CallBuiltin

kind(exp

) = func Γ, p ⊢ exp

⇓

lookup

(⋆, p

func

)

f = Γ

func

) d = dirs(exp

) Γ, p, s ⊢ (d, exp

) ⇓ (v, lv)

Γ, p, s ⊢ (f, v) ⇓ (s

′

, v

′

, sig) s

′

⊢ lv := v

′

⇓

write

′′

Γ, p, s ⊢ exp

(exp

) ⇓ (s

′′

, sig)

E-CallFunc1

kind(exp

) = func Γ, p ⊢ exp

⇓

lookup

′

, p

func

)

f = Γ

func

) d = dirs(exp

) Γ, p, s ⊢ (d, exp

) ⇓ (v, lv)

Γ, p

′

, ([], s

extern

) ⊢ (f, v) ⇓ (( , s

′

extern

), v

′

, sig)

local

, s

′

extern

) ⊢ lv := v

′

⇓

write

′′

Γ, p, (s

local

, s

extern

) ⊢ exp

(exp

) ⇓ (s

′′

, sig)

E-CallFunc2

Figure 2.8: Semantics rules for function call expressions

handled by E-CallFunc1 and E-CallFunc2. These two rules are applied based

on the function lookup result in Section 2.2.1, which are Γ, p ⊢ exp

⇓

lookup

(⋆, p

func

)

and Γ, p ⊢ exp

⇓

lookup

′

, p

func

), respectively. In E-CallFunc1, the function body

is executed in the same path p and the same state s. In E-CallFunc2, the function

body is executed in path p

′

and local variables are cleared in the state, and the

original local variables, stored in s

local

, are unchanged until copying the out parameters

as (s

local

, s

′

extern

) ⊢ lv := v

′

⇓

write

′′

. In both E-CallFunc1 and E-CallFunc2,

direction information of the arguments, d, is extracted from the type annotation of

the function expression exp

. We assume the argument evaluation result, (v, lv),

is automatically split into two lists, v and lv, such that v contains in and inout

parameters, and lv contains out and inout parameters.

Figure 2.9 shows the rules for functions. E-Internal is for normal P4 functions.

At the beginning, the in parameters are “copied in” using l-value write. Because the

parameter names are annotated with locators, there no concern of scope. At the end,

s ⊢ p

:= v

⇓

write

′

Γ, p, s

′

⊢ stmt ⇓ (s

′′

, return v) s

′′

⊢ p

out

⇓

read

out

Γ, p, s ⊢ (internal (p

, p

out

, stmt), v

) ⇓ (s

′

, v

out

, return v)

E-Internal

Γ, p, s ⊢ key ⇓ v C(p · n) = entry (v, kind, entry) ⇓

match

′

(exp

)

action = [. . . , n

′

(exp

), . . . ] Γ, p, s ⊢ n

′

(exp

, exp

) ⇓ (s

′

, return null)

Γ, p, s ⊢ (table (n, key, kind, action), []) ⇓ (s

′

, [], null)

E-Table

Figure 2.9: Semantics rules for functions

the out parameters are “copied out” as l-value read. E-Table is the more interesting

rule, which is for tables. The function body of a table essentially has four ﬁelds:

• n, the local name of the table;

• key, the keys, which is a list of expressions;

• kind, a list of match kinds of the same length as key;

• action, a list actions pending to be called based on table matching result.

Table evaluation ﬁrst evaluates key into values, v. The condition C(p · n) = entry

means retrieving table entries from control plane information, denoted by C, using

the table’s fully qualiﬁed name p · n. Then, (v, kind, entry) ⇓

match

′

(exp

) indicates

the procedure of ﬁnding the matching entry n

′

(exp

) of v in entry. This procedure is

primarily determined by the architecture, because the architecture may deﬁne custom

match kinds. Then, the rule ﬁnds the corresponding action n

′

(exp

) in action. The

arguments exp

and exp

are combined together in order to execute the action.

In this presentation of the E-Table rule, certain technical details have been

omitted. First, a table has its return value that enables the retrieval of the matched

action after the table.apply(). The expression of E-Table ignores this complexity

and returns null. Second, instead of having entries determined by the control plane,

there is another kind of table whose entries are hardcoded in the P4 program. These

tables do not need the step C(p · n) = entry.

2.4 Comparison with Petr4

This section compares our semantics with Petr4’s [18] semantics and explains our

design choices. Petr4 was the ﬁrst formal semantics for P4. It gave two representations

of the P4 semantics: one is called Core P4, for a subset of P4 with formal semantics

and pen-and-paper soundness proof; the other is the interpreter, which is an OCaml

program and supports almost all P4 features such as parsers and extern functions and

objects.

Figure 2.10 shows the semantic rules handling objects in Petr4’s Core P4 seman-

tics. It is adapted for better presentation: ﬁrst, it is simpliﬁed by removing support

for the features that are not related to the comparison; second, Petr4’s Core P4 se-

mantics does not include extern objects, so we extracted the semantic rules for extern

objects from Petr4’s interpreter and added to Figure 2.10.

Petr4’s Core P4 semantics uses closures intensively to handle function-like objects

with local scopes in P4, so it is in a more standard way of deﬁning semantics. But since

P4’s scopes are statically allocated, we ﬁnd that deﬁning the semantics based on the

phase distinction is more convenient for program reasoning, as we will demonstrate

in Example 2.1.

In Petr4 semantics, statements and declarations are treated similarly without

phase distinction, so here we consider both kinds as statements. The form of semantic

judgment is

⟨C, ∆, σ, ϵ, stmt⟩ ⇓ ⟨∆

′

, σ

′

, ϵ

′

, sig⟩,

where σ is a global storage that maps locations to values, ϵ is a local environment

that maps currently visible names to locations. Locations are allocated dynamically.

The judgment ℓ fresh is used to obtain a previously not used location ℓ.

The states of extern objects are persistent between packets, unless the switch

speciﬁcation resets them outside of the P4 program. So in order to make sure extern

E-CtrlDecl

ℓ fresh val = cclos(ϵ, ctrl(d x : τ)(x

: τ

) {decl stmt})

⟨C, ∆, σ, ϵ, ctrl X(d x : τ)(x

: τ

) {decl stmt}⟩ ⇓ ⟨∆, σ[ℓ 7→ val], ϵ[X 7→ ℓ], normal⟩

E-CtrlInst

⟨C, ∆, σ, ϵ, X⟩ ⇓ ⟨σ

, cclos(ϵ

, ctrl(d x : τ )(x

: τ

) {decl stmt})⟩

ϵ(path) = p ⟨C, ∆, σ

, ϵ, exp⟩ ⇓ ⟨σ

, val

⟩

ℓ

, ℓ fresh val = clos(ϵ

[path 7→ p.x][x

7→ ℓ

], d x : τ , {}, {decl stmt})

⟨C, ∆, σ, ϵ, X(exp) x⟩ ⇓ ⟨∆, σ

[ℓ

7→ val

][ℓ 7→ val], ϵ[x 7→ ℓ], normal⟩

E-ExtnInst1

⟨C, ∆, σ, ϵ, X⟩ ⇓ ⟨σ

, extern Y ⟩ ϵ(path) = p

(p.x 7→ ) /∈ σ

⟨C, ∆, σ

, ϵ, exp⟩ ⇓ ⟨σ

, val

⟩ ⟨C, Y, val

⟩ ⇓

extern

val

⟨C, ∆, σ, ϵ, X(exp) x⟩ ⇓ ⟨∆, σ

[p.x 7→ val], ϵ[x 7→ p.x], normal⟩

E-ExtnInst2

⟨C, ∆, σ, ϵ, X⟩ ⇓ ⟨σ

, extern Y ⟩ ϵ(path) = p (p.x 7→ ) ∈ σ

⟨C, ∆, σ, ϵ, X(exp) x⟩ ⇓ ⟨∆, σ

, ϵ[x 7→ p.x], normal⟩

Figure 2.10: Petr4’s Core P4 semantic rules for objects (simpliﬁed for demonstration)

objects are always allocated to the same locations, their control plane names are

used as their locations, which are computed rather than arbitrarily allocated. These

names, if computed correctly, are exactly the paths that we use in our semantics (see

Section 2.2). Together with internal locations allocated during execution, they make

up locations in this version of Petr4 semantics. In order to maintain these names

without introducing a new context in the semantic judgment, a special ﬁeld called

path is added to the environment ϵ (as shown in Figure 2.10).

Rule E-CtrlDecl means, when evaluating a control declaration, the semantics

allocates a fresh location ℓ and stores the declaration into ℓ as a closure with the

current local environment ϵ. Rule E-CtrlInst evaluates a control instantiation. It

ﬁrst retrieves the closure from the environment and the storage, followed by ﬁnd-

ing p, which is the path of the current environment, and evaluating the constructor

arguments exp.

Fresh locations ℓ

are allocated for the constructor arguments, and fresh location

ℓ is for the control instance. Then the closure val representing the control instance

is created by binding the parameter names to locations ℓ

and path to p.x, which is

the path of the control instance, respectively. Finally, the constructor arguments are

stored in fresh locations ℓ

and val is stored in ℓ. In our semantics, these two rules are

handled by the instantiation phase without dynamic allocation or closure passing.

For extern object instantiation, Petr4 semantics needs to consider whether the

object has already been initialized. If it has not been initialized, the rule ExtnInst1

is applied. In ExtnInst1, ⟨C, ∆, σ, ϵ, X⟩ ⇓ ⟨σ

, extern Y ⟩ means evaluating the

name X gives an extern object class Y . It ﬁnds the path p of the current environment

and tests p.x is uninitialized. Then ⟨C, Y,

val

⟩ ⇓

extern

val means according to the

architecture, initializing an extern object of class Y with parameters val

gives val. In

this rule, the location for the new instance is given by p.x instead of allocating a fresh

one. Otherwise, the rule ExtnInst2 will be applied. In ExtnInst2, p.x is already

in σ so already initialized, therefore the value from previous packet processing should

be used without reinitialization. In our semantics, extern objects are allocated in the

instantiation phase, so there is no need to consider whether an object is initialized or

not.

Another comparison between Petr4 and our semantics is on location manipulation,

as shown in Figure 2.11. The two E-VarAssign rules handle exactly the same

assignment statement whose left hand formula is just a variable.

For assignment

x := exp, Petr4 semantics needs to ﬁrst looks up in the environment ϵ to ﬁnd ℓ

before writing into it. In contrast, our semantics is facilitated with the locator, and

They are special cases of the complete assignment rules. For example, E-VarAssign in this

thesis is derived from E-Assign, E-LVar, and E-WVar. These simple rules are more accessible

for comparison.

E-VarAssign (Petr4)

ϵ(x) = ℓ ⟨C, ∆, σ, ϵ, exp⟩ ⇓ ⟨val⟩

⟨C, ∆, σ, ϵ, x := exp⟩ ⇓ ⟨∆, σ[ℓ 7→ val], ϵ, normal⟩

E-VarAssign (This thesis)

Γ, p, (s

local

, s

extern

) ⊢ exp ⇓ val

Γ, p, (s

local

, s

extern

) ⊢ x@(inst p

′

) := exp ⇓ ((s

local

′

7→ val], s

extern

), normal)

Figure 2.11: Comparison with Petr4 on location manipulation

the assignment statement becomes x@(inst p

′

) := exp.

So the result is directly

written in p

′

. The same look-up procedure also occurs when evaluating variables in

the expression exp. This signiﬁcantly reduces the work of designing and proving the

program logic. The following example illustrates how our design facilitates reasoning

about program execution.

Example 2.1. Figure 2.12 shows an example P4 program written in a simpliﬁed

version of the V1Model architecture, and whose ingress block gets input a and outputs

In our semantics, the program in Example 2.1 ﬁrst goes through the instantiation

phase. We get the objects

main.ig 7→ (ctrl MyIngress, main.ig)

main.ig.c1 7→ (ctrl C, main.ig.c1)

main.ig.c2 7→ (ctrl C, main.ig.c2)

main.ig.c1.reg 7→ (register : [8w0])

main.ig.c2.reg 7→ (register : [8w0])

Then the switch starts processing packets. In this procedure, the object structure does

not change—only the values in registers may change and be kept between packets.

Variables can only be local, so their locators must be inst p.

control C(out bit<8> x) {

apply {

bit<8> y;

reg.read(y, 0);

y = y + 1;

reg.write(0, y);

x = y;

}

control MyIngress(in bit<8> a, out bit<8> x) {

C() c1;

C() c2;

apply {

if(a == 0) {

c1.apply(x);

}

c2.apply(x);

}

Main(MyIngress()) main;

Figure 2.12: P4 program in Example 2.1

In contrast, Petr4’s semantics uses a special ﬁeld cnt with initial value 0 in σ to

count the location to allocate. It ﬁrst loads control deﬁnitions into the storage

ϵ = [C 7→ 0; MyIngress 7→ 1]

σ = [cnt 7→ 2; 0 7→ (body of C); 1 7→ (body of MyIngress)].

Petr4’s semantics does not support nameless instantiation MyIngress(), but we can

follow the implementation of the interpreter to see that, after the instantiation of

main, the program environment and storage will be

ϵ = [C 7→ 0; MyIngress 7→ 1; main 7→ 3]

= [C 7→ 0; path 7→ main.ig]

σ = [cnt 7→ 4; 0, 1 7→ . . . ; 2 7→ (ϵ

, body of MyIngress); 3 7→ package[ig 7→ 2]].

Then the MyIngress control block is executed with actual packet data. It knows

that the closure stored at location 2 should be executed, by looking up ϵ(main) = 3,

σ(3)(ig) = 2. As it is inside MyIngress, ϵ

is used. It starts with instantiating two

instances of C, namely c1 and c2. The rule E-CtrlInst is used and there is

= [C 7→ 0; path 7→ main.ig; c1 7→ 4; c2 7→ 5]

= [path 7→ main.ig.c1]

= [path 7→ main.ig.c2]

σ = [cnt 7→ 6; 0, 1, 2, 3 7→ . . . ; 4 7→ (ϵ

, body of C); 5 7→ (ϵ

, body of C)].

Then the program’s behavior depends on the value of a. If a = 0, then c1 is called

and a fresh location is allocated for x in C. (x is not copied in because it is an out

parameter.

= [path 7→ main.ig.c1; x 7→ 6]

σ = [cnt 7→ 7; 0, 1, 2, 3, 4, 5 7→ . . . ; 6 7→

Then reg let instantiated and initialized from rule E-ExtnInstInit.

= [path 7→ main.ig.c1; x 7→ 6; reg 7→ main.ig.c1.reg]

σ = [cnt 7→ 7; 0, 1, 2, 3, 4, 5, 6 7→ . . . ; main.ig.c1.reg 7→ [0]].

Then, the rest of C is executed:

= [path 7→ main.ig.c1; x 7→ 6; reg 7→ main.ig.c1.reg; y 7→ 7]

σ = [cnt 7→ 8; 0, 1, 2, 3, 4, 5 7→ . . . ; 6 7→ 1; 7 7→ 1; main.ig.c1.reg 7→ [1]].

Then x will be copied out and c2 will be executed. In c2, x and y will be allocated

at 8 and 9 respectively. But this allocation depends on the actual data. The case

discussed above is a equals 0. If a is not equal to 0, c1 will not be executed and c2

will start with cnt 7→ 6. x and y will be allocated at 6 and 7 instead.

From this example, we can see that our semantics is much easier to manipulate.

This advantage will be more apparent in the program logic (Chapter 3) and the

veriﬁcation (Chapter 5). It is also important for proving correctness of a P4 compiler,

which is beyond the scope of this thesis. First, instantiation and execution (pure

data processing) are mixed and executed intermittently in Petr4’s semantics. In

the design of the P4 language, the restrictions are designed to force instantiation to

be done statically and the cost of implementing the program on actual hardware is

determined only by instantiation. So it is better to point out this idea in its formal

semantics.

The second beneﬁt is that locations are statically determined and easy to be sep-

arated in our semantics. Petr4’s semantics allocates fresh locations during execution.

In practice, a counter is used to track the next location to allocate. However, that

means we need to reason about this counter, even as the behavior of the program

is totally irrelevant to the counter. And the locations are, in other words, dynami-

cally allocated pointers, so we need to consider that the locations are nonoverlapping

and constants are never modiﬁed when deﬁning the semantics of the program logic

judgments and proving soundness of the logic.

Program analyzers also need to be careful to avoid name capturing. The usage of

locators in our semantics simpliﬁes the variable disambiguation procedure. Context

does not change within a control/parser object, so it does not need to be maintained

in the type system, soundness proof, and the analyzer.

Function call interface Finally, there is another diﬀerence between Petr4 and our

semantics that is noteworthy but perpendicular with the instantiation-versus-closure

discussion. Function parameters are passed as a list of values in our semantics rather

than a named mapping as in Petr4, because the names of formal parameters are

internal for the callee, so the caller should not refer to these names for copying-

in and copying-out. In particular, control/parser instances may be parameterized

with other control/parser instances as constructor parameters. These host instances

can call these parameter instances or pass them to other instances as constructor

parameters. In either case, only the type signatures of the callees are relevant, not

the names. In modular reasoning, we want to reason about the caller with only the

behavior of the callee but not the implementation, so we want to remove the names

of formal parameters in the semantics of function call.

2.5 Architecture speciﬁcation

Architecture-speciﬁc components are not part of P4’s core semantics, but they are

very important in order to formalize programs’ behavior. The formal speciﬁcation

of a P4 architecture consists of three parts: the extern objects and methods, table

match kinds, and the switch model. The P4 semantics is parameterized by such an

architecture speciﬁcation.

The speciﬁcation of extern objects and methods consists of an initialization

function for each class of objects and an execution relation for each method. The

initialization function is used in the instantiation phase. Its input is a 5-tuple

(Γ

ext

, e

extern

, type, p, (p

param

| v

param

)), where Γ

ext

and e

extern

are the static environ-

ment and the dynamic state for extern objects in the ongoing instantiation phase,

respectively. The list of type parameters is denoted as type. The path p indicates

where the extern object is instantiated. The last element, (p

param

| v

param

), is a mixed

list of paths and values, which is the constructor parameters The paths in the list

represent parameters that are objects. The output of the initialization function is a

pair (Γ

′

ext

, e

′

extern

The execution relation is used in the execution phase. It is a relation on an 8-tuple

(Γ

ext

, e

extern

, p, type, v

param

, e

′

extern

, v

′

param

, sig),

where Γ

ext

is the static extern environment, p is the path of the extern object, e

extern

extern state before the call, type and v

param

are type and in parameters, e

′

extern

is the

extern state after the call, v

′

param

is out parameters, sig is the signal.

An architecture may deﬁne custom match kinds, so their interpretation needs to

be speciﬁed by the architecture. Technically, it deﬁnes a function that takes a list

of values of keys with their match kinds and a list of table entries and returns the

matched table entry.

The switch model is the behavior of the whole switch. P4 programmable hard-

ware consists of P4 programmable components and ﬁxed-function components, and

a P4 program deﬁnes the programmable components. But still it is necessary to

deﬁne other components and the connection between them. So in an architecture

speciﬁcation, there is a inductive relation that deﬁnes how the architecture execute a

packet. This relation invokes the P4 semantics for the P4 programmable components.

Currently, it only processes one packet at a time without queueing and concurrency.

2.6 Implementation

This section brieﬂy describes the implementation of the semantics in this chapter.

P4light and front end The P4light syntax and the front end consist of four parts:

P4light syntax, parser (including lexer) and typechecker (including elaborator), two

transformation passes (as described in Section 2.1), and pretty printer. The P4light

syntax is written in Coq and extracted to OCaml to connect with OCaml code. The

parser and typechecker are written in OCaml. The two transformation passes are

written in Coq and extracted to OCaml, so it is possible to prove their correctness in

the future. The pretty printer is an OCaml program that prints the P4light AST as

Coq source code, in order to handle P4 programs in Coq.

Instantiation The instantiation phase is implemented as a computable function

in Coq. One may use Coq’s built-in Compute mechanism to generate the global

environment.

Execution Normal values and storable values are implemented as a data type pa-

rameterized by diﬀerent bit representations (bool and option bool, respectively), This

makes some proofs applicable for both representations. The execution phase judg-

ments are deﬁned as (mutually) inductive relations.

Artifact The artifact from this chapter is distributed via a GitHub repository [43].

Chapter 3

Program Logic

Program logics, especially Hoare logic and separation logic, are widely used for rea-

soning about imperative programs in languages such as C and Java.

In this chapter,

we introduce our “Veriﬁable P4” program logic for semi-modular veriﬁcation of P4

control blocks. The basis of Veriﬁable P4 logic is Hoare logic with function calls [25].

It uses modiﬁcation sets, as in Dafny [26] and ACSL [14], to track heap-like shared

state between functions. We propose hierarchical extern predicates that provide a bet-

ter encapsulation of state representation and modiﬁcation sets in P4 control blocks,

utilizing the hierarchical structure of the state as designed in Chapter 2. The syntax

and semantics of function speciﬁcations and Hoare triples are based on hierarchical

extern predicates, and then we derive program logic rules from the operational se-

mantics. All the program logic rules are proven sound with respect to the operational

semantics in the Coq proof assistant.

Veriﬁable P4 logic requires each function to have a function speciﬁcation in the

same way as many other program logics. Because P4 does not support recursion, the

functions can be proved in the order from the lowest level to the highest level. So an

Program logics are not only the standard method for proving functional correctness of imperative

programs, but also the logical basis to justify static analysis algorithms.

PATH p, MOD m M

PRE (ARG

⃗

P , MEM

⃗

Q, EXT

⃗

POST (RET v, ARG

⃗

′

, MEM

⃗

′

, EXT

⃗

′

)

Figure 3.1: Simple form of function speciﬁcation

environment of function speciﬁcations is not necessary in the program logic judgments.

and step-indexing [2] is not needed in the semantics of function speciﬁcations.

A simple form of function speciﬁcation is shown in Figure 3.1. We postpone

detailed explanation until Section 3.2, and only provide an overview to motivate hi-

erarchical predicate. In Figure 3.1, p is the path where the function is executed (the

fully qualiﬁed name of the object), which is the same as in the semantics. MOD m M

indicates the sets of local variables and extern objects modiﬁed in the function. De-

scribing modiﬁed local variables is necessary, because P4 programs have local variables

visible in diﬀerent functions (actions) in the same class (e.g. a control block, see the

example in Section 3.2). The rest of the function speciﬁcation is a precondition and

a postcondition. RET v describes the return value. The three remaining parts ARG,

MEM, and EXT are descriptions of in and out arguments, local variables, and extern

objects, respectively.

The extern objects are crucial for stateful applications, and their state is shared

around the whole program. In a function speciﬁcation, the state of extern objects is

described by EXT

⃗

R, where

⃗

R is a list of hierarchical predicates for extern objects as

deﬁned in Section 3.1. This design allows us to meet two requirements at the same

time: (1) to avoid the diﬃculty of automating proofs in separation logic, and use

modiﬁcation set instead to manipulate the predicates, and (2) to still encapsulate the

representation and modiﬁcation of extern objects inside a control block, which makes

them invisible from outside in the speciﬁcation, just as they are invisible from outside

in the program. In Section 3.2, we describe the form of function speciﬁcations and

program logic rules of our P4 program logic in detail.

Veriﬁable P4 program logic does not handle packet parsers in P4. Parsers have

a diﬀerent control ﬂow and they are stateless, so it is presumably more suitable for

a diﬀerent approach of veriﬁcation. Leapfrog [19] is an automatic veriﬁer for P4

parser equivalence, and it may be extended to more general automatic veriﬁcation

and veriﬁcation with soundness proof in the future. So we only focus on control blocks

in the program logic for P4.

3.1 Hierarchical extern predicates

Modularity is just as important in veriﬁcation as in programming. It saves eﬀort and

enables reusable components. To achieve modularity, we follow the approach used

by most modular veriﬁcation frameworks, which is using function speciﬁcations, each

of which consists of a precondition and a postcondition, to describe a function, in

particular, a control block. As the extern objects in the control block are persistent,

their states need to be maintained in function speciﬁcations. However, these extern

objects are invisible from outside of the control block, so we want to encapsulate which

extern objects are used and how they represent the state of the control instance in a

predicate, so that the client of the control block (the code that calls the control block)

can use it modularly (decoupled from its implementation details) and conveniently

(without mentioning each extern object).

Suppose we have a predicate P without its deﬁnition and P holds before executing

a program block c. After executing c, whether P still holds or not is unclear, because

the objects that P depends on may be modiﬁed by c. One traditional solution is

separation logic, but the complexity and expressive power of separation logic are not

needed for P4, a statically instantiated language without pointers. Another tradi-

tional solution is using modiﬁcation sets and dependent sets. That is, we characterize

c with its modiﬁcation set, denoted with M, which is the set of paths of the extern

objects that c may modify, and characterize P with its dependent set, denoted with

D, namely the set of paths of the extern objects that P may depend on. If M and

D are disjoint, then P still holds after executing c. But, in this approach, D exposes

the names of objects used internally and breaks modularity.

To address these issues, we introduce hierarchical predicates for extern objects

based on P4’s hierarchical state structure and paths. We also refer to them as extern

predicates in certain contexts. For paths p and q, let p ⊒ q denote that p is a preﬁx

of q.

According to the allocation of control plane names, if the path of a control

instance is p, then each extern object allocated inside the control instance has path q

such that q ⊑ p. When using a predicate P to describe the persistent state in control

instance p, we set the dependent set of P as {p}, and interpret it as P may depend

on extern objects with path q such that q ⊑ p. In general, predicate P and dependent

set D should satisfy that P only depends on the objects with path q such that there

exists p ∈ D such that q ⊑ p. In other words, for any states s and s

′

(∀pq. p ∈ D ∧ q ⊑ p =⇒ s(q) = s

′

(q)) =⇒ (P (s) ⇐⇒ P (s

′

)). (3.1)

The modiﬁcation set M is interpreted in the same way: program c may only modify

extern objects with path q that there exists p ∈ M such that q ⊑ p. If P does not

depend on any objects q modiﬁed by c, i.e. there does not exist p, q, r such that

p ∈ D ∧ r ⊑ p ∧ q ∈ M ∧ r ⊑ q, (3.2)

then P still holds after executing c. To simplify (3.2), let

p ∥ q := ∀r. ¬(r ⊑ p ∧ r ⊑ q), (3.3)

Here preﬁx is considered in terms of lists of identiﬁers. For example, a.b is a preﬁx of a.b.c, but

not a preﬁx of a.bb. And p is also a preﬁx of itself.

p ⊑ q, p ⊏ q, and p ⊐ q are deﬁned accordingly.

which read as paths p and q are disjoint. In other words, p ∥ q means one cannot

get the same r by appending to p and q, so p and q must branch somewhere. Thus,

p ∥ q has an alternative deﬁnition that there exist path r as the common preﬁx and

identiﬁers a and b such that

a ̸= b ∧ p ⊑ r.a ∧ q ⊑ r.b. (3.4)

This gives a method to eﬃciently test disjointness. And we deﬁne disjointness between

sets as

D ∥ E := ∀p ∈ D. ∀q ∈ E. p ∥ q,

and (3.2) can be rewritten as D ∥ M.

When handling hierarchical predicates, it is preferable for the dependent sets to

be automatically inferred from the predicates instead of ﬁgured out manually. So we

deﬁne a syntax of hierarchical predicates, denoted by P :

P ::= P

pure

| p 7→ v | P

∧ P

| P

∨ P

| ¬P

| ∃x.P (x) | ∀x.P (x) | wrap

(P )

pure

::= pure predicate

p ::= path

v ::= object value

D ::= path set.

It includes atoms (pure predicates and singletons), normal logical combinators, and

a wrapper wrap

to replace the dependent set with set D that dominates the original

dependent set. The logical combinators and quantiﬁers are interpreted as usual.

The wrapper wrap

does not change the interpretation of the predicate but only the

dependent set. Each hierarchical predicate P has its dependent set δ(P ) determined

WF(P

pure

) WF(p 7→ v)

WF(P

) WF(P

)

WF(P

∧ P

)

WF(P

) WF(P

)

WF(P

∨ P

)

WF(P )

WF(¬P )

WF(P (x)) δ(P (x)) does not depend on x

WF(∃x.P (x))

WF(P (x)) δ(P (x)) does not depend on x

WF(∀x.P (x))

WF(P ) δ(P ) ⊑ D

WF(wrap

(P ))

Figure 3.2: Inference rules for well-formedness

by its syntax:

δ(P

pure

) = ∅ δ(p 7→ v) = {p}

δ(P

∧ P

) = δ(P

) ∪ δ(P

) δ(P

∨ P

) = δ(P

) ∪ δ(P

)

δ(¬P ) = δ(P ) δ(∃x.P (x)) = δ(P (x))

δ(∀x.P (x)) = δ(P (x)) δ(wrap

(P )) = D

Predicate P and its syntactic dependent set δ(P ) might not satisfy (3.1). We say P

is well-formed if P and δ(P ) satisfy (3.1), i.e.

wf(P ) := (∀pq. p ∈ δ(P ) ∧ q ⊑ p =⇒ s(q) = s

′

(q)) =⇒ (P (s) ⇐⇒ P (s

′

)). (3.5)

In order to derive wf(P ) from the syntax of P , we introduce a set of inference

rules for syntactic well-formedness WF(P ) as shown in Figure 3.2. The ﬁrst ﬁve rules

do not have side conditions. The rules for quantiﬁed predicates require that δ(P (x))

does not depend on the quantiﬁed variable x. As the dependent set is calculated

syntactically, this means x either does not appear in the path p in p 7→ v, or the

occurrences in paths are wrapped by wrap

where D does not depend on x. The

rule for wrap

requires that the original dependent set δ(P ) is covered by the new

dependent set D, denoted by δ(P ) ⊑ D. It is deﬁned as

E ⊑ D := ∀p ∈ E. ∃q ∈ D. p ⊑ q.

Theorem 3.1 (Soundness). For any predicate P , if WF(P ) is derivable from the

inference rules in Figure 3.2, then we have wf(P ) as deﬁned in (3.5).

Proof. The proof is by induction on the inference tree. Most of the rules are straight-

forward, except the last rule. Assume we have δ(P ) ⊑ D and WF(P ). The condition

δ(P ) ⊑ D means

(∀pq. p ∈ δ(P ) ∧ q ⊑ p =⇒ s(q) = s

′

(q)) =⇒ (P (s) ⇐⇒ P (s

′

)).

The condition WF(P ) implies wf(P ) by induction. We need to prove wf(wrap

(P )),

which is

(∀pq. p ∈ D ∧ q ⊑ p =⇒ s(q) = s

′

(q)) =⇒ (P (s) ⇐⇒ P (s

′

)),

as wrap

does not change the interpretation. So it is enough to show

(∀pq. p ∈ D∧q ⊑ p =⇒ s(q) = s

′

(q)) =⇒ (∀pq. p ∈ δ(P )∧q ⊑ p =⇒ s(q) = s

′

(q))

We only need to show, for every path q,

(∃p. p ∈ δ(P ) ∧ q ⊑ p) =⇒ (∃r. r ∈ D ∧ q ⊑ r) (3.6)

Let p be as in the ﬁrst existential quantiﬁer, so p ∈ δ(P ) and q ⊑ p. By the deﬁnition

of δ(P ) ⊑ D, there exists a path r ∈ D such that p ⊑ r. By the deﬁnition of ⊑, it is

transitive, so q ⊑ r. So we have (3.6).

Recall the program in Figure 2.1. In Section 5.3, we are going to relate the

P4 program with a functional program that serves as its functional model, in order

to prove properties about the program. So we use hierarchical predicates to relate

P4 state with model values. The predicate row repr(p, r) describes that a Row at p

represents a row model value r:

row repr(p, r) := wrap

{p}

(p.reg 7→ r).

Each Pane has three rows, so its representation predicate pane

repr is the conjunction

of row repr on each P4 instance row i and its corresponding model value a[i]:

pane repr(p, a) := wrap

{p}

i=1,2,3

row repr(p.row i, a[i])

Similarly, a ﬁlter (the name SBFilter is reserved for later use) consists of four panes

and auxiliary registers clear index and timer, and the predicate ﬁlter repr is deﬁned as

ﬁlter repr(p, f) := wrap

{p}

p.clear index 7→ · · · ∧ p.timer 7→ . . .

∧

i=1,2,3,4

pane repr(p.pane i, f.a[i])

It is easy to check well-formedness of these predicates. The dependent set of all these

predicates are {p} for their respective p’s. So the client can handle them without

knowing their internal implementation.

A commonly used technique to prove correctness of programs is layer reﬁnement.

That is, a low-level functional model (like above) is related to a high-level functional

model by a simulation relation f R f

′

, where f is a value from the low-level model

and f

′

is a value from the high-level model. They we prove operations preserve the

simulation relation, then we only need to reason about the high-level model instead

of the low-level model. When we want to describe the P4 state of a SBFilter with a

high-level functional model, we can deﬁne predicate using the quantiﬁer in the syntax:

SBFilter repr(p, f

′

) := ∃f. ﬁlter repr(p, f) ∧ f R f

′

So the client can use the high-level functional model directly.

3.2 Program logic

Based on hierarchical predicates, we give our Veriﬁable P4 program logic that is

useful and proved sound with respect to the operational semantics. As mentioned

above, it is a Hoare logic with dependent set and modiﬁcation set. Given a P4

program, the global environment Γ is directly calculated, so we treat Γ as a known

constant. The form of function speciﬁcation is as shown in Figure 3.3. First, there

are three layers of quantiﬁed variables: WITH ⃗x, WITH ⃗y, and EX ⃗z. All the three

layers of quantiﬁed variables can be dependently typed in Coq’s type system, so they

can include propositions. The outer layer WITH ⃗x provides universally quantiﬁed

variables ⃗x whose scope is the whole speciﬁcation. PATH p indicates “when executing

in an instance at path p”. There is a common pattern, for a P4 class Foo,

WITH p ( : Γ(p) = Foo), PATH p, . . .

This pattern gives a speciﬁcation for every instance p of class Foo, where Γ(p) is an

abuse of notation for “getting the class name of instance p in Γ”. Although it is a

nice form of speciﬁcation, we have not yet established a method to assist its proof

in our veriﬁer (Chapter 4), so it is currently only used in speciﬁcations of extern

methods, whose correctness is directly proved using the semantics without using the

veriﬁer. For internal functions, we use a pragmatic approach that is setting p to be

Full form of function speciﬁcation:

WITH ⃗x, PATH p, MOD m M

WITH ⃗y,

PRE (ARG

⃗

P , MEM

⃗

Q, EXT

⃗

POST (EX ⃗z, RET v, ARG

⃗

′

, MEM

⃗

′

, EXT

⃗

′

)

Quantiﬁer-free form of function speciﬁcation:

PATH p, MOD m M

PRE (ARG

⃗

P , MEM

⃗

Q, EXT

⃗

POST (RET v, ARG

⃗

′

, MEM

⃗

′

, EXT

⃗

′

)

Figure 3.3: Function speciﬁcation

the path of each instance, respectively, and running the proof script multiple times

(see Chapter 5).

WITH ⃗y and EX ⃗z are standard quantiﬁers in function speciﬁcations, as used in

VST [7], VeriFast, and Dafny. WITH ⃗y means variables ⃗y are shared between the

precondition and the postcondition. They can be any value in a particular function

call, so if the precondition is satisﬁed with certain ⃗y, the postcondition is satisﬁed

with the same ⃗y. EX ⃗z are existentially quantiﬁed for the postcondition.

In the C language, we do not need to mention local variables in function speciﬁca-

tions, since each function call creates a new stack frame. But P4 has local variables

visible in diﬀerent functions (actions) in the same class, for example,

control Ingress {

bit<8> x;

action incr() { x = x + 1; }

apply { x = 0; incr(); }

}

The variable x is visible in both the apply block and the incr action. As deﬁned

in our semantics, a P4 function call only creates a new stack frame if it calls to a

diﬀerent instance (which often means a diﬀerent control block, but not always). So P4

functions that are called in the same control block need to describe local variables that

the functions use in their preconditions and postconditions, specify which variables

they modify.

For clarity, we explain the rest of the function speciﬁcation in the quantiﬁer-free

form as shown in the second half of Figure 3.3. The MOD-clause MOD m M speciﬁes

the variable modiﬁcation set m and the object modiﬁcation set M. m is either a set of

variables (indicated with paths) or “∗”, which means any variable can be modiﬁed. M

is the modiﬁcation set as described in Section 3.1. These two sets are placed outside

of quantiﬁed variables ⃗y and ⃗z, so that their validity can be proved and applied

automatically without supplying ⃗y and ⃗z, which depends on symbolic program data

and sometimes requires human eﬀort to ﬁgure out.

The precondition of the form ARG

⃗

P , MEM

⃗

Q, EXT

⃗

R, where

⃗

P is a list of values

corresponding to in parameters (including inout parameters) of the function,

⃗

Q is a

list of pairs of paths and values that describes local variables,

⃗

R is a list of well-formed

hierarchical predicates. The postcondition is similar, but with an additional return

value v.

Bitwise abstract interpretation All values used in preconditions and postcondi-

tions, including v and values in

⃗

P and

⃗

Q, are bitwise abstract values. The same applies

for assertions in Hoare triples (see below). Recall that, in Section 2.3, we mentioned

uninitialized variables in P4 need to be treated explicitly, so variables are stored using

storable values whose bits can take values within 0, 1, and ⊥ (uninitialized).

During veriﬁcation, we have not only uninitialized bits but also unknown bits, for

example, when uninitialized bits are nondeterministically converted to 0/1 during pa-

rameter copy-in/copy-out. Although these bits are “initialized”, it is counterintuitive

Abstract value

Storable value

0 1 ⊥

0 T F F

1 F T F

⊥ T T T

Table 3.1: Truth table of bitwise abstract interpretation

for programmers to keep track of their “initialization” and reason about nondetermin-

istically converted values, so our experience shows most P4 programs do not rely on

such quirky behavior. If the program logic directly reﬂects initialized and uninitial-

ized bits, the veriﬁcation using the program logic becomes inconvenient. For example,

consider that a struct value with two single-bit ﬁelds {|x := 0; y := ⊥|} is passed into

a function as an argument. According to the semantics of argument evaluation, this

value is determinized to ∃b. {|x := 0; y := b|}. Although the programmer is happy

to treat y as uninitialized, it need to be handled explicitly in such a program logic.

It is inconvenient: (1) it requires more often manipulation of quantiﬁers; (2) all the

assertions need to identify whether each ﬁeld is initialized or not.

In order to resolve the inconvenience and simplify veriﬁcation, we apply abstract

interpretation [13] on a simple abstract domain, namely bitwise abstract values. The

construction of bitwise abstract values coincides with storable values, i.e. each bit

takes values within 0, 1, and ⊥, but ⊥ is interpreted as “don’t care”, so it can

correspond with any bit value, as shown in Table 3.1. This is also used to encode

a struct with some of the ﬁelds unspeciﬁed. This abstract interpretation is usually

unnoticeable by the veriﬁer user.

Hoare triple In Veriﬁable P4 program logic, a Hoare triple is of the form

{X}stmt{Y, Z}, where Y is the normal postcondition and Z is the return postcondi-

tion. This method of manipulating control ﬂow signals in Hoare logic is the same as

ret assert in VST [1, Chapter 24], so we skip detailed discussion, and only consider the

Simpliﬁed rule for assignment:

Γ, p,

⃗

P ⊢ exp ⇓ v

Γ, p ⊢ {MEM

⃗

P , EXT

⃗

Q}x@(inst p

′

) := exp{MEM

⃗

P [p

′

7→ v], EXT

⃗

Simpliﬁed function call rule (when p = q):

Γ, p ⊢ f() ⇓ (q, func) Γ, func satisﬁes PATH q, MOD m M, PRE X

, POST Y

p = q X ⊢ X

(X, Y

, m, M) ⇓

merge

Γ, p ⊢ {X}f(); {Y }

Figure 3.4: Sample program logic rules

form of {X}stmt{Y }. The assertions X and Y are of the form “MEM

⃗

P , EXT

⃗

Q”,

where the meaning of MEM and EXT is the same as in function speciﬁcations.

3.2.1 Program logic rules

Figure 3.4 shows a few simpliﬁed program logic rules, which we use to explain the

interesting aspects of the program logic.

The ﬁrst rule is for assignment, which corresponds to the semantic rule in Fig-

ure 2.12. When the precondition is (MEM

⃗

P , EXT

⃗

Q), we want to ﬁnd the post-

condition of executing x@(inst p

′

) := exp, where inst p

′

is the locator. The premise

Γ, p,

⃗

P ⊢ exp ⇓ v means the expression exp is evaluated to v if the variables are as

described in

⃗

P . (We omit the conversions between normal values and storable values

that use three-valued bits.) Then the post condition is simply obtained by setting

′

7→ v in

⃗

P . Although we do not present the soundness proof here, one can see that if

is almost straightforward because the modiﬁcation of s

local

in the semantics is directly

reﬂected in the logic rule. In comparison, using the Petr4 semantic rule in Figure 2.12

will be much more complicated to design a program logic and prove its soundness,

because it is necessary to track location allocation and prove their distinctness.

The second rule in Figure 3.4 is for function calls. For simplicity, we present a

special case that function f has no parameters or return value, its speciﬁcation is

given in quantiﬁer-free form, and p = q (explained below).

The most interesting part is the last premise, (X, Y

, m, M) ⇓

merge

Y . It merges

the assertion at the call site with the assertion from the function, using the disjointness

we established in Section 3.1. It is deﬁned as

((MEM

⃗

, EXT

⃗

), (MEM

⃗

, EXT

⃗

), m, M) ⇓

merge

((MEM ﬁlter(

⃗

, m) +

⃗

, EXT ﬁlter(

⃗

, M) +

⃗

where ﬁlter(

⃗

, m) +

⃗

means to remove local variables in

⃗

that appear in m since

they are modiﬁed (remove everything in

⃗

if m is ∗) and put together with

⃗

and ﬁlter(

⃗

, M) +

⃗

means to remove every hierarchical predicates Q in

⃗

that

δ(Q) ∥ M (the dependent set of Q is disjoint from M) and put together with

⃗

The meaning of the rest premises in the second rule is as follows. The rule ﬁrst

looks up f in Γ and ﬁnds out f is to execute function body func in path q. (We

ignore the detailed function resolution procedure here.) The second premise states

that func satisﬁes its function speciﬁcation. We ignore the ARG part in X

and Y

since there are no parameters. The third premise is p = q, which means the function

body func is executed in the same stack frame where f is called, as deﬁned in the

semantics. In the general case, if p ̸= q, the function body func is executed in a new

stack frame, then the local variable assertions in Y

and the modiﬁed local variables

in m do not aﬀect the postcondition Y . Then the rule needs the precondition X to

imply function precondition X

The complete program logic rules are provided as source code in Coq. We proved

all these rules are sound with respect to the operational semantics in Coq. The result

that we proved can be expressed as the following theorem.

Theorem 3.2 (Soundness). The semantics of the judgment Γ, p ⊢ {X}stmt{Y } is

for every states s and s

′

, and signal sig, if we have Γ, p, s ⊢ stmt ⇓ (s

′

, sig), and have

X(s), then sig = return and Y (s

′

). All Veriﬁable P4 program logic rules are proved

in Coq with respect to the operational semantics and semantics of Hoare triples.

Chapter 4

Tactic-Based Veriﬁer

The previous chapter presented a sound program logic for P4 programs. But verifying

a program, i.e., proving each function speciﬁcation, by directly applying those rules

is still a tough and laborious process, which involves picking appropriate rules, in-

stantiating variables, simplifying terms, and resolving entailments between assertions.

In this section, we describe our “Veriﬁable P4” veriﬁer based on Coq’s Ltac tactic

programming that helps users verify P4 control blocks. The general design of the

tactic-based veriﬁer is similar to VST-Floyd [7], which is a veriﬁer for C programs. It

automatically applies programming logic rules according to the P4 program syntax. It

also includes automatic simpliﬁcation and resolution of assertions for both eﬃciency

and readability of intermediate steps. Our veriﬁer gives a practical way to construct

foundational proofs showing that P4 programs implement functional models; but the

amount of work to verify a program depends on how much the natural interpretation

of the program diﬀers from the mathematical language used in its speciﬁcation.

We ﬁrst give a simple example as a walkthrough in Section 4.1. Then we in-

troduce the core tactics provided by the veriﬁcation system and their mechanism

in Section 4.2. After that, we show the automated resolution of MOD-clauses in

Section 4.3.

Proof assistant Here we give some additional preliminaries for the readers who are

not familiar with proof assistants. The typical workﬂow of interactive theorem prov-

ing in a proof assistant like Coq is as follows. The user inputs the theorem/lemma

statement they want to prove, which becomes the initial proof goal in the proof as-

sistant. The proof assistant displays the current proof goal(s), and the user inputs a

command, which is called a tactic, to either resolve the proof goal or reduce it into

one or more easier goals, based on their knowledge about the proof. This procedure

iterates until there are no remaining proof goals, at which point the lemma is proved.

The advantage of interactive theorem proving is it can prove complicated mathemat-

ical properties that are highly unlikely to become automatically provable in the near

future.

A tactic is a program used to construct a proof. Tactics can vary from performing

a simple proof step to automatically searching for a proof using advanced algorithms.

In practice, users can summarize common patterns in proofs and create new reusable

tactics. Program veriﬁcation proofs often contain many common patterns, making

them an ideal target for building a tactic system. For example, VST-Floyd is a

tactic system for C program veriﬁcation using separation logic. Coq provides a tactic

language, Ltac [16], for building custom tactics.

4.1 Walkthrough

In this section, we give a simple example of program veriﬁcation using our Veriﬁable

P4 veriﬁer. A real, large-scale application of proving a sliding-window Bloom ﬁlter

will be presented in Chapter 5.

Consider a P4 control block in the V1Model architecture as shown in Figure 4.1.

The ﬁrst two lines list the function prototypes of read and write. The program to

verify is in the Increment control block, which maintains a register array reg with a

// Function prototypes

void read(out T result, in bit<32> index);

void write(in bit<32> index, in T value);

// Prorgam to verify

control Increment(out bit<8> x) {

apply {

ℓ

reg.read(x, 0);

ℓ

x = x + 1;

ℓ

reg.write(0, x);

}

Figure 4.1: An example P4 control block in V1Model

Deﬁnition p : path := [”incr”].

Deﬁnition Increment spec : func spec :=

WITH,

PATH p

MOD ∗ [p]

WITH (x : Z),

PRE

(ARG [], MEM [], EXT [p.reg 7→ (ObjRegister [P4Bit 8 x])])

POST

(RET Null, ARG [P4Bit 8 (x + 1)], MEM [],

EXT [p.reg 7→ (ObjRegister [P4Bit 8 (x + 1)])])

Figure 4.2: Function speciﬁcation of Increment

single cell of type bit<8> (8-bit unsigned integer), referred as reg[0]. Line ℓ

loads

the value of reg[0] to x. Line ℓ

increases x by 1 (modulo 2

). Line ℓ

writes x back to

reg[0]. Overall, this function increases the value in reg by 1 and passes the new value

out as out parameter x.

Figure 4.2 shows the function speciﬁcation written in Coq (simpliﬁed for pre-

sentation). The value in the register reg before calling apply is represented by a

mathematical integer (x : Z) in the WITH-clause. The state to track in the function

speciﬁcation is simple: the precondition has neither in arguments (ARG) nor local

variables (MEM), and the EXT part contains a singleton extern predicate saying the

of x (x does not need to be between 0 and 255); the postcondition contains return

value Null (as the control block does not have a return value), out argument x + 1

(ARG), no assertion on local variables (MEM), and a singleton extern predicate saying

the new value of the register p.reg is x + 1.

The proof of Increment spec using the veriﬁer is a step-by-step forward symbolic ex-

ecution, as in VST-Floyd [7]. Roughly speaking, the user ﬁrst applies the start function

tactic to reduce from Increment spec to a Hoare triple across the function body, of

the form

}ℓ

; ℓ

{Y }.

Then the user uses the step

and step call tactics to infer the postcondition of each

statement. It would be ideal if the inferred postcondition is the strongest postcondi-

tion: it is indeed the strongest for assignment statements, but it is not the strongest

for function calls. Because function calls are reasoned using function speciﬁcations,

which contain quantiﬁers, the strongest postcondition of a function call also includes

quantiﬁers. Those quantiﬁers are cumbersome in entailment resolution and human

interaction, so an instance of quantiﬁer variables is inferred to generate a postcondi-

tion, which is not the strongest but enough to prove the program in most cases. Let

be the postcondition of executing ℓ

from X

. Then the tactic applies program

logic rules to reduce the goal to

}ℓ

; ℓ

{Y }.

These tactics are called forward and forward call in VST. We call them step instead, since “for-

ward” is a commonly used term in networking.

Proof.

start function.

step. (∗ initialization ∗)

step call register read body.

{ entailer. } (∗ function precondition ∗)

{ simpl. lia. } (∗ resolve arithmetic conditions ∗)

step.

step call register write body.

{ entailer. } (∗ function precondition ∗)

{ simpl. lia. } (∗ resolve arithmetic conditions ∗)

step. (∗ empty statement ∗)

entailer.

Qed.

Figure 4.3: Proof script of Increment

The next two steps reduce the proof goal to {X

}ℓ

{Y } and then to X

=⇒ Y

in a similar way. Finally, the user uses the entailer tactic to resolve the entailment

=⇒ Y .

Figure 4.3 shows a more concrete version of the Coq proof script to prove

Increment spec. In addition to the aforementioned start function, step, step call,

and entailer tactics, Figure 4.3 uses step two more times to handle an implicit

initialization statement at the beginning and an empty statement at the end, respec-

tively. Also, after applying step call with a function body proof register read body or

ﬁed. The variables in the WITH-clauses in register read body and register write body,

i.e. quantiﬁed variables, are also inferred by entailer. Some arithmetic proof goals are

also needed to prove after function calls, which are used to ensure the index to access

the register is in bounds. They can be proved easily by Coq’s built-in automated

tactics simpl and lia, which are not in the interest of this thesis.

The step call tactic requires the user to provide the proof of the callee func-

tion, such as register read body and register write body. In this example, since

proved them by hand directly with respect to the operational semantics in our

library. These methods are a ﬁxed set for each architecture (such as V1Model and

Toﬁno), so we can provide as a library. In the case of P4 functions (control blocks,

tables, and actions), the user should ﬁrst specify and prove the callee function in the

same way as the example. P4 language does not allow recursive functions, so the user

can prove the functions in order from the lowest layer to the highest. The step call

tactic requires the user to provide the proof instead of automatically searching in

a database. This is because functions, especially control blocks, may have separate

speciﬁcations for diﬀerent cases, and automatic search may not ﬁnd the proof for the

desired speciﬁcation.

4.2 Tactics

In this section, we give detailed explanation of the structure of the tactic system, func-

tionality of tactics, and their implementation. Besides start function, step, and entailer

mentioned in the previous section, there are two more groups of tactics: predicate

manipulating tactics that perform necessary transformations during the step-by-step

proof, and table tactics that handles P4 tables.

start function tactic (for normal functions) The start function for normal func-

tions is used at the beginning of each normal function proof, including actions and

control blocks. It reduces a function speciﬁcation to a Hoare triple. It ﬁrst resolves

the MOD-clause as Section 4.3 will show. Then it simulates the copy-in procedure

by converting the in-argument assertions in the precondition into local variable as-

sertions. It also infers the postcondition of Hoare triple to prove from the function

postcondition by generating the weakest precondition of the copy-out procedure.

step tactics There are a three diﬀerent step tactics used depending on the next

program statement:

• step: Execute the next program statement, including built-in function calls but

not other function calls.

• step if/step if Y : This tactic handles if-statements. Y is the postcondition of

the if-statement, which can be omitted if there are no other statements after

the if-statement. step if makes one subgoal for the then-clause and one for the

else-clause.

• step call lem ⃗v: This tactic processes a call to a function whose speciﬁcation

is proved by the lemma lem. ⃗v is an optional list of Coq terms to instantiate

the second WITH-clause in the function speciﬁcation proved in lem. If some

of the WITH-variables are not instantiated (including that ⃗v is not provided,

⃗v is shorter than the list of variables, and some terms in ⃗v are ﬁlled with “ ”),

the uninstantiated variables are ﬁlled with uniﬁcation variables, which means

leaving them as values to be determined. These values are often inferable from

the values in the current assertion and function arguments, and automatically

ﬁlled by the following entailer tactic. So the user only needs to provide them

manually in some special cases.

All the step tactics include simplifying expressions with best eﬀort.

Predicate manipulating tactics During the proof of a function body, the user

sometimes needs to transform the precondition in the Hoare triple before proceeding

to the next step. For example, suppose the precondition contains a encapsulated

hierarchical predicate, e.g. wrap

(P ∧ Q ∧ R), and the next step is a function call

that needs to match the contents inside wrap. Then she wants unpack the predicate

into three single predicates P , Q and R using the normalize EXT tactic below before

calling step call. The list of predicate manipulating tactics is as follows.

• normalize EXT: Flatten the hierarchical predicates by breaking wrap and con-

junctions, and then move the members to the top level of the EXT part as much

as possible.

• Intros: Move an existentially quantiﬁed variable from the precondition to the

context.

• Intros prop: Move a pure proposition (P

pure

) from the precondition to the con-

text.

• P4assert P : This tactic is used when a pure proposition P is needed for manual

transformation (e.g. rewriting with an equality). After applying P4assert P ,

the user ﬁrst proves the precondition implies P , then P will be available in the

context for the remaining proof.

entailer tactic The entailer tactic is used to resolve or simplify assertion en-

tailments, mostly in the form of MEM

⃗

Q, EXT

⃗

R =⇒ MEM

⃗

′

, EXT

⃗

′

ARG

⃗

P , MEM

⃗

Q, EXT

⃗

R =⇒ ARG

⃗

′

, MEM

⃗

′

, EXT

⃗

′

. These goals appear at

the end of a function, a block, or at a function call (where we need to prove that

the current precondition satisﬁes the function precondition). This tactic is safe in

most cases: safe means it does not turn a provable goal into any unprovable goals.

The exception is the case where a pure proposition needs to be proved from the

precondition ﬁrst, before reducing the entailment.

The reduction of entailment in this tactic is done by matching the corresponding

parts in the precondition and the postcondition. Function arguments are matched by

their positions. Local variables are matched by their names. Both function arguments

and local variables are reduced to the order relation (inclusion) in the abstract domain,

and resolve automatically if possible. Singleton extern predicates, which are of the

form p 7→ v, i.e. the value of the extern object p has value v, are matched with any

singleton with the same p. Other extern predicates, if they are not directly resolved,

are not be able to be matched. All the resolution steps include instantiation of

uniﬁcation variables, which are often introduced by postconditions with existential

variables and by function calls for which the user has not (manually) provided an

instantiation for the WITH-clause.

There are some other tactics for resolving entailments. The Exists tactic instan-

tiates existential quantiﬁers in the postcondition. Predicate manipulating tactics

mentioned above are also applicable on entailments.

Table tactics There are two tactics for tables:

• start function (for tables): This tactic is used at the beginning of proofs of tables,

which are considered as a kind of function. It only supports the case that table

entries are known. It evaluates table key expressions and matches them against

the entries. It reduces the proof goal into each entry’s action call.

• table

action lem: After splitting a table into the cases for each entry, table action

lem is used on each goal to apply the function body proof lem of the action called

by the table in this case and reduce the proof goal into two entailments, before

and after the function call, respectively.

Summary With these ﬁve groups of tactics handling P4 execution, users can easily

verify P4 programs without paying further attention to P4 features. They only need to

focus on normal Coq proof speciﬁc to their theory and algorithm, such as arithmetic,

data structures, and packet forwarding protocols.

4.3 Automated proof of MOD-clauses

In a function speciﬁcation, we need to have a MOD-clause of the form MOD m M to

indicate modiﬁed local variables and extern objects, so that the function speciﬁcation

can compose with the assertion at the call site. When the start function tactic reduces

a function speciﬁcation to a Hoare triple, it ﬁrst proves the MOD-clause. P4 is

a simple language and the phase distinction between instantiation and execution

has eliminated higher-order runtime behavior, so the MOD-clause should be proved

automatically.

To prove a function only modiﬁes m and M, we reason on the syntactic structure of

the function body.

The breakdown proof is done by applying proven-sound deduction

rules similar to program logic rules, e.g.

Γ, p ⊢ stmt

MOD m M Γ, p ⊢ stmt

MOD m M

Γ, p ⊢ stmt

; stmt

MOD m M .

Simple statements without function calls are also proved by applying such rules.

Function call statements can be proved either by stepping into the function or by

using its function speciﬁcation. Function speciﬁcations designed in Chapter 3 has the

MOD-clauses outside of quantiﬁed variables, except the top-level quantiﬁed variables

(⃗x), which are expected to be inferable. So instantiating other quantiﬁed variables

is not required, making it easy to automatically apply the MOD-clauses in function

speciﬁcations.

The proof search is based on Coq’s built-in eauto search engine, so we do not

have to build and optimize a new one. The deduction rules are designed such that

each step has at most one applicable rule, so the proof search does not backtrack and

is reasonably eﬃcient. The eauto search engine also allows the user to add proved

function speciﬁcations into the rules, so the caller does not need to step in to the

It is possible that a variable is ﬁrst modiﬁed and then modiﬁed back. In this case, these syntax-

directed proof rules are incomplete. However, it is uncommon in practical programming, and if it

occurs, it is no harm to add them to the MOD-clause and maintain their value in the precondition

and the postcondition.

function. In the large-scale example presented in Chapter 5, the proof search takes

at most about 3 seconds for a function.

Discussion Another approach to prove MOD-clauses automatically is proof by re-

ﬂection, which is the method of proving a property by writing a testing program for

the property and proving its correctness. So the proof is simply by evaluating the

program. We did not use proof by reﬂection for MOD-clauses because proof by re-

ﬂection is usually considered more heavyweight. It requires more eﬀort to implement

and modify. We refer the reader to a fantastic book [10] for more information about

proof by reﬂection.

Chapter 5

Application Demonstration:

Sliding-Window Bloom Filter

In this section, we present an application case of our veriﬁcation framework to demon-

strate its utility. Using this example, we show our Veriﬁable P4 is capable of mod-

ularly verifying P4 programs at a larger scale, and proving stateful properties of

processing a sequence of packets (called ﬂow properties).

Consider a switch that connects an internal network with an untrusted external

network. Adversaries from the external network may perform distributed denial-of-

service (DDoS) attacks on internal hosts. Although the switch is capable of handling

the huge traﬃc, the internal hosts cannot, because they are based on CPUs. One

policy for protecting internal hosts is to allow incoming traﬃc recently requested by

internal hosts and block most unsolicited incoming traﬃc. This is a kind of stateful

ﬁrewall, which can be implemented based on a sliding-window Bloom ﬁlter (SBF).

SBF is an approximate data structure for maintaining a set whose elements expire

after a ﬁxed amount of time. We take a P4 implementation of such a stateful ﬁrewall

on the Toﬁno architecture with a single pipeline as a case study on verifying stateful

P4 programs. It is acceptable that the stateful ﬁrewall allows a small portion of

unsolicited traﬃc that internal hosts can handle, but it is necessary to prove that the

stateful ﬁrewall does not block any benign traﬃc. We formalize this property as a

formula on the history of packets and give an end-to-end veriﬁcation.

5.1 Sliding-window Bloom ﬁlter

Programmable switches have limited resources. Therefore sketching data structures

are used to implement features with tiny space and time (usually in terms of hardware

pipeline stages). The trade-oﬀ is tolerating errors with a low probability.

A famous example of sketching data structures is the Bloom ﬁlter [4], which

approximately supports querying existence of elements in a set. It supports two

operations: (1) inserting an element, and (2) querying an element and reporting

“likely in the set” (positive) or “deﬁnitely not in the set” (negative).

Beyond the Bloom ﬁlter, sliding-window Bloom ﬁlter (SBF) is the task of main-

taining elements in a set, and automatically removing each element approximately T

time units after its most recent insertion. For ﬁxed time interval T and tolerance pa-

rameter δ, the data structure provides two operations: (1) inserting an element with

a timestamp (timestamps of operations must be nondecreasing), and (2) querying an

element with a timestamp and reporting “likely inserted within time (1 + δ)T ” (pos-

itive) or “deﬁnitely not inserted within time T ” (negative). A classic algorithm for

sliding-window Bloom ﬁlter is using multiple Bloom ﬁlters periodically, which we will

show below. Although there is a more eﬃcient algorithm [32], that algorithm cannot

be implemented on the Toﬁno architecture (nor on other hardware P4 architectures).

We choose the sliding-window Bloom ﬁlter on the Toﬁno architecture generated

by the CatQL code generator [35] as the veriﬁcation target. In the following sections,

we will prove that the P4 program always returns a positive result if the element is in

the set. This is called no-false-negative property. On the other hand, it is also desired

that the probability of returning a positive result when the element is not in the

set (low-false-positive-rate property). This property requires a probabilistic model of

elements and hash functions used in the Bloom ﬁlter, so we leave it as future work.

But the no-false-negative proof includes useful steps for the low-false-positive-rate

property.

A sliding-window Bloom ﬁlter (SBF) consists of k Bloom ﬁlters, called panes.

Each pane has r rows, each associated with a hash function h

. The hash function

array (h

, h

, . . . ) is the same for all the panes.

Each row is a hash table of S bit-

value slots initially ﬁlled with 0s.

Inserting x into a Bloom ﬁlter is to set the h

(x)-th

slot to 1 for each row i. Querying x in a Bloom ﬁlter is to take conjunction over the

(x)-th slot of all rows i. If x is already inserted, then querying x must be 1 as the

(x)-th slot is set to 1 for all i. So if the result is 0, it is “deﬁnitely not in the set”.

One can expect the false-positive rate is low: If γ is the density of 1s in each row,

an ideal analysis of hash functions gives the probability of getting 1 on query of a

not-in-the-set x is γ

An SBF maintains time by dividing it into pieces of the same size, called time

panes, as shown in Figure 5.1. Each time pane is associated with an SBF pane, which

is allocated circularly. In this example, there are k = 4 SBF panes. At any moment,

there are k−1 working panes (denoted by solid boxes) and 1 clearing pane (denoted by

a dashed box). If the present time is in pane3 as shown in the ﬁrst row of Figure 5.1,

pane1 through pane3 are working panes, and new elements will be inserted into the

latest working pane, pane3. When querying, all working panes are queried and the

overall result is their disjunction. So a query is guaranteed no-false-negative, if the

element has been inserted between the beginning of the earliest working pane and

One may choose to use diﬀerent hash functions for each pane, for example, in order to prevent

some exploitation of the hash functions. These are not considered in this thesis.

The implementation of panes (Bloom ﬁlters) is slightly diﬀerent from the standard implemen-

tation, which uses a single hash table for all hash functions. This diﬀerence is because the Toﬁno

architecture only allows one update per register (the underlying implementation of hash table) per

packet. This diﬀerence does not aﬀect the essence of the Bloom ﬁlter.

time

pane

pane1 pane2 pane3pane4

pane2 pane3 pane4

present

pane2 pane3 pane4 pane1

present

pane1

query window

Figure 5.1: Panes in a sliding window Bloom ﬁlter

the present time, which is the query window shown in the ﬁgure. When the present

time moves to the next pane, as shown in the second row, pane4 is ﬁnished clearing

and reused for new elements. Pane1 is outdated and going to be cleared, and the left

endpoint of the query window becomes the beginning of pane2.

To clear a pane means to set it to all zeros. This cannot be done instantaneously

on Toﬁno (or similar architectures). Instead, with each packet processed, one more

bit of the clearing pane will be set to zero. This is called the clearing duty. So we

require an assumption in our veriﬁcation result: there are always at least s packets

in each time pane to serve the clearing duty, where s is the number of slots in each

row. We can guarantee this assumption using the Toﬁno switch’s packet generator, a

component just before the P4 pipeline that can be conﬁgured to insert extra packets

at the desired rate.

Which elements are maintained by this data structure? As shown in Figure 5.1,

the length of the query window varies between (k − 2) and (k − 1) time panes. So

the aforementioned time T is the lower bound of query window size, i.e. (k − 2) time

panes. If an element is inserted within time T , it is guaranteed no-false-negative.

Rest of the chapter Next, we will show some key points in the P4 program of

SBF, which provides further illustration of P4 programming, and explains the tricks

used to implement the algorithm eﬃciently that makes the veriﬁcation challenging.

Although the P4 program is clumsy, the functional model (Section 5.3.1) and the

axiomatic interface (Section 5.2.1) to characterize SBF are much easier to read and

reason about.

5.1.1 P4 implementation

The P4 program of SBF that we verify is as follows. This program is generated by

the CatQL code generator. This program is the extended version of Figure 2.1. The

program has been signiﬁcantly simpliﬁed for presentation,

but it unavoidably remains

very complicated, as it was crafted by CatQL to maximize the eﬃcient utilization of

hardware resources. P4 language features, such as registers, actions and tables, also

add complexity to the program.

This program demonstrates the complexity of P4 programs. Nonetheless, it follows

the natural modular structure of SBF, which contains three layers from bottom to

top: row, pane, and SBF. Each control block corresponds to a layer. So we can verify

it modularly.

1 control Row(in bit<8> api, in bit<18> index, out bit<8> res) {

2 Register<bit<8>, bit<18>>(262144, 0) reg;

The program starts with the P4 implementation of a Bloom ﬁlter row. It has

two in parameters: api takes value from NOOP (no operation), INSERT, QUERY, and

CLEAR; it represents the operation to take on the row; index represents the index

of the slot to modify or query, which is a hash value of the key. Row has one out

The full version is available in the appendix.

U execute(in I index) {

U rv;

T value = reg.read(index);

apply(value, rv);

reg.write(index, value);

return rv;

}

Figure 5.2: Behavior of execute in RegisterAction

parameter, namely res, which is used to pass out the query result. Line 2 deﬁnes reg,

which is a Toﬁno register with 2

= 262144 slots, initialized with zeros.

The P4 implementation is a bit more verbose than in conventional imperative pro-

gramming languages, because the P4 implementation is closely related to hardware.

3 RegisterAction<bit<8>, bit<18>, bit<8>>(reg) regact insert = {

4 void apply(inout bit<8> value, out bit<8> rv) {

5 value = 1;

6 rv = 1;

7 }

8 };

9 action act insert() {

10 res = regact insert.execute(index);

11 }

12 action act query() { ... } /∗ Similar to regact insert and act insert ∗/

13 action act clear() { ... } /∗ Similar to regact insert and act insert ∗/

Line 3 deﬁnes a register action regact insert on register reg, which is used to up-

date reg when api is INSERT. It contains an abstract method (c.f. Section 2.2.2)

apply, which is called by the register action itself when its execute method is called

on line 10. The behavior of execute is illustrated in Figure 5.2. In particular,

regact insert.execute(index) sets the index-th slot to 1 and returns 1. Line 9 wraps

the register action as an action. The register actions are deﬁned and wrapped as

actions similarly for QUERY and INSERT.

14 table tbl row {

15 key = { api : ternary; }

16 actions = {

17 NoAction(); act insert(); act query(); act clear();

18 }

19 const entries = {

20 NOOP : NoAction();

21 INSERT : act insert();

22 QUERY : act query();

23 CLEAR : act clear();

24 }

25 }

26 apply {

27 tbl row.apply();

28 }

29 }

Then we deﬁne a match-action table tbl row, which corresponds to a hardware

match-action table. This table is responsible for calling the appropriate action based

on the value of api. The control block’s apply block, which is its entrance, is just

calling tbl row.

30 control Pane(inout pane md t pane md) {

31 Row() row 1; Row() row 2; Row() row 3;

32 apply {

33 row 1.apply(pane md.api, pane md.index 1, pane md.res 1);

34 row 2.apply(pane md.api, pane md.index 2, pane md.res 2);

35 row 3.apply(pane md.api, pane md.index 3, pane md.res 3);

36 }

37 }

The implementation of a pane is simple. It just dispatches the operation to each

of its rows, and returns the result from each row as-is.

38 control SBFilter(in bit<64> key, in bit<8> api, in bit<48> tstamp,

39 out bit<8> res) {

40 sbf md t sbf md; /∗ temporary variables ∗/

41 Register<bit<18>, bit<1>>(1, 0) reg clear index;

42 Register<pair<bit<16>, bit<16>>, bit<1>>(1, {0, 0}) reg timer;

43 Pane() pane 1; Pane() pane 2; Pane() pane 3; Pane() pane 4;

The control block for an SBF has three in parameters, key, api, and tstamp, which

represent the key, the operation, and the timestamp, respectively. The operation can

take value from INSERT, QUERY and CLEAR. All the operations involve serving the

clearing duty, and the CLEAR operation does not do anything else. The out parameter

res returns the query result. Line 40 deﬁnes a struct to use as temporary variables.

Lines 41, 42 & 43 deﬁne the stateful objects that an SBF maintains. Among them,

reg clear index is an 18-bit integer that increments by one in each step and wraps back

to 0 after reaching 2

− 1. It determines the position to be cleared in each row of the

clearing pane. reg timer maintains the time information according to the timestamp

of each packet generated by the switch, in nanoseconds, to control which panes are

being used and when to enter a new time pane. pane 1, . . . , pane 4 are 4 instances

of Pane to implement the panes of the SBF.

68 apply {

69 act hash index 1(); /∗ caculate hash functions ∗/

70 act hash index 2();

71 act hash index 3();

72 tbl clear index.apply(); /∗ increment clear index ∗/

73 tbl timer.apply(); /∗ update timer ∗/

74 tbl set panes.apply(); /∗ generate operation for each pane ∗/

75 pane 1.apply(sbf md.pane 1); /∗ invoke each pane ∗/

76 pane 2.apply(sbf md.pane 2);

77 pane 3.apply(sbf md.pane 3);

78 pane 4.apply(sbf md.pane 4);

79 tbl merge panes.apply(); /∗ merge query results of panes ∗/

80 }

81 }

We temporarily skip some lines of code and show the apply block of SBFilter ﬁrst,

which is the main body of SBFilter. It ﬁrst calculates the three hash functions used

by the SBF, followed by calling tbl clear index to read and increment reg clear index.

Lines 73 & 74 update reg timer and use it to generate an operation for each pane,

respectively. This part involves some tricks to eﬃciently maintain the timer, which

will be explained below. Then it invokes each pane to process their operation, followed

by merging the query result using a match-action table tbl merge panes.

44 RegisterAction<...>(reg timer) regact timer signal 0 = {

45 void apply(inout pair<bit<16>, bit<16>> val, out bit<16> rv) {

46 if (val.lo != 0) { /∗ increase hi by 1 when setting lo back to 0 ∗/

47 if (val.hi == 4∗PT−1) /∗ test if wrapping back to 0 ∗/

48 { val.lo = 0; val.hi = 0; }

49 else

50 { val.lo = 0; val.hi = (val.hi + 1); }

51 }

52 else

53 { val.lo = val.lo; val.hi = val.hi; }

54 rv = val.hi;

55 }

56 };

57 RegisterAction<...>(reg timer) regact timer signal 1 = { ... }

58 table tbl timer { ... } /∗ select the regact based on the 21st bit of tstamp ∗/

59 table tbl set panes { /∗ assign operations to panes based on timer ∗/

60 const entries = {

61 (INSERT, 0∗PT .. 1∗PT−1) : set api(CLEAR, NOOP, NOOP, INSERT);

62 (INSERT, 1∗PT .. 2∗PT−1) : set api(INSERT, CLEAR, NOOP, NOOP);

63 (INSERT, 2∗PT .. 3∗PT−1) : set api(NOOP, INSERT, CLEAR, NOOP);

64 (INSERT, 3∗PT .. 4∗PT−1) : set api(NOOP, NOOP, INSERT, CLEAR);

65 ...

66 }

67 }

Lines 44–67 brieﬂy show the tricks of maintaining the timer. In order to measure

the time elapsed eﬃciently, the program counts the number of rounds that the 21

bit (the bit representing 2

) of the timestamp switches from 0 to 1 and back to 0. We

assume that the packet ﬂow is dense enough that the timestamps are continuous up

to the 21

bit. This ensures the timestamp does not leap without being noticed by

the timer. This assumption is easily satisﬁed by a practical ﬂow, as it only requires

one packet in every 2

ns ≈ 2 ms. And it can be further ensured using the packet

generator. Such a round is called a “tick-tock”, so each tick-tock is 2 · 2

ns.

Let PT (pane time) be the number of tick-tocks in each time pane. The SBF panes

are used circularly, so its period is k ·PT tick-tocks. The register reg timer keeps track

of the position in this period using two integers, lo and hi, which are the 21

bit of the

previous timestamp, and the number of tick-tocks modulo k · PT. Line 44 shows the

bit of the current timestamp is

0. Another register action regact timer signal 1 is used when the 21

bit is 1. These

two actions are selected by table tbl timer. Then table tbl set panes uses the value of hi

to generate operations of panes. This manipulation of timestamps reduces the usage

of arithmetic operators and uses more tables. This is a common way to eﬃciently

utilize resources in P4 programming. But it increases the complexity of veriﬁcation.

5.2 Veriﬁcation organization

To provide navigation for the remaining of this chapter, we begin by presenting an

overview of the the whole veriﬁcation process in Figure 5.3. We ﬁrst compiled the

P4 source program into P4light AST using our front end (Section 2.1). We then

generated the global environment using the instantiation phase program. The rest of

the veriﬁcation process is separated in two parts shown in gray regions: veriﬁcation

of the sliding-window Bloom ﬁlter (SBF) and veriﬁcation of the ﬁrewall (FW) as a

client of the SBF. We will illustrate them in detail in Sections 5.3 & 5.4, respectively.

The veriﬁcation of the SBF concerns the implementation of the SBF, in particular,

the SBF instance and its subordinate objects. The veriﬁcation of the ﬁrewall concerns

the client code that calls the SBF, the model of other programmable blocks (parsers

and deparsers), and the Toﬁno switch model.

We design a narrow interface between these two parts, which only consists of the

interface of the abstract functional model of SBF (but not the implementation) and

two proof blocks: (1) the Veriﬁable P4 proof of that the SBF P4 program satisﬁes

the abstract function speciﬁcation, and (2) the proof of that the abstract SBF model

satisﬁes the axiomatic interface of SBF. The statements of the latter are shown in

subsumption

proof

Con. SBF Model

Abs. SBF Model

Refinement Proof

Con. SBF Func.

Spec.

SBF Objects

P4 Proof

Abs. SBF Func.

Spec.

P4 Proof

FW Model

SBF Axiom. Interf.

FW Func. Spec.

P4 Proof

P4 Program

FW Objects

Depends on contents

Depends on interfaces

Main Theorem

P4 AST

compile

Global Environment

Switch Model

Flow Property

Firewall VerificationSBF Verification

Figure 5.3: Overview. Rounded rectangles represent blocks of deﬁnitions; parallelograms represent blocks of proofs. A solid

line indicates that a block depends on the implementation of another block; a dashed line indicates that a block depends only

on the interface of another block. A proof block usually establishes the relationship between two deﬁnition block and therefore

depends on these two blocks. The proof body is irrelevant to its usage, so proof blocks are always depended with dashed lines.

Section 5.2.1. This narrow interface allows modular veriﬁcation: a data structure

and its client are veriﬁed separately; the data structure proofs can be used for a

diﬀerent client, and the client proofs can be plugged with a diﬀerent data structure

implementation that satisﬁes the interface.

One veriﬁcation methodology used for SBF is layer reﬁnement, which is similar

to the technique in CertiKOS [21] and veriﬁed FEC [11]. We implement two layers of

functional model, called “concrete functional model” and “abstract functional model.”

The concrete functional model is deﬁned closely related to the P4 program, so it is

convenient to prove that the P4 program implements the functional model. However,

the shortcoming of the concrete functional model is that it is too much involved with

the tricks used in the P4 program, making it more diﬃcult to prove desired properties

about the functional model. So we deﬁne the abstract functional model of SBF in

a more elegant way. We prove the reﬁnement relation between these two functional

models, and then prove the abstract function speciﬁcation through subsumption of

speciﬁcations. We then prove the axioms in the axiomatic interface, with the beneﬁt

of the abstract functional model.

The next part is the veriﬁcation of the ﬁrewall. We ﬁrst prove P4 program of

the ﬁrewall as a client of the SBF. For the behavior of the switch, we focus on

the part related to the ﬁrewall build a switch model that assumes the behavior of

other components (parser and deparser). With the switch model, we prove that the

stateful ﬁrewall guarantees “no-false-negative,” in other words, the response packets

to internal hosts are always allowed by the stateful ﬁrewall.

5.2.1 Axiomatic interface of SBF

Figure 5.4 shows the axioms to characterize SBF. There are three operations,

SBFinsert, SBFquery, and SBFclear, deﬁned in the abstract SBF model. SBFinsert

If state f is OK until (at least) deadline t, and if you insert IP address h at time t

and then look up h at time t

′

no more than T seconds later, it will be present.

QueryInsertSame : ∀ f t t’ h, ok until f t → t ≤ t’ ≤ t+T →

SBFquery (SBFinsert f (t, h)) (t’, h) = Some true.

If you could ﬁnd IP address h

′

in the state, and you insert (perhaps diﬀerent)

IP address h, then h

′

is still in there.

QueryInsertOther : ∀ f t t’ h h’, SBFquery f (t’, h’) = Some true →

ok until f t → t ≤ t’ → SBFquery (SBFinsert f (t, h)) (t’, h’) = Some true.

Doing a clear-step won’t aﬀect any query results.

QueryClear : ∀ f t t’ h, ok until f t → t ≤ t’ →

SBFquery f (t’, h) = SBFquery (SBFclear f t) (t’, h).

If state f is OK until deadline t, you can extend its

deadline by up to 100 microseconds (T /C) by [insert].

OkInsert : ∀ f t t’ h, ok until f t → t ≤ t’ ≤ t+T/C →

ok until (SBFinsert f (t,h)) t’.

If state f is OK until deadline t, you can extend its

deadline by up to 100 microseconds (T /C) by [clear].

OkClear : ∀ f t t’, ok until f t → t ≤ t’ ≤ t+T/C → ok until (SBFclear f t) t’.

The initial state is OK until its preset deadline.

OkEmpty : ∀ t, ok until (SBFempty t) t.

Figure 5.4: Axiomatic interface of SBF

inserts a record to the SBF with a timestamp, and serves the clear duty of one slot.

SBFclear only serves the clear duty of one slot. SBFquery queries the SBF with a

timestamp without modiﬁcation. In order to characterize the requirement of serving

the clear duty, we add a predicate ok until f t to test whether f is in good shape

without clearing before t. The meaning of the axioms is explained in the ﬁgure.

These axioms are enough to prove the “no-false-negative” property. (see Section 5.4)

5.3 Veriﬁcation of sliding-window Bloom ﬁlter

This section shows the veriﬁcation process of the SBF module using Coq and Veriﬁable

P4.

5.3.1 Concrete functional model

The ﬁrst step of the veriﬁcation is deﬁning a functional model of the P4 program in

the Coq proof assistant. Because we are going to deﬁne a more abstract functional

model as an intermediate step of the proof, we name this as the concrete functional

model, as it is closer to the original program.

1 Parameter (num slots num rows num panes pane tick tocks : Z).

2 Let cycle tick tocks := pane tick tocks ∗ num panes.

4 (∗ a.[i] and a.[i := x] are get and set for lists, respectively. ∗)

5 (∗ map2 applies a binary function on each pair of

6 corresponding elements of two lists and results in a single list. ∗)

7 (∗ fold andb is logical ”and” over a list of Booleans. ∗)

9 Deﬁnition row := list bool.

11 Deﬁnition row insert (r : row) (i : Z) : row := r.[i := true].

13 Deﬁnition pane := list row.

15 Deﬁnition pane insert (f : pane) (is : list Z) : pane :=

16 map2 row insert f is.

The code above is the ﬁrst part of the concrete functional model. At the be-

ginning, the dimension and time constants of the SBF are deﬁned as parameters in

the functional model. These constants are hardcoded in the P4 program, as the P4

language is not ﬂexible enough to make all of them parameters. With these constants

parametrized, all the proofs about the functional model itself are for any parameters.

And we expect to be able to replay the same Veriﬁable P4 proof scripts, which con-

nects the program with the concrete functional model, for programs with diﬀerent

values of the constants.

The functional model is layered in the same way as the P4 program. Types row

and pane represent the persistent states of rows and panes, respectively. The function

row insert deﬁnes the operation of inserting an element into a row, with the element

is represented by its hash value for the row. The function pane insert deﬁnes the

operation of inserting an element into a pane, with the element is represented by a

list of hash values. The QUERY and CLEAR operations are deﬁned similarly in the

functional model.

18 Record ﬁlter := mk ﬁlter {

19 ﬁl panes : list pane;

20 ﬁl clear index : Z;

21 ﬁl timer : Z ∗ bool;

22 }.

24 Deﬁnition update timer (t : Z ∗ bool) (tick : bool) : Z ∗ bool :=

25 if tick

26 then (fst t, true)

27 else if snd t

28 then if (fst t =? cycle tick tocks − 1)

29 then (0, false)

30 else (fst t + 1, false)

31 else t.

33 Deﬁnition get clear pane (t : Z ∗ bool) : Z :=

34 fst t / pane tick tocks.

36 Deﬁnition get insert pane (cp : Z) : Z :=

37 if (cp =? 0) then num panes − 1 else cp − 1.

39 Deﬁnition ﬁlter insert (f : ﬁlter) (tick : bool) (h : list Z) : ﬁlter :=

40 ... (∗ unpack f, calling update timer, get clear pane, get insert pane ∗)

41 let new panes := panes.[cp := pane clear panes[cp] (Zrepeat clear index num rows)]

42 .[ip := pane insert panes[ip] h] in

43 mk ﬁlter new panes new clear index new timer.

100

Function speciﬁcation of Row.apply() for the INSERT operation:

PATH p

MOD ∗ [p]

WITH (r : row) (i : Z) ( : 0 ≤ i < num slots),

PRE (ARG [P4Bit 8 INSERT; P4Bit index w i], MEM [ ], EXT [row repr p r])

POST (RET Null, ARG [P4Bit 8 1], MEM [ ], EXT [row repr p (row insert r i)])

Function speciﬁcation of action act insert() in Row:

PATH p

MOD [[”rw”]] [p]

WITH (r : row) (i : Z) ( : 0 ≤ i < num slots),

PRE (ARG [ ], MEM [([”index”], P4Bit index w i)], EXT [row repr p r])

POST (RET Null, ARG [ ], MEM [([”rw”], P4Bit 8 1)],

EXT [row repr p (row insert r i)])

Figure 5.5: Function Speciﬁcations with concrete functional model

The code above is the SBF layer of the functional model. Type ﬁlter repre-

sents the persistent state of an SBF. Because the dimension and time constants are

parametrized, we use division and modulo operations to deﬁne the manipulation of

the timer. The function ﬁlter insert deﬁnes the operation of inserting an element into

an SBF, with the element is represented by a list of hash values. Variables cp and

ip represent the indices of the panes to clear and to insert, respectively. The QUERY

and CLEAR operations are deﬁned similarly.

5.3.2 Function speciﬁcations

The function speciﬁcations link the functional model with the P4 program state.

For example, Figure 5.5 shows the speciﬁcation of the control block Row’s apply func-

tion when the operation is INSERT, and the speciﬁcation of the action act insert() are

as shown in Figure 5.5. The hierarchical predicate row repr, as deﬁned in Section 3.1,

is used to link the functional model value with the state of extern objects in P4. Be-

cause each layer supports diﬀerent operations, we write a function speciﬁcation for

each operation and prove them separately, instead of organizing diﬀerent operations

101

into a single function speciﬁcation. The speciﬁcation for other operations and other

layers are deﬁned similarly.

5.3.3 Veriﬁcation of abstract methods

Abstract methods are an important feature in P4, whose semantics we deﬁned in

Section 2.2.2. For example, when P4 is compiled to hardware in architectures such

as Toﬁno, there is a signiﬁcant limitation in the way that the P4 program accesses its

state: only one access (to any particular piece of state) per packet. To work around

that limitation, P4 allows that single access to be a read-modify-write expressed as a

“register action,” in which the custom modiﬁcation is written by the P4 programmer

as an abstract method.

In this section, we show how to verify P4 extern objects with abstract methods, by

demonstrating Toﬁno’s register actions. Register actions are widely used in stateful

programs in Toﬁno, such as the SBF (see Section 5.1.1). A register action contains

an abstract method, apply, that speciﬁes how the register is updated. We have built a

systematic way to modularly verify register actions with respect to functional models,

and built automatic tactics to facilitate the veriﬁcation procedure. We hope this

approach is general enough for all extern objects with abstract methods.

An abstract method is a code block that does not have access to any variables

beyond the parameters passed into the the abstract method. So, in a naive approach,

an abstract method can be modeled as a pure function from values to values. For

example, when the abstract method has one in parameter and one out parameter, we

can model its behavior as a function f : Val → Val. Its function speciﬁcation can be

written as below.

PATH p, MOD ∗ [ ]

WITH (v : Val),

PRE (ARG [v], MEM [ ], EXT [ ])

POST (RET Null, ARG [f v], MEM [ ], EXT [ ])

102

Function speciﬁcation of abstract method apply in RegisterAction:

PATH p, MOD ∗ [ ]

WITH (v : A) (H

: P v),

PRE (ARG [r v], MEM [ ], EXT [ ])

POST (RET Null, ARG [r (f v), g v], MEM [ ], EXT [ ]).

Function speciﬁcation of method execute in RegisterAction:

PATH p, MOD ∗ [p

reg

]

WITH (reg : list Val) (i : Z) (v : A) (H

: P v) (H

′

: reg[i] = r v),

PRE (ARG [P4Bit w i], MEM [ ], EXT [p

reg

7→ (ObjRegister reg)])

POST (RET (g v), ARG [ ], MEM [ ],

EXT [p

reg

7→ (ObjRegister reg[i := r (f v)])]).

Figure 5.6: Function speciﬁcations in RegisterAction

However, this form is not convenient to apply in actual veriﬁcation tasks. Be-

cause Val contains all kinds of values, including signed and unsigned integers, structs,

headers, etc., the model function f needs to be deﬁned for all the cases, and the func-

tion speciﬁcation needs to be proved for all the cases. But in fact, these parameters

are not only typed, but sometimes also have more restrictive constraints from the

program design. In order to characterize and utilize those constraints, we allow the

parameters to be represented by a custom type A instead of Val. We use a predicate

P : A → Prop to characterize valid inputs, and use r : A → Val to describe the

representation of A as a value. This approach gives more convenience and ﬂexibility

for treating abstract methods.

For example, consider the abstract method in the register action mentioned in Sec-

tion 2.2.2. Figure 5.6 ﬁrst shows the general form of function speciﬁcation of abstract

method apply, where functions r, f and predicate P are as above, and function g re-

lates the input with the out parameter rv for the return value of execute. Our library

provides a general speciﬁcation of execute parametrized by the apply speciﬁcation.

Once the user has proved the abstract method satisﬁes a speciﬁcation in this form,

the speciﬁcation of function execute provided in our library can be specialized for the

instance. The specialized form is given in the second part of Figure 5.6, where the

103

parameters A, r, P, f, g are from the apply method of the instance. The specialization

procedure is supported by the fully automatic tactic build execute body, so the user

does not have to handle the details.

Abstract methods are often simple. For abstract methods that do not have con-

straints P on their parameters and do not have branches in the function body, we

have implemented an automatic tactic to infer and prove its function speciﬁcation.

For example, a simple abstract method used in SBF is

void apply(inout bit<32> val, out bit<32> rv) {

rv = val;

val = (val + 32w1);

}

By automatic inference, we get

A = Z P (a) = true r(a) = P4Bit 32 a

f(a) = a + 1 g(a) = P4Bit 32 a.

5.3.4 Abstract functional model

The concrete functional model is coarse and close to the P4 implementation, therefore

it is not convenient for reasoning about its own properties, such as the axioms shown

in Figure 5.4.

Record SBFilter core : Type := {

switch

: Z; t

last

: Z; n

clear

: Z; p : list (list H)

Deﬁnition SBFilter = option SBFilter core.

From a mathematician’s perspective, the sliding-window Bloom ﬁlter should be

rather deﬁned as above. The elements in each pane should not be represented by

hash tables. Instead, as a mathematical model, we should use the set of elements

104

to represent a pane. And an SBF should not be represented by physical panes used

periodically, but by a list of working panes sorted from the oldest to the newest

(denoted by p). For the clearing pane, we only need to count the number of slots

that have been cleared (denoted by n

clear

). To keep track of time, we use a timestamp

of unlimited integer instead of 64-bit integer, and only keep track of two things: the

time when we will switch to the next pane (t

switch

), which is also the upper bound of

the current query window, and the last timestamp we have seen (t

last

). Also, the SBF

might fall into a corrupted state if the clear duty is not properly completed, or if the

packets are not dense enough for the timer to work. So we deﬁne SBF as an option

type, and use None to represent the corrupted state. In summary, we deﬁne the

representation of an SBF as SBFilter as above, where H denotes the type of elements.

Deﬁnition SBFrefresh (f : SBFilter) (t : Z) : SBFilter :=

match f with

| Some (t

switch

, t

last

, n

clear

, p) ⇒

assert(t ∈ [t

last

, t

last

+ t

tick

]);

if (t > t

switch

) then

assert(n

clear

≤ num slots);

let p

′

:= p[1..(num panes − 1)] ++ [[]] in

Some ((t

switch

+ t

pane

), t

last

, 0, p

′

)

else

Some f

| None ⇒ None

end.

Deﬁnition SBFinsert (f : SBFilter) (t, h) : SBFilter :=

match SBFrefresh f t with

| Some (t

switch

, t

last

, n

clear

, p) ⇒

′

:= p.[num panes − 2 := p.[num panes − 2] ++ [h]]

Some (t

switch

, t, n

clear

+ 1, p

′

)

| None ⇒ None

end.

Then we can deﬁne function SBFinsert that inserts an element to the SBF, which

is used in the SBF axioms in Figure 5.4. We ﬁrst deﬁne function SBFrefresh that

refreshes the SBF up to time t, which includes two parts. The ﬁrst part is checking

105

the time interval between the last and the current timestamps such that tracking

time by tick is accurate. The second part is, when t > t

switch

, removing the ﬁrst

pane (i.e. the oldest pane) from p and inserting an empty pane (corresponding to

the fully-cleared pane) at the end, to form p

′

. In SBFrefresh, we use assert to denote

checking a property and returning None if it fails.

In function SBFinsert, after using SBFrefresh to refresh the SBF up to date, we

insert element h into the latest pane, update t

last

with t, and increment n

clear

. The

other operations SBFquery and SBFclear are deﬁned similarly using SBFrefresh. The

abstract operation SBFquery only returns the query result without modifying the SBF

in any ways including serving the clear duty.

5.3.5 Reﬁnement proof

In order to prove that the concrete functional model reﬁnes the abstract functional

model, we ﬁrst deﬁne a simulation relation R between them. For a concrete SBF f

and an abstract SBF f

′

, we say f R f

′

if either f

′

is the corrupted state None, or the

following four conditions are satisﬁed:

1. The working panes of the concrete SBF simulate the working panes in p

′

in the

abstract SBF. The correspondence between concrete panes and abstract panes

is determined by i

, which is the index of the clearing pane. i

is determined

by the timer record in the concrete SBF. The panes in p

′

are compared with

, . . . , p

num panes−1

, p

, . . . , p

−1

], respectively, where p

, . . . , p

num panes−1

are

panes in the concrete SBF in physical order. The simulation relation between a

concrete pane and an abstract pane is that the hash tables in the concrete pane

is the same as the result of inserting the elements in the corresponding abstract

pane.

106

2. The clearing pane in the concrete SBF is properly cleared with respect to n

clear

in the abstract SBF. That includes two cases: If n

clear

≥ num slots, i.e. the

clearing pane is fully cleared, then all slots of the concrete clearing pane are

0; If n

clear

< num slots, then the n

clear

slots before the next slot to clear in the

concrete are 0, counting from the end if reaching the beginning of the hash

table.

3. The timer of the concrete SBF simulates the timer of the abstract SBF, which

contains the following two conditions:

(a) The time to switch panes in the abstract SBF, t

switch

, is aligned with a tick-

tock in the concrete SBF. This is because the concrete SBF keeps track of

time by examining a particular bit. Each time the bit ﬂips is called a tick,

and a cycle of ﬂipping twice is called a tick-tock. The concrete SBF always

switches panes at the end of a tick-tock, and the abstract SBF must align

with it. Mathematically, that is (2t

tick

) divides t

switch

(b) The time interval between the last timestamp t

last

and the time to switch

panes t

switch

matches with the concrete SBF. The concrete SBF tracks

time by counting the number of ticks with modulus, and switches panes

when the number reaches certain points. Let n be the number of ticks

from now till pane switching in the concrete SBF. For example, n = 1

if the current tick is the last one before pane switching. If t

last

is at the

beginning of the current tick, we have t

switch

− t

last

= n · t

tick

. Since t

last

can be anywhere in the current tick, the condition needs to be satisﬁed is

switch

− t

last

∈



(n − 1) · t

tick

, n · t

tick



4. The numbers used in the concrete SBF are in their designated range, respec-

tively: ﬁl clear index must fall in [0, num slots). The ﬁrst element of ﬁl timer

must fall in [0, cycle tick tocks). The second element of ﬁl timer must be 0 or 1.

107

PATH p

MOD ∗ [p]

WITH (h : H) (t : Z) (f : SBFilter),

PRE (ARG [h; P4Bit 8 INSERT; P4Bit 48 t; P4Bit 8 1], MEM [ ],

EXT [SBFilter repr p f])

POST (RET Null, ARG [P4Bit 8 1], MEM [ ],

EXT [SBFﬁlter repr p (ﬁlter insert f (t, h))])

Figure 5.7: Function Speciﬁcations relating the abstract functional model

This simulation relation allows us to prove that the three operations (insert, query,

and clear) preserve the simulation relation. For example, the preservation of the insert

operation is as follows.

Lemma 5.1. For any concrete SBF f , abstract SBF f

′

, timestamp t, and element

h, if f R f

′

, then inserting (t, h) preserves the simulation relation, i.e.



ﬁlter insert f (⌊t/t

tick

⌋ mod 2) (hash(h))





SBFinsert f

′

(t, h)



The lemmas for the other two operations, query and clear, are deﬁned similarly.

We proved these three lemmas in the Coq proof assistant, using normal interactive

proof techniques and an automatic solver reasoning about arrays [44].

We utilize this abstraction in the function speciﬁcations of SBFilter to encapsulate

SBF’s implementation details. As mentioned in Section 3.1, we deﬁne hierarchical

predicate

SBFilter repr(p, f

′

) := ∃f. ﬁlter repr(p, f) ∧ f R f

′

to relate the abstract model with P4 state. Then we can deﬁne function speciﬁcation

relating P4 program state to the abstract functional model. For example, Figure 5.7

shows the function speciﬁcation in the case that the operation is INSERT. This speciﬁ-

cation hides all the implementation details of the SBF. As long as the implementation

satisﬁes the function speciﬁcation and the SBF axioms, we can prove the client has

108

desire property. We prove the abstract function speciﬁcations through the concrete

speciﬁcations and the reﬁnement lemma (Lemma 5.1).

Finally, we deﬁne the operator ok until and prove the axioms of SBF (Figure 5.4).

For an abstract SBF f and timestamp t, the predicate ok until indicates whether f is

fresh enough if the next operation is at time t. The predicate ok until is deﬁned as, f

and t satisﬁes ok until if the following conditions hold:

1. The adjacent operations are close enough, i.e. 0 ≤ t − t

last

≤ T/C.

2. The SBF is well-formed. This includes (1) t

last

< t

switch

; (2) |p| = num panes−1;

(3) If the time interval between each pair of consecutive operations afterwards

is not more than (T/C), including the interval between the next operation and

last

, then the clear duty will be fulﬁlled at or before t

switch

. We write it as the

formula

⌊t

switch

− 1 − t

switch

/(T/C)⌋ + n

clear

>= num slots.

We proved that the abstract functional model satisﬁes the SBF axioms in Coq.

5.4 Veriﬁcation of stateful ﬁrewall

In the previous section, we veriﬁed the implementation of a sliding-window Bloom

ﬁlter. In this section, we show how we verify the stateful ﬁrewall implemented as

a client of the sliding-window Bloom ﬁlter, to ensure it satisﬁes the desired ﬂow

property.

Recall that the task of this stateful ﬁrewall is to ﬁlter and block unsolicited traﬃc

from the external network into the internal network, in order to prevent DDoS attack

against internal hosts, whose capacity is far less than the switch. We want to show

109

the “no-false-negative” property: the stateful ﬁrewall allows each packet that is the

response of a recent outgoing packet. We express this property as

p.dir = in ∧ h[i].dir = out ∧ h[i].dst = p.src

∧ h[i].src = p.dst ∧ p.t − h[i].t ≤ T

=⇒ r = forward, (5.1)

where h is the history as a list of packets, p is the current packet, and r is the reaction

of the switch for the current packet.

5.4.1 Verifying the P4 program of stateful ﬁrewall

The stateful ﬁrewall is implemented as a control block, based on the sliding-window

Bloom ﬁlter, which we have veriﬁed in Section 5.3. The control block reads the

packet’s headers parsed by the parser. It determines whether the packet is outgoing

(from internal to external) or incoming (from external to internal), by testing whether

the source IP address has the preﬁx of the internal network. For an outgoing packet,

it inserts the pair (p.dst, p.src) into the SBF. For an incoming packet, it queries where

the pair (p.src, p.dst) is in the SBF, and it forwards the packet only if the SBF reports

“positive”. Otherwise, it drops the packet.

Similar to Section 5.3, we build a functional model of this control block, based on

the abstract functional model of SBF. Then we prove the control block satisﬁes the

functional model using our Veriﬁable P4 framework. This program is much simpler

than the SBF, and so is the proof.

5.4.2 Switch model

The stateful ﬁrewall runs on a switch of the Toﬁno architecture, which also takes part

in the program’s behavior. So we need a model of the switch’s behavior to reason

about the program. In Section 2.5, we discussed architecture speciﬁcation.

110

In this application, instead of adopting our full model of Toﬁno, we adopt a

single-pipeline idealized model for a that axiomatizes the behavior of other pro-

grammable components, so we can focus on the SBF and stateful ﬁrewall. The other

programmable components, the parser and the deparser, have very diﬀerent nature

than the SBF and stateful ﬁrewall. The main complexity of the SBF and stateful ﬁre-

wall is the persistent state, while the parser and the deparser are stateless and their

complexity is about packet structure. We leave veriﬁcation of parsers and deparsers

as a separate future problem. For example, Leapfrog [19] is a veriﬁer for P4 parsers

that we could perhaps integrate with Veriﬁable P4.

In this idealized switch model, we only consider the IP header of each packet, and

treat the rest as payload. In an IP header, we only care about the destination and

source addresses. So we represent a packet as ((dst, src), payload). The input packet

also comes with a timestamp t. The result is represented as either None to indicate

the packet is dropped or Some ((dst, src), payload) to indicate the forwarded packet.

Let Γ be the global environment of the P4 program, p be the path of the control

instance, and func be the function body of the control’s apply block. The behavior of

the switch is shown in Figure 5.8. The judgment ⇓

packet

describes the processing of a

single packet. The judgment ⇓

ﬂow

describes the processing of a ﬂow, i.e. a sequence

of packets. For simplicity, we use encode to denote encoding the timestamp and the

header into a list of P4 values as arguments. And decode denotes decoding the drop

ﬂag and the header from the arguments.

5.4.3 Flow property proof

Given the execution relation of a ﬂow packets, the ﬂow property that we want to

prove is described as the following theorem.

Theorem 5.1. Let the initial state s

extern

be empty. For any historical packets h,

current packet p, assume we have the following conditions:

111

Γ, p, ([], s

extern

) ⊢ (func, encode(t, (dst, src))) ⇓ ((s

′

local

, s

′

extern

), v

out

, return Null)

decode(v

out

).drop = true

extern

, (t, (dst, src), payload)) ⇓

packet

′

extern

, None)

Γ, p, ([], s

extern

) ⊢ (func, encode(t, (dst, src))) ⇓ ((s

′

local

, s

′

extern

), v

out

, return Null)

decode(v

out

).drop = false

extern

, (t, (dst, src), payload)) ⇓

packet

′

extern

, Some (decode(v

out

).hdr, payload))

extern

, []) ⇓

ﬂow

extern

, [])

extern

, p

) ⇓

packet

′

extern

, r

) (s

′

extern

, p) ⇓

ﬂow

′′

extern

, r)

extern

, p

:: p) ⇓

ﬂow

′′

extern

, r

:: r)

Figure 5.8: Switch model

1. Let r be the ﬂow processing result of the current packet p, i.e.

extern

, h + +[p]) ⇓

ﬂow

′

extern

, r

′

+ +[r]).

2. The ﬂow is dense. That means for every i such that 0 ≤ i < |h|,

0 ≤ h[i + 1].t − h[i].t ≤ T /C,

where we consider h[|h|] = p.

Then we have (5.1).

We proved this theorem in Coq, based the proof of the SBF program, the proof

of SBF axioms, ﬁrewall program proof, and the switch model. The core of this part

of proof is a mathematical proof separate from the P4 program. Lennart Beringer,

who was expert in Coq but not very knowledgeable about P4 or Veriﬁable P4 at the

time, completed this mathematical proof in two days. [45]

112

Chapter 6

Conclusion

The foundation and correctness of programs have long been areas of concern. For

the P4 programming language, we built Veriﬁable P4, which contains the following

components. We improved the previous Petr4 operational semantics by introduc-

ing the instantiation phase separate from the execution phase. We mechanized the

operational semantics in the Coq proof assistant, guided by the P4

Speciﬁcation

(Chapter 2). We deﬁned hierarchical predicates for extern objects, and built a Hoare

logic for P4, which we proved sound with respect to the operational semantics (Chap-

ter 3). We built a tactic system using Coq’s Ltac language to facilitate the veriﬁcation

process (Chapter 4).

To evaluate the utility of Veriﬁable P4, we applied Veriﬁable P4 on a sophisti-

cated stateful program. The program that we veriﬁed is a stateful ﬁrewall purely

implemented on the data plane. It maintains its internal state using a sliding-window

Bloom ﬁlter, which is a sketching data structure. Using Veriﬁable P4, we proved its

functional property about processing a ﬂow of packets. The proof is fully checked by

Coq and modularly reusable.

In conclusion, we have built a mechanized operational semantics and veriﬁcation

system for foundationally and modularly verifying stateful P4 programs. On one

113

hand, we hope this work introduces a new veriﬁcation methodology to the P4 com-

munity. One the other hand, we anticipate this work will bring more applications for

the software veriﬁcation community.

6.1 Future work

There are several directions for further research in this area.

First, it is beneﬁcial to support more P4 features and more automatic veriﬁcation.

Currently, Veriﬁable P4 only weakly supports features such as parsers and tables

with control plane entries. So veriﬁcation involving those features requires signiﬁcant

manual manipulation. In order to enhance automation, the approaches to explore

include migrating the techniques from other P4 veriﬁers, and looking for new methods

for stateful programs.

Second, we plan to achieve modular veriﬁcation of instantiation. A single class in

P4 may be instantiated to multiple instances. Also, the same class can be instantiated

diﬀerently in diﬀerent P4 programs. In the semantics and the program logic, we

designed the instantiation phase to characterize this behavior, and allowed the path

of the instance to be a variable in a function speciﬁcation. But in the tactic-based

veriﬁer, we still need to support such variables in the veriﬁcation procedure, so that

each class only needs to be veriﬁed once and applied to all the instances.

Third, we are interested in modeling and verifying the behavior of the whole

switch, not only its P4 match-action pipeline. Since P4 programs only deﬁne the

programmable components, the rest of the switch is deﬁned by the switch model.

In a real switch, such as Toﬁno [23], the full switch model is rather complicated,

with features such as packet recirculation and resubmission, traﬃc manager, packet

generator, and digest to the controller. We plan to build a language to deﬁne and

verify the behavior of the switch.

114

Fourth, it is worth considering programming tools on a level higher than P4. Lucid

[39] and CatQL [35] are high-level tools that generate P4 programs. We plan to deﬁne

operational semantics of the languages in these tools, verify the code generators or

secure them by compilation validation, and build veriﬁcation tools for these high-level

languages.

Fifth, we would like to explore more applications of our veriﬁcation system. On

one hand, that will produce more secured data plane programs. On the other hand,

that will further evaluate the veriﬁcation system and lead to directions of future

improvements.

115

Appendix A

Programs Omitted in the Main

Text

A.1 P4 implementation of sliding-window Bloom

ﬁlter and stateful ﬁrewall

This is the complete version of the program that was excerpted in Section 5.1.1.

1 #deﬁne NOOP 0

2 #deﬁne CLEAR 1

3 #deﬁne INSERT 2

4 #deﬁne QUERY 3

5 #deﬁne INSQUERY 4

6 #deﬁne UPDATE 5

7 #deﬁne UPDQUERY 6

8 #deﬁne DONTCARE 0

9 #deﬁne QDEFAULT 0

11 #include <core.p4>

12 #include <tna.p4>

13 #include "common/headers.p4"

14 #include "common/util.p4"

116

17 typedef bit<8> api t;

19 typedef bit<16> window t;

21 typedef bit<4> pred t;

23 typedef bit<18> bf2 index t;

25 typedef bit<8> bf2 value t;

27 typedef bit<64> bf2 key t;

29 struct bf2 win md t {

30 api t api;

31 bf2 index t index 1;

32 bf2 index t index 2;

33 bf2 index t index 3;

34 bf2 value t rw 1;

35 bf2 value t rw 2;

36 bf2 value t rw 3;

37 }

39 struct bf2 ds md t {

40 window t clear window;

41 bf2 index t clear index 1;

42 bf2 index t hash index 1;

43 bf2 index t hash index 2;

44 bf2 index t hash index 3;

45 bf2 win md t win 1;

46 bf2 win md t win 2;

47 bf2 win md t win 3;

48 bf2 win md t win 4;

49 }

51 struct metadata t {

52 bf2 key t bf2 key;

53 api t bf2 api;

54 bit<8> solicited;

55 }

57 struct window pair t {

58 window t lo;

59 window t hi;

60 }

117

62 parser EtherIPTCPUDPParser(packet in pkt, out header t hdr) {

63 state start {

64 transition parse ethernet;

65 }

66 state parse ethernet {

67 pkt.extract(hdr.ethernet);

68 transition select(hdr.ethernet.ether type) {

69 ETHERTYPE IPV4 : parse ipv4;

70 : reject;

71 }

72 }

73 state parse ipv4 {

74 pkt.extract(hdr.ipv4);

75 transition select(hdr.ipv4.protocol) {

76 IP PROTOCOLS TCP : parse tcp;

77 IP PROTOCOLS UDP : parse udp;

78 : accept;

79 }

80 }

81 state parse tcp {

82 pkt.extract(hdr.tcp);

83 transition accept;

84 }

85 state parse udp {

86 pkt.extract(hdr.udp);

87 transition accept;

88 }

89 }

91 parser SwitchIngressParser(packet in pkt,

92 out header t hdr,

93 out metadata t ig md,

94 out ingress intrinsic metadata t ig intr md) {

95 ToﬁnoIngressParser() toﬁno parser;

96 EtherIPTCPUDPParser() layer4 parser;

97 state start {

98 toﬁno parser.apply(pkt, ig intr md);

99 layer4 parser.apply(pkt, hdr);

100 transition accept;

101 }

102 }

103

104 control Bf2BloomFilterRow(in api t api,

105 in bf2 index t index,

106 out bf2 value t rw) {

118

107 Register<bf2 value t, bf2 index t>(32w262144, 8w0) reg row;

108 RegisterAction<bf2 value t, bf2 index t, bf2 value t>(reg row) regact insert = {

109 void apply(inout bf2 value t value, out bf2 value t rv) {

110 value = 8w1;

111 rv = 8w1;

112 }

113 };

114 action act insert() {

115 rw = regact insert.execute(index);

116 }

117 RegisterAction<bf2 value t, bf2 index t, bf2 value t>(reg row) regact query = {

118 void apply(inout bf2 value t value, out bf2 value t rv) {

119 rv = value;

120 }

121 };

122 action act query() {

123 rw = regact query.execute(index);

124 }

125 RegisterAction<bf2 value t, bf2 index t, bf2 value t>(reg row) regact clear = {

126 void apply(inout bf2 value t value, out bf2 value t rv) {

127 value = 8w0;

128 rv = 8w0;

129 }

130 };

131 action act clear() {

132 rw = regact clear.execute(index);

133 }

134 table tbl bloom {

135 key = {

136 api : ternary;

137 }

138 actions = {

139 act insert();

140 act query();

141 act clear();

142 .NoAction();

143 }

144 const entries = {

145 INSERT : act insert();

146 QUERY : act query();

147 CLEAR : act clear();

148 : .NoAction();

149 }

150 default action = .NoAction();

151 size = 4;

119

152 }

153 apply {

154 tbl bloom.apply();

155 }

156 }

157

158 control Bf2BloomFilterWin(inout bf2 win md t win md) {

159 Bf2BloomFilterRow() row 1;

160 Bf2BloomFilterRow() row 2;

161 Bf2BloomFilterRow() row 3;

162 apply {

163 row 1.apply(win md.api, win md.index 1, win md.rw 1);

164 row 2.apply(win md.api, win md.index 2, win md.rw 2);

165 row 3.apply(win md.api, win md.index 3, win md.rw 3);

166 }

167 }

168

169 control Bf2BloomFilter(in bf2 key t ds key,

170 in api t api,

171 in bit<48> ingress mac tstamp,

172 inout bf2 value t query res) {

173 bf2 ds md t ds md;

174 CRCPolynomial<bit<32>>(32w79764919, true, false, false, 32w0, 32

w4294967295) poly idx 1;

175 Hash<bit<32>>(HashAlgorithm t.CUSTOM, poly idx 1) hash idx 1;

176 action act hash index 1() {

177 ds md.hash index 1 = hash idx 1.get(ds key)[17:0];

178 }

179 table tbl hash index 1 {

180 actions = {

181 act hash index 1();

182 }

183 default action = act hash index 1();

184 size = 1;

185 }

186 CRCPolynomial<bit<32>>(32w517762881, true, false, false, 32w0, 32

w4294967295) poly idx 2;

187 Hash<bit<32>>(HashAlgorithm t.CUSTOM, poly idx 2) hash idx 2;

188 action act hash index 2() {

189 ds md.hash index 2 = hash idx 2.get(ds key)[17:0];

190 }

191 table tbl hash index 2 {

192 actions = {

193 act hash index 2();

194 }

120

195 default action = act hash index 2();

196 size = 1;

197 }

198 CRCPolynomial<bit<32>>(32w2821953579, true, false, false, 32w0, 32

w4294967295) poly idx 3;

199 Hash<bit<32>>(HashAlgorithm t.CUSTOM, poly idx 3) hash idx 3;

200 action act hash index 3() {

201 ds md.hash index 3 = hash idx 3.get(ds key)[17:0];

202 }

203 table tbl hash index 3 {

204 actions = {

205 act hash index 3();

206 }

207 default action = act hash index 3();

208 size = 1;

209 }

210 Register<bit<32>, bit<1>>(32w1, 32w0) reg clear index;

211 RegisterAction<bit<32>, bit<1>, bit<32>>(reg clear index) regact clear index

= {

212 void apply(inout bit<32> val, out bit<32> rv) {

213 rv = val;

214 val = (val + 32w1);

215 }

216 };

217 action act clear index() {

218 ds md.clear index 1 = regact clear index.execute(1w0)[17:0];

219 }

220 table tbl clear index {

221 actions = {

222 act clear index();

223 }

224 default action = act clear index();

225 size = 1;

226 }

227 Register<window pair t, bit<1>>(32w1, {16w0, 16w0}) reg clear window;

228 RegisterAction<window pair t, bit<1>, window t>(reg clear window)

regact clear window signal 0 = {

229 void apply(inout window pair t val, out window t rv) {

230 bool ﬂip = (val.lo != 16w0);

231 bool wrap = (val.hi == 16w28135);

232 if (ﬂip)

233 {

234 if (wrap)

235 {

236 val.lo = 16w0;

121

237 val.hi = 16w0;

238 }

239 else

240 {

241 val.lo = 16w0;

242 val.hi = (val.hi + 16w1);

243 }

244 }

245 else

246 {

247 val.lo = val.lo;

248 val.hi = val.hi;

249 }

250 rv = val.hi;

251 }

252 };

253 RegisterAction<window pair t, bit<1>, window t>(reg clear window)

regact clear window signal 1 = {

254 void apply(inout window pair t val, out window t rv) {

255 if ((val.lo != 16w1))

256 {

257 val.lo = 16w1;

258 }

259 rv = val.hi;

260 }

261 };

262 action act clear window signal 0() {

263 ds md.clear window = regact clear window signal 0.execute(1w0);

264 }

265 action act clear window signal 1() {

266 ds md.clear window = regact clear window signal 1.execute(1w0);

267 }

268 table tbl clear window {

269 key = {

270 ingress mac tstamp : ternary;

271 }

272 actions = {

273 act clear window signal 0();

274 act clear window signal 1();

275 }

276 const entries = {

277 48w0 &&& 48w2097152 : act clear window signal 0();

278 : act clear window signal 1();

279 }

280 default action = act clear window signal 1();

122

281 size = 2;

282 }

283 action act set clear win 1(bit<8> api 1,

284 bit<8> api 2,

285 bit<8> api 3,

286 bit<8> api 4) {

287 ds md.win 1.index 1 = ds md.clear index 1;

288 ds md.win 1.index 2 = ds md.clear index 1;

289 ds md.win 1.index 3 = ds md.clear index 1;

290 ds md.win 2.index 1 = ds md.hash index 1;

291 ds md.win 2.index 2 = ds md.hash index 2;

292 ds md.win 2.index 3 = ds md.hash index 3;

293 ds md.win 3.index 1 = ds md.hash index 1;

294 ds md.win 3.index 2 = ds md.hash index 2;

295 ds md.win 3.index 3 = ds md.hash index 3;

296 ds md.win 4.index 1 = ds md.hash index 1;

297 ds md.win 4.index 2 = ds md.hash index 2;

298 ds md.win 4.index 3 = ds md.hash index 3;

299 ds md.win 1.api = api 1;

300 ds md.win 2.api = api 2;

301 ds md.win 3.api = api 3;

302 ds md.win 4.api = api 4;

303 }

304 action act set clear win 2(bit<8> api 1,

305 bit<8> api 2,

306 bit<8> api 3,

307 bit<8> api 4) {

308 ds md.win 1.index 1 = ds md.hash index 1;

309 ds md.win 1.index 2 = ds md.hash index 2;

310 ds md.win 1.index 3 = ds md.hash index 3;

311 ds md.win 2.index 1 = ds md.clear index 1;

312 ds md.win 2.index 2 = ds md.clear index 1;

313 ds md.win 2.index 3 = ds md.clear index 1;

314 ds md.win 3.index 1 = ds md.hash index 1;

315 ds md.win 3.index 2 = ds md.hash index 2;

316 ds md.win 3.index 3 = ds md.hash index 3;

317 ds md.win 4.index 1 = ds md.hash index 1;

318 ds md.win 4.index 2 = ds md.hash index 2;

319 ds md.win 4.index 3 = ds md.hash index 3;

320 ds md.win 1.api = api 1;

321 ds md.win 2.api = api 2;

322 ds md.win 3.api = api 3;

323 ds md.win 4.api = api 4;

324 }

325 action act set clear win 3(bit<8> api 1,

123

326 bit<8> api 2,

327 bit<8> api 3,

328 bit<8> api 4) {

329 ds md.win 1.index 1 = ds md.hash index 1;

330 ds md.win 1.index 2 = ds md.hash index 2;

331 ds md.win 1.index 3 = ds md.hash index 3;

332 ds md.win 2.index 1 = ds md.hash index 1;

333 ds md.win 2.index 2 = ds md.hash index 2;

334 ds md.win 2.index 3 = ds md.hash index 3;

335 ds md.win 3.index 1 = ds md.clear index 1;

336 ds md.win 3.index 2 = ds md.clear index 1;

337 ds md.win 3.index 3 = ds md.clear index 1;

338 ds md.win 4.index 1 = ds md.hash index 1;

339 ds md.win 4.index 2 = ds md.hash index 2;

340 ds md.win 4.index 3 = ds md.hash index 3;

341 ds md.win 1.api = api 1;

342 ds md.win 2.api = api 2;

343 ds md.win 3.api = api 3;

344 ds md.win 4.api = api 4;

345 }

346 action act set clear win 4(bit<8> api 1,

347 bit<8> api 2,

348 bit<8> api 3,

349 bit<8> api 4) {

350 ds md.win 1.index 1 = ds md.hash index 1;

351 ds md.win 1.index 2 = ds md.hash index 2;

352 ds md.win 1.index 3 = ds md.hash index 3;

353 ds md.win 2.index 1 = ds md.hash index 1;

354 ds md.win 2.index 2 = ds md.hash index 2;

355 ds md.win 2.index 3 = ds md.hash index 3;

356 ds md.win 3.index 1 = ds md.hash index 1;

357 ds md.win 3.index 2 = ds md.hash index 2;

358 ds md.win 3.index 3 = ds md.hash index 3;

359 ds md.win 4.index 1 = ds md.clear index 1;

360 ds md.win 4.index 2 = ds md.clear index 1;

361 ds md.win 4.index 3 = ds md.clear index 1;

362 ds md.win 1.api = api 1;

363 ds md.win 2.api = api 2;

364 ds md.win 3.api = api 3;

365 ds md.win 4.api = api 4;

366 }

367 table tbl set win {

368 key = {

369 api : ternary;

370 ds md.clear window : range;

124

371 }

372 actions = {

373 act set clear win 1();

374 act set clear win 2();

375 act set clear win 3();

376 act set clear win 4();

377 .NoAction();

378 }

379 const entries = {

380 (INSERT, 16w0 .. 16w7033) : act set clear win 1(CLEAR,

381 NOOP,

382 NOOP,

383 INSERT);

384 (INSERT, 16w7034 .. 16w14067) : act set clear win 2(INSERT,

385 CLEAR,

386 NOOP,

387 NOOP);

388 (INSERT, 16w14068 .. 16w21101) : act set clear win 3(NOOP,

389 INSERT,

390 CLEAR,

391 NOOP);

392 (INSERT, 16w21102 .. 16w28135) : act set clear win 4(NOOP,

393 NOOP,

394 INSERT,

395 CLEAR);

396 (QUERY, 16w0 .. 16w7033) : act set clear win 1(CLEAR,

397 QUERY,

398 QUERY,

399 QUERY);

400 (QUERY, 16w7034 .. 16w14067) : act set clear win 2(QUERY,

401 CLEAR,

402 QUERY,

403 QUERY);

404 (QUERY, 16w14068 .. 16w21101) : act set clear win 3(QUERY,

405 QUERY,

406 CLEAR,

407 QUERY);

408 (QUERY, 16w21102 .. 16w28135) : act set clear win 4(QUERY,

409 QUERY,

410 QUERY,

411 CLEAR);

412 (CLEAR, 16w0 .. 16w7033) : act set clear win 1(CLEAR,

413 NOOP,

414 NOOP,

415 NOOP);

125

416 (CLEAR, 16w7034 .. 16w14067) : act set clear win 2(NOOP,

417 CLEAR,

418 NOOP,

419 NOOP);

420 (CLEAR, 16w14068 .. 16w21101) : act set clear win 3(NOOP,

421 NOOP,

422 CLEAR,

423 NOOP);

424 (CLEAR, 16w21102 .. 16w28135) : act set clear win 4(NOOP,

425 NOOP,

426 NOOP,

427 CLEAR);

428 ( , ) : .NoAction();

429 }

430 default action = .NoAction();

431 size = 13;

432 }

433 Bf2BloomFilterWin() win 1;

434 Bf2BloomFilterWin() win 2;

435 Bf2BloomFilterWin() win 3;

436 Bf2BloomFilterWin() win 4;

437 action act merge wins() {

438 query res = 8w1;

439 }

440 action act merge default() {

441 query res = 8w0;

442 }

443 table tbl merge wins {

444 key = {

445 api : ternary;

446 ds md.win 1.rw 1 : ternary;

447 ds md.win 1.rw 2 : ternary;

448 ds md.win 1.rw 3 : ternary;

449 ds md.win 2.rw 1 : ternary;

450 ds md.win 2.rw 2 : ternary;

451 ds md.win 2.rw 3 : ternary;

452 ds md.win 3.rw 1 : ternary;

453 ds md.win 3.rw 2 : ternary;

454 ds md.win 3.rw 3 : ternary;

455 ds md.win 4.rw 1 : ternary;

456 ds md.win 4.rw 2 : ternary;

457 ds md.win 4.rw 3 : ternary;

458 }

459 actions = {

460 act merge wins();

126

461 act merge default();

462 .NoAction();

463 }

464 const entries = {

465 (QUERY, 8w1, 8w1, 8w1, , , , , , , , , ) : act merge wins();

466 (QUERY, , , , 8w1, 8w1, 8w1, , , , , , ) : act merge wins();

467 (QUERY, , , , , , , 8w1, 8w1, 8w1, , , ) : act merge wins();

468 (QUERY, , , , , , , , , , 8w1, 8w1, 8w1) : act merge wins();

469 (QUERY, , , , , , , , , , , , ) : act merge default();

470 ( , , , , , , , , , , , , ) : .NoAction();

471 }

472 default action = .NoAction();

473 size = 6;

474 }

475 apply {

476 tbl hash index 1.apply();

477 tbl hash index 2.apply();

478 tbl hash index 3.apply();

479 tbl clear index.apply();

480 tbl clear window.apply();

481 tbl set win.apply();

482 win 1.apply(ds md.win 1);

483 win 2.apply(ds md.win 2);

484 win 3.apply(ds md.win 3);

485 win 4.apply(ds md.win 4);

486 tbl merge wins.apply();

487 }

488 }

489

490 control SwitchIngress(inout header t hdr,

491 inout metadata t ig md,

492 in ingress intrinsic metadata t ig intr md,

493 in ingress intrinsic metadata from parser t ig intr prsr md,

494 inout ingress intrinsic metadata for deparser t ig intr dprsr md,

495 inout ingress intrinsic metadata for tm t ig intr tm md)

496 {

497 action act for tbl 1 action 0() {

498 ig md.solicited = 1;

499 }

500 table tbl for stmt 1 {

501 actions = {

502 act for tbl 1 action 0();

503 }

504 default action = act for tbl 1 action 0();

505 size = 1;

127

506 }

507 action bf2 act set insert key(bit<8> api) {

508 ig md.bf2 api = api;

509 ig md.bf2 key = (hdr.ipv4.dst addr ++ hdr.ipv4.src addr);

510 }

511 action bf2 act set query key(bit<8> api) {

512 ig md.bf2 api = api;

513 ig md.bf2 key = (hdr.ipv4.src addr ++ hdr.ipv4.dst addr);

514 }

515 action bf2 act set clear key() {

516 ig md.bf2 api = CLEAR;

517 }

518 table bf2 tbl set key {

519 key = {

520 hdr.ipv4.src addr : ternary;

521 }

522 actions = {

523 bf2 act set insert key();

524 bf2 act set query key();

525 bf2 act set clear key();

526 }

527 const entries = {

528 2154823680 &&& 4294901760 : bf2 act set insert key(INSERT);

529 : bf2 act set query key(QUERY);

530 }

531 default action = bf2 act set clear key();

532 size = 2;

533 }

534 Bf2BloomFilter() bf2 ds;

535 action act for tbl 3 action 0() {

536 ig intr dprsr md.drop ctl = 1;

537 }

538 action act for tbl 3 action 1() {

539 ig intr dprsr md.drop ctl = 0;

540 }

541 table tbl for stmt 3 {

542 key = {

543 ig md.solicited : ternary;

544 }

545 actions = {

546 act for tbl 3 action 0();

547 act for tbl 3 action 1();

548 }

549 const entries = {

550 0 : act for tbl 3 action 0();

128

551 : act for tbl 3 action 1();

552 }

553 default action = act for tbl 3 action 1();

554 size = 2;

555 }

556 apply {

557 tbl for stmt 1.apply();

558 bf2 tbl set key.apply();

559 bf2 ds.apply(ig md.bf2 key,

560 ig md.bf2 api,

561 ig intr md.ingress mac tstamp,

562 ig md.solicited);

563 tbl for stmt 3.apply();

564 }

565 }

566

567 control SwitchIngressDeparser(packet out pkt,

568 inout header t hdr,

569 in metadata t ig md,

570 in ingress intrinsic metadata for deparser t

ig intr dprsr md)

571 {

572 apply {

573 pkt.emit(hdr);

574 }

575 }

576

577 parser SwitchEgressParser(packet in pkt,

578 out header t hdr,

579 out metadata t eg md,

580 out egress intrinsic metadata t eg intr md) {

581 ToﬁnoEgressParser() toﬁno parser;

582 EtherIPTCPUDPParser() layer4 parser;

583 state start {

584 toﬁno parser.apply(pkt, eg intr md);

585 layer4 parser.apply(pkt, hdr);

586 transition accept;

587 }

588 }

589

590 control SwitchEgress(inout header t hdr,

591 inout metadata t eg md,

592 in egress intrinsic metadata t eg intr md,

593 in egress intrinsic metadata from parser t eg intr from prsr,

129

594 inout egress intrinsic metadata for deparser t

eg intr md for dprsr,

595 inout egress intrinsic metadata for output port t

eg intr md for oport)

596 {

597 apply { }

598 }

599

600 control SwitchEgressDeparser(packet out pkt,

601 inout header t hdr,

602 in metadata t eg md,

603 in egress intrinsic metadata for deparser t

eg intr dprsr md)

604 {

605 apply {

606 pkt.emit(hdr);

607 }

608 }

609

610 Pipeline(SwitchIngressParser(), SwitchIngress(), SwitchIngressDeparser(),

SwitchEgressParser(), SwitchEgress(), SwitchEgressDeparser()) pipe;

611

612 Switch(pipe) main;

A.2 Concrete functional model

This is the complete version of the program that was excerpted in Section 5.3.1.

1 #deﬁne NOOP 0

2 #deﬁne CLEAR 1

3 #deﬁne INSERT 2

4 #deﬁne QUERY 3

5 #deﬁne INSQUERY 4

6 #deﬁne UPDATE 5

7 #deﬁne UPDQUERY 6

8 #deﬁne DONTCARE 0

9 #deﬁne QDEFAULT 0

11 #include <core.p4>

12 #include <tna.p4>

13 #include ”common/headers.p4”

130

14 #include ”common/util.p4”

17 typedef bit<8> api t;

19 typedef bit<16> window t;

21 typedef bit<4> pred t;

23 typedef bit<18> bf2 index t;

25 typedef bit<8> bf2 value t;

27 typedef bit<64> bf2 key t;

29 struct bf2 win md t {

30 api t api;

31 bf2 index t index 1;

32 bf2 index t index 2;

33 bf2 index t index 3;

34 bf2 value t rw 1;

35 bf2 value t rw 2;

36 bf2 value t rw 3;

37 }

39 struct bf2 ds md t {

40 window t clear window;

41 bf2 index t clear index 1;

42 bf2 index t hash index 1;

43 bf2 index t hash index 2;

44 bf2 index t hash index 3;

45 bf2 win md t win 1;

46 bf2 win md t win 2;

47 bf2 win md t win 3;

48 bf2 win md t win 4;

49 }

51 struct metadata t {

52 bf2 key t bf2 key;

53 api t bf2 api;

54 bit<8> solicited;

55 }

57 struct window pair t {

58 window t lo;

131

59 window t hi;

60 }

62 parser EtherIPTCPUDPParser(packet in pkt, out header t hdr) {

63 state start {

64 transition parse ethernet;

65 }

66 state parse ethernet {

67 pkt.extract(hdr.ethernet);

68 transition select(hdr.ethernet.ether type) {

69 ETHERTYPE IPV4 : parse ipv4;

70 : reject;

71 }

72 }

73 state parse ipv4 {

74 pkt.extract(hdr.ipv4);

75 transition select(hdr.ipv4.protocol) {

76 IP PROTOCOLS TCP : parse tcp;

77 IP PROTOCOLS UDP : parse udp;

78 : accept;

79 }

80 }

81 state parse tcp {

82 pkt.extract(hdr.tcp);

83 transition accept;

84 }

85 state parse udp {

86 pkt.extract(hdr.udp);

87 transition accept;

88 }

89 }

91 parser SwitchIngressParser(packet in pkt,

92 out header t hdr,

93 out metadata t ig md,

94 out ingress intrinsic metadata t ig intr md) {

95 ToﬁnoIngressParser() toﬁno parser;

96 EtherIPTCPUDPParser() layer4 parser;

97 state start {

98 toﬁno parser.apply(pkt, ig intr md);

99 layer4 parser.apply(pkt, hdr);

100 transition accept;

101 }

102 }

103

132

104 control Bf2BloomFilterRow(in api t api,

105 in bf2 index t index,

106 out bf2 value t rw) {

107 Register<bf2 value t, bf2 index t>(32w262144, 8w0) reg row;

108 RegisterAction<bf2 value t, bf2 index t, bf2 value t>(reg row) regact insert = {

109 void apply(inout bf2 value t value, out bf2 value t rv) {

110 value = 8w1;

111 rv = 8w1;

112 }

113 };

114 action act insert() {

115 rw = regact insert.execute(index);

116 }

117 RegisterAction<bf2 value t, bf2 index t, bf2 value t>(reg row) regact query = {

118 void apply(inout bf2 value t value, out bf2 value t rv) {

119 rv = value;

120 }

121 };

122 action act query() {

123 rw = regact query.execute(index);

124 }

125 RegisterAction<bf2 value t, bf2 index t, bf2 value t>(reg row) regact clear = {

126 void apply(inout bf2 value t value, out bf2 value t rv) {

127 value = 8w0;

128 rv = 8w0;

129 }

130 };

131 action act clear() {

132 rw = regact clear.execute(index);

133 }

134 table tbl bloom {

135 key = {

136 api : ternary;

137 }

138 actions = {

139 act insert();

140 act query();

141 act clear();

142 .NoAction();

143 }

144 const entries = {

145 INSERT : act insert();

146 QUERY : act query();

147 CLEAR : act clear();

148 : .NoAction();

133

149 }

150 default action = .NoAction();

151 size = 4;

152 }

153 apply {

154 tbl bloom.apply();

155 }

156 }

157

158 control Bf2BloomFilterWin(inout bf2 win md t win md) {

159 Bf2BloomFilterRow() row 1;

160 Bf2BloomFilterRow() row 2;

161 Bf2BloomFilterRow() row 3;

162 apply {

163 row 1.apply(win md.api, win md.index 1, win md.rw 1);

164 row 2.apply(win md.api, win md.index 2, win md.rw 2);

165 row 3.apply(win md.api, win md.index 3, win md.rw 3);

166 }

167 }

168

169 control Bf2BloomFilter(in bf2

key t ds key,

170 in api t api,

171 in bit<48> ingress mac tstamp,

172 inout bf2 value t query res) {

173 bf2 ds md t ds md;

174 CRCPolynomial<bit<32>>(32w79764919, true, false, false, 32w0, 32w4294967295

) poly idx 1;

175 Hash<bit<32>>(HashAlgorithm t.CUSTOM, poly idx 1) hash idx 1;

176 action act hash index 1() {

177 ds md.hash index 1 = hash idx 1.get(ds key)[17:0];

178 }

179 table tbl hash index 1 {

180 actions = {

181 act hash index 1();

182 }

183 default action = act hash index 1();

184 size = 1;

185 }

186 CRCPolynomial<bit<32>>(32w517762881, true, false, false, 32w0, 32

w4294967295) poly idx 2;

187 Hash<bit<32>>(HashAlgorithm t.CUSTOM, poly idx 2) hash idx 2;

188 action act hash index 2() {

189 ds md.hash index 2 = hash idx 2.get(ds key)[17:0];

190 }

191 table tbl hash index 2 {

134

192 actions = {

193 act hash index 2();

194 }

195 default action = act hash index 2();

196 size = 1;

197 }

198 CRCPolynomial<bit<32>>(32w2821953579, true, false, false, 32w0, 32

w4294967295) poly idx 3;

199 Hash<bit<32>>(HashAlgorithm t.CUSTOM, poly idx 3) hash idx 3;

200 action act hash index 3() {

201 ds md.hash index 3 = hash idx 3.get(ds key)[17:0];

202 }

203 table tbl hash index 3 {

204 actions = {

205 act hash index 3();

206 }

207 default action = act hash index 3();

208 size = 1;

209 }

210 Register<bit<32>, bit<1>>(32w1, 32w0) reg clear index;

211 RegisterAction<bit<32>, bit<1>, bit<32>>(reg clear index) regact clear index =

{

212 void apply(inout bit<32> val, out bit<32> rv) {

213 rv = val;

214 val = (val + 32w1);

215 }

216 };

217 action act clear index() {

218 ds md.clear index 1 = regact clear index.execute(1w0)[17:0];

219 }

220 table tbl clear index {

221 actions = {

222 act clear index();

223 }

224 default action = act clear index();

225 size = 1;

226 }

227 Register<window pair t, bit<1>>(32w1, {16w0, 16w0}) reg clear window;

228 RegisterAction<window pair t, bit<1>, window t>(reg clear window)

regact clear window signal 0 = {

229 void apply(inout window pair t val, out window t rv) {

230 bool ﬂip = (val.lo != 16w0);

231 bool wrap = (val.hi == 16w28135);

232 if (ﬂip)

233 {

135

234 if (wrap)

235 {

236 val.lo = 16w0;

237 val.hi = 16w0;

238 }

239 else

240 {

241 val.lo = 16w0;

242 val.hi = (val.hi + 16w1);

243 }

244 }

245 else

246 {

247 val.lo = val.lo;

248 val.hi = val.hi;

249 }

250 rv = val.hi;

251 }

252 };

253 RegisterAction<window pair t, bit<1>, window t>(reg clear window)

regact clear window signal 1 = {

254 void apply(inout window pair t val, out window t rv) {

255 if ((val.lo != 16w1))

256 {

257 val.lo = 16w1;

258 }

259 rv = val.hi;

260 }

261 };

262 action act clear window signal 0() {

263 ds md.clear window = regact clear window signal 0.execute(1w0);

264 }

265 action act clear window signal 1() {

266 ds md.clear window = regact clear window signal 1.execute(1w0);

267 }

268 table tbl clear window {

269 key = {

270 ingress mac tstamp : ternary;

271 }

272 actions = {

273 act clear window signal 0();

274 act clear window signal 1();

275 }

276 const entries = {

277 48w0 &&& 48w2097152 : act clear window signal 0();

136

278 : act clear window signal 1();

279 }

280 default action = act clear window signal 1();

281 size = 2;

282 }

283 action act set clear win 1(bit<8> api 1,

284 bit<8> api 2,

285 bit<8> api 3,

286 bit<8> api 4) {

287 ds md.win 1.index 1 = ds md.clear index 1;

288 ds md.win 1.index 2 = ds md.clear index 1;

289 ds md.win 1.index 3 = ds md.clear index 1;

290 ds md.win 2.index 1 = ds md.hash index 1;

291 ds md.win 2.index 2 = ds md.hash index 2;

292 ds md.win 2.index 3 = ds md.hash index 3;

293 ds md.win 3.index 1 = ds md.hash index 1;

294 ds md.win 3.index 2 = ds md.hash index 2;

295 ds md.win 3.index 3 = ds md.hash index 3;

296 ds md.win 4.index 1 = ds md.hash index 1;

297 ds md.win 4.index 2 = ds md.hash index 2;

298 ds md.win 4.index 3 = ds md.hash index 3;

299 ds md.win 1.api = api 1;

300 ds md.win 2.api = api 2;

301 ds md.win 3.api = api 3;

302 ds md.win 4.api = api 4;

303 }

304 action act set clear win 2(bit<8> api 1,

305 bit<8> api 2,

306 bit<8> api 3,

307 bit<8> api 4) {

308 ds md.win 1.index 1 = ds md.hash index 1;

309 ds md.win 1.index 2 = ds md.hash index 2;

310 ds md.win 1.index 3 = ds md.hash index 3;

311 ds md.win 2.index 1 = ds md.clear index 1;

312 ds md.win 2.index 2 = ds md.clear index 1;

313 ds md.win 2.index 3 = ds md.clear index 1;

314 ds md.win 3.index 1 = ds md.hash index 1;

315 ds md.win 3.index 2 = ds md.hash index 2;

316 ds md.win 3.index 3 = ds md.hash index 3;

317 ds md.win 4.index 1 = ds md.hash index 1;

318 ds md.win 4.index 2 = ds md.hash index 2;

319 ds md.win 4.index 3 = ds md.hash index 3;

320 ds md.win 1.api = api 1;

321 ds md.win 2.api = api 2;

322 ds md.win 3.api = api 3;

137

323 ds md.win 4.api = api 4;

324 }

325 action act set clear win 3(bit<8> api 1,

326 bit<8> api 2,

327 bit<8> api 3,

328 bit<8> api 4) {

329 ds md.win 1.index 1 = ds md.hash index 1;

330 ds md.win 1.index 2 = ds md.hash index 2;

331 ds md.win 1.index 3 = ds md.hash index 3;

332 ds md.win 2.index 1 = ds md.hash index 1;

333 ds md.win 2.index 2 = ds md.hash index 2;

334 ds md.win 2.index 3 = ds md.hash index 3;

335 ds md.win 3.index 1 = ds md.clear index 1;

336 ds md.win 3.index 2 = ds md.clear index 1;

337 ds md.win 3.index 3 = ds md.clear index 1;

338 ds md.win 4.index 1 = ds md.hash index 1;

339 ds md.win 4.index 2 = ds md.hash index 2;

340 ds md.win 4.index 3 = ds md.hash index 3;

341 ds md.win 1.api = api 1;

342 ds md.win 2.api = api 2;

343 ds md.win 3.api = api 3;

344 ds md.win 4.api = api 4;

345 }

346 action act set clear win 4(bit<8> api 1,

347 bit<8> api 2,

348 bit<8> api 3,

349 bit<8> api 4) {

350 ds md.win 1.index 1 = ds md.hash index 1;

351 ds md.win 1.index 2 = ds md.hash index 2;

352 ds md.win 1.index 3 = ds md.hash index 3;

353 ds md.win 2.index 1 = ds md.hash index 1;

354 ds md.win 2.index 2 = ds md.hash index 2;

355 ds md.win 2.index 3 = ds md.hash index 3;

356 ds md.win 3.index 1 = ds md.hash index 1;

357 ds md.win 3.index 2 = ds md.hash index 2;

358 ds md.win 3.index 3 = ds md.hash index 3;

359 ds md.win 4.index 1 = ds md.clear index 1;

360 ds md.win 4.index 2 = ds md.clear index 1;

361 ds md.win 4.index 3 = ds md.clear index 1;

362 ds md.win 1.api = api 1;

363 ds md.win 2.api = api 2;

364 ds md.win 3.api = api 3;

365 ds md.win 4.api = api 4;

366 }

367 table tbl set win {

138

368 key = {

369 api : ternary;

370 ds md.clear window : range;

371 }

372 actions = {

373 act set clear win 1();

374 act set clear win 2();

375 act set clear win 3();

376 act set clear win 4();

377 .NoAction();

378 }

379 const entries = {

380 (INSERT, 16w0 .. 16w7033) : act set clear win 1(CLEAR,

381 NOOP,

382 NOOP,

383 INSERT);

384 (INSERT, 16w7034 .. 16w14067) : act set clear win 2(INSERT,

385 CLEAR,

386 NOOP,

387 NOOP);

388 (INSERT, 16w14068 .. 16w21101) : act

set clear win 3(NOOP,

389 INSERT,

390 CLEAR,

391 NOOP);

392 (INSERT, 16w21102 .. 16w28135) : act set clear win 4(NOOP,

393 NOOP,

394 INSERT,

395 CLEAR);

396 (QUERY, 16w0 .. 16w7033) : act set clear win 1(CLEAR,

397 QUERY,

398 QUERY,

399 QUERY);

400 (QUERY, 16w7034 .. 16w14067) : act set clear win 2(QUERY,

401 CLEAR,

402 QUERY,

403 QUERY);

404 (QUERY, 16w14068 .. 16w21101) : act set clear win 3(QUERY,

405 QUERY,

406 CLEAR,

407 QUERY);

408 (QUERY, 16w21102 .. 16w28135) : act set clear win 4(QUERY,

409 QUERY,

410 QUERY,

411 CLEAR);

412 (CLEAR, 16w0 .. 16w7033) : act set clear win 1(CLEAR,

139

413 NOOP,

414 NOOP,

415 NOOP);

416 (CLEAR, 16w7034 .. 16w14067) : act set clear win 2(NOOP,

417 CLEAR,

418 NOOP,

419 NOOP);

420 (CLEAR, 16w14068 .. 16w21101) : act set clear win 3(NOOP,

421 NOOP,

422 CLEAR,

423 NOOP);

424 (CLEAR, 16w21102 .. 16w28135) : act set clear win 4(NOOP,

425 NOOP,

426 NOOP,

427 CLEAR);

428 ( , ) : .NoAction();

429 }

430 default action = .NoAction();

431 size = 13;

432 }

433 Bf2BloomFilterWin() win 1;

434 Bf2BloomFilterWin() win 2;

435 Bf2BloomFilterWin() win 3;

436 Bf2BloomFilterWin() win 4;

437 action act merge wins() {

438 query res = 8w1;

439 }

440 action act merge default() {

441 query res = 8w0;

442 }

443 table tbl merge wins {

444 key = {

445 api : ternary;

446 ds md.win 1.rw 1 : ternary;

447 ds md.win 1.rw 2 : ternary;

448 ds md.win 1.rw 3 : ternary;

449 ds md.win 2.rw 1 : ternary;

450 ds md.win 2.rw 2 : ternary;

451 ds md.win 2.rw 3 : ternary;

452 ds md.win 3.rw 1 : ternary;

453 ds md.win 3.rw 2 : ternary;

454 ds md.win 3.rw 3 : ternary;

455 ds md.win 4.rw 1 : ternary;

456 ds md.win 4.rw 2 : ternary;

457 ds md.win 4.rw 3 : ternary;

140

458 }

459 actions = {

460 act merge wins();

461 act merge default();

462 .NoAction();

463 }

464 const entries = {

465 (QUERY, 8w1, 8w1, 8w1, , , , , , , , , ) : act merge wins();

466 (QUERY, , , , 8w1, 8w1, 8w1, , , , , , ) : act merge wins();

467 (QUERY, , , , , , , 8w1, 8w1, 8w1, , , ) : act merge wins();

468 (QUERY, , , , , , , , , , 8w1, 8w1, 8w1) : act merge wins();

469 (QUERY, , , , , , , , , , , , ) : act merge default();

470 ( , , , , , , , , , , , , ) : .NoAction();

471 }

472 default action = .NoAction();

473 size = 6;

474 }

475 apply {

476 tbl hash index 1.apply();

477 tbl hash index 2.apply();

478 tbl hash index 3.apply();

479 tbl clear index.apply();

480 tbl clear window.apply();

481 tbl set win.apply();

482 win 1.apply(ds md.win 1);

483 win 2.apply(ds md.win 2);

484 win 3.apply(ds md.win 3);

485 win 4.apply(ds md.win 4);

486 tbl merge wins.apply();

487 }

488 }

489

490 control SwitchIngress(inout header t hdr,

491 inout metadata t ig md,

492 in ingress intrinsic metadata t ig intr md,

493 in ingress intrinsic metadata from parser t ig intr prsr md,

494 inout ingress intrinsic metadata for deparser t ig intr dprsr md,

495 inout ingress intrinsic metadata for tm t ig intr tm md)

496 {

497 action act for tbl 1 action 0() {

498 ig md.solicited = 1;

499 }

500 table tbl for stmt 1 {

501 actions = {

502 act for tbl 1 action 0();

141

503 }

504 default action = act for tbl 1 action 0();

505 size = 1;

506 }

507 action bf2 act set insert key(bit<8> api) {

508 ig md.bf2 api = api;

509 ig md.bf2 key = (hdr.ipv4.dst addr ++ hdr.ipv4.src addr);

510 }

511 action bf2 act set query key(bit<8> api) {

512 ig md.bf2 api = api;

513 ig md.bf2 key = (hdr.ipv4.src addr ++ hdr.ipv4.dst addr);

514 }

515 action bf2 act set clear key() {

516 ig md.bf2 api = CLEAR;

517 }

518 table bf2 tbl set key {

519 key = {

520 hdr.ipv4.src addr : ternary;

521 }

522 actions = {

523 bf2 act set insert key();

524 bf2 act set query key();

525 bf2 act set clear key();

526 }

527 const entries = {

528 2154823680 &&& 4294901760 : bf2 act set insert key(INSERT);

529 : bf2 act set query key(QUERY);

530 }

531 default action = bf2 act set clear key();

532 size = 2;

533 }

534 Bf2BloomFilter() bf2 ds;

535 action act for tbl 3 action 0() {

536 ig intr dprsr md.drop ctl = 1;

537 }

538 action act for tbl 3 action 1() {

539 ig intr dprsr md.drop ctl = 0;

540 }

541 table tbl for stmt 3 {

542 key = {

543 ig md.solicited : ternary;

544 }

545 actions = {

546 act for tbl 3 action 0();

547 act for tbl 3 action 1();

142

548 }

549 const entries = {

550 0 : act for tbl 3 action 0();

551 : act for tbl 3 action 1();

552 }

553 default action = act for tbl 3 action 1();

554 size = 2;

555 }

556 apply {

557 tbl for stmt 1.apply();

558 bf2 tbl set key.apply();

559 bf2 ds.apply(ig md.bf2 key,

560 ig md.bf2 api,

561 ig intr md.ingress mac tstamp,

562 ig md.solicited);

563 tbl for stmt 3.apply();

564 }

565 }

566

567 control SwitchIngressDeparser(packet out pkt,

568 inout header t hdr,

569 in metadata t ig md,

570 in ingress intrinsic metadata for deparser t

ig intr dprsr md)

571 {

572 apply {

573 pkt.emit(hdr);

574 }

575 }

576

577 parser SwitchEgressParser(packet in pkt,

578 out header t hdr,

579 out metadata t eg md,

580 out egress intrinsic metadata t eg intr md) {

581 ToﬁnoEgressParser() toﬁno parser;

582 EtherIPTCPUDPParser() layer4 parser;

583 state start {

584 toﬁno parser.apply(pkt, eg intr md);

585 layer4 parser.apply(pkt, hdr);

586 transition accept;

587 }

588 }

589

590 control SwitchEgress(inout header t hdr,

591 inout metadata t eg md,

143

592 in egress intrinsic metadata t eg intr md,

593 in egress intrinsic metadata from parser t eg intr from prsr,

594 inout egress intrinsic metadata for deparser t

eg intr md for dprsr,

595 inout egress intrinsic metadata for output port t

eg intr md for oport)

596 {

597 apply { }

598 }

599

600 control SwitchEgressDeparser(packet out pkt,

601 inout header t hdr,

602 in metadata t eg md,

603 in egress intrinsic metadata for deparser t

eg intr dprsr md)

604 {

605 apply {

606 pkt.emit(hdr);

607 }

608 }

609

610 Pipeline(SwitchIngressParser(), SwitchIngress(), SwitchIngressDeparser(),

SwitchEgressParser(), SwitchEgress(), SwitchEgressDeparser()) pipe;

611

612 Switch(pipe) main;

144

Bibliography

[1] Andrew W. Appel, Robert Dockins, Aquinas Hobor, Lennart Beringer, Josiah

Dodds, Gordon Stewart, Sandrine Blazy, and Xavier Leroy. Program logics for

certiﬁed compilers. Cambridge University Press, 2014.

[2] Andrew W. Appel and David McAllester. An indexed model of recursive types

for foundational proof-carrying code. ACM Transactions on Programming Lan-

guages and Systems (TOPLAS), 23(5):657–683, 2001.

[3] Sandrine Blazy and Xavier Leroy. Mechanized semantics for the Clight subset

of the C language. Journal of Automated Reasoning, 43(3):263–288, 2009.

[4] Burton H. Bloom. Space/time trade-oﬀs in hash coding with allowable errors.

Communications of the ACM, 13(7):422–426, 1970.

[5] Pat Bosshart, Dan Daly, Glen Gibb, Martin Izzard, Nick McKeown, Jennifer

Rexford, Cole Schlesinger, Dan Talayco, Amin Vahdat, George Varghese, et al.

P4: Programming protocol-independent packet processors. ACM SIGCOMM

Computer Communication Review, 44(3):87–95, 2014.

[6] Cristian Cadar, Daniel Dunbar, and Dawson R. Engler. KLEE: unassisted and

automatic generation of high-coverage tests for complex systems programs. In

Richard Draves and Robbert van Renesse, editors, 8th USENIX Symposium

on Operating Systems Design and Implementation, OSDI 2008, December 8-

10, 2008, San Diego, California, USA, Proceedings, pages 209–224. USENIX

Association, 2008.

[7] Qinxiang Cao, Lennart Beringer, Samuel Gruetter, Josiah Dodds, and An-

drew W. Appel. VST-Floyd: A separation logic tool to verify correctness of

C programs. Journal of Automated Reasoning, 61:367–422, 2018.

[8] Henry Cejtin, Suresh Jagannathan, and Stephen Weeks. Flow-directed closure

conversion for typed languages. In Programming Languages and Systems: 9th

European Symposium on Programming, pages 56–71. Springer, 2000.

[9] Xiaoqi Chen, Shir Landau-Feibish, Mark Braverman, and Jennifer Rexford.

Beaucoup: Answering many network traﬃc queries, one memory update at a

time. In Proceedings of the Annual Conference of the ACM Special Interest Group

145

on Data Communication on the Applications, Technologies, Architectures, and

Protocols for Computer Communication, SIGCOMM ’20, page 226–239, New

York, NY, USA, 2020. Association for Computing Machinery.

[10] Adam Chlipala. Certiﬁed Programming with Dependent Types. MIT Press, 2013.

[11] Joshua M Cohen, Qinshi Wang, and Andrew W. Appel. Veriﬁed erasure correc-

tion in Coq with MathComp and VST. In Computer Aided Veriﬁcation: 34th

International Conference, CAV 2022, pages 272–292. Springer, 2022.

[12] The P4 Language Consortium. P4

language speciﬁcation, version 1.2.3. Tech-

nical report, July 2022.

[13] Patrick Cousot and Radhia Cousot. Abstract interpretation: a uniﬁed lattice

model for static analysis of programs by construction or approximation of ﬁx-

points. In Proceedings of the 4th ACM SIGACT-SIGPLAN symposium on Prin-

ciples of Programming Languages, pages 238–252, 1977.

[14] Pascal Cuoq, Florent Kirchner, Nikolai Kosmatov, Virgile Prevosto, Julien Sig-

noles, and Boris Yakobowski. Frama-C: A software analysis perspective. In Soft-

ware Engineering and Formal Methods: 10th International Conference, SEFM

2012, pages 233–247. Springer, 2012.

[15] Leonardo De Moura and Nikolaj Bjørner. Z3: An eﬃcient SMT solver. In Inter-

national conference on Tools and Algorithms for the Construction and Analysis

of Systems, pages 337–340. Springer, 2008.

[16] David Delahaye. A tactic language for the system Coq. In International Confer-

ence on Logic for Programming and Automated Reasoning (LPAR), pages 85–95.

Springer, 2000.

[17] Edsger W. Dijkstra. Guarded commands, nondeterminacy and formal derivation

of programs. Communications of the ACM, 18(8):453–457, 1975.

[18] Ryan Doenges, Mina Tahmasbi Arashloo, Santiago Bautista, Alexander Chang,

Newton Ni, Samwise Parkinson, Rudy Peterson, Alaia Solko-Breslin, Amanda

Xu, and Nate Foster. Petr4: formal foundations for P4 data planes. Proceedings

of the ACM on Programming Languages, 5(POPL), 2021.

[19] Ryan Doenges, Tobias Kapp´e, John Sarracino, Nate Foster, and Greg Mor-

risett. Leapfrog: certiﬁed equivalence for protocol parsers. In Proceedings of

the 43rd ACM SIGPLAN International Conference on Programming Language

Design and Implementation, pages 950–965, 2022.

[20] Matthias Eichholz, Eric Hayden Campbell, Matthias Krebs, Nate Foster, and

Mira Mezini. Dependently-typed data plane programming. Proceedings of the

ACM on Programming Languages, 6(POPL):1–28, 2022.

146

[21] Ronghui Gu, J´er´emie Koenig, Tahina Ramananandro, Zhong Shao, Xiongnan

Wu, Shu-Chun Weng, Haozhong Zhang, and Yu Guo. Deep speciﬁcations and

certiﬁed abstraction layers. ACM SIGPLAN Notices, 50(1):595–608, 2015.

[22] Frederik Hauser, Marco H¨aberle, Daniel Merling, Steﬀen Lindner, Vladimir

Gurevich, Florian Zeiger, Reinhard Frank, and Michael Menth. A survey on

data plane programming with P4: Fundamentals, advances, and applied research.

Journal of Network and Computer Applications, 212:103561, 2023.

[23] Intel. Intel® Toﬁno™ programmable ethernet switch ASIC.

https://www.intel.com/content/www/us/en/products/network-io/

programmable-ethernet-switch/tofino-series.html. Accessed: 2023-

01-18.

[24] Xin Jin, Xiaozhou Li, Haoyu Zhang, Robert Soul´e, Jeongkeun Lee, Nate Foster,

Changhoon Kim, and Ion Stoica. Netcache: Balancing key-value stores with fast

in-network caching. In Proceedings of the 26th Symposium on Operating Systems

Principles, SOSP ’17, page 121–136, New York, NY, USA, 2017. Association for

Computing Machinery.

[25] Thomas Kleymann. Hoare logic and VDM: Machine-checked soundness and com-

pleteness proofs. PhD thesis, University of Edinburgh, 1998.

[26] K Rustan M Leino. Dafny: An automatic program veriﬁer for functional cor-

rectness. In Logic for Programming, Artiﬁcial Intelligence, and Reasoning: 16th

International Conference, LPAR-16, pages 348–370. Springer, 2010.

[27] Yuliang Li, Rui Miao, Changhoon Kim, and Minlan Yu. Flowradar: A bet-

ter netﬂow for data centers. In Proceedings of the 13th Usenix Conference on

Networked Systems Design and Implementation, NSDI’16, page 311–324, USA,

2016. USENIX Association.

[28] Jed Liu, William Hallahan, Cole Schlesinger, Milad Sharif, Jeongkeun Lee,

Robert Soul´e, Han Wang, C˘alin Ca¸scaval, Nick McKeown, and Nate Foster.

p4v: Practical veriﬁcation for programmable data planes. In Proceedings of the

2018 Conference of the ACM Special Interest Group on data communication,

pages 490–503, 2018.

[29] David MacQueen. An implementation of standard ML modules. In Proceedings of

the 1988 ACM Conference on LISP and Functional Programming, LFP ’88, pages

212–223, New York, NY, USA, 1988. Association for Computing Machinery.

[30] Rui Miao, Hongyi Zeng, Changhoon Kim, Jeongkeun Lee, and Minlan Yu.

Silkroad: Making stateful layer-4 load balancing fast and cheap using switch-

ing asics. In Proceedings of the Conference of the ACM Special Interest Group

on Data Communication, SIGCOMM ’17, page 15–28, New York, NY, USA,

2017. Association for Computing Machinery.

147

[31] Hun Namkung, Zaoxing Liu, Daehyeok Kim, Vyas Sekar, and Peter Steenkiste.

SketchLib: Enabling eﬃcient sketch-based monitoring on programmable

switches. In 19th USENIX Symposium on Networked Systems Design and Im-

plementation (NSDI 22), pages 743–759, Renton, WA, April 2022. USENIX As-

sociation.

[32] Moni Naor and Eylon Yogev. Tight bounds for sliding bloom ﬁlters. Algorithmica,

73(4):652–672, 2015.

[33] Miguel Neves, Lucas Freire, Alberto Schaeﬀer-Filho, and Marinho Barcellos. Ver-

iﬁcation of P4 programs in feasible time using assertions. In Proceedings of the

14th International Conference on Emerging Networking EXperiments and Tech-

nologies, pages 73–85, 2018.

[34] Samir Palnitkar. Verilog HDL: a guide to digital design and synthesis. Prentice

Hall Professional, 2003. pages 30-33.

[35] Mengying Pan. CtrlApp: A query-based language for control applications on

data plane. https://github.com/MollyDream/CtrlApp.

[36] Benjamin C. Pierce, Arthur Azevedo de Amorim, Chris Casinghino, Marco

Gaboardi, Michael Greenberg, C˘at˘alin Hrit¸cu, Vilhelm Sj¨oberg, Andrew Tol-

mach, and Brent Yorgey. Programming Language Foundations, volume 2

of Software Foundations. Electronic textbook, 2022. Version 6.2, http:

//softwarefoundations.cis.upenn.edu.

[37] Benjamin C. Pierce, Arthur Azevedo de Amorim, Chris Casinghino, Marco

Gaboardi, Michael Greenberg, C˘at˘alin Hrit¸cu, Vilhelm Sj¨oberg, and Brent

Yorgey. Logical Foundations, volume 1 of Software Foundations. Electronic

textbook, 2022. Version 6.2, http://softwarefoundations.cis.upenn.edu.

[38] Fabian Ruﬀy, Tao Wang, and Anirudh Sivaraman. Gauntlet: Finding bugs in

compilers for programmable packet processing. In 14th USENIX Symposium on

Operating Systems Design and Implementation (OSDI 20), pages 683–699, 2020.

[39] John Sonchack, Devon Loehr, Jennifer Rexford, and David Walker. Lucid: A lan-

guage for control in the data plane. In Proceedings of the 2021 ACM SIGCOMM

2021 Conference, pages 731–747, 2021.

[40] Radu Stoenescu, Dragos Dumitrescu, Matei Popovici, Lorina Negreanu, and

Costin Raiciu. Debugging P4 programs with Vera. In Proceedings of the 2018

Conference of the ACM Special Interest Group on Data Communication, pages

518–532, 2018.

[41] Radu Stoenescu, Matei Popovici, Lorina Negreanu, and Costin Raiciu. SymNet:

Scalable symbolic execution for modern networks. In Proceedings of the 2016

ACM SIGCOMM Conference, pages 314–327, 2016.

148

[42] Bingchuan Tian, Jiaqi Gao, Mengqi Liu, Ennan Zhai, Yanqing Chen, Yu Zhou,

Li Dai, Feng Yan, Mengjing Ma, Ming Tang, Chen Tian, and Minlan Yu. Aquila:

a practically usable veriﬁcation system for production-scale programmable data

planes. In Proceedings of the 2021 ACM SIGCOMM 2021 Conference, pages

17–32, 2021.

[43] Veriﬁed Network Toolchain. petr4/coq/lib/P4light at Feb2023. https:

//github.com/verified-network-toolchain/petr4/tree/Feb2023/coq/

lib/P4light. Accessed: 2023-03-07.

[44] Qinshi Wang and Andrew W. Appel. A solver for arrays with concatenation.

Journal of Automated Reasoning, 67(1):4, 2023.

[45] Qinshi Wang, Mengying Pan, Shengyi Wang, Ryan Doenges, Lennart Beringer,

and Andrew W. Appel. Foundational veriﬁcation of stateful P4 packet pro-

cessing. In 14th International Conference on Interactive Theorem Proving (ITP

2023), 2023.

[46] Zhuolong Yu, Yiwen Zhang, Vladimir Braverman, Mosharaf Chowdhury, and Xin

Jin. Netlock: Fast, centralized lock management using programmable switches.

In Proceedings of the Annual Conference of the ACM Special Interest Group

on Data Communication on the Applications, Technologies, Architectures, and

Protocols for Computer Communication, SIGCOMM ’20, page 126–138, New

York, NY, USA, 2020. Association for Computing Machinery.

149