Snabb Reference Manual

Mike Pall

Luke Gorrie

Max Rottenkolber

Andy Wingo

Diego Pino Garcia

Cosmin Apreutesei

Diego Pino

Nikolay Nikolaev

Katerina Barone-Adesi

Jessica Tallon

Asumu Takikawa

Alexander Gall

Hans Huebner

Javier Guerra

Nicola Larosa

Marcel Wiget

Pete Bristow

Adrian Perez de Castro

Adrián Pérez de Castro

Antonio Nikishaev

Domen Kožar

Alexander Altshuler

Rolf Sommerhalder

Pete Kazmier

Timo Buhrmester

Mikhail Nazarov

Justin Cormack

Kristian Larsson

Nicola ‘tekNico’ Larosa

Felix Geißler

Christian Graf

Carlos Alberto Lopez Perez

R. Matthew Emerson

Jeff Loughridge

Kacper Wysocki

Adrian Perez

Vladimir Fedin

Vincenzo Maffione

Tomas Korcak

Tim Upthegrove

Tim LaBerge

Stevan Markovic

Simon Leinen

Ryan Hartlage

Peter Cawley

Kristian Kielhofner

Jon Olsson

Jianbo Liu

Jay Fenton

James Cunningham

Hui Xiang

Gernot Nusshall

Edward Hope-Morley

Anshul Makkar

Andy Chong

Alex Kordic

Alexandr Kostrikov

Alexander Spyridakis

Version f9b5363, Fri Jun 2 10:58:51 2017 +0000

Introduction
Snabb API
- App
- Config (core.config)
- Engine (core.app)
- Link (core.link)
- Packet (core.packet)
- Memory (core.memory)
- Shared Memory (core.shm)
  - Counter (core.counter)
  - Histogram (core.histogram)
- Lib (core.lib)
- Multiprocess operation (core.worker)
- Main
Basic Apps (apps.basic.basic_apps)
- Source
- Join
- Split
- Sink
- Tee
- Repeater
- Truncate
- Sample
Intel 82599 Ethernet Controller Apps
- Intel10G (apps.intel.intel_app)
- LoadGen (apps.intel.loadgen)
  - Configuration
  - Performance
Intel i210 / i350 / 82599 Ethernet Controller apps (apps.intel_mp.intel_mp)
- Caveats
- Configuration
Solarflare Ethernet Controller Apps
- Solarflare (apps.solarflare.solarflare)
  - Configuration
RateLimiter App (apps.rate_limiter.rate_limiter)
- Configuration
- Performance
PcapFilter App (apps.packet_filter.pcap_filter)
- Configuration
- Special Counters
IPv6 Apps
- Nd_light (apps.ipv6.nd_light)
  - Configuration
  - Special Counters
- SimpleKeyedTunnel (apps.keyed_ipv6_tunnel.tunnel)
  - Configuration
  - Special Counters
VhostUser App (apps.vhost.vhost_user)
- Configuration
VirtioNet App (apps.virtio_net.virtio_net)
- Configuration
PcapReader and PcapWriter Apps (apps.pcap.pcap)
- Configuration
RawSocket App (apps.socket.raw)
- Configuration
UnixSocket App (apps.socket.unix)
- Configuration
Tap app (apps.tap.tap)
- Configuration
VLAN Apps
- Tagger (apps.vlan.vlan)
  - Configuration
- Untagger (apps.vlan.vlan)
  - Configuration
- VlanMux (apps.vlan.vlan)
Bridge Apps
- Configuration
- Flooding bridge (apps.bridge.flooding)
  - Configuration
- Learning bridge (apps.bridge.learning)
  - Configuration
IPsec Apps
- AES128gcm (apps.ipsec.esp)
  - Configuration
Test Apps
- Match (apps.test.match)
  - Configuration
- Synth (apps.test.synth)
  - Configuration
SnabbWall Apps
- L7Spy (apps.wall.l7spy)
- Filter (apps.wall.filter)
- Scanner (apps.wall.scanner)
  - Subclassing
  - NdpiScanner (apps.wall.scanner.ndpi)
- Utilities
  - SouthAndNorth (apps.wall.util)
Libraries
- IP checksum (lib.checksum)
- Ctable (lib.ctable)
- PMU (lib.pmu)
- Hardware
  - PCI (lib.hardware.pci)
  - Register (lib.hardware.register)
- Protocols
- IPsec
  - Encapsulating Security Payload (lib.ipsec.esp)
- Snabb NFV
- Watchdog (lib.watchdog.watchdog)
Snabblab
- Guidelines
- Servers
- Get started
- Using the lab
- Questions
- Thanks

Note: This reference manual is a draft. The API defined in this document is not guaranteed to be stable or complete and future versions of Snabb will introduce backwards incompatible changes. With that being said, discrepancies between this document and the actual Snabb Switch implementation are considered to be bugs. Please report them in order to help improve this document.

Introduction

Snabb is an extensible, virtualized, Ethernet networking toolkit. With Snabb you can implement networking applications using the Lua language. Snabb includes all the tools you need to quickly realize your network designs and its really fast too! Furthermore, Snabb is extensible and encourages you to grow the ecosystem to match your requirements.

Architecture

The Snabb Core forms a runtime environment (engine) which executes your design. A design is simply a Lua script used to drive the Snabb stack, you can think of it as your top-level “main” routine.

In order to add functionality to the Snabb stack you can load modules into the Snabb engine. These can be Lua modules as well as native code objects. We differentiate between two classes of modules, namely libraries and Apps. Libraries are simple collections of program utilities to be used in your designs, apps or other libraries, just as you might expect. Apps, on the other hand, are code objects that implement a specific interface, which is used by the Snabb engine to organize an App Network.

Network

Usually, a Snabb design will create a series of apps, interconnect these in a desired way using links and finally pass the resulting app network on to the Snabb engine. The engine’s job is to:

Pump traffic through the app network
Keep the app network running (e.g. restart failed apps)
Report on the network status

Snabb API

The core modules defined below can be loaded using Lua’s require. For example:

local config = require("core.config")

local c = config.new()
...

App

An app is an isolated implementation of a specific networking function. For example, a switch, a router, or a packet filter.

Apps receive packets on input ports, perform some processing, and transmit packets on output ports. Each app has zero or more input and output ports. For example, a packet filter may have one input and one output port, while a packet recorder may have only an input port. Every app must implement the interface below. Methods which may be left unimplemented are marked as “optional”.

— Method myapp:new arg

Required. Create an instance of the app with a given argument arg. Myapp:new must return an instance of the app. The handling of arg is up to the app but it is encouraged to use core.config’s parse_app_arg to parse arg.

— Field myapp.input

— Field myapp.output

Tables of named input and output links. These tables are initialized by the engine for use in processing and are read-only.

— Field myapp.appname

Name of the app. Read-only.

— Field myapp.shm

Can be set to a specification for core.shm.create_frame. When set, this field will be initialized to a frame of shared memory objects by the engine.

— Field myapp.config

Can be set to a specification for core.lib.parse. When set, the specification will be used to validate the app’s arg when it is configured using config.app.

— Method myapp:link

Optional. Called any time the app’s links may have been changed (including on start-up). Guaranteed to be called before pull and push are called with new links.

— Method myapp:pull

Optional. Pull packets into the network.

For example: Pull packets from a network adapter into the app network by transmitting them to output ports.

— Method myapp:push

Optional. Push packets through the system.

For example: Move packets from input ports to output ports or to a network adapter.

— Method myapp:reconfig arg

Optional. Reconfigure the app with a new arg. If this method is not implemented the app instance is discarded and a new instance is created.

— Method myapp:report

Optional. Print a report of the current app status.

— Method myapp:stop

Optional. Stop the app and release associated external resources.

— Field myapp.zone

Optional. Name of the LuaJIT profiling zone used for this app (descriptive string). The default is the module name.

Config (core.config)

A config is a description of a packet-processing network. The network is a directed graph. Nodes in the graph are apps that each process packets in a specific way. Each app has a set of named input and output ports—often called rx and tx. Edges of the graph are unidirectional links that carry packets from an output port to an input port.

The config is a purely passive data structure. Creating and manipulating a config object does not immediately affect operation. The config has to be activated using engine.configure.

— Function config.new

Creates and returns a new empty configuration.

— Function config.app config, name, class, arg

Adds an app of class with arg to the config where it will be assigned to name.

Example:

config.app(c, "nic", Intel82599, {pciaddr = "0000:00:00.0"})

— Function config.link config, linkspec

Add a link defined by linkspec to the config config. Linkspec must be a string of the format

app_name1.output_port->app_name2.input_port

where app_name1 and app_name2 are names of apps in config and output_port and input_port are valid output and input ports of the referenced apps respectively.

Example:

config.link(c, "nic1.tx->nic2.rx")

Engine (core.app)

The engine executes a config by initializing apps, creating links, and driving the flow of execution. The engine also performs profiling and reporting functions. It can be reconfigured during runtime. Within Snabb Switch scripts the core.app module is bound to the global engine variable.

— Function engine.configure config

Configure the engine to use a new config config. You can safely call this method many times to incrementally update the running app network. The engine updates the app network as follows:

Apps that did not exist in the old configuration are started.
Apps that do not exist in the new configuration are stopped. (The app stop() method is called if defined.)
Apps with unchanged configurations are preserved.
Apps with changed configurations are updated by calling their reconfig() method. If the reconfig() method is not implemented then the old instance is stopped a new one started.
Links with unchanged endpoints are preserved.

— Function engine.main options

Run the Snabb engine. Options is a table of key/value pairs. The following keys are recognized:

duration - Duration in seconds to run the engine for (as a floating point number). If this is set you cannot supply done.
done - A function to be called repeatedly by engine.main until it returns true. Once it returns true the engine will be stopped and engine.main will return. If this is set you cannot supply duration.
report - A table which configures the report printed before engine.main() returns. The keys showlinks and showapps can be set to boolean values to force or suppress link and app reporting individually. By default `engine.main()’ will report on links but not on apps.
measure_latency - By default, the breathe() loop is instrumented to record the latency distribution of running the app graph. This information can be processed by the snabb top program. Passing measure_latency=false in the options will disable this instrumentation.
no_report - A boolean value. If true no final report will be printed.

— Function engine.now

Returns monotonic time in seconds as a floating point number. Suitable for timers.

— Variable engine.busywait

If set to true then the engine polls continuously for new packets to process. This consumes 100% CPU and makes processing latency less vulnerable to kernel scheduling behavior which can cause pauses of more than one millisecond.

Default: false

— Variable engine.Hz

Frequency at which to poll for new input packets. The default value is ‘false’ which means to adjust dynamically up to 100us during low traffic. The value can be overridden with a constant integer saying how many times per second to poll.

This setting is not used when engine.busywait is true.

Link (core.link)

A link is a ring buffer used to store packets between apps. Links can be treated either like arrays—accessing their internal structure directly—or as streams of packets by using their API functions.

— Function link.empty link

Predicate used to test if a link is empty. Returns true if link is empty and false otherwise.

— Function link.full link

Predicate used to test if a link is full. Returns true if link is full and false otherwise.

— Function link.nreadable link

Returns the number of packets on link.

— Function link.nwriteable link

Returns the remaining number of packets that fit onto link.

— Function link.receive link

Returns the next available packet (and advances the read cursor) on link. If the link is empty an error is signaled.

— Function link.front link

Return the next available packet without advancing the read cursor on link. If the link is empty, nil is returned.

— Function link.transmit link, packet

Transmits packet onto link. If the link is full packet is dropped (and the drop counter increased).

— Function link.stats link

Returns a structure holding ring statistics for the link:

txbytes, rxbytes: Counts of transferred bytes.
txpackets, rxpackets: Counts of transferred packets.
txdrop: Count of packets dropped due to ring overflow.

Packet (core.packet)

A packet is an FFI object of type struct packet representing a network packet that is currently being processed. The packet is used to explicitly manage the life cycle of the packet. Packets are explicitly allocated and freed by using packet.allocate and packet.free. When a packet is received using link.receive its ownership is acquired by the calling app. The app must then ensure to either transfer the packet ownership to another app by calling link.transmit on the packet or free the packet using packet.free. Apps may only use packets they own, e.g. packets that have not been transmitted or freed. The number of allocatable packets is limited by the size of the underlying “freelist”, e.g. a pool of unused packet objects from and to which packets are allocated and freed.

— Type struct packet

struct packet {
    uint16_t length;
    uint8_t  data[packet.max_payload];
};

— Constant packet.max_payload

The maximum payload length of a packet.

— Function packet.allocate

Returns a new empty packet. An an error is raised if there are no packets left on the freelist. Initially the length of the allocated is 0, and its data is uninitialized garbage.

— Function packet.free packet

Frees packet and puts in back onto the freelist.

— Function packet.clone packet

Returns an exact copy of packet.

— Function packet.resize packet, length

Sets the payload length of packet, truncating or extending its payload. In the latter case the contents of the extended area at the end of the payload are filled with zeros.

— Function packet.append packet, pointer, length

Appends length bytes starting at pointer to the end of packet. An error is raised if there is not enough space in packet to accomodate length additional bytes.

— Function packet.prepend packet, pointer, length

Prepends length bytes starting at pointer to the front of packet, taking ownership of the packet and returning a new packet. An error is raised if there is not enough space in packet to accomodate length additional bytes.

— Function packet.shiftleft packet, length

Take ownership of packet, truncate it by length bytes from the front, and return a new packet. Length must be less than or equal to length of packet.

— Function packet.shiftright packet, length

Take ownership of packet, moves packet payload to the right by length bytes, growing packet by length. Returns a new packet. The sum of length and length of packet must be less than or equal to packet.max_payload.

— Function packet.from_pointer pointer, length

Allocate packet and fill it with length bytes from pointer.

— Function packet.from_string string

Allocate packet and fill it with the contents of string.

— Function **packet.clone_to_memory* pointer packet

Creates an exact copy of at memory pointed to by pointer. Pointer must point to a packet.packet_t.

Memory (core.memory)

Snabb allocates special DMA memory that can be accessed directly by network cards. The important characteristic of DMA memory is being located in contiguous physical memory at a stable address.

— Function memory.dma_alloc bytes, [alignment]

Returns a pointer to bytes of new DMA memory.

Optionally a specific alignment requirement can be provided (in bytes). The default alignment is 128.

— Function memory.virtual_to_physical pointer

Returns the physical address (uint64_t) the DMA memory at pointer.

— Variable memory.huge_page_size

Size of a single huge page in bytes. Read-only.

Shared Memory (core.shm)

This module facilitates creation and management of named shared memory objects. Objects can be created using shm.create similar to ffi.new, except that separate calls to shm.open for the same name will each return a new mapping of the same shared memory. Different processes can share memory by mapping an object with the same name (and type). Each process can map any object any number of times.

Mappings are deleted on process termination or with an explicit shm.unmap. Names are unlinked from objects that are no longer needed using shm.unlink. Object memory is freed when the name is unlinked and all mappings have been deleted.

Names can be fully qualified or abbreviated to be within the current process. Here are examples of names and how they are resolved where <pid> is the PID of this process:

Local: foo/bar ⇒ /var/run/snabb/<pid>/foo/bar
Fully qualified: /1234/foo/bar ⇒ /var/run/snabb/1234/foo/bar

Behind the scenes the objects are backed by files on ram disk (/var/run/snabb/<pid>) and accessed with the equivalent of POSIX shared memory (shm_overview(7)).

The practical limit on the number of objects that can be mapped will depend on the operating system limit for memory mappings. On Linux the default limit is 65,530 mappings:

$ sysctl vm.max_map_count vm.max_map_count = 65530

— Function shm.create name, type

Creates and maps a shared object of type into memory via a hierarchical name. Returns a pointer to the mapped object.

— Function shm.open name, type, [readonly]

Maps an existing shared object of type into memory via a hierarchical name. If readonly is non-nil the shared object is mapped in read-only mode. Readonly defaults to nil. Fails if the shared object does not already exist. Returns a pointer to the mapped object.

— Function shm.alias new-path existing-path

Create an alias (symbolic link) for an object.

— Function shm.exists name

Returns a true value if shared object by name exists.

— Function shm.unmap pointer

Deletes the memory mapping for pointer.

— Function shm.unlink path

Unlinks the subtree of objects designated by path from the filesystem.

— Function shm.children path

Returns an array of objects in the directory designated by path.

— Function shm.register type, module

Registers an abstract shared memory object type implemented by module in shm.types. Module must provide the following functions:

create name, …
open, name

and can optionally provide the function:

delete, name

The module’s type variable must be bound to type. To register a new type a module might invoke shm.register like so:

type = shm.register('mytype', getfenv())
-- Now the following holds true:
--   shm.types[type] == getfenv()

— Variable shm.types

A table that maps types to modules. See shm.register.

— Function shm.create_frame path, specification

Creates and returns a shared memory frame by specification under path. A frame is a table of mapped—possibly abstract‑shared memory objects. Specification must be of the form:

{ <name> = {<module>, ...},
  ... }

Module must implement an abstract type registered with shm.register, and is followed by additional initialization arguments to its create function. Example usage:

local counter = require("core.counter")
-- Create counters foo/bar/{dtime,rxpackets,txpackets}.counter
local f = shm.create_frame(
   "foo/bar",
   {dtime     = {counter, C.get_unix_time()},
    rxpackets = {counter},
    txpackets = {counter}})
counter.add(f.rxpackets)
counter.read(f.dtime)

— Function shm.open_frame path

Opens and returns the shared memory frame under path for reading.

— Function shm.delete_frame frame

Deletes/unmaps a shared memory frame. The frame directory is unlinked if frame was created by shm.create_frame.

Counter (core.counter)

Double-buffered shared memory counters. Counters are 64-bit unsigned values. Registered with core.shm as type counter.

— Function counter.create name, [initval]

Creates and returns a counter by name, initialized to initval. Initval defaults to 0.

— Function counter.open name

Opens and returns the counter by name for reading.

— Function counter.delete name

Deletes and unmaps the counter by name.

— Function counter.commit

Commits buffered counter values to public shared memory.

— Function counter.set counter, value

Sets counter to value.

— Function counter.add counter, [value]

Increments counter by value. Value defaults to 1.

— Function counter.read counter

Returns the value of counter.

Histogram (core.histogram)

Shared memory histogram with logarithmic buckets. Registered with core.shm as type histogram.

— Function histogram.new min, max

Returns a new histogram, with buckets covering the range from min to max. The range between min and max will be divided logarithmically.

— Function histogram.create name, min, max

Creates and returns a histogram as in histogram.new by name. If the file exists already, it will be cleared.

— Function histogram.open name

Opens and returns histogram by name for reading.

— Method histogram:add measurement

Adds measurement to histogram.

— Method histogram:iterate prev

When used as for count, lo, hi in histogram:iterate(), visits all buckets in histogram in order from lowest to highest. Count is the number of samples recorded in that bucket, and lo and hi are the lower and upper bounds of the bucket. Note that count is an unsigned 64-bit integer; to get it as a Lua number, use tonumber.

If prev is given, it should be a snapshot of the previous version of the histogram. In that case, the count values will be returned as a difference between their values in histogram and their values in prev.

— Method histogram:snapshot [dest]

Copies out the contents of histogram into the histogram dest and returns dest. If dest is not given, the result will be a fresh histogram.

— Method histogram:clear

Clears the buckets of histogram.

— Method **histogram:wrap_thunk* thunk, now

Returns a closure that wraps thunk, measuring and recording the difference between calls to now before and after thunk into histogram.

Lib (core.lib)

The core.lib module contains miscellaneous utilities.

— Function lib.equal x, y

Predicate to test if x and y are structurally similar (isomorphic).

— Function lib.can_open filename, mode

Predicate to test if file at filename can be successfully opened with mode.

— Function lib.can_read filename

Predicate to test if file at filename can be successfully opened for reading.

— Function lib.can_write filename

Predicate to test if file at filename can be successfully opened for writing.

— Function lib.readcmd command, what

Runs Unix shell command and returns what of its output. What must be a valid argument to file:read.

— Function lib.readfile filename, what

Reads and returns what from file at filename. What must be a valid argument to file:read.

— Function lib.writefile filename, value

Writes value to file at filename using file:write. Returns the value returned by file:write.

— Function lib.readlink filename

Returns the true name of symbolic link at filename.

— Function lib.dirname filename

Returns the dirname(3) of filename.

— Function lib.basename filename

Returns the basename(3) of filename.

— Function lib.firstfile directory

Returns the filename of the first file in directory.

— Function lib.firstline filename

Returns the first line of file at filename as a string.

— Function lib.files_in_directory directory

Returns an array of filenames in directory.

— Function lib.load_string string

Evaluates and returns the value of the Lua expression in string.

— Function lib.load_conf filename

Evaluates and returns the value of the Lua expression in file at filename.

— Function lib.store_conf filename, value

Writes value to file at filename as a Lua expression. Supports tables, strings and everything that can be readably printed using print.

— Function lib.bits bitset, basevalue

Returns a bitmask using the values of bitset as indexes. The keys of bitset are ignored (and can be used as comments).

Example:

bits({RESET=0,ENABLE=4}, 123) => 1<<0 | 1<<4 | 123

— Function lib.bitset value, n

Predicate to test if bit number n of value is set.

— Function lib.bitfield size, struct, member, offset, nbits, value

Combined accesor and setter function for bit ranges of integers in cdata structs. Sets nbits (number of bits) starting from offset to value. If value is not given the current value is returned.

Size may be one of 8, 16 or 32 depending on the bit size of the integer being set or read.

Struct must be a pointer to a cdata object and member must be the literal name of a member of struct.

Example:

local struct_t = ffi.typeof[[struct { uint16_t flags; }]]
-- Assuming `s' is an instance of `struct_t', set bits 4-7 to 0xF:
lib.bitfield(16, s, 'flags', 4, 4, 0xf)
-- Get the value:
lib.bitfield(16, s, 'flags', 4, 4) -- => 0xF

— Function string:split pattern

Returns an iterator over the string split by pattern. Pattern must be a valid argument to string:gmatch.

Example:

for word, sep in ("foo!bar!baz"):split("(!)") do
    print(word, sep)
end

> foo   !
> bar   !
> baz   nil

— Function lib.hexdump string

Returns hexadecimal string for bytes in string.

— Function lib.hexundump hexstring

Returns byte string for hexstring.

— Function lib.comma_value n

Returns a string for decimal number n with magnitudes separated by commas. Example:

comma_value(1000000) => "1,000,000"

— Function lib.random_data length

Returns a string of length bytes of random data.

— Function lib.bounds_checked type, base, offset, size

Returns a table that acts as a bounds checked wrapper around a C array of type and size starting at base plus offset. Type must be a ctype and the caller must ensure that the allocated memory region at base/offset is at least sizeof(type)*size bytes long.

— Function lib.throttle seconds

Return a closure that returns true at most once during any seconds (a floating point value) time interval, otherwise false.

— Function lib.timeout seconds

Returns a closure that returns true if seconds (a floating point value) have elapsed since it was created, otherwise false.

— Function lib.waitfor condition

Blocks until the function condition returns a true value.

— Function lib.waitfor2 name, condition, attempts, interval

Repeatedly calls the function condition in interval (milliseconds). If condition returns a true value waitfor2 returns. If condition does not return a true value after attempts waitfor2 raises an error identified by name.

— Function lib.yesno flag

Returns the string "yes" if flag is a true value and "no" otherwise.

— Function lib.align value, size

Return the next integer that is a multiple of size starting from value.

— Function lib.csum pointer, length

Computes and returns the “IP checksum” length bytes starting at pointer.

— Function lib.update_csum pointer, length, checksum

Returns checksum updated by length bytes starting at pointer. The default of checksum is 0LL.

— Function lib.finish_csum checksum

Returns the finalized checksum.

— Function lib.malloc etype

Returns a pointer to newly allocated DMA memory for etype.

— Function lib.deepcopy object

Returns a copy of object. Supports tables as well as ctypes.

— Function lib.array_copy array

Returns a copy of array. Array must not be a “sparse array”.

— Function lib.htonl n

— Function lib.htons n

Host to network byte order conversion functions for 32 and 16 bit integers n respectively. Unsigned.

— Function lib.ntohl n

— Function lib.ntohs n

Network to host byte order conversion functions for 32 and 16 bit integers n respectively. Unsigned.

— Function lib.parse arg, config

Validates arg against the specification in config, and returns a fresh table containing the parameters in arg and any omitted optional parameters with their default values. Given arg, a table of parameters or nil, assert that from config all of the required keys are present, fill in any missing values for optional keys, and error if any unknown keys are found. Config has the following format:

config := { key = {[required=boolean], [default=value]}, ... }

Each key is optional unless required is set to a true value, and its default value defaults to nil.

Example:

lib.parse({foo=42, bar=43}, {foo={required=true}, bar={}, baz={default=44}})
  => {foo=42, bar=43, baz=44}

Multiprocess operation (core.worker)

Snabb can operate as a group of cooperating processes. The main process is the initial one that you start directly. The optional worker processes are children spawned when the main process calls the core.worker module.

Multiprocessing

Each worker is a complete Snabb process. They can define app networks, run the engine, and do everything else that ordinary Snabb processes do. The exact behavior of each worker is determined by a Lua expression provided upon creation.

Groups of Snabb processes each have the following special properties:

Group termination: Terminating the main process automatically terminates all of the workers. This works for all process termination scenarios including kill -9.
Shared DMA memory: DMA memory pointers obtained with memory.dma_alloc() are usable by all processes in the group. This means that you can share DMA memory pointers between processes, for example via shm shared memory objects, and reference them from any process. (The memory is automatically mapped at the expected address via a SEGV signal handler.)
PCI device shutdown: For each PCI device opened by a process within the group, bus mastering (DMA) is disabled upon termination before any DMA memory is returned to the kernel. This prevents “dangling” DMA requests from corrupting memory that has been freed and reused.

The core.worker API functions are available in the main process only:

— Function worker.start name luacode

Start a named worker process. The worker starts with a completely fresh Snabb process image (fork()+execve()) and then executes the string luacode as a Lua source code expression.

Example:

worker.start("myworker", [[
   print("hello world, from a Snabb worker process!")
   print("could configure and run the engine now...")
]])

— Function worker.stop name

Stop a named worker process. The worker is abruptly killed.

Example:

worker.stop("myworker")

— Function worker.status

Return a table summarizing the status of all workers. The table key is the worker name and the value is a table with pid and alive attributes.

Example:

for w, s in pairs(worker.status()) do
   print(("  worker %s: pid=%s alive=%s"):format(
         w, s.pid, s.alive))
end

Output:

worker w3: pid=21949 alive=true
worker w1: pid=21947 alive=true
worker w2: pid=21948 alive=true

Main

Snabb designs can be run either with:

snabb <snabb-arg>* <design> <design-arg>*

#!/usr/bin/env snabb <snabb-arg>*
...

The main module provides an interface for running Snabb scripts. It exposes various operating system functions to scripts.

— Field main.parameters

A list of command-line arguments to the running script. Read-only.

— Function main.exit status

Cleanly exits the process with status.

Basic Apps (apps.basic.basic_apps)

The module apps.basic.basic_apps provides apps with general functionality for use in you app networks.

Source

The Source app is a synthetic packet generator. On each breath it fills each attached output link with new packets. It accepts a number as its configuration argument which is the byte size of the generated packets. By default, each packet is 60 bytes long. The packet data is initialized with zero bytes.

Source

Join

The Join app joins together packets from N input links onto one output link. On each breath it outputs as many packets as possible from the inputs onto the output.

Join

Split

The Split app splits packets from multiple inputs across multiple outputs. On each breath it transfers as many packets as possible from the input links to the output links.

Split

Sink

The Sink app receives all packets from any number of input links and discards them. This can be handy in combination with a Source.

Sink

Tee

The Tee app receives all packets from any number of input links and transfers each received packet to all output links. It can be used to merge and/or duplicate packet streams

Tee

Repeater

The Repeater app collects all packets received from the input link and repeatedly transfers the accumulated packets to the output link. The packets are transmitted in the order they were received.

Repeater

Truncate

The Truncate app sends all packets received from the input to the output link and truncates or zero pads each packet to a given length. It accepts a number as its configuration argument which is the length of the truncated or padded packets.

Truncate

Sample

The Sample app forwards packets every nth packet from the input link to the output link, and drops all others packets. It accepts a number as its configuration argument which is n.

Sample

Intel 82599 Ethernet Controller Apps

Intel10G (apps.intel.intel_app)

The Intel10G drives one port of an Intel 82599 Ethernet controller. Packets taken from the rx port are transmitted onto the network. Packets received from the network are put on the tx port.

Intel10G

— Method Intel10G.dev:get_rxstats

Returns a table with the following keys:

counter_id - Counter id
packets - Number of packets received
dropped - Number of packets dropped
bytes - Total bytes received

— Method Intel10G.dev:get_txstats

Returns a table with the following keys:

counter_id - Counter id
packets - Number of packets sent
bytes - Total bytes sent

Configuration

The Intel10G app accepts a table as its configuration argument. The following keys are defined:

— Key pciaddr

Required. The PCI address of the NIC as a string.

— Key macaddr

Optional. The MAC address to use as a string. The default is a wild-card (e.g. accept all packets).

— Key vlan

Optional. A twelve bit integer (0-4095). If set, incoming packets from other VLANs are dropped and outgoing packets are tagged with a VLAN header.

— Key vmdq

Optional. Boolean, defaults to false. Enables interface virtualization. Allows to have multiple Intel10G apps per port. If enabled, macaddr must be specified.

— Key mirror

Optional. A table. If set, this app will receive copies of all selected packets on the physical port. The selection is configured by setting keys of the mirror table. Either mirror.pool or mirror.port may be set.

If mirror.pool is true all pools defined on this physical port are mirrored. If mirror.pool is an array of pool numbers then the specified pools are mirrored.

If mirror.port is one of “in”, “out” or “inout” all incoming and/or outgoing packets on the port are mirrored respectively. Note that this does not include internal traffic which does not enter or exit through the physical port.

— Key rxcounter

— Key txcounter

Optional. Four bit integers (0-15). If set, incoming/outgoing packets will be counted in the selected statistics counter respectively. Multiple apps can share a counter. To retrieve counter statistics use Intel10G.dev:get_rxstats() and Intel10G.dev:get_txstats().

— Key rate_limit

Optional. Number. Limits the maximum Mbit/s to transmit. Default is 0 which means no limit. Only applies to outgoing traffic.

— Key priority

Optional. Floating point number. Weight for the round-robin algorithm used to arbitrate transmission when rate_limit is not set or adds up to more than the line rate of the physical port. Default is 1.0 (scaled to the geometric middle of the scale which goes from 1/128 to 128). The absolute value is not relevant, instead only the ratio between competing apps controls their respective bandwidths. Only applies to outgoing traffic.

For example, if two apps without rate_limit set have the same priority, both get the same output bandwidth. If the priorities are 3.0/1.0, the output bandwidth is split 75%/25%. Likewise, 1.0/0.333 or 1.5/0.5 yield the same result.

Note that even a low-priority app can use the whole line rate unless other (higher priority) apps are using up the available bandwidth.

Performance

The Intel10G app can transmit and receive at approximately 10 Mpps per processor core.

Hardware limits

Each physical Intel 82599 port supports the use of up to:

64 pools (virtualized Intel10G app instances)
127 MAC addresses (see the macaddr configuration option)
64 VLANs (see the vlan configuration option)
4 mirror pools (see the mirror configuration option)

LoadGen (apps.intel.loadgen)

LoadGen is a load generator app based on the Intel 82599 Ethernet controller. It reads up to 32,000 packets from the input port and transmits them repeatedly onto the network. All incoming packets are dropped.

LoadGen

Configuration

The LoadGen app accepts a string as its configuration argument. The given string denotes the PCI address of the NIC to use.

Performance

The LoadGen app can transmit at line-rate (14 Mpps) without significant CPU usage.

Intel i210 / i350 / 82599 Ethernet Controller apps (apps.intel_mp.intel_mp)

The intel_mp.Intel app provides drivers for Intel i210/i250/82599 based network cards. The driver exposes multiple receive and transmit queues that can be attached to separate instances of the app on different processes.

The links are named input and output.

Caveats

If attaching multiple processes to a single NIC, performance appears better with engine.busywait = false.

The intel_mp.Intel app can drive an Intel 82599 NIC at 14 million pps.

Configuration

— Key pciaddr

Required. The PCI address of the NIC as a string.

— Key ndesc

Optional. Number of DMA descriptors to use i.e. size of the DMA transmit and receive queues. Must be a multiple of 128. Default is not specified but assumed to be broadly applicable.

— Key rxq

Optional. The receive queue to attach to, numbered from 0.

— Key txq

Optional. The transmit queue to attach to, numbered from 0.

— Key rsskey

Optional. The rsskey is a 32 bit integer that seeds the hash used to distribute packets across queues. If there are multiple levels of RSS snabb devices in the packet flow making this unique will help packet distribution.

— Key wait_for_link

Optional. Boolean that indicates if new should block until there is a link light or not. The default is false.

— Key linkup_wait

Optional Number of seconds new waits for the device to come up. The default is 120.

— Key mtu

Optional The maximum packet length sent or received, excluding the trailing 4 byte CRC. The default is 9014.

— Key master_stats

Optional Boolean indicating whether to elect an arbitrary app (the master) to collect device statistics. The default is true.

— Key run_stats

Optional Boolean indicating if this app instance should collect device statistics. One per physical NIC (conflicts with master_stats). There is a small but detectable run time performance hit incurred. The default is false.

RSS hashing methods

RSS will distribute packets based on as many of the fields below as are present in the packet:

Source / Destination IP address
Source / Destination TCP ports
Source / Destination UDP ports

Default RSS Queue

Packets that are not IPv4 or IPv6 will be delivered to receive queue 0.

Hardware limits

Each chipset supports a differing number of receive / transmit queues:

Intel82599 supports 16 receive and 16 transmit queues, 0-15
Intel1g i350 supports 8 receive and 8 transmit queues, 0-7
Intel1g i210 supports 4 receive and 4 transmit queues, 0-3

Solarflare Ethernet Controller Apps

Solarflare (apps.solarflare.solarflare)

The Solarflare app drives one port of a Solarflare SFN7 Ethernet controller. Multiple instances of the Solarflare app can be instantiated on the same PCI device. Packets received from the network will be dispatched between apps based on destination MAC address and VLAN. Packets taken from the rx port are transmitted onto the network. Packets received from the network are put on the tx port.

Solarflare

The Solarflare app requires OpenOnload version 201502 to be installed and the sfc module to be loaded.

Configuration

The Solarflare app accepts a table as its configuration argument. The following keys are defined:

— Key pciaddr

Required. The PCI address of the NIC as a string.

— Key macaddr

Optional. The MAC address to use as a string. The default is a wild-card (e.g. accept all packets).

— Key vlan

Optional. A twelve bit integer (0-4095). If set, incoming packets from other VLANs are dropped and outgoing packets are tagged with a VLAN header.

RateLimiter App (apps.rate_limiter.rate_limiter)

The RateLimiter app implements a Token bucket algorithm with a single bucket dropping non-conforming packets. It receives packets on the input port and transmits conforming packets to the output port.

RateLimiter

— Method RateLimiter:snapshot

Returns throughput statistics in form of a table with the following fields:

rx - Number of packets received
tx - Number of packets transmitted
time - Current time in nanoseconds

Configuration

The RateLimiter app accepts a table as its configuration argument. The following keys are defined:

— Key rate

Required. Rate in bytes per second to which throughput should be limited.

— Key bucket_capacity

Required. Bucket capacity in bytes. Should be equal or greater than rate. Otherwise the effective rate may be limted.

— Key initial_capacity

Optional. Initial bucket capacity in bytes. Defaults to bucket_capacity.

Performance

The RateLimiter app is able to process more than 20 Mpps per CPU core. Refer to its selftest for details.

PcapFilter App (apps.packet_filter.pcap_filter)

The PcapFilter app receives packets on the input port and transmits conforming packets to the output port. In order to conform, a packet must match the pcap-filter expression of the PcapFilter instance and/or belong to a sanctioned connection. For a connection to be sanctioned it must be tracked in a state table by a PcapFilter app using the same state table. All PcapFilter apps share a global namespace of state table identifiers. Multiple PcapFilter apps—e.g. for inbound and outbound traffic—can refer to the same connection by sharing a state table identifer.

PcapFilter

Configuration

The PcapFilter app accepts a table as its configuration argument. The following keys are available:

— Key filter

Required. A string containing a pcap-filter expression.

— Key state_table

Optional. A string naming a state table. If set, packets passing any rule will be tracked in the specified state table and any packet that belongs to a tracked connection in the specified state table will be let pass.

Special Counters

— Key sessions_established

Total number of sessions established.

IPv6 Apps

Nd_light (apps.ipv6.nd_light)

The nd_light app implements a small subset of IPv6 neighbor discovery (RFC4861). It has two duplex ports, north and south. The south port attaches to a network on which neighbor discovery (ND) must be performed. The north port attaches to an app that processes IPv6 packets (including full ethernet frames). Packets transmitted to the north port must be wrapped in full Ethernet frames (which may be empty).

The nd_light app replies to neighbor solicitations for which it is configured as a target and performs rudimentary address resolution for its configured next-hop address. If address resolution succeeds, the Ethernet headers of packets from the north port will be overwritten with headers containing the discovered destination address and the configured source address before they are transmitted over the south port. All packets from the north port are discarded as long as ND has not yet succeeded. Packets received from the south port are transmitted to the north port unaltered.

nd_light

Configuration

The nd_light app accepts a table as its configuration argument. The following keys are defined:

— Key local_mac

Required. Local MAC address as a string or in binary representation.

— Key local_ip

Required. Local IPv6 address as a string or in binary representation.

— Key next_hop

Required. IPv6 address of next hop as a string or in binary representation.

— Key delay

Optional. Neighbor solicitation retransmission delay in milliseconds. Default is 1,000ms.

— Key retrans

Optional. Number of neighbor solicitation retransmissions. Default is unlimited retransmissions.

Special Counters

— Key ns_checksum_errors

Neighbor solicitation requests dropped due to invalid ICMP checksum.

— Key ns_target_address_errors

Neighbor solicitation requests dropped due to invalid target address.

— Key na_duplicate_errors

Neighbor advertisement requests dropped because next-hop is already resolved.

— Key na_target_address_errors

Neighbor advertisement requests dropped due to invalid target address.

— Key nd_protocol_errors

Neighbor discovery requests dropped due to protocol errors (invalid IPv6 hop-limit or invalid neighbor solicitation request options).

SimpleKeyedTunnel (apps.keyed_ipv6_tunnel.tunnel)

The SimpleKeyedTunnel app implements “a simple L2 Ethernet over IPv6 tunnel encapsulation” as described in Keyed IPv6 Tunnel. It has two duplex ports, encapsulated and decapsulated. Packets transmitted on the decapsulated input port will be encapsulated and put on the encapsulated output port. Packets transmitted on the encapsulated input port will be decapsulated and put on the decapsulated output port.

SimpleKeyedTunnel

Configuration

The SimpleKeyedTunnel app accepts a table as its configuration argument. The following keys are defined:

— Key local_address

Required. Local IPv6 address as a string.

— Key remote_address

Required. Remote IPv6 address as a string.

— Key local_cookie

Required. Local cookie, 8 bytes encoded in a hexadecimal string.

— Key remote_cookie

Required. Remote cookie, 8 bytes encoded in a hexadecimal string.

— Key local_session

Optional. Unsigned integer, 32 bit. If set, the session_id field of the L2TPv3 header will be overwritten with this value.

— Key hop_limit

Optional. Unsigned integer. Sets the hop limit. Default is 64.

— Key default_gateway_MAC

Optional. Destination MAC as a string. Not required if overwritten by an app such as nd_light.

Special Counters

— Key length_errors

Ingress packets dropped due to invalid length (packet too short).

— Key protocol_errors

Ingress packets dropped due to unrecognized IPv6 protocol ID.

— Key cookie_errors

Ingress packets dropped due to wrong cookie value.

— Key remote_address_errors

Ingress packets dropped due to wrong remote IPv6 endpoint address.

— Key local_address_errors

Ingress packets dropped due to wrong local IPv6 endpoint address.

VhostUser App (apps.vhost.vhost_user)

The VhostUser app implements portions of the Virtio protocol for virtual ethernet I/O interfaces. In particular, VhostUser supports the virtio vring data structure for packet I/O in shared memory (DMA) and the Linux vhost API for creating vrings attached to tuntap devices.

With VhostUser SnabbSwitch can be used as a virtual ethernet interface by QEMU virtual machines. When connected via a UNIX socket, packets can be sent to the virtual machine by transmitting them on the rx port and packets sent by the virtual machine will arrive on the tx port.

VhostUser

Configuration

The VhostUser app accepts a table as its configuration argument. The following keys are defined:

— Key socket_path

Optional. A string denoting the path to the UNIX socket to connect on. Unless given all incoming packets will be dropped.

— Key is_server

Optional. Listen and accept an incoming connection on socket_path instead of connecting to it.

VirtioNet App (apps.virtio_net.virtio_net)

The VirtioNet app implements a subset of the driver part of the virtio-net specification. It can connect to a virtio-net device from within a QEMU virtual machine. Packets can be sent out of the virtual machine by transmitting them on the rx port, and packets sent to the virtual machine will arrive on the tx port.

VirtioNet

Configuration

The VirtioNet app accepts a table as its configuration argument. The following keys are defined:

— Key pciaddr

Required. The PCI address of the virtio-net device.

— Key use_checksum

Optional. Boolean value to enable the checksum offloading pre-calculations applied on IPv4/IPv6 TCP and UDP packets.

PcapReader and PcapWriter Apps (apps.pcap.pcap)

The PcapReader and PcapWriter apps can be used to inject and log raw packet data into and out of the app network using the Libpcap File Format. PcapReaderreads raw packets from a PCAP file and transmits them on its output port while PcapWriter writes packets received on its input port to a PCAP file.

PcapReader

Configuration

Both PcapReader and PcapWriter expect a filename string as their configuration arguments to read from and write to respectively. PcapWriter will alternatively accept an array as its configuration argument, with the first element being the filename and the second element being a mode argument to io.open.

RawSocket App (apps.socket.raw)

The RawSocket app is a bridge between Linux network interfaces (eth0, lo, etc.) and a Snabb app network. Packets taken from the rx port are transmitted over the selected interface. Packets received on the interface are put on the tx port.

RawSocket

Configuration

The RawSocket app accepts a string as its configuration argument. The string denotes the interface to bridge to.

UnixSocket App (apps.socket.unix)

The UnixSocket app provides I/O for a named Unix socket.

UnixSocket

Configuration

The UnixSocket app takes a string argument which denotes the Unix socket file name to open, or a table with the fields:

filename - the Unix socket file name to open.
listen - if true, listen for incoming connections on the socket rather than connecting to the socket in client mode.
mode - can be “stream” or “packet” (the default is “stream”): the difference is that in packet mode, the packets are not split or merged (in both modes packets arrive in order).

NOTE: The socket is not opened until the first call to push() or pull(). If connection is lost, the socket will be re-opened on the next call to push() or pull().

Tap app (apps.tap.tap)

The Tap app is used to interact with a Linux tap device. Packets transmitted on the input port will be sent over the tap device, and packets that arrive on the tap device can be received on the output port.

Tap

Configuration

The Tap app accepts a string that identifies an existing tap interface.

The Tap device can be configured using standard Linux tools:

ip tuntap add Tap345 mode tap
ip link set up dev Tap345
ip link set address 02:01:02:03:04:08 dev Tap0

VLAN Apps

There are three VLAN related apps, Tagger, Untagger and VlanMux. The Tagger and Untagger apps add or remove a VLAN tag whereas the VlanMux app can multiplex and demultiplex packets to different output ports based on tag.

Tagger (apps.vlan.vlan)

The Tagger app adds a VLAN tag, with the configured value, to packets received on its input port and transmits them on its output port.

Configuration

— Key tag

Required. VLAN tag to add or remove from the packet.

Untagger (apps.vlan.vlan)

The Untagger app checks packets received on its input port for a VLAN tag, removes it if it matches with the configured VLAN tag and transmits them on its output port. Packets with other VLAN tags than the configured tag will be dropped.

Configuration

— Key tag

Required. VLAN tag to add or remove from the packet.

VlanMux (apps.vlan.vlan)

Despite the name, the VlanMux app can act both as a multiplexer, i.e. receive packets from multiple different input ports, add a VLAN tag and transmit them out onto one, as well as receiving packets from its trunk port and demultiplex it over many output ports based on the VLAN tag of the received packet.

Packets received on its trunk input port with Ethernet type 0x8100 are inspected for the VLAN tag and transmitted on an output port vlanX where X is the VLAN tag parsed from the packet. If no such output port exists the packet is dropped. Received packets with an Ethernet type other than 0x8100 are transmitted on its native output port,

Packets received on its native input port are transmitted verbatim on its trunk output port.

Packets received on input ports named vlanX, where X is a VLAN tag, will have the VLAN tag X added and then be transmitted on its trunk output port.

There is no configuration for the VlanMux app, simply connect it to your other apps and it will base its actions on the name of the ports.

Bridge Apps

A bridge app implements a basic Ethernet bridge with split-horizon semantics. It has an arbitrary number of ports. For each input port there must exist an output port with the same name. Each port name is a member of at most one split-horizon group. If it is not a member of a split-horizon group, the port is also called a free port. Packets arriving on a free input port may be forwarded to all other output ports. Packets arriving on an input port that belongs to a split-horizon group are never forwarded to any output port belonging to the same split-horizon group. There are two bridge implementations available: apps.bridge.flooding and apps.bridge.learning`.

bridge

Configuration

A bridge app accepts a table as its configuration argument. The following keys are defined:

— Key ports

Optional. An array of free port names. The default is no free ports.

— Key split_horizon_groups

Optional. A table mapping split-horizon groups to arrays of port names. The default is no split-horizon groups.

— Key config

Optional. The configuration of the actual bridge implementation.

Flooding bridge (apps.bridge.flooding)

The flooding bridge app implements the simplest possible bridge, which floods a packet arriving on an input port to all output ports within its scope according to the split-horizon topology.

Configuration

The flooding bridge app ignores the config key of its configuration.

Learning bridge (apps.bridge.learning)

The learning bridge app implements a learning bridge using a custom hash table to store the set of MAC source addresses of packets arriving on each input port. When a packet is received it is forwarded to all output ports whose corresponding input ports match the packet’s destination MAC address. When no input port matches, the packet is flooded to all output ports. Multicast MAC addresses are always flooded to all output ports associated with the input port. The scoping rules according to the split-horizon topology apply unchanged.

Configuration

The learning bridge app accepts a table as the value of the config key of its configuration. The following keys are defined:

— Key mac_table

Optional. This is a table that defines the characteristics of the MAC table. The following keys are defined

— Key size

Optional. The number of MAC addresses to be stored in the table. Default is 256. The size of the table is increased automatically if this limit is reached or if an overflow in one of the hash buckets occurs. This value is capped by resize_max.

— Key timeout

Optional. Timeout for learned MAC addresses in seconds. Default is 60.

— Key verbose

Optional. A boolean value. If true, statistics about table usage is logged during each timeout interval. Default is false.

— Key copy_on_resize

Optional. A boolean value. If true, the contents of the table is copied to the newly allocated table after a resize operation. Default is true.

— Key resize_max

Optional. An upper bound for the size of the table. Default is 65536.

IPsec Apps

AES128gcm (apps.ipsec.esp)

The AES128gcm implements ESP in transport mode using the AES-GCM-128 cipher. It encrypts packets received on its decapsulated port and transmits them on its encapsulated port, and vice-versa. Packets arriving on the decapsulated port must have an IPv6 header, and packets arriving on the encapsulated port must have an IPv6 header followed by an ESP header, otherwise they will be discarded.

AES128gcm

References:

lib.ipsec.esp

Configuration

The AES128gcm app accepts a table as its configuration argument. The following keys are defined:

— Key spi

Required. A 32 bit integer denoting the “Security Parameters Index” as specified in RFC 4303.

— Key transmit_key

Required. Hexadecimal string of 32 digits (two digits for each byte) that denotes a 128-bit AES key as specified in RFC 4106 used for the encryption of outgoing packets.

— Key transmit_salt

Required. Hexadecimal string of eight digits (two digits for each byte) that denotes four bytes of salt as specified in RFC 4106 used for the encryption of outgoing packets.

— Key receive_key

Required. Hexadecimal string of 32 digits (two digits for each byte) that denotes a 128-bit AES key as specified in RFC 4106 used for the decryption of incoming packets.

— Key receive_salt

Required. Hexadecimal string of eight digits (two digits for each byte) that denotes four bytes of salt as specified in RFC 4106 used for the decryption of incoming packets.

— Key receive_window

Optional. Minimum width of the window in which out of order packets are accepted as specified in RFC 4303. The default is 128.

— Key resync_threshold

Optional. Number of consecutive packets allowed to fail decapsulation before attempting “Re-synchronization” as specified in RFC 4303. The default is 1024.

— Key resync_attempts

Optional. Number of attempts to re-synchronize a packet that triggered “Re-synchronization” as specified in RFC 4303. The default is 8.

— Key auditing

Optional. A boolean value indicating whether to enable or disable “Auditing” as specified in RFC 4303. The default is nil (no auditing).

Test Apps

Match (apps.test.match)

The Match app compares packets received on its input port rx with those received on the reference input port comparator, and reports mismatches as well as packets from comparator that were not matched.

Match

— Method Match:errors

Returns the recorded errors as an array of strings.

Configuration

The Match app accepts a table as its configuration argument. The following keys are defined:

— Key fuzzy

Optional. If this key is true packets from rx that do not match the next packet from comparator are ignored. The default is false.

— Key modest

Optional. If this key is true unmatched packets from comparator are ignored if at least one packet from ´rx´ was successfully matched. The default is false.

Synth (apps.test.synth)

The Synth app generates synthetic packets with Ethernet headers and alternating payload sizes. On each breath it fills each attached output link with new packets.

Synth

Configuration

The Synth app accepts a table as its configuration argument. The following keys are defined:

— Key src

— Key dst

Source and destination MAC addresses in human readable from. The default is "00:00:00:00:00:00".

— Key sizes

An array of numbers designating the packet payload sizes. The default is {64}.

SnabbWall Apps

L7Spy (apps.wall.l7spy)

L7Spy

The L7Spy app is a Snabb app that scans packets passing through it using an instance of the Scanner class. The scanner instance may be shared among several L7Spy instances or with a L7Fw app for filtering.

— Method L7Spy:new config

Construct a new L7Spy app instance based on a given configuration table. The table may contain the following key:

scanner (optional): Either a string identifying the kind of scanner to construct (currently only "ndpi" is accepted) or an existing scanner instance.

Filter (apps.wall.filter)

L7Fw

The L7Fw app implements a stateful firewall by querying the scanner state collected by a L7Spy app. It then filters packets based on a given set of rules.

— Method L7Fw:new config

Construct a new L7Fw app instance based on a given configuration table. The table may contain the following keys:

scanner: A Scanner instance shared with an L7Spy instance. The metadata in this scanner is used for packet filtering.
rules: A table mapping protocol names (as strings) to firewall actions. The accepted actions are "accept", "reject", "drop", or a pfmatch expression. The pfmatch expression may use the variable flow_count (as an arithmetic expression) to refer to the number of packets in a given protocol flow, and may call the accept, reject, or drop methods.
local_ipv4 (optional): An IPv4 address that identifies the host running the firewall. This is used as the source address in ICMPv4 or TCP reject responses.
local_ipv6 (optional): An IPv6 address that identifies the host running the firewall. This is used as the source address in ICMPv6 or TCP reject responses.
local_macaddr (optional): A MAC address that identifies the host running the firewall. This is used for the source address in ethernet frames for reject responses.
logging (optional): A log level parameter that can be set to “on” or “off”. When set to “on”, it will report dropped/rejected packets to the system log.

Scanner (apps.wall.scanner)

Scanner objects are responsible for:

Identifying traffic flows.
Analyzing the contents of packet to determine which application they belong to.
Keeping enough state to be able to enumerate the identified traffic flows, and identify the application they belong to.

The class is not meant to be instantiated directly, but to be used as the basis for concrete implementations (e.g. NdpiScanner). It provides one function that subclasses can use:

Method Scanner:extract_packet_info packet

Extracts fields from the headers of an IPv4 or IPv6 packet. The returned values are:

A key object (more on this below) which uniquely identifies a traffic flow.
The offset to the packet payload content.
The source_address of the packet, as an array of bytes.
The source_port, for UDP and TCP packets.
The destination_address of the packet, as an array of bytes.
The destination_port, for UDP and TCP packets.

Key objects contain some of the returned information in a compact FFI representation, and can be used as an aid to uniquely identify a flow of packets. The provide the following attributes:

:eth_type(): Method which returns the type of the Ethernet frame payload, either ETH_TYPE_IPv4 or ETH_TYPE_IPv6.
:hash(): Method which returns an integer calculated by hashing all the other values in the key object.
.vlan_id: VLAN identifier. Zero for no VLAN tags.
.ip_proto: The IP protocol.
.lo_addr and .hi_addr: IP addresses (either v4 or v6).
.lo_port and .hi_port: For TCP and UDP, the ports as big-endian (network) integers.

This method can be very useful to implement scanners using backends which do not implement their own flow classification.

Subclassing

All the Scanner implementations conform to the Scanner base API.

— Method Scanner:scan_packet packet, time

Scans a packet.

The time parameter is used to know at which time (in seconds from the Epoch) packet has been received for processing. A suitable value can be obtained using engine.now().

— Method Scanner:get_flow packet

Obtains the traffic flow for a given packet. If the packet is determined to not match any of the detected flows, nil is returned. The returned flow object has at least the following fields:

protocol: The L7 protocol for the flow. A user-visible string can be obtained by passing this value to Scanner:protocol_name().
packets: Number of packets scanned which belong to the traffic flow.
last_seen: Last time (in seconds from the Epoch) at which a packet belonging to the flow has been scanned.

— Method Scanner:flows

Returns an iterator over all the traffic flows detected by the scanner. The returned value is suitable to be used in a for-loop:

for flow in my_scanner:flows() do
   -- Do something with "flow".
end

— Method Scanner:protocol_name protocol

Given a protocol identifier, returns a user-friendly name as a string. Typically the protocol is obtained flow objects returned by Scanner:get_flow().

NdpiScanner (apps.wall.scanner.ndpi)

NdpiScanner uses the nDPI library (via the ljndpi FFI binding) to scan packets and determine L7 traffic flows. The nDPI library (libndpi.so) must be available in the host system. Versions 1.7 and 1.8 are supported.

— Method NdpiScanner:new ticks_per_second

Creates a new scanner, with a ticks_per_second resolution.

Utilities

The apps.wall.util module contains miscellaneous utilities.

— Function util.ipv4_addr_cmp a, b

Compares two IPv4 addresses a and b. The returned value follows the same convention as for C.memcmp(): zero if both addresses are equal, or an integer value with the same sign as the sign of the difference between the first pair of bytes that differ in a and b.

— Function util.ipv6_addr_cmp a, b

Compares two IPv6 addresses a and b. The returned value follows the same convention as for C.memcmp(): zero if both addresses are equal, or an integer value with the same sign as the sign of the difference between the first pair of bytes that differ in a and b.

SouthAndNorth (apps.wall.util)

The SouthAndNorth application is not to mean to be used directly, but rather as a building block for more complex applications which need two duplex ports (south and north) which forward packets between them, optionally doing some intermediate processing.

Packets arriving to the north port are passed to the :on_southbound_packet() method —which can be overriden in a subclass—, and forwarded to the south port. Conversely, packets arriving to the south port are passed to :on_northbound_packet() method, and finally forwarded to the north port.

SouthAndNorth

The value returnbyed :on_southbound_packet() and :on_northbound_packet() determines what will be done to the packet being processed:

Returning false discards the packet: the packet will not be forwarded, and packet.free() will be called on it.
Returning a different packet replaces the packet: the packet originally being processed is discarded, packet.free() called on it, and the returned packet is forwarded.
Returning the same packet being handled will forward it. Retuning nil achieves the same effect.

Example

The following snippet defines an application derived from SouthAndNorth which silently discards packets bigger than a certain size, and keeps a count of how many packets have been discarded and forwarded:


-- Setting SouthAndNorth as metatable "inherits" from it.
DiscardBigPackets = setmetatable({},
   require("apps.wall.util").SouthAndNorth)

function DiscardBigPackets:new (max_length)
   return setmetatable({
      max_packet_length = max_length,
      discarded_packets = 0,
      forwarded_packets = 0,
   }, self)
end

function DiscardBigPackets:on_northbound_packet (pkt)
   if pkt.length > self.max_packet_length then
      self.discarded_packets = self.discarded_packets + 1
      return false
   end
   self.forwarded_packets = self.forwarded_packets + 1
end

-- Apply the same logic for packets in the other direction.
DiscardBigPackets.on_southbound_packet =
   DiscardBigPackets.on_northbound_packet

Libraries

IP checksum (lib.checksum)

The checksum module provides an optimized ones-complement checksum routine.

— Function ipsum pointer length initial

Return the ones-complement checksum for the given region of memory.

pointer is a pointer to an array of data to be checksummed. initial is an unsigned 16-bit number in host byte order which is used as the starting value of the accumulator. The result is the IP checksum over the data in host byte order.

The initial argument can be used to verify a checksum or to calculate the checksum in an incremental manner over chunks of memory. The synopsis to check whether the checksum over a block of data is equal to a given value is the following

if ipsum(pointer, length, value) == 0 then
  -- checksum correct
else
  -- checksum incorrect
end

To chain the calculation of checksums over multiple blocks of data together to obtain the overall checksum, one needs to pass the one’s complement of the checksum of one block as initial value to the call of ipsum() for the following block, e.g.

local sum1 = ipsum(data1, length1, 0)
local total_sum = ipsum(data2, length2, bit.bnot(sum1))

This function takes advantage of SIMD hardware when available.

Ctable (lib.ctable)

A ctable is a hash table whose keys and values are instances of FFI data types. In Lua parlance, an FFI value is a “cdata” value, hence the name “ctable”.

A ctable is parameterized for the specific types for its keys and values. This allows for the table to be stored in an efficient manner. Adding an entry to a ctable will copy the value into the table. Logically, the table “owns” the value. Lookup can either return a pointer to the value in the table, or copy the value into a user-supplied buffer, depending on what is most convenient for the user.

As an implementation detail, the table is stored as an open-addressed robin-hood hash table with linear probing. This means that to look up a key in the table, we take its hash value (using a user-supplied hash function), map that hash value to an index into the table by scaling the hash to the table size, and then scan forward in the table until we find an entry whose hash value is greater than or equal to the hash in question. Each entry stores its hash value, and empty entries have a hash of 0xFFFFFFFF. If the entry’s hash matches and the entry’s key is equal to the one we are looking for, then we have our match. If the entry’s hash is greater than our hash, then we have a failure. Hash collisions are possible as well of course; in that case we continue scanning forward.

The distance travelled while scanning for the matching hash is known as the displacement. The table measures its maximum displacement, for a number of purposes, but you might be interested to know that a maximum displacement for a table with 2 million entries and a 40% load factor is around 8 or 9. Smaller tables will have smaller maximum displacements.

The ctable has two lookup interfaces. One will perform the lookup as described above, scanning through the hash table in place. The other will fetch all entries within the maximum displacement into a buffer, then do a branchless binary search over that buffer. This second streaming lookup can also fetch entries for multiple keys in one go. This can amortize the cost of a round-trip to RAM, in the case where you expect to miss cache for every lookup.

To create a ctable, first create a parameters table specifying the key and value types, along with any other options. Then call ctable.new on those parameters. For example:

local ctable = require('lib.ctable')
local ffi = require('ffi')
local params = {
   key_type = ffi.typeof('uint32_t'),
   value_type = ffi.typeof('int32_t[6]'),
   hash_fn = ctable.hash_i32,
   max_occupancy_rate = 0.4,
   initial_size = math.ceil(occupancy / 0.4)
}
local ctab = ctable.new(params)

— Function ctable.new parameters

Create a new ctable. parameters is a table of key/value pairs. The following keys are required:

key_type: An FFI type (LuaJIT “ctype”) for keys in this table.
value_type: An FFI type (LuaJT “ctype”) for values in this table.

Hash values are unsigned 32-bit integers in the range [0, 0xFFFFFFFF). That is to say, 0xFFFFFFFF is the only unsigned 32-bit integer that is not a valid hash value. The hash_fn must return a hash value in the correct range.

Optional entries that may be present in the parameters table include:

hash_fn: A function that takes a key and returns a hash value. If not given, defaults to the result of calling compute_hash_fn on the key type.
initial_size: The initial size of the hash table, including free space. Defaults to 8 slots.
max_occupancy_rate: The maximum ratio of occupancy/size, where occupancy denotes the number of entries in the table, and size is the total table size including free entries. Trying to add an entry to a “full” table will cause the table to grow in size by a factor of

Defaults to 0.9, for a 90% maximum occupancy ratio.

min_occupancy_rate: Minimum ratio of occupancy/size. Removing an entry from an “empty” table will shrink the table.

— Function ctable.load stream parameters

Load a ctable that was previously saved out to a binary format. parameters are as for ctable.new. stream should be an object that has a :read_ptr(ctype) method, which returns a pointer to an embedded instances of ctype in the stream, advancing the stream over the object; and :read_array(ctype, count) which is the same but reading count instances of ctype instead of just one.

Methods

Users interact with a ctable through methods. In these method descriptions, the object on the left-hand-side of the method invocation should be a ctable.

— Method :resize size

Resize the ctable to have size total entries, including empty space.

— Method :insert hash, key, value, updates_allowed

An internal helper method that does the bulk of updates to hash table. hash is the hash of key. This method takes the hash as an explicit parameter because it is used when resizing the table, and that way we avoid calling the hash function in that case. key and value are FFI values for the key and the value, of course.

updates_allowed is an optional parameter. If not present or false, then the :insert method will raise an error if the key is already present in the table. If updates_allowed is the string "required", then an error will be raised if key is not already in the table. Any other true value allows updates but does not require them. An update will replace the existing entry in the table.

Returns the index of the inserted entry.

— Method :add key, value, updates_allowed

Add an entry to the ctable, returning the index of the added entry. See the documentation for :insert for a description of the parameters.

— Method :update key, value

Update the entry in a ctable with the key key to have the new value value. Throw an error if key is not present in the table.

— Method :lookup_ptr key

Look up key in the table, and if found return a pointer to the entry. Return nil if the value is not found.

An entry pointer has three fields: the hash value, which must not be modified; the key itself; and the value. Access them as usual in Lua:

local ptr = ctab:lookup(key)
if ptr then print(ptr.value) end

Note that pointers are only valid until the next modification of a table.

— Method :lookup_and_copy key, entry

Look up key in the table, and if found, copy that entry into entry and return true. Otherwise return false.

— Method :remove_ptr entry

Remove an entry from a ctable. entry should be a pointer that points into the table. Note that pointers are only valid until the next modification of a table.

— Method :remove key, missing_allowed

Remove an entry from a ctable, keyed by key.

Return true if we actually do find a value and remove it. Otherwise if no entry is found in the table and missing_allowed is true, then return false. Otherwise raise an error.

— Method :save stream

Save a ctable to a byte sink. stream should be an object that has a :write_ptr(ctype) method, which writes an instance of a struct type out to a stream, and :write_array(ctype, count) which is the same but writing count instances of ctype instead of just one.

— Method :selfcheck

Run an expensive internal diagnostic to verify that the table’s internal invariants are fulfilled.

— Method :dump

Print out the entries in a table. Can be expensive if the table is large.

— Method :iterate

Return an iterator for use by for in. For example:

for entry in ctab:iterate() do
   print(entry.key, entry.value)
end

Streaming interface

As mentioned earlier, batching multiple lookups can amortize the cost of a round-trip to RAM. To do this, first prepare a LookupStreamer for the batch size that you need. You will have to experiment to find the batch size that works best for your table’s entry sizes; for reference, for 32-byte entries a 32-wide lookup seems to be optimum.

-- Stream in 32 lookups at once.
local stride = 32
local streamer = ctab:make_lookup_streamer(stride)

Wiring up streaming lookup in a packet-processing network is a bit of a chore currently, as you have to maintain separate queues of lookup keys and packets, assuming that each lookup maps to a packet. Let’s make a little helper:

local lookups = {
   queue = ffi.new("struct packet * [?]", stride),
   queue_len = 0,
   streamer = streamer
}

local function flush(lookups)
   if lookups.queue_len > 0 then
      -- Here is the magic!
      lookups.streamer:stream()
      for i = 0, lookups.queue_len - 1 do
         local pkt = lookups.queue[i]
         if lookups.streamer:is_found(i)
            local val = lookups.streamer.entries[i].value
            --- Do something cool here!
         end
      end
      lookups.queue_len = 0
   end
end

local function enqueue(lookups, pkt, key)
   local n = lookups.queue_len
   lookups.streamer.entries[n].key = key
   lookups.queue[n] = pkt
   n = n + 1
   if n == stride then
      flush(lookups)
   else
      lookups.queue_len = n
   end
end

Then as you see packets, you enqueue them via enqueue, extracting out the key from the packet in some way and passing that value as the argument. When enqueue detects that the queue is full, it will flush it, performing the lookups in parallel and processing the results.

Hash functions

Any hash function will do, as long as it produces values in the [0, 0xFFFFFFFF) range. In practice we include some functions for hashing byte sequences of some common small lengths.

— Function ctable.hash_32 number

Hash a 32-bit integer. As a hash_fn parameter, this will only work if your key type’s Lua representation is a Lua number. For example, use hash_32 on ffi.typeof('uint32_t'), but use hashv_32 on ffi.typeof('uint8_t[4]').

— Function ctable.hashv_32 ptr

Hash the first 32 bits of a byte sequence.

— Function ctable.hashv_48 ptr

Hash the first 48 bits of a byte sequence.

— Function ctable.hashv_64 ptr

Hash the first 64 bits of a byte sequence.

— Function ctable.compute_hash_fn ctype

Return a hashv_-like hash function over the bytes in instances of ctype. Note that the same reservations apply as for hash_32 above.

PMU (lib.pmu)

The CPU’s PMU (Performance Monitoring Unit) collects information about specific performance events such as cache misses, branch mispredictions, and utilization of internal CPU resources like execution units. This module provides an API for counting events with the PMU.

Hundreds of low-level counters are available. The exact list depends on CPU model. See pmu_cpu.lua for our definitions.

High-level interface

— Function is_available

If the PMU hardware is available then return true. Otherwise return two values: false and a string briefly explaining why. (Cooperation from the Linux kernel is required to acess the PMU.)

— Function profile function [event_list] [aux]

Call function, return the result, and print a human-readable report of the performance events that were counted during execution.

— Function measure function [event_list]

Call function and return two values: the result and a table of performance event counter tallies.

Low-level interface

— Function setup event_list

Setup the hardware performance counters to track a given list of events (in addition to the built-in fixed-function counters).

Each event is a Lua string pattern. This could be a full event name:

mem_load_uops_retired.l1_hit

or a more general pattern that matches several counters:

mem_load.*l._hit

Return the number of overflowed counters that could not be tracked due to hardware constraints. These will be the last counters in the list.

Example:

setup({"uops_issued.any",
       "uops_retired.all",
       "br_inst_retired.conditional",
       "br_misp_retired.all_branches"}) => 0

— Function new_counter_set

Return a counter_set object that can be used for accumulating events. The counter_set will be valid only until the next call to setup().

— Function switch_to counter_set

Switch to a new set of counters to accumulate events in. Has the side-effect of committing the current accumulators to the previous record.

If counter_set is nil then do not accumulate events.

— Function to_table counter_set

Return a table containing the values accumulated in counter_set.

Example:

to_table(cs) =>
  {
   -- Fixed-function counters
   instructions                 = 133973703,
   cycles                       = 663011188,
   ref-cycles                   = 664029720,
   -- General purpose counters selected with setup()
   uops_issued.any              = 106860997,
   uops_retired.all             = 106844204,
   br_inst_retired.conditional  =  26702830,
   br_misp_retired.all_branches =       419
  }

— Function report counter_set [aux]

Print a textual report on the values accumulated in a counter set. Optionally include auxiliary application-level counters. The ratio of each event to each auxiliary counter is also reported.

Example:

report(my_counter_set, {packet = 26700000, breath = 208593})

prints output approximately like:

EVENT                                   TOTAL     /packet     /breath
instructions                      133,973,703       5.000     642.000
cycles                            663,011,188      24.000    3178.000
ref-cycles                        664,029,720      24.000    3183.000
uops_issued.any                   106,860,997       4.000     512.000
uops_retired.all                  106,844,204       4.000     512.000
br_inst_retired.conditional        26,702,830       1.000     128.000
br_misp_retired.all_branches              419       0.000       0.000
packet                             26,700,000       1.000     128.000
breath                                208,593       0.008       1.000

Hardware

PCI (lib.hardware.pci)

The lib.hardware.pci module provides functions that abstract common operations on PCI devices on Linux. In order to drive a PCI device using Direct memory access (DMA) one must:

Unbind the PCI device using pci.unbind_device_from_linux.
Enable PCI bus mastering for device using pci.set_bus_master in order to enable DMA.
Memory map PCI device configuration space using pci.map_pci_memory.
Control the PCI device by manipulating the memory referenced by the pointer returned by pci.map_pci_memory.
Disable PCI bus master for device using pci.set_bus_master.
Unmap PCI device configuration space using pci.close_pci_resource.

The correct ordering of these steps is absolutely critical.

— Variable pci.devices

An array of supported hardware devices. Must be populated by calling pci.scan_devices. Each entry is a table as returned by pci.device_info.

— Function pci.canonical pciaddress

Returns the canonical representation of a PCI address. The canonical representation is preferred internally in Snabb and for presenting to users. It shortens addresses with leading zeros like this: 0000:01:00.0 becomes 01:00.0.

— Function pci.qualified pciaddress

Returns the fully qualified representation of a PCI address. Fully qualified addresses have the form 0000:01:00.0 and so this function undoes any abbreviation in the canonical representation.

— Function pci.scan_devices

Scans for available PCI devices and populates the pci.devices table.

— Function pci.device_info pciaddress

Returns a table containing information about the PCI device by pciaddress. The table has the following keys:

pciaddress—String denoting the PCI address of the device. E.g. "0000:83:00.1".
vendor—Identification string e.g. "0x8086" for Intel.
device—Identification string e.g. "0x10fb" for 82599 chip.
interface—Name of Linux interface using this device e.g. "eth0".
status—String denoting the Linux operational status, or nil if not known.
driver—String denoting the Lua module that supports this hardware e.g. "apps.intel.intel10g".
usable—String denoting if the device was suitable to use when scanned. One of "yes" or "no".

— Function pci.which_driver vendor, model

Returns the module name for a suitable device driver (if available) for a device of model from vendor.

— Function pci.unbind_device_from_linux pciaddress

Forces Linux to unbind the device identified by pciaddress from any kernel drivers.

— Function pci.set_bus_master pciaddress, enable

Enables or disables PCI bus mastering for device identified by pciaddress depending on whether enable is a true or a false value. PCI bus mastering must be enabled in order to perform DMA on the PCI device.

— Function pci.map_pci_memory_unlocked pciaddress, n — Function pci.map_pci_memory_locked pciaddress, n

Memory maps configuration space n of PCI device identified by pciaddress. Returns a pointer to the memory mapped region and a file descriptor of the opened sysfs resource file. PCI bus mastering must be enabled on the device identified by pciaddress before calling this function. The 2 variants indicate if the underlying memory mapped file should be exclusively flocked or not.

— Function pci.close_pci_resource file_descriptor, pointer

Closes memory mapped file_descriptor of sysfs resource file and unmaps it from pointer as returned by pci.map_pci_memory.

Register (lib.hardware.register)

The lib.hardware.register module provides an abstraction for hardware device registers. This abstraction can be used to declaratively specify and conveniently manipulate structured memory regions via DMA. The functions register.define and register.define_array construct Register objects based on a register description string. The resulting Register objects can be used to manipulate the defined registers using the methods Register:read, Register:write, Register:set, Register:clr, Register:wait and Register:reset (exact set depends on the register mode).

A register description is a string with one Register object definition per line. A Register object definition must be expressed using the following grammar:

Register   ::= Name Offset Indexing Mode Longname
Name       ::= <identifier>
Indexing   ::= "-"
           ::= "+" OffsetStep "*" Min ".." Max
Mode       ::= "RO" | "RW" | "RC" | "RCR" | "RW64" | "RO64" | "RC64" | "RCR64"
Longname   ::= <string>
Offset ::= OffsetStep ::= Min ::= Max ::= <number>

A Register object definition is made up of the following properties:

Name—A string to be used to refer to the Register object. Must be a valid Lua identifier, e.g. "foo", "foo_bar", "FOO" etc.
Offset—Integer specifying the offset from the base pointer (as supplied to register.define and register.define_array).
Indexing—Optional. Three integers specifying the offset step as well as minimum and maximum indexes in bytes.
Mode—One of "RO", "RW", "RC", "RCR" "RO64", "RW64", "RC64", "RCR64" standing for read-only, read-write and counter modes in 32bit and 64bit modes respectively. Counter mode is for counter registers that clear back to zero when read, RCR is for counters that wrap.
Longname—A string describing the register (used for self-documentation).

For instance, the following Register object definition defines a register range “TXDCTL” in read-write mode starting at offset 0x06028 with 128 registers each of length 0x40.

TXDCTL 0x06028 +0x40*0..127 RW Transmit Descriptor Control

The next example defines a singular register “TPT” in counter mode located at offset 0x01428.

TPT 0x01428 - RC Total Packets Transmitted

— Function register.define description, table, base_pointer, n

Creates Register objects for description relative to base_pointer. The resulting Register objects will become a named entries in table using the names defined in description. If an entry in description defines an indexing range then n specifies the index of the register within that range. N defaults to 0.

— Function register.define_array description, table, base_pointer

Creates Register objects for description relative to base_pointer. The resulting Register objects will become a named entries in table using the names defined in description. If an entry in description defines an indexing range, an array of Register objects will be created instead of a singular Register object.

— Function register.dump table

Prints a pretty-printed register dump of a table of registers.

— Method Register:read

Returns the value of register. For convenience register objects can be called without arguments instead of calling Register:read. E.g. reg:read() is equivalent to reg().

— Method Register:write value

Sets the value of register to value. Only available on registers in read-write mode. For convenience register objects can be called with an argument instead of calling Register:write. E.g. reg:write(value) is equivalent to reg(value).

If register is in counter mode it is assumed that the register will be reset to zero upon reading. The read value is added to a register accumulator and the sum of all reads is returned.

— Method Register:set bitmask

Sets bits of register according to bitmask. Only available on registers in read-write mode.

— Method Register:clr bitmask

Clears bits of register according to bitmask. Only available on registers in read-write mode.

Method Register:bits offset, length, bits

Get or set length bits at offset in register. Sets length bits at offset in register to bits if bits is supplied. Returns length bits at offset in register otherwise. Setting is only available on registers in read-write mode.

Method Register:byte offset, byte

Get or set byte at offset in register. Sets byte at offset in register to byte if byte is supplied. Returns byte at offset in register otherwise. Setting is only available on registers in read-write mode.

— Method Register:wait bitmask, value

Blocks until applying bitmask to the register equals value. If value is not supplied blocks until all bits in the mask are set instead. Only available on registers in read-write and read-only modes.

— Method Register:reset

Reset the register accumulator to 0. Only available on registers in counter mode.

— Method Register:print

Prints the register state to standard output.

Protocols

Protocol Header (lib.protocol.header)

The lib.protocol.header module contains the base class from which the supported protocol classes are derived. It defines generic methods on all protocol subclasses.

— Method header:new_from_mem memory, length

Creates and returns a header object by “overlaying” the respective header structure over length bytes of memory.

— Method header:header

Returns the raw header as a cdata object.

— Method header:sizeof

Returns the byte size of header.

— Method header:eq header

Generic equality predicate. Returns true if header is equal to self and false otherwise.

— Method header:copy destination, relocate

Copies the header to destination. The caller must ensure that there is enough space at destination. If relocate is a true value, destination is promoted to be the active storage for the header.

— Method header:clone

Returns a copy of the header object.

— Method header:upper_layer

Returns the protocol class that can handle the “upper layer protocol” or nil if the protocol is not supported or the protocol has no upper layer.

For instance, on an Ethernet header object this method might return a IPv4 or IPv6 header class.

Ethernet (lib.protocol.ethernet)

The lib.protocol.ethernet module contains a class for representing Ethernet headers. The ethernet protocol class supports two upper layer protocols: lib.protocol.ipv4 and lib.protocol.ipv6.

— Method ethernet:new config

Returns a new Ethernet header for config. Config must a be a table which may contain the following keys:

dst - Destination MAC (binary representation). Default is 00:00:00:00:00:00.
src - Source MAC (binary representation). Default is 00:00:00:00:00:00.
type - Either 0x0800 or 0x86dd for IPv4/6 individually. Default is 0x0.

— Method ethernet:src mac

— Method ethernet:dst mac

— Method ethernet:type type

Combined accessor and setter methods. These methods set the values of the source, destination and type fields of an Ethernet header. If no argument is given the current value is returned.

Example:

local eth = ethernet:new({src = ethernet:pton("00:00:00:00:00:00"),
                          dst = ethernet:pton("00:00:00:00:00:00"),
                          type = 0x86dd})
eth:dst(ethernet:pton("54:52:00:01:00:00"))
ethernet:ntop(eth:dst()) => "54:52:00:01:00:00"

— Method ethernet:src_eq mac

— Method ethernet:dst_eq mac

Predicate methods to test if mac is equal to the source or destination addresses individually.

— Method ethernet:swap

Swaps the values of the source and destination fields.

— Function ethernet:pton string

Returns the binary representation of MAC address denoted by string.

— Function ethernet:ntop mac

Returns the string representation of mac address.

— Function ethernet:is_mcast mac

Returns a true value if mac address denotes a Multicast address.

— Function ethernet:is_bcast mac

Returns a true value if mac address denotes a Broadcast address.

— Function ethernet:ipv6_mcast ip

Returns the MAC address for IPv6 multicast ip as defined by RFC2464, section 7.

IPv4 (lib.protocol.ipv4)

The lib.protocol.ipv4 module contains a class for representing IPv4 headers. The ipv4 protocol class supports four upper layer protocols: lib.protocol.tcp, lib.protocol.udp, lib.protocol.gre and lib.protocol.icmp.header.

— Method ipv4:new config

Returns a new IPv4 header for config. Config must a be a table which may contain the following keys:

dst - Destination IPv4 address (binary representation). Default is 0.0.0.0.
src - Source IPv4 address (binary representation). Default is 0.0.0.0.
protocol - The upper layer protocol, can be 6 (TCP), 17 (UDP), 47 (GRE) or 58 (ICMP). Default is 255.
dscp - “Differentiated Services Code Point” field (6 bit unsigned integer). Default is 0.
ecn - “Explicit Congestion Notification” field (2 bit unsigned integer). Default is 0.
id - “Identification” field (16 bit unsigned integer). Default is 0.
flags - “Don’t Fragment (DF)” and “More Fragments (MF)” fields (3 bit unsigned integer). Default is 0.
frag_off - “Fragment Offset” field (13 bit unsigned integer). Default is 0.
ttl - “Time To Live” field (8 bit unsigned integer). Default is 0.

— Method ipv4:dst ip

— Method ipv4:src ip

— Method ipv4:protocol protocol

— Method ipv4:dscp dscp

— Method ipv4:ecn ecn

— Method ipv4:id id

— Method ipv4:flags flags

— Method ipv4:frag_off frag_off

— Method ipv4:ttl ttl

Combined accessor and setter methods. These methods set the values of the instance fields (see new) of an IPv4 header. If no argument is given the current value is returned.

— Method ipv4:version version

Combined accessor and setter method for the “Version” field (4 bit unsigned integer). Defaults to 4 (set automatically by new). Sets the “Version” field to version. If no argument is given the current value is returned.

— Method ipv4:ihl ihl

Combined accessor and setter method for the “Internet Header Length” field (4 bit unsigned integer). Set automatically by new. Sets the “Internet Header Length” field to ihl. If no argument is given the current value is returned.

— Method ipv4:total_length length

Combined accessor and setter method for the “Total Length” field (16 bit unsigned integer). Defaults to header length (set automatically by new). Sets the “Total Length” field to length. If no argument is given the current value is returned.

— Method ipv4:checksum

Computes and sets the IPv4 header checksum. Its called automatically by new but must be called after the header is changed.

— Method ipv4:dst_eq ip

— Method ipv4:src_eq ip

Predicate methods to test if ip is equal to the source or destination addresses individually.

— Function ipv4:pton string

Returns the binary representation of IPv4 address denoted by string.

— Function ipv4:ntop ip

Returns the string representation of ip address.

IPv6 (lib.protocol.ipv6)

The lib.protocol.ipv6 module contains a class for representing IPv6 headers. The ipv6 protocol class supports four upper layer protocols: lib.protocol.tcp, lib.protocol.udp, lib.protocol.gre and lib.protocol.icmp.header.

— Method ipv6:new config

Returns a new IPv6 header for config. Config must a be a table which may contain the following keys:

dst - Destination IPv6 address (binary representation). Default is 0::0.
src - Source IPv6 address (binary representation). Default is 0::0.
traffic_class - “Traffic Class” field (8 bit unsigned integer). Default is 0.
flow_label - “Flow Label” field (20 bit unsigned integer). Default is 0.
next_header - “Next Header” field (8 bit unsigned integer). Default is 0.
hop_limit - “Hop Limit” field (8 bit unsigned integer). Default is 0.

— Method ipv6:dst ip

— Method ipv6:src ip

— Method ipv6:traffic_class traffic_class

— Method ipv6:flow_label flow_label

— Method ipv6:next_header next_header

— Method ipv6:hop_limit hop_limit

Combined accessor and setter methods. These methods set the values of the instance fields (see new) of an IPv6 header. If no argument is given the current value is returned.

— Method ipv6:version version

Combined accessor and setter method for the version field (4 bit unsigned integer). Defaults to 6 (set automatically by new). Sets the “Version” field to version. If no argument is given the current value is returned.

— Method ipv6:dscp dscp

Combined accessor and setter method for the “Differentiated Services Code Point” field (6 bit unsigned integer). Default is 0. This is a sub-field of the “Traffic Class” field. Sets the “Differentiated Services Code Point” field to dscp. If no argument is given the current value is returned.

— Method ipv6:ecn ecn

Combined accessor and setter method for the “Explicit Congestion Notification” (2 bit unsigned integer). Default is 0. This is a sub-field of the “Traffic Class” field. Sets the “Explicit Congestion Notification” field to ecn. If no argument is given the current value is returned.

— Method ipv6:payload_length length

Combined accessor and setter method for the “Payload Length” field (16 bit unsigned integer). Default is 0. Sets the “Payload Length” field to length. If no argument is given the current value is returned.

— Method ipv6:dst_eq ip

— Method ipv6:src_eq ip

Predicate methods to test if ip is equal to the source or destination addresses individually.

— Function ipv6:pton string

Returns the binary representation of IPv6 address denoted by string.

— Function ipv6:ntop ip

Returns the string representation of ip address.

— Function ipv6:solicited_node_mcast ip

Returns the solicited-node multicast address from the given unicast ip.

TCP (lib.protocol.tcp)

The lib.protocol.tcp module contains a class for representing TCP headers.

— Method tcp:new config

Returns a new TCP header for config. Config must a be a table which may contain the following keys:

src_port - “Source Port Number” field (16 bit unsigned integer). Default is 0.
dst_port - “Destination Port Number” field (16 bit unsigned integer). Default is 0.
seq_num - “Sequence Number” field (32 bit unsigned integer). Default is 0.
ack_num - “Acknowledgement Number” field (32 bit unsigned integer). Default is 0.
window_size - “Window Size” field (16 bit unsigned integer). Default is 0.
offset - “Data Offset” field (4 bit unsigned integer). Default is 0.
ns - “NS” flag (1 bit). Default is 0.
cwr - “CWR” flag (1 bit). Default is 0.
ece - “ECE” flag (1 bit). Default is 0.
urg - “URG” flag (1 bit). Default is 0.
ack - “ACK” flag (1 bit). Default is 0.
psh - “PSH” flag (1 bit). Default is 0.
rst - “RST” flag (1 bit). Default is 0.
syn - “SYN” flag (1 bit). Default is 0.
fin - “FIN” flag (1 bit). Default is 0.

— Method tcp:src_port port

— Method tcp:dst_port port

— Method tcp:seq_num seq_num

— Method tcp:ack_num ack_num

— Method tcp:window_size window_size

— Method tcp:offset offset

— Method tcp:ns ns

— Method tcp:cwr cwr

— Method tcp:ece ece

— Method tcp:urg urg

— Method tcp:ack ack

— Method tcp:psh psh

— Method tcp:rst rst

— Method tcp:syn syn

— Method tcp:fin fin

Combined accessor and setter methods. These methods set the values of the instance fields (see new) of a TCP header. If no argument is given the current value is returned.

— Method tcp:flags flags

Combined accessor and setter method for the TCP header flags (NS, CRW, ECE, URG, ACK, PSH, RST, SYN and FIN). Sets the header’s flags accoring to flags (9 bit unsigned intetger). If no argument is given the current flags are returned.

— Method tcp:checksum payload, length, ip

Computes and sets the “Checksum” field for length bytes of payload and optionally ip. If no argument is given the current value of the “Checksum” field is returned.

UDP (lib.protocol.udp)

The lib.protocol.udp module contains a class for representing UDP headers.

— Method udp:new config

Returns a new UDP header for config. Config must a be a table which may contain the following keys:

src_port - “Source Port Number” field (16 bit unsigned integer). Default is 0.
dst_port - “Destination Port Number” field (16 bit unsigned integer). Default is 0.

— Method udp:src_port port

— Method udp:dst_port port

Combined accessor and setter methods for the source and destination port fields. Sets the source or destination port individually. Returns the current port if called without arguments. Default is 8 (the UDP header length).

— Method udp:length length

Combined accessor and setter method for the “Length” field. Sets the “Length” field* to length (a 16 bit unsigned integer). If no argument is given the current value of the “Length” field is returned.

— Method udp:checksum payload, length, ip

Computes and sets the “Checksum” field for length bytes of payload and optionally ip. If no argument is given the current value of the “Checksum” field is returned.

GRE (lib.protocol.gre)

The lib.protocol.gre module contains a class for representing GRE headers. The gre protocol class only supports the checksum and key extensions and the lib.protocol.ethernet upper layer protocol.

— Method gre:new config

Returns a new GRE header for config. Config must a be a table which may contain the following keys:

protocol - Upper layer protocol. May be 0x6558 (Ethernet). Default is nil.
checksum - Set to true to enable checksumming. Default is false.
key - 32 bit unsigned integer. Enables keying if supplied. Default is nil.

— Method gre:checksum payload, length

Combined accessor and setter method for the checksum field. Computes and sets the checksum field for length bytes of payload. If no argument is given the current checksum is returned. Returns nil if checksumming is disabled.

— Method gre:checksum_check payload, length

Predicate to verify length bytes of payload against the header checkum. Return nil if checksumming is disabled.

— Method gre:key key

Combined accessor and setter method for the key field. Sets the key field to key. If no argument is given the current key is returned. Returns nil if keying is disabled.

— Method gre:protocol protocol

Combined accessor and setter method for the upper layer protocol. Sets the upper layer protocol to protocol. If no argument is given the current upper layer protocol is returned.

The lib.protocol.icmp.header module contains a class for representing ICMP headers. The icmp protocol class currently supports two upper layer protocols: lib.protocol.icmp.nd.ns and lib.protocol.icmp.nd.na. These upper layer protocols implement the headers necessary to perform “Neighbor Discovery”.

— Method icmp:new type, code

Returns a new ICMP header of type which may be either 135 or 136 for lib.protocol.icmp.nd.ns or lib.protocol.icmp.nd.na respectively. Optionally code can be supplied to set the “Code” field for the type.

— Method icmp:type type

— Method icmp:code code

Combined accessor and setter methods. These methods set the values of the instance fields (see new) of an ICMP header. If no argument is given the current value is returned.

— Method icmp:checksum payload, length, ipv6

Computes and sets the “Checksum” field for length bytes of payload. If the lower protocol layer is lib.protocol.ipv6 then ipv6 must be set to a true value.

— Method icmp:checksum_check payload, length, ipv6

Predicate to test if the header’s “Checksum” field matches length bytes of payload. If the lower protocol layer is lib.protocol.ipv6 then ipv6 must be set to a true value.

Neighbor Solicitation (lib.protocol.icmp.nd.ns)

— Method ns:new target

Returns a new Neighbor Solicitation header. Target is the IP address used for the “Target Address” field.

— Method ns:target target

Combined accessor and setter method for the “Target Address” field. Sets the “Target Address” field to target. If no argument is given the current value is returned.

— Method ns:target_eq target

Predicate to test if the header’s value in the “Target Address” field is equivalent to target.

Neighbor Advertisement (lib.protocol.icmp.nd.na)

— Method na:new target, router, solicited, override

Returns a new Neighbor Advertisement header. Target is the IP address used for the “Target Address” field. Router, solicited and override can be boolean values to set the “Router”, “Solicited” and “Override” flags respectively. The default for the flags is 0.

— Method ns:target target

— Method ns:router router

— Method ns:solicited solicited

— Method ns:override override

Combined accessor and setter methods. These methods set the values of the instance fields (see new) of an Neighbor Advertisement header. If no argument is given the current value is returned.

— Method ns:target_eq target

Predicate to test if the header’s value in the “Target Address” field is equivalent to target.

Both Neighbor Solicitation and Advertisement (lib.protocol.icmp.nd.ns and lib.protocol.icmp.nd.na) headers implement an options method for parsing TLV Options contained in the their payloads.

Example:

 -- Parse datagram with ICMP/NA packet
local na = dgram:parse()
 -- Parse TLV Options
local options = na:options(dgram:payload())

— Method nd:options payload, length

Parses and returns an array of TLV Options (see lib.protocol.icmp.nd.options.tlv) from length bytes of payload.

TLV Option (lib.protocol.icmp.nd.options.tlv)

The lib.protocol.icmp.nd.options.tlv module contains a class for representing TLV Options. Currently only two types of options are implemented: “Source Link-Layer Address” ("src_ll_addr") and “Target Link-Layer Address” ("tgt_ll_address"). Both are represented by the lladdr class (see lib.protocol.icmp.nd.options.lladdr).

— Method tlv:new type, data

Returns a new TLV Option object for data of type. Type may be either 1 for “Source Link-Layer Address” or 2 for “Target Link-Layer Address”. Data must be a lladdr object.

— Method tlv:name

Returns a string denoting the type of the option. Either "src_ll_addr" for “Source Link-Layer Address” or "tgt_ll_address" for “Target Link-Layer Address”.

— Method tlv:length

Returns the the size of the TLV Option as multiples of 8 bytes.

— Method tlv:type type

Combined accessor and setter method. Sets the type field (see new) to type. If no argument is given the current value of the type field is returned.

— Method tlv:option

Returns an object of the class denoted by the type field. Currently that only includes lladdr instances.

Link-Layer Address Option (lib.protocol.icmp.nd.options.lladdr)

The lib.protocol.icmp.nd.options.lladdr module contains a class for representing Link-Layer Address Options.

— Method lladdr:new address

Returns a new Link-Layer Option object for MAC address in binary representation.

— Method lladdr:name

Returns the string "ll_addr".

— Method lladdr:addr address

Combined accessor and setter method. Sets the address field (see new) to address. If no argument is given the current value of the address field is returned.

Datagram (lib.protocol.datagram)

The lib.protocol.datagram module provides basic mechanisms for parsing, building and manipulating a hierarchy of protocol headers and the associated payload contained in a data packet. In particular, it supports:

Parsing and in-place manipulation of protocol headers in a received packet
In-place decapsulation by removing leading protocol headers
Adding headers to an existing packet
Creation of a new packet
Appending payload to a packet

It mediates between packets as defined in core.packet and protocol classes which are defined as classes derived from the protocol header base class in the lib.protocol.header module.

The contents of a datagram instance are logically divided into three areas: The payload, parsed headers and pushed headers. The datagram payload is a sequence of bytes either inherited from the packet given to datagram:new or appended using datagram:payload. The headers in the payload can be parsed using datagram:parse_match, which will shrink the payload by the header. Finally, synthetic headers can be prepended to the datagram using datagram:push. To get the whole datagram as a packet use datagram:packet.

Datagram

A datagram can be used in two modes of operation, called “immediate commit” and “delayed commit”. In immediate commit mode, the push and pop methods immediately modify the underlying packet. However, this can be undesireable.

Even though the manipulations are relatively fast by using SIMD instructions to move and copy data when possible, performance-aware applications usually try to avoid as much of them as possible. This creates a conflict if the caller performs operations to push or parse a sequence of protocol headers in immediate commit mode.

This problem can be avoided by using delayed commit mode. In this mode, the push methods add the data to a separate buffer as intermediate storage. The buffer is prepended to the actual packet in a single operation by calling datagram:commit.

The pop methods are made light-weight in delayed commit mode as well by keeping track of an additional offset that indicates where the actual packet starts in the packet buffer. Each call to one of the pop methods simply increases the offset by the size of the popped piece of data. The accumulated actions will be applied as a single operation by datagram:commit.

The push and pop methods can be freely mixed in delayed commit mode.

Due to the destructive nature of these methods in immediate commit mode, they cannot be applied when the parse stack is not empty, because moving the data in the packet buffer will invalidate the parsed headers. The push and pop methods will raise an error in that case.

The buffer used in delayed commit mode has a fixed size of 512 bytes. This limits the size of data that can be pushed in a single operation. A sequence of push/commit operations can be used to push an arbitrary amount of data in chunks of up to 512 bytes.

— Method datagram:new packet, protocol, options

Creates a datagram for packet or from scratch if packet is nil. Protocol will be used by parse_match to parse the packet payload. If protocol is not nil it is set as the initial upper layer protocol. If options is not nil it must be a table that selects configurable properties of the class. Currently, the only option is the selection of immediate or delayed commit mode by setting the key delayed_commit to false or true, respectively. The default is immediate commit mode.

— Method datagram:push header

Prepends header to the front of the datagram. This method destructively modifies the underlying packet in immediate commit mode and raises an error if the parse stack is not empty.

In delayed commit mode, header is prepended to an intermediate buffer.

— Method datagram:push_raw data, length

This method behaves like the datagram:push method for an arbitrary chunk of memory of length length located at the address pointed to by data.

— Method datagram:parse_match protocol, check

Attempts to parse the next header in the datagram, thereby removing it from the payload. Returns a header instance of class protocol on success. If protocol is nil the current upper layer protocol as set by datagram:new or previous calls to parse_match is used.

If neither protocol nor the upper layer protocol is set or the constructor of the protocol class returns nil, the parsing operation has failed and parse_match returns nil. The datagram remains unchanged.

If the protocol class instance has been created successfully, it is passed as single argument to the anonymous function check.

If check returns a false value, the parsing has failed and parse_match returns nil. The packet remains unchanged.

If check is not supplied or if it returned a true value, the parsing has succeeded and the current upper layer protocol of the datagram is set to the value returned by header:upper_layer.

— Method datagram:parse protocols_and_checks

A wrapper around parse_match that allows parsing of a sequence of headers with a single method call.

If protocols_and_checks is a sequence of protocol class and check function pairs, parse_match is called for each pair. Returns the header object of the last header parsed or nil if any of the calls to parse_match return nil.

If called with a nil argument, this method is equivalent to parse_match called without arguments.

— Method datagram:parse_n n

A wrapper around parse_match that parses the next n protocol headers using the current upper layer protocol and subsequent values of header:upper_layer. It returns the last header object or nil if less than n headers could be parsed successfully.

— Method datagram:unparse n

Undoes the last n calls to parse_match on the datagram. E.g. prepends n parsed headers back to the payload. The sequence of parsed headers can be obtained by calling stack.

— Method datagram:pop n

Removes the leading n parsed headers from the datagram. Note that headers added via push can not be removed using pop. The caller has to ensure that the datagram contains at least n headers that were parsed using parse_match. The sequence of parsed headers can be obtained by calling stack. This method destructively modifies the underlying packet in immediate commit mode and raises an error if the parse stack is not empty.

In delayed commit mode, the packet is not modified and the parse stack remains valid.

For instance let d be an datagram with an Ethernet header followed by an IPv6 header. Assuming we have parsed both headers using d:parse_n(2), we could call d:pop(1) to decapsulate the IPv6 packet from its Ethernet header.

— Method datagram:pop_raw length, ulp

Removes length bytes from the beginning of the datagram. If ulp is given it is set as the current upper layer protocol. This method destructively modifies the underlying packet in immediate commit mode and raises an error if the parse stack is not empty.

In delayed commit mode, the packet is not modified and the parse stack remains valid.

— Method datagram:stack

Returns the parsed header objects as a sequence.

— Method datagram:packet

Returns a packet (see core.packet) containing the datagram (including pushed headers).

— Method datagram:payload pointer, length

Combined payload accessor and setter method. Returns a pointer to the datagram payload and its byte size.

If pointer and length are supplied then length bytes starting from pointer are appended to the datagram’s payload.

— Method datagram:data

Returns data and length of the underlying packet.

Method datagram:commit

If called in delayed commit mode, the operations accumulated by the push and pop methods since the creation of the datagram or the last invocation of datagram:commit are commited to the underlying packet. An error is raised if the parse stack is not empty.

The method can be safely called in immediate commit mode.

IPsec

Encapsulating Security Payload (lib.ipsec.esp)

The lib.ipsec.esp module contains two classes esp_v6_encrypt and esp_v6_decrypt which implement implement packet encryption and decryption with IPsec ESP using the AES-GCM-128 cipher in IPv6 transport mode. Packets are encrypted with the key and salt provided to the classes constructors. These classes do not implement any key exchange protocol.

The encrypt class accepts IPv6 packets and inserts a new ESP header between the outer IPv6 header and the inner protocol header (e.g. TCP, UDP, L2TPv3) and also encrypts the contents of the inner protocol header. The decrypt class does the reverse: it decrypts the inner protocol header and removes the ESP protocol header.

ESP-Transport

References:

IPsec Wikipedia page.
RFC 4303 on IPsec ESP.
RFC 4106 on using AES-GCM with IPsec ESP.
LISP Data-Plane Confidentiality example of a software layer above these apps that includes key exchange.

— Method esp_v6_encrypt:new config

— Method esp_v6_decrypt:new config

Returns a new encryption/decryption context respectively. Config must a be a table with the following keys:

mode - Encryption mode (string). The only accepted value is the string "aes-128-gcm".
spi - A 32 bit integer denoting the “Security Parameters Index” as specified in RFC 4303.
key - Hexadecimal string of 32 digits (two digits for each byte) that denotes a 128-bit AES key as specified in RFC 4106.
salt - Hexadecimal string of eight digits (two digits for each byte) that denotes four bytes of salt as specified in RFC 4106.
window_size - Optional. Minimum width of the window in which out of order packets are accepted as specified in RFC 4303. The default is 128. (esp_v6_decrypt only.)
resync_threshold - Optional. Number of consecutive packets allowed to fail decapsulation before attempting “Re-synchronization” as specified in RFC 4303. The default is 1024. (esp_v6_decrypt only.)
resync_attempts - Optional. Number of attempts to re-synchronize a packet that triggered “Re-synchronization” as specified in RFC 4303. The default is 8. (esp_v6_decrypt only.)
auditing - Optional. A boolean value indicating whether to enable or disable “Auditing” as specified in RFC 4303. The default is nil (no auditing). (esp_v6_decrypt only.)

— Method esp_v6_encrypt:encapsulate packet

Encapsulates packet and encrypts its payload. Returns true on success and false otherwise.

— Method esp_v6_decrypt:decapsulate packet

Decapsulates packet and decrypts its payload. Returns true on success and false otherwise.

Snabb NFV

NFV config (program.snabbnfv.nfvconfig)

The program.snabbnfv.nfvconfig module implements a Network Functions Virtualization component based on Snabb. It introduces a simple configuration file format to describe NFV configurations which it then compiles to app networks. This NFV component is compatible with OpenStack Neutron.

NFV

— Function nfvconfig.load file, pci_address, socket_path

Loads the NFV configuration from file and compiles an app network using pci_address and socket_path for the underlying NIC driver and VhostUser apps. Returns the resulting engine configuration.

NFV Configuration Format

The configuration file format understood by program.snabbnfv.nfvconfig is based on Lua expressions. Initially, it contains a list of NFV ports:

return { <port-1>, ..., <port-n> }

Each port is defined by a range of properties which correspond to the configuration parameters of the underlying apps (NIC driver, VhostUser, PcapFilter, RateLimiter, nd_light and SimpleKeyedTunnel):

port := { port_id        = <id>,          -- A unique string
          mac_address    = <mac-address>, -- MAC address as a string
          vlan           = <vlan-id>,     -- ..
          ingress_filter = <filter>,       -- A pcap-filter(7) expression
          egress_filter  = <filter>,       -- ..
          tunnel         = <tunnel-conf>,
          crypto         = <crypto-conf>,
          rx_police      = <n>,           -- Allowed input rate in Gbps
          tx_police      = <n> }          -- Allowed output rate in Gbps

The tunnel section deviates a little from SimpleKeyedTunnel’s terminology:

tunnel := { type          = "L2TPv3",     -- The only type (for now)
            local_cookie  = <cookie>,     -- As for SimpleKeyedTunnel
            remote_cookie = <cookie>,     -- ..
            next_hop      = <ip-address>, -- Gateway IP
            local_ip      = <ip-address>, -- ~ `local_address'
            remote_ip     = <ip-address>, -- ~ `remote_address'
            session       = <32bit-int> } -- ~ `session_id'

The crypto section allows configuration of traffic encryption based on apps.ipsec.esp:

crypto := { type          = "esp-aes-128-gcm", -- The only type (for now)
            spi           = <spi>,             -- As for AES128gcm
            transmit_key  = <key>,
            transmit_salt = <salt>,
            receive_key   = <key>,
            receive_salt  = <salt>,
            auditing      = <boolean> }

snabbnfv traffic

The snabbnfv traffic program loads and runs a NFV configuration using program.snabbnfv.nfvconfig. It can be invoked like so:

./snabb snabbnfv traffic <file> <pci-address> <socket-path>

snabbnfv traffic runs the loaded configuration indefinitely and automatically reloads the configuration file if it changes (at most once every second).

snabbnfv neutron2snabb

The snabbnfv neutron2snabb program converts Neutron database CSV dumps to the format used by program.snabbnfv.nfvconfig. For more info see Snabb NFV Architecture. It can be invoked like so:

./snabb snabbnfv neutron2snabb <csv-directory> <output-directory> [<hostname>]

snabbnfv neutron2snabb reads the Neutron configuration csv-directory and translates them to one lib.nfv.conig configuration file per physical network. If hostname is given, it overrides the hostname provided by hostname(1).

Watchdog (lib.watchdog.watchdog)

The lib.watchdog.watchdog module implements a per-thread watchdog functionality. Its purpose is to watch and kill processes which fail to call the watchdog periodically (e.g. hang).

It does so by using alarm(3) and ualarm(3) to have the OS send a SIGALRM to the process after a specified timeout. Because the process does not handle the signal it will be killed and exit with status 142.

— Function watchdog.set milliseconds

Set watchdog timeout to milliseconds. Values for milliseconds greater than 1,000 are truncated to the next second. For example:

watchdog.set(1100) == watchdog.set(2000)

— Function watchdog.reset

Starts the timout if the watchdog has not yet been started and resets the timeout otherwise. If the timeout is reached the process will be killed.

— Function watchdog.stop

Disables the timeout.

Snabblab

Servers devoted to the Snabb project and usable by all known developers.

Want to be a known developer? Sure! Just edit the user account list with your user and send a pull request. No fuss.

Guidelines

Feel at home. These servers are here for you to play with and enjoy.
Please run Snabb processes like this: sudo lock ./snabb .... The lock command will automatically wait if somebody else is running a Snabb process on the same machine and that helps us avoid conflicts for access to hardware resources.
Tell luke@snabb.co your email address(es) to get an invitation to the Lab Slack.
Don’t keep precious data on the servers. We might want to reinstall them at short notice.

Servers

Name	Purpose	SSH	Xeon model	NICs
lugano-1	General use	lugano-1.snabb.co	E3 1650v3	2 x 10G (82599), 4 x 10G (X710), 2 x 40G (XL710)
lugano-2	General use	lugano-2.snabb.co	E3 1650v3	2 x 10G (82599), 4 x 10G (X710), 2 x 40G (XL710)
lugano-3	General use	lugano-3.snabb.co	E3 1650v3	2 x 10G (82599), 2 x 100G (ConnectX-4)
lugano-4	General use	lugano-4.snabb.co	E3 1650v3	2 x 10G (82599), 2 x 100G (ConnectX-4)
davos	Continuous Integration tests & driver development	lab1.snabb.co port 2000	2x E5 2603	Diverse 10G/40G: Intel, SolarFlare, Mellanox, Chelsio, Broadcom. Installed upon request.
grindelwald	Snabb NFV testing	lab1.snabb.co port 2010	2x E5 2697v2	12 x 10G (Intel 82599)
interlaken	Haswell/AVX2 testing	lab1.snabb.co port 2030	2x E5 2620v3	12 x 10G (Intel 82599)

Get started

You are welcome to play, test, and develop on the lugano-1 .. lugano-4 servers. Once your account is added you can connect like this:

$ ssh user@lugano-1.snabb.co

and check the PCI devices and their addresses with lspci.

Certain cards (82599 and ConnectX-4) are cabled to themselves. That is, dual-port cards have their ports connected to each other. Certain other cards (X710/XL710) are currently not cabled. If you have special cabling needs then please open an issue on the snabblab-nixos.

Using the lab

All servers run the latest stable version of NixOS Linux distribution.

To quickly install a package:

$ nox <search string>

For other operations such as uninstalling a package, refer to man nix-env.

Questions

If you have any questions or trouble, ask on the #lab channel or open an issue.

Thanks

We are grateful to Silicom for their sponsorship in the form of discounted network cards for chur and to Netgate for giving us jura. Thanks gang!