API Reference for SNF and usage model. More...
Data Structures | |
struct | snf_ifaddrs |
struct | snf_recv_req |
struct | snf_ring_qinfo |
struct | snf_ring_portinfo |
struct | snf_ring_stats |
Modules | |
Receive-Side Scaling (RSS) | |
Open flags for process-sharing, port aggregation and packet duplication | |
Packet injection | |
Packet reflect to netdev (kernel stack) | |
Macros | |
#define | SNF_VERSION_API 8 |
Typedefs | |
typedef struct snf_handle * | snf_handle_t |
opaque snf device handle More... | |
typedef struct snf_ring * | snf_ring_t |
opaque snf ring handle More... | |
Enumerations | |
enum | snf_link_state { SNF_LINK_DOWN = 0, SNF_LINK_UP = 1 } |
enum | snf_timesource_state { SNF_TIMESOURCE_LOCAL = 0, SNF_TIMESOURCE_EXT_UNSYNCED, SNF_TIMESOURCE_EXT_SYNCED, SNF_TIMESOURCE_EXT_FAILED, SNF_TIMESOURCE_ARISTA_ACTIVE, SNF_TIMESOURCE_PPS } |
Functions | |
int | snf_init (uint16_t api_version) |
Initialize Sniffer Library with api_version == SNF_VERSION_API. More... | |
int | snf_set_app_id (int32_t id) |
Set the application ID. More... | |
int | snf_getifaddrs (struct snf_ifaddrs **ifaddrs_o) |
void | snf_freeifaddrs (struct snf_ifaddrs *ifaddrs) |
int | snf_getportmask_valid (uint32_t *mask_o, int *cnt_o) |
int | snf_getportmask_linkup (uint32_t *mask_o, int *cnt_o) |
int | snf_open (uint32_t portnum, int num_rings, const struct snf_rss_params *rss_params, int64_t dataring_sz, int flags, snf_handle_t *devhandle) |
Open device for single or multi-ring operation. More... | |
int | snf_open_defaults (uint32_t portnum, snf_handle_t *devhandle) |
Open device for single or multi-ring operation. More... | |
int | snf_start (snf_handle_t devhandle) |
int | snf_stop (snf_handle_t devhandle) |
int | snf_get_link_state (snf_handle_t devhandle, enum snf_link_state *state) |
int | snf_get_timesource_state (snf_handle_t devhandle, enum snf_timesource_state *state) |
int | snf_get_link_speed (snf_handle_t devhandle, uint64_t *speed) |
int | snf_close (snf_handle_t devhandle) |
Close port. More... | |
int | snf_ring_open (snf_handle_t devhandle, snf_ring_t *ringh) |
int | snf_ring_open_id (snf_handle_t devhandle, int ring_id, snf_ring_t *ringh) |
int | snf_ring_close (snf_ring_t ringh) |
int | snf_ring_recv (snf_ring_t ringh, int timeout_ms, struct snf_recv_req *recv_req) |
Receive next packet from a receive ring. More... | |
int | snf_ring_portinfo_count (snf_ring_t ring, int *count) |
For aggregated rings, return the number of physical subrings. If the ring is not aggregated, the count is set to 1. | |
int | snf_ring_portinfo (snf_ring_t ring, struct snf_ring_portinfo *portinfo) |
Returns information for the ring. For aggregated rings, returns information for each of the physical rings. It is up to the user to make sure they have allocated enough memory to hold the information for all the physical rings in an aggregated ring. More... | |
int | snf_ring_recv_qinfo (snf_ring_t ring, struct snf_ring_qinfo *) |
Return queue information from ring. | |
int | snf_ring_recv_many (snf_ring_t ring, int timeout_ms, struct snf_recv_req *req_vector, int nreq_in, int *nreq_out, struct snf_ring_qinfo *qinfo) |
Receive and borrow many packets at once. More... | |
int | snf_ring_return_many (snf_ring_t ring, uint32_t data_qlen, struct snf_ring_qinfo *qinfo) |
Return packet space to receive ring. More... | |
int | snf_ring_getstats (snf_ring_t ringh, struct snf_ring_stats *stats) |
Get statistics from a receive ring. More... | |
API Reference for SNF and usage model.
The Sniffer API model is best summarized in the following steps:
The API promotes a processing model mindful of the multiple cores available on today's systems by enabling packets to be received through multiple receive rings. This feature appears in many 10GBit Ethernet drivers as a Receive-Side Scaling (RSS) but instead of being an feature internal to the driver, the Sniffer API presents the rings (or queues) as a first-order interface to the user. Instead of internally maintaining multiple rings as part of the driver, the Sniffer API allows users to directly create rings and retain better control over how rings are allocated on a typical multicore system. The flexibility over ring allocation is balanced with a receive function with much stricter semantics, but only to pursue these goals:
The number of requested receive rings is part of the function call to open a device for packet capture. While users retain control over how many rings are to be allocated by the API, the API best matches processing models that allocate one ring per core, where each core will both capture packets from Sniffer and process/analyze them in place. In-place processing refers to an effort to contain the packet capture and analysis within a CPU core's memory domain, which includes all levels of caches and RAM closest to the core. Since today's multicore platforms are increasingly complex in the non-uniformity of memory access (NUMA), it is best to minimize the amount of memory accesses across memory domains and to maximize the amount of work that can be done by one core within its own memory domain. As such, Sniffer assumes that for best performance, users recognize the importance of opening rings and maintain strong ties between each ring's single consuming thread and the core where it is scheduled.
When multiple rings are used, users can specify how packets should be partitioned based on a set of RSS flags. These flags instruct how Sniffer should hash incoming packets such that per-flow affinity is maintained in the same ring if so desired. This effectively allows each ring to virtualize the underlying network interface by ensuring that packets within the same flow are deterministically delivered to the same receive ring. Users should not need to reconstruct the ordering of flows even though incoming data is being split across many receive rings.
When opening a device with one or multiple rings, users can also specify combined amount of memory that can be consumed by all rings to store packet data. If left unspecified, the implementation will default to choosing a relatively conservative amount of memory, assuming that consumers can process incoming packets at line rate. Resizing the ring should be considered only to address specific buffering concerns and when multiple rings are not possible. It is possible that larger rings are necessary to mitigate the non-real-time behavior of some of the supported operating systems. Larger data rings will only serve as a temporary relief for users that cannot consume incoming data at line rates. If the application does not return packets to the data ring as fast as the packets are coming in, the receive ring will eventually overflow and the packets will be dropped.
The API tries to minimize software overhead in as many areas that can be addressed directly by the library. This includes duplicating some internal data structures to prevent false sharing between multiple ring consumers. More importantly, however, is the explicit decision to not provide any locking for the main receive function (snf_ring_recv). Even light forms of locking can impact processing rates when incoming data rates are roughly 15 Mpps (or every 70 nanoseconds). Instead, the API promotes the use of multiple rings to improve concurrency in packet processing and leave more breathing room for packet-per-packet analysis on each consuming core.
Users should consult the API reference for information on thread safety as each function extensively documents how in can be used in threaded environments. While most API functions provide strong thread safety guarantees, the main receive function (snf_ring_recv) is specifically not thread-safe for software overhead reasons explained in API Software Overhead. The expectation is that multiple threads would each open their own receive rings and independently consume packets on their own rings. Specifically, the Sniffer implementation assumes that with each successive call to snf_ring_recv within the context of a ring always means that the previous packet was consumed by the previous receive on that ring. This should not come as a surprise for users that have been exposed to the serialization present in libpcap-type interfaces. However, this may not necessarily match up well with users that expected to separate their computing resources (i.e. cores) between capture and analysis. These users can always resort to using a single receive ring.
Packet timestamps are available from each packet for packet arrival as seen from the NIC. The 64-bit nanosecond since EPOC timestamp is returned to the user in the receive call. It should be straightforward for the user to convert nanoseconds to a struct timeval if so needed. There are 3 timestamping modes: Host, Timesource (Hardware) and Arista. In Host timestamping mode, the timestamp returned to the user in the receive call is normalized in host nanoseconds ctime and Sniffer internally ensures that both clocks remain synchronized at a regular interval. The frequency of the NIC's clock is 2MHz plus/minus 100ppm. Timesource (Hardware) mode is available with 10G-PCIE2-8C2-2S-SYNC adapters when connected to an IRIG-B time source. Arista mode is available when the NIC port is connected to an Arista 71xx series switch port with "FCS append mode" timestamping enabled and the mode is also enabled in Sniffer10G with the MYRI_ARISTA_ENABLE_TIMESTAMPING=1 environment variable.
In order to efficiently use host memory, data rings are allocated when the device is opened as a pool of 4KB pages. At open time, users can specify the number of data rings that are allocated as well as the amount of data_ring_size that can be allocated for all rings. Internally, the Sniffer library requires an additional 1/32nd of the data_ring_size for each ring and some additional inter-ring synchronization memory. This can be summarized as at most 10% of the data_ring_size is allocated internally by Sniffer.
At any time, irrespective of whether the underlying device is enabled for Sniffer, users can configure a Sniffer-capable device as a regular ethernet device (i.e. typically via ifconfig). Unless the device is opened for capture, both send and receive functionality work as expected from any Ethernet driver. When the device is opened for capture, packets usually destined to the Ethernet driver are delivered to a Sniffer receive ring instead. However, users can still rely on the Ethernet's RAW sockets interface to send packets (a sample raw socket send enabled with Sniffer receive calls is provided as a test).
#define SNF_VERSION_API 8 |
SNF API version number (16 bits) Least significant byte increases for minor backwards compatible changes in the API. Most significant byte increases for incompatible changes in the API
0x0008: Add link speed support.
0x0007: Add Multiple Application support and snf_set_app_id() function.
0x0006: Internal driver/library API changes.
0x0005: Internal driver/library API changes.
0x0004: Add more injection support and aggregate port opens
0x0003: Add injection support and 3 send counters in statistics.
0x0002: Add nic_bytes_recv counter to stats to help users calculate the amount of bandwidth that is actually going through the NIC port.
typedef struct snf_handle* snf_handle_t |
opaque snf device handle
Opaque snf handle, allocated at snf_open time when a device can be successfully opened
typedef struct snf_ring* snf_ring_t |
opaque snf ring handle
Opaque snf ring handle, allocated at snf_ring_open time when a ring can be succesfully opened. The ring itself can represent a single or an aggregate of physical rings.
enum snf_link_state |
Link state enumeration, returned by snf_get_link_state
enum snf_timesource_state |
Timesource state (for -SYNC NICs), returned by snf_get_timesource_state
int snf_close | ( | snf_handle_t | devhandle | ) |
Close port.
This function can be closed once all opened rings (if any) are closed through snf_ring_close. Once a port is determined to be closable, it is implicitly called as if a call had been previously made to snf_stop.
0 | Successful. |
EBUSY | Some rings are still opened and the port cannot be closed (yet). |
void snf_freeifaddrs | ( | struct snf_ifaddrs * | ifaddrs | ) |
Free the list of library allocated memory for snf_getifaddrs
ifaddrs | Pointer to ifaddrs allocated via snf_getifaddrs |
int snf_get_link_speed | ( | snf_handle_t | devhandle, |
uint64_t * | speed | ||
) |
Get link speed on opened handle
devhandle | Device handle |
speed | Returns speed in bits-per-second for the link |
int snf_get_link_state | ( | snf_handle_t | devhandle, |
enum snf_link_state * | state | ||
) |
Get link status on opened handle
devhandle | Device handle |
state | Returns one of SNF_LINK_DOWN or SNF_LINK_UP |
int snf_get_timesource_state | ( | snf_handle_t | devhandle, |
enum snf_timesource_state * | state | ||
) |
Get Timesource information from opened handle
devhandle | Device handle |
state | Returns one of snf_timesource_state |
int snf_getifaddrs | ( | struct snf_ifaddrs ** | ifaddrs_o | ) |
Get a list of Sniffer-capable ethernet devices.
ifaddrs_o | Library-allocated list of Sniffer-capable devices |
int snf_getportmask_linkup | ( | uint32_t * | mask_o, |
int * | cnt_o | ||
) |
Get a mask of all Sniffer-capable ports that have their link state set to UP
The least significant bit represents port 0.
Similar to snf_getportmask_valid except that only ports with an active link are set in the mask.
mask_o | bitmask set at output |
cnt_o | Number of bits set in bitmask |
0 | Successful. |
ENODEV | Error obtaining port information |
int snf_getportmask_valid | ( | uint32_t * | mask_o, |
int * | cnt_o | ||
) |
Get a mask of all Sniffer-capable ports.
The least significant bit represents port 0.
mask_o | bitmask set at output |
cnt_o | Number of bits set in bitmask |
0 | Successful. |
ENODEV | Error obtaining port information |
int snf_init | ( | uint16_t | api_version | ) |
Initialize Sniffer Library with api_version == SNF_VERSION_API.
Initializes the sniffer library.
api_version | Must always be SNF_VERSION_API |
int snf_open | ( | uint32_t | portnum, |
int | num_rings, | ||
const struct snf_rss_params * | rss_params, | ||
int64_t | dataring_sz, | ||
int | flags, | ||
snf_handle_t * | devhandle | ||
) |
Open device for single or multi-ring operation.
Opens a port for sniffing and allocates a device handle.
portnum | Port numbers can be interpreted as integers for a specific port number or as a mask when SNF_F_AGGREGATE_PORTMASK is specified in flags. Port information can be obtained through snf_getifaddrs and active/valid masks are available with snf_getportmask_valid and snf_getportmask_linkup. As a special case, if portnum -1 is passed, the library will internally open a portmask as if snf_getportmask_valid was called. |
num_rings | Number of rings to allocate for receive-side scaling feature, which determines how many different threads can open their own ring via snf_ring_open(). If set to 0 or less than zero, default value is used unless SNF_NUM_RINGS is set in the environment. |
rss_params | Points to a user-initialized structure that selects the RSS mechanism to apply to each incoming packet. This parameter is only meaningful if there are more than 1 rings to be opened. By default, if users pass a NULL value, the implementation will select its own mechanism to divide incoming packets across rings. RSS parameters are documented in Receive-Side Scaling (RSS). |
dataring_sz | Represents the total amount of memory to be used to store incoming packet data for all rings to be opened. If the value is set to 0 or less than 0, the library tries to choose a sensible default unless SNF_DATARING_SIZE is set in the environment. The value can be specified in megabytes (if it is less than 1048576) or is otherwise considered to be in bytes. In either case, the library may slightly adjust the user's request to satisfy alignment requirements (typically 2MB boundaries). |
flags | A mask of flags documented in Open flags for process-sharing, port aggregation and packet duplication. |
devhandle | Device handle allocated if the call is successful |
0 | Successful. the port is opened and a value devhandle is allocated (see remarks) |
EBUSY | Device is already opened |
EINVAL | Invalid argument passed, most probably num_rings (if not, check syslog) |
E2BIG | Driver could not allocate requested dataring_sz (check syslog) |
ENOMEM | Either library or driver did not have enough memory to allocate handle descriptors (but not data ring). |
ENODEV | Device portnum can't be opened |
int snf_open_defaults | ( | uint32_t | portnum, |
snf_handle_t * | devhandle | ||
) |
Open device for single or multi-ring operation.
Opens a port for sniffing and allocates a device handle using system defaults.
This function is a simplified version of snf_open and ensures that the resulting device is opened according to system defaults. Since the number of rings and flags can be set by module parameters, some installations may prefer to control device-level parameters in a system-wide configuration and keep the library calls simple.
This call is equivalent to
portnum | Ports are numbered from 0 to N-1 where 'N' is the number of Myricom ports available on the system. snf_getifaddrs() may be a useful utility to retrieve the port number by interface name or mac address if there are multiple |
devhandle | Device handle allocated if the call is successful |
int snf_ring_close | ( | snf_ring_t | ringh | ) |
Close a ring
This function is used to inform the underlying device that no further calls to snf_ring_recv will be made. If the device is not subsequently closed (snf_close), all packets that would have been delivered to this ring are dropped. Also, by calling this function, users confirm that all packet processing for packets obtained on this ring via snf_ring_recv is complete.
ringh | Ring handle |
0 | Successful. |
int snf_ring_getstats | ( | snf_ring_t | ringh, |
struct snf_ring_stats * | stats | ||
) |
Get statistics from a receive ring.
ringh | Ring handle |
stats | User-provided pointer to a statistics structure snf_ring_stats, filled in by the library. |
int snf_ring_open | ( | snf_handle_t | devhandle, |
snf_ring_t * | ringh | ||
) |
Opens the next available ring
devhandle | Device handle, obtained from a successful call to snf_open |
ringh | Ring handle allocated if the call is successful. |
0 | Successful. The ring is opened and ringh contains the ring handle. |
EBUSY | Too many rings already opened |
int snf_ring_open_id | ( | snf_handle_t | devhandle, |
int | ring_id, | ||
snf_ring_t * | ringh | ||
) |
Opens a ring from an opened port.
devhandle | Device handle, obtained from a successful call to snf_open |
ring_id | Ring number to open, from 0 to num_rings - 1. If the value is -1, this function behaves as if snf_ring_open was called. |
ringh | Ring handle allocated if the call is successful. |
0 | Successful. The ring is opened and ringh contains the ring handle. |
EBUSY | If ring_id == -1, Too many rings already opened. If ring_id >= 0, that ring is already opened. |
int snf_ring_portinfo | ( | snf_ring_t | ring, |
struct snf_ring_portinfo * | portinfo | ||
) |
Returns information for the ring. For aggregated rings, returns information for each of the physical rings. It is up to the user to make sure they have allocated enough memory to hold the information for all the physical rings in an aggregated ring.
ring | Ring handle (from snf_ring_open) |
portinfo | Pointer to memory allocated by the user that will be filled in with the information. |
int snf_ring_recv | ( | snf_ring_t | ringh, |
int | timeout_ms, | ||
struct snf_recv_req * | recv_req | ||
) |
Receive next packet from a receive ring.
This function is used to return the next available packet in a receive ring. The function can block indefinitely, for a specific timeout or be used as a non-blocking call with a timeout of 0.
ringh | Ring handle (from snf_ring_open) |
timeout_ms | Receive timeout to control how the function blocks for the next packet. If the value is less than 0, the function can block indefinitely. If the value is 0, the function is guaranteed to never enter a blocking state and returns EAGAIN unless there is a packet waiting. If the value is greater than 0, the caller indicates a desired wait time in milliseconds. With a non-zero wait time, the function only blocks if there are no outstanding packets. If the timeout expires before a packet can be received, the function returns EAGAIN (and not ETIMEDOUT). In all cases, users should expect that the function may return EINTR as the result of signal delivery. |
recv_req | Receive Packet structure, only updated when a the function returns 0 for a successful packet receive (snf_recv_req) |
0 | Successful packet delivery, recv_req is updated with packet information. |
EINTR | The call was interrupted by a signal handler |
EAGAIN | No packets available (only when timeout is >= 0). |
int snf_ring_recv_many | ( | snf_ring_t | ring, |
int | timeout_ms, | ||
struct snf_recv_req * | req_vector, | ||
int | nreq_in, | ||
int * | nreq_out, | ||
struct snf_ring_qinfo * | qinfo | ||
) |
Receive and borrow many packets at once.
This function allows callers to receive one or more packets per call. Contrary to snf_ring_recv, this function assumes that callers will split the functionality to receive packets (or borrow them) and the functionality to return packets through snf_ring_return_many.
ring | Ring handle (from snf_ring_open) |
timeout_ms | Receive timeout to control how the function blocks for the next packet. See complete documentation in snf_ring_recv. |
req_vector | Vector of receive packet structures provided by the user and only updated when a packet is received. |
nreq_in | Number of receive packet structures provided in req_vector. No more than nreq_in packets can be received. |
nreq_out | Output value for the number of packets actually received and updated in req_vector |
qinfo | If non-NULL, the qinfo structure is updated before the function returns 0 or EAGAIN (the function is not updated for other error conditions). // See snf_ring_return_many documentation for examples
|
int snf_ring_return_many | ( | snf_ring_t | ring, |
uint32_t | data_qlen, | ||
struct snf_ring_qinfo * | qinfo | ||
) |
Return packet space to receive ring.
Under the borrow-many-return-many receive model, it is up to the user to return space in the receive ring. The user achieves this by accumulating packet lengths from the length_data parameter from each packet received and returning the space through this function call.
ring | Ring handle (from snf_ring_open) |
data_qlen | Amount of data returned by previously consumed packets. As a special case, if the value -1 is provided, all data previously borrowed through snf_ring_recv_many will be returned. |
qinfo | If non-NULL, the qinfo structure is updated before the function returns. |
int snf_set_app_id | ( | int32_t | id | ) |
Set the application ID.
Sets the application ID.
The user may set the application ID after the call to snf_init, but before snf_open. When the application ID is set, Sniffer duplicates receive packets to multiple applications. Each application must have a unique ID. Then, each application may utilize a different number of rings. The application can be a process with multiple rings and threads. In this case all rings have the same ID. Or, multiple processes may share the same application ID.
The user may store the application ID in the environment variable SNF_APP_ID, instead of calling this function. Both actions have the same effect. SNF_APP_ID overrides the ID set via snf_set_app_id.
The user may not run a mix of processes with valid application IDs (not -1) and processes with no IDs (-1). Either all processes have valid IDs or none of them do.
id | A 32-bit signed integer representing the application ID. A valid ID is any value except -1. -1 is reserved and represents "no ID". |
0 | Successful |
EINVAL | snf_init has not been called or id is -1. |
int snf_start | ( | snf_handle_t | devhandle | ) |
int snf_stop | ( | snf_handle_t | devhandle | ) |
Stop packet capture on a port. This function should be used carefully in multi-process mode as a single stop command stops packet capture on all rings. It is usually best to simply snf_ring_close a ring to stop capture on a ring.
devhandle | Device handle |