Priority Flow Control – PFC and Lossless Traffic

Table of Contents

Design Notes

CONFIG_DB Tables

Configure PFC

Asymmetric PFC

TC (Traffic Class) to PG (Priority Group)

Design Notes

Priority flow control (PFC) provides an enhancement to the existing pause mechanism in Ethernet. The current Ethernet pause option stops all traffic on a link. PFC creates eight separate virtual links on the physical link and allows any of these links to be paused and restarted independently, enabling the network to create a lossless class of service for an individual virtual link.

PFC oversees:

  • Mapping of an incoming packet to a priority according to its DSCP value or IEEE priority.
  • Mapping of the priority to an egress queue.
  • Sending a PFC frame for lossless priority if it runs out of headroom.
  • Ceasing the egress queue on receiving a PFC frame.

Enabling PFC allows lossless packet forwarding to be achieved.

These operations depend on the mappings among queue, priorities and DSCP, which are defined below.

Example model & SONiC version:

  • Aurora 830, Aurora 721, Aurora 621, Aurora 221
  • Netberg SONiC: sonic-broadcom-202311-n0

Restrictions:

  • Aurora 221 does not support PFC.

CONFIG_DB Tables

PFC uses the following tables in CONFIG_DB:

 

Table

Description

TC_TO_QUEUE_MAP

In the egress direction, maps the traffic class to an egress queue.

DSCP_TO_TC_MAP

In the ingress direction, maps the DSCP field in an IP packet into a traffic class. By utilizing this table and the TC_TO_QUEUE_MAP table, a packet can be mapped from its DSCP field to the egress queue. In case there is a congestion in the egress queue, the back-pressure mechanism is triggered, and the PFC frame can be sent. The priority in the PFC frame is determined by TC to priority group mapping table.

MAP_PFC_PRIORITY_TO_QUEUE

Maps the priority in the received PFC frames to an egress queue so that the switch knows which egress to resume or pause.

PFC_TO_PRIORITY_GROUP_MAP

Maps PFC priority to a priority group.

TC_TO_PRIORITY_GROUP_MAP

Maps the PFC priority to a priority group and enables the switch to set the PFC priority in xon/xoff frames in order to resume/pause corresponding egress queue at the link peer.

PORT_QOS_MAP

Defines the mappings adopted by a port, including:
– tc_to_queue_map, default value is TC_TO_QUEUE_MAP/AZURE
– dscp_to_tc_map, default value is DSCP_TO_TC_MAP/AZURE
– pfc_to_queue_map, default value is MAP_PFC_PRIORITY_TO_QUEUE/AZURE
– pfc_to_pg_map, default value is PFC_PRIORITY_TO_PRIORITY_GROUP_MAP/AZURE
– tc_to_pg_map, default value is TC_TO_PRIORITY_GROUP_MAP
– pfc_enable, defines which queues are pfc-enabled
– pfcwd_sw_enable, specify the queue(s) to enable PFC watchdog

Configure PFC

All the mapping tables are simple integer-to-integer mappings.

Load the buffer configuration as described in the Buffer Settings piece.

It contains default QoS profiles with all necessary mapping.

Check the Buffer Pool settings:

admin@sonic:~$ show buffer configuration
Pool: egress_lossless_pool
----  --------
mode  dynamic
size  16189824
type  egress
----  --------
Pool: egress_lossy_pool
----  --------
mode  dynamic
size  16189824
type  egress
----  --------
Pool: ingress_lossless_pool
----  --------
mode  dynamic
size  2940941
type  ingress
xoff  14704704
----  --------
Pool: ingress_lossy_pool
----  --------
mode  dynamic
size  15031475
type  ingress
----  --------
Profile: egress_lossless_profile
----------  --------------------
dynamic_th  1
pool        egress_lossless_pool
size        0
----------  --------------------
Profile: egress_lossy_profile
----------  -----------------
dynamic_th  1
pool        egress_lossy_pool
size        0
----------  -----------------
Profile: ingress_lossless_profile
----------  ---------------------
dynamic_th  1
pool        ingress_lossless_pool
size        0
xoff        244706
xon         1746919
xon_offset  18432
----------  ---------------------
Profile: ingress_lossy_profile
----------  ------------------
dynamic_th  1
pool        ingress_lossy_pool
size        0
----------  ------------------

Check the Priority Groups in /etc/sonic/config_db.json:

"BUFFER_PG": {
    "Ethernet0,Ethernet1,Ethernet10,Ethernet11,Ethernet12,Ethernet13,...,Ethernet68,Ethernet7,Ethernet8,Ethernet9|0": {
        "profile": "ingress_lossy_profile"
    },
    "Ethernet0,Ethernet1,Ethernet10,Ethernet11,Ethernet12,Ethernet13,...,Ethernet68,Ethernet7,Ethernet8,Ethernet9|1": {
        "profile": "ingress_lossy_profile"
    },
    "Ethernet0,Ethernet1,Ethernet10,Ethernet11,Ethernet12,Ethernet13,...,Ethernet68,Ethernet7,Ethernet8,Ethernet9|2": {
        "profile": "ingress_lossy_profile"
    },
    "Ethernet0,Ethernet1,Ethernet10,Ethernet11,Ethernet12,Ethernet13,...,Ethernet68,Ethernet7,Ethernet8,Ethernet9|3": {
        "profile": "ingress_lossless_profile"
    },
    "Ethernet0,Ethernet1,Ethernet10,Ethernet11,Ethernet12,Ethernet13,...,Ethernet68,Ethernet7,Ethernet8,Ethernet9|4": {
        "profile": "ingress_lossless_profile"
    },
    "Ethernet0,Ethernet1,Ethernet10,Ethernet11,Ethernet12,Ethernet13,...,Ethernet68,Ethernet7,Ethernet8,Ethernet9|5": {
        "profile": "ingress_lossy_profile"
    },
    "Ethernet0,Ethernet1,Ethernet10,Ethernet11,Ethernet12,Ethernet13,...,Ethernet68,Ethernet7,Ethernet8,Ethernet9|6": {
        "profile": "ingress_lossy_profile"
    },
    "Ethernet0,Ethernet1,Ethernet10,Ethernet11,Ethernet12,Ethernet13,...,Ethernet68,Ethernet7,Ethernet8,Ethernet9|7": {
        "profile": "ingress_lossy_profile"
    }
},
NoteThe buffer profile must be the same for all ports’ PG (queue). Any port missing from the PG (queue) binding configuration may lead to unexpected behavior.

PFC will be enabled on the interfaces with lossless PGs.

"PORT_QOS_MAP": {
    "Ethernet0": {
        "dscp_to_tc_map": "AZURE",
        "pfc_enable": "3,4",
        "pfc_to_queue_map": "AZURE",
        "pfcwd_sw_enable": "3,4",
        "tc_to_pg_map": "AZURE",
        "tc_to_queue_map": "AZURE"
    },
    "Ethernet1": {
        "dscp_to_tc_map": "AZURE",
        "pfc_enable": "3,4",
        "pfc_to_queue_map": "AZURE",
        "pfcwd_sw_enable": "3,4",
        "tc_to_pg_map": "AZURE",
        "tc_to_queue_map": "AZURE"
    },
            ...
    "Ethernet64": {
        "dscp_to_tc_map": "AZURE",
        "pfc_enable": "3,4",
        "pfc_to_queue_map": "AZURE",
        "pfcwd_sw_enable": "3,4",
        "tc_to_pg_map": "AZURE",
        "tc_to_queue_map": "AZURE"
    },
    "Ethernet68": {
        "dscp_to_tc_map": "AZURE",
        "pfc_enable": "3,4",
        "pfc_to_queue_map": "AZURE",
        "pfcwd_sw_enable": "3,4",
        "tc_to_pg_map": "AZURE",
        "tc_to_queue_map": "AZURE"
    },
    "global": {
        "dscp_to_tc_map": "AZURE"
    }
  • pfc_enable Specify on which queue to enable PFC
  • pfc_wd_sw_enable Specify the queue(s) to enable PFC watchdog
 admin@sonic:~$ show pfc priority
 Interface    Lossless priorities
 -----------  ---------------------
 Ethernet0    3,4
 Ethernet1    3,4
 Ethernet2    3,4
 ...
 Ethernet64   3,4
 Ethernet68   3,4
 global       N/A

Now you can manage PFC priorities.

Syntax:

config interface pfc priority <interface_name> [0|1|2|3|4|5|6|7] [on|off]

Example:

admin@sonic:~$ sudo config interface pfc priority Ethernet0 3 off
Interface      Lossless priorities
-----------  ---------------------
Ethernet0                        4
admin@sonic:~$ sudo config interface pfc priority Ethernet0 3 on
Interface    Lossless priorities
-----------  ---------------------
Ethernet0    3,4

Asymmetric PFC

Asymmetric PFC brings two major changes from the standard PFC behavior.

  • The interface should handle incoming pause frames on all priorities, not just on lossless.
  • The interface should only generate pause frames on priorities that are configured to be lossless.

Syntax

config interface pfc asymmetric <interface_name> [on|off]

Example:

admin@sonic:~$ sudo config interface pfc asymmetric Ethernet0 on

Check the status:

admin@sonic:~$ show pfc asymmetric
Interface    Asymmetric
-----------  ------------
Ethernet0    on
...
admin@sonic:~$ show interfaces status
  Interface        Lanes    Speed    MTU    FEC          Alias    Vlan    Oper    Admin             Type    Asym PFC
-----------  -----------  -------  -----  -----  -------------  ------  ------  -------  ---------------  ----------
  Ethernet0            1      10G   9100    N/A    Eth0(Port0)  routed    down       up   SFP/SFP+/SFP28          on

TC (Traffic Class) to PG (Priority Group)

This mapping table has two purposes: determining the priority group (PG) for buffering and the priority in the PFC PAUSE frame.

Defaults (may vary between platforms):

"TC_TO_PRIORITY_GROUP_MAP": {
    "AZURE": {
        "0": "0",
        "1": "0",
        "2": "0",
        "3": "3",
        "4": "4",
        "5": "0",
        "6": "0",
        "7": "7"
    }
},

Users can create custom maps under different names in this section of config_db.json

Example:

"TC_TO_PRIORITY_GROUP_MAP": {
    "AZURE": {
        "0": "0",
        "1": "0",
        "2": "0",
        "3": "3",
        "4": "4",
        "5": "0",
        "6": "0",
        "7": "7"
    },
    "DEFAULT": {
        "0": "0",
        "1": "1",
        "2": "2",
        "3": "3",
        "4": "4",
        "5": "5",
        "6": "6",
        "7": "7"
    }
},

Each interface has a field to bind the mapping in the PORT_QOS_MAP section.

"PORT_QOS_MAP": {
    "Ethernet0": {
        "dscp_to_tc_map": "AZURE",
        "pfc_enable": "3,4",
        "pfc_to_queue_map": "AZURE",
        "pfcwd_sw_enable": "3,4",
        "tc_to_pg_map": "AZURE",
        "tc_to_queue_map": "AZURE"
    },
    "Ethernet1": {
        "dscp_to_tc_map": "AZURE",
        "pfc_enable": "3,4",
        "pfc_to_queue_map": "AZURE",
        "pfcwd_sw_enable": "3,4",
        "tc_to_pg_map": "DEFAULT",
        "tc_to_queue_map": "AZURE"
    },
    ...
},

After all manipulations in the config_db.json file, the configuration should be reloaded.

admin@sonic:~$ sudo config reload -y
NEWS

Latest news