I am currently working with a traffic shaping Linux node. The rule set has grown to about 2500 hosts, all identified specifically by MAC address. The filter configuration is "basic", meaning that on average, 1250 rules must be tested before a packet is filtered to the correct class. At line rates we're currently seeing, this is causing too much CPU on the host, causing packets to be dropped.
I would like to move from the linear linked-list ruleset to a 6-level hashtable lookup (one for each byte of the MAC address)
I am completely unsure on how to achieve this currently. Most documentation on this feature is reasonably confusing to me, and as far as I've found, it all is based on IP address hashing.
I'm currently matching packets based on their L2 header values with a filter, example (for egress/upload):
MAC: 52:54:00:12:34:56
tc filter add dev <dev> protocol ip parent 1:0 prio 1 u32 match u16 0x0800 at -2 match u16 0x3456 0xffff at -4 match u32 0x52540012 0xffffffff at -8 flowid 1:50
Are there any resources I could follow to maybe explain the hashtable setup better?
Lets say 52:54:00:12:34:56 goes to 1:50 and 52:54:00:12:37:56 goes to 1:51.
For a 256-wise split of the rules, start with one hash table, lets call it 2:
:
# tc filter add dev eth1 parent 1:0 prio 1 handle 2: protocol ip u32 divisor 256
Now for defining which byte this matches on, an offset and a bitmask selects a single byte.
IP payload start minus 2 bytes ethertype (0x800) minus 6 bytes (48-bit) addr plus is the start of the mac addr. So just say minus 6 to select the byte at the end of our 4-byte mask.
# tc filter add dev <dev> protocol ip parent 1:0 prio 1 u32 ht 800:: \
match u16 0x0800 at -2 \
hashkey mask 0x000000ff at -6 \
link 2:
Note my examples matches the IPv4 filter right away and do not repeat it later, I do not think it matters - in any case, we only need it exactly once.
Then insert the rules, putting the (hexadecimal) byte to the ht handle:
# tc filter add dev <dev> protocol ip parent 1:0 prio 1 u32 \
ht 2:56: \
match u16 0x3456 0xffff at -4 match u32 0x52540012 0xffffffff at -8 \
flowid 1:50
# tc filter add dev <dev> protocol ip parent 1:0 prio 1 u32 \
ht 2:56: \
match u16 0x3756 0xffff at -4 match u32 0x52540012 0xffffffff at -8 \
flowid 1:51
... any many more mac addr rules
Flowchart:
1:0 =hash 1st byte=> ht 2:56 =sequential mac filter=> flowid 1:50
Now of course, adding another hashtable layer for the second (or more) bytes is possible and potentially further reduces the number of rules traversed, but you probably only have a handful of macaddrs with matching last-byte, so why bother going 6-level?
User contributions licensed under CC BY-SA 3.0