- 27 Oct, 2016 3 commits
-
-
Johannes Berg authored
Now genl_register_family() is the only thing (other than the users themselves, perhaps, but I didn't find any doing that) writing to the family struct. In all families that I found, genl_register_family() is only called from __init functions (some indirectly, in which case I've add __init annotations to clarifly things), so all can actually be marked __ro_after_init. This protects the data structure from accidental corruption. Signed-off-by:
Johannes Berg <johannes.berg@intel.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Johannes Berg authored
Instead of providing macros/inline functions to initialize the families, make all users initialize them statically and get rid of the macros. This reduces the kernel code size by about 1.6k on x86-64 (with allyesconfig). Signed-off-by:
Johannes Berg <johannes.berg@intel.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Johannes Berg authored
Static family IDs have never really been used, the only use case was the workaround I introduced for those users that assumed their family ID was also their multicast group ID. Additionally, because static family IDs would never be reserved by the generic netlink code, using a relatively low ID would only work for built-in families that can be registered immediately after generic netlink is started, which is basically only the control family (apart from the workaround code, which I also had to add code for so it would reserve those IDs) Thus, anything other than GENL_ID_GENERATE is flawed and luckily not used except in the cases I mentioned. Move those workarounds into a few lines of code, and then get rid of GENL_ID_GENERATE entirely, making it more robust. Signed-off-by:
Johannes Berg <johannes.berg@intel.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 20 Oct, 2016 1 commit
-
-
Sabrina Dubroca authored
Currently, GRO can do unlimited recursion through the gro_receive handlers. This was fixed for tunneling protocols by limiting tunnel GRO to one level with encap_mark, but both VLAN and TEB still have this problem. Thus, the kernel is vulnerable to a stack overflow, if we receive a packet composed entirely of VLAN headers. This patch adds a recursion counter to the GRO layer to prevent stack overflow. When a gro_receive function hits the recursion limit, GRO is aborted for this skb and it is processed normally. This recursion counter is put in the GRO CB, but could be turned into a percpu counter if we run out of space in the CB. Thanks to Vladimír Beneš <vbenes@redhat.com> for the initial bug report. Fixes: CVE-2016-7039 Fixes: 9b174d88 ("net: Add Transparent Ethernet Bridging GRO support.") Fixes: 66e5133f ("vlan: Add GRO support for non hardware accelerated vlan") Signed-off-by:
Sabrina Dubroca <sd@queasysnail.net> Reviewed-by:
Jiri Benc <jbenc@redhat.com> Acked-by:
Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by:
Tom Herbert <tom@herbertland.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 01 Sep, 2016 1 commit
-
-
stephen hemminger authored
Signed-off-by:
Stephen Hemminger <stephen@networkplumber.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 08 Jun, 2016 1 commit
-
-
Tom Herbert authored
This patch implements direct encapsulation of IPv4 and IPv6 packets in UDP. This is done a version "1" of GUE and as explained in I-D draft-ietf-nvo3-gue-03. Changes here are only in the receive path, fou with IPxIPx already supports the transmit side. Both the normal receive path and GRO path are modified to check for GUE version and check for IP version in the case that GUE version is "1". Tested: IPIP with direct GUE encap 1 TCP_STREAM 4530 Mbps 200 TCP_RR 1297625 tps 135/232/444 90/95/99% latencies IP4IP6 with direct GUE encap 1 TCP_STREAM 4903 Mbps 200 TCP_RR 1184481 tps 149/253/473 90/95/99% latencies IP6IP6 direct GUE encap 1 TCP_STREAM 5146 Mbps 200 TCP_RR 1202879 tps 146/251/472 90/95/99% latencies SIT with direct GUE encap 1 TCP_STREAM 6111 Mbps 200 TCP_RR 1250337 tps 139/241/467 90/95/99% latencies Signed-off-by:
Tom Herbert <tom@herbertland.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 20 May, 2016 3 commits
-
-
Tom Herbert authored
This patch adds receive path support for IPv6 with fou. - Add address family to fou structure for open sockets. This supports AF_INET and AF_INET6. Lookups for fou ports are performed on both the port number and family. - In fou and gue receive adjust tot_len in IPv4 header or payload_len based on address family. - Allow AF_INET6 in FOU_ATTR_AF netlink attribute. Signed-off-by:
Tom Herbert <tom@herbertland.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
Create __fou_build_header and __gue_build_header. These implement the protocol generic parts of building the fou and gue header. fou_build_header and gue_build_header implement the IPv4 specific functions and call the __*_build_header functions. Signed-off-by:
Tom Herbert <tom@herbertland.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
Use helper function to set up UDP tunnel related information for a fou socket. Signed-off-by:
Tom Herbert <tom@herbertland.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 06 May, 2016 2 commits
-
-
Jarno Rajahalme authored
UDP tunnel segmentation code relies on the inner offsets being set for an UDP tunnel GSO packet, but the inner *_complete() functions will set the inner offsets only if 'encapsulation' is set before calling them. Currently, udp_gro_complete() sets 'encapsulation' only after the inner *_complete() functions are done. This causes the inner offsets having invalid values after udp_gro_complete() returns, which in turn will make it impossible to properly segment the packet in case it needs to be forwarded, which would be visible to the user either as invalid packets being sent or as packet loss. This patch fixes this by setting skb's 'encapsulation' in udp_gro_complete() before calling into the inner complete functions, and by making each possible UDP tunnel gro_complete() callback set the inner_mac_header to the beginning of the tunnel payload. Signed-off-by:
Jarno Rajahalme <jarno@ovn.org> Reviewed-by:
Alexander Duyck <aduyck@mirantis.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Jarno Rajahalme authored
The setting of the UDP tunnel GSO type is already performed by udp[46]_gro_complete(). Signed-off-by:
Jarno Rajahalme <jarno@ovn.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 16 Apr, 2016 1 commit
-
-
Alexander Duyck authored
This patch updates the IP tunnel core function iptunnel_handle_offloads so that we return an int and do not free the skb inside the function. This actually allows us to clean up several paths in several tunnels so that we can free the skb at one point in the path without having to have a secondary path if we are supporting tunnel offloads. In addition it should resolve some double-free issues I have found in the tunnels paths as I believe it is possible for us to end up triggering such an event in the case of fou or gue. Signed-off-by:
Alexander Duyck <aduyck@mirantis.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 07 Apr, 2016 2 commits
-
-
Alexander Duyck authored
This patch fixes an issue I found in which we were dropping frames if we had enabled checksums on GRE headers that were encapsulated by either FOU or GUE. Without this patch I was barely able to get 1 Gb/s of throughput. With this patch applied I am now at least getting around 6 Gb/s. The issue is due to the fact that with FOU or GUE applied we do not provide a transport offset pointing to the GRE header, nor do we offload it in software as the GRE header is completely skipped by GSO and treated like a VXLAN or GENEVE type header. As such we need to prevent the stack from generating it and also prevent GRE from generating it via any interface we create. Fixes: c3483384 ("gro: Allow tunnel stacking in the case of FOU/GUE") Signed-off-by:
Alexander Duyck <aduyck@mirantis.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
Adapt gue_gro_receive, gue_gro_complete to take a socket argument. Don't set udp_offloads any more. Signed-off-by:
Tom Herbert <tom@herbertland.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 30 Mar, 2016 1 commit
-
-
Alexander Duyck authored
This patch should fix the issues seen with a recent fix to prevent tunnel-in-tunnel frames from being generated with GRO. The fix itself is correct for now as long as we do not add any devices that support NETIF_F_GSO_GRE_CSUM. When such a device is added it could have the potential to mess things up due to the fact that the outer transport header points to the outer UDP header and not the GRE header as would be expected. Fixes: fac8e0f5 ("tunnels: Don't apply GRO to multiple layers of encapsulation.") Signed-off-by:
Alexander Duyck <aduyck@mirantis.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 20 Mar, 2016 1 commit
-
-
Jesse Gross authored
If a packet is either locally encapsulated or processed through GRO it is marked with the offloads that it requires. However, when it is decapsulated these tunnel offload indications are not removed. This means that if we receive an encapsulated TCP packet, aggregate it with GRO, decapsulate, and retransmit the resulting frame on a NIC that does not support encapsulation, we won't be able to take advantage of hardware offloads even though it is just a simple TCP packet at this point. This fixes the problem by stripping off encapsulation offload indications when packets are decapsulated. The performance impacts of this bug are significant. In a test where a Geneve encapsulated TCP stream is sent to a hypervisor, GRO'ed, decapsulated, and bridged to a VM performance is improved by 60% (5Gbps->8Gbps) as a result of avoiding unnecessary segmentation at the VM tap interface. Reported-by:
Ramu Ramamurthy <sramamur@linux.vnet.ibm.com> Fixes: 68c33163 ("v4 GRE: Add TCP segmentation offload for GRE") Signed-off-by:
Jesse Gross <jesse@kernel.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 13 Mar, 2016 1 commit
-
-
Alexander Duyck authored
This patch updates the GRO handlers for GRE, VXLAN, GENEVE, and FOU so that we do not clear the flush bit until after we have called the next level GRO handler. Previously this was being cleared before parsing through the list of frames, however this resulted in several paths where either the bit needed to be reset but wasn't as in the case of FOU, or cases where it was being set as in GENEVE. By just deferring the clearing of the bit until after the next level protocol has been parsed we can avoid any unnecessary bit twiddling and avoid bugs. Signed-off-by:
Alexander Duyck <aduyck@mirantis.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 12 Feb, 2016 2 commits
-
-
Edward Cree authored
All users now pass false, so we can remove it, and remove the code that was conditional upon it. Signed-off-by:
Edward Cree <ecree@solarflare.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Edward Cree authored
Signed-off-by:
Edward Cree <ecree@solarflare.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 10 Jan, 2016 1 commit
-
-
Hannes Frederic Sowa authored
udp tunnel offloads tend to aggregate datagrams based on inner headers. gro engine gets notified by tunnel implementations about possible offloads. The match is solely based on the port number. Imagine a tunnel bound to port 53, the offloading will look into all DNS packets and tries to aggregate them based on the inner data found within. This could lead to data corruption and malformed DNS packets. While this patch minimizes the problem and helps an administrator to find the issue by querying ip tunnel/fou, a better way would be to match on the specific destination ip address so if a user space socket is bound to the same address it will conflict. Cc: Tom Herbert <tom@herbertland.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by:
Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 17 Dec, 2015 1 commit
-
-
Hannes Frederic Sowa authored
fou->udp_offloads is managed by RCU. As it is actually included inside the fou sockets, we cannot let the memory go out of scope before a grace period. We either can synchronize_rcu or switch over to kfree_rcu to manage the sockets. kfree_rcu seems appropriate as it is used by vxlan and geneve. Fixes: 23461551 ("fou: Support for foo-over-udp RX path") Cc: Tom Herbert <tom@herbertland.com> Signed-off-by:
Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 29 Aug, 2015 1 commit
-
-
Jiri Benc authored
fou does not really support IPv6 encapsulation. After an UDP socket is created in fou_create, the encap_rcv callback is set either to fou_udp_recv or to gue_udp_recv. Both of those unconditionally assume that the received packet has an IPv4 header and access the data at network_header as it was an IPv4 header. This leads to IPv6 flow label being interpreted as IP packet length, etc. Disallow fou tunnel to be configured as IPv6 until real IPv6 support is added to fou. CC: Tom Herbert <tom@herbertland.com> Signed-off-by:
Jiri Benc <jbenc@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 23 Aug, 2015 2 commits
-
-
Tom Herbert authored
Do WARN_ON_ONCE instead of WARN_ON in gue_gro_receive when the offload callcaks are bad (either don't exist or gro_receive is not specified). Signed-off-by:
Tom Herbert <tom@herbertland.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
The remote checksum offload GRO did not consider the case that frag0 might be in use. This patch fixes that by accessing headers using the skb_gro functions and not saving offsets relative to skb->head. Signed-off-by:
Tom Herbert <tom@herbertland.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 16 Apr, 2015 1 commit
-
-
WANG Cong authored
Fixes: 7a6c8c34 ("fou: implement FOU_CMD_GET") Reported-by:
Dan Carpenter <dan.carpenter@oracle.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 13 Apr, 2015 5 commits
-
-
WANG Cong authored
Cc: Tom Herbert <tom@herbertland.com> Signed-off-by:
Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
WANG Cong authored
Also convert the spinlock to a mutex. Cc: Tom Herbert <tom@herbertland.com> Signed-off-by:
Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
WANG Cong authored
udp_config.local_udp_port is be16. And iproute2 passes network order for FOU_ATTR_PORT. This doesn't fix any bug, just for consistency. Cc: Tom Herbert <tom@herbertland.com> Signed-off-by:
Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
WANG Cong authored
Not a big deal, just for corretness. Cc: Tom Herbert <tom@herbertland.com> Signed-off-by:
Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
WANG Cong authored
This fixes the following harmless warning: ./ip/ip fou del port 7777 [ 122.907516] udp_del_offload: didn't find offload for port 7777 Cc: Tom Herbert <tom@herbertland.com> Signed-off-by:
Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 08 Apr, 2015 1 commit
-
-
Andi Kleen authored
const __read_mostly is a senseless combination. If something is already const it cannot be __read_mostly. Remove the bogus __read_mostly in the fou driver. This fixes section conflicts with LTO. Signed-off-by:
Andi Kleen <ak@linux.intel.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 11 Feb, 2015 3 commits
-
-
Tom Herbert authored
Change remote checksum handling to set checksum partial as default behavior. Added an iflink parameter to configure not using checksum partial (calling csum_partial to update checksum). Signed-off-by:
Tom Herbert <therbert@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
This patch adds infrastructure so that remote checksum offload can set CHECKSUM_PARTIAL instead of calling csum_partial and writing the modfied checksum field. Add skb_remcsum_adjust_partial function to set an skb for using CHECKSUM_PARTIAL with remote checksum offload. Changed skb_remcsum_process and skb_gro_remcsum_process to take a boolean argument to indicate if checksum partial can be set or the checksum needs to be modified using the normal algorithm. Signed-off-by:
Tom Herbert <therbert@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
Remote checksum offload processing is currently the same for both the GRO and non-GRO path. When the remote checksum offload option is encountered, the checksum field referred to is modified in the packet. So in the GRO case, the packet is modified in the GRO path and then the operation is skipped when the packet goes through the normal path based on skb->remcsum_offload. There is a problem in that the packet may be modified in the GRO path, but then forwarded off host still containing the remote checksum option. A remote host will again perform RCO but now the checksum verification will fail since GRO RCO already modified the checksum. To fix this, we ensure that GRO restores a packet to it's original state before returning. In this model, when GRO processes a remote checksum option it still changes the checksum per the algorithm but on return from lower layer processing the checksum is restored to its original value. In this patch we add define gro_remcsum structure which is passed to skb_gro_remcsum_process to save offset and delta for the checksum being changed. After lower layer processing, skb_gro_remcsum_cleanup is called to restore the checksum before returning from GRO. Signed-off-by:
Tom Herbert <therbert@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 04 Feb, 2015 1 commit
-
-
Tom Herbert authored
This patch adds skb_remcsum_process and skb_gro_remcsum_process to perform the appropriate adjustments to the skb when receiving remote checksum offload. Updated vxlan and gue to use these functions. Tested: Ran TCP_RR and TCP_STREAM netperf for VXLAN and GUE, did not see any change in performance. Signed-off-by:
Tom Herbert <therbert@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 14 Jan, 2015 1 commit
-
-
Tom Herbert authored
This patch introduces udp_offload_callbacks which has the same GRO functions (but not a GSO function) as offload_callbacks, except there is an argument to a udp_offload struct passed to gro_receive and gro_complete functions. This additional argument can be used to retrieve the per port structure of the encapsulation for use in gro processing (mostly by doing container_of on the structure). Signed-off-by:
Tom Herbert <therbert@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 06 Jan, 2015 1 commit
-
-
Tom Herbert authored
Move convert_csum from udp_sock to inet_sock. This allows the possibility that we can use convert checksum for different types of sockets and also allows convert checksum to be enabled from inet layer (what we'll want to do when enabling IP_CHECKSUM cmsg). Signed-off-by:
Tom Herbert <therbert@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 26 Nov, 2014 1 commit
-
-
Tom Herbert authored
Change remote checksum offload to call remcsum_adjust. This also eliminates the optimization to skip an IP header as part of the adjustment (really does not seem to be much of a win). Signed-off-by:
Tom Herbert <therbert@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 13 Nov, 2014 1 commit
-
-
Thomas Graf authored
net/ipv4/fou.c: In function ‘ip_tunnel_encap_del_fou_ops’: net/ipv4/fou.c:861:1: warning: no return statement in function returning non-void [-Wreturn-type] Fixes: a8c5f90f ("ip_tunnel: Ops registration for secondary encap (fou, gue)") Signed-off-by:
Thomas Graf <tgraf@suug.ch> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 12 Nov, 2014 1 commit
-
-
Tom Herbert authored
Instead of calling fou and gue functions directly from ip_tunnel use ops for these that were previously registered. This patch adds the logic to add and remove encapsulation operations for ip_tunnel, and modified fou (and gue) to register with ip_tunnels. This patch also addresses a circular dependency between ip_tunnel and fou that was causing link errors when CONFIG_NET_IP_TUNNEL=y and CONFIG_NET_FOU=m. References to fou an gue have been removed from ip_tunnel.c Reported-by:
Randy Dunlap <rdunlap@infradead.org> Signed-off-by:
Tom Herbert <therbert@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-