- 24 Jan, 2011 1 commit
-
-
Michał Mirosław authored
Quoting Ben Hutchings: we presumably won't be defining features that can only be enabled on 64-bit architectures. Occurences found by `grep -r` on net/, drivers/net, include/ [ Move features and vlan_features next to each other in struct netdev, as per Eric Dumazet's suggestion -DaveM ] Signed-off-by:
Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 10 Nov, 2010 1 commit
-
-
Eric Dumazet authored
Robin Holt tried to boot a 16TB machine and found some limits were reached : sysctl_tcp_mem[2], sysctl_udp_mem[2] We can switch infrastructure to use long "instead" of "int", now atomic_long_t primitives are available for free. Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Reported-by:
Robin Holt <holt@sgi.com> Reviewed-by:
Robin Holt <holt@sgi.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 21 Oct, 2010 1 commit
-
-
Balazs Scheidler authored
Just like with IPv4, we need access to the UDP hash table to look up local sockets, but instead of exporting the global udp_table, export a lookup function. Signed-off-by:
Balazs Scheidler <bazsi@balabit.hu> Signed-off-by:
KOVACS Krisztian <hidden@balabit.hu> Signed-off-by:
Patrick McHardy <kaber@trash.net>
-
- 09 Sep, 2010 1 commit
-
-
Eric Dumazet authored
commit 30fff923 introduced in linux-2.6.33 (udp: bind() optimisation) added a secondary hash on UDP, hashed on (local addr, local port). Problem is that following sequence : fd = socket(...) connect(fd, &remote, ...) not only selects remote end point (address and port), but also sets local address, while UDP stack stored in secondary hash table the socket while its local address was INADDR_ANY (or ipv6 equivalent) Sequence is : - autobind() : choose a random local port, insert socket in hash tables [while local address is INADDR_ANY] - connect() : set remote address and port, change local address to IP given by a route lookup. When an incoming UDP frame comes, if more than 10 sockets are found in primary hash table, we switch to secondary table, and fail to find socket because its local address changed. One solution to this problem is to rehash datagram socket if needed. We add a new rehash(struct socket *) method in "struct proto", and implement this method for UDP v4 & v6, using a common helper. This rehashing only takes care of secondary hash table, since primary hash (based on local port only) is not changed. Reported-by:
Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Tested-by:
Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 13 Jul, 2010 1 commit
-
-
Changli Gao authored
remove useless blanks. Signed-off-by:
Changli Gao <xiaosuo@gmail.com> ---- include/net/inet_common.h | 55 ++++------- include/net/tcp.h | 222 +++++++++++++++++----------------------------- include/net/udp.h | 38 +++---- 3 files changed, 123 insertions(+), 192 deletions(-) Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 11 Nov, 2009 1 commit
-
-
Eric Dumazet authored
UDP bind() can be O(N^2) in some pathological cases. Thanks to secondary hash tables, we can make it O(N) Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 09 Nov, 2009 2 commits
-
-
Eric Dumazet authored
Extends udp_table to contain a secondary hash table. socket anchor for this second hash is free, because UDP doesnt use skc_bind_node : We define an union to hold both skc_bind_node & a new hlist_nulls_node udp_portaddr_node udp_lib_get_port() inserts sockets into second hash chain (additional cost of one atomic op) udp_lib_unhash() deletes socket from second hash chain (additional cost of one atomic op) Note : No spinlock lockdep annotation is needed, because lock for the secondary hash chain is always get after lock for primary hash chain. Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Adds a counter in udp_hslot to keep an accurate count of sockets present in chain. This will permit to upcoming UDP lookup algo to chose the shortest chain when secondary hash is added. Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 08 Oct, 2009 1 commit
-
-
Eric Dumazet authored
UDP_HTABLE_SIZE was initialy defined to 128, which is a bit small for several setups. 4000 active UDP sockets -> 32 sockets per chain in average. An incoming frame has to lookup all sockets to find best match, so long chains hurt latency. Instead of a fixed size hash table that cant be perfect for every needs, let UDP stack choose its table size at boot time like tcp/ip route, using alloc_large_system_hash() helper Add an optional boot parameter, uhash_entries=x so that an admin can force a size between 256 and 65536 if needed, like thash_entries and rhash_entries. dmesg logs two new lines : [ 0.647039] UDP hash table entries: 512 (order: 0, 4096 bytes) [ 0.647099] UDP Lite hash table entries: 512 (order: 0, 4096 bytes) Maximal size on 64bit arches would be 65536 slots, ie 1 MBytes for non debugging spinlocks. Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 30 Sep, 2009 1 commit
-
-
David S. Miller authored
This provides safety against negative optlen at the type level instead of depending upon (sometimes non-trivial) checks against this sprinkled all over the the place, in each and every implementation. Based upon work done by Arjan van de Ven and feedback from Linus Torvalds. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 12 Jul, 2009 1 commit
-
-
Sridhar Samudrala authored
- validate and forward GSO UDP/IPv4 packets from untrusted sources. - do software UFO if the outgoing device doesn't support UFO. Signed-off-by:
Sridhar Samudrala <sri@us.ibm.com> Acked-by:
Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 11 Apr, 2009 1 commit
-
-
Vlad Yasevich authored
Commit b2f5e7cd (ipv6: Fix conflict resolutions during ipv6 binding) introduced a regression where time-wait sockets were not treated correctly. This resulted in the following: BUG: unable to handle kernel NULL pointer dereference at 0000000000000062 IP: [<ffffffff805d7d61>] ipv4_rcv_saddr_equal+0x61/0x70 ... Call Trace: [<ffffffffa033847b>] ipv6_rcv_saddr_equal+0x1bb/0x250 [ipv6] [<ffffffffa03505a8>] inet6_csk_bind_conflict+0x88/0xd0 [ipv6] [<ffffffff805bb18e>] inet_csk_get_port+0x1ee/0x400 [<ffffffffa0319b7f>] inet6_bind+0x1cf/0x3a0 [ipv6] [<ffffffff8056d17c>] ? sockfd_lookup_light+0x3c/0xd0 [<ffffffff8056ed49>] sys_bind+0x89/0x100 [<ffffffff80613ea2>] ? trace_hardirqs_on_thunk+0x3a/0x3c [<ffffffff8020bf9b>] system_call_fastpath+0x16/0x1b Tested-by:
Brian Haley <brian.haley@hp.com> Tested-by:
Ed Tomlinson <edt@aei.ca> Signed-off-by:
Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 25 Mar, 2009 1 commit
-
-
Vlad Yasevich authored
The ipv6 version of bind_conflict code calls ipv6_rcv_saddr_equal() which at times wrongly identified intersections between addresses. It particularly broke down under a few instances and caused erroneous bind conflicts. Signed-off-by:
Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 17 Nov, 2008 1 commit
-
-
Eric Dumazet authored
This is a straightforward patch, using hlist_nulls infrastructure. RCUification already done on UDP two weeks ago. Using hlist_nulls permits us to avoid some memory barriers, both at lookup time and delete time. Patch is large because it adds new macros to include/net/sock.h. These macros will be used by TCP & DCCP in next patch. Signed-off-by:
Eric Dumazet <dada1@cosmosbay.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 29 Oct, 2008 1 commit
-
-
Eric Dumazet authored
UDP sockets are hashed in a 128 slots hash table. This hash table is protected by *one* rwlock. This rwlock is readlocked each time an incoming UDP message is handled. This rwlock is writelocked each time a socket must be inserted in hash table (bind time), or deleted from this table (close time) This is not scalable on SMP machines : 1) Even in read mode, lock() and unlock() are atomic operations and must dirty a contended cache line, shared by all cpus. 2) A writer might be starved if many readers are 'in flight'. This can happen on a machine with some NIC receiving many UDP messages. User process can be delayed a long time at socket creation/dismantle time. This patch prepares RCU migration, by introducing 'struct udp_table and struct udp_hslot', and using one spinlock per chain, to reduce contention on central rwlock. Introducing one spinlock per chain reduces latencies, for port randomization on heavily loaded UDP servers. This also speedup bindings to specific ports. udp_lib_unhash() was uninlined, becoming to big. Some cleanups were done to ease review of following patch (RCUification of UDP Unicast lookups) Signed-off-by:
Eric Dumazet <dada1@cosmosbay.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 07 Oct, 2008 2 commits
-
-
Denis V. Lunev authored
Signed-off-by:
Denis V. Lunev <den@openvz.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Denis V. Lunev authored
Signed-off-by:
Denis V. Lunev <den@openvz.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 01 Oct, 2008 1 commit
-
-
KOVACS Krisztian authored
The iptables tproxy code has to be able to do UDP socket hash lookups, so we have to provide an exported lookup function for this purpose. Signed-off-by:
KOVACS Krisztian <hidden@sch.bme.hu> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 18 Jul, 2008 2 commits
-
-
Pavel Emelyanov authored
Signed-off-by:
Pavel Emelyanov <xemul@openvz.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Pavel Emelyanov authored
Similar to... ouch, I repeat myself. Signed-off-by:
Pavel Emelyanov <xemul@openvz.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 06 Jul, 2008 4 commits
-
-
Pavel Emelyanov authored
Signed-off-by:
Pavel Emelyanov <xemul@openvz.org> Acked-by:
Denis V. Lunev <den@openvz.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Pavel Emelyanov authored
As simple as the patch #1 in this set. Signed-off-by:
Pavel Emelyanov <xemul@openvz.org> Acked-by:
Denis V. Lunev <den@openvz.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Pavel Emelyanov authored
Two special cases here - one is rxrpc - I put init_net there explicitly, since we haven't touched this part yet. The second place is in __udp4_lib_rcv - we already have a struct net there, but I have to move its initialization above to make it ready at the "drop" label. Signed-off-by:
Pavel Emelyanov <xemul@openvz.org> Acked-by:
Denis V. Lunev <den@openvz.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Pavel Emelyanov authored
Nothing special - all the places already have a struct sock at hands, so use the sock_net() net. Signed-off-by:
Pavel Emelyanov <xemul@openvz.org> Acked-by:
Denis V. Lunev <den@openvz.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 13 Jun, 2008 1 commit
-
-
Richard Kennedy authored
reorder udp_iter_state to remove padding on 64bit builds shrinks from 24 to 16 bytes, moving to a smaller slab when CONFIG_NET_NS is undefined & seq_net_private = {} Signed-off-by:
Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 04 Jun, 2008 1 commit
-
-
Denis V. Lunev authored
IPv6 UDP sockets wth IPv4 mapped address use udp_sendmsg to send the data actually. In this case ip_flush_pending_frames should be called instead of ip6_flush_pending_frames. Signed-off-by:
Denis V. Lunev <den@openvz.org> Signed-off-by:
YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
-
- 01 Apr, 2008 1 commit
-
-
Pavel Emelyanov authored
This counter is about to become per-proto-and-per-net, so we'll need two arguments to determine which cell in this "table" to work with. All the places, but proc already pass proper net to it - proc will be tuned a bit later. Some indentation with spaces in proc files is done to keep the file coding style consistent. Signed-off-by:
Pavel Emelyanov <xemul@openvz.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 29 Mar, 2008 4 commits
-
-
Denis V. Lunev authored
Move it to udp_seq_afinfo->seq_fops as should be. Signed-off-by:
Denis V. Lunev <den@openvz.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Denis V. Lunev authored
No need to have separate never-used variable. Signed-off-by:
Denis V. Lunev <den@openvz.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Denis V. Lunev authored
No need to create seq_operations for each instance of 'netstat'. Signed-off-by:
Denis V. Lunev <den@openvz.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Denis V. Lunev authored
Signed-off-by:
Denis V. Lunev <den@openvz.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 22 Mar, 2008 1 commit
-
-
Pavel Emelyanov authored
After this we have only udp_lib_get_port to get the port and two stubs for ipv4 and ipv6. No difference in udp and udplite except for initialized h.udp_hash member. I tried to find a graceful way to drop the only difference between udp_v4_get_port and udp_v6_get_port (i.e. the rcv_saddr comparison routine), but adding one more callback on the struct proto didn't appear such :( Maybe later. Signed-off-by:
Pavel Emelyanov <xemul@openvz.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 21 Mar, 2008 2 commits
-
-
Daniel Lezcano authored
The proc init/exit functions take a new network namespace parameter in order to register/unregister /proc/net/udp6 for a namespace. Signed-off-by:
Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Daniel Lezcano authored
This patch makes the common udp proc functions to take care of which socket they should show taking into account the namespace it belongs. Signed-off-by:
Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 28 Jan, 2008 3 commits
-
-
Eric Dumazet authored
1) Cleanups (all functions are prefixed by sock_prot_inuse) sock_prot_inc_use(prot) -> sock_prot_inuse_add(prot,-1) sock_prot_dec_use(prot) -> sock_prot_inuse_add(prot,-1) sock_prot_inuse() -> sock_prot_inuse_get() New functions : sock_prot_inuse_init() and sock_prot_inuse_free() to abstract pcounter use. 2) if CONFIG_PROC_FS=n, we can zap 'inuse' member from "struct proto", since nobody wants to read the inuse value. This saves 1372 bytes on i386/SMP and some cpu cycles. Signed-off-by:
Eric Dumazet <dada1@cosmosbay.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Hideo Aoki authored
Signed-off-by:
Takahiro Yasui <tyasui@redhat.com> Signed-off-by:
Hideo Aoki <haoki@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Herbert Xu authored
The previous move of the the UDP inDatagrams counter caused the counting of encapsulated packets, SUNRPC data (as opposed to call) packets and RXRPC packets to go missing. This patch restores all of these. Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 07 Jun, 2007 1 commit
-
-
David S. Miller authored
This reverts changesets: 6aaf47fa b7b5f487 de34ed91 fc038410 There are still some correctness issues recently discovered which do not have a known fix that doesn't involve doing a full hash table scan on port bind. So revert for now. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 11 May, 2007 1 commit
-
-
David S. Miller authored
__udp_lib_port_inuse() cannot make direct references to inet_sk(sk)->rcv_saddr as that is ipv4 specific state and this code is used by ipv6 too. Use an operations vector to solve this, and this also paves the way for ipv6 support for non-wild saddr hashing in UDP. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 26 Apr, 2007 1 commit
-
-
Herbert Xu authored
When a transmitted packet is looped back directly, CHECKSUM_PARTIAL maps to the semantics of CHECKSUM_UNNECESSARY. Therefore we should treat it as such in the stack. Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by:
David S. Miller <davem@davemloft.net>
-