1. 11 Feb, 2012 40 commits
    • Willy Tarreau's avatar
      Linux 2.6.27.60 · 5a85ebb7
      Willy Tarreau authored
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      5a85ebb7
    • Paolo Bonzini's avatar
      dm: do not forward ioctls from logical volumes to the underlying device · 8eca6dc4
      Paolo Bonzini authored
      commit ec8013be
      
       upstream.
      
      A logical volume can map to just part of underlying physical volume.
      In this case, it must be treated like a partition.
      
      Based on a patch from Alasdair G Kergon.
      
      Cc: Alasdair G Kergon <agk@redhat.com>
      Cc: dm-devel@redhat.com
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [bwh: Backport to 2.6.32 - drop change to drivers/md/dm-flakey.c]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      8eca6dc4
    • Paolo Bonzini's avatar
      block: fail SCSI passthrough ioctls on partition devices · 4d6fe88a
      Paolo Bonzini authored
      commit 0bfc96cb
      
       upstream.
      
      [ Changes with respect to 3.3: return -ENOTTY from scsi_verify_blk_ioctl
        and -ENOIOCTLCMD from sd_compat_ioctl. ]
      
      Linux allows executing the SG_IO ioctl on a partition or LVM volume, and
      will pass the command to the underlying block device.  This is
      well-known, but it is also a large security problem when (via Unix
      permissions, ACLs, SELinux or a combination thereof) a program or user
      needs to be granted access only to part of the disk.
      
      This patch lets partitions forward a small set of harmless ioctls;
      others are logged with printk so that we can see which ioctls are
      actually sent.  In my tests only CDROM_GET_CAPABILITY actually occurred.
      Of course it was being sent to a (partition on a) hard disk, so it would
      have failed with ENOTTY and the patch isn't changing anything in
      practice.  Still, I'm treating it specially to avoid spamming the logs.
      
      In principle, this restriction should include programs running with
      CAP_SYS_RAWIO.  If for example I let a program access /dev/sda2 and
      /dev/sdb, it still should not be able to read/write outside the
      boundaries of /dev/sda2 independent of the capabilities.  However, for
      now programs with CAP_SYS_RAWIO will still be allowed to send the
      ioctls.  Their actions will still be logged.
      
      This patch does not affect the non-libata IDE driver.  That driver
      however already tests for bd != bd->bd_contains before issuing some
      ioctl; it could be restricted further to forbid these ioctls even for
      programs running with CAP_SYS_ADMIN/CAP_SYS_RAWIO.
      
      Cc: linux-scsi@vger.kernel.org
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: James Bottomley <JBottomley@parallels.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      [ Make it also print the command name when warning - Linus ]
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [bwh: Backport to 2.6.32 - ENOIOCTLCMD does not get converted to
       ENOTTY, so we must return ENOTTY directly]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      4d6fe88a
    • Paolo Bonzini's avatar
      block: add and use scsi_blk_cmd_ioctl · 7d064959
      Paolo Bonzini authored
      commit 577ebb37
      
       upstream.
      
      Introduce a wrapper around scsi_cmd_ioctl that takes a block device.
      
      The function will then be enhanced to detect partition block devices
      and, in that case, subject the ioctls to whitelisting.
      
      Cc: linux-scsi@vger.kernel.org
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: James Bottomley <JBottomley@parallels.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      [bwh: Backport to 2.6.32 - adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      [wt: slightly changed the interface to match 2.6.27's scsi_cmd_ioctl()
           which still needs the file pointer but has no mode parameter].
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      7d064959
    • Luca Tettamanti's avatar
      i8k: Avoid lahf in 64-bit code · d172827c
      Luca Tettamanti authored
      commit bc1f419c
      
       upstream.
      
      i8k uses lahf to read the flag register in 64-bit code; early x86-64
      CPUs, however, lack this instruction and we get an invalid opcode
      exception at runtime.
      Use pushf to load the flag register into the stack instead.
      Signed-off-by: default avatarLuca Tettamanti <kronos.it@gmail.com>
      Reported-by: default avatarJeff Rickman <jrickman@myamigos.us>
      Tested-by: default avatarJeff Rickman <jrickman@myamigos.us>
      Tested-by: default avatarHarry G McGavran Jr <w5pny@arrl.net>
      Cc: Massimo Dal Zotto <dz@debian.org>
      Signed-off-by: default avatarJean Delvare <khali@linux-fr.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d172827c
    • Michal Marek's avatar
      kbuild: Fix passing -Wno-* options to gcc 4.4+ · 1e1df1cd
      Michal Marek authored
      commit 8417da6f upstream.
      
      Starting with 4.4, gcc will happily accept -Wno-<anything> in the
      cc-option test and complain later when compiling a file that has some
      other warning. This rather unexpected behavior is intentional as per
      http://gcc.gnu.org/PR28322
      
      , so work around it by testing for support of
      the opposite option (without the no-). Introduce a new Makefile function
      cc-disable-warning that does this and update two uses of cc-option in
      the toplevel Makefile.
      Reported-and-tested-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarMichal Marek <mmarek@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      1e1df1cd
    • Dave Jones's avatar
      kbuild: Disable -Wunused-but-set-variable for gcc 4.6.0 · da0acbfb
      Dave Jones authored
      commit af0e5d56
      
       upstream.
      
      Disable the new -Wunused-but-set-variable that was added in gcc 4.6.0
      It produces more false positives than useful warnings.
      
      This can still be enabled using W=1
      [gregkh - No it can not for 2.6.32, but we don't care]
      Signed-off-by: default avatarDave Jones <davej@redhat.com>
      Acked-by: default avatarSam Ravnborg <sam@ravnborg.org>
      Tested-by: default avatarSam Ravnborg <sam@ravnborg.org>
      Signed-off-by: default avatarMichal Marek <mmarek@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      da0acbfb
    • Jim Bos's avatar
      Fix gcc 4.5.1 miscompiling drivers/char/i8k.c (again) · 89a58f0b
      Jim Bos authored
      commit 22d3243d upstream.
      
      The fix in commit 6b4e81db
      
       ("i8k: Tell gcc that *regs gets
      clobbered") to work around the gcc miscompiling i8k.c to add "+m
      (*regs)" caused register pressure problems and a build failure.
      
      Changing the 'asm' statement to 'asm volatile' instead should prevent
      that and works around the gcc bug as well, so we can remove the "+m".
      
      [ Background on the gcc bug: a memory clobber fails to mark the function
        the asm resides in as non-pure (aka "__attribute__((const))"), so if
        the function does nothing else that triggers the non-pure logic, gcc
        will think that that function has no side effects at all. As a result,
        callers will be mis-compiled.
      
        Adding the "+m" made gcc see that it's not a pure function, and so
        does "asm volatile". The problem was never really the need to mark
        "*regs" as changed, since the memory clobber did that part - the
        problem was just a bug in the gcc "pure" function analysis  - Linus ]
      Signed-off-by: default avatarJim Bos <jim876@xs4all.nl>
      Acked-by: default avatarJakub Jelinek <jakub@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Andreas Schwab <schwab@linux-m68k.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      89a58f0b
    • Jim Bos's avatar
      i8k: Tell gcc that *regs gets clobbered · ac7024a2
      Jim Bos authored
      commit 6b4e81db
      
       upstream.
      
      More recent GCC caused the i8k driver to stop working, on Slackware
      compiler was upgraded from gcc-4.4.4 to gcc-4.5.1 after which it didn't
      work anymore, meaning the driver didn't load or gave total nonsensical
      output.
      
      As it turned out the asm(..) statement forgot to mention it modifies the
      *regs variable.
      
      Credits to Andi Kleen and Andreas Schwab for providing the fix.
      Signed-off-by: default avatarJim Bos <jim876@xs4all.nl>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Andreas Schwab <schwab@linux-m68k.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      ac7024a2
    • Ludwig Nussel's avatar
      x86: Fix mmap random address range · ad6e2b74
      Ludwig Nussel authored
      commit 9af0c7a6 upstream.
      
      On x86_32 casting the unsigned int result of get_random_int() to
      long may result in a negative value.  On x86_32 the range of
      mmap_rnd() therefore was -255 to 255.  The 32bit mode on x86_64
      used 0 to 255 as intended.
      
      The bug was introduced by 675a0813
      
       ("x86: unify mmap_{32|64}.c")
      in January 2008.
      Signed-off-by: default avatarLudwig Nussel <ludwig.nussel@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: harvey.harrison@gmail.com
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Harvey Harrison <harvey.harrison@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/201111152246.pAFMklOB028527@wpaz5.hot.corp.google.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      ad6e2b74
    • Marcus Meissner's avatar
      net/ipv4: Check for mistakenly passed in non-IPv4 address · 93050f52
      Marcus Meissner authored
      [ Upstream commit d0733d2e
      
       ]
      
      Check against mistakenly passing in IPv6 addresses (which would result
      in an INADDR_ANY bind) or similar incompatible sockaddrs.
      Signed-off-by: default avatarMarcus Meissner <meissner@suse.de>
      Cc: Reinhard Max <max@suse.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      93050f52
    • john stultz's avatar
      Fix time() inconsistencies caused by intermediate xtime_cache values being read · af79350f
      john stultz authored
      
      Currently with 2.6.32-longterm, its possible for time() to occasionally
      return values one second earlier then the previous time() call.
      
      This happens because update_xtime_cache() does:
      	xtime_cache = xtime;
      	timespec_add_ns(&xtime_cache, nsec);
      
      Its possible that xtime is 1sec,999msecs, and nsecs is 1ms, resulting in
      a xtime_cache that is 2sec,0ms.
      
      get_seconds() (which is used by sys_time()) does not take the
      xtime_lock, which is ok as the xtime.tv_sec value is a long and can be
      atomically read safely.
      
      The problem occurs the next call to update_xtime_cache() if xtime has
      not increased:
      	/* This sets xtime_cache back to 1sec, 999msec */
      	xtime_cache = xtime;
      	/* get_seconds, calls here, and sees a 1second inconsistency */
      	timespec_add_ns(&xtime_cache, nsec);
      
      In order to resolve this, we could add locking to get_seconds(), but it
      needs to be lock free, as it is called from the machine check handler,
      opening a possible deadlock.
      
      So instead, this patch introduces an intermediate value for the
      calculations, so that we only assign xtime_cache once with the correct
      time, using ACCESS_ONCE to make sure the compiler doesn't optimize out
      any intermediate values.
      
      The xtime_cache manipulations were removed with 2.6.35, so that kernel
      and later do not need this change.
      
      In 2.6.33 and 2.6.34 the logarithmic accumulation should make it so
      xtime is updated each tick, so it is unlikely that two updates to
      xtime_cache could occur while the difference between xtime and
      xtime_cache crosses the second boundary. However, the paranoid might
      want to pull this into 2.6.33/34-longterm just to be sure.
      
      Thanks to Stephen for helping finally narrow down the root cause and
      many hours of help with testing and validation. Also thanks to Max,
      Andi, Eric and Paul for review of earlier attempts and helping clarify
      what is possible with regard to out of order execution.
      Acked-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarJohn Stultz <johnstul@us.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      af79350f
    • Eric Dumazet's avatar
      af_packet: prevent information leak · 6e4e5889
      Eric Dumazet authored
      [ Upstream commit 13fcb7bd ]
      
      In 2.6.27, commit 393e52e3
      
       (packet: deliver VLAN TCI to userspace)
      added a small information leak.
      
      Add padding field and make sure its zeroed before copy to user.
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      CC: Patrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      6e4e5889
    • Joe Perches's avatar
      MAINTAINERS: stable: Update address · b9ce0b27
      Joe Perches authored
      commit bc7a2f3a
      
       upstream.
      
      The old address hasn't worked since the great intrusion of August 2011.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      b9ce0b27
    • Jiri Slaby's avatar
      SCSI: scsi_lib: fix potential NULL dereference · 65d2e980
      Jiri Slaby authored
      commit 03b14708
      
       upstream.
      
      Stanse found a potential NULL dereference in scsi_kill_request.
      
      Instead of triggering BUG() in 'if (unlikely(cmd == NULL))' branch,
      the kernel will Oops earlier on cmd dereference.
      
      Move the dereferences after the if.
      
      [ WT: starget is not set in 2.6.27 ]
      Signed-off-by: default avatarJiri Slaby <jirislaby@gmail.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      65d2e980
    • Jiri Olsa's avatar
      x86, 64-bit: Fix copy_[to/from]_user() checks for the userspace address limit · 67cd6ea4
      Jiri Olsa authored
      commit 26afb7c6 upstream.
      
      As reported in BZ #30352:
      
        https://bugzilla.kernel.org/show_bug.cgi?id=30352
      
      
      
      there's a kernel bug related to reading the last allowed page on x86_64.
      
      The _copy_to_user() and _copy_from_user() functions use the following
      check for address limit:
      
        if (buf + size >= limit)
      	fail();
      
      while it should be more permissive:
      
        if (buf + size > limit)
      	fail();
      
      That's because the size represents the number of bytes being
      read/write from/to buf address AND including the buf address.
      So the copy function will actually never touch the limit
      address even if "buf + size == limit".
      
      Following program fails to use the last page as buffer
      due to the wrong limit check:
      
       #include <sys/mman.h>
       #include <sys/socket.h>
       #include <assert.h>
      
       #define PAGE_SIZE       (4096)
       #define LAST_PAGE       ((void*)(0x7fffffffe000))
      
       int main()
       {
              int fds[2], err;
              void * ptr = mmap(LAST_PAGE, PAGE_SIZE, PROT_READ | PROT_WRITE,
                                MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0);
              assert(ptr == LAST_PAGE);
              err = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
              assert(err == 0);
              err = send(fds[0], ptr, PAGE_SIZE, 0);
              perror("send");
              assert(err == PAGE_SIZE);
              err = recv(fds[1], ptr, PAGE_SIZE, MSG_WAITALL);
              perror("recv");
              assert(err == PAGE_SIZE);
              return 0;
       }
      
      The other place checking the addr limit is the access_ok() function,
      which is working properly. There's just a misleading comment
      for the __range_not_ok() macro - which this patch fixes as well.
      
      The last page of the user-space address range is a guard page and
      Brian Gerst observed that the guard page itself due to an erratum on K8 cpus
      (#121 Sequential Execution Across Non-Canonical Boundary Causes Processor
      Hang).
      
      However, the test code is using the last valid page before the guard page.
      The bug is that the last byte before the guard page can't be read
      because of the off-by-one error. The guard page is left in place.
      
      This bug would normally not show up because the last page is
      part of the process stack and never accessed via syscalls.
      
      [WT: in 2.6.27 use include/asm-x86/uaccess.h]
      Signed-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      Acked-by: default avatarBrian Gerst <brgerst@gmail.com>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1305210630-7136-1-git-send-email-jolsa@redhat.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      67cd6ea4
    • James Bottomley's avatar
      block: add proper state guards to __elv_next_request · 636121a6
      James Bottomley authored
      commit 0a58e077
      
       upstream.
      
      blk_cleanup_queue() calls elevator_exit() and after this, we can't
      touch the elevator without oopsing.  __elv_next_request() must check
      for this state because in the refcounted queue model, we can still
      call it after blk_cleanup_queue() has been called.
      
      This was reported as causing an oops attributable to scsi.
      
      [WT: in 2.6.27, __elv_next_request() is in elevator.c]
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@suse.de>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      636121a6
    • Neil Horman's avatar
      bonding: Ensure that we unshare skbs prior to calling pskb_may_pull · 3870ae18
      Neil Horman authored
      commit b3053251
      
       upstream.
      
      Recently reported oops:
      
      kernel BUG at net/core/skbuff.c:813!
      invalid opcode: 0000 [#1] SMP
      last sysfs file: /sys/devices/virtual/net/bond0/broadcast
      CPU 8
      Modules linked in: sit tunnel4 cpufreq_ondemand acpi_cpufreq freq_table bonding
      ipv6 dm_mirror dm_region_hash dm_log cdc_ether usbnet mii serio_raw i2c_i801
      i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma i7core_edac edac_core bnx2
      ixgbe dca mdio sg ext4 mbcache jbd2 sd_mod crc_t10dif mptsas mptscsih mptbase
      scsi_transport_sas dm_mod [last unloaded: microcode]
      
      Modules linked in: sit tunnel4 cpufreq_ondemand acpi_cpufreq freq_table bonding
      ipv6 dm_mirror dm_region_hash dm_log cdc_ether usbnet mii serio_raw i2c_i801
      i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma i7core_edac edac_core bnx2
      ixgbe dca mdio sg ext4 mbcache jbd2 sd_mod crc_t10dif mptsas mptscsih mptbase
      scsi_transport_sas dm_mod [last unloaded: microcode]
      Pid: 0, comm: swapper Not tainted 2.6.32-71.el6.x86_64 #1 BladeCenter HS22
      -[7870AC1]-
      RIP: 0010:[<ffffffff81405b16>]  [<ffffffff81405b16>]
      pskb_expand_head+0x36/0x1e0
      RSP: 0018:ffff880028303b70  EFLAGS: 00010202
      RAX: 0000000000000002 RBX: ffff880c6458ec80 RCX: 0000000000000020
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880c6458ec80
      RBP: ffff880028303bc0 R08: ffffffff818a6180 R09: ffff880c6458ed64
      R10: ffff880c622b36c0 R11: 0000000000000400 R12: 0000000000000000
      R13: 0000000000000180 R14: ffff880c622b3000 R15: 0000000000000000
      FS:  0000000000000000(0000) GS:ffff880028300000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      CR2: 00000038653452a4 CR3: 0000000001001000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process swapper (pid: 0, threadinfo ffff8806649c2000, task ffff880c64f16ab0)
      Stack:
       ffff880028303bc0 ffffffff8104fff9 000000000000001c 0000000100000000
      <0> ffff880000047d80 ffff880c6458ec80 000000000000001c ffff880c6223da00
      <0> ffff880c622b3000 0000000000000000 ffff880028303c10 ffffffff81407f7a
      Call Trace:
      <IRQ>
       [<ffffffff8104fff9>] ? __wake_up_common+0x59/0x90
       [<ffffffff81407f7a>] __pskb_pull_tail+0x2aa/0x360
       [<ffffffffa0244530>] bond_arp_rcv+0x2c0/0x2e0 [bonding]
       [<ffffffff814a0857>] ? packet_rcv+0x377/0x440
       [<ffffffff8140f21b>] netif_receive_skb+0x2db/0x670
       [<ffffffff8140f788>] napi_skb_finish+0x58/0x70
       [<ffffffff8140fc89>] napi_gro_receive+0x39/0x50
       [<ffffffffa01286eb>] ixgbe_clean_rx_irq+0x35b/0x900 [ixgbe]
       [<ffffffffa01290f6>] ixgbe_clean_rxtx_many+0x136/0x240 [ixgbe]
       [<ffffffff8140fe53>] net_rx_action+0x103/0x210
       [<ffffffff81073bd7>] __do_softirq+0xb7/0x1e0
       [<ffffffff810d8740>] ? handle_IRQ_event+0x60/0x170
       [<ffffffff810142cc>] call_softirq+0x1c/0x30
       [<ffffffff81015f35>] do_softirq+0x65/0xa0
       [<ffffffff810739d5>] irq_exit+0x85/0x90
       [<ffffffff814cf915>] do_IRQ+0x75/0xf0
       [<ffffffff81013ad3>] ret_from_intr+0x0/0x11
       <EOI>
       [<ffffffff8101bc01>] ? mwait_idle+0x71/0xd0
       [<ffffffff814cd80a>] ? atomic_notifier_call_chain+0x1a/0x20
       [<ffffffff81011e96>] cpu_idle+0xb6/0x110
       [<ffffffff814c17c8>] start_secondary+0x1fc/0x23f
      
      Resulted from bonding driver registering packet handlers via dev_add_pack and
      then trying to call pskb_may_pull. If another packet handler (like for AF_PACKET
      sockets) gets called first, the delivered skb will have a user count > 1, which
      causes pskb_may_pull to BUG halt when it does its skb_shared check.  Fix this by
      calling skb_share_check prior to the may_pull call sites in the bonding driver
      to clone the skb when needed.  Tested by myself and the reported successfully.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: Jay Vosburgh <fubar@us.ibm.com>
      CC: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: default avatarJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: default avatarAndy Gospodarek <andy@greyhouse.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      3870ae18
    • Andy Gospodarek's avatar
      bonding: correctly process non-linear skbs · 1ec33da4
      Andy Gospodarek authored
      commit ab12811c
      
       upstream.
      
      It was recently brought to my attention that 802.3ad mode bonds would no
      longer form when using some network hardware after a driver update.
      After snooping around I realized that the particular hardware was using
      page-based skbs and found that skb->data did not contain a valid LACPDU
      as it was not stored there.  That explained the inability to form an
      802.3ad-based bond.  For balance-alb mode bonds this was also an issue
      as ARPs would not be properly processed.
      
      This patch fixes the issue in my tests and should be applied to 2.6.36
      and as far back as anyone cares to add it to stable.
      
      Thanks to Alexander Duyck <alexander.h.duyck@intel.com> and Jesse
      Brandeburg <jesse.brandeburg@intel.com> for the suggestions on this one.
      Signed-off-by: default avatarAndy Gospodarek <andy@greyhouse.net>
      CC: Alexander Duyck <alexander.h.duyck@intel.com>
      CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
      Signed-off-by: default avatarJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      1ec33da4
    • Stratos Psomadakis's avatar
      sym53c8xx: Fix NULL pointer dereference in slave_destroy · 64494d9f
      Stratos Psomadakis authored
      commit cced5041 upstream.
      
      sym53c8xx_slave_destroy unconditionally assumes that sym53c8xx_slave_alloc has
      succesesfully allocated a sym_lcb. This can lead to a NULL pointer dereference
      (exposed by commit 4e6c82b3
      
      ).
      Signed-off-by: default avatarStratos Psomadakis <psomas@gentoo.org>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      64494d9f
    • Sasha Levin's avatar
      nfsd: Fix oops when parsing a 0 length export · e9a7f323
      Sasha Levin authored
      commit b2ea70af
      
       upstream.
      
      expkey_parse() oopses when handling a 0 length export. This is easily
      triggerable from usermode by writing 0 bytes into
      '/proc/[proc id]/net/rpc/nfsd.fh/channel'.
      
      Below is the log:
      
      [ 1402.286893] BUG: unable to handle kernel paging request at ffff880077c49fff
      [ 1402.287632] IP: [<ffffffff812b4b99>] expkey_parse+0x28/0x2e1
      [ 1402.287632] PGD 2206063 PUD 1fdfd067 PMD 1ffbc067 PTE 8000000077c49160
      [ 1402.287632] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
      [ 1402.287632] CPU 1
      [ 1402.287632] Pid: 20198, comm: trinity Not tainted 3.2.0-rc2-sasha-00058-gc65cd37 #6
      [ 1402.287632] RIP: 0010:[<ffffffff812b4b99>]  [<ffffffff812b4b99>] expkey_parse+0x28/0x2e1
      [ 1402.287632] RSP: 0018:ffff880077f0fd68  EFLAGS: 00010292
      [ 1402.287632] RAX: ffff880077c49fff RBX: 00000000ffffffea RCX: 0000000001043400
      [ 1402.287632] RDX: 0000000000000000 RSI: ffff880077c4a000 RDI: ffffffff82283de0
      [ 1402.287632] RBP: ffff880077f0fe18 R08: 0000000000000001 R09: ffff880000000000
      [ 1402.287632] R10: 0000000000000000 R11: 0000000000000001 R12: ffff880077c4a000
      [ 1402.287632] R13: ffffffff82283de0 R14: 0000000001043400 R15: ffffffff82283de0
      [ 1402.287632] FS:  00007f25fec3f700(0000) GS:ffff88007d400000(0000) knlGS:0000000000000000
      [ 1402.287632] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [ 1402.287632] CR2: ffff880077c49fff CR3: 0000000077e1d000 CR4: 00000000000406e0
      [ 1402.287632] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 1402.287632] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [ 1402.287632] Process trinity (pid: 20198, threadinfo ffff880077f0e000, task ffff880077db17b0)
      [ 1402.287632] Stack:
      [ 1402.287632]  ffff880077db17b0 ffff880077c4a000 ffff880077f0fdb8 ffffffff810b411e
      [ 1402.287632]  ffff880000000000 ffff880077db17b0 ffff880077c4a000 ffffffff82283de0
      [ 1402.287632]  0000000001043400 ffffffff82283de0 ffff880077f0fde8 ffffffff81111f63
      [ 1402.287632] Call Trace:
      [ 1402.287632]  [<ffffffff810b411e>] ? lock_release+0x1af/0x1bc
      [ 1402.287632]  [<ffffffff81111f63>] ? might_fault+0x97/0x9e
      [ 1402.287632]  [<ffffffff81111f1a>] ? might_fault+0x4e/0x9e
      [ 1402.287632]  [<ffffffff81a8bcf2>] cache_do_downcall+0x3e/0x4f
      [ 1402.287632]  [<ffffffff81a8c950>] cache_write.clone.16+0xbb/0x130
      [ 1402.287632]  [<ffffffff81a8c9df>] ? cache_write_pipefs+0x1a/0x1a
      [ 1402.287632]  [<ffffffff81a8c9f8>] cache_write_procfs+0x19/0x1b
      [ 1402.287632]  [<ffffffff8118dc54>] proc_reg_write+0x8e/0xad
      [ 1402.287632]  [<ffffffff8113fe81>] vfs_write+0xaa/0xfd
      [ 1402.287632]  [<ffffffff8114142d>] ? fget_light+0x35/0x9e
      [ 1402.287632]  [<ffffffff8113ff8b>] sys_write+0x48/0x6f
      [ 1402.287632]  [<ffffffff81bbdb92>] system_call_fastpath+0x16/0x1b
      [ 1402.287632] Code: c0 c9 c3 55 48 63 d2 48 89 e5 48 8d 44 32 ff 41 57 41 56 41 55 41 54 53 bb ea ff ff ff 48 81 ec 88 00 00 00 48 89 b5 58 ff ff ff
      [ 1402.287632]  38 0a 0f 85 89 02 00 00 c6 00 00 48 8b 3d 44 4a e5 01 48 85
      [ 1402.287632] RIP  [<ffffffff812b4b99>] expkey_parse+0x28/0x2e1
      [ 1402.287632]  RSP <ffff880077f0fd68>
      [ 1402.287632] CR2: ffff880077c49fff
      [ 1402.287632] ---[ end trace 368ef53ff773a5e3 ]---
      
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: linux-nfs@vger.kernel.org
      Signed-off-by: default avatarSasha Levin <levinsasha928@gmail.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      e9a7f323
    • J. Bruce Fields's avatar
      svcrpc: fix double-free on shutdown of nfsd after changing pool mode · b57fdc83
      J. Bruce Fields authored
      commit 61c8504c
      
       upstream.
      
      The pool_to and to_pool fields of the global svc_pool_map are freed on
      shutdown, but are initialized in nfsd startup only in the
      SVC_POOL_PERCPU and SVC_POOL_PERNODE cases.
      
      They *are* initialized to zero on kernel startup.  So as long as you use
      only SVC_POOL_GLOBAL (the default), this will never be a problem.
      
      You're also OK if you only ever use SVC_POOL_PERCPU or SVC_POOL_PERNODE.
      
      However, the following sequence events leads to a double-free:
      
      	1. set SVC_POOL_PERCPU or SVC_POOL_PERNODE
      	2. start nfsd: both fields are initialized.
      	3. shutdown nfsd: both fields are freed.
      	4. set SVC_POOL_GLOBAL
      	5. start nfsd: the fields are left untouched.
      	6. shutdown nfsd: now we try to free them again.
      
      Step 4 is actually unnecessary, since (for some bizarre reason), nfsd
      automatically resets the pool mode to SVC_POOL_GLOBAL on shutdown.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      b57fdc83
    • Richard Weinberger's avatar
      UBI: fix nameless volumes handling · 2e9633af
      Richard Weinberger authored
      commit 4a59c797
      
       upstream.
      
      Currently it's possible to create a volume without a name. E.g:
      ubimkvol -n 32 -s 2MiB -t static /dev/ubi0 -N ""
      
      After that vtbl_check() will always fail because it does not permit
      empty strings.
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarArtem Bityutskiy <Artem.Bityutskiy@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      2e9633af
    • Pavel Hofman's avatar
      ALSA: ice1724 - Check for ac97 to avoid kernel oops · 2533c024
      Pavel Hofman authored
      commit e7848163
      
       upstream.
      
      Cards with identical PCI ids but no AC97 config in EEPROM do not have
      the ac97 field initialized. We must check for this case to avoid kernel oops.
      Signed-off-by: default avatarPavel Hofman <pavel.hofman@ivitera.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      2533c024
    • Moger, Babu's avatar
      SCSI: scsi_dh: check queuedata pointer before proceeding further · d6cca235
      Moger, Babu authored
      commit a18a920c
      
       upstream.
      
      This patch validates sdev pointer in scsi_dh_activate before proceeding further.
      
      Without this check we might see the panic as below. I have seen this
      panic multiple times..
      
      Call trace:
      
       #0 [ffff88007d647b50] machine_kexec at ffffffff81020902
       #1 [ffff88007d647ba0] crash_kexec at ffffffff810875b0
       #2 [ffff88007d647c70] oops_end at ffffffff8139c650
       #3 [ffff88007d647c90] __bad_area_nosemaphore at ffffffff8102dd15
       #4 [ffff88007d647d50] page_fault at ffffffff8139b8cf
          [exception RIP: scsi_dh_activate+0x82]
          RIP: ffffffffa0041922  RSP: ffff88007d647e00  RFLAGS: 00010046
          RAX: 0000000000000000  RBX: 0000000000000000  RCX: 00000000000093c5
          RDX: 00000000000093c5  RSI: ffffffffa02e6640  RDI: ffff88007cc88988
          RBP: 000000000000000f   R8: ffff88007d646000   R9: 0000000000000000
          R10: ffff880082293790  R11: 00000000ffffffff  R12: ffff88007cc88988
          R13: 0000000000000000  R14: 0000000000000286  R15: ffff880037b845e0
          ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
       #5 [ffff88007d647e38] run_workqueue at ffffffff81060268
       #6 [ffff88007d647e78] worker_thread at ffffffff81060386
       #7 [ffff88007d647ee8] kthread at ffffffff81064436
       #8 [ffff88007d647f48] kernel_thread at ffffffff81003fba
      Signed-off-by: default avatarBabu Moger <babu.moger@netapp.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d6cca235
    • Huajun Li's avatar
      usb: usb-storage doesn't support dynamic id currently, the patch disables the... · 3c890fae
      Huajun Li authored
      usb: usb-storage doesn't support dynamic id currently, the patch disables the feature to fix an oops
      
      commit 1a3a026b
      
       upstream.
      
      Echo vendor and product number of a non usb-storage device to
      usb-storage driver's new_id, then plug in the device to host and you
      will find following oops msg, the root cause is usb_stor_probe1()
      refers invalid id entry if giving a dynamic id, so just disable the
      feature.
      
      [ 3105.018012] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
      [ 3105.018062] CPU 0
      [ 3105.018075] Modules linked in: usb_storage usb_libusual bluetooth
      dm_crypt binfmt_misc snd_hda_codec_analog snd_hda_intel snd_hda_codec
      snd_hwdep hp_wmi ppdev sparse_keymap snd_pcm snd_seq_midi snd_rawmidi
      snd_seq_midi_event snd_seq snd_timer snd_seq_device psmouse snd
      serio_raw tpm_infineon soundcore i915 snd_page_alloc tpm_tis
      parport_pc tpm tpm_bios drm_kms_helper drm i2c_algo_bit video lp
      parport usbhid hid sg sr_mod sd_mod ehci_hcd uhci_hcd usbcore e1000e
      usb_common floppy
      [ 3105.018408]
      [ 3105.018419] Pid: 189, comm: khubd Tainted: G          I  3.2.0-rc7+
      #29 Hewlett-Packard HP Compaq dc7800p Convertible Minitower/0AACh
      [ 3105.018481] RIP: 0010:[<ffffffffa045830d>]  [<ffffffffa045830d>]
      usb_stor_probe1+0x2fd/0xc20 [usb_storage]
      [ 3105.018536] RSP: 0018:ffff880056a3d830  EFLAGS: 00010286
      [ 3105.018562] RAX: ffff880065f4e648 RBX: ffff88006bb28000 RCX: 0000000000000000
      [ 3105.018597] RDX: ffff88006f23c7b0 RSI: 0000000000000001 RDI: 0000000000000206
      [ 3105.018632] RBP: ffff880056a3d900 R08: 0000000000000000 R09: ffff880067365000
      [ 3105.018665] R10: 00000000000002ac R11: 0000000000000010 R12: ffff6000b41a7340
      [ 3105.018698] R13: ffff880065f4ef60 R14: ffff88006bb28b88 R15: ffff88006f23d270
      [ 3105.018733] FS:  0000000000000000(0000) GS:ffff88007a200000(0000)
      knlGS:0000000000000000
      [ 3105.018773] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [ 3105.018801] CR2: 00007fc99c8c4650 CR3: 0000000001e05000 CR4: 00000000000006f0
      [ 3105.018835] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 3105.018870] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [ 3105.018906] Process khubd (pid: 189, threadinfo ffff880056a3c000,
      task ffff88005677a400)
      [ 3105.018945] Stack:
      [ 3105.018959]  0000000000000000 0000000000000000 ffff880056a3d8d0
      0000000000000002
      [ 3105.019011]  0000000000000000 ffff880056a3d918 ffff880000000000
      0000000000000002
      [ 3105.019058]  ffff880056a3d8d0 0000000000000012 ffff880056a3d8d0
      0000000000000006
      [ 3105.019105] Call Trace:
      [ 3105.019128]  [<ffffffffa0458cd4>] storage_probe+0xa4/0xe0 [usb_storage]
      [ 3105.019173]  [<ffffffffa0097822>] usb_probe_interface+0x172/0x330 [usbcore]
      [ 3105.019211]  [<ffffffff815fda67>] driver_probe_device+0x257/0x3b0
      [ 3105.019243]  [<ffffffff815fdd43>] __device_attach+0x73/0x90
      [ 3105.019272]  [<ffffffff815fdcd0>] ? __driver_attach+0x110/0x110
      [ 3105.019303]  [<ffffffff815fb93c>] bus_for_each_drv+0x9c/0xf0
      [ 3105.019334]  [<ffffffff815fd6c7>] device_attach+0xf7/0x120
      [ 3105.019364]  [<ffffffff815fc905>] bus_probe_device+0x45/0x80
      [ 3105.019396]  [<ffffffff815f98a6>] device_add+0x876/0x990
      [ 3105.019434]  [<ffffffffa0094e42>] usb_set_configuration+0x822/0x9e0 [usbcore]
      [ 3105.019479]  [<ffffffffa00a3492>] generic_probe+0x62/0xf0 [usbcore]
      [ 3105.019518]  [<ffffffffa0097a46>] usb_probe_device+0x66/0xb0 [usbcore]
      [ 3105.019555]  [<ffffffff815fda67>] driver_probe_device+0x257/0x3b0
      [ 3105.019589]  [<ffffffff815fdd43>] __device_attach+0x73/0x90
      [ 3105.019617]  [<ffffffff815fdcd0>] ? __driver_attach+0x110/0x110
      [ 3105.019648]  [<ffffffff815fb93c>] bus_for_each_drv+0x9c/0xf0
      [ 3105.019680]  [<ffffffff815fd6c7>] device_attach+0xf7/0x120
      [ 3105.019709]  [<ffffffff815fc905>] bus_probe_device+0x45/0x80
      [ 3105.021040] usb usb6: usb auto-resume
      [ 3105.021045] usb usb6: wakeup_rh
      [ 3105.024849]  [<ffffffff815f98a6>] device_add+0x876/0x990
      [ 3105.025086]  [<ffffffffa0088987>] usb_new_device+0x1e7/0x2b0 [usbcore]
      [ 3105.025086]  [<ffffffffa008a4d7>] hub_thread+0xb27/0x1ec0 [usbcore]
      [ 3105.025086]  [<ffffffff810d5200>] ? wake_up_bit+0x50/0x50
      [ 3105.025086]  [<ffffffffa00899b0>] ? usb_remote_wakeup+0xa0/0xa0 [usbcore]
      [ 3105.025086]  [<ffffffff810d49b8>] kthread+0xd8/0xf0
      [ 3105.025086]  [<ffffffff81939884>] kernel_thread_helper+0x4/0x10
      [ 3105.025086]  [<ffffffff8192a8c0>] ? _raw_spin_unlock_irq+0x50/0x80
      [ 3105.025086]  [<ffffffff8192b1b4>] ? retint_restore_args+0x13/0x13
      [ 3105.025086]  [<ffffffff810d48e0>] ? __init_kthread_worker+0x80/0x80
      [ 3105.025086]  [<ffffffff81939880>] ? gs_change+0x13/0x13
      [ 3105.025086] Code: 00 48 83 05 cd ad 00 00 01 48 83 05 cd ad 00 00
      01 4c 8b ab 30 0c 00 00 48 8b 50 08 48 83 c0 30 48 89 45 a0 4c 89 a3
      40 0c 00 00 <41> 0f b6 44 24 10 48 89 55 a8 3c ff 0f 84 b8 04 00 00 48
      83 05
      [ 3105.025086] RIP  [<ffffffffa045830d>] usb_stor_probe1+0x2fd/0xc20
      [usb_storage]
      [ 3105.025086]  RSP <ffff880056a3d830>
      [ 3105.060037] hub 6-0:1.0: hub_resume
      [ 3105.062616] usb usb5: usb auto-resume
      [ 3105.064317] ehci_hcd 0000:00:1d.7: resume root hub
      [ 3105.094809] ---[ end trace a7919e7f17c0a727 ]---
      [ 3105.130069] hub 5-0:1.0: hub_resume
      [ 3105.132131] usb usb4: usb auto-resume
      [ 3105.132136] usb usb4: wakeup_rh
      [ 3105.180059] hub 4-0:1.0: hub_resume
      [ 3106.290052] usb usb6: suspend_rh (auto-stop)
      [ 3106.290077] usb usb4: suspend_rh (auto-stop)
      Signed-off-by: default avatarHuajun Li <huajun.li.lee@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      3c890fae
    • Benjamin Herrenschmidt's avatar
      offb: Fix bug in calculating requested vram size · ffee6c2e
      Benjamin Herrenschmidt authored
      commit c055fe07
      
       upstream.
      
      We used to try to request 8 times more vram than needed, which would
      fail if the card has a too small BAR (observed with qemu & kvm).
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      ffee6c2e
    • Benjamin Herrenschmidt's avatar
      offb: Fix setting of the pseudo-palette for >8bpp · c377a00d
      Benjamin Herrenschmidt authored
      commit 1bb0b7d2
      
       upstream.
      
      When using a >8bpp framebuffer, offb advertises truecolor, not directcolor,
      and doesn't touch the color map even if it has a corresponding access method
      for the real hardware.
      
      Thus it needs to set the pseudo-palette with all 3 components of the color,
      like other truecolor framebuffers, not with copies of the color index like
      a directcolor framebuffer would do.
      
      This went unnoticed for a long time because it's pretty hard to get offb
      to kick in with anything but 8bpp (old BootX under MacOS will do that and
      qemu does it).
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      c377a00d
    • Andrea Arcangeli's avatar
      ext4: avoid hangs in ext4_da_should_update_i_disksize() · ea242bf2
      Andrea Arcangeli authored
      commit ea51d132
      
       upstream.
      
      If the pte mapping in generic_perform_write() is unmapped between
      iov_iter_fault_in_readable() and iov_iter_copy_from_user_atomic(), the
      "copied" parameter to ->end_write can be zero. ext4 couldn't cope with
      it with delayed allocations enabled. This skips the i_disksize
      enlargement logic if copied is zero and no new data was appeneded to
      the inode.
      
       gdb> bt
       #0  0xffffffff811afe80 in ext4_da_should_update_i_disksize (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x1\
       08000, len=0x1000, copied=0x0, page=0xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2467
       #1  ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\
       xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512
       #2  0xffffffff810d97f1 in generic_perform_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value o\
       ptimized out>, pos=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2440
       #3  generic_file_buffered_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value optimized out>, p\
       os=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2482
       #4  0xffffffff810db5d1 in __generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, ppos=0\
       xffff88001e26be40) at mm/filemap.c:2600
       #5  0xffffffff810db853 in generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=<value optimi\
       zed out>, pos=<value optimized out>) at mm/filemap.c:2632
       #6  0xffffffff811a71aa in ext4_file_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, pos=0x108000) a\
       t fs/ext4/file.c:136
       #7  0xffffffff811375aa in do_sync_write (filp=0xffff88003f606a80, buf=<value optimized out>, len=<value optimized out>, \
       ppos=0xffff88001e26bf48) at fs/read_write.c:406
       #8  0xffffffff81137e56 in vfs_write (file=0xffff88003f606a80, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x4\
       000, pos=0xffff88001e26bf48) at fs/read_write.c:435
       #9  0xffffffff8113816c in sys_write (fd=<value optimized out>, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x\
       4000) at fs/read_write.c:487
       #10 <signal handler called>
       #11 0x00007f120077a390 in __brk_reservation_fn_dmi_alloc__ ()
       #12 0x0000000000000000 in ?? ()
       gdb> print offset
       $22 = 0xffffffffffffffff
       gdb> print idx
       $23 = 0xffffffff
       gdb> print inode->i_blkbits
       $24 = 0xc
       gdb> up
       #1  ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\
       xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512
       2512                    if (ext4_da_should_update_i_disksize(page, end)) {
       gdb> print start
       $25 = 0x0
       gdb> print end
       $26 = 0xffffffffffffffff
       gdb> print pos
       $27 = 0x108000
       gdb> print new_i_size
       $28 = 0x108000
       gdb> print ((struct ext4_inode_info *)((char *)inode-((int)(&((struct ext4_inode_info *)0)->vfs_inode))))->i_disksize
       $29 = 0xd9000
       gdb> down
       2467            for (i = 0; i < idx; i++)
       gdb> print i
       $30 = 0xd44acbee
      
      This is 100% reproducible with some autonuma development code tuned in
      a very aggressive manner (not normal way even for knumad) which does
      "exotic" changes to the ptes. It wouldn't normally trigger but I don't
      see why it can't happen normally if the page is added to swap cache in
      between the two faults leading to "copied" being zero (which then
      hangs in ext4). So it should be fixed. Especially possible with lumpy
      reclaim (albeit disabled if compaction is enabled) as that would
      ignore the young bits in the ptes.
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      ea242bf2
    • Robert Richter's avatar
      oprofile, x86: Fix nmi-unsafe callgraph support · d57c4ca8
      Robert Richter authored
      commit a0e3e702 upstream.
      
      Backport for stable kernel v2.6.32.y to v2.6.36.y.
      
      Current oprofile's x86 callgraph support may trigger page faults
      throwing the BUG_ON(in_nmi()) message below. This patch fixes this by
      using the same nmi-safe copy-from-user code as in perf.
      
      ------------[ cut here ]------------
      kernel BUG at .../arch/x86/kernel/traps.c:436!
      invalid opcode: 0000 [#1] SMP
      last sysfs file: /sys/devices/pci0000:00/0000:00:0a.0/0000:07:00.0/0000:08:04.0/net/eth0/broadcast
      CPU 5
      Modules linked in:
      
      Pid: 8611, comm: opcontrol Not tainted 2.6.39-00007-gfe47ae7f #1 Advanced Micro Device Anaheim/Anaheim
      RIP: 0010:[<ffffffff813e8e35>]  [<ffffffff813e8e35>] do_nmi+0x22/0x1ee
      RSP: 0000:ffff88042fd47f28  EFLAGS: 00010002
      RAX: ffff88042c0a7fd8 RBX: 0000000000000001 RCX: 00000000c0000101
      RDX: 00000000ffff8804 RSI: ffffffffffffffff RDI: ffff88042fd47f58
      RBP: ffff88042fd47f48 R08: 0000000000000004 R09: 0000000000001484
      R10: 0000000000000001 R11: 0000000000000000 R12: ffff88042fd47f58
      R13: 0000000000000000 R14: ffff88042fd47d98 R15: 0000000000000020
      FS:  00007fca25e56700(0000) GS:ffff88042fd40000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000074 CR3: 000000042d28b000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process opcontrol (pid: 8611, threadinfo ffff88042c0a6000, task ffff88042c532310)
      Stack:
       0000000000000000 0000000000000001 ffff88042c0a7fd8 0000000000000000
       ffff88042fd47de8 ffffffff813e897a 0000000000000020 ffff88042fd47d98
       0000000000000000 ffff88042c0a7fd8 ffff88042fd47de8 0000000000000074
      Call Trace:
       <NMI>
       [<ffffffff813e897a>] nmi+0x1a/0x20
       [<ffffffff813f08ab>] ? bad_to_user+0x25/0x771
       <<EOE>>
      Code: ff 59 5b 41 5c 41 5d c9 c3 55 65 48 8b 04 25 88 b5 00 00 48 89 e5 41 55 41 54 49 89 fc 53 48 83 ec 08 f6 80 47 e0 ff ff 04 74 04 <0f> 0b eb fe 81 80 44 e0 ff ff 00 00 01 04 65 ff 04 25 c4 0f 01
      RIP  [<ffffffff813e8e35>] do_nmi+0x22/0x1ee
       RSP <ffff88042fd47f28>
      ---[ end trace ed6752185092104b ]---
      Kernel panic - not syncing: Fatal exception in interrupt
      Pid: 8611, comm: opcontrol Tainted: G      D     2.6.39-00007-gfe47ae7f
      
       #1
      Call Trace:
       <NMI>  [<ffffffff813e5e0a>] panic+0x8c/0x188
       [<ffffffff813e915c>] oops_end+0x81/0x8e
       [<ffffffff8100403d>] die+0x55/0x5e
       [<ffffffff813e8c45>] do_trap+0x11c/0x12b
       [<ffffffff810023c8>] do_invalid_op+0x91/0x9a
       [<ffffffff813e8e35>] ? do_nmi+0x22/0x1ee
       [<ffffffff8131e6fa>] ? oprofile_add_sample+0x83/0x95
       [<ffffffff81321670>] ? op_amd_check_ctrs+0x4f/0x2cf
       [<ffffffff813ee4d5>] invalid_op+0x15/0x20
       [<ffffffff813e8e35>] ? do_nmi+0x22/0x1ee
       [<ffffffff813e8e7a>] ? do_nmi+0x67/0x1ee
       [<ffffffff813e897a>] nmi+0x1a/0x20
       [<ffffffff813f08ab>] ? bad_to_user+0x25/0x771
       <<EOE>>
      
      Cc: John Lumby <johnlumby@hotmail.com>
      Cc: Maynard Johnson <maynardj@us.ibm.com>
      Signed-off-by: default avatarRobert Richter <robert.richter@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d57c4ca8
    • Xiao Guangrong's avatar
      export __get_user_pages_fast() function · 40a10c27
      Xiao Guangrong authored
      commit 45888a0c
      
       upstream.
      
      Backport for stable kernel v2.6.32.y to v2.6.36.y.
      
      Needed for next patch:
      
       oprofile, x86: Fix nmi-unsafe callgraph support
      
      This function is used by KVM to pin process's page in the atomic context.
      
      Define the 'weak' function to avoid other architecture not support it
      Acked-by: default avatarNick Piggin <npiggin@suse.de>
      Signed-off-by: default avatarXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: default avatarRobert Richter <robert.richter@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      40a10c27
    • Peter Zijlstra's avatar
      x86, mm: Add __get_user_pages_fast() · 8669a4b6
      Peter Zijlstra authored
      
      Introduce a gup_fast() variant which is usable from IRQ/NMI context.
      
      [ WT: this one is only needed for next patch ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      CC: Nick Piggin <npiggin@suse.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      8669a4b6
    • Phillip Lougher's avatar
      hfs: fix hfs_find_init() sb->ext_tree NULL ptr oops · 3db7e32f
      Phillip Lougher authored
      commit 434a964d upstream.
      
      Clement Lecigne reports a filesystem which causes a kernel oops in
      hfs_find_init() trying to dereference sb->ext_tree which is NULL.
      
      This proves to be because the filesystem has a corrupted MDB extent
      record, where the extents file does not fit into the first three extents
      in the file record (the first blocks).
      
      In hfs_get_block() when looking up the blocks for the extent file
      (HFS_EXT_CNID), it fails the first blocks special case, and falls
      through to the extent code (which ultimately calls hfs_find_init())
      which is in the process of being initialised.
      
      Hfs avoids this scenario by always having the extents b-tree fitting
      into the first blocks (the extents B-tree can't have overflow extents).
      
      The fix is to check at mount time that the B-tree fits into first
      blocks, i.e.  fail if HFS_I(inode)->alloc_blocks >=
      HFS_I(inode)->first_blocks
      
      Note, the existing commit 47f365eb ("hfs: fix oops on mount with
      corrupted btree extent records") becomes subsumed into this as a special
      case, but only for the extents B-tree (HFS_EXT_CNID), it is perfectly
      acceptable for the catalog B-Tree file to grow beyond three extents,
      with the remaining extent descriptors in the extents overfow.
      
      [WT: patch edited - 47f365eb
      
       was missing from 2.6.27.x]
      
      This fixes CVE-2011-2203
      Reported-by: default avatarClement LECIGNE <clement.lecigne@netasq.com>
      Signed-off-by: default avatarPhillip Lougher <plougher@redhat.com>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Moritz Mühlenhoff <jmm@inutil.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      3db7e32f
    • Linus Torvalds's avatar
      Make TASKSTATS require root access · 52556d4b
      Linus Torvalds authored
      commit 1a51410a
      
       upstream.
      
      Ok, this isn't optimal, since it means that 'iotop' needs admin
      capabilities, and we may have to work on this some more.  But at the
      same time it is very much not acceptable to let anybody just read
      anybody elses IO statistics quite at this level.
      
      Use of the GENL_ADMIN_PERM suggested by Johannes Berg as an alternative
      to checking the capabilities by hand.
      Reported-by: default avatarVasiliy Kulikov <segoon@openwall.com>
      Cc: Johannes Berg <johannes.berg@intel.com>
      Acked-by: default avatarBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Moritz Mühlenhoff <jmm@inutil.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      52556d4b
    • Eryu Guan's avatar
      jbd/jbd2: validate sb->s_first in journal_get_superblock() · 75016c6e
      Eryu Guan authored
      commit 8762202d
      
       upstream.
      
      I hit a J_ASSERT(blocknr != 0) failure in cleanup_journal_tail() when
      mounting a fsfuzzed ext3 image. It turns out that the corrupted ext3
      image has s_first = 0 in journal superblock, and the 0 is passed to
      journal->j_head in journal_reset(), then to blocknr in
      cleanup_journal_tail(), in the end the J_ASSERT failed.
      
      So validate s_first after reading journal superblock from disk in
      journal_get_superblock() to ensure s_first is valid.
      
      The following script could reproduce it:
      
      fstype=ext3
      blocksize=1024
      img=$fstype.img
      offset=0
      found=0
      magic="c0 3b 39 98"
      
      dd if=/dev/zero of=$img bs=1M count=8
      mkfs -t $fstype -b $blocksize -F $img
      filesize=`stat -c %s $img`
      while [ $offset -lt $filesize ]
      do
              if od -j $offset -N 4 -t x1 $img | grep -i "$magic";then
                      echo "Found journal: $offset"
                      found=1
                      break
              fi
              offset=`echo "$offset+$blocksize" | bc`
      done
      
      if [ $found -ne 1 ];then
              echo "Magic \"$magic\" not found"
              exit 1
      fi
      
      dd if=/dev/zero of=$img seek=$(($offset+23)) conv=notrunc bs=1 count=1
      
      mkdir -p ./mnt
      mount -o loop $img ./mnt
      
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: default avatarEryu Guan <guaneryu@gmail.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: Moritz Mühlenhoff <jmm@inutil.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      75016c6e
    • Robert Richter's avatar
      oprofile, x86: Fix crash when unloading module (nmi timer mode) · 13ca84e1
      Robert Richter authored
      commit 97f7f818
      
       upstream.
      
      If oprofile uses the nmi timer interrupt there is a crash while
      unloading the module. The bug can be triggered with oprofile build as
      module and kernel parameter nolapic set. This patch fixes this.
      
      oprofile: using NMI timer interrupt.
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      IP: [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58
      PGD 42dbca067 PUD 41da6a067 PMD 0
      Oops: 0002 [#1] PREEMPT SMP
      CPU 5
      Modules linked in: oprofile(-) [last unloaded: oprofile]
      
      Pid: 2518, comm: modprobe Not tainted 3.1.0-rc7-00019-gb2fb49d #19 Advanced Micro Device Anaheim/Anaheim
      RIP: 0010:[<ffffffff8123c226>]  [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58
      RSP: 0018:ffff88041ef71e98  EFLAGS: 00010296
      RAX: 0000000000000000 RBX: ffffffffa0017100 RCX: dead000000200200
      RDX: 0000000000000000 RSI: dead000000100100 RDI: ffffffff8178c620
      RBP: ffff88041ef71ea8 R08: 0000000000000001 R09: 0000000000000082
      R10: 0000000000000000 R11: ffff88041ef71de8 R12: 0000000000000080
      R13: fffffffffffffff5 R14: 0000000000000001 R15: 0000000000610210
      FS:  00007fc902f20700(0000) GS:ffff88042fd40000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 0000000000000008 CR3: 000000041cdb6000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process modprobe (pid: 2518, threadinfo ffff88041ef70000, task ffff88041d348040)
      Stack:
       ffff88041ef71eb8 ffffffffa0017790 ffff88041ef71eb8 ffffffffa0013532
       ffff88041ef71ec8 ffffffffa00132d6 ffff88041ef71ed8 ffffffffa00159b2
       ffff88041ef71f78 ffffffff81073115 656c69666f72706f 0000000000610200
      Call Trace:
       [<ffffffffa0013532>] op_nmi_exit+0x15/0x17 [oprofile]
       [<ffffffffa00132d6>] oprofile_arch_exit+0xe/0x10 [oprofile]
       [<ffffffffa00159b2>] oprofile_exit+0x1e/0x20 [oprofile]
       [<ffffffff81073115>] sys_delete_module+0x1c3/0x22f
       [<ffffffff811bf09e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
       [<ffffffff8148070b>] system_call_fastpath+0x16/0x1b
      Code: 20 c6 78 81 e8 c5 cc 23 00 48 8b 13 48 8b 43 08 48 be 00 01 10 00 00 00 ad de 48 b9 00 02 20 00 00 00 ad de 48 c7 c7 20 c6 78 81
       89 42 08 48 89 10 48 89 33 48 89 4b 08 e8 a6 c0 23 00 5a 5b
      RIP  [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58
       RSP <ffff88041ef71e98>
      CR2: 0000000000000008
      ---[ end trace 43a541a52956b7b0 ]---
      Signed-off-by: default avatarRobert Richter <robert.richter@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      13ca84e1
    • Hannes Reinecke's avatar
      SCSI: Silencing 'killing requests for dead queue' · dcac16cc
      Hannes Reinecke authored
      commit 74571813
      
       upstream.
      
      When we tear down a device we try to flush all outstanding
      commands in scsi_free_queue(). However the check in
      scsi_request_fn() is imperfect as it only signals that
      we _might start_ aborting commands, not that we've actually
      aborted some.
      So move the printk inside the scsi_kill_request function,
      this will also give us a hint about which commands are aborted.
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      dcac16cc
    • Andrew Worsley's avatar
      USB: Fix Corruption issue in USB ftdi driver ftdi_sio.c · 68a59259
      Andrew Worsley authored
      commit b1ffb4c8
      
       upstream.
      
      Fix for ftdi_set_termios() glitching output
      
      ftdi_set_termios() is constantly setting the baud rate, data bits and parity
      unnecessarily on every call, . When called while characters are being
      transmitted can cause the FTDI chip to corrupt the serial port bit stream
      output by stalling the output half a bit during the output of a character.
      Simple fix by skipping this setting if the baud rate/data bits/parity are
      unchanged.
      Signed-off-by: default avatarAndrew Worsley <amworsley@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      68a59259
    • Dan Carpenter's avatar
      hfs: add sanity check for file name length · 9f5e4da2
      Dan Carpenter authored
      commit bc5b8a90
      
       upstream.
      
      On a corrupted file system the ->len field could be wrong leading to
      a buffer overflow.
      Reported-and-acked-by: default avatarClement LECIGNE <clement.lecigne@netasq.com>
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      9f5e4da2
    • Bart Van Assche's avatar
      Make scsi_free_queue() kill pending SCSI commands · 41503f12
      Bart Van Assche authored
      commit 3308511c upstream.
      
      Make sure that SCSI device removal via scsi_remove_host() does finish
      all pending SCSI commands. Currently that's not the case and hence
      removal of a SCSI host during I/O can cause a deadlock. See also
      "blkdev_issue_discard() hangs forever if underlying storage device is
      removed" (http://bugzilla.kernel.org/show_bug.cgi?id=40472). See also
      http://lkml.org/lkml/2011/8/27/6
      
      .
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      41503f12