- 09 Sep, 2009 40 commits
-
-
Christoph Hellwig authored
backport of upstream commit 54e34621 Currently inode_init_always calls into ->destroy_inode if the additional initialization fails. That's not only counter-intuitive because inode_init_always did not allocate the inode structure, but in case of XFS it's actively harmful as ->destroy_inode might delete the inode from a radix-tree that has never been added. This in turn might end up deleting the inode for the same inum that has been instanciated by another process and cause lots of cause subtile problems. Also in the case of re-initializing a reclaimable inode in XFS it would free an inode we still want to keep alive. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Eric Sandeen <sandeen@sandeen.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Frans Pop authored
commit 2a908002 upstream. If the BIOS reports an invalid throttling state (which seems to be fairly common after system boot), a reset is done to state T0. Because of a check in acpi_processor_get_throttling_ptc(), the reset never actually gets executed, which results in the error reoccurring on every access of for example /proc/acpi/processor/CPU0/throttling. Add a 'force' option to acpi_processor_set_throttling() to ensure the reset really takes effect. Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13389 This patch, together with the next one, fixes a regression introduced in 2.6.30, listed on the regression list. They have been available for 2.5 months now in bugzilla, but have not been picked up, despite various reminders and without any reason given. Google shows that numerous people are hitting this issue. The issue is in itself relatively minor, but the bug in the code is clear. The patches have been in all my kernels and today testing has shown that throttling works correctly with the patches applied when the system overheats (http://bugzilla.kernel.org/show_bug.cgi?id=13918#c14 ). Signed-off-by:
Frans Pop <elendil@planet.nl> Acked-by:
Zhang Rui <rui.zhang@intel.com> Cc: Len Brown <lenb@kernel.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Sunil Mushran authored
commit e7432675 upstream. In a non-sparse extend, we correctly allocate (and zero) the clusters between the old_i_size and pos, but we don't zero the portions of the cluster we're writing to outside of pos<->len. It handles clustersize > pagesize and blocksize < pagesize. [Cleaned up by Joel Becker.] Signed-off-by:
Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by:
Joel Becker <joel.becker@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jeremy Fitzhardinge authored
commit 2cb07860 upstream. If we've logically disabled apics, don't probe the PCI space for the AMD extended APIC ID. [ Impact: prevent boot crash under Xen. ] Signed-off-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Reported-by:
Bastian Blank <bastian@waldi.eu.org> Signed-off-by:
H. Peter Anvin <hpa@zytor.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Fenghua Yu authored
commit 51b89f7a upstream. In commit 160c1d8e , dma_ops->dma_supported = iommu_dma_supported; This dma_ops->dma_supported is first called in platform_dma_init() during kernel boot. Then dma_ops->dma_supported will be called recursively in iommu_dma_supported. Kernel can not boot because kernel can not get out of iommu_dma_supported until it runs out of stack memory. Signed-off-by:
Fenghua Yu <fenghua.yu@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Linus Torvalds authored
commit 0083fc2c upstream. Ulrich Drepper correctly points out that there is generally padding in the structure on 64-bit hosts, and that copying the structure from kernel to user space can leak information from the kernel stack in those padding bytes. Avoid the whole issue by just copying the three members one by one instead, which also means that the function also can avoid the need for a stack frame. This also happens to match how we copy the new structure from user space, so it all even makes sense. [ The obvious solution of adding a memset() generates horrid code, gcc does really stupid things. ] Reported-by:
Ulrich Drepper <drepper@redhat.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Kashyap, Desai authored
commit 388ce4be upstream. Moving the setting and clearing of the mutex's to _config_request. There was a mutex deadlock when diag reset is called from inside _config_request, so diag reset was moved to outside the mutexs. Signed-off-by:
James Bottomley <James.Bottomley@suse.de> Signed-off-by:
Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by:
Eric Moore <Eric.moore@lsi.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Kashyap, Desai authored
commit fcfe6392 upstream. Fix another ocurring when the system resumes. This oops was due to driver setting the pci drvdata to NULL on the prior hibernation. Becuase it was set to NULL, upon resmume we assume the pci drvdata is non-zero, and we oops. To fix the ooops, we don't set pci drvdata to NULL at hibernation time. Signed-off-by:
Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by:
James Bottomley <James.Bottomley@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Kashyap, Desai authored
commit e4750c98 upstream. Fix oops ocurring at hibernation time. This oops was due to the firmware fault watchdog timer still running after we freed resources. To fix the issue we need to terminate the watchdog timer at hibernation time. Signed-off-by:
Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by:
James Bottomley <James.Bottomley@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Kashyap, Desai authored
commit 6bd4e1e4 upstream. This restriction is introduced just to avoid loop of config_request. Retry must be limited so we have restricted config request to maximum 2 times. Signed-off-by:
Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by:
James Bottomley <James.Bottomley@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Kashyap, Desai authored
commit be9e8cd7 upstream. Inhibit 0x3117 loginfos - during cable pull, there are too many printks going to the syslog, this is have impact on how fast the interrupt routine can handle keeping up with command completions; this was the root cause to the config pages timeouts. Signed-off-by:
Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by:
James Bottomley <James.Bottomley@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Kashyap, Desai authored
commit 62727a7b upstream. When a volume is activated, the driver will recieve a pair of ir config change events to remove the foreign volume, then add the native. In the process of the removal event, the hidden raid componet is removed from the parent.When the disks is added back, the adding of the port fails becuase there is no instance of the device in its parent. To fix this issue, the driver needs to call mpt2sas_transport_update_links() prior to calling _scsih_add_device. In addition, we added sanity checks on volume add and removal to ignore events for foreign volumes. Signed-off-by:
Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by:
James Bottomley <James.Bottomley@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Kashyap, Desai authored
commit 20f5895d upstream. Kernel panic is seen because driver did not tear down the port which should be dnoe using mpt2sas_transport_port_remove(). without this fix When expander is added back we would oops inside sas_port_add_phy. Signed-off-by:
Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by:
James Bottomley <James.Bottomley@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Kashyap, Desai authored
commit 15052c9e upstream. Kernel panic is seen because of enclosure_handle received from FW is zero. Check is introduced before calling mpt2sas_config_get_enclosure_pg0. Signed-off-by:
Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by:
James Bottomley <James.Bottomley@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Tejun Heo authored
commit 7831387b upstream. OCZ Vertex SSD can't do HPA and not in a usual way. It reports HPA, allows unlocking but then fails all IOs which fall in the unlocked area. Quirk it so that HPA unlocking is not used for the device. Reported by Daniel Perup in bnc#522414. https://bugzilla.novell.com/show_bug.cgi?id=522414 Signed-off-by:
Tejun Heo <tj@kernel.org> Reported-by:
Daniel Perup <probe@spray.se> Signed-off-by:
Jeff Garzik <jgarzik@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Reinette Chatre authored
commit 99f1b015 upstream. Do all key clearing except sending sommands to device when rfkill enabled. When rfkill enabled the interface is brought down and will be brought back up correctly after rfkill is enabled again. Same change is not needed for iwl3945 as it ignores return code when sending key clearing command to device. This fixes http://bugzilla.kernel.org/show_bug.cgi?id=13742 Signed-off-by:
Reinette Chatre <reinette.chatre@intel.com> Tested-by:
Frans Pop <elendil@planet.nl> Signed-off-by:
John W. Linville <linville@tuxdriver.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Stanislaw Gruszka authored
(Not needed upstream, due to the major rewrite in 2.6.31) Due to rfkill and iwlwifi mishmash of SW / HW killswitch representation, we have race conditions which make unable turn wifi radio on, after enable and disable again killswitch. I can observe this problem on my laptop with iwl3945 device. In rfkill core HW switch and SW switch are separate 'states'. Device can be only in one of 3 states: RFKILL_STATE_SOFT_BLOCKED, RFKILL_STATE_UNBLOCKED, RFKILL_STATE_HARD_BLOCKED. Whereas in iwlwifi driver we have separate bits STATUS_RF_KILL_HW and STATUS_RF_KILL_SW for HW and SW switches - radio can be turned on, only if both bits are cleared. In this particular race conditions, radio can not be turned on if in driver STATUS_RF_KILL_SW bit is set, and rfkill core is in state RFKILL_STATE_HARD_BLOCKED, because rfkill core is unable to call rfkill->toggle_radio(). This situation can be entered in case: - killswitch is turned on - rfkill core 'see' button change first and move to RFKILL_STATE_SOFT_BLOCKED also call ->toggle_radio() and STATE_RF_KILL_SW in driver is set - iwl3945 get info about button from hardware to set STATUS_RF_KILL_HW bit and force rfkill to move to RFKILL_STATE_HARD_BLOCKED - killsiwtch is turend off - driver clear STATUS_RF_KILL_HW - rfkill core is unable to clear STATUS_RF_KILL_SW in driver Additionally call to rfkill_epo() when STATUS_RF_KILL_HW in driver is set cause move to the same situation. In 2.6.31 this problem is fixed due to _total_ rewrite of rfkill subsystem. This is a quite small fix for 2.6.30.x in iwlwifi driver. We are changing internal rfkill state to always have below relations true: STATUS_RF_KILL_HW=1 STATUS_RF_KILL_SW=1 <-> RFKILL_STATUS_SOFT_BLOCKED STATUS_RF_KILL_HW=0 STATUS_RF_KILL_SW=1 <-> RFKILL_STATUS_SOFT_BLOCKED STATUS_RF_KILL_HW=1 STATUS_RF_KILL_SW=0 <-> RFKILL_STATUS_HARD_BLOCKED STATUS_RF_KILL_HW=0 STATUS_RF_KILL_SW=0 <-> RFKILL_STATUS_UNBLOCKED Signed-off-by:
Stanislaw Gruszka <sgruszka@redhat.com> Acked-by:
Reinette Chatre <reinette.chatre@intel.com> Acked-by:
John W. Linville <linville@tuxdriver.com>
-
Jan Kiszka authored
commit e125e7b6 upstream. So far, KVM copied the emulated_msrs (only MSR_IA32_MISC_ENABLE) to a wrong address in user space due to broken pointer arithmetic. This caused subtle corruption up there (missing MSR_IA32_MISC_ENABLE had probably no practical relevance). Moreover, the size check for the user-provided kvm_msr_list forgot about emulated MSRs. Signed-off-by:
Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Michael S. Tsirkin authored
(cherry picked from commit 5116d8f6 ) kvm_notify_acked_irq does not check irq type, so that it sometimes interprets msi vector as irq. As a result, ack notifiers are not called, which typially hangs the guest. The fix is to track and check irq type. Signed-off-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Marcelo Tosatti authored
(cherry picked from commit 53a27b39 ) Otherwise the host can spend too long traversing an rmap chain, which happens under a spinlock. Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Marcelo Tosatti authored
(cherry picked from commit 025dbbf3 ) kvm_mmu_change_mmu_pages mishandles the case where n_alloc_mmu_pages is smaller then n_free_mmu_pages, by not checking if the result of the subtraction is negative. Its a valid condition which can happen if a large number of pages has been recently freed. Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Marcelo Tosatti authored
(cherry picked from commit 4b656b12) If a migrated vcpu matches the asid_generation value of the target pcpu, there will be no TLB flush via TLB_CONTROL_FLUSH_ALL_ASID. The check for vcpu.cpu in pre_svm_run is meaningless since svm_vcpu_load already updated it on schedule in. Such vcpu will VMRUN with stale TLB entries. Based on original patch from Joerg Roedel (http://patchwork.kernel.org/patch/10021/ ) Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com> Acked-by:
Joerg Roedel <joerg.roedel@amd.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Marcelo Tosatti authored
(cherry picked from commit d6289b93 ) Do not allow invalid memory types in MTRR/PAT (generating a #GP otherwise). Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Avi Kivity authored
(cherry picked from commit 8d753f36 ) MTRR, PAT, MCE, and MCA are all supported (to some extent) but not reported. Vista requires these features, so if userspace relies on kernel cpuid reporting, it loses support for Vista. Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Amit Shah authored
(cherry picked from commit 9e699624) In commit 7fe29e0f we ignored the reads to the P6 EVNTSEL MSRs. That fixed crashes on Intel machines. Ignore the reads to K7 EVNTSEL MSRs as well to fix this on AMD hosts. This fixes Kaspersky antivirus crashing Windows guests on AMD hosts. Signed-off-by:
Amit Shah <amit.shah@redhat.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Amit Shah authored
(cherry picked from commit 7fe29e0f ) We ignore writes to the performance counters and performance event selector registers already. Kaspersky antivirus reads the eventsel MSR causing it to crash with the current behaviour. Return 0 as data when the eventsel registers are read to stop the crash. Signed-off-by:
Amit Shah <amit.shah@redhat.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Avi Kivity authored
(cherry picked from commit 9645bb56 ) A pte that is shadowed when the guest EFER.NXE=1 is not valid when EFER.NXE=0; if bit 63 is set, the pte should cause a fault, and since the shadow EFER always has NX enabled, this won't happen. Fix by using a different shadow page table for different EFER.NXE bits. This allows vcpus to run correctly with different values of EFER.NXE, and for transitions on this bit to be handled correctly without requiring a full flush. Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Glauber Costa authored
(cherry picked from commit 310b5d30) We currently unblock shadow interrupt state when we skip an instruction, but failing to do so when we actually emulate one. This blocks interrupts in key instruction blocks, in particular sti; hlt; sequences If the instruction emulated is an sti, we have to block shadow interrupts. The same goes for mov ss. pop ss also needs it, but we don't currently emulate it. Without this patch, I cannot boot gpxe option roms at vmx machines. This is described at https://bugzilla.redhat.com/show_bug.cgi?id=494469 Signed-off-by:
Glauber Costa <glommer@redhat.com> CC: H. Peter Anvin <hpa@zytor.com> CC: Gleb Natapov <gleb@redhat.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Glauber Costa authored
This patch introduces set/get_interrupt_shadow(), that does exactly what the name suggests. It also replaces open code that explicitly does it with the now existent functions. It differs slightly from upstream, because upstream merged it after gleb's interrupt rework, that we don't ship. Just for reference, upstream changelog is (2809f5d2 ): This patch replaces drop_interrupt_shadow with the more general set_interrupt_shadow, that can either drop or raise it, depending on its parameter. It also adds ->get_interrupt_shadow() for future use. Signed-off-by:
Glauber Costa <glommer@redhat.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Gleb Natapov authored
(cherry picked from commit f00be0ca ) free_mmu_pages() should only undo what alloc_mmu_pages() does. Free mmu pages from the generic VM destruction function, kvm_destroy_vm(). Signed-off-by:
Gleb Natapov <gleb@redhat.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Marcelo Tosatti authored
(cherry picked from commit 7c8a83b7 ) kvm_handle_hva, called by MMU notifiers, manipulates mmu data only with the protection of mmu_lock. Update kvm_mmu_change_mmu_pages callers to take mmu_lock, thus protecting against kvm_handle_hva. Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Marcelo Tosatti authored
(cherry picked from commit 8986ecc0 ) Verify the cr3 address stored in vcpu->arch.cr3 points to an existant memslot. If not, inject a triple fault. Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Marcelo Tosatti authored
(cherry picked from commit b43b1901 ) kvm_handle_hva relies on mmu_lock protection to safely access the memslot structures. Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Takashi Iwai authored
commit a3f730af upstream. This patch fixes the wrong headphone output routing for MacBookPro 3,1/4,1 quirk with ALC889A codec, which caused the silent headphone output. Also, this gives the individual Headphone and Speaker volume controls. Reference: kernel bug#14078 http://bugzilla.kernel.org/show_bug.cgi?id=14078 Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Trond Myklebust authored
commit 2574cc9f upstream. This patch fixes the bug that was reported in http://bugzilla.kernel.org/show_bug.cgi?id=14053 If we're in the case where we need to force a reencode and then resend of the RPC request, due to xprt_transmit failing with a networking error, then we _must_ retransmit the entire request. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Costantino Leandro authored
commit f3d83e24 upstream. Summary: Kernel panic arise when stack protection is enabled, since strncat will add a null terminating byte '\0'; So in functions like this one (wmi_query_block): char wc[4]="WC"; .... strncat(method, block->object_id, 2); ... the length of wc should be n+1 (wc[5]) or stack protection fault will arise. This is not noticeable when stack protection is disabled,but , isn't good either. Config used: [CONFIG_CC_STACKPROTECTOR_ALL=y, CONFIG_CC_STACKPROTECTOR=y] Panic Trace ------------ .... stack-protector: kernel stack corrupted in : fa7b182c 2.6.30-rc8-obelisco-generic call_trace: [<c04a6c40>] ? panic+0x45/0xd9 [<c012925d>] ? __stack_chk_fail+0x1c/0x40 [<fa7b182c>] ? wmi_query_block+0x15a/0x162 [wmi] [<fa7b182c>] ? wmi_query_block+0x15a/0x162 [wmi] [<fa7e7000>] ? acer_wmi_init+0x00/0x61a [acer_wmi] [<fa7e7135>] ? acer_wmi_init+0x135/0x61a [acer_wmi] [<c0101159>] ? do_one_initcall+0x50+0x126 Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13514 Signed-off-by:
Costantino Leandro <lcostantino@gmail.com> Signed-off-by:
Carlos Corbacho <carlos@strangeworlds.co.uk> Cc: Len Brown <len.brown@intel.com> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Oleg Nesterov authored
commit 4ab6c083 upstream. Spotted by Hiroshi Shimamoto who also provided the test-case below. copy_process() uses signal->count as a reference counter, but it is not. This test case #include <sys/types.h> #include <sys/wait.h> #include <unistd.h> #include <stdio.h> #include <errno.h> #include <pthread.h> void *null_thread(void *p) { for (;;) sleep(1); return NULL; } void *exec_thread(void *p) { execl("/bin/true", "/bin/true", NULL); return null_thread(p); } int main(int argc, char **argv) { for (;;) { pid_t pid; int ret, status; pid = fork(); if (pid < 0) break; if (!pid) { pthread_t tid; pthread_create(&tid, NULL, exec_thread, NULL); for (;;) pthread_create(&tid, NULL, null_thread, NULL); } do { ret = waitpid(pid, &status, 0); } while (ret == -1 && errno == EINTR); } return 0; } quickly creates an unkillable task. If copy_process(CLONE_THREAD) races with de_thread() copy_signal()->atomic(signal->count) breaks the signal->notify_count logic, and the execing thread can hang forever in kernel space. Change copy_process() to increment count/live only when we know for sure we can't fail. In this case the forked thread will take care of its reference to signal correctly. If copy_process() fails, check CLONE_THREAD flag. If it it set - do nothing, the counters were not changed and current belongs to the same thread group. If it is not set, ->signal must be released in any case (and ->count must be == 1), the forked child is the only thread in the thread group. We need more cleanups here, in particular signal->count should not be used by de_thread/__exit_signal at all. This patch only fixes the bug. Reported-by:
Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Tested-by:
Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Signed-off-by:
Oleg Nesterov <oleg@redhat.com> Acked-by:
Roland McGrath <roland@redhat.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Clemens Ladisch authored
commit b1ddaf68 upstream. snd_interval_list() expected a sorted list but did not document this, so there are drivers that give it an unsorted list. To fix this, change the algorithm to work with any list. This fixes the "Slave PCM not usable" error with USB devices that have multiple alternate settings with sample rates in decreasing order, such as the Philips Askey VC010 WebCam. http://bugzilla.kernel.org/show_bug.cgi?id=14028 Reported-and-tested-by:
Andrzej <adkadk@gmail.com> Signed-off-by:
Clemens Ladisch <clemens@ladisch.de> Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Ingo Molnar authored
commit 4a683bf9 upstream. One of my testboxes triggered this nasty stack overflow crash during SCSI probing: [ 5.874004] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 5.875004] device: 'sda': device_add [ 5.878004] BUG: unable to handle kernel NULL pointer dereference at 00000a0c [ 5.878004] IP: [<b1008321>] print_context_stack+0x81/0x110 [ 5.878004] *pde = 00000000 [ 5.878004] Thread overran stack, or stack corrupted [ 5.878004] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 5.878004] last sysfs file: [ 5.878004] [ 5.878004] Pid: 1, comm: swapper Not tainted (2.6.31-rc6-tip-01272-g9919e28-dirty #5685) [ 5.878004] EIP: 0060:[<b1008321>] EFLAGS: 00010083 CPU: 0 [ 5.878004] EIP is at print_context_stack+0x81/0x110 [ 5.878004] EAX: cf8a3000 EBX: cf8a3fe4 ECX: 00000049 EDX: 00000000 [ 5.878004] ESI: b1cfce84 EDI: 00000000 EBP: cf8a3018 ESP: cf8a2ff4 [ 5.878004] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [ 5.878004] Process swapper (pid: 1, ti=cf8a2000 task=cf8a8000 task.ti=cf8a3000) [ 5.878004] Stack: [ 5.878004] b1004867 fffff000 cf8a3ffc [ 5.878004] Call Trace: [ 5.878004] [<b1004867>] ? kernel_thread_helper+0x7/0x10 [ 5.878004] BUG: unable to handle kernel NULL pointer dereference at 00000a0c [ 5.878004] IP: [<b1008321>] print_context_stack+0x81/0x110 [ 5.878004] *pde = 00000000 [ 5.878004] Thread overran stack, or stack corrupted [ 5.878004] Oops: 0000 [#2] PREEMPT SMP DEBUG_PAGEALLOC The oops did not reveal any more details about the real stack that we have and the system got into an infinite loop of recursive pagefaults. So i booted with CONFIG_STACK_TRACER=y and the 'stacktrace' boot parameter. The box did not crash (timings/conditions probably changed a tiny bit to trigger the catastrophic crash), but the /debug/tracing/stack_trace file was rather revealing: Depth Size Location (72 entries) ----- ---- -------- 0) 3704 52 __change_page_attr+0xb8/0x290 1) 3652 24 __change_page_attr_set_clr+0x43/0x90 2) 3628 60 kernel_map_pages+0x108/0x120 3) 3568 40 prep_new_page+0x7d/0x130 4) 3528 84 get_page_from_freelist+0x106/0x420 5) 3444 116 __alloc_pages_nodemask+0xd7/0x550 6) 3328 36 allocate_slab+0xb1/0x100 7) 3292 36 new_slab+0x1c/0x160 8) 3256 36 __slab_alloc+0x133/0x2b0 9) 3220 4 kmem_cache_alloc+0x1bb/0x1d0 10) 3216 108 create_object+0x28/0x250 11) 3108 40 kmemleak_alloc+0x81/0xc0 12) 3068 24 kmem_cache_alloc+0x162/0x1d0 13) 3044 52 scsi_pool_alloc_command+0x29/0x70 14) 2992 20 scsi_host_alloc_command+0x22/0x70 15) 2972 24 __scsi_get_command+0x1b/0x90 16) 2948 28 scsi_get_command+0x35/0x90 17) 2920 24 scsi_setup_blk_pc_cmnd+0xd4/0x100 18) 2896 128 sd_prep_fn+0x332/0xa70 19) 2768 36 blk_peek_request+0xe7/0x1d0 20) 2732 56 scsi_request_fn+0x54/0x520 21) 2676 12 __generic_unplug_device+0x2b/0x40 22) 2664 24 blk_execute_rq_nowait+0x59/0x80 23) 2640 172 blk_execute_rq+0x6b/0xb0 24) 2468 32 scsi_execute+0xe0/0x140 25) 2436 64 scsi_execute_req+0x152/0x160 26) 2372 60 scsi_vpd_inquiry+0x6c/0x90 27) 2312 44 scsi_get_vpd_page+0x112/0x160 28) 2268 52 sd_revalidate_disk+0x1df/0x320 29) 2216 92 rescan_partitions+0x98/0x330 30) 2124 52 __blkdev_get+0x309/0x350 31) 2072 8 blkdev_get+0xf/0x20 32) 2064 44 register_disk+0xff/0x120 33) 2020 36 add_disk+0x6e/0xb0 34) 1984 44 sd_probe_async+0xfb/0x1d0 35) 1940 44 __async_schedule+0xf4/0x1b0 36) 1896 8 async_schedule+0x12/0x20 37) 1888 60 sd_probe+0x305/0x360 38) 1828 44 really_probe+0x63/0x170 39) 1784 36 driver_probe_device+0x5d/0x60 40) 1748 16 __device_attach+0x49/0x50 41) 1732 32 bus_for_each_drv+0x5b/0x80 42) 1700 24 device_attach+0x6b/0x70 43) 1676 16 bus_attach_device+0x47/0x60 44) 1660 76 device_add+0x33d/0x400 45) 1584 52 scsi_sysfs_add_sdev+0x6a/0x2c0 46) 1532 108 scsi_add_lun+0x44b/0x460 47) 1424 116 scsi_probe_and_add_lun+0x182/0x4e0 48) 1308 36 __scsi_add_device+0xd9/0xe0 49) 1272 44 ata_scsi_scan_host+0x10b/0x190 50) 1228 24 async_port_probe+0x96/0xd0 51) 1204 44 __async_schedule+0xf4/0x1b0 52) 1160 8 async_schedule+0x12/0x20 53) 1152 48 ata_host_register+0x171/0x1d0 54) 1104 60 ata_pci_sff_activate_host+0xf3/0x230 55) 1044 44 ata_pci_sff_init_one+0xea/0x100 56) 1000 48 amd_init_one+0xb2/0x190 57) 952 8 local_pci_probe+0x13/0x20 58) 944 32 pci_device_probe+0x68/0x90 59) 912 44 really_probe+0x63/0x170 60) 868 36 driver_probe_device+0x5d/0x60 61) 832 20 __driver_attach+0x89/0xa0 62) 812 32 bus_for_each_dev+0x5b/0x80 63) 780 12 driver_attach+0x1e/0x20 64) 768 72 bus_add_driver+0x14b/0x2d0 65) 696 36 driver_register+0x6e/0x150 66) 660 20 __pci_register_driver+0x53/0xc0 67) 640 8 amd_init+0x14/0x16 68) 632 572 do_one_initcall+0x2b/0x1d0 69) 60 12 do_basic_setup+0x56/0x6a 70) 48 20 kernel_init+0x84/0xce 71) 28 28 kernel_thread_helper+0x7/0x10 There's a lot of fat functions on that stack trace, but the largest of all is do_one_initcall(). This is due to the boot trace entry variables being on the stack. Fixing this is relatively easy, initcalls are fundamentally serialized, so we can move the local variables to file scope. Note that this large stack footprint was present for a couple of months already - what pushed my system over the edge was the addition of kmemleak to the call-chain: 6) 3328 36 allocate_slab+0xb1/0x100 7) 3292 36 new_slab+0x1c/0x160 8) 3256 36 __slab_alloc+0x133/0x2b0 9) 3220 4 kmem_cache_alloc+0x1bb/0x1d0 10) 3216 108 create_object+0x28/0x250 11) 3108 40 kmemleak_alloc+0x81/0xc0 12) 3068 24 kmem_cache_alloc+0x162/0x1d0 13) 3044 52 scsi_pool_alloc_command+0x29/0x70 This pushes the total to ~3800 bytes, only a tiny bit more was needed to corrupt the on-kernel-stack thread_info. The fix reduces the stack footprint from 572 bytes to 28 bytes. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <new-submission> Signed-off-by:
Ingo Molnar <mingo@elte.hu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Mimi Zohar authored
commit 6777d773 upstream. vfs_read() offset is defined as loff_t, but kernel_read() offset is only defined as unsigned long. Redefine kernel_read() offset as loff_t. Signed-off-by:
Mimi Zohar <zohar@us.ibm.com> Signed-off-by:
James Morris <jmorris@namei.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-