- 09 Sep, 2009 33 commits
-
-
Eric Dumazet authored
commit 3d392475 upstream. atalk_getname() can leak 8 bytes of kernel memory to user Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Marcelo Tosatti authored
(cherry picked from commit 7c8a83b7 ) kvm_handle_hva, called by MMU notifiers, manipulates mmu data only with the protection of mmu_lock. Update kvm_mmu_change_mmu_pages callers to take mmu_lock, thus protecting against kvm_handle_hva. Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Marcelo Tosatti authored
(cherry picked from commit 8986ecc0 ) Verify the cr3 address stored in vcpu->arch.cr3 points to an existant memslot. If not, inject a triple fault. Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Izik Eidus authored
(cherry picked from commit e244584f ) When slot is already allocated and being asked to be tracked we need to break the large pages. This code flush the mmu when someone ask a slot to start dirty bit tracking. Signed-off-by:
Izik Eidus <ieidus@redhat.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Gleb Natapov authored
(cherry picked from commit f00be0ca ) free_mmu_pages() should only undo what alloc_mmu_pages() does. Free mmu pages from the generic VM destruction function, kvm_destroy_vm(). Signed-off-by:
Gleb Natapov <gleb@redhat.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Avi Kivity authored
(cherry picked from commit a2edf57f ) The processor is documented to reload the PDPTRs while in PAE mode if any of the CR4 bits PSE, PGE, or PAE change. Linux relies on this behaviour when zapping the low mappings of PAE kernels during boot. The code already handled changes to CR4.PAE; augment it to also notice changes to PSE and PGE. This triggered while booting an F11 PAE kernel; the futex initialization code runs before any CR3 reloads and writes to a NULL pointer; the futex subsystem ended up uninitialized, killing PI futexes and pulseaudio which uses them. Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Avi Kivity authored
(cherry picked from commit a8cd0244 ) The paravirt tlb flush may be used not only to flush TLBs, but also to reload the four page-directory-pointer-table entries, as it is used as a replacement for reloading CR3. Change the code to do the entire CR3 reloading dance instead of simply flushing the TLB. Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Avi Kivity authored
(cherry picked from commit e3c7cb6a ) IF a guest tries to use vmx instructions, inject a #UD to let it know the instruction is not implemented, rather than crashing. This prevents guest userspace from crashing the guest kernel. Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Avi Kivity authored
(cherry picked from commit e286e86e ) Some processors don't have EFER; don't oops if userspace wants us to read EFER when we check NX. Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Avi Kivity authored
(cherry picked from commit 99f85a28 ) KVM optimizes guest port 80 accesses by passthing them through to the host. Some AMD machines die on port 80 writes, allowing the guest to hard-lock the host. Remove the port passthrough to avoid the problem. Reported-by:
Piotr Jaroszyński <p.jaroszynski@gmail.com> Tested-by:
Piotr Jaroszyński <p.jaroszynski@gmail.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Avi Kivity authored
(cherry picked from commit 16175a79 ) vmx_set_msr() does not allow i386 guests to touch EFER, but they can still do so through the default: label in the switch. If they set EFER_LME, they can oops the host. Fix by having EFER access through the normal channel (which will check for EFER_LME) even on i386. Reported-and-tested-by:
Benjamin Gilbert <bgilbert@cs.cmu.edu> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Glauber Costa authored
(cherry picked from commit 7d8fece6 ) One of vcpu_setup responsibilities is to do mmu initialization. However, in case we fail in kvm_arch_vcpu_reset, before we get the chance to init mmu. OTOH, vcpu_destroy will attempt to destroy mmu, triggering a bug. Keeping track of whether or not mmu is initialized would unnecessarily complicate things. Rather, we just make return, making sure any needed uninitialization is done before we return, in case we fail. Signed-off-by:
Glauber Costa <glommer@redhat.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Sheng Yang authored
(cherry picked from commit 928d4bf7 ) There is a potential issue that, when guest using pagetable without vmexit when EPT enabled, guest would use PAT/PCD/PWT bits to index PAT msr for it's memory, which would be inconsistent with host side and would cause host MCE due to inconsistent cache attribute. The patch set IGMT bit in EPT entry to ignore guest PAT and use WB as default memory type to protect host (notice that all memory mapped by KVM should be WB). Signed-off-by:
Sheng Yang <sheng@linux.intel.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Marcelo Tosatti authored
(cherry picked from commit c41ef344 ) The page fault path can use two rmap_desc structures, if: - walk_addr's dirty pte update allocates one rmap_desc. - mmu_lock is dropped, sptes are zapped resulting in rmap_desc being freed. - fetch->mmu_set_spte allocates another rmap_desc. Increase to 4 for safety. Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Avi Kivity <avi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Marcelo Tosatti authored
(cherry picked from commit 29415c37 ) The vcpu thread can be preempted after the guest_debug_pre() callback, resulting in invalid debug registers on the new vcpu. Move it inside the non-preemptable section. Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Avi Kivity <avi@qumranet.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Joerg Roedel authored
(cherry picked from commit a89c1ad2 ) Currently KVM implements MC0-MC4_MISC read support. When booting Linux this results in KVM warnings in the kernel log when the guest tries to read MC5_MISC. Fix this warnings with this patch. Signed-off-by:
Joerg Roedel <joerg.roedel@amd.com> Signed-off-by:
Avi Kivity <avi@qumranet.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dave Hansen authored
(cherry picked from commit 6ad18fba ) We're in a hot path. We can't use kmalloc() because it might impact performance. So, we just stick the buffer that we need into the kvm_vcpu_arch structure. This is used very often, so it is not really a waste. We also have to move the buffer structure's definition to the arch-specific x86 kvm header. Signed-off-by:
Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by:
Avi Kivity <avi@qumranet.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dave Hansen authored
(cherry picked from commit b772ff36 ) [sheng: fix KVM_GET_LAPIC using wrong size] Signed-off-by:
Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by:
Sheng Yang <sheng.yang@intel.com> Signed-off-by:
Avi Kivity <avi@qumranet.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dave Hansen authored
(cherry picked from commit fa3795a7 ) Signed-off-by:
Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by:
Avi Kivity <avi@qumranet.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dave Hansen authored
(cherry picked from commit f0d66275 ) On my machine with gcc 3.4, kvm uses ~2k of stack in a few select functions. This is mostly because gcc fails to notice that the different case: statements could have their stack usage combined. It overflows very nicely if interrupts happen during one of these large uses. This patch uses two methods for reducing stack usage. 1. dynamically allocate large objects instead of putting on the stack. 2. Use a union{} member for all of the case variables. This tricks gcc into combining them all into a single stack allocation. (There's also a comment on this) Signed-off-by:
Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by:
Avi Kivity <avi@qumranet.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Avi Kivity authored
(cherry picked from commit 3201b5d9 ) The accessed bit was accidentally turned on in a random flag word, rather than, the spte itself, which was lucky, since it used the non-EPT compatible PT_ACCESSED_MASK. Fix by turning the bit on in the spte and changing it to use the portable accessed mask. Signed-off-by:
Avi Kivity <avi@qumranet.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Avi Kivity authored
(cherry picked from commit 171d595d ) Otherwise, the cpu may allow writes to the tracked pages, and we lose some display bits or fail to migrate correctly. Signed-off-by:
Avi Kivity <avi@qumranet.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Avi Kivity authored
(cherry picked from commit 2245a28f ) It was generally safe due to slots_lock being held for write, but it wasn't very nice. Signed-off-by:
Avi Kivity <avi@qumranet.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Avi Kivity authored
(cherry picked from commit d657c733 ) This is esoteric and only needed to break COW on MAP_SHARED mappings. Since KVM no longer does these sorts of mappings, breaking COW on them is no longer necessary. Signed-off-by:
Avi Kivity <avi@qumranet.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Avi Kivity authored
(cherry picked from commit acee3c04 ) There is no reason to share internal memory slots with fork()ed instances. Signed-off-by:
Avi Kivity <avi@qumranet.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Avi Kivity authored
(cherry picked from commit f4bbd9aa ) Real mode segments to not reference the GDT or LDT; they simply compute base = selector * 16. Signed-off-by:
Avi Kivity <avi@qumranet.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Avi Kivity authored
(cherry picked from commit a16b20da ) This is more emulation friendly, if not 100% correct. Signed-off-by:
Avi Kivity <avi@qumranet.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Avi Kivity authored
(cherry picked from commit 5706be0d ) Real mode cs is a data segment, not a code segment. Signed-off-by:
Avi Kivity <avi@qumranet.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Trond Myklebust authored
commit 2574cc9f upstream. This patch fixes the bug that was reported in http://bugzilla.kernel.org/show_bug.cgi?id=14053 If we're in the case where we need to force a reencode and then resend of the RPC request, due to xprt_transmit failing with a networking error, then we _must_ retransmit the entire request. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Clemens Ladisch authored
commit b1ddaf68 upstream. snd_interval_list() expected a sorted list but did not document this, so there are drivers that give it an unsorted list. To fix this, change the algorithm to work with any list. This fixes the "Slave PCM not usable" error with USB devices that have multiple alternate settings with sample rates in decreasing order, such as the Philips Askey VC010 WebCam. http://bugzilla.kernel.org/show_bug.cgi?id=14028 Reported-and-tested-by:
Andrzej <adkadk@gmail.com> Signed-off-by:
Clemens Ladisch <clemens@ladisch.de> Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Hannes Hering authored
commit 357eb46d upstream. This patch fixes the napi list handling when an ehea interface is shut down to avoid corruption of the napi list. Signed-off-by:
Hannes Hering <hering2@de.ibm.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Oleg Nesterov authored
commit 4ab6c083 upstream. Spotted by Hiroshi Shimamoto who also provided the test-case below. copy_process() uses signal->count as a reference counter, but it is not. This test case #include <sys/types.h> #include <sys/wait.h> #include <unistd.h> #include <stdio.h> #include <errno.h> #include <pthread.h> void *null_thread(void *p) { for (;;) sleep(1); return NULL; } void *exec_thread(void *p) { execl("/bin/true", "/bin/true", NULL); return null_thread(p); } int main(int argc, char **argv) { for (;;) { pid_t pid; int ret, status; pid = fork(); if (pid < 0) break; if (!pid) { pthread_t tid; pthread_create(&tid, NULL, exec_thread, NULL); for (;;) pthread_create(&tid, NULL, null_thread, NULL); } do { ret = waitpid(pid, &status, 0); } while (ret == -1 && errno == EINTR); } return 0; } quickly creates an unkillable task. If copy_process(CLONE_THREAD) races with de_thread() copy_signal()->atomic(signal->count) breaks the signal->notify_count logic, and the execing thread can hang forever in kernel space. Change copy_process() to increment count/live only when we know for sure we can't fail. In this case the forked thread will take care of its reference to signal correctly. If copy_process() fails, check CLONE_THREAD flag. If it it set - do nothing, the counters were not changed and current belongs to the same thread group. If it is not set, ->signal must be released in any case (and ->count must be == 1), the forked child is the only thread in the thread group. We need more cleanups here, in particular signal->count should not be used by de_thread/__exit_signal at all. This patch only fixes the bug. Reported-by:
Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Tested-by:
Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Signed-off-by:
Oleg Nesterov <oleg@redhat.com> Acked-by:
Roland McGrath <roland@redhat.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Takashi Iwai authored
commit a3f730af upstream. This patch fixes the wrong headphone output routing for MacBookPro 3,1/4,1 quirk with ALC889A codec, which caused the silent headphone output. Also, this gives the individual Headphone and Speaker volume controls. Reference: kernel bug#14078 http://bugzilla.kernel.org/show_bug.cgi?id=14078 Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
- 17 Aug, 2009 2 commits
-
-
Greg Kroah-Hartman authored
-
Greg Kroah-Hartman authored
This reverts commit 9ac36642 . This ioctl is not present in the 2.6.27 tree. I incorrectly added this patch to this tree. Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
- 16 Aug, 2009 5 commits
-
-
Greg Kroah-Hartman authored
-
Trond Myklebust authored
commit 1ae88b2e upstream. We can't call nfs_readdata_release()/nfs_writedata_release() without first initialising and referencing args.context. Doing so inside nfs_direct_read_schedule_segment()/nfs_direct_write_schedule_segment() causes an Oops. We should rather be calling nfs_readdata_free()/nfs_writedata_free() in those cases. Looking at the O_DIRECT code, the "struct nfs_direct_req" is already referencing the nfs_open_context for us. Since the readdata and writedata structures carry a reference to that, we can simplify things by getting rid of the extra nfs_open_context references, so that we can replace all instances of nfs_readdata_release()/nfs_writedata_release(). Reported-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Linus Torvalds authored
commit e6949583 upstream. kernel_sendpage() does the proper default case handling for when the socket doesn't have a native sendpage implementation. Now, arguably this might be something that we could instead solve by just specifying that all protocols should do it themselves at the protocol level, but we really only care about the common protocols. Does anybody really care about sendpage on something like Appletalk? Not likely. Acked-by:
David S. Miller <davem@davemloft.net> Acked-by:
Julien TINNES <julien@cr0.org> Acked-by:
Tavis Ormandy <taviso@sdf.lonestar.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Oleg Nesterov authored
commit 00f89d21 upstream. mm_for_maps() takes ->mmap_sem after security checks, this looks strange and obfuscates the locking rules. Move this lock to its single caller, m_start(). Signed-off-by:
Oleg Nesterov <oleg@redhat.com> Acked-by:
Serge Hallyn <serue@us.ibm.com> Signed-off-by:
James Morris <jmorris@namei.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Oleg Nesterov authored
commit 13f0feaf upstream. It would be nice to kill __ptrace_may_access(). It requires task_lock(), but this lock is only needed to read mm->flags in the middle. Convert mm_for_maps() to use ptrace_may_access(), this also simplifies the code a little bit. Also, we do not need to take ->mmap_sem in advance. In fact I think mm_for_maps() should not play with ->mmap_sem at all, the caller should take this lock. With or without this patch, without ->cred_guard_mutex held we can race with exec() and get the new ->mm but check old creds. Signed-off-by:
Oleg Nesterov <oleg@redhat.com> Reviewed-by:
Serge Hallyn <serue@us.ibm.com> Signed-off-by:
James Morris <jmorris@namei.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-