1. 02 Jan, 2014 1 commit
  2. 14 Aug, 2013 1 commit
  3. 29 Mar, 2013 1 commit
  4. 15 Mar, 2013 1 commit
    • Michel Lespinasse's avatar
      mm/fremap.c: fix possible oops on error path · a2362d24
      Michel Lespinasse authored
      The vm_flags introduced in 6d7825b1
      
       ("mm/fremap.c: fix oops on error
      path") is supposed to avoid a compiler warning about unitialized
      vm_flags without changing the generated code.
      
      However I am concerned that this is going to be very brittle, and fail
      with some compiler versions. The failure could be either of:
      
      - compiler could actually load vma->vm_flags before checking for the
        !vma condition, thus reintroducing the oops
      
      - compiler could optimize out the !vma check, since the pointer just got
        dereferenced shortly before (so the compiler knows it can't be NULL!)
      
      I propose reversing this part of the change and initializing vm_flags to 0
      just to avoid the bogus uninitialized use warning.
      Signed-off-by: default avatarMichel Lespinasse <walken@google.com>
      Cc: Tommi Rantala <tt.rantala@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a2362d24
  5. 13 Mar, 2013 1 commit
  6. 24 Feb, 2013 4 commits
    • Michel Lespinasse's avatar
      mm: introduce VM_POPULATE flag to better deal with racy userspace programs · 18693050
      Michel Lespinasse authored
      
      The vm_populate() code populates user mappings without constantly
      holding the mmap_sem.  This makes it susceptible to racy userspace
      programs: the user mappings may change while vm_populate() is running,
      and in this case vm_populate() may end up populating the new mapping
      instead of the old one.
      
      In order to reduce the possibility of userspace getting surprised by
      this behavior, this change introduces the VM_POPULATE vma flag which
      gets set on vmas we want vm_populate() to work on.  This way
      vm_populate() may still end up populating the new mapping after such a
      race, but only if the new mapping is also one that the user has
      requested (using MAP_SHARED, MAP_LOCKED or mlock) to be populated.
      Signed-off-by: default avatarMichel Lespinasse <walken@google.com>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Tested-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Cc: Greg Ungerer <gregungerer@westnet.com.au>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      18693050
    • Michel Lespinasse's avatar
      mm: remove flags argument to mmap_region · c22c0d63
      Michel Lespinasse authored
      
      After the MAP_POPULATE handling has been moved to mmap_region() call
      sites, the only remaining use of the flags argument is to pass the
      MAP_NORESERVE flag.  This can be just as easily handled by
      do_mmap_pgoff(), so do that and remove the mmap_region() flags
      parameter.
      
      [akpm@linux-foundation.org: remove double parens]
      Signed-off-by: default avatarMichel Lespinasse <walken@google.com>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Tested-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Cc: Greg Ungerer <gregungerer@westnet.com.au>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c22c0d63
    • Michel Lespinasse's avatar
      mm: use mm_populate() for blocking remap_file_pages() · a1ea9549
      Michel Lespinasse authored
      Signed-off-by: default avatarMichel Lespinasse <walken@google.com>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Tested-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Cc: Greg Ungerer <gregungerer@westnet.com.au>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a1ea9549
    • Michel Lespinasse's avatar
      mm: remap_file_pages() fixes · 940e7da5
      Michel Lespinasse authored
      
      We have many vma manipulation functions that are fast in the typical
      case, but can optionally be instructed to populate an unbounded number
      of ptes within the region they work on:
      
       - mmap with MAP_POPULATE or MAP_LOCKED flags;
       - remap_file_pages() with MAP_NONBLOCK not set or when working on a
         VM_LOCKED vma;
       - mmap_region() and all its wrappers when mlock(MCL_FUTURE) is in
         effect;
       - brk() when mlock(MCL_FUTURE) is in effect.
      
      Current code handles these pte operations locally, while the
      sourrounding code has to hold the mmap_sem write side since it's
      manipulating vmas.  This means we're doing an unbounded amount of pte
      population work with mmap_sem held, and this causes problems as Andy
      Lutomirski reported (we've hit this at Google as well, though it's not
      entirely clear why people keep trying to use mlock(MCL_FUTURE) in the
      first place).
      
      I propose introducing a new mm_populate() function to do this pte
      population work after the mmap_sem has been released.  mm_populate()
      does need to acquire the mmap_sem read side, but critically, it doesn't
      need to hold it continuously for the entire duration of the operation -
      it can drop it whenever things take too long (such as when hitting disk
      for a file read) and re-acquire it later on.
      
      The following patches are included
      
      - Patches 1 fixes some issues I noticed while working on the existing code.
        If needed, they could potentially go in before the rest of the patches.
      
      - Patch 2 introduces the new mm_populate() function and changes
        mmap_region() call sites to use it after they drop mmap_sem. This is
        inspired from Andy Lutomirski's proposal and is built as an extension
        of the work I had previously done for mlock() and mlockall() around
        v2.6.38-rc1. I had tried doing something similar at the time but had
        given up as there were so many do_mmap() call sites; the recent cleanups
        by Linus and Viro are a tremendous help here.
      
      - Patches 3-5 convert some of the less-obvious places doing unbounded
        pte populates to the new mm_populate() mechanism.
      
      - Patches 6-7 are code cleanups that are made possible by the
        mm_populate() work. In particular, they remove more code than the
        entire patch series added, which should be a good thing :)
      
      - Patch 8 is optional to this entire series. It only helps to deal more
        nicely with racy userspace programs that might modify their mappings
        while we're trying to populate them. It adds a new VM_POPULATE flag
        on the mappings we do want to populate, so that if userspace replaces
        them with mappings it doesn't want populated, mm_populate() won't
        populate those replacement mappings.
      
      This patch:
      
      Assorted small fixes. The first two are quite small:
      
      - Move check for vma->vm_private_data && !(vma->vm_flags & VM_NONLINEAR)
        within existing if (!(vma->vm_flags & VM_NONLINEAR)) block.
        Purely cosmetic.
      
      - In the VM_LOCKED case, when dropping PG_Mlocked for the over-mapped
        range, make sure we own the mmap_sem write lock around the
        munlock_vma_pages_range call as this manipulates the vma's vm_flags.
      
      Last fix requires a longer explanation. remap_file_pages() can do its work
      either through VM_NONLINEAR manipulation or by creating extra vmas.
      These two cases were inconsistent with each other (and ultimately, both wrong)
      as to exactly when did they fault in the newly mapped file pages:
      
      - In the VM_NONLINEAR case, new file pages would be populated if
        the MAP_NONBLOCK flag wasn't passed. If MAP_NONBLOCK was passed,
        new file pages wouldn't be populated even if the vma is already
        marked as VM_LOCKED.
      
      - In the linear (emulated) case, the work is passed to the mmap_region()
        function which would populate the pages if the vma is marked as
        VM_LOCKED, and would not otherwise - regardless of the value of the
        MAP_NONBLOCK flag, because MAP_POPULATE wasn't being passed to
        mmap_region().
      
      The desired behavior is that we want the pages to be populated and locked
      if the vma is marked as VM_LOCKED, or to be populated if the MAP_NONBLOCK
      flag is not passed to remap_file_pages().
      Signed-off-by: default avatarMichel Lespinasse <walken@google.com>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Tested-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Cc: Greg Ungerer <gregungerer@westnet.com.au>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      940e7da5
  7. 19 Oct, 2012 1 commit
  8. 09 Oct, 2012 2 commits
    • Michel Lespinasse's avatar
      mm: replace vma prio_tree with an interval tree · 6b2dbba8
      Michel Lespinasse authored
      
      Implement an interval tree as a replacement for the VMA prio_tree.  The
      algorithms are similar to lib/interval_tree.c; however that code can't be
      directly reused as the interval endpoints are not explicitly stored in the
      VMA.  So instead, the common algorithm is moved into a template and the
      details (node type, how to get interval endpoints from the node, etc) are
      filled in using the C preprocessor.
      
      Once the interval tree functions are available, using them as a
      replacement to the VMA prio tree is a relatively simple, mechanical job.
      Signed-off-by: default avatarMichel Lespinasse <walken@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6b2dbba8
    • Konstantin Khlebnikov's avatar
      mm: kill vma flag VM_CAN_NONLINEAR · 0b173bc4
      Konstantin Khlebnikov authored
      
      Move actual pte filling for non-linear file mappings into the new special
      vma operation: ->remap_pages().
      
      Filesystems must implement this method to get non-linear mapping support,
      if it uses filemap_fault() then generic_file_remap_pages() can be used.
      
      Now device drivers can implement this method and obtain nonlinear vma support.
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Carsten Otte <cotte@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>	#arch/tile
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0b173bc4
  9. 27 Sep, 2012 1 commit
  10. 31 Oct, 2011 1 commit
  11. 26 May, 2011 1 commit
  12. 25 May, 2011 1 commit
  13. 25 Sep, 2010 1 commit
  14. 24 Sep, 2010 1 commit
    • Linus Torvalds's avatar
      fremap: get rid of broken 'end' variable · e92b05de
      Linus Torvalds authored
      
      Thomas Pollet points out that the 'end' variable is broken.  It was
      computed based on start/size before they were page-aligned, and as such
      doesn't actually match any of the other actions we take.  The overflow
      test on end was also redundant, since we had already tested it with the
      properly aligned version.
      
      So just get rid of it entirely.  The one remaining use for that broken
      variable can just use 'start+size' like all the other cases already did.
      Reported-by: default avatarThomas Pollet <thomas.pollet@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e92b05de
  15. 06 Mar, 2010 1 commit
  16. 10 Feb, 2009 1 commit
    • Mel Gorman's avatar
      Do not account for the address space used by hugetlbfs using VM_ACCOUNT · 5a6fe125
      Mel Gorman authored
      When overcommit is disabled, the core VM accounts for pages used by anonymous
      shared, private mappings and special mappings. It keeps track of VMAs that
      should be accounted for with VM_ACCOUNT and VMAs that never had a reserve
      with VM_NORESERVE.
      
      Overcommit for hugetlbfs is much riskier than overcommit for base pages
      due to contiguity requirements. It avoids overcommiting on both shared and
      private mappings using reservation counters that are checked and updated
      during mmap(). This ensures (within limits) that hugepages exist in the
      future when faults occurs or it is too easy to applications to be SIGKILLed.
      
      As hugetlbfs makes its own reservations of a different unit to the base page
      size, VM_ACCOUNT should never be set. Even if the units were correct, we would
      double account for the usage in the core VM and hugetlbfs. VM_NORESERVE may
      be set because an application can request no reserves be made for hugetlbfs
      at the risk of getting killed later.
      
      With commit fc8744ad
      
      , VM_NORESERVE and
      VM_ACCOUNT are getting unconditionally set for hugetlbfs-backed mappings. This
      breaks the accounting for both the core VM and hugetlbfs, can trigger an
      OOM storm when hugepage pools are too small lockups and corrupted counters
      otherwise are used. This patch brings hugetlbfs more in line with how the
      core VM treats VM_NORESERVE but prevents VM_ACCOUNT being set.
      Signed-off-by: default avatarMel Gorman <mel@csn.ul.ie>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5a6fe125
  17. 14 Jan, 2009 1 commit
  18. 06 Jan, 2009 1 commit
  19. 20 Oct, 2008 1 commit
  20. 28 Jul, 2008 1 commit
    • Andrea Arcangeli's avatar
      mmu-notifiers: core · cddb8a5c
      Andrea Arcangeli authored
      
      With KVM/GFP/XPMEM there isn't just the primary CPU MMU pointing to pages.
       There are secondary MMUs (with secondary sptes and secondary tlbs) too.
      sptes in the kvm case are shadow pagetables, but when I say spte in
      mmu-notifier context, I mean "secondary pte".  In GRU case there's no
      actual secondary pte and there's only a secondary tlb because the GRU
      secondary MMU has no knowledge about sptes and every secondary tlb miss
      event in the MMU always generates a page fault that has to be resolved by
      the CPU (this is not the case of KVM where the a secondary tlb miss will
      walk sptes in hardware and it will refill the secondary tlb transparently
      to software if the corresponding spte is present).  The same way
      zap_page_range has to invalidate the pte before freeing the page, the spte
      (and secondary tlb) must also be invalidated before any page is freed and
      reused.
      
      Currently we take a page_count pin on every page mapped by sptes, but that
      means the pages can't be swapped whenever they're mapped by any spte
      because they're part of the guest working set.  Furthermore a spte unmap
      event can immediately lead to a page to be freed when the pin is released
      (so requiring the same complex and relatively slow tlb_gather smp safe
      logic we have in zap_page_range and that can be avoided completely if the
      spte unmap event doesn't require an unpin of the page previously mapped in
      the secondary MMU).
      
      The mmu notifiers allow kvm/GRU/XPMEM to attach to the tsk->mm and know
      when the VM is swapping or freeing or doing anything on the primary MMU so
      that the secondary MMU code can drop sptes before the pages are freed,
      avoiding all page pinning and allowing 100% reliable swapping of guest
      physical address space.  Furthermore it avoids the code that teardown the
      mappings of the secondary MMU, to implement a logic like tlb_gather in
      zap_page_range that would require many IPI to flush other cpu tlbs, for
      each fixed number of spte unmapped.
      
      To make an example: if what happens on the primary MMU is a protection
      downgrade (from writeable to wrprotect) the secondary MMU mappings will be
      invalidated, and the next secondary-mmu-page-fault will call
      get_user_pages and trigger a do_wp_page through get_user_pages if it
      called get_user_pages with write=1, and it'll re-establishing an updated
      spte or secondary-tlb-mapping on the copied page.  Or it will setup a
      readonly spte or readonly tlb mapping if it's a guest-read, if it calls
      get_user_pages with write=0.  This is just an example.
      
      This allows to map any page pointed by any pte (and in turn visible in the
      primary CPU MMU), into a secondary MMU (be it a pure tlb like GRU, or an
      full MMU with both sptes and secondary-tlb like the shadow-pagetable layer
      with kvm), or a remote DMA in software like XPMEM (hence needing of
      schedule in XPMEM code to send the invalidate to the remote node, while no
      need to schedule in kvm/gru as it's an immediate event like invalidating
      primary-mmu pte).
      
      At least for KVM without this patch it's impossible to swap guests
      reliably.  And having this feature and removing the page pin allows
      several other optimizations that simplify life considerably.
      
      Dependencies:
      
      1) mm_take_all_locks() to register the mmu notifier when the whole VM
         isn't doing anything with "mm".  This allows mmu notifier users to keep
         track if the VM is in the middle of the invalidate_range_begin/end
         critical section with an atomic counter incraese in range_begin and
         decreased in range_end.  No secondary MMU page fault is allowed to map
         any spte or secondary tlb reference, while the VM is in the middle of
         range_begin/end as any page returned by get_user_pages in that critical
         section could later immediately be freed without any further
         ->invalidate_page notification (invalidate_range_begin/end works on
         ranges and ->invalidate_page isn't called immediately before freeing
         the page).  To stop all page freeing and pagetable overwrites the
         mmap_sem must be taken in write mode and all other anon_vma/i_mmap
         locks must be taken too.
      
      2) It'd be a waste to add branches in the VM if nobody could possibly
         run KVM/GRU/XPMEM on the kernel, so mmu notifiers will only enabled if
         CONFIG_KVM=m/y.  In the current kernel kvm won't yet take advantage of
         mmu notifiers, but this already allows to compile a KVM external module
         against a kernel with mmu notifiers enabled and from the next pull from
         kvm.git we'll start using them.  And GRU/XPMEM will also be able to
         continue the development by enabling KVM=m in their config, until they
         submit all GRU/XPMEM GPLv2 code to the mainline kernel.  Then they can
         also enable MMU_NOTIFIERS in the same way KVM does it (even if KVM=n).
         This guarantees nobody selects MMU_NOTIFIER=y if KVM and GRU and XPMEM
         are all =n.
      
      The mmu_notifier_register call can fail because mm_take_all_locks may be
      interrupted by a signal and return -EINTR.  Because mmu_notifier_reigster
      is used when a driver startup, a failure can be gracefully handled.  Here
      an example of the change applied to kvm to register the mmu notifiers.
      Usually when a driver startups other allocations are required anyway and
      -ENOMEM failure paths exists already.
      
       struct  kvm *kvm_arch_create_vm(void)
       {
              struct kvm *kvm = kzalloc(sizeof(struct kvm), GFP_KERNEL);
      +       int err;
      
              if (!kvm)
                      return ERR_PTR(-ENOMEM);
      
              INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
      
      +       kvm->arch.mmu_notifier.ops = &kvm_mmu_notifier_ops;
      +       err = mmu_notifier_register(&kvm->arch.mmu_notifier, current->mm);
      +       if (err) {
      +               kfree(kvm);
      +               return ERR_PTR(err);
      +       }
      +
              return kvm;
       }
      
      mmu_notifier_unregister returns void and it's reliable.
      
      The patch also adds a few needed but missing includes that would prevent
      kernel to compile after these changes on non-x86 archs (x86 didn't need
      them by luck).
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: fix mm/filemap_xip.c build]
      [akpm@linux-foundation.org: fix mm/mmu_notifier.c build]
      Signed-off-by: default avatarAndrea Arcangeli <andrea@qumranet.com>
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Signed-off-by: default avatarChristoph Lameter <cl@linux-foundation.org>
      Cc: Jack Steiner <steiner@sgi.com>
      Cc: Robin Holt <holt@sgi.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Kanoj Sarcar <kanojsarcar@yahoo.com>
      Cc: Roland Dreier <rdreier@cisco.com>
      Cc: Steve Wise <swise@opengridcomputing.com>
      Cc: Avi Kivity <avi@qumranet.com>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Anthony Liguori <aliguori@us.ibm.com>
      Cc: Chris Wright <chrisw@redhat.com>
      Cc: Marcelo Tosatti <marcelo@kvack.org>
      Cc: Eric Dumazet <dada1@cosmosbay.com>
      Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
      Cc: Izik Eidus <izike@qumranet.com>
      Cc: Anthony Liguori <aliguori@us.ibm.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cddb8a5c
  21. 20 Mar, 2008 1 commit
  22. 05 Feb, 2008 1 commit
  23. 17 Oct, 2007 2 commits
  24. 08 Oct, 2007 1 commit
  25. 19 Jul, 2007 3 commits
    • Miklos Szeredi's avatar
      only allow nonlinear vmas for ram backed filesystems · 3ee6dafc
      Miklos Szeredi authored
      
      page_mkclean() doesn't re-protect ptes for non-linear mappings, so a later
      re-dirty through such a mapping will not generate a fault, PG_dirty will
      not reflect the dirty state and the dirty count will be skewed.  This
      implies that msync() is also currently broken for nonlinear mappings.
      
      The easiest solution is to emulate remap_file_pages on non-linear mappings
      with simple mmap() for non ram-backed filesystems.  Applications continue
      to work (albeit slower), as long as the number of remappings remain below
      the maximum vma count.
      
      However all currently known real uses of non-linear mappings are for ram
      backed filesystems, which this patch doesn't affect.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3ee6dafc
    • Nick Piggin's avatar
      mm: fault feedback #1 · d0217ac0
      Nick Piggin authored
      
      Change ->fault prototype.  We now return an int, which contains
      VM_FAULT_xxx code in the low byte, and FAULT_RET_xxx code in the next byte.
       FAULT_RET_ code tells the VM whether a page was found, whether it has been
      locked, and potentially other things.  This is not quite the way he wanted
      it yet, but that's changed in the next patch (which requires changes to
      arch code).
      
      This means we no longer set VM_CAN_INVALIDATE in the vma in order to say
      that a page is locked which requires filemap_nopage to go away (because we
      can no longer remain backward compatible without that flag), but we were
      going to do that anyway.
      
      struct fault_data is renamed to struct vm_fault as Linus asked. address
      is now a void __user * that we should firmly encourage drivers not to use
      without really good reason.
      
      The page is now returned via a page pointer in the vm_fault struct.
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d0217ac0
    • Nick Piggin's avatar
      mm: merge populate and nopage into fault (fixes nonlinear) · 54cb8821
      Nick Piggin authored
      
      Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes
      the virtual address -> file offset differently from linear mappings.
      
      ->populate is a layering violation because the filesystem/pagecache code
      should need to know anything about the virtual memory mapping.  The hitch here
      is that the ->nopage handler didn't pass down enough information (ie.  pgoff).
       But it is more logical to pass pgoff rather than have the ->nopage function
      calculate it itself anyway (because that's a similar layering violation).
      
      Having the populate handler install the pte itself is likewise a nasty thing
      to be doing.
      
      This patch introduces a new fault handler that replaces ->nopage and
      ->populate and (later) ->nopfn.  Most of the old mechanism is still in place
      so there is a lot of duplication and nice cleanups that can be removed if
      everyone switches over.
      
      The rationale for doing this in the first place is that nonlinear mappings are
      subject to the pagefault vs invalidate/truncate race too, and it seemed stupid
      to duplicate the synchronisation logic rather than just consolidate the two.
      
      After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in
      pagecache.  Seems like a fringe functionality anyway.
      
      NOPAGE_REFAULT is removed.  This should be implemented with ->fault, and no
      users have hit mainline yet.
      
      [akpm@linux-foundation.org: cleanup]
      [randy.dunlap@oracle.com: doc. fixes for readahead]
      [akpm@linux-foundation.org: build fix]
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Signed-off-by: default avatarRandy Dunlap <randy.dunlap@oracle.com>
      Cc: Mark Fasheh <mark.fasheh@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      54cb8821
  26. 22 Dec, 2006 1 commit
  27. 07 Dec, 2006 1 commit
  28. 01 Oct, 2006 1 commit
  29. 26 Sep, 2006 1 commit
  30. 23 Jun, 2006 1 commit
  31. 29 Nov, 2005 1 commit
  32. 28 Nov, 2005 1 commit
    • Linus Torvalds's avatar
      mm: re-architect the VM_UNPAGED logic · 6aab341e
      Linus Torvalds authored
      
      This replaces the (in my opinion horrible) VM_UNMAPPED logic with very
      explicit support for a "remapped page range" aka VM_PFNMAP.  It allows a
      VM area to contain an arbitrary range of page table entries that the VM
      never touches, and never considers to be normal pages.
      
      Any user of "remap_pfn_range()" automatically gets this new
      functionality, and doesn't even have to mark the pages reserved or
      indeed mark them any other way.  It just works.  As a side effect, doing
      mmap() on /dev/mem works for arbitrary ranges.
      
      Sparc update from David in the next commit.
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      6aab341e
  33. 22 Nov, 2005 1 commit
    • Hugh Dickins's avatar
      [PATCH] unpaged: VM_NONLINEAR VM_RESERVED · 101d2be7
      Hugh Dickins authored
      
      There's one peculiar use of VM_RESERVED which the previous patch left behind:
      because VM_NONLINEAR's try_to_unmap_cluster uses vm_private_data as a swapout
      cursor, but should never meet VM_RESERVED vmas, it was a way of extending
      VM_NONLINEAR to VM_RESERVED vmas using vm_private_data for some other purpose.
       But that's an empty set - they don't have the populate function required.  So
      just throw away those VM_RESERVED tests.
      
      But one more interesting in rmap.c has to go too: try_to_unmap_one will want
      to swap out an anonymous page from VM_RESERVED or VM_UNPAGED area.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      101d2be7