1. 13 Jan, 2012 1 commit
    • Mel Gorman's avatar
      mm: compaction: introduce sync-light migration for use by compaction · a6bc32b8
      Mel Gorman authored
      
      This patch adds a lightweight sync migrate operation MIGRATE_SYNC_LIGHT
      mode that avoids writing back pages to backing storage.  Async compaction
      maps to MIGRATE_ASYNC while sync compaction maps to MIGRATE_SYNC_LIGHT.
      For other migrate_pages users such as memory hotplug, MIGRATE_SYNC is
      used.
      
      This avoids sync compaction stalling for an excessive length of time,
      particularly when copying files to a USB stick where there might be a
      large number of dirty pages backed by a filesystem that does not support
      ->writepages.
      
      [aarcange@redhat.com: This patch is heavily based on Andrea's work]
      [akpm@linux-foundation.org: fix fs/nfs/write.c build]
      [akpm@linux-foundation.org: fix fs/btrfs/disk-io.c build]
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Andy Isaacson <adi@hexapodia.org>
      Cc: Nai Xia <nai.xia@gmail.com>
      Cc: Johannes Weiner <jweiner@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a6bc32b8
  2. 31 Oct, 2011 1 commit
  3. 26 Jul, 2011 1 commit
    • Daniel Kiper's avatar
      mm: extend memory hotplug API to allow memory hotplug in virtual machines · 9d0ad8ca
      Daniel Kiper authored
      
      This patch contains online_page_callback and apropriate functions for
      registering/unregistering online page callbacks.  It allows to do some
      machine specific tasks during online page stage which is required to
      implement memory hotplug in virtual machines.  Currently this patch is
      required by latest memory hotplug support for Xen balloon driver patch
      which will be posted soon.
      
      Additionally, originial online_page() function was splited into
      following functions doing "atomic" operations:
      
        - __online_page_set_limits() - set new limits for memory management code,
        - __online_page_increment_counters() - increment totalram_pages and totalhigh_pages,
        - __online_page_free() - free page to allocator.
      
      It was done to:
        - not duplicate existing code,
        - ease hotplug code devolpment by usage of well defined interface,
        - avoid stupid bugs which are unavoidable when the same code
          (by design) is developed in many places.
      
      [akpm@linux-foundation.org: use explicit indirect-call syntax]
      Signed-off-by: default avatarDaniel Kiper <dkiper@net-space.pl>
      Reviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Ian Campbell <ian.campbell@citrix.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9d0ad8ca
  4. 23 Jun, 2011 2 commits
  5. 16 Jun, 2011 1 commit
    • KAMEZAWA Hiroyuki's avatar
      mm/memory_hotplug.c: fix building of node hotplug zonelist · 959ecc48
      KAMEZAWA Hiroyuki authored
      
      During memory hotplug we refresh zonelists when we online a page in a new
      zone.  It means that the node's zonelist is not initialized until pages
      are onlined.  So for example, "nid" passed by MEM_GOING_ONLINE notifier
      will point to NODE_DATA(nid) which has no zone fallback list.  Moreover,
      if we hot-add cpu-only nodes, alloc_pages() will do no fallback.
      
      This patch makes a zonelist when a new pgdata is available.
      
      Note: in production, at fujitsu, memory should be onlined before cpu
            and our server didn't have any memory-less nodes and had no problems.
      
            But recent changes in MEM_GOING_ONLINE+page_cgroup
            will access not initialized zonelist of node.
            Anyway, there are memory-less node and we need some care.
      Signed-off-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Dave Hansen <dave@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      959ecc48
  6. 25 May, 2011 4 commits
  7. 14 Apr, 2011 1 commit
  8. 31 Mar, 2011 1 commit
  9. 14 Jan, 2011 3 commits
  10. 11 Jan, 2011 1 commit
  11. 02 Dec, 2010 1 commit
  12. 26 Oct, 2010 6 commits
  13. 19 Oct, 2010 1 commit
  14. 10 Sep, 2010 1 commit
  15. 25 May, 2010 3 commits
    • Haicheng Li's avatar
      mem-hotplug: fix potential race while building zonelist for new populated zone · 4eaf3f64
      Haicheng Li authored
      
      Add global mutex zonelists_mutex to fix the possible race:
      
           CPU0                                  CPU1                    CPU2
      (1) zone->present_pages += online_pages;
      (2)                                       build_all_zonelists();
      (3)                                                               alloc_page();
      (4)                                                               free_page();
      (5) build_all_zonelists();
      (6)   __build_all_zonelists();
      (7)     zone->pageset = alloc_percpu();
      
      In step (3,4), zone->pageset still points to boot_pageset, so bad
      things may happen if 2+ nodes are in this state. Even if only 1 node
      is accessing the boot_pageset, (3) may still consume too much memory
      to fail the memory allocations in step (7).
      
      Besides, atomic operation ensures alloc_percpu() in step (7) will never fail
      since there is a new fresh memory block added in step(6).
      
      [haicheng.li@linux.intel.com: hold zonelists_mutex when build_all_zonelists]
      Signed-off-by: default avatarHaicheng Li <haicheng.li@linux.intel.com>
      Signed-off-by: default avatarWu Fengguang <fengguang.wu@intel.com>
      Reviewed-by: default avatarAndi Kleen <andi.kleen@intel.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4eaf3f64
    • Haicheng Li's avatar
      mem-hotplug: avoid multiple zones sharing same boot strapping boot_pageset · 1f522509
      Haicheng Li authored
      
      For each new populated zone of hotadded node, need to update its pagesets
      with dynamically allocated per_cpu_pageset struct for all possible CPUs:
      
          1) Detach zone->pageset from the shared boot_pageset
             at end of __build_all_zonelists().
      
          2) Use mutex to protect zone->pageset when it's still
             shared in onlined_pages()
      
      Otherwises, multiple zones of different nodes would share same boot strapping
      boot_pageset for same CPU, which will finally cause below kernel panic:
      
        ------------[ cut here ]------------
        kernel BUG at mm/page_alloc.c:1239!
        invalid opcode: 0000 [#1] SMP
        ...
        Call Trace:
         [<ffffffff811300c1>] __alloc_pages_nodemask+0x131/0x7b0
         [<ffffffff81162e67>] alloc_pages_current+0x87/0xd0
         [<ffffffff81128407>] __page_cache_alloc+0x67/0x70
         [<ffffffff811325f0>] __do_page_cache_readahead+0x120/0x260
         [<ffffffff81132751>] ra_submit+0x21/0x30
         [<ffffffff811329c6>] ondemand_readahead+0x166/0x2c0
         [<ffffffff81132ba0>] page_cache_async_readahead+0x80/0xa0
         [<ffffffff8112a0e4>] generic_file_aio_read+0x364/0x670
         [<ffffffff81266cfa>] nfs_file_read+0xca/0x130
         [<ffffffff8117b20a>] do_sync_read+0xfa/0x140
         [<ffffffff8117bf75>] vfs_read+0xb5/0x1a0
         [<ffffffff8117c151>] sys_read+0x51/0x80
         [<ffffffff8103c032>] system_call_fastpath+0x16/0x1b
        RIP  [<ffffffff8112ff13>] get_page_from_freelist+0x883/0x900
         RSP <ffff88000d1e78a8>
        ---[ end trace 4bda28328b9990db ]
      
      [akpm@linux-foundation.org: merge fix]
      Signed-off-by: default avatarHaicheng Li <haicheng.li@linux.intel.com>
      Signed-off-by: default avatarWu Fengguang <fengguang.wu@intel.com>
      Reviewed-by: default avatarAndi Kleen <andi.kleen@intel.com>
      Reviewed-by: default avatarChristoph Lameter <cl@linux-foundation.org>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1f522509
    • minskey guo's avatar
      cpu/mem hotplug: enable CPUs online before local memory online · cf23422b
      minskey guo authored
      
      Enable users to online CPUs even if the CPUs belongs to a numa node which
      doesn't have onlined local memory.
      
      The zonlists(pg_data_t.node_zonelists[]) of a numa node are created either
      in system boot/init period, or at the time of local memory online.  For a
      numa node without onlined local memory, its zonelists are not initialized
      at present.  As a result, any memory allocation operations executed by
      CPUs within this node will fail.  In fact, an out-of-memory error is
      triggered when attempt to online CPUs before memory comes to online.
      
      This patch tries to create zonelists for such numa nodes, so that the
      memory allocation for this node can be fallback'ed to other nodes.
      
      [akpm@linux-foundation.org: remove unneeded export]
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: minskey guo<chaohong.guo@intel.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cf23422b
  16. 12 Mar, 2010 1 commit
    • Wu Fengguang's avatar
      mm: introduce dump_page() and print symbolic flag names · 718a3821
      Wu Fengguang authored
      
      - introduce dump_page() to print the page info for debugging some error
        condition.
      
      - convert three mm users: bad_page(), print_bad_pte() and memory offline
        failure.
      
      - print an extra field: the symbolic names of page->flags
      
      Example dump_page() output:
      
      [  157.521694] page:ffffea0000a7cba8 count:2 mapcount:1 mapping:ffff88001c901791 index:0x147
      [  157.525570] page flags: 0x100000000100068(uptodate|lru|active|swapbacked)
      Signed-off-by: default avatarWu Fengguang <fengguang.wu@intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Alex Chiang <achiang@hp.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Mel Gorman <mel@linux.vnet.ibm.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      718a3821
  17. 06 Mar, 2010 1 commit
  18. 15 Dec, 2009 5 commits
    • Rakib Mullick's avatar
      mm: fix section mismatch in memory_hotplug.c · 23ce932a
      Rakib Mullick authored
      
      __free_pages_bootmem() is a __meminit function - which has been called
      from put_pages_bootmem thus causes a section mismatch warning.
      
       We were warned by the following warning:
      
        LD      mm/built-in.o
      WARNING: mm/built-in.o(.text+0x26b22): Section mismatch in reference
      from the function put_page_bootmem() to the function
      .meminit.text:__free_pages_bootmem()
      The function put_page_bootmem() references
      the function __meminit __free_pages_bootmem().
      This is often because put_page_bootmem lacks a __meminit
      annotation or the annotation of __free_pages_bootmem is wrong.
      Signed-off-by: default avatarRakib Mullick <rakib.mullick@gmail.com>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Cc: Badari Pulavarty <pbadari@us.ibm.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      23ce932a
    • Andrew Morton's avatar
      mm: memory_hotplug: make offline_pages() static · b4e655a4
      Andrew Morton authored
      
      It has no references outside memory_hotplug.c.
      
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b4e655a4
    • Hugh Dickins's avatar
      ksm: memory hotremove migration only · 62b61f61
      Hugh Dickins authored
      
      The previous patch enables page migration of ksm pages, but that soon gets
      into trouble: not surprising, since we're using the ksm page lock to lock
      operations on its stable_node, but page migration switches the page whose
      lock is to be used for that.  Another layer of locking would fix it, but
      do we need that yet?
      
      Do we actually need page migration of ksm pages?  Yes, memory hotremove
      needs to offline sections of memory: and since we stopped allocating ksm
      pages with GFP_HIGHUSER, they will tend to be GFP_HIGHUSER_MOVABLE
      candidates for migration.
      
      But KSM is currently unconscious of NUMA issues, happily merging pages
      from different NUMA nodes: at present the rule must be, not to use
      MADV_MERGEABLE where you care about NUMA.  So no, NUMA page migration of
      ksm pages does not make sense yet.
      
      So, to complete support for ksm swapping we need to make hotremove safe.
      ksm_memory_callback() take ksm_thread_mutex when MEM_GOING_OFFLINE and
      release it when MEM_OFFLINE or MEM_CANCEL_OFFLINE.  But if mapped pages
      are freed before migration reaches them, stable_nodes may be left still
      pointing to struct pages which have been removed from the system: the
      stable_node needs to identify a page by pfn rather than page pointer, then
      it can safely prune them when MEM_OFFLINE.
      
      And make NUMA migration skip PageKsm pages where it skips PageReserved.
      But it's only when we reach unmap_and_move() that the page lock is taken
      and we can be sure that raised pagecount has prevented a PageAnon from
      being upgraded: so add offlining arg to migrate_pages(), to migrate ksm
      page when offlining (has sufficient locking) but reject it otherwise.
      Signed-off-by: default avatarHugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: Izik Eidus <ieidus@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Chris Wright <chrisw@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      62b61f61
    • David Rientjes's avatar
      mm: clear node in N_HIGH_MEMORY and stop kswapd when all memory is offlined · 8fe23e05
      David Rientjes authored
      
      When memory is hot-removed, its node must be cleared in N_HIGH_MEMORY if
      there are no present pages left.
      
      In such a situation, kswapd must also be stopped since it has nothing left
      to do.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      Cc: Nishanth Aravamudan <nacc@us.ibm.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Adam Litke <agl@us.ibm.com>
      Cc: Andy Whitcroft <apw@canonical.com>
      Cc: Eric Whitney <eric.whitney@hp.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8fe23e05
    • KOSAKI Motohiro's avatar
      mm: move inc_zone_page_state(NR_ISOLATED) to just isolated place · 6d9c285a
      KOSAKI Motohiro authored
      
      Christoph pointed out inc_zone_page_state(NR_ISOLATED) should be placed
      in right after isolate_page().
      
      This patch does it.
      Reviewed-by: default avatarChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6d9c285a
  19. 18 Nov, 2009 2 commits
    • Andi Kleen's avatar
      mm: allow memory hotplug and hibernation in the same kernel · 6ad696d2
      Andi Kleen authored
      
      Allow memory hotplug and hibernation in the same kernel
      
      Memory hotplug and hibernation were exclusive in Kconfig.  This is
      obviously a problem for distribution kernels who want to support both in
      the same image.
      
      After some discussions with Rafael and others the only problem is with
      parallel memory hotadd or removal while a hibernation operation is in
      process.  It was also working for s390 before.
      
      This patch removes the Kconfig level exclusion, and simply makes the
      memory add / remove functions grab the pm_mutex to exclude against
      hibernation.
      
      Fixes a regression - old kernels didn't exclude memory hotadd and
      hibernation.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Acked-by: default avatarRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6ad696d2
    • Hidetoshi Seto's avatar
      mm/memory_hotplug: fix section mismatch · e1319331
      Hidetoshi Seto authored
      
      With CONFIG_MEMORY_HOTPLUG I got following warning:
      
      WARNING: vmlinux.o(.text+0x1276b0): Section mismatch in reference from
      the function hotadd_new_pgdat() to the function
      .meminit.text:free_area_init_node()
      The function hotadd_new_pgdat() references
      the function __meminit free_area_init_node().
      This is often because hotadd_new_pgdat lacks a __meminit
      annotation or the annotation of free_area_init_node is wrong.
      
      Use __ref to fix this.
      Signed-off-by: default avatarHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e1319331
  20. 23 Sep, 2009 1 commit
    • KAMEZAWA Hiroyuki's avatar
      walk system ram range · 908eedc6
      KAMEZAWA Hiroyuki authored
      
      Originally, walk_memory_resource() was introduced to traverse all memory
      of "System RAM" for detecting memory hotplug/unplug range.  For doing so,
      flags of IORESOUCE_MEM|IORESOURCE_BUSY was used and this was enough for
      memory hotplug.
      
      But for using other purpose, /proc/kcore, this may includes some firmware
      area marked as IORESOURCE_BUSY | IORESOUCE_MEM.  This patch makes the
      check strict to find out busy "System RAM".
      
      Note: PPC64 keeps their own walk_memory_resouce(), which walk through
      ppc64's lmb informaton.  Because old kclist_add() is called per lmb, this
      patch makes no difference in behavior, finally.
      
      And this patch removes CONFIG_MEMORY_HOTPLUG check from this function.
      Because pfn_valid() just show "there is memmap or not* and cannot be used
      for "there is physical memory or not", this function is useful in generic
      to scan physical memory range.
      Signed-off-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmi...
      908eedc6
  21. 22 Sep, 2009 2 commits