1. 08 Apr, 2022 1 commit
  2. 16 Oct, 2020 4 commits
    • Jann Horn's avatar
      binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot · a07279c9
      Jann Horn authored
      
      In both binfmt_elf and binfmt_elf_fdpic, use a new helper
      dump_vma_snapshot() to take a snapshot of the VMA list (including the gate
      VMA, if we have one) while protected by the mmap_lock, and then use that
      snapshot instead of walking the VMA list without locking.
      
      An alternative approach would be to keep the mmap_lock held across the
      entire core dumping operation; however, keeping the mmap_lock locked while
      we may be blocked for an unbounded amount of time (e.g.  because we're
      dumping to a FUSE filesystem or so) isn't really optimal; the mmap_lock
      blocks things like the ->release handler of userfaultfd, and we don't
      really want critical system daemons to grind to a halt just because
      someone "gifted" them SCM_RIGHTS to an eternally-locked userfaultfd, or
      something like that.
      
      Since both the normal ELF code and the FDPIC ELF code need this
      functionality (and if any other binfmt wants to add coredump support in
      the future, they'd probably need it, too), implement this with a common
      helper in fs/coredump.c.
      
      A downside of this approach is that we now need a bigger amount of kernel
      memory per userspace VMA in the normal ELF case, and that we need O(n)
      kernel memory in the FDPIC ELF case at all; but 40 bytes per VMA shouldn't
      be terribly bad.
      
      There currently is a data race between stack expansion and anything that
      reads ->vm_start or ->vm_end under the mmap_lock held in read mode; to
      mitigate that for core dumping, take the mmap_lock in write mode when
      taking a snapshot of the VMA hierarchy.  (If we only took the mmap_lock in
      read mode, we could end up with a corrupted core dump if someone does
      get_user_pages_remote() concurrently.  Not really a major problem, but
      taking the mmap_lock either way works here, so we might as well avoid the
      issue.) (This doesn't do anything about the existing data races with stack
      expansion in other mm code.)
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Link: http://lkml.kernel.org/r/20200827114932.3572699-6-jannh@google.com
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a07279c9
    • Jann Horn's avatar
      coredump: rework elf/elf_fdpic vma_dump_size() into common helper · 429a22e7
      Jann Horn authored
      
      At the moment, the binfmt_elf and binfmt_elf_fdpic code have slightly
      different code to figure out which VMAs should be dumped, and if so,
      whether the dump should contain the entire VMA or just its first page.
      
      Eliminate duplicate code by reworking the binfmt_elf version into a
      generic core dumping helper in coredump.c.
      
      As part of that, change the heuristic for detecting executable/library
      header pages to check whether the inode is executable instead of looking
      at the file mode.
      
      This is less problematic in terms of locking because it lets us avoid
      get_user() under the mmap_sem.  (And arguably it looks nicer and makes
      more sense in generic code.)
      
      Adjust a little bit based on the binfmt_elf_fdpic version: ->anon_vma is
      only meaningful under CONFIG_MMU, otherwise we have to assume that the VMA
      has been written to.
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Link: http://lkml.kernel.org/r/20200827114932.3572699-5-jannh@google.com
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      429a22e7
    • Jann Horn's avatar
      coredump: refactor page range dumping into common helper · afc63a97
      Jann Horn authored
      
      Both fs/binfmt_elf.c and fs/binfmt_elf_fdpic.c need to dump ranges of
      pages into the coredump file.  Extract that logic into a common helper.
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Link: http://lkml.kernel.org/r/20200827114932.3572699-4-jannh@google.com
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      afc63a97
    • Jann Horn's avatar
      binfmt_elf_fdpic: stop using dump_emit() on user pointers on !MMU · 8f942eea
      Jann Horn authored
      Patch series "Fix ELF / FDPIC ELF core dumping, and use mmap_lock properly in there", v5.
      
      At the moment, we have that rather ugly mmget_still_valid() helper to work
      around <https://crbug.com/project-zero/1790
      
      >: ELF core dumping doesn't
      take the mmap_sem while traversing the task's VMAs, and if anything (like
      userfaultfd) then remotely messes with the VMA tree, fireworks ensue.  So
      at the moment we use mmget_still_valid() to bail out in any writers that
      might be operating on a remote mm's VMAs.
      
      With this series, I'm trying to get rid of the need for that as cleanly as
      possible.  ("cleanly" meaning "avoid holding the mmap_lock across
      unbounded sleeps".)
      
      Patches 1, 2, 3 and 4 are relatively unrelated cleanups in the core
      dumping code.
      
      Patches 5 and 6 implement the main change: Instead of repeatedly accessing
      the VMA list with sleeps in between, we snapshot it at the start with
      proper locking, and then later we just use our copy of the VMA list.  This
      ensures that the kernel won't crash, that VMA metadata in the coredump is
      consistent even in the presence of concurrent modifications, and that any
      virtual addresses that aren't being concurrently modified have their
      contents show up in the core dump properly.
      
      The disadvantage of this approach is that we need a bit more memory during
      core dumping for storing metadata about all VMAs.
      
      At the end of the series, patch 7 removes the old workaround for this
      issue (mmget_still_valid()).
      
      I have tested:
      
       - Creating a simple core dump on X86-64 still works.
       - The created coredump on X86-64 opens in GDB and looks plausible.
       - X86-64 core dumps contain the first page for executable mappings at
         offset 0, and don't contain the first page for non-executable file
         mappings or executable mappings at offset !=0.
       - NOMMU 32-bit ARM can still generate plausible-looking core dumps
         through the FDPIC implementation. (I can't test this with GDB because
         GDB is missing some structure definition for nommu ARM, but I've
         poked around in the hexdump and it looked decent.)
      
      This patch (of 7):
      
      dump_emit() is for kernel pointers, and VMAs describe userspace memory.
      Let's be tidy here and avoid accessing userspace pointers under KERNEL_DS,
      even if it probably doesn't matter much on !MMU systems - especially given
      that it looks like we can just use the same get_dump_page() as on MMU if
      we move it out of the CONFIG_MMU block.
      
      One small change we have to make in get_dump_page() is to use
      __get_user_pages_locked() instead of __get_user_pages(), since the latter
      doesn't exist on nommu.  On mmu builds, __get_user_pages_locked() will
      just call __get_user_pages() for us.
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Link: http://lkml.kernel.org/r/20200827114932.3572699-1-jannh@google.com
      Link: http://lkml.kernel.org/r/20200827114932.3572699-2-jannh@google.com
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8f942eea
  3. 07 Aug, 2020 1 commit
    • Mike Rapoport's avatar
      mm: remove unneeded includes of <asm/pgalloc.h> · ca15ca40
      Mike Rapoport authored
      
      Patch series "mm: cleanup usage of <asm/pgalloc.h>"
      
      Most architectures have very similar versions of pXd_alloc_one() and
      pXd_free_one() for intermediate levels of page table.  These patches add
      generic versions of these functions in <asm-generic/pgalloc.h> and enable
      use of the generic functions where appropriate.
      
      In addition, functions declared and defined in <asm/pgalloc.h> headers are
      used mostly by core mm and early mm initialization in arch and there is no
      actual reason to have the <asm/pgalloc.h> included all over the place.
      The first patch in this series removes unneeded includes of
      <asm/pgalloc.h>
      
      In the end it didn't work out as neatly as I hoped and moving
      pXd_alloc_track() definitions to <asm-generic/pgalloc.h> would require
      unnecessary changes to arches that have custom page table allocations, so
      I've decided to move lib/ioremap.c to mm/ and make pgalloc-track.h local
      to mm/.
      
      This patch (of 8):
      
      In most cases <asm/pgalloc.h> header is required only for allocations of
      page table memory.  Most of the .c files that include that header do not
      use symbols declared in <asm/pgalloc.h> and do not require that header.
      
      As for the other header files that used to include <asm/pgalloc.h>, it is
      possible to move that include into the .c file that actually uses symbols
      from <asm/pgalloc.h> and drop the include from the header file.
      
      The process was somewhat automated using
      
      	sed -i -E '/[<"]asm\/pgalloc\.h/d' \
                      $(grep -L -w -f /tmp/xx \
                              $(git grep -E -l '[<"]asm/pgalloc\.h'))
      
      where /tmp/xx contains all the symbols defined in
      arch/*/include/asm/pgalloc.h.
      
      [rppt@linux.ibm.com: fix powerpc warning]
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarPekka Enberg <penberg@kernel.org>
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
      Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Matthew Wilcox <willy@infradead.org>
      Link: http://lkml.kernel.org/r/20200627143453.31835-1-rppt@kernel.org
      Link: http://lkml.kernel.org/r/20200627143453.31835-2-rppt@kernel.org
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ca15ca40
  4. 27 Jul, 2020 6 commits
  5. 03 Jun, 2020 1 commit
  6. 27 May, 2020 1 commit
  7. 21 May, 2020 1 commit
    • Eric W. Biederman's avatar
      exec: Generic execfd support · b8a61c9e
      Eric W. Biederman authored
      Most of the support for passing the file descriptor of an executable
      to an interpreter already lives in the generic code and in binfmt_elf.
      Rework the fields in binfmt_elf that deal with executable file
      descriptor passing to make executable file descriptor passing a first
      class concept.
      
      Move the fd_install from binfmt_misc into begin_new_exec after the new
      creds have been installed.  This means that accessing the file through
      /proc/<pid>/fd/N is able to see the creds for the new executable
      before allowing access to the new executables files.
      
      Performing the install of the executables file descriptor after
      the point of no return also means that nothing special needs to
      be done on error.  The exiting of the process will close all
      of it's open files.
      
      Move the would_dump from binfmt_misc into begin_new_exec right
      after would_dump is called on the bprm->file.  This makes it
      obvious this case exists and that no nesting of bprm->file is
      currently supported.
      
      In binfmt_misc the movement of fd_install into generic code means
      that it's special error exit path is no longer needed.
      
      Link: https://lkml.kernel.org/r/87y2poyd91.fsf_-_@x220.int.ebiederm.org
      
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      b8a61c9e
  8. 07 May, 2020 3 commits
  9. 05 May, 2020 1 commit
  10. 15 Nov, 2019 1 commit
    • Arnd Bergmann's avatar
      y2038: elfcore: Use __kernel_old_timeval for process times · e2bb80d5
      Arnd Bergmann authored
      
      We store elapsed time for a crashed process in struct elf_prstatus using
      'timeval' structures. Once glibc starts using 64-bit time_t, this becomes
      incompatible with the kernel's idea of timeval since the structure layout
      no longer matches on 32-bit architectures.
      
      This changes the definition of the elf_prstatus structure to use
      __kernel_old_timeval instead, which is hardcoded to the currently used
      binary layout. There is no risk of overflow in y2038 though, because
      the time values are all relative times, and can store up to 68 years
      of process elapsed time.
      
      There is a risk of applications breaking at build time when they
      use the new kernel headers and expect the type to be exactly 'timeval'
      rather than a structure that has the same fields as before. Those
      applications have to be modified to deal with 64-bit time_t anyway.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      e2bb80d5
  11. 30 May, 2019 1 commit
  12. 12 Jun, 2018 1 commit
    • Kees Cook's avatar
      treewide: kmalloc() -> kmalloc_array() · 6da2ec56
      Kees Cook authored
      The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
      patch replaces cases of:
      
              kmalloc(a * b, gfp)
      
      with:
              kmalloc_array(a * b, gfp)
      
      as well as handling cases of:
      
              kmalloc(a * b * c, gfp)
      
      with:
      
              kmalloc(array3_size(a, b, c), gfp)
      
      as it's slightly less ugly than:
      
              kmalloc_array(array_size(a, b), c, gfp)
      
      This does, however, attempt to ignore constant size factors like:
      
              kmalloc(4 * 1024, gfp)
      
      though any constants defined via macros get caught up in the conversion.
      
      Any factors with a sizeof() of "unsigned char", "char", and "u8" were
      dropped, since they're redundant.
      
      The tools/ directory was manually excluded, since it has its own
      implementation of kmalloc().
      
      The Coccinelle script used for this was:
      
      // Fix redundant parens around sizeof().
      @@
      type TYPE;
      expression THING, E;
      @@
      
      (
        kmalloc(
      -	(sizeof(TYPE)) * E
      +	sizeof(TYPE) * E
        , ...)
      |
        kmalloc(
      -	(sizeof(THING)) * E
      +	sizeof(THING) * E
      ...
      6da2ec56
  13. 11 Apr, 2018 1 commit
  14. 12 Oct, 2017 1 commit
  15. 10 Sep, 2017 2 commits
  16. 04 Sep, 2017 1 commit
  17. 01 Aug, 2017 1 commit
    • Kees Cook's avatar
      binfmt: Introduce secureexec flag · c425e189
      Kees Cook authored
      
      The bprm_secureexec hook can be moved earlier. Right now, it is called
      during create_elf_tables(), via load_binary(), via search_binary_handler(),
      via exec_binprm(). Nearly all (see exception below) state used by
      bprm_secureexec is created during the bprm_set_creds hook, called from
      prepare_binprm().
      
      For all LSMs (except commoncaps described next), only the first execution
      of bprm_set_creds takes any effect (they all check bprm->called_set_creds
      which prepare_binprm() sets after the first call to the bprm_set_creds
      hook).  However, all these LSMs also only do anything with bprm_secureexec
      when they detected a secure state during their first run of bprm_set_creds.
      Therefore, it is functionally identical to move the detection into
      bprm_set_creds, since the results from secureexec here only need to be
      based on the first call to the LSM's bprm_set_creds hook.
      
      The single exception is that the commoncaps secureexec hook also examines
      euid/uid and egid/gid differences which are controlled by bprm_fill_uid(),
      via prepare_binprm(), which can be called multiple times (e.g.
      binfmt_script, binfmt_misc), and may clear the euid/egid for the final
      load (i.e. the script interpreter). However, while commoncaps specifically
      ignores bprm->cred_prepared, and runs its bprm_set_creds hook each time
      prepare_binprm() may get called, it needs to base the secureexec decision
      on the final call to bprm_set_creds. As a result, it will need special
      handling.
      
      To begin this refactoring, this adds the secureexec flag to the bprm
      struct, and calls the secureexec hook during setup_new_exec(). This is
      safe since all the cred work is finished (and past the point of no return).
      This explicit call will be removed in later patches once the hook has been
      removed.
      
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarJohn Johansen <john.johansen@canonical.com>
      Acked-by: default avatarSerge Hallyn <serge@hallyn.com>
      Reviewed-by: default avatarJames Morris <james.l.morris@oracle.com>
      c425e189
  18. 02 Mar, 2017 3 commits
  19. 01 Feb, 2017 3 commits
    • Frederic Weisbecker's avatar
      fs/binfmt: Convert obsolete cputime type to nsecs · cd19c364
      Frederic Weisbecker authored
      
      Use the new nsec based cputime accessors as part of the whole cputime
      conversion from cputime_t to nsecs.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Wanpeng Li <wanpeng.li@hotmail.com>
      Link: http://lkml.kernel.org/r/1485832191-26889-12-git-send-email-fweisbec@gmail.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      cd19c364
    • Frederic Weisbecker's avatar
      sched/cputime: Convert task/group cputime to nsecs · 5613fda9
      Frederic Weisbecker authored
      
      Now that most cputime readers use the transition API which return the
      task cputime in old style cputime_t, we can safely store the cputime in
      nsecs. This will eventually make cputime statistics less opaque and more
      granular. Back and forth convertions between cputime_t and nsecs in order
      to deal with cputime_t random granularity won't be needed anymore.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Wanpeng Li <wanpeng.li@hotmail.com>
      Link: http://lkml.kernel.org/r/1485832191-26889-8-git-send-email-fweisbec@gmail.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      5613fda9
    • Frederic Weisbecker's avatar
      sched/cputime: Introduce special task_cputime_t() API to return old-typed cputime · a1cecf2b
      Frederic Weisbecker authored
      
      This API returns a task's cputime in cputime_t in order to ease the
      conversion of cputime internals to use nsecs units instead. Blindly
      converting all cputime readers to use this API now will later let us
      convert more smoothly and step by step all these places to use the
      new nsec based cputime.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Wanpeng Li <wanpeng.li@hotmail.com>
      Link: http://lkml.kernel.org/r/1485832191-26889-7-git-send-email-fweisbec@gmail.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a1cecf2b
  20. 24 Dec, 2016 1 commit
  21. 25 Jul, 2016 1 commit
  22. 08 Jun, 2016 1 commit
  23. 12 May, 2016 1 commit
  24. 04 Apr, 2016 1 commit
    • Kirill A. Shutemov's avatar
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov authored
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patc...
      09cbfeaf
  25. 09 Nov, 2015 1 commit
    • Rich Felker's avatar
      fs/binfmt_elf_fdpic.c: provide NOMMU loader for regular ELF binaries · 1bde925d
      Rich Felker authored
      
      The ELF binary loader in binfmt_elf.c requires an MMU, making it
      impossible to use regular ELF binaries on NOMMU archs.  However, the FDPIC
      ELF loader in binfmt_elf_fdpic.c is fully capable as a loader for plain
      ELF, which requires constant displacements between LOAD segments, since it
      already supports FDPIC ELF files flagged as needing constant displacement.
      
      This patch adjusts the FDPIC ELF loader to accept non-FDPIC ELF files on
      NOMMU archs.  They are treated identically to FDPIC ELF files with the
      constant-displacement flag bit set, except for personality, which must
      match the ABI of the program being loaded; the PER_LINUX_FDPIC personality
      controls how the kernel interprets function pointers passed to sigaction.
      
      Files that do not set a stack size requirement explicitly are given a
      default stack size (matching the amount of committed stack the normal ELF
      loader for MMU archs would give them) rather than being rejected; this is
      necessary because plain ELF files generally do not declare stack
      requirements in theit program headers.
      
      Only ET_DYN (PIE) format ELF files are supported, since loading at a fixed
      virtual address is not possible on NOMMU.
      
      This patch was developed and tested on J2 (SH2-compatible) but should
      be usable immediately on all archs where binfmt_elf_fdpic is
      available. Moreover, by providing dummy definitions of the
      elf_check_fdpic() and elf_check_const_displacement() macros for archs
      which lack an FDPIC ABI, it should be possible to enable building of
      binfmt_elf_fdpic on all other NOMMU archs and thereby give them ELF
      binary support, but I have not yet tested this.
      
      The motivation for using binfmt_elf_fdpic.c rather than adapting
      binfmt_elf.c to NOMMU is that the former already has all the necessary
      code to work properly on NOMMU and has already received widespread
      real-world use and testing. I hope this is not controversial.
      
      I'm not really happy with having to unset the FDPIC_FUNCPTRS
      personality bit when loading non-FDPIC ELF. This bit should really
      reset automatically on execve, since otherwise, executing non-ELF
      binaries (e.g. bFLT) from an FDPIC process will leave the personality
      in the wrong state and severely break signal handling. But that's a
      separate, existing bug and I don't know the right place to fix it.
      Signed-off-by: default avatarRich Felker <dalias@libc.org>
      Acked-by: default avatarGreg Ungerer <gerg@uclinux.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Oleg Endo <oleg.endo@t-online.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1bde925d