1. 17 Apr, 2020 40 commits
    • Matthew Wilcox (Oracle)'s avatar
      XArray: Fix xas_pause for large multi-index entries · d849c610
      Matthew Wilcox (Oracle) authored
      commit c36d451a upstream.
      
      Inspired by the recent Coverity report, I looked for other places where
      the offset wasn't being converted to an unsigned long before being
      shifted, and I found one in xas_pause() when the entry being paused is
      of order >32.
      
      Fixes: b803b428
      
       ("xarray: Add XArray iterators")
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d849c610
    • Nikos Tsironis's avatar
      dm clone metadata: Fix return type of dm_clone_nr_of_hydrated_regions() · 54aabddc
      Nikos Tsironis authored
      commit 81d5553d upstream.
      
      dm_clone_nr_of_hydrated_regions() returns the number of regions that
      have been hydrated so far. In order to do so it employs bitmap_weight().
      
      Until now, the return type of dm_clone_nr_of_hydrated_regions() was
      unsigned long.
      
      Because bitmap_weight() returns an int, in case BITS_PER_LONG == 64 and
      the return value of bitmap_weight() is 2^31 (the maximum allowed number
      of regions for a device), the result is sign extended from 32 bits to 64
      bits and an incorrect value is displayed, in the status output of
      dm-clone, as the number of hydrated regions.
      
      Fix this by having dm_clone_nr_of_hydrated_regions() return an unsigned
      int.
      
      Fixes: 7431b783
      
       ("dm: add clone target")
      Cc: stable@vger.kernel.org # v5.4+
      Signed-off-by: default avatarNikos Tsironis <ntsironis@arrikto.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      54aabddc
    • Nikos Tsironis's avatar
      dm clone: Add missing casts to prevent overflows and data corruption · 6709d665
      Nikos Tsironis authored
      commit 9fc06ff5 upstream.
      
      Add missing casts when converting from regions to sectors.
      
      In case BITS_PER_LONG == 32, the lack of the appropriate casts can lead
      to overflows and miscalculation of the device sector.
      
      As a result, we could end up discarding and/or copying the wrong parts
      of the device, thus corrupting the device's data.
      
      Fixes: 7431b783
      
       ("dm: add clone target")
      Cc: stable@vger.kernel.org # v5.4+
      Signed-off-by: default avatarNikos Tsironis <ntsironis@arrikto.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6709d665
    • Nikos Tsironis's avatar
      dm clone: Add overflow check for number of regions · c159c51a
      Nikos Tsironis authored
      commit cd481c12 upstream.
      
      Add overflow check for clone->nr_regions variable, which holds the
      number of regions of the target.
      
      The overflow can occur with sufficiently large devices, if BITS_PER_LONG
      == 32. E.g., if the region size is 8 sectors (4K), the overflow would
      occur for device sizes > 34359738360 sectors (~16TB).
      
      This could result in multiple device sectors wrongly mapping to the same
      region number, due to the truncation from 64 bits to 32 bits, which
      would lead to data corruption.
      
      Fixes: 7431b783
      
       ("dm: add clone target")
      Cc: stable@vger.kernel.org # v5.4+
      Signed-off-by: default avatarNikos Tsironis <ntsironis@arrikto.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c159c51a
    • Nikos Tsironis's avatar
      dm clone: Fix handling of partial region discards · d5780838
      Nikos Tsironis authored
      commit 4b514290 upstream.
      
      There is a bug in the way dm-clone handles discards, which can lead to
      discarding the wrong blocks or trying to discard blocks beyond the end
      of the device.
      
      This could lead to data corruption, if the destination device indeed
      discards the underlying blocks, i.e., if the discard operation results
      in the original contents of a block to be lost.
      
      The root of the problem is the code that calculates the range of regions
      covered by a discard request and decides which regions to discard.
      
      Since dm-clone handles the device in units of regions, we don't discard
      parts of a region, only whole regions.
      
      The range is calculated as:
      
          rs = dm_sector_div_up(bio->bi_iter.bi_sector, clone->region_size);
          re = bio_end_sector(bio) >> clone->region_shift;
      
      , where 'rs' is the first region to discard and (re - rs) is the number
      of regions to discard.
      
      The bug manifests when we try to discard part of a single region, i.e.,
      when we try to discard a block with size < region_size, and the discard
      request both starts at an offset with respect to the beginning of that
      region and ends before the end of the region.
      
      The root cause is the following comparison:
      
        if (rs == re)
          // skip discard and complete original bio immediately
      
      , which doesn't take into account that 'rs' might be greater than 're'.
      
      Thus, we then issue a discard request for the wrong blocks, instead of
      skipping the discard all together.
      
      Fix the check to also take into account the above case, so we don't end
      up discarding the wrong blocks.
      
      Also, add some range checks to dm_clone_set_region_hydrated() and
      dm_clone_cond_set_range(), which update dm-clone's region bitmap.
      
      Note that the aforementioned bug doesn't cause invalid memory accesses,
      because dm_clone_is_range_hydrated() returns True for this case, so the
      checks are just precautionary.
      
      Fixes: 7431b783
      
       ("dm: add clone target")
      Cc: stable@vger.kernel.org # v5.4+
      Signed-off-by: default avatarNikos Tsironis <ntsironis@arrikto.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d5780838
    • Bob Liu's avatar
      dm zoned: remove duplicate nr_rnd_zones increase in dmz_init_zone() · 016e73f7
      Bob Liu authored
      commit b8fdd090 upstream.
      
      zmd->nr_rnd_zones was increased twice by mistake. The other place it
      is increased in dmz_init_zone() is the only one needed:
      
      1131                 zmd->nr_useable_zones++;
      1132                 if (dmz_is_rnd(zone)) {
      1133                         zmd->nr_rnd_zones++;
      					^^^
      Fixes: 3b1a94c8
      
       ("dm zoned: drive-managed zoned block device target")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBob Liu <bob.liu@oracle.com>
      Reviewed-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      016e73f7
    • Shetty, Harshini X (EXT-Sony Mobile)'s avatar
      dm verity fec: fix memory leak in verity_fec_dtr · e3dd9eb8
      commit 75fa6019 upstream.
      
      Fix below kmemleak detected in verity_fec_ctr. output_pool is
      allocated for each dm-verity-fec device. But it is not freed when
      dm-table for the verity target is removed. Hence free the output
      mempool in destructor function verity_fec_dtr.
      
      unreferenced object 0xffffffffa574d000 (size 4096):
        comm "init", pid 1667, jiffies 4294894890 (age 307.168s)
        hex dump (first 32 bytes):
          8e 36 00 98 66 a8 0b 9b 00 00 00 00 00 00 00 00  .6..f...........
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<0000000060e82407>] __kmalloc+0x2b4/0x340
          [<00000000dd99488f>] mempool_kmalloc+0x18/0x20
          [<000000002560172b>] mempool_init_node+0x98/0x118
          [<000000006c3574d2>] mempool_init+0x14/0x20
          [<0000000008cb266e>] verity_fec_ctr+0x388/0x3b0
          [<000000000887261b>] verity_ctr+0x87c/0x8d0
          [<000000002b1e1c62>] dm_table_add_target+0x174/0x348
          [<000000002ad89eda>] table_load+0xe4/0x328
          [<000000001f06f5e9>] dm_ctl_ioctl+0x3b4/0x5a0
          [<00000000bee5fbb7>] do_vfs_ioctl+0x5dc/0x928
          [<00000000b475b8f5>] __arm64_sys_ioctl+0x70/0x98
          [<000000005361e2e8>] el0_svc_common+0xa0/0x158
          [<000000001374818f>] el0_svc_handler+0x6c/0x88
          [<000000003364e9f4>] el0_svc+0x8/0xc
          [<000000009d84cec9>] 0xffffffffffffffff
      
      Fixes: a739ff3f ("dm verity: add support for forward error correction")
      Depends-on: 6f1c819c
      
       ("dm: convert to bioset_init()/mempool_init()")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHarshini Shetty <harshini.x.shetty@sony.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e3dd9eb8
    • Mikulas Patocka's avatar
      dm integrity: fix a crash with unusually large tag size · 31cc25c6
      Mikulas Patocka authored
      commit b93b6643
      
       upstream.
      
      If the user specifies tag size larger than HASH_MAX_DIGESTSIZE,
      there's a crash in integrity_metadata().
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      31cc25c6
    • Mikulas Patocka's avatar
      dm writecache: add cond_resched to avoid CPU hangs · 323f56d3
      Mikulas Patocka authored
      commit 1edaa447
      
       upstream.
      
      Initializing a dm-writecache device can take a long time when the
      persistent memory device is large.  Add cond_resched() to a few loops
      to avoid warnings that the CPU is stuck.
      
      Cc: stable@vger.kernel.org # v4.18+
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      323f56d3
    • Jakub Kicinski's avatar
      mm, memcg: do not high throttle allocators based on wraparound · cbb99658
      Jakub Kicinski authored
      commit 9b8b1754 upstream.
      
      If a cgroup violates its memory.high constraints, we may end up unduly
      penalising it.  For example, for the following hierarchy:
      
        A:   max high, 20 usage
        A/B: 9 high, 10 usage
        A/C: max high, 10 usage
      
      We would end up doing the following calculation below when calculating
      high delay for A/B:
      
        A/B: 10 - 9 = 1...
        A:   20 - PAGE_COUNTER_MAX = 21, so set max_overage to 21.
      
      This gets worse with higher disparities in usage in the parent.
      
      I have no idea how this disappeared from the final version of the patch,
      but it is certainly Not Good(tm).  This wasn't obvious in testing because,
      for a simple cgroup hierarchy with only one child, the result is usually
      roughly the same.  It's only in more complex hierarchies that things go
      really awry (although still, the effects are limited to a maximum of 2
      seconds in schedule_timeout_killable at a maximum).
      
      [chris@chrisdown.name: changelog]
      Fixes: e26733e0
      
       ("mm, memcg: throttle allocators based on ancestral memory.high")
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarChris Down <chris@chrisdown.name>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: <stable@vger.kernel.org>	[5.4.x]
      Link: http://lkml.kernel.org/r/20200331152424.GA1019937@chrisdown.name
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cbb99658
    • Maxime Ripard's avatar
      arm64: dts: allwinner: h5: Fix PMU compatible · 1be6329e
      Maxime Ripard authored
      commit 4ae7a3c3 upstream.
      
      The commit c35a516a ("arm64: dts: allwinner: H5: Add PMU node")
      introduced support for the PMU found on the Allwinner H5. However, the
      binding only allows for a single compatible, while the patch was adding
      two.
      
      Make sure we follow the binding.
      
      Fixes: c35a516a
      
       ("arm64: dts: allwinner: H5: Add PMU node")
      Signed-off-by: default avatarMaxime Ripard <maxime@cerno.tech>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1be6329e
    • Scott Wood's avatar
      sched/core: Remove duplicate assignment in sched_tick_remote() · 42f43f29
      Scott Wood authored
      commit 82e0516c upstream.
      
      A redundant "curr = rq->curr" was added; remove it.
      
      Fixes: ebc0f83c
      
       ("timers/nohz: Update NOHZ load in remote tick")
      Signed-off-by: default avatarScott Wood <swood@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/1580776558-12882-1-git-send-email-swood@redhat.com
      
      
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      42f43f29
    • Maxime Ripard's avatar
      arm64: dts: allwinner: h6: Fix PMU compatible · f366f670
      Maxime Ripard authored
      commit 4c7eeb9a upstream.
      
      The commit 7aa9b9eb ("arm64: dts: allwinner: H6: Add PMU mode")
      introduced support for the PMU found on the Allwinner H6. However, the
      binding only allows for a single compatible, while the patch was adding
      two.
      
      Make sure we follow the binding.
      
      Fixes: 7aa9b9eb
      
       ("arm64: dts: allwinner: H6: Add PMU mode")
      Signed-off-by: default avatarMaxime Ripard <maxime@cerno.tech>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f366f670
    • Subash Abhinov Kasiviswanathan's avatar
      net: qualcomm: rmnet: Allow configuration updates to existing devices · d0aa115a
      Subash Abhinov Kasiviswanathan authored
      commit 2abb5792 upstream.
      
      This allows the changelink operation to succeed if the mux_id was
      specified as an argument. Note that the mux_id must match the
      existing mux_id of the rmnet device or should be an unused mux_id.
      
      Fixes: 1dc49e9d
      
       ("net: rmnet: do not allow to change mux id if mux id is duplicated")
      Reported-and-tested-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarSean Tranchetti <stranche@codeaurora.org>
      Signed-off-by: default avatarSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d0aa115a
    • Anssi Hannula's avatar
      tools: gpio: Fix out-of-tree build regression · 6182908d
      Anssi Hannula authored
      commit 82f04bfe upstream.
      
      Commit 0161a94e ("tools: gpio: Correctly add make dependencies for
      gpio_utils") added a make rule for gpio-utils-in.o but used $(output)
      instead of the correct $(OUTPUT) for the output directory, breaking
      out-of-tree build (O=xx) with the following error:
      
        No rule to make target 'out/tools/gpio/gpio-utils-in.o', needed by 'out/tools/gpio/lsgpio-in.o'.  Stop.
      
      Fix that.
      
      Fixes: 0161a94e
      
       ("tools: gpio: Correctly add make dependencies for gpio_utils")
      Cc: <stable@vger.kernel.org>
      Cc: Laura Abbott <labbott@redhat.com>
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Link: https://lore.kernel.org/r/20200325103154.32235-1-anssi.hannula@bitwise.fi
      
      Reviewed-by: default avatarBartosz Golaszewski <bgolaszewski@baylibre.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6182908d
    • Sreekanth Reddy's avatar
      scsi: mpt3sas: Fix kernel panic observed on soft HBA unplug · 4dbaa2e9
      Sreekanth Reddy authored
      commit cc41f11a upstream.
      
      Generic protection fault type kernel panic is observed when user performs
      soft (ordered) HBA unplug operation while IOs are running on drives
      connected to HBA.
      
      When user performs ordered HBA removal operation, the kernel calls PCI
      device's .remove() call back function where driver is flushing out all the
      outstanding SCSI IO commands with DID_NO_CONNECT host byte and also unmaps
      sg buffers allocated for these IO commands.
      
      However, in the ordered HBA removal case (unlike of real HBA hot removal),
      HBA device is still alive and hence HBA hardware is performing the DMA
      operations to those buffers on the system memory which are already unmapped
      while flushing out the outstanding SCSI IO commands and this leads to
      kernel panic.
      
      Don't flush out the outstanding IOs from .remove() path in case of ordered
      removal since HBA will be still alive in this case and it can complete the
      outstanding IOs. Flush out the outstanding IOs only in case of 'physical
      HBA hot unplug' where there won't be any communication with the HBA.
      
      During shutdown also it is possible that HBA hardware can perform DMA
      operations on those outstanding IO buffers which are completed with
      DID_NO_CONNECT by the driver from .shutdown(). So same above fix is applied
      in shutdown path as well.
      
      It is safe to drop the outstanding commands when HBA is inaccessible such
      as when permanent PCI failure happens, when HBA is in non-operational
      state, or when someone does a real HBA hot unplug operation. Since driver
      knows that HBA is inaccessible during these cases, it is safe to drop the
      outstanding commands instead of waiting for SCSI error recovery to kick in
      and clear these outstanding commands.
      
      Link: https://lore.kernel.org/r/1585302763-23007-1-git-send-email-sreekanth.reddy@broadcom.com
      Fixes: c666d3be
      
       ("scsi: mpt3sas: wait for and flush running commands on shutdown/unload")
      Cc: stable@vger.kernel.org #v4.14.174+
      Signed-off-by: default avatarSreekanth Reddy <sreekanth.reddy@broadcom.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4dbaa2e9
    • Jens Axboe's avatar
      io_uring: honor original task RLIMIT_FSIZE · 04a9f660
      Jens Axboe authored
      commit 4ed734b0
      
       upstream.
      
      With the previous fixes for number of files open checking, I added some
      debug code to see if we had other spots where we're checking rlimit()
      against the async io-wq workers. The only one I found was file size
      checking, which we should also honor.
      
      During write and fallocate prep, store the max file size and override
      that for the current ask if we're in io-wq worker context.
      
      Cc: stable@vger.kernel.org # 5.1+
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      04a9f660
    • Rosioru Dragos's avatar
      crypto: mxs-dcp - fix scatterlist linearization for hash · ce599494
      Rosioru Dragos authored
      commit fa03481b upstream.
      
      The incorrect traversal of the scatterlist, during the linearization phase
      lead to computing the hash value of the wrong input buffer.
      New implementation uses scatterwalk_map_and_copy()
      to address this issue.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 15b59e7c
      
       ("crypto: mxs - Add Freescale MXS DCP driver")
      Signed-off-by: default avatarRosioru Dragos <dragos.rosioru@nxp.com>
      Reviewed-by: default avatarHoria Geantă <horia.geanta@nxp.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ce599494
    • Dan Carpenter's avatar
      crypto: rng - Fix a refcounting bug in crypto_rng_reset() · fac8cc0f
      Dan Carpenter authored
      commit eed74b3e upstream.
      
      We need to decrement this refcounter on these error paths.
      
      Fixes: f7d76e05
      
       ("crypto: user - fix use_after_free of struct xxx_request")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fac8cc0f
    • Dmitry Safonov's avatar
      time/namespace: Add max_time_namespaces ucount · b8ba5056
      Dmitry Safonov authored
      commit eeec26d5 upstream.
      
      Michael noticed that userns limit for number of time namespaces is missing.
      
      Furthermore, time namespace introduced UCOUNT_TIME_NAMESPACES, but didn't
      introduce an array member in user_table[]. It would make array's
      initialisation OOB write, but by luck the user_table array has an excessive
      empty member (all accesses to the array are limited with UCOUNT_COUNTS - so
      it silently reuses the last free member.
      
      Fixes user-visible regression: max_inotify_instances by reason of the
      missing UCOUNT_ENTRY() has limited max number of namespaces instead of the
      number of inotify instances.
      
      Fixes: 769071ac
      
       ("ns: Introduce Time Namespace")
      Reported-by: default avatarMichael Kerrisk (man-pages) <mtk.manpages@gmail.com>
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarAndrei Vagin <avagin@gmail.com>
      Acked-by: default avatarVincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: stable@kernel.org
      Link: https://lkml.kernel.org/r/20200406171342.128733-1-dima@arista.com
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b8ba5056
    • Michael Kerrisk (man-pages)'s avatar
      time/namespace: Fix time_for_children symlink · 2f476bbf
      Michael Kerrisk (man-pages) authored
      commit b801f1e2 upstream.
      
      Looking at the contents of the /proc/PID/ns/time_for_children symlink shows
      an anomaly:
      
      $ ls -l /proc/self/ns/* |awk '{print $9, $10, $11}'
      ...
      /proc/self/ns/pid -> pid:[4026531836]
      /proc/self/ns/pid_for_children -> pid:[4026531836]
      /proc/self/ns/time -> time:[4026531834]
      /proc/self/ns/time_for_children -> time_for_children:[4026531834]
      /proc/self/ns/user -> user:[4026531837]
      ...
      
      The reference for 'time_for_children' should be a 'time' namespace, just as
      the reference for 'pid_for_children' is a 'pid' namespace.  In other words,
      the above time_for_children link should read:
      
      /proc/self/ns/time_for_children -> time:[4026531834]
      
      Fixes: 769071ac
      
       ("ns: Introduce Time Namespace")
      Signed-off-by: default avatarMichael Kerrisk <mtk.manpages@gmail.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDmitry Safonov <dima@arista.com>
      Acked-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Acked-by: default avatarAndrei Vagin <avagin@gmail.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/a2418c48-ed80-3afe-116e-6611cb799557@gmail.com
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2f476bbf
    • Nikita Shubin's avatar
      remoteproc: Fix NULL pointer dereference in rproc_virtio_notify · 438d3d80
      Nikita Shubin authored
      commit 791c13b7
      
       upstream.
      
      Undefined rproc_ops .kick method in remoteproc driver will result in
      "Unable to handle kernel NULL pointer dereference" in rproc_virtio_notify,
      after firmware loading if:
      
       1) .kick method wasn't defined in driver
       2) resource_table exists in firmware and has "Virtio device entry" defined
      
      Let's refuse to register an rproc-induced virtio device if no kick method was
      defined for rproc.
      
      [   13.180049][  T415] 8<--- cut here ---
      [   13.190558][  T415] Unable to handle kernel NULL pointer dereference at virtual address 00000000
      [   13.212544][  T415] pgd = (ptrval)
      [   13.217052][  T415] [00000000] *pgd=00000000
      [   13.224692][  T415] Internal error: Oops: 80000005 [#1] PREEMPT SMP ARM
      [   13.231318][  T415] Modules linked in: rpmsg_char imx_rproc virtio_rpmsg_bus rpmsg_core [last unloaded: imx_rproc]
      [   13.241687][  T415] CPU: 0 PID: 415 Comm: unload-load.sh Not tainted 5.5.2-00002-g707df13bbbdd #6
      [   13.250561][  T415] Hardware name: Freescale i.MX7 Dual (Device Tree)
      [   13.257009][  T415] PC is at 0x0
      [   13.260249][  T415] LR is at rproc_virtio_notify+0x2c/0x54
      [   13.265738][  T415] pc : [<00000000>]    lr : [<8050f6b0>]    psr: 60010113
      [   13.272702][  T415] sp : b8d47c48  ip : 00000001  fp : bc04de00
      [   13.278625][  T415] r10: bc04c000  r9 : 00000cc0  r8 : b8d46000
      [   13.284548][  T415] r7 : 00000000  r6 : b898f200  r5 : 00000000  r4 : b8a29800
      [   13.291773][  T415] r3 : 00000000  r2 : 990a3ad4  r1 : 00000000  r0 : b8a29800
      [   13.299000][  T415] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
      [   13.306833][  T415] Control: 10c5387d  Table: b8b4806a  DAC: 00000051
      [   13.313278][  T415] Process unload-load.sh (pid: 415, stack limit = 0x(ptrval))
      [   13.320591][  T415] Stack: (0xb8d47c48 to 0xb8d48000)
      [   13.325651][  T415] 7c40:                   b895b680 00000001 b898f200 803c6430 b895bc80 7f00ae18
      [   13.334531][  T415] 7c60: 00000035 00000000 00000000 b9393200 80b3ed80 00004000 b9393268 bbf5a9a2
      [   13.343410][  T415] 7c80: 00000e00 00000200 00000000 7f00aff0 7f00a014 b895b680 b895b800 990a3ad4
      [   13.352290][  T415] 7ca0: 00000001 b898f210 b898f200 00000000 00000000 7f00e000 00000001 00000000
      [   13.361170][  T415] 7cc0: 00000000 803c62e0 80b2169c 802a0924 b898f210 00000000 00000000 b898f210
      [   13.370049][  T415] 7ce0: 80b9ba44 00000000 80b9ba48 00000000 7f00e000 00000008 80b2169c 80400114
      [   13.378929][  T415] 7d00: 80b2169c 8061fd64 b898f210 7f00e000 80400744 b8d46000 80b21634 80b21634
      [   13.387809][  T415] 7d20: 80b2169c 80400614 80b21634 80400718 7f00e000 00000000 b8d47d7c 80400744
      [   13.396689][  T415] 7d40: b8d46000 80b21634 80b21634 803fe338 b898f254 b80fe76c b8d32e38 990a3ad4
      [   13.405569][  T415] 7d60: fffffff3 b898f210 b8d46000 00000001 b898f254 803ffe7c 80857a90 b898f210
      [   13.414449][  T415] 7d80: 00000001 990a3ad4 b8d46000 b898f210 b898f210 80b17aec b8a29c20 803ff0a4
      [   13.423328][  T415] 7da0: b898f210 00000000 b8d46000 803fb8e0 b898f200 00000000 80b17aec b898f210
      [   13.432209][  T415] 7dc0: b8a29c20 990a3ad4 b895b900 b898f200 8050fb7c 80b17aec b898f210 b8a29c20
      [   13.441088][  T415] 7de0: b8a29800 b895b900 b8a29a04 803c5ec0 b8a29c00 b898f200 b8a29a20 00000007
      [   13.449968][  T415] 7e00: b8a29c20 8050fd78 b8a29800 00000000 b8a29a20 b8a29c04 b8a29820 b8a299d0
      [   13.458848][  T415] 7e20: b895b900 8050e5a4 b8a29800 b8a299d8 b8d46000 b8a299e0 b8a29820 b8a299d0
      [   13.467728][  T415] 7e40: b895b900 8050e008 000041ed 00000000 b8b8c440 b8a299d8 b8a299e0 b8a299d8
      [   13.476608][  T415] 7e60: b8b8c440 990a3ad4 00000000 b8a29820 b8b8c400 00000006 b8a29800 b895b880
      [   13.485487][  T415] 7e80: b8d47f78 00000000 00000000 8050f4b4 00000006 b895b890 b8b8c400 008fbea0
      [   13.494367][  T415] 7ea0: b895b880 8029f530 00000000 00000000 b8d46000 00000006 b8d46000 008fbea0
      [   13.503246][  T415] 7ec0: 8029f434 00000000 b8d46000 00000000 00000000 8021e2e4 0000000a 8061fd0c
      [   13.512125][  T415] 7ee0: 0000000a b8af0c00 0000000a b8af0c40 00000001 b8af0c40 00000000 8061f910
      [   13.521005][  T415] 7f00: 0000000a 80240af4 00000002 b8d46000 00000000 8061fd0c 00000002 80232d7c
      [   13.529884][  T415] 7f20: 00000000 b8d46000 00000000 990a3ad4 00000000 00000006 b8a62d80 008fbea0
      [   13.538764][  T415] 7f40: b8d47f78 00000000 b8d46000 00000000 00000000 802210c0 b88f2900 00000000
      [   13.547644][  T415] 7f60: b8a62d80 b8a62d80 b8d46000 00000006 008fbea0 80221320 00000000 00000000
      [   13.556524][  T415] 7f80: b8af0c00 990a3ad4 0000006c 008fbea0 76f1cda0 00000004 80101204 00000004
      [   13.565403][  T415] 7fa0: 00000000 80101000 0000006c 008fbea0 00000001 008fbea0 00000006 00000000
      [   13.574283][  T415] 7fc0: 0000006c 008fbea0 76f1cda0 00000004 00000006 00000006 00000000 00000000
      [   13.583162][  T415] 7fe0: 00000004 7ebaf7d0 76eb4c0b 76e3f206 600d0030 00000001 00000000 00000000
      [   13.592056][  T415] [<8050f6b0>] (rproc_virtio_notify) from [<803c6430>] (virtqueue_notify+0x1c/0x34)
      [   13.601298][  T415] [<803c6430>] (virtqueue_notify) from [<7f00ae18>] (rpmsg_probe+0x280/0x380 [virtio_rpmsg_bus])
      [   13.611663][  T415] [<7f00ae18>] (rpmsg_probe [virtio_rpmsg_bus]) from [<803c62e0>] (virtio_dev_probe+0x1f8/0x2c4)
      [   13.622022][  T415] [<803c62e0>] (virtio_dev_probe) from [<80400114>] (really_probe+0x200/0x450)
      [   13.630817][  T415] [<80400114>] (really_probe) from [<80400614>] (driver_probe_device+0x16c/0x1ac)
      [   13.639873][  T415] [<80400614>] (driver_probe_device) from [<803fe338>] (bus_for_each_drv+0x84/0xc8)
      [   13.649102][  T415] [<803fe338>] (bus_for_each_drv) from [<803ffe7c>] (__device_attach+0xd4/0x164)
      [   13.658069][  T415] [<803ffe7c>] (__device_attach) from [<803ff0a4>] (bus_probe_device+0x84/0x8c)
      [   13.666950][  T415] [<803ff0a4>] (bus_probe_device) from [<803fb8e0>] (device_add+0x444/0x768)
      [   13.675572][  T415] [<803fb8e0>] (device_add) from [<803c5ec0>] (register_virtio_device+0xa4/0xfc)
      [   13.684541][  T415] [<803c5ec0>] (register_virtio_device) from [<8050fd78>] (rproc_add_virtio_dev+0xcc/0x1b8)
      [   13.694466][  T415] [<8050fd78>] (rproc_add_virtio_dev) from [<8050e5a4>] (rproc_start+0x148/0x200)
      [   13.703521][  T415] [<8050e5a4>] (rproc_start) from [<8050e008>] (rproc_boot+0x384/0x5c0)
      [   13.711708][  T415] [<8050e008>] (rproc_boot) from [<8050f4b4>] (state_store+0x3c/0xc8)
      [   13.719723][  T415] [<8050f4b4>] (state_store) from [<8029f530>] (kernfs_fop_write+0xfc/0x214)
      [   13.728348][  T415] [<8029f530>] (kernfs_fop_write) from [<8021e2e4>] (__vfs_write+0x30/0x1cc)
      [   13.736971][  T415] [<8021e2e4>] (__vfs_write) from [<802210c0>] (vfs_write+0xac/0x17c)
      [   13.744985][  T415] [<802210c0>] (vfs_write) from [<80221320>] (ksys_write+0x64/0xe4)
      [   13.752825][  T415] [<80221320>] (ksys_write) from [<80101000>] (ret_fast_syscall+0x0/0x54)
      [   13.761178][  T415] Exception stack(0xb8d47fa8 to 0xb8d47ff0)
      [   13.766932][  T415] 7fa0:                   0000006c 008fbea0 00000001 008fbea0 00000006 00000000
      [   13.775811][  T415] 7fc0: 0000006c 008fbea0 76f1cda0 00000004 00000006 00000006 00000000 00000000
      [   13.784687][  T415] 7fe0: 00000004 7ebaf7d0 76eb4c0b 76e3f206
      [   13.790442][  T415] Code: bad PC value
      [   13.839214][  T415] ---[ end trace 1fe21ecfc9f28852 ]---
      Reviewed-by: default avatarMathieu Poirier <mathieu.poirier@linaro.org>
      Signed-off-by: default avatarNikita Shubin <NShubin@topcon.com>
      Fixes: 7a186941 ("remoteproc: remove the single rpmsg vdev limitation")
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20200306072452.24743-1-NShubin@topcon.com
      
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      438d3d80
    • Sibi Sankar's avatar
      remoteproc: qcom_q6v5_mss: Reload the mba region on coredump · 174e9da4
      Sibi Sankar authored
      commit d96f2571 upstream.
      
      On secure devices after a wdog/fatal interrupt, the mba region has to be
      refreshed in order to prevent the following errors during mba load.
      
      Err Logs:
      remoteproc remoteproc2: stopped remote processor 4080000.remoteproc
      qcom-q6v5-mss 4080000.remoteproc: PBL returned unexpected status -284031232
      qcom-q6v5-mss 4080000.remoteproc: PBL returned unexpected status -284031232
      ....
      qcom-q6v5-mss 4080000.remoteproc: PBL returned unexpected status -284031232
      qcom-q6v5-mss 4080000.remoteproc: MBA booted, loading mpss
      
      Fixes: 7dd8ade2
      
       ("remoteproc: qcom: q6v5-mss: Add custom dump function for modem")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSibi Sankar <sibis@codeaurora.org>
      Tested-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Link: https://lore.kernel.org/r/20200304194729.27979-4-sibis@codeaurora.org
      
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      174e9da4
    • Bjorn Andersson's avatar
      remoteproc: qcom_q6v5_mss: Don't reassign mpss region on shutdown · 34883e39
      Bjorn Andersson authored
      commit 900fc60d upstream.
      
      Trying to reclaim mpss memory while the mba is not running causes the
      system to crash on devices with security fuses blown, so leave it
      assigned to the remote on shutdown and recover it on a subsequent boot.
      
      Fixes: 6c5a9dc2
      
       ("remoteproc: qcom: Make secure world call for mem ownership switch")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarSibi Sankar <sibis@codeaurora.org>
      Tested-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Link: https://lore.kernel.org/r/20200304194729.27979-2-sibis@codeaurora.org
      
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      34883e39
    • Josef Bacik's avatar
      btrfs: use nofs allocations for running delayed items · 98cbca53
      Josef Bacik authored
      commit 351cbf6e
      
       upstream.
      
      Zygo reported the following lockdep splat while testing the balance
      patches
      
      ======================================================
      WARNING: possible circular locking dependency detected
      5.6.0-c6f0579d496a+ #53 Not tainted
      ------------------------------------------------------
      kswapd0/1133 is trying to acquire lock:
      ffff888092f622c0 (&delayed_node->mutex){+.+.}, at: __btrfs_release_delayed_node+0x7c/0x5b0
      
      but task is already holding lock:
      ffffffff8fc5f860 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 (fs_reclaim){+.+.}:
             fs_reclaim_acquire.part.91+0x29/0x30
             fs_reclaim_acquire+0x19/0x20
             kmem_cache_alloc_trace+0x32/0x740
             add_block_entry+0x45/0x260
             btrfs_ref_tree_mod+0x6e2/0x8b0
             btrfs_alloc_tree_block+0x789/0x880
             alloc_tree_block_no_bg_flush+0xc6/0xf0
             __btrfs_cow_block+0x270/0x940
             btrfs_cow_block+0x1ba/0x3a0
             btrfs_search_slot+0x999/0x1030
             btrfs_insert_empty_items+0x81/0xe0
             btrfs_insert_delayed_items+0x128/0x7d0
             __btrfs_run_delayed_items+0xf4/0x2a0
             btrfs_run_delayed_items+0x13/0x20
             btrfs_commit_transaction+0x5cc/0x1390
             insert_balance_item.isra.39+0x6b2/0x6e0
             btrfs_balance+0x72d/0x18d0
             btrfs_ioctl_balance+0x3de/0x4c0
             btrfs_ioctl+0x30ab/0x44a0
             ksys_ioctl+0xa1/0xe0
             __x64_sys_ioctl+0x43/0x50
             do_syscall_64+0x77/0x2c0
             entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      -> #0 (&delayed_node->mutex){+.+.}:
             __lock_acquire+0x197e/0x2550
             lock_acquire+0x103/0x220
             __mutex_lock+0x13d/0xce0
             mutex_lock_nested+0x1b/0x20
             __btrfs_release_delayed_node+0x7c/0x5b0
             btrfs_remove_delayed_node+0x49/0x50
             btrfs_evict_inode+0x6fc/0x900
             evict+0x19a/0x2c0
             dispose_list+0xa0/0xe0
             prune_icache_sb+0xbd/0xf0
             super_cache_scan+0x1b5/0x250
             do_shrink_slab+0x1f6/0x530
             shrink_slab+0x32e/0x410
             shrink_node+0x2a5/0xba0
             balance_pgdat+0x4bd/0x8a0
             kswapd+0x35a/0x800
             kthread+0x1e9/0x210
             ret_from_fork+0x3a/0x50
      
      other info that might help us debug this:
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(fs_reclaim);
                                     lock(&delayed_node->mutex);
                                     lock(fs_reclaim);
        lock(&delayed_node->mutex);
      
       *** DEADLOCK ***
      
      3 locks held by kswapd0/1133:
       #0: ffffffff8fc5f860 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30
       #1: ffffffff8fc380d8 (shrinker_rwsem){++++}, at: shrink_slab+0x1e8/0x410
       #2: ffff8881e0e6c0e8 (&type->s_umount_key#42){++++}, at: trylock_super+0x1b/0x70
      
      stack backtrace:
      CPU: 2 PID: 1133 Comm: kswapd0 Not tainted 5.6.0-c6f0579d496a+ #53
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
      Call Trace:
       dump_stack+0xc1/0x11a
       print_circular_bug.isra.38.cold.57+0x145/0x14a
       check_noncircular+0x2a9/0x2f0
       ? print_circular_bug.isra.38+0x130/0x130
       ? stack_trace_consume_entry+0x90/0x90
       ? save_trace+0x3cc/0x420
       __lock_acquire+0x197e/0x2550
       ? btrfs_inode_clear_file_extent_range+0x9b/0xb0
       ? register_lock_class+0x960/0x960
       lock_acquire+0x103/0x220
       ? __btrfs_release_delayed_node+0x7c/0x5b0
       __mutex_lock+0x13d/0xce0
       ? __btrfs_release_delayed_node+0x7c/0x5b0
       ? __asan_loadN+0xf/0x20
       ? pvclock_clocksource_read+0xeb/0x190
       ? __btrfs_release_delayed_node+0x7c/0x5b0
       ? mutex_lock_io_nested+0xc20/0xc20
       ? __kasan_check_read+0x11/0x20
       ? check_chain_key+0x1e6/0x2e0
       mutex_lock_nested+0x1b/0x20
       ? mutex_lock_nested+0x1b/0x20
       __btrfs_release_delayed_node+0x7c/0x5b0
       btrfs_remove_delayed_node+0x49/0x50
       btrfs_evict_inode+0x6fc/0x900
       ? btrfs_setattr+0x840/0x840
       ? do_raw_spin_unlock+0xa8/0x140
       evict+0x19a/0x2c0
       dispose_list+0xa0/0xe0
       prune_icache_sb+0xbd/0xf0
       ? invalidate_inodes+0x310/0x310
       super_cache_scan+0x1b5/0x250
       do_shrink_slab+0x1f6/0x530
       shrink_slab+0x32e/0x410
       ? do_shrink_slab+0x530/0x530
       ? do_shrink_slab+0x530/0x530
       ? __kasan_check_read+0x11/0x20
       ? mem_cgroup_protected+0x13d/0x260
       shrink_node+0x2a5/0xba0
       balance_pgdat+0x4bd/0x8a0
       ? mem_cgroup_shrink_node+0x490/0x490
       ? _raw_spin_unlock_irq+0x27/0x40
       ? finish_task_switch+0xce/0x390
       ? rcu_read_lock_bh_held+0xb0/0xb0
       kswapd+0x35a/0x800
       ? _raw_spin_unlock_irqrestore+0x4c/0x60
       ? balance_pgdat+0x8a0/0x8a0
       ? finish_wait+0x110/0x110
       ? __kasan_check_read+0x11/0x20
       ? __kthread_parkme+0xc6/0xe0
       ? balance_pgdat+0x8a0/0x8a0
       kthread+0x1e9/0x210
       ? kthread_create_worker_on_cpu+0xc0/0xc0
       ret_from_fork+0x3a/0x50
      
      This is because we hold that delayed node's mutex while doing tree
      operations.  Fix this by just wrapping the searches in nofs.
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      98cbca53
    • Robbie Ko's avatar
      btrfs: fix missing semaphore unlock in btrfs_sync_file · d91eefd1
      Robbie Ko authored
      commit 6ff06729 upstream.
      
      Ordered ops are started twice in sync file, once outside of inode mutex
      and once inside, taking the dio semaphore. There was one error path
      missing the semaphore unlock.
      
      Fixes: aab15e8e
      
       ("Btrfs: fix rare chances for data loss when doing a fast fsync")
      CC: stable@vger.kernel.org # 4.19+
      Signed-off-by: default avatarRobbie Ko <robbieko@synology.com>
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      [ add changelog ]
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d91eefd1
    • Josef Bacik's avatar
      btrfs: unset reloc control if we fail to recover · 9b8fda76
      Josef Bacik authored
      commit fb2d83ee
      
       upstream.
      
      If we fail to load an fs root, or fail to start a transaction we can
      bail without unsetting the reloc control, which leads to problems later
      when we free the reloc control but still have it attached to the file
      system.
      
      In the normal path we'll end up calling unset_reloc_control() twice, but
      all it does is set fs_info->reloc_control = NULL, and we can only have
      one balance at a time so it's not racey.
      
      CC: stable@vger.kernel.org # 5.4+
      Reviewed-by: default avatarQu Wenruo <wqu@suse.com>
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9b8fda76
    • Filipe Manana's avatar
      btrfs: fix missing file extent item for hole after ranged fsync · 49656752
      Filipe Manana authored
      commit 95418ed1
      
       upstream.
      
      When doing a fast fsync for a range that starts at an offset greater than
      zero, we can end up with a log that when replayed causes the respective
      inode miss a file extent item representing a hole if we are not using the
      NO_HOLES feature. This is because for fast fsyncs we don't log any extents
      that cover a range different from the one requested in the fsync.
      
      Example scenario to trigger it:
      
        $ mkfs.btrfs -O ^no-holes -f /dev/sdd
        $ mount /dev/sdd /mnt
      
        # Create a file with a single 256K and fsync it to clear to full sync
        # bit in the inode - we want the msync below to trigger a fast fsync.
        $ xfs_io -f -c "pwrite -S 0xab 0 256K" -c "fsync" /mnt/foo
      
        # Force a transaction commit and wipe out the log tree.
        $ sync
      
        # Dirty 768K of data, increasing the file size to 1Mb, and flush only
        # the range from 256K to 512K without updating the log tree
        # (sync_file_range() does not trigger fsync, it only starts writeback
        # and waits for it to finish).
      
        $ xfs_io -c "pwrite -S 0xcd 256K 768K" /mnt/foo
        $ xfs_io -c "sync_range -abw 256K 256K" /mnt/foo
      
        # Now dirty the range from 768K to 1M again and sync that range.
        $ xfs_io -c "mmap -w 768K 256K"        \
                 -c "mwrite -S 0xef 768K 256K" \
                 -c "msync -s 768K 256K"       \
                 -c "munmap"                   \
                 /mnt/foo
      
        <power fail>
      
        # Mount to replay the log.
        $ mount /dev/sdd /mnt
        $ umount /mnt
      
        $ btrfs check /dev/sdd
        Opening filesystem to check...
        Checking filesystem on /dev/sdd
        UUID: 482fb574-b288-478e-a190-a9c44a78fca6
        [1/7] checking root items
        [2/7] checking extents
        [3/7] checking free space cache
        [4/7] checking fs roots
        root 5 inode 257 errors 100, file extent discount
        Found file extent holes:
             start: 262144, len: 524288
        ERROR: errors found in fs roots
        found 720896 bytes used, error(s) found
        total csum bytes: 512
        total tree bytes: 131072
        total fs tree bytes: 32768
        total extent tree bytes: 16384
        btree space waste bytes: 123514
        file data blocks allocated: 589824
          referenced 589824
      
      Fix this issue by setting the range to full (0 to LLONG_MAX) when the
      NO_HOLES feature is not enabled. This results in extra work being done
      but it gives the guarantee we don't end up with missing holes after
      replaying the log.
      
      CC: stable@vger.kernel.org # 4.19+
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      49656752
    • Josef Bacik's avatar
      btrfs: drop block from cache on error in relocation · 69e45524
      Josef Bacik authored
      commit 8e19c973
      
       upstream.
      
      If we have an error while building the backref tree in relocation we'll
      process all the pending edges and then free the node.  However if we
      integrated some edges into the cache we'll lose our link to those edges
      by simply freeing this node, which means we'll leak memory and
      references to any roots that we've found.
      
      Instead we need to use remove_backref_node(), which walks through all of
      the edges that are still linked to this node and free's them up and
      drops any root references we may be holding.
      
      CC: stable@vger.kernel.org # 4.9+
      Reviewed-by: default avatarQu Wenruo <wqu@suse.com>
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      69e45524
    • Josef Bacik's avatar
      btrfs: set update the uuid generation as soon as possible · 706cd9e4
      Josef Bacik authored
      commit 75ec1db8 upstream.
      
      In my EIO stress testing I noticed I was getting forced to rescan the
      uuid tree pretty often, which was weird.  This is because my error
      injection stuff would sometimes inject an error after log replay but
      before we loaded the UUID tree.  If log replay committed the transaction
      it wouldn't have updated the uuid tree generation, but the tree was
      valid and didn't change, so there's no reason to not update the
      generation here.
      
      Fix this by setting the BTRFS_FS_UPDATE_UUID_TREE_GEN bit immediately
      after reading all the fs roots if the uuid tree generation matches the
      fs generation.  Then any transaction commits that happen during mount
      won't screw up our uuid tree state, forcing us to do needless uuid
      rescans.
      
      Fixes: 70f80175
      
       ("Btrfs: check UUID tree during mount if required")
      CC: stable@vger.kernel.org # 4.19+
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      706cd9e4
    • Josef Bacik's avatar
      btrfs: reloc: clean dirty subvols if we fail to start a transaction · b4882169
      Josef Bacik authored
      commit 6217b0fa
      
       upstream.
      
      If we do merge_reloc_roots() we could insert a few roots onto the dirty
      subvol roots list, where we hold a ref on them.  If we fail to start the
      transaction we need to run clean_dirty_subvols() in order to cleanup the
      refs.
      
      CC: stable@vger.kernel.org # 5.4+
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b4882169
    • Filipe Manana's avatar
      Btrfs: fix crash during unmount due to race with delayed inode workers · 3dd4bb59
      Filipe Manana authored
      commit f0cc2cd7
      
       upstream.
      
      During unmount we can have a job from the delayed inode items work queue
      still running, that can lead to at least two bad things:
      
      1) A crash, because the worker can try to create a transaction just
         after the fs roots were freed;
      
      2) A transaction leak, because the worker can create a transaction
         before the fs roots are freed and just after we committed the last
         transaction and after we stopped the transaction kthread.
      
      A stack trace example of the crash:
      
       [79011.691214] kernel BUG at lib/radix-tree.c:982!
       [79011.692056] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
       [79011.693180] CPU: 3 PID: 1394 Comm: kworker/u8:2 Tainted: G        W         5.6.0-rc2-btrfs-next-54 #2
       (...)
       [79011.696789] Workqueue: btrfs-delayed-meta btrfs_work_helper [btrfs]
       [79011.697904] RIP: 0010:radix_tree_tag_set+0xe7/0x170
       (...)
       [79011.702014] RSP: 0018:ffffb3c84a317ca0 EFLAGS: 00010293
       [79011.702949] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
       [79011.704202] RDX: ffffb3c84a317cb0 RSI: ffffb3c84a317ca8 RDI: ffff8db3931340a0
       [79011.705463] RBP: 0000000000000005 R08: 0000000000000005 R09: ffffffff974629d0
       [79011.706756] R10: ffffb3c84a317bc0 R11: 0000000000000001 R12: ffff8db393134000
       [79011.708010] R13: ffff8db3931340a0 R14: ffff8db393134068 R15: 0000000000000001
       [79011.709270] FS:  0000000000000000(0000) GS:ffff8db3b6a00000(0000) knlGS:0000000000000000
       [79011.710699] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [79011.711710] CR2: 00007f22c2a0a000 CR3: 0000000232ad4005 CR4: 00000000003606e0
       [79011.712958] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       [79011.714205] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       [79011.715448] Call Trace:
       [79011.715925]  record_root_in_trans+0x72/0xf0 [btrfs]
       [79011.716819]  btrfs_record_root_in_trans+0x4b/0x70 [btrfs]
       [79011.717925]  start_transaction+0xdd/0x5c0 [btrfs]
       [79011.718829]  btrfs_async_run_delayed_root+0x17e/0x2b0 [btrfs]
       [79011.719915]  btrfs_work_helper+0xaa/0x720 [btrfs]
       [79011.720773]  process_one_work+0x26d/0x6a0
       [79011.721497]  worker_thread+0x4f/0x3e0
       [79011.722153]  ? process_one_work+0x6a0/0x6a0
       [79011.722901]  kthread+0x103/0x140
       [79011.723481]  ? kthread_create_worker_on_cpu+0x70/0x70
       [79011.724379]  ret_from_fork+0x3a/0x50
       (...)
      
      The following diagram shows a sequence of steps that lead to the crash
      during ummount of the filesystem:
      
              CPU 1                                             CPU 2                                CPU 3
      
       btrfs_punch_hole()
         btrfs_btree_balance_dirty()
           btrfs_balance_delayed_items()
             --> sees
                 fs_info->delayed_root->items
                 with value 200, which is greater
                 than
                 BTRFS_DELAYED_BACKGROUND (128)
                 and smaller than
                 BTRFS_DELAYED_WRITEBACK (512)
             btrfs_wq_run_delayed_node()
               --> queues a job for
                   fs_info->delayed_workers to run
                   btrfs_async_run_delayed_root()
      
                                                                                                  btrfs_async_run_delayed_root()
                                                                                                    --> job queued by CPU 1
      
                                                                                                    --> starts picking and running
                                                                                                        delayed nodes from the
                                                                                                        prepare_list list
      
                                                       close_ctree()
      
                                                         btrfs_delete_unused_bgs()
      
                                                         btrfs_commit_super()
      
                                                           btrfs_join_transaction()
                                                             --> gets transaction N
      
                                                           btrfs_commit_transaction(N)
                                                             --> set transaction state
                                                              to TRANTS_STATE_COMMIT_START
      
                                                                                                   btrfs_first_prepared_delayed_node()
                                                                                                     --> picks delayed node X through
                                                                                                         the prepared_list list
      
                                                             btrfs_run_delayed_items()
      
                                                               btrfs_first_delayed_node()
                                                                 --> also picks delayed node X
                                                                     but through the node_list
                                                                     list
      
                                                               __btrfs_commit_inode_delayed_items()
                                                                  --> runs all delayed items from
                                                                      this node and drops the
                                                                      node's item count to 0
                                                                      through call to
                                                                      btrfs_release_delayed_inode()
      
                                                               --> finishes running any remaining
                                                                   delayed nodes
      
                                                             --> finishes transaction commit
      
                                                         --> stops cleaner and transaction threads
      
                                                         btrfs_free_fs_roots()
                                                           --> frees all roots and removes them
                                                               from the radix tree
                                                               fs_info->fs_roots_radix
      
                                                                                                   btrfs_join_transaction()
                                                                                                     start_transaction()
                                                                                                       btrfs_record_root_in_trans()
                                                                                                         record_root_in_trans()
                                                                                                           radix_tree_tag_set()
                                                                                                             --> crashes because
                                                                                                                 the root is not in
                                                                                                                 the radix tree
                                                                                                                 anymore
      
      If the worker is able to call btrfs_join_transaction() before the unmount
      task frees the fs roots, we end up leaking a transaction and all its
      resources, since after the call to btrfs_commit_super() and stopping the
      transaction kthread, we don't expect to have any transaction open anymore.
      
      When this situation happens the worker has a delayed node that has no
      more items to run, since the task calling btrfs_run_delayed_items(),
      which is doing a transaction commit, picks the same node and runs all
      its items first.
      
      We can not wait for the worker to complete when running delayed items
      through btrfs_run_delayed_items(), because we call that function in
      several phases of a transaction commit, and that could cause a deadlock
      because the worker calls btrfs_join_transaction() and the task doing the
      transaction commit may have already set the transaction state to
      TRANS_STATE_COMMIT_DOING.
      
      Also it's not possible to get into a situation where only some of the
      items of a delayed node are added to the fs/subvolume tree in the current
      transaction and the remaining ones in the next transaction, because when
      running the items of a delayed inode we lock its mutex, effectively
      waiting for the worker if the worker is running the items of the delayed
      node already.
      
      Since this can only cause issues when unmounting a filesystem, fix it in
      a simple way by waiting for any jobs on the delayed workers queue before
      calling btrfs_commit_supper() at close_ctree(). This works because at this
      point no one can call btrfs_btree_balance_dirty() or
      btrfs_balance_delayed_items(), and if we end up waiting for any worker to
      complete, btrfs_commit_super() will commit the transaction created by the
      worker.
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3dd4bb59
    • Josef Bacik's avatar
      btrfs: fix btrfs_calc_reclaim_metadata_size calculation · b8c026e1
      Josef Bacik authored
      commit fa121a26
      
       upstream.
      
      I noticed while running my snapshot torture test that we were getting a
      lot of metadata chunks allocated with very little actually used.
      Digging into this we would commit the transaction, still not have enough
      space, and then force a chunk allocation.
      
      I noticed that we were barely flushing any delalloc at all, despite the
      fact that we had around 13gib of outstanding delalloc reservations.  It
      turns out this is because of our btrfs_calc_reclaim_metadata_size()
      calculation.  It _only_ takes into account the outstanding ticket sizes,
      which isn't the whole story.  In this particular workload we're slowly
      filling up the disk, which means our overcommit space will suddenly
      become a lot less, and our outstanding reservations will be well more
      than what we can handle.  However we are only flushing based on our
      ticket size, which is much less than we need to actually reclaim.
      
      So fix btrfs_calc_reclaim_metadata_size() to take into account the
      overage in the case that we've gotten less available space suddenly.
      This makes it so we attempt to reclaim a lot more delalloc space, which
      allows us to make our reservations and we no longer are allocating a
      bunch of needless metadata chunks.
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b8c026e1
    • Qu Wenruo's avatar
      btrfs: Don't submit any btree write bio if the fs has errors · cc3c1509
      Qu Wenruo authored
      commit b3ff8f1d upstream.
      
      [BUG]
      There is a fuzzed image which could cause KASAN report at unmount time.
      
        BUG: KASAN: use-after-free in btrfs_queue_work+0x2c1/0x390
        Read of size 8 at addr ffff888067cf6848 by task umount/1922
      
        CPU: 0 PID: 1922 Comm: umount Tainted: G        W         5.0.21 #1
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
        Call Trace:
         dump_stack+0x5b/0x8b
         print_address_description+0x70/0x280
         kasan_report+0x13a/0x19b
         btrfs_queue_work+0x2c1/0x390
         btrfs_wq_submit_bio+0x1cd/0x240
         btree_submit_bio_hook+0x18c/0x2a0
         submit_one_bio+0x1be/0x320
         flush_write_bio.isra.41+0x2c/0x70
         btree_write_cache_pages+0x3bb/0x7f0
         do_writepages+0x5c/0x130
         __writeback_single_inode+0xa3/0x9a0
         writeback_single_inode+0x23d/0x390
         write_inode_now+0x1b5/0x280
         iput+0x2ef/0x600
         close_ctree+0x341/0x750
         generic_shutdown_super+0x126/0x370
         kill_anon_super+0x31/0x50
         btrfs_kill_super+0x36/0x2b0
         deactivate_locked_super+0x80/0xc0
         deactivate_super+0x13c/0x150
         cleanup_mnt+0x9a/0x130
         task_work_run+0x11a/0x1b0
         exit_to_usermode_loop+0x107/0x130
         do_syscall_64+0x1e5/0x280
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      [CAUSE]
      The fuzzed image has a completely screwd up extent tree:
      
        leaf 29421568 gen 8 total ptrs 6 free space 3587 owner EXTENT_TREE
        refs 2 lock (w:0 r:0 bw:0 br:0 sw:0 sr:0) lock_owner 0 current 5938
                item 0 key (12587008 168 4096) itemoff 3942 itemsize 53
                        extent refs 1 gen 9 flags 1
                        ref#0: extent data backref root 5 objectid 259 offset 0 count 1
                item 1 key (12591104 168 8192) itemoff 3889 itemsize 53
                        extent refs 1 gen 9 flags 1
                        ref#0: extent data backref root 5 objectid 271 offset 0 count 1
                item 2 key (12599296 168 4096) itemoff 3836 itemsize 53
                        extent refs 1 gen 9 flags 1
                        ref#0: extent data backref root 5 objectid 259 offset 4096 count 1
                item 3 key (29360128 169 0) itemoff 3803 itemsize 33
                        extent refs 1 gen 9 flags 2
                        ref#0: tree block backref root 5
                item 4 key (29368320 169 1) itemoff 3770 itemsize 33
                        extent refs 1 gen 9 flags 2
                        ref#0: tree block backref root 5
                item 5 key (29372416 169 0) itemoff 3737 itemsize 33
                        extent refs 1 gen 9 flags 2
                        ref#0: tree block backref root 5
      
      Note that leaf 29421568 doesn't have its backref in the extent tree.
      Thus extent allocator can re-allocate leaf 29421568 for other trees.
      
      In short, the bug is caused by:
      
      - Existing tree block gets allocated to log tree
        This got its generation bumped.
      
      - Log tree balance cleaned dirty bit of offending tree block
        It will not be written back to disk, thus no WRITTEN flag.
      
      - Original owner of the tree block gets COWed
        Since the tree block has higher transid, no WRITTEN flag, it's reused,
        and not traced by transaction::dirty_pages.
      
      - Transaction aborted
        Tree blocks get cleaned according to transaction::dirty_pages. But the
        offending tree block is not recorded at all.
      
      - Filesystem unmount
        All pages are assumed to be are clean, destroying all workqueue, then
        call iput(btree_inode).
        But offending tree block is still dirty, which triggers writeback, and
        causes use-after-free bug.
      
      The detailed sequence looks like this:
      
      - Initial status
        eb: 29421568, header=WRITTEN bflags_dirty=0, page_dirty=0, gen=8,
            not traced by any dirty extent_iot_tree.
      
      - New tree block is allocated
        Since there is no backref for 29421568, it's re-allocated as new tree
        block.
        Keep in mind that tree block 29421568 is still referred by extent
        tree.
      
      - Tree block 29421568 is filled for log tree
        eb: 29421568, header=0 bflags_dirty=1, page_dirty=1, gen=9 << (gen bumped)
            traced by btrfs_root::dirty_log_pages
      
      - Some log tree operations
        Since the fs is using node size 4096, the log tree can easily go a
        level higher.
      
      - Log tree needs balance
        Tree block 29421568 gets all its content pushed to right, thus now
        it is empty, and we don't need it.
        btrfs_clean_tree_block() from __push_leaf_right() get called.
      
        eb: 29421568, header=0 bflags_dirty=0, page_dirty=0, gen=9
            traced by btrfs_root::dirty_log_pages
      
      - Log tree write back
        btree_write_cache_pages() goes through dirty pages ranges, but since
        page of tree block 29421568 gets cleaned already, it's not written
        back to disk. Thus it doesn't have WRITTEN bit set.
        But ranges in dirty_log_pages are cleared.
      
        eb: 29421568, header=0 bflags_dirty=0, page_dirty=0, gen=9
            not traced by any dirty extent_iot_tree.
      
      - Extent tree update when committing transaction
        Since tree block 29421568 has transid equal to running trans, and has
        no WRITTEN bit, should_cow_block() will use it directly without adding
        it to btrfs_transaction::dirty_pages.
      
        eb: 29421568, header=0 bflags_dirty=1, page_dirty=1, gen=9
            not traced by any dirty extent_iot_tree.
      
        At this stage, we're doomed. We have a dirty eb not tracked by any
        extent io tree.
      
      - Transaction gets aborted due to corrupted extent tree
        Btrfs cleans up dirty pages according to transaction::dirty_pages and
        btrfs_root::dirty_log_pages.
        But since tree block 29421568 is not tracked by neither of them, it's
        still dirty.
      
        eb: 29421568, header=0 bflags_dirty=1, page_dirty=1, gen=9
            not traced by any dirty extent_iot_tree.
      
      - Filesystem unmount
        Since all cleanup is assumed to be done, all workqueus are destroyed.
        Then iput(btree_inode) is called, expecting no dirty pages.
        But tree 29421568 is still dirty, thus triggering writeback.
        Since all workqueues are already freed, we cause use-after-free.
      
      This shows us that, log tree blocks + bad extent tree can cause wild
      dirty pages.
      
      [FIX]
      To fix the problem, don't submit any btree write bio if the filesytem
      has any error.  This is the last safe net, just in case other cleanup
      haven't caught catch it.
      
      Link: https://github.com/bobfuzzer/CVE/tree/master/CVE-2019-19377
      
      
      CC: stable@vger.kernel.org # 5.4+
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cc3c1509
    • Tvrtko Ursulin's avatar
      drm/i915/gen12: Disable preemption timeout · 36e25289
      Tvrtko Ursulin authored
      commit 07bcfd12
      
       upstream.
      
      Allow super long OpenCL workloads which cannot be preempted within
      the default timeout to run out of the box.
      
      v2:
       * Make it stick out more and apply only to RCS. (Chris)
      
      v3:
       * Mention platform override in kconfig. (Joonas)
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Michal Mrozek <michal.mrozek@intel.com>
      Cc: <stable@vger.kernel.org> # v5.6+
      Acked-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Acked-by: default avatarMichal Mrozek <Michal.mrozek@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200312115748.29970-1-tvrtko.ursulin@linux.intel.com
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      36e25289
    • Piotr Sroka's avatar
      mtd: rawnand: cadence: reinit completion before executing a new command · 9d63d513
      Piotr Sroka authored
      commit 0d7d6c81 upstream.
      
      Reing the completion object before executing CDMA command to make sure
      the 'done' flag is OK.
      
      Fixes: ec4ba01e
      
       ("mtd: rawnand: Add new Cadence NAND driver to MTD subsystem")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPiotr Sroka <piotrs@cadence.com>
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/1581328530-29966-4-git-send-email-piotrs@cadence.com
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9d63d513
    • Piotr Sroka's avatar
      mtd: rawnand: cadence: change bad block marker size · 26453f49
      Piotr Sroka authored
      commit 9bf1903b upstream.
      
      Increase bad block marker size from one byte to two bytes.
      Bad block marker is handled by skip bytes feature of HPNFC.
      Controller expects this value to be an even number.
      
      Fixes: ec4ba01e
      
       ("mtd: rawnand: Add new Cadence NAND driver to MTD subsystem")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPiotr Sroka <piotrs@cadence.com>
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/1581328530-29966-3-git-send-email-piotrs@cadence.com
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      26453f49
    • Piotr Sroka's avatar
      mtd: rawnand: cadence: fix the calculation of the avaialble OOB size · a30a4dab
      Piotr Sroka authored
      commit e4578af0 upstream.
      
      The value of cdns_chip->sector_count is not known at the moment
      of the derivation of ecc_size, leading to a zero value. Fix
      this by assigning ecc_size later in the code.
      
      Fixes: ec4ba01e
      
       ("mtd: rawnand: Add new Cadence NAND driver to MTD subsystem")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPiotr Sroka <piotrs@cadence.com>
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/1581328530-29966-2-git-send-email-piotrs@cadence.com
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a30a4dab
    • Frieder Schrempf's avatar
      mtd: spinand: Do not erase the block before writing a bad block marker · 834a686e
      Frieder Schrempf authored
      commit b645ad39 upstream.
      
      Currently when marking a block, we use spinand_erase_op() to erase
      the block before writing the marker to the OOB area. Doing so without
      waiting for the operation to finish can lead to the marking failing
      silently and no bad block marker being written to the flash.
      
      In fact we don't need to do an erase at all before writing the BBM.
      The ECC is disabled for raw accesses to the OOB data and we don't
      need to work around any issues with chips reporting ECC errors as it
      is known to be the case for raw NAND.
      
      Fixes: 7529df46
      
       ("mtd: nand: Add core infrastructure to support SPI NANDs")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarFrieder Schrempf <frieder.schrempf@kontron.de>
      Reviewed-by: default avatarBoris Brezillon <boris.brezillon@collabora.com>
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/20200218100432.32433-4-frieder.schrempf@kontron.de
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      834a686e
    • Frieder Schrempf's avatar
      mtd: spinand: Stop using spinand->oobbuf for buffering bad block markers · 1aeeb0e8
      Frieder Schrempf authored
      commit 21489375 upstream.
      
      For reading and writing the bad block markers, spinand->oobbuf is
      currently used as a buffer for the marker bytes. During the
      underlying read and write operations to actually get/set the content
      of the OOB area, the content of spinand->oobbuf is reused and changed
      by accessing it through spinand->oobbuf and/or spinand->databuf.
      
      This is a flaw in the original design of the SPI NAND core and at the
      latest from 13c15e07 ("mtd: spinand: Handle the case where
      PROGRAM LOAD does not reset the cache") on, it results in not having
      the bad block marker written at all, as the spinand->oobbuf is
      cleared to 0xff after setting the marker bytes to zero.
      
      To fix it, we now just store the two bytes for the marker on the
      stack and let the read/write operations copy it from/to the page
      buffer later.
      
      Fixes: 7529df46
      
       ("mtd: nand: Add core infrastructure to support SPI NANDs")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarFrieder Schrempf <frieder.schrempf@kontron.de>
      Reviewed-by: default avatarBoris Brezillon <boris.brezillon@collabora.com>
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/20200218100432.32433-2-frieder.schrempf@kontron.de
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1aeeb0e8