1. 12 Jul, 2022 26 commits
    • Hans de Goede's avatar
      Input: goodix - add a goodix.h header file · f5b1c6d5
      Hans de Goede authored
      [ Upstream commit a2233cb7
      
       ]
      
      Add a goodix.h header file, and move the register definitions,
      and struct declarations there and add prototypes for various
      helper functions.
      
      This is a preparation patch for adding support for controllers
      without flash, which need to have their firmware uploaded and
      need some other special handling too.
      
      Since MAINTAINERS needs updating because of this change anyways,
      also add myself as co-maintainer.
      Reviewed-by: default avatarBastien Nocera <hadess@hadess.net>
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Link: https://lore.kernel.org/r/20210920150643.155872-3-hdegoede@redhat.com
      
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f5b1c6d5
    • Hans de Goede's avatar
      Input: goodix - change goodix_i2c_write() len parameter type to int · 1354ceb1
      Hans de Goede authored
      [ Upstream commit 31ae0102
      
       ]
      
      Change the type of the goodix_i2c_write() len parameter to from 'unsigned'
      to 'int' to avoid bare use of 'unsigned', changing it to 'int' makes
      goodix_i2c_write()' prototype consistent with goodix_i2c_read().
      Reviewed-by: default avatarBastien Nocera <hadess@hadess.net>
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Link: https://lore.kernel.org/r/20210920150643.155872-2-hdegoede@redhat.com
      
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1354ceb1
    • Tang Bin's avatar
      Input: cpcap-pwrbutton - handle errors from platform_get_irq() · 8d1d6b29
      Tang Bin authored
      [ Upstream commit 58ae4004
      
       ]
      
      The function cpcap_power_button_probe() does not perform
      sufficient error checking after executing platform_get_irq(),
      thus fix it.
      Signed-off-by: default avatarTang Bin <tangbin@cmss.chinamobile.com>
      Link: https://lore.kernel.org/r/20210802121740.8700-1-tangbin@cmss.chinamobile.com
      
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8d1d6b29
    • Filipe Manana's avatar
      btrfs: fix warning when freeing leaf after subvolume creation failure · 48f8f198
      Filipe Manana authored
      [ Upstream commit 212a58fd ]
      
      When creating a subvolume, at ioctl.c:create_subvol(), if we fail to
      insert the root item for the new subvolume into the root tree, we can
      trigger the following warning:
      
      [78961.741046] WARNING: CPU: 0 PID: 4079814 at fs/btrfs/extent-tree.c:3357 btrfs_free_tree_block+0x2af/0x310 [btrfs]
      [78961.743344] Modules linked in:
      [78961.749440]  dm_snapshot dm_thin_pool (...)
      [78961.773648] CPU: 0 PID: 4079814 Comm: fsstress Not tainted 5.16.0-rc4-btrfs-next-108 #1
      [78961.775198] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
      [78961.777266] RIP: 0010:btrfs_free_tree_block+0x2af/0x310 [btrfs]
      [78961.778398] Code: 17 00 48 85 (...)
      [78961.781067] RSP: 0018:ffffaa4001657b28 EFLAGS: 00010202
      [78961.781877] RAX: 0000000000000213 RBX: ffff897f8a796910 RCX: 0000000000000000
      [78961.782780] RDX: 0000000000000000 RSI: 0000000011004000 RDI: 00000000ffffffff
      [78961.783764] RBP: ffff8981f490e800 R08: 0000000000000001 R09: 0000000000000000
      [78961.784740] R10: 0000000000000000 R11: 0000000000000001 R12: ffff897fc963fcc8
      [78961.785665] R13: 0000000000000001 R14: ffff898063548000 R15: ffff898063548000
      [78961.786620] FS:  00007f31283c6b80(0000) GS:ffff8982ace00000(0000) knlGS:0000000000000000
      [78961.787717] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [78961.788598] CR2: 00007f31285c3000 CR3: 000000023fcc8003 CR4: 0000000000370ef0
      [78961.789568] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [78961.790585] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [78961.791684] Call Trace:
      [78961.792082]  <TASK>
      [78961.792359]  create_subvol+0x5d1/0x9a0 [btrfs]
      [78961.793054]  btrfs_mksubvol+0x447/0x4c0 [btrfs]
      [78961.794009]  ? preempt_count_add+0x49/0xa0
      [78961.794705]  __btrfs_ioctl_snap_create+0x123/0x190 [btrfs]
      [78961.795712]  ? _copy_from_user+0x66/0xa0
      [78961.796382]  btrfs_ioctl_snap_create_v2+0xbb/0x140 [btrfs]
      [78961.797392]  btrfs_ioctl+0xd1e/0x35c0 [btrfs]
      [78961.798172]  ? __slab_free+0x10a/0x360
      [78961.798820]  ? rcu_read_lock_sched_held+0x12/0x60
      [78961.799664]  ? lock_release+0x223/0x4a0
      [78961.800321]  ? lock_acquired+0x19f/0x420
      [78961.800992]  ? rcu_read_lock_sched_held+0x12/0x60
      [78961.801796]  ? trace_hardirqs_on+0x1b/0xe0
      [78961.802495]  ? _raw_spin_unlock_irqrestore+0x3e/0x60
      [78961.803358]  ? kmem_cache_free+0x321/0x3c0
      [78961.804071]  ? __x64_sys_ioctl+0x83/0xb0
      [78961.804711]  __x64_sys_ioctl+0x83/0xb0
      [78961.805348]  do_syscall_64+0x3b/0xc0
      [78961.805969]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [78961.806830] RIP: 0033:0x7f31284bc957
      [78961.807517] Code: 3c 1c 48 f7 d8 (...)
      
      This is because we are calling btrfs_free_tree_block() on an extent
      buffer that is dirty. Fix that by cleaning the extent buffer, with
      btrfs_clean_tree_block(), before freeing it.
      
      This was triggered by test case generic/475 from fstests.
      
      Fixes: 67addf29
      
       ("btrfs: fix metadata extent leak after failure to create subvolume")
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      48f8f198
    • Filipe Manana's avatar
      btrfs: fix invalid delayed ref after subvolume creation failure · 9bc53f5a
      Filipe Manana authored
      [ Upstream commit 7a163608 ]
      
      When creating a subvolume, at ioctl.c:create_subvol(), if we fail to
      insert the new root's root item into the root tree, we are freeing the
      metadata extent we reserved for the new root to prevent a metadata
      extent leak, as we don't abort the transaction at that point (since
      there is nothing at that point that is irreversible).
      
      However we allocated the metadata extent for the new root which we are
      creating for the new subvolume, so its delayed reference refers to the
      ID of this new root. But when we free the metadata extent we pass the
      root of the subvolume where the new subvolume is located to
      btrfs_free_tree_block() - this is incorrect because this will generate
      a delayed reference that refers to the ID of the parent subvolume's root,
      and not to ID of the new root.
      
      This results in a failure when running delayed references that leads to
      a transaction abort and a trace like the following:
      
      [3868.738042] RIP: 0010:__btrfs_free_extent+0x709/0x950 [btrfs]
      [3868.739857] Code: 68 0f 85 e6 fb ff (...)
      [3868.742963] RSP: 0018:ffffb0e9045cf910 EFLAGS: 00010246
      [3868.743908] RAX: 00000000fffffffe RBX: 00000000fffffffe RCX: 0000000000000002
      [3868.745312] RDX: 00000000fffffffe RSI: 0000000000000002 RDI: ffff90b0cd793b88
      [3868.746643] RBP: 000000000e5d8000 R08: 0000000000000000 R09: ffff90b0cd793b88
      [3868.747979] R10: 0000000000000002 R11: 00014ded97944d68 R12: 0000000000000000
      [3868.749373] R13: ffff90b09afe4a28 R14: 0000000000000000 R15: ffff90b0cd793b88
      [3868.750725] FS:  00007f281c4a8b80(0000) GS:ffff90b3ada00000(0000) knlGS:0000000000000000
      [3868.752275] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [3868.753515] CR2: 00007f281c6a5000 CR3: 0000000108a42006 CR4: 0000000000370ee0
      [3868.754869] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [3868.756228] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [3868.757803] Call Trace:
      [3868.758281]  <TASK>
      [3868.758655]  ? btrfs_merge_delayed_refs+0x178/0x1c0 [btrfs]
      [3868.759827]  __btrfs_run_delayed_refs+0x2b1/0x1250 [btrfs]
      [3868.761047]  btrfs_run_delayed_refs+0x86/0x210 [btrfs]
      [3868.762069]  ? lock_acquired+0x19f/0x420
      [3868.762829]  btrfs_commit_transaction+0x69/0xb20 [btrfs]
      [3868.763860]  ? _raw_spin_unlock+0x29/0x40
      [3868.764614]  ? btrfs_block_rsv_release+0x1c2/0x1e0 [btrfs]
      [3868.765870]  create_subvol+0x1d8/0x9a0 [btrfs]
      [3868.766766]  btrfs_mksubvol+0x447/0x4c0 [btrfs]
      [3868.767669]  ? preempt_count_add+0x49/0xa0
      [3868.768444]  __btrfs_ioctl_snap_create+0x123/0x190 [btrfs]
      [3868.769639]  ? _copy_from_user+0x66/0xa0
      [3868.770391]  btrfs_ioctl_snap_create_v2+0xbb/0x140 [btrfs]
      [3868.771495]  btrfs_ioctl+0xd1e/0x35c0 [btrfs]
      [3868.772364]  ? __slab_free+0x10a/0x360
      [3868.773198]  ? rcu_read_lock_sched_held+0x12/0x60
      [3868.774121]  ? lock_release+0x223/0x4a0
      [3868.774863]  ? lock_acquired+0x19f/0x420
      [3868.775634]  ? rcu_read_lock_sched_held+0x12/0x60
      [3868.776530]  ? trace_hardirqs_on+0x1b/0xe0
      [3868.777373]  ? _raw_spin_unlock_irqrestore+0x3e/0x60
      [3868.778280]  ? kmem_cache_free+0x321/0x3c0
      [3868.779011]  ? __x64_sys_ioctl+0x83/0xb0
      [3868.779718]  __x64_sys_ioctl+0x83/0xb0
      [3868.780387]  do_syscall_64+0x3b/0xc0
      [3868.781059]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [3868.781953] RIP: 0033:0x7f281c59e957
      [3868.782585] Code: 3c 1c 48 f7 d8 4c (...)
      [3868.785867] RSP: 002b:00007ffe1f83e2b8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
      [3868.787198] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f281c59e957
      [3868.788450] RDX: 00007ffe1f83e2c0 RSI: 0000000050009418 RDI: 0000000000000003
      [3868.789748] RBP: 00007ffe1f83f300 R08: 0000000000000000 R09: 00007ffe1f83fe36
      [3868.791214] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000003
      [3868.792468] R13: 0000000000000003 R14: 00007ffe1f83e2c0 R15: 00000000000003cc
      [3868.793765]  </TASK>
      [3868.794037] irq event stamp: 0
      [3868.794548] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
      [3868.795670] hardirqs last disabled at (0): [<ffffffff98294214>] copy_process+0x934/0x2040
      [3868.797086] softirqs last  enabled at (0): [<ffffffff98294214>] copy_process+0x934/0x2040
      [3868.798309] softirqs last disabled at (0): [<0000000000000000>] 0x0
      [3868.799284] ---[ end trace be24c7002fe27747 ]---
      [3868.799928] BTRFS info (device dm-0): leaf 241188864 gen 1268 total ptrs 214 free space 469 owner 2
      [3868.801133] BTRFS info (device dm-0): refs 2 lock_owner 225627 current 225627
      [3868.802056]  item 0 key (237436928 169 0) itemoff 16250 itemsize 33
      [3868.802863]          extent refs 1 gen 1265 flags 2
      [3868.803447]          ref#0: tree block backref root 1610
      (...)
      [3869.064354]  item 114 key (241008640 169 0) itemoff 12488 itemsize 33
      [3869.065421]          extent refs 1 gen 1268 flags 2
      [3869.066115]          ref#0: tree block backref root 1689
      (...)
      [3869.403834] BTRFS error (device dm-0): unable to find ref byte nr 241008640 parent 0 root 1622  owner 0 offset 0
      [3869.405641] BTRFS: error (device dm-0) in __btrfs_free_extent:3076: errno=-2 No such entry
      [3869.407138] BTRFS: error (device dm-0) in btrfs_run_delayed_refs:2159: errno=-2 No such entry
      
      Fix this by passing the new subvolume's root ID to btrfs_free_tree_block().
      This requires changing the root argument of btrfs_free_tree_block() from
      struct btrfs_root * to a u64, since at this point during the subvolume
      creation we have not yet created the struct btrfs_root for the new
      subvolume, and btrfs_free_tree_block() only needs a root ID and nothing
      else from a struct btrfs_root.
      
      This was triggered by test case generic/475 from fstests.
      
      Fixes: 67addf29
      
       ("btrfs: fix metadata extent leak after failure to create subvolume")
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9bc53f5a
    • Nikolay Borisov's avatar
      btrfs: add additional parameters to btrfs_init_tree_ref/btrfs_init_data_ref · 66182050
      Nikolay Borisov authored
      [ Upstream commit f42c5da6
      
       ]
      
      In order to make 'real_root' used only in ref-verify it's required to
      have the necessary context to perform the same checks that this member
      is used for. So add 'mod_root' which will contain the root on behalf of
      which a delayed ref was created and a 'skip_group' parameter which
      will contain callsite-specific override of skip_qgroup.
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      66182050
    • Nikolay Borisov's avatar
      btrfs: rename btrfs_alloc_chunk to btrfs_create_chunk · bb5c2471
      Nikolay Borisov authored
      [ Upstream commit f6f39f7a
      
       ]
      
      The user facing function used to allocate new chunks is
      btrfs_chunk_alloc, unfortunately there is yet another similar sounding
      function - btrfs_alloc_chunk. This creates confusion, especially since
      the latter function can be considered "private" in the sense that it
      implements the first stage of chunk creation and as such is called by
      btrfs_chunk_alloc.
      
      To avoid the awkwardness that comes with having similarly named but
      distinctly different in their purpose function rename btrfs_alloc_chunk
      to btrfs_create_chunk, given that the main purpose of this function is
      to orchestrate the whole process of allocating a chunk - reserving space
      into devices, deciding on characteristics of the stripe size and
      creating the in-memory structures.
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarAnand Jain <anand.jain@oracle.com>
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bb5c2471
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: stricter validation of element data · c1784d20
      Pablo Neira Ayuso authored
      commit 7e6bc1f6 upstream.
      
      Make sure element data type and length do not mismatch the one specified
      by the set declaration.
      
      Fixes: 7d740264
      
       ("netfilter: nf_tables: variable sized set element keys / data")
      Reported-by: default avatarHugues ANGUELKOV <hanguelkov@randorisec.fr>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c1784d20
    • Pablo Neira Ayuso's avatar
      netfilter: nft_set_pipapo: release elements in clone from abort path · 5ccecafc
      Pablo Neira Ayuso authored
      commit 9827a0e6 upstream.
      
      New elements that reside in the clone are not released in case that the
      transaction is aborted.
      
      [16302.231754] ------------[ cut here ]------------
      [16302.231756] WARNING: CPU: 0 PID: 100509 at net/netfilter/nf_tables_api.c:1864 nf_tables_chain_destroy+0x26/0x127 [nf_tables]
      [...]
      [16302.231882] CPU: 0 PID: 100509 Comm: nft Tainted: G        W         5.19.0-rc3+ #155
      [...]
      [16302.231887] RIP: 0010:nf_tables_chain_destroy+0x26/0x127 [nf_tables]
      [16302.231899] Code: f3 fe ff ff 41 55 41 54 55 53 48 8b 6f 10 48 89 fb 48 c7 c7 82 96 d9 a0 8b 55 50 48 8b 75 58 e8 de f5 92 e0 83 7d 50 00 74 09 <0f> 0b 5b 5d 41 5c 41 5d c3 4c 8b 65 00 48 8b 7d 08 49 39 fc 74 05
      [...]
      [16302.231917] Call Trace:
      [16302.231919]  <TASK>
      [16302.231921]  __nf_tables_abort.cold+0x23/0x28 [nf_tables]
      [16302.231934]  nf_tables_abort+0x30/0x50 [nf_tables]
      [16302.231946]  nfnetlink_rcv_batch+0x41a/0x840 [nfnetlink]
      [16302.231952]  ? __nla_validate_parse+0x48/0x190
      [16302.231959]  nfnetlink_rcv+0x110/0x129 [nfnetlink]
      [16302.231963]  netlink_unicast+0x211/0x340
      [16302.231969]  netlink_sendmsg+0x21e/0x460
      
      Add nft_set_pipapo_match_destroy() helper function to release the
      elements in the lookup tables.
      
      Stefano Brivio says: "We additionally look for elements pointers in the
      cloned matching data if priv->dirty is set, because that means that
      cloned data might point to additional elements we did not commit to the
      working copy yet (such as the abort path case, but perhaps not limited
      to it)."
      
      Fixes: 3c4287f6
      
       ("nf_tables: Add set type for arbitrary concatenation of ranges")
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5ccecafc
    • Duoming Zhou's avatar
      net: rose: fix UAF bug caused by rose_t0timer_expiry · 75e9009e
      Duoming Zhou authored
      commit 148ca045 upstream.
      
      There are UAF bugs caused by rose_t0timer_expiry(). The
      root cause is that del_timer() could not stop the timer
      handler that is running and there is no synchronization.
      One of the race conditions is shown below:
      
          (thread 1)             |        (thread 2)
                                 | rose_device_event
                                 |   rose_rt_device_down
                                 |     rose_remove_neigh
      rose_t0timer_expiry        |       rose_stop_t0timer(rose_neigh)
        ...                      |         del_timer(&neigh->t0timer)
                                 |         kfree(rose_neigh) //[1]FREE
        neigh->dce_mode //[2]USE |
      
      The rose_neigh is deallocated in position [1] and use in
      position [2].
      
      The crash trace triggered by POC is like below:
      
      BUG: KASAN: use-after-free in expire_timers+0x144/0x320
      Write of size 8 at addr ffff888009b19658 by task swapper/0/0
      ...
      Call Trace:
       <IRQ>
       dump_stack_lvl+0xbf/0xee
       print_address_description+0x7b/0x440
       print_report+0x101/0x230
       ? expire_timers+0x144/0x320
       kasan_report+0xed/0x120
       ? expire_timers+0x144/0x320
       expire_timers+0x144/0x320
       __run_timers+0x3ff/0x4d0
       run_timer_softirq+0x41/0x80
       __do_softirq+0x233/0x544
       ...
      
      This patch changes rose_stop_ftimer() and rose_stop_t0timer()
      in rose_remove_neigh() to del_timer_sync() in order that the
      timer handler could be finished before the resources such as
      rose_neigh and so on are deallocated. As a result, the UAF
      bugs could be mitigated.
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarDuoming Zhou <duoming@zju.edu.cn>
      Link: https://lore.kernel.org/r/20220705125610.77971-1-duoming@zju.edu.cn
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      75e9009e
    • Oliver Neukum's avatar
      usbnet: fix memory leak in error case · db89582f
      Oliver Neukum authored
      commit b55a21b7 upstream.
      
      usbnet_write_cmd_async() mixed up which buffers
      need to be freed in which error case.
      
      v2: add Fixes tag
      v3: fix uninitialized buf pointer
      
      Fixes: 877bd862
      
       ("usbnet: introduce usbnet 3 command helpers")
      Signed-off-by: default avatarOliver Neukum <oneukum@suse.com>
      Link: https://lore.kernel.org/r/20220705125351.17309-1-oneukum@suse.com
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db89582f
    • Daniel Borkmann's avatar
      bpf: Fix insufficient bounds propagation from adjust_scalar_min_max_vals · a7de8d43
      Daniel Borkmann authored
      commit 3844d153 upstream.
      
      Kuee reported a corner case where the tnum becomes constant after the call
      to __reg_bound_offset(), but the register's bounds are not, that is, its
      min bounds are still not equal to the register's max bounds.
      
      This in turn allows to leak pointers through turning a pointer register as
      is into an unknown scalar via adjust_ptr_min_max_vals().
      
      Before:
      
        func#0 @0
        0: R1=ctx(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) R10=fp(off=0,imm=0,umax=0,var_off=(0x0; 0x0))
        0: (b7) r0 = 1                        ; R0_w=scalar(imm=1,umin=1,umax=1,var_off=(0x1; 0x0))
        1: (b7) r3 = 0                        ; R3_w=scalar(imm=0,umax=0,var_off=(0x0; 0x0))
        2: (87) r3 = -r3                      ; R3_w=scalar()
        3: (87) r3 = -r3                      ; R3_w=scalar()
        4: (47) r3 |= 32767                   ; R3_w=scalar(smin=-9223372036854743041,umin=32767,var_off=(0x7fff; 0xffffffffffff8000),s32_min=-2147450881)
        5: (75) if r3 s>= 0x0 goto pc+1       ; R3_w=scalar(umin=9223372036854808575,var_off=(0x8000000000007fff; 0x7fffffffffff8000),s32_min=-2147450881,u32_min=32767)
        6: (95) exit
      
        from 5 to 7: R0=scalar(imm=1,umin=1,umax=1,var_off=(0x1; 0x0)) R1=ctx(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) R3=scalar(umin=32767,umax=9223372036854775807,var_off=(0x7fff; 0x7fffffffffff8000),s32_min=-2147450881) R10=fp(off=0,imm=0,umax=0,var_off=(0x0; 0x0))
        7: (d5) if r3 s<= 0x8000 goto pc+1    ; R3=scalar(umin=32769,umax=9223372036854775807,var_off=(0x7fff; 0x7fffffffffff8000),s32_min=-2147450881,u32_min=32767)
        8: (95) exit
      
        from 7 to 9: R0=scalar(imm=1,umin=1,umax=1,var_off=(0x1; 0x0)) R1=ctx(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) R3=scalar(umin=32767,umax=32768,var_off=(0x7fff; 0x8000)) R10=fp(off=0,imm=0,umax=0,var_off=(0x0; 0x0))
        9: (07) r3 += -32767                  ; R3_w=scalar(imm=0,umax=1,var_off=(0x0; 0x0))  <--- [*]
        10: (95) exit
      
      What can be seen here is that R3=scalar(umin=32767,umax=32768,var_off=(0x7fff;
      0x8000)) after the operation R3 += -32767 results in a 'malformed' constant, that
      is, R3_w=scalar(imm=0,umax=1,var_off=(0x0; 0x0)). Intersecting with var_off has
      not been done at that point via __update_reg_bounds(), which would have improved
      the umax to be equal to umin.
      
      Refactor the tnum <> min/max bounds information flow into a reg_bounds_sync()
      helper and use it consistently everywhere. After the fix, bounds have been
      corrected to R3_w=scalar(imm=0,umax=0,var_off=(0x0; 0x0)) and thus the register
      is regarded as a 'proper' constant scalar of 0.
      
      After:
      
        func#0 @0
        0: R1=ctx(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) R10=fp(off=0,imm=0,umax=0,var_off=(0x0; 0x0))
        0: (b7) r0 = 1                        ; R0_w=scalar(imm=1,umin=1,umax=1,var_off=(0x1; 0x0))
        1: (b7) r3 = 0                        ; R3_w=scalar(imm=0,umax=0,var_off=(0x0; 0x0))
        2: (87) r3 = -r3                      ; R3_w=scalar()
        3: (87) r3 = -r3                      ; R3_w=scalar()
        4: (47) r3 |= 32767                   ; R3_w=scalar(smin=-9223372036854743041,umin=32767,var_off=(0x7fff; 0xffffffffffff8000),s32_min=-2147450881)
        5: (75) if r3 s>= 0x0 goto pc+1       ; R3_w=scalar(umin=9223372036854808575,var_off=(0x8000000000007fff; 0x7fffffffffff8000),s32_min=-2147450881,u32_min=32767)
        6: (95) exit
      
        from 5 to 7: R0=scalar(imm=1,umin=1,umax=1,var_off=(0x1; 0x0)) R1=ctx(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) R3=scalar(umin=32767,umax=9223372036854775807,var_off=(0x7fff; 0x7fffffffffff8000),s32_min=-2147450881) R10=fp(off=0,imm=0,umax=0,var_off=(0x0; 0x0))
        7: (d5) if r3 s<= 0x8000 goto pc+1    ; R3=scalar(umin=32769,umax=9223372036854775807,var_off=(0x7fff; 0x7fffffffffff8000),s32_min=-2147450881,u32_min=32767)
        8: (95) exit
      
        from 7 to 9: R0=scalar(imm=1,umin=1,umax=1,var_off=(0x1; 0x0)) R1=ctx(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) R3=scalar(umin=32767,umax=32768,var_off=(0x7fff; 0x8000)) R10=fp(off=0,imm=0,umax=0,var_off=(0x0; 0x0))
        9: (07) r3 += -32767                  ; R3_w=scalar(imm=0,umax=0,var_off=(0x0; 0x0))  <--- [*]
        10: (95) exit
      
      Fixes: b03c9f9f
      
       ("bpf/verifier: track signed and unsigned min/max values")
      Reported-by: default avatarKuee K1r0a <liulin063@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20220701124727.11153-2-daniel@iogearbox.net
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a7de8d43
    • Daniel Borkmann's avatar
      bpf: Fix incorrect verifier simulation around jmp32's jeq/jne · a703cbdd
      Daniel Borkmann authored
      commit a12ca627 upstream.
      
      Kuee reported a quirk in the jmp32's jeq/jne simulation, namely that the
      register value does not match expectations for the fall-through path. For
      example:
      
      Before fix:
      
        0: R1=ctx(off=0,imm=0) R10=fp0
        0: (b7) r2 = 0                        ; R2_w=P0
        1: (b7) r6 = 563                      ; R6_w=P563
        2: (87) r2 = -r2                      ; R2_w=Pscalar()
        3: (87) r2 = -r2                      ; R2_w=Pscalar()
        4: (4c) w2 |= w6                      ; R2_w=Pscalar(umin=563,umax=4294967295,var_off=(0x233; 0xfffffdcc),s32_min=-2147483085) R6_w=P563
        5: (56) if w2 != 0x8 goto pc+1        ; R2_w=P571  <--- [*]
        6: (95) exit
        R0 !read_ok
      
      After fix:
      
        0: R1=ctx(off=0,imm=0) R10=fp0
        0: (b7) r2 = 0                        ; R2_w=P0
        1: (b7) r6 = 563                      ; R6_w=P563
        2: (87) r2 = -r2                      ; R2_w=Pscalar()
        3: (87) r2 = -r2                      ; R2_w=Pscalar()
        4: (4c) w2 |= w6                      ; R2_w=Pscalar(umin=563,umax=4294967295,var_off=(0x233; 0xfffffdcc),s32_min=-2147483085) R6_w=P563
        5: (56) if w2 != 0x8 goto pc+1        ; R2_w=P8  <--- [*]
        6: (95) exit
        R0 !read_ok
      
      As can be seen on line 5 for the branch fall-through path in R2 [*] is that
      given condition w2 != 0x8 is false, verifier should conclude that r2 = 8 as
      upper 32 bit are known to be zero. However, verifier incorrectly concludes
      that r2 = 571 which is far off.
      
      The problem is it only marks false{true}_reg as known in the switch for JE/NE
      case, but at the end of the function, it uses {false,true}_{64,32}off to
      update {false,true}_reg->var_off and they still hold the prior value of
      {false,true}_reg->var_off before it got marked as known. The subsequent
      __reg_combine_32_into_64() then propagates this old var_off and derives new
      bounds. The information between min/max bounds on {false,true}_reg from
      setting the register to known const combined with the {false,true}_reg->var_off
      based on the old information then derives wrong register data.
      
      Fix it by detangling the BPF_JEQ/BPF_JNE cases and updating relevant
      {false,true}_{64,32}off tnums along with the register marking to known
      constant.
      
      Fixes: 3f50f132
      
       ("bpf: Verifier, do explicit ALU32 bounds tracking")
      Reported-by: default avatarKuee K1r0a <liulin063@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20220701124727.11153-1-daniel@iogearbox.net
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a703cbdd
    • Thomas Kopp's avatar
      can: mcp251xfd: mcp251xfd_regmap_crc_read(): update workaround broken CRC on TBC register · f7c9b38c
      Thomas Kopp authored
      commit e3d4ee7d upstream.
      
      The mcp251xfd compatible chips have an erratum ([1], [2]), where the
      received CRC doesn't match the calculated CRC. In commit
      c7eb923c ("can: mcp251xfd: mcp251xfd_regmap_crc_read(): work
      around broken CRC on TBC register") the following workaround was
      implementierend.
      
      - If a CRC read error on the TBC register is detected and the first
        byte is 0x00 or 0x80, the most significant bit of the first byte is
        flipped and the CRC is calculated again.
      - If the CRC now matches, the _original_ data is passed to the reader.
        For now we assume transferred data was OK.
      
      New investigations and simulations indicate that the CRC send by the
      device is calculated on correct data, and the data is incorrectly
      received by the SPI host controller.
      
      Use flipped instead of original data and update workaround description
      in mcp251xfd_regmap_crc_read().
      
      [1] mcp2517fd: DS80000792C: "Incorrect CRC for certain READ_CRC commands"
      [2] mcp2518fd: DS80000789C: "Incorrect CRC for certain READ_CRC commands"
      
      Link: https://lore.kernel.org/all/DM4PR11MB53901D49578FE265B239E55AFB7C9@DM4PR11MB5390.namprd11.prod.outlook.com
      Fixes: c7eb923c
      
       ("can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarThomas Kopp <thomas.kopp@microchip.com>
      [mkl: split into 2 patches, update patch description and documentation]
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f7c9b38c
    • Thomas Kopp's avatar
      can: mcp251xfd: mcp251xfd_regmap_crc_read(): improve workaround handling for mcp2517fd · 0cab3fb9
      Thomas Kopp authored
      commit 406cc9cd upstream.
      
      The mcp251xfd compatible chips have an erratum ([1], [2]), where the
      received CRC doesn't match the calculated CRC. In commit
      c7eb923c ("can: mcp251xfd: mcp251xfd_regmap_crc_read(): work
      around broken CRC on TBC register") the following workaround was
      implementierend.
      
      - If a CRC read error on the TBC register is detected and the first
        byte is 0x00 or 0x80, the most significant bit of the first byte is
        flipped and the CRC is calculated again.
      - If the CRC now matches, the _original_ data is passed to the reader.
        For now we assume transferred data was OK.
      
      Measurements on the mcp2517fd show that the workaround is applicable
      not only of the lowest byte is 0x00 or 0x80, but also if 3 least
      significant bits are set.
      
      Update check on 1st data byte and workaround description accordingly.
      
      [1] mcp2517fd: DS80000792C: "Incorrect CRC for certain READ_CRC commands"
      [2] mcp2518fd: DS80000789C: "Incorrect CRC for certain READ_CRC commands"
      
      Link: https://lore.kernel.org/all/DM4PR11MB53901D49578FE265B239E55AFB7C9@DM4PR11MB5390.namprd11.prod.outlook.com
      Fixes: c7eb923c
      
       ("can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarPavel Modilaynen <pavel.modilaynen@volvocars.com>
      Signed-off-by: default avatarThomas Kopp <thomas.kopp@microchip.com>
      [mkl: split into 2 patches, update patch description and documentation]
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0cab3fb9
    • Marc Kleine-Budde's avatar
      can: m_can: m_can_{read_fifo,echo_tx_event}(): shift timestamp to full 32 bits · c7333f79
      Marc Kleine-Budde authored
      commit 4c333369 upstream.
      
      In commit 1be37d3b ("can: m_can: fix periph RX path: use
      rx-offload to ensure skbs are sent from softirq context") the RX path
      for peripheral devices was switched to RX-offload.
      
      Received CAN frames are pushed to RX-offload together with a
      timestamp. RX-offload is designed to handle overflows of the timestamp
      correctly, if 32 bit timestamps are provided.
      
      The timestamps of m_can core are only 16 bits wide. So this patch
      shifts them to full 32 bit before passing them to RX-offload.
      
      Link: https://lore.kernel.org/all/20220612211410.4081390-1-mkl@pengutronix.de
      Fixes: 1be37d3b
      
       ("can: m_can: fix periph RX path: use rx-offload to ensure skbs are sent from softirq context")
      Cc: <stable@vger.kernel.org> # 5.13
      Cc: Torin Cooper-Bennun <torin@maxiluxsystems.com>
      Reviewed-by: default avatarChandrasekar Ramakrishnan <rcsekar@samsung.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c7333f79
    • Marc Kleine-Budde's avatar
      can: m_can: m_can_chip_config(): actually enable internal timestamping · f4d90e9c
      Marc Kleine-Budde authored
      commit 5b12933d upstream.
      
      In commit df06fd67 ("can: m_can: m_can_chip_config(): enable and
      configure internal timestamps") the timestamping in the m_can core
      should be enabled. In peripheral mode, the RX'ed CAN frames, TX
      compete frames and error events are sorted by the timestamp.
      
      The above mentioned commit however forgot to enable the timestamping.
      Add the missing bits to enable the timestamp counter to the write of
      the Timestamp Counter Configuration register.
      
      Link: https://lore.kernel.org/all/20220612212708.4081756-1-mkl@pengutronix.de
      Fixes: df06fd67
      
       ("can: m_can: m_can_chip_config(): enable and configure internal timestamps")
      Cc: <stable@vger.kernel.org> # 5.13
      Cc: Torin Cooper-Bennun <torin@maxiluxsystems.com>
      Reviewed-by: default avatarChandrasekar Ramakrishnan <rcsekar@samsung.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f4d90e9c
    • Rhett Aultman's avatar
      can: gs_usb: gs_usb_open/close(): fix memory leak · 0e60230b
      Rhett Aultman authored
      commit 2bda24ef upstream.
      
      The gs_usb driver appears to suffer from a malady common to many USB
      CAN adapter drivers in that it performs usb_alloc_coherent() to
      allocate a number of USB request blocks (URBs) for RX, and then later
      relies on usb_kill_anchored_urbs() to free them, but this doesn't
      actually free them. As a result, this may be leaking DMA memory that's
      been used by the driver.
      
      This commit is an adaptation of the techniques found in the esd_usb2
      driver where a similar design pattern led to a memory leak. It
      explicitly frees the RX URBs and their DMA memory via a call to
      usb_free_coherent(). Since the RX URBs were allocated in the
      gs_can_open(), we remove them in gs_can_close() rather than in the
      disconnect function as was done in esd_usb2.
      
      For more information, see the 928150fa ("can: esd_usb2: fix memory
      leak").
      
      Link: https://lore.kernel.org/all/alpine.DEB.2.22.394.2206031547001.1630869@thelappy
      Fixes: d08e973a
      
       ("can: gs_usb: Added support for the GS_USB CAN devices")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarRhett Aultman <rhett.aultman@samsara.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0e60230b
    • Liang He's avatar
      can: grcan: grcan_probe(): remove extra of_node_get() · 8cfa1a33
      Liang He authored
      commit 562fed94 upstream.
      
      In grcan_probe(), of_find_node_by_path() has already increased the
      refcount. There is no need to call of_node_get() again, so remove it.
      
      Link: https://lore.kernel.org/all/20220619070257.4067022-1-windhl@126.com
      Fixes: 1e93ed26
      
       ("can: grcan: grcan_probe(): fix broken system id check for errata workaround needs")
      Cc: stable@vger.kernel.org # v5.18
      Cc: Andreas Larsson <andreas@gaisler.com>
      Signed-off-by: default avatarLiang He <windhl@126.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8cfa1a33
    • Oliver Hartkopp's avatar
      can: bcm: use call_rcu() instead of costly synchronize_rcu() · f34f2a18
      Oliver Hartkopp authored
      commit f1b4e32a upstream.
      
      In commit d5f9023f ("can: bcm: delay release of struct bcm_op
      after synchronize_rcu()") Thadeu Lima de Souza Cascardo introduced two
      synchronize_rcu() calls in bcm_release() (only once at socket close)
      and in bcm_delete_rx_op() (called on removal of each single bcm_op).
      
      Unfortunately this slow removal of the bcm_op's affects user space
      applications like cansniffer where the modification of a filter
      removes 2048 bcm_op's which blocks the cansniffer application for
      40(!) seconds.
      
      In commit 181d4447 ("can: gw: use call_rcu() instead of costly
      synchronize_rcu()") Eric Dumazet replaced the synchronize_rcu() calls
      with several call_rcu()'s to safely remove the data structures after
      the removal of CAN ID subscriptions with can_rx_unregister() calls.
      
      This patch adopts Erics approach for the can-bcm which should be
      applicable since the removal of tasklet_kill() in bcm_remove_op() and
      the introduction of the HRTIMER_MODE_SOFT timer handling in Linux 5.4.
      
      Fixes: d5f9023f ("can: bcm: delay release of struct bcm_op after synchronize_rcu()") # >= 5.4
      Link: https://lore.kernel.org/all/20220520183239.19111-1-socketcan@hartkopp.net
      
      
      Cc: stable@vger.kernel.org
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Norbert Slusarek <nslusarek@gmx.net>
      Cc: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Signed-off-by: default avatarOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f34f2a18
    • Takashi Iwai's avatar
      ALSA: cs46xx: Fix missing snd_card_free() call at probe error · 51aab37a
      Takashi Iwai authored
      commit c5e58c45 upstream.
      
      The previous cleanup with devres may lead to the incorrect release
      orders at the probe error handling due to the devres's nature.  Until
      we register the card, snd_card_free() has to be called at first for
      releasing the stuff properly when the driver tries to manage and
      release the stuff via card->private_free().
      
      This patch fixes it by calling snd_card_free() manually on the error
      from the probe callback.
      
      Fixes: 5bff69b3
      
       ("ALSA: cs46xx: Allocate resources with device-managed APIs")
      Cc: <stable@vger.kernel.org>
      Reported-and-tested-by: default avatarJan Engelhardt <jengelh@inai.de>
      Link: https://lore.kernel.org/r/p2p1s96o-746-74p4-s95-61qo1p7782pn@vanv.qr
      Link: https://lore.kernel.org/r/20220705152336.350-1-tiwai@suse.de
      
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      51aab37a
    • Tim Crawford's avatar
      ALSA: hda/realtek: Add quirk for Clevo L140PU · f768f3ca
      Tim Crawford authored
      commit 11bea269
      
       upstream.
      
      Fixes headset detection on Clevo L140PU.
      Signed-off-by: default avatarTim Crawford <tcrawford@system76.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20220624144109.3957-1-tcrawford@system76.com
      
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f768f3ca
    • Takashi Iwai's avatar
      ALSA: usb-audio: Workarounds for Behringer UMC 204/404 HD · f62c53c6
      Takashi Iwai authored
      commit ae8b1631 upstream.
      
      Both Behringer UMC 202 HD and 404 HD need explicit quirks to enable
      the implicit feedback mode and start the playback stream primarily.
      The former seems fixing the stuttering and the latter is required for
      a playback-only case.
      
      Note that the "clock source 41 is not valid" error message still
      appears even after this fix, but it should be only once at probe.
      The reason of the error is still unknown, but this seems to be mostly
      harmless as it's a one-off error and the driver retires the clock
      setup and it succeeds afterwards.
      
      BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215934
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20220624101132.14528-1-tiwai@suse.de
      
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f62c53c6
    • Po-Hsu Lin's avatar
      Revert "selftests/bpf: Add test for bpf_timer overwriting crash" · e63b94b8
      Po-Hsu Lin authored
      This reverts commit b0028e1c which is
      commit a7e75016
      
       upstream.
      
      It will break the bpf self-tests build with:
      progs/timer_crash.c:8:19: error: field has incomplete type 'struct bpf_timer'
              struct bpf_timer timer;
                               ^
      /home/ubuntu/linux/tools/testing/selftests/bpf/tools/include/bpf/bpf_helper_defs.h:39:8:
      note: forward declaration of 'struct bpf_timer'
      struct bpf_timer;
             ^
      1 error generated.
      
      This test can only be built with 5.17 and newer kernels.
      Signed-off-by: default avatarPo-Hsu Lin <po-hsu.lin@canonical.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e63b94b8
    • Liu Shixin's avatar
      mm/filemap: fix UAF in find_lock_entries · 066a5b67
      Liu Shixin authored
      Release refcount after xas_set to fix UAF which may cause panic like this:
      
       page:ffffea000491fa40 refcount:1 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x1247e9
       head:ffffea000491fa00 order:3 compound_mapcount:0 compound_pincount:0
       memcg:ffff888104f91091
       flags: 0x2fffff80010200(slab|head|node=0|zone=2|lastcpupid=0x1fffff)
      ...
      page dumped because: VM_BUG_ON_PAGE(PageTail(page))
       ------------[ cut here ]------------
       kernel BUG at include/linux/page-flags.h:632!
       invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
       CPU: 1 PID: 7642 Comm: sh Not tainted 5.15.51-dirty #26
      ...
       Call Trace:
        <TASK>
        __invalidate_mapping_pages+0xe7/0x540
        drop_pagecache_sb+0x159/0x320
        iterate_supers+0x120/0x240
        drop_caches_sysctl_handler+0xaa/0xe0
        proc_sys_call_handler+0x2b4/0x480
        new_sync_write+0x3d6/0x5c0
        vfs_write+0x446/0x7a0
        ksys_write+0x105/0x210
        do_syscall_64+0x35/0x80
        entry_SYSCALL_64_after_hwframe+0x44/0xae
       RIP: 0033:0x7f52b5733130
      ...
      
      This problem has been fixed on mainline by patch 6b24ca4a ("mm: Use
      multi-index entries in the page cache") since it deletes the related code.
      
      Fixes: 5c211ba2
      
       ("mm: add and use find_lock_entries")
      Signed-off-by: default avatarLiu Shixin <liushixin2@huawei.com>
      Acked-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      066a5b67
    • Jann Horn's avatar
      mm/slub: add missing TID updates on slab deactivation · 0515cc9b
      Jann Horn authored
      commit eeaa345e upstream.
      
      The fastpath in slab_alloc_node() assumes that c->slab is stable as long as
      the TID stays the same. However, two places in __slab_alloc() currently
      don't update the TID when deactivating the CPU slab.
      
      If multiple operations race the right way, this could lead to an object
      getting lost; or, in an even more unlikely situation, it could even lead to
      an object being freed onto the wrong slab's freelist, messing up the
      `inuse` counter and eventually causing a page to be freed to the page
      allocator while it still contains slab objects.
      
      (I haven't actually tested these cases though, this is just based on
      looking at the code. Writing testcases for this stuff seems like it'd be
      a pain...)
      
      The race leading to state inconsistency is (all operations on the same CPU
      and kmem_cache):
      
       - task A: begin do_slab_free():
          - read TID
          - read pcpu freelist (==NULL)
          - check `slab == c->slab` (true)
       - [PREEMPT A->B]
       - task B: begin slab_alloc_node():
          - fastpath fails (`c->freelist` is NULL)
          - enter __slab_alloc()
          - slub_get_cpu_ptr() (disables preemption)
          - enter ___slab_alloc()
          - take local_lock_irqsave()
          - read c->freelist as NULL
          - get_freelist() returns NULL
          - write `c->slab = NULL`
          - drop local_unlock_irqrestore()
          - goto new_slab
          - slub_percpu_partial() is NULL
          - get_partial() returns NULL
          - slub_put_cpu_ptr() (enables preemption)
       - [PREEMPT B->A]
       - task A: finish do_slab_free():
          - this_cpu_cmpxchg_double() succeeds()
          - [CORRUPT STATE: c->slab==NULL, c->freelist!=NULL]
      
      From there, the object on c->freelist will get lost if task B is allowed to
      continue from here: It will proceed to the retry_load_slab label,
      set c->slab, then jump to load_freelist, which clobbers c->freelist.
      
      But if we instead continue as follows, we get worse corruption:
      
       - task A: run __slab_free() on object from other struct slab:
          - CPU_PARTIAL_FREE case (slab was on no list, is now on pcpu partial)
       - task A: run slab_alloc_node() with NUMA node constraint:
          - fastpath fails (c->slab is NULL)
          - call __slab_alloc()
          - slub_get_cpu_ptr() (disables preemption)
          - enter ___slab_alloc()
          - c->slab is NULL: goto new_slab
          - slub_percpu_partial() is non-NULL
          - set c->slab to slub_percpu_partial(c)
          - [CORRUPT STATE: c->slab points to slab-1, c->freelist has objects
            from slab-2]
          - goto redo
          - node_match() fails
          - goto deactivate_slab
          - existing c->freelist is passed into deactivate_slab()
          - inuse count of slab-1 is decremented to account for object from
            slab-2
      
      At this point, the inuse count of slab-1 is 1 lower than it should be.
      This means that if we free all allocated objects in slab-1 except for one,
      SLUB will think that slab-1 is completely unused, and may free its page,
      leading to use-after-free.
      
      Fixes: c17dda40 ("slub: Separate out kmem_cache_cpu processing from deactivate_slab")
      Fixes: 03e404af
      
       ("slub: fast release on full slab")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Reviewed-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Tested-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Link: https://lore.kernel.org/r/20220608182205.2945720-1-jannh@google.com
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0515cc9b
  2. 07 Jul, 2022 14 commits