Forum | Documentation | Website | Blog

Skip to content
Snippets Groups Projects
  1. Mar 29, 2022
    • Duoming Zhou's avatar
      ax25: Fix UAF bugs in ax25 timers · 82e31755
      Duoming Zhou authored
      
      There are race conditions that may lead to UAF bugs in
      ax25_heartbeat_expiry(), ax25_t1timer_expiry(), ax25_t2timer_expiry(),
      ax25_t3timer_expiry() and ax25_idletimer_expiry(), when we call
      ax25_release() to deallocate ax25_dev.
      
      One of the UAF bugs caused by ax25_release() is shown below:
      
            (Thread 1)                    |      (Thread 2)
      ax25_dev_device_up() //(1)          |
      ...                                 | ax25_kill_by_device()
      ax25_bind()          //(2)          |
      ax25_connect()                      | ...
       ax25_std_establish_data_link()     |
        ax25_start_t1timer()              | ax25_dev_device_down() //(3)
         mod_timer(&ax25->t1timer,..)     |
                                          | ax25_release()
         (wait a time)                    |  ...
                                          |  ax25_dev_put(ax25_dev) //(4)FREE
         ax25_t1timer_expiry()            |
          ax25->ax25_dev->values[..] //USE|  ...
           ...                            |
      
      We increase the refcount of ax25_dev in position (1) and (2), and
      decrease the refcount of ax25_dev in position (3) and (4).
      The ax25_dev will be freed in position (4) and be used in
      ax25_t1timer_expiry().
      
      The fail log is shown below:
      ==============================================================
      
      [  106.116942] BUG: KASAN: use-after-free in ax25_t1timer_expiry+0x1c/0x60
      [  106.116942] Read of size 8 at addr ffff88800bda9028 by task swapper/0/0
      [  106.116942] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.17.0-06123-g0905eec574
      [  106.116942] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-14
      [  106.116942] Call Trace:
      ...
      [  106.116942]  ax25_t1timer_expiry+0x1c/0x60
      [  106.116942]  call_timer_fn+0x122/0x3d0
      [  106.116942]  __run_timers.part.0+0x3f6/0x520
      [  106.116942]  run_timer_softirq+0x4f/0xb0
      [  106.116942]  __do_softirq+0x1c2/0x651
      ...
      
      This patch adds del_timer_sync() in ax25_release(), which could ensure
      that all timers stop before we deallocate ax25_dev.
      
      Signed-off-by: default avatarDuoming Zhou <duoming@zju.edu.cn>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      82e31755
    • Duoming Zhou's avatar
      ax25: fix UAF bug in ax25_send_control() · 5352a761
      Duoming Zhou authored
      There are UAF bugs in ax25_send_control(), when we call ax25_release()
      to deallocate ax25_dev. The possible race condition is shown below:
      
            (Thread 1)              |     (Thread 2)
      ax25_dev_device_up() //(1)    |
                                    | ax25_kill_by_device()
      ax25_bind()          //(2)    |
      ax25_connect()                | ...
       ax25->state = AX25_STATE_1   |
       ...                          | ax25_dev_device_down() //(3)
      
            (Thread 3)
      ax25_release()                |
       ax25_dev_put()  //(4) FREE   |
       case AX25_STATE_1:           |
        ax25_send_control()         |
         alloc_skb()       //USE    |
      
      The refcount of ax25_dev increases in position (1) and (2), and
      decreases in position (3) and (4). The ax25_dev will be freed
      before dereference sites in ax25_send_control().
      
      The following is part of the report:
      
      [  102.297448] BUG: KASAN: use-after-free in ax25_send_control+0x33/0x210
      [  102.297448] Read of size 8 at addr ffff888009e6e408 by task ax25_close/602
      [  102.297448] Call Trace:
      [  102.303751]  ax25_send_control+0x33/0x210
      [  102.303751]  ax25_release+0x356/0x450
      [  102.305431]  __sock_release+0x6d/0x120
      [  102.305431]  sock_close+0xf/0x20
      [  102.305431]  __fput+0x11f/0x420
      [  102.305431]  task_work_run+0x86/0xd0
      [  102.307130]  get_signal+0x1075/0x1220
      [  102.308253]  arch_do_signal_or_restart+0x1df/0xc00
      [  102.308253]  exit_to_user_mode_prepare+0x150/0x1e0
      [  102.308253]  syscall_exit_to_user_mode+0x19/0x50
      [  102.308253]  do_syscall_64+0x48/0x90
      [  102.308253]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [  102.308253] RIP: 0033:0x405ae7
      
      This patch defers the free operation of ax25_dev and net_device after
      all corresponding dereference sites in ax25_release() to avoid UAF.
      
      Fixes: 9fd75b66
      
       ("ax25: Fix refcount leaks caused by ax25_cb_del()")
      Signed-off-by: default avatarDuoming Zhou <duoming@zju.edu.cn>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      5352a761
    • Martin Varghese's avatar
      openvswitch: Fixed nd target mask field in the flow dump. · f19c4445
      Martin Varghese authored
      IPv6 nd target mask was not getting populated in flow dump.
      
      In the function __ovs_nla_put_key the icmp code mask field was checked
      instead of icmp code key field to classify the flow as neighbour discovery.
      
      ufid:bdfbe3e5-60c2-43b0-a5ff-dfcac1c37328, recirc_id(0),dp_hash(0/0),
      skb_priority(0/0),in_port(ovs-nm1),skb_mark(0/0),ct_state(0/0),
      ct_zone(0/0),ct_mark(0/0),ct_label(0/0),
      eth(src=00:00:00:00:00:00/00:00:00:00:00:00,
      dst=00:00:00:00:00:00/00:00:00:00:00:00),
      eth_type(0x86dd),
      ipv6(src=::/::,dst=::/::,label=0/0,proto=58,tclass=0/0,hlimit=0/0,frag=no),
      icmpv6(type=135,code=0),
      nd(target=2001::2/::,
      sll=00:00:00:00:00:00/00:00:00:00:00:00,
      tll=00:00:00:00:00:00/00:00:00:00:00:00),
      packets:10, bytes:860, used:0.504s, dp:ovs, actions:ovs-nm2
      
      Fixes: e6445719
      
       (openvswitch: Restructure datapath.c and flow.c)
      Signed-off-by: default avatarMartin Varghese <martin.varghese@nokia.com>
      Link: https://lore.kernel.org/r/20220328054148.3057-1-martinvarghesenokia@gmail.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      f19c4445
  2. Mar 28, 2022
  3. Mar 26, 2022
  4. Mar 25, 2022
    • Eric Dumazet's avatar
      llc: only change llc->dev when bind() succeeds · 2d327a79
      Eric Dumazet authored
      My latest patch, attempting to fix the refcount leak in a minimal
      way turned out to add a new bug.
      
      Whenever the bind operation fails before we attempt to grab
      a reference count on a device, we might release the device refcount
      of a prior successful bind() operation.
      
      syzbot was not happy about this [1].
      
      Note to stable teams:
      
      Make sure commit b37a4668 ("netdevice: add the case if dev is NULL")
      is already present in your trees.
      
      [1]
      general protection fault, probably for non-canonical address 0xdffffc0000000070: 0000 [#1] PREEMPT SMP KASAN
      KASAN: null-ptr-deref in range [0x0000000000000380-0x0000000000000387]
      CPU: 1 PID: 3590 Comm: syz-executor361 Tainted: G        W         5.17.0-syzkaller-04796-g169e77764adc #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:llc_ui_connect+0x400/0xcb0 net/llc/af_llc.c:500
      Code: 80 3c 02 00 0f 85 fc 07 00 00 4c 8b a5 38 05 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d bc 24 80 03 00 00 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 a9 07 00 00 49 8b b4 24 80 03 00 00 4c 89 f2 48
      RSP: 0018:ffffc900038cfcc0 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: ffff8880756eb600 RCX: 0000000000000000
      RDX: 0000000000000070 RSI: ffffc900038cfe3e RDI: 0000000000000380
      RBP: ffff888015ee5000 R08: 0000000000000001 R09: ffff888015ee5535
      R10: ffffed1002bdcaa6 R11: 0000000000000000 R12: 0000000000000000
      R13: ffffc900038cfe37 R14: ffffc900038cfe38 R15: ffff888015ee5012
      FS:  0000555555acd300(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020000280 CR3: 0000000077db6000 CR4: 00000000003506e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       __sys_connect_file+0x155/0x1a0 net/socket.c:1900
       __sys_connect+0x161/0x190 net/socket.c:1917
       __do_sys_connect net/socket.c:1927 [inline]
       __se_sys_connect net/socket.c:1924 [inline]
       __x64_sys_connect+0x6f/0xb0 net/socket.c:1924
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x7f016acb90b9
      Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007ffd417947f8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f016acb90b9
      RDX: 0000000000000010 RSI: 0000000020000140 RDI: 0000000000000003
      RBP: 00007f016ac7d0a0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f016ac7d130
      R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
       </TASK>
      Modules linked in:
      ---[ end trace 0000000000000000 ]---
      RIP: 0010:llc_ui_connect+0x400/0xcb0 net/llc/af_llc.c:500
      
      Fixes: 764f4eb6
      
       ("llc: fix netdevice reference leaks in llc_ui_bind()")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: 赵子轩 <beraphin@gmail.com>
      Cc: Stoyan Manolov <smanolov@suse.de>
      Link: https://lore.kernel.org/r/20220325035827.360418-1-eric.dumazet@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2d327a79
  5. Mar 24, 2022
  6. Mar 23, 2022
  7. Mar 22, 2022
    • Muchun Song's avatar
      fs: allocate inode by using alloc_inode_sb() · fd60b288
      Muchun Song authored
      The inode allocation is supposed to use alloc_inode_sb(), so convert
      kmem_cache_alloc() of all filesystems to alloc_inode_sb().
      
      Link: https://lkml.kernel.org/r/20220228122126.37293-5-songmuchun@bytedance.com
      
      
      Signed-off-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Acked-by: Theodore Ts'o <tytso@mit.edu>		[ext4]
      Acked-by: default avatarRoman Gushchin <roman.gushchin@linux.dev>
      Cc: Alex Shi <alexs@kernel.org>
      Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
      Cc: Chao Yu <chao@kernel.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Fam Zheng <fam.zheng@bytedance.com>
      Cc: Jaegeuk Kim <jaegeuk@kernel.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kari Argillander <kari.argillander@gmail.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: Xiongchun Duan <duanxiongchun@bytedance.com>
      Cc: Yang Shi <shy828301@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fd60b288
    • Eric Dumazet's avatar
      net: bridge: mst: prevent NULL deref in br_mst_info_size() · cde3fc24
      Eric Dumazet authored
      Call br_mst_info_size() only if vg pointer is not NULL.
      
      general protection fault, probably for non-canonical address 0xdffffc0000000058: 0000 [#1] PREEMPT SMP KASAN
      KASAN: null-ptr-deref in range [0x00000000000002c0-0x00000000000002c7]
      CPU: 0 PID: 975 Comm: syz-executor.0 Tainted: G        W         5.17.0-next-20220321-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:br_mst_info_size+0x97/0x270 net/bridge/br_mst.c:242
      Code: 00 00 31 c0 e8 ba 10 53 f9 31 c0 b9 40 00 00 00 4c 8d 6c 24 30 4c 89 ef f3 48 ab 48 8d 83 c0 02 00 00 48 89 04 24 48 c1 e8 03 <80> 3c 28 00 0f 85 ae 01 00 00 48 8b 83 c0 02 00 00 41 bf 04 00 00
      RSP: 0018:ffffc900153770a8 EFLAGS: 00010202
      RAX: 0000000000000058 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: 0000000000040000 RSI: ffffffff88259876 RDI: ffffc900153772d8
      RBP: dffffc0000000000 R08: 0000000000000000 R09: ffffffff8db68957
      R10: ffffffff881f737b R11: 0000000000000000 R12: 0000000000000000
      R13: ffffc900153770d8 R14: 00000000000002a0 R15: 00000000ffffffff
      FS:  00007f18bbb6f700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020001a80 CR3: 000000001a7d9000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 00000000000000d8 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       br_get_link_af_size_filtered+0x6e9/0xc00 net/bridge/br_netlink.c:123
       rtnl_link_get_af_size net/core/rtnetlink.c:598 [inline]
       if_nlmsg_size+0x40c/0xa50 net/core/rtnetlink.c:1040
       rtnl_calcit.isra.0+0x25f/0x460 net/core/rtnetlink.c:3780
       rtnetlink_rcv_msg+0xa65/0xb80 net/core/rtnetlink.c:5937
       netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2496
       netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
       netlink_unicast+0x543/0x7f0 net/netlink/af_netlink.c:1345
       netlink_sendmsg+0x904/0xe00 net/netlink/af_netlink.c:1921
       sock_sendmsg_nosec net/socket.c:705 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:725
       ____sys_sendmsg+0x6e8/0x810 net/socket.c:2413
       ___sys_sendmsg+0xf3/0x170 net/socket.c:2467
       __sys_sendmsg+0xe5/0x1b0 net/socket.c:2496
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0x80 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x7f18baa89049
      Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f18bbb6f168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007f18bab9bf60 RCX: 00007f18baa89049
      RDX: 0000000000000000 RSI: 0000000020001a80 RDI: 0000000000000004
      RBP: 00007f18baae308d R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 00007ffeedb2be2f R14: 00007f18bbb6f300 R15: 0000000000022000
       </TASK>
      Modules linked in:
      ---[ end trace 0000000000000000 ]---
      RIP: 0010:br_mst_info_size+0x97/0x270 net/bridge/br_mst.c:242
      Code: 00 00 31 c0 e8 ba 10 53 f9 31 c0 b9 40 00 00 00 4c 8d 6c 24 30 4c 89 ef f3 48 ab 48 8d 83 c0 02 00 00 48 89 04 24 48 c1 e8 03 <80> 3c 28 00 0f 85 ae 01 00 00 48 8b 83 c0 02 00 00 41 bf 04 00 00
      RSP: 0018:ffffc900153770a8 EFLAGS: 00010202
      RAX: 0000000000000058 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: 0000000000040000 RSI: ffffffff88259876 RDI: ffffc900153772d8
      RBP: dffffc0000000000 R08: 0000000000000000 R09: ffffffff8db68957
      R10: ffffffff881f737b R11: 0000000000000000 R12: 0000000000000000
      R13: ffffc900153770d8 R14: 00000000000002a0 R15: 00000000ffffffff
      FS:  00007f18bbb6f700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000001b2ca22000 CR3: 000000001a7d9000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 00000000000000d8 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      
      Fixes: 122c2948
      
       ("net: bridge: mst: Support setting and reporting MST port states")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Tobias Waldekranz <tobias@waldekranz.com>
      Cc: Nikolay Aleksandrov <razor@blackwall.org>
      Reviewed-by: default avatarTobias Waldekranz <tobias@waldekranz.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Link: https://lore.kernel.org/r/20220322012314.795187-1-eric.dumazet@gmail.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      cde3fc24
    • Hoang Le's avatar
      tipc: fix the timer expires after interval 100ms · 6a7d8cff
      Hoang Le authored
      In the timer callback function tipc_sk_timeout(), we're trying to
      reschedule another timeout to retransmit a setup request if destination
      link is congested. But we use the incorrect timeout value
      (msecs_to_jiffies(100)) instead of (jiffies + msecs_to_jiffies(100)),
      so that the timer expires immediately, it's irrelevant for original
      description.
      
      In this commit we correct the timeout value in sk_reset_timer()
      
      Fixes: 67879274
      
       ("tipc: buffer overflow handling in listener socket")
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarHoang Le <hoang.h.le@dektech.com.au>
      Link: https://lore.kernel.org/r/20220321042229.314288-1-hoang.h.le@dektech.com.au
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      6a7d8cff
    • Vladimir Oltean's avatar
      net: dsa: fix panic on shutdown if multi-chip tree failed to probe · 8fd36358
      Vladimir Oltean authored
      DSA probing is atypical because a tree of devices must probe all at
      once, so out of N switches which call dsa_tree_setup_routing_table()
      during probe, for (N - 1) of them, "complete" will return false and they
      will exit probing early. The Nth switch will set up the whole tree on
      their behalf.
      
      The implication is that for (N - 1) switches, the driver binds to the
      device successfully, without doing anything. When the driver is bound,
      the ->shutdown() method may run. But if the Nth switch has failed to
      initialize the tree, there is nothing to do for the (N - 1) driver
      instances, since the slave devices have not been created, etc. Moreover,
      dsa_switch_shutdown() expects that the calling @ds has been in fact
      initialized, so it jumps at dereferencing the various data structures,
      which is incorrect.
      
      Avoid the ensuing NULL pointer dereferences by simply checking whether
      the Nth switch has previously set "ds->setup = true" for the switch
      which is currently shutting down. The entire setup is serialized under
      dsa2_mutex which we already hold.
      
      Fixes: 0650bf52
      
       ("net: dsa: be compatible with masters which unregister on shutdown")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Link: https://lore.kernel.org/r/20220318195443.275026-1-vladimir.oltean@nxp.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8fd36358
    • Aaron Conole's avatar
      openvswitch: always update flow key after nat · 60b44ca6
      Aaron Conole authored
      During NAT, a tuple collision may occur.  When this happens, openvswitch
      will make a second pass through NAT which will perform additional packet
      modification.  This will update the skb data, but not the flow key that
      OVS uses.  This means that future flow lookups, and packet matches will
      have incorrect data.  This has been supported since
      5d50aa83 ("openvswitch: support asymmetric conntrack").
      
      That commit failed to properly update the sw_flow_key attributes, since
      it only called the ovs_ct_nat_update_key once, rather than each time
      ovs_ct_nat_execute was called.  As these two operations are linked, the
      ovs_ct_nat_execute() function should always make sure that the
      sw_flow_key is updated after a successful call through NAT infrastructure.
      
      Fixes: 5d50aa83
      
       ("openvswitch: support asymmetric conntrack")
      Cc: Dumitru Ceara <dceara@redhat.com>
      Cc: Numan Siddique <nusiddiq@redhat.com>
      Signed-off-by: default avatarAaron Conole <aconole@redhat.com>
      Acked-by: default avatarEelco Chaudron <echaudro@redhat.com>
      Link: https://lore.kernel.org/r/20220318124319.3056455-1-aconole@redhat.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      60b44ca6
  8. Mar 21, 2022
    • Ziyang Xuan's avatar
      net/tls: optimize judgement processes in tls_set_device_offload() · b1a6f56b
      Ziyang Xuan authored
      
      It is known that priority setting HW offload when set tls TX/RX offload
      by setsockopt(). Check netdevice whether support NETIF_F_HW_TLS_TX or
      not at the later stages in the whole tls_set_device_offload() process,
      some memory allocations have been done before that. We must release those
      memory and return error if we judge the netdevice not support
      NETIF_F_HW_TLS_TX. It is redundant.
      
      Move NETIF_F_HW_TLS_TX judgement forward, and move start_marker_record
      and offload_ctx memory allocation back slightly. Thus, we can get
      simpler exception handling process.
      
      Signed-off-by: default avatarZiyang Xuan <william.xuanziyang@huawei.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b1a6f56b
    • Ziyang Xuan's avatar
      net/tls: remove unnecessary jump instructions in do_tls_setsockopt_conf() · 1ddcbfbf
      Ziyang Xuan authored
      
      Avoid using "goto" jump instruction unconditionally when we
      can return directly. Remove unnecessary jump instructions in
      do_tls_setsockopt_conf().
      
      Signed-off-by: default avatarZiyang Xuan <william.xuanziyang@huawei.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1ddcbfbf
    • Jakub Kicinski's avatar
      tcp: ensure PMTU updates are processed during fastopen · ed0c99dc
      Jakub Kicinski authored
      tp->rx_opt.mss_clamp is not populated, yet, during TFO send so we
      rise it to the local MSS. tp->mss_cache is not updated, however:
      
      tcp_v6_connect():
        tp->rx_opt.mss_clamp = IPV6_MIN_MTU - headers;
        tcp_connect():
           tcp_connect_init():
             tp->mss_cache = min(mtu, tp->rx_opt.mss_clamp)
           tcp_send_syn_data():
             tp->rx_opt.mss_clamp = tp->advmss
      
      After recent fixes to ICMPv6 PTB handling we started dropping
      PMTU updates higher than tp->mss_cache. Because of the stale
      tp->mss_cache value PMTU updates during TFO are always dropped.
      
      Thanks to Wei for helping zero in on the problem and the fix!
      
      Fixes: c7bb4b89
      
       ("ipv6: tcp: drop silly ICMPv6 packet too big messages")
      Reported-by: default avatarAndre Nash <alnash@fb.com>
      Reported-by: default avatarNeil Spring <ntspring@fb.com>
      Reviewed-by: default avatarWei Wang <weiwan@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20220321165957.1769954-1-kuba@kernel.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ed0c99dc
    • Sebastian Andrzej Siewior's avatar
      net: Revert the softirq will run annotation in ____napi_schedule(). · 351bdbb6
      Sebastian Andrzej Siewior authored
      The lockdep annotation lockdep_assert_softirq_will_run() expects that
      either hard or soft interrupts are disabled because both guaranty that
      the "raised" soft-interrupts will be processed once the context is left.
      
      This triggers in flush_smp_call_function_from_idle() but it this case it
      explicitly calls do_softirq() in case of pending softirqs.
      
      Revert the "softirq will run" annotation in ____napi_schedule() and move
      the check back to __netif_rx() as it was. Keep the IRQ-off assert in
      ____napi_schedule() because this is always required.
      
      Fixes: fbd9a2ce
      
       ("net: Add lockdep asserts to ____napi_schedule().")
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Reviewed-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Link: https://lore.kernel.org/r/YjhD3ZKWysyw8rc6@linutronix.de
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      351bdbb6
    • Jakub Kicinski's avatar
      devlink: hold the instance lock during eswitch_mode callbacks · 14e426bf
      Jakub Kicinski authored
      
      Make the devlink core hold the instance lock during eswitch_mode
      callbacks. Cheat in case of mlx5 (see the cover letter).
      
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      14e426bf
    • Jakub Kicinski's avatar
      devlink: add explicitly locked flavor of the rate node APIs · 8879b32a
      Jakub Kicinski authored
      
      We'll need an explicitly locked rate node API for netdevsim
      to switch eswitch mode setting to locked.
      
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8879b32a
    • Wang Yufen's avatar
      netlabel: fix out-of-bounds memory accesses · f22881de
      Wang Yufen authored
      
      In calipso_map_cat_ntoh(), in the for loop, if the return value of
      netlbl_bitmap_walk() is equal to (net_clen_bits - 1), when
      netlbl_bitmap_walk() is called next time, out-of-bounds memory accesses
      of bitmap[byte_offset] occurs.
      
      The bug was found during fuzzing. The following is the fuzzing report
       BUG: KASAN: slab-out-of-bounds in netlbl_bitmap_walk+0x3c/0xd0
       Read of size 1 at addr ffffff8107bf6f70 by task err_OH/252
      
       CPU: 7 PID: 252 Comm: err_OH Not tainted 5.17.0-rc7+ #17
       Hardware name: linux,dummy-virt (DT)
       Call trace:
        dump_backtrace+0x21c/0x230
        show_stack+0x1c/0x60
        dump_stack_lvl+0x64/0x7c
        print_address_description.constprop.0+0x70/0x2d0
        __kasan_report+0x158/0x16c
        kasan_report+0x74/0x120
        __asan_load1+0x80/0xa0
        netlbl_bitmap_walk+0x3c/0xd0
        calipso_opt_getattr+0x1a8/0x230
        calipso_sock_getattr+0x218/0x340
        calipso_sock_getattr+0x44/0x60
        netlbl_sock_getattr+0x44/0x80
        selinux_netlbl_socket_setsockopt+0x138/0x170
        selinux_socket_setsockopt+0x4c/0x60
        security_socket_setsockopt+0x4c/0x90
        __sys_setsockopt+0xbc/0x2b0
        __arm64_sys_setsockopt+0x6c/0x84
        invoke_syscall+0x64/0x190
        el0_svc_common.constprop.0+0x88/0x200
        do_el0_svc+0x88/0xa0
        el0_svc+0x128/0x1b0
        el0t_64_sync_handler+0x9c/0x120
        el0t_64_sync+0x16c/0x170
      
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarWang Yufen <wangyufen@huawei.com>
      Acked-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f22881de
    • Duoming Zhou's avatar
      ax25: Fix NULL pointer dereferences in ax25 timers · fc6d01ff
      Duoming Zhou authored
      The previous commit 7ec02f5a ("ax25: fix NPD bug in ax25_disconnect")
      move ax25_disconnect into lock_sock() in order to prevent NPD bugs. But
      there are race conditions that may lead to null pointer dereferences in
      ax25_heartbeat_expiry(), ax25_t1timer_expiry(), ax25_t2timer_expiry(),
      ax25_t3timer_expiry() and ax25_idletimer_expiry(), when we use
      ax25_kill_by_device() to detach the ax25 device.
      
      One of the race conditions that cause null pointer dereferences can be
      shown as below:
      
            (Thread 1)                    |      (Thread 2)
      ax25_connect()                      |
       ax25_std_establish_data_link()     |
        ax25_start_t1timer()              |
         mod_timer(&ax25->t1timer,..)     |
                                          | ax25_kill_by_device()
         (wait a time)                    |  ...
                                          |  s->ax25_dev = NULL; //(1)
         ax25_t1timer_expiry()            |
          ax25->ax25_dev->values[..] //(2)|  ...
           ...                            |
      
      We set null to ax25_cb->ax25_dev in position (1) and dereference
      the null pointer in position (2).
      
      The corresponding fail log is shown below:
      ===============================================================
      BUG: kernel NULL pointer dereference, address: 0000000000000050
      CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.17.0-rc6-00794-g45690b7d0
      RIP: 0010:ax25_t1timer_expiry+0x12/0x40
      ...
      Call Trace:
       call_timer_fn+0x21/0x120
       __run_timers.part.0+0x1ca/0x250
       run_timer_softirq+0x2c/0x60
       __do_softirq+0xef/0x2f3
       irq_exit_rcu+0xb6/0x100
       sysvec_apic_timer_interrupt+0xa2/0xd0
      ...
      
      This patch moves ax25_disconnect() before s->ax25_dev = NULL
      and uses del_timer_sync() to delete timers in ax25_disconnect().
      If ax25_disconnect() is called by ax25_kill_by_device() or
      ax25->ax25_dev is NULL, the reason in ax25_disconnect() will be
      equal to ENETUNREACH, it will wait all timers to stop before we
      set null to s->ax25_dev in ax25_kill_by_device().
      
      Fixes: 7ec02f5a
      
       ("ax25: fix NPD bug in ax25_disconnect")
      Signed-off-by: default avatarDuoming Zhou <duoming@zju.edu.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fc6d01ff
    • Duoming Zhou's avatar
      ax25: Fix refcount leaks caused by ax25_cb_del() · 9fd75b66
      Duoming Zhou authored
      The previous commit d01ffb9e ("ax25: add refcount in ax25_dev to
      avoid UAF bugs") and commit feef318c ("ax25: fix UAF bugs of
      net_device caused by rebinding operation") increase the refcounts of
      ax25_dev and net_device in ax25_bind() and decrease the matching refcounts
      in ax25_kill_by_device() in order to prevent UAF bugs, but there are
      reference count leaks.
      
      The root cause of refcount leaks is shown below:
      
           (Thread 1)                      |      (Thread 2)
      ax25_bind()                          |
       ...                                 |
       ax25_addr_ax25dev()                 |
        ax25_dev_hold()   //(1)            |
        ...                                |
       dev_hold_track()   //(2)            |
       ...                                 | ax25_destroy_socket()
                                           |  ax25_cb_del()
                                           |   ...
                                           |   hlist_del_init() //(3)
                                           |
                                           |
           (Thread 3)                      |
      ax25_kill_by_device()                |
       ...                                 |
       ax25_for_each(s, &ax25_list) {      |
        if (s->ax25_dev == ax25_dev) //(4) |
         ...                               |
      
      Firstly, we use ax25_bind() to increase the refcount of ax25_dev in
      position (1) and increase the refcount of net_device in position (2).
      Then, we use ax25_cb_del() invoked by ax25_destroy_socket() to delete
      ax25_cb in hlist in position (3) before calling ax25_kill_by_device().
      Finally, the decrements of refcounts in ax25_kill_by_device() will not
      be executed, because no s->ax25_dev equals to ax25_dev in position (4).
      
      This patch adds decrements of refcounts in ax25_release() and use
      lock_sock() to do synchronization. If refcounts decrease in ax25_release(),
      the decrements of refcounts in ax25_kill_by_device() will not be
      executed and vice versa.
      
      Fixes: d01ffb9e ("ax25: add refcount in ax25_dev to avoid UAF bugs")
      Fixes: 87563a04 ("ax25: fix reference count leaks of ax25_dev")
      Fixes: feef318c
      
       ("ax25: fix UAF bugs of net_device caused by rebinding operation")
      Reported-by: default avatarThomas Osterried <thomas@osterried.de>
      Signed-off-by: default avatarDuoming Zhou <duoming@zju.edu.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9fd75b66
  9. Mar 20, 2022
  10. Mar 19, 2022