Commits · 9b9311af4e8657be85bb1d083c531cfb6afb0d9c · Vaishnav Achath / Linux

Aug 06, 2021

David S. Miller authored 3 years ago

Vladimir Oltean says:

====================
Always flood multicast to the DSA CPU port

Discussing with Qingfang, it became obvious that DSA is not prepared to
disable multicast flooding towards the CPU port under any circumstance
right now, and this in fact breaks traffic quite blatantly.

This series is a revert done in reverse chronological order. These
should be propagated to stable trees up to commit a8b659e7

 ("net:
dsa: act as passthrough for bridge port flags") which is in v5.12.
For older kernels, that commit blocks further backporting, so I need to
send a modified version of patch 3 separately to Greg after these go
into "net".

v1->v2: delete unused b53_set_mrouter function prototype
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

9b9311af

net: dsa: don't disable multicast flooding to the CPU even without an IGMP querier · c73c5708

Vladimir Oltean authored 3 years ago

Commit 08cc83cc ("net: dsa: add support for BRIDGE_MROUTER
attribute") added an option for users to turn off multicast flooding
towards the CPU if they turn off the IGMP querier on a bridge which
already has enslaved ports (echo 0 > /sys/class/net/br0/bridge/multicast_router).

And commit a8b659e7 ("net: dsa: act as passthrough for bridge port flags")
simply papered over that issue, because it moved the decision to flood
the CPU with multicast (or not) from the DSA core down to individual drivers,
instead of taking a more radical position then.

The truth is that disabling multicast flooding to the CPU is simply
something we are not prepared to do now, if at all. Some reasons:

- ICMP6 neighbor solicitation messages are unregistered multicast
  packets as far as the bridge is concerned. So if we stop flooding
  multicast, the outside world cannot ping the bridge device's IPv6
  link-local address.

- There might be foreign interfaces bridged with our DSA switch ports
  (sending a packet towards the host does not necessarily equal
  termination, but maybe software forwarding). So if there is no one
  interested in that multicast traffic in the local network stack, that
  doesn't mean nobody is.

- PTP over L4 (IPv4, IPv6) is multicast, but is unregistered as far as
  the bridge is concerned. This should reach the CPU port.

- The switch driver might not do FDB partitioning. And since we don't
  even bother to do more fine-grained flood disabling (such as "disable
  flooding _from_port_N_ towards the CPU port" as opposed to "disable
  flooding _from_any_port_ towards the CPU port"), this breaks standalone
  ports, or even multiple bridges where one has an IGMP querier and one
  doesn't.

Reverting the logic makes all of the above work.

Fixes: a8b659e7 ("net: dsa: act as passthrough for bridge port flags")
Fixes: 08cc83cc

 ("net: dsa: add support for BRIDGE_MROUTER attribute")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c73c5708

net: dsa: mt7530: remove the .port_set_mrouter implementation · cbbf09b5

Vladimir Oltean authored 3 years ago

DSA's idea of optimizing out multicast flooding to the CPU port leaves
quite a few holes open, so it should be reverted.

The mt7530 driver is the only new driver which added a .port_set_mrouter
implementation after the reorg from commit a8b659e7 ("net: dsa: act
as passthrough for bridge port flags"), so it needs to be reverted
separately so that the other revert commit can go a bit further down the
git history.

Fixes: 5a30833b

 ("net: dsa: mt7530: support MDB and bridge flag operations")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cbbf09b5

net: dsa: stop syncing the bridge mcast_router attribute at join time · 7df4e744

Vladimir Oltean authored 3 years ago

Qingfang points out that when a bridge with the default settings is
created and a port joins it:

ip link add br0 type bridge
ip link set swp0 master br0

DSA calls br_multicast_router() on the bridge to see if the br0 device
is a multicast router port, and if it is, it enables multicast flooding
to the CPU port, otherwise it disables it.

If we look through the multicast_router_show() sysfs or at the
IFLA_BR_MCAST_ROUTER netlink attribute, we see that the default mrouter
attribute for the bridge device is "1" (MDB_RTR_TYPE_TEMP_QUERY).

However, br_multicast_router() will return "0" (MDB_RTR_TYPE_DISABLED),
because an mrouter port in the MDB_RTR_TYPE_TEMP_QUERY state may not be
actually _active_ until it receives an actual IGMP query. So, the
br_multicast_router() function should really have been called
br_multicast_router_active() perhaps.

When/if an IGMP query is received, the bridge device will transition via
br_multicast_mark_router() into the active state until the
ip4_mc_router_timer expires after an multicast_querier_interval.

Of course, this does not happen if the bridge is created with an
mcast_router attribute of "2" (MDB_RTR_TYPE_PERM).

The point is that in lack of any IGMP query messages, and in the default
bridge configuration, unregistered multicast packets will not be able to
reach the CPU port through flooding, and this breaks many use cases
(most obviously, IPv6 ND, with its ICMP6 neighbor solicitation multicast
messages).

Leave the multicast flooding setting towards the CPU port down to a driver
level decision.

Fixes: 010e269f

 ("net: dsa: sync up switchdev objects and port attributes when joining the bridge")
Reported-by: DENG Qingfang <dqfext@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7df4e744

net: ethernet: ti: am65-cpsw: use napi_complete_done() in TX completion · 3bacbe04

Grygorii Strashko authored 3 years ago

This patch enables support for hard irqs deferral feature from Eric Dumazet
[1] for TI K3 CPSW driver by using napi_complete_done() in TX completion
path.

Depending on gro_flush_timeout and napi_defer_hard_irqs at gives up to 30%
CPU utilization reduction:

gro_flush_timeout=50000
napi_defer_hard_irqs=2

netperf -l 10 -H 192.168.1.1  -t UDP_STREAM -c -C -- -m 1470
MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.1 () port 0 AF_INET
Socket  Message  Elapsed      Messages                   CPU      Service
Size    Size     Time         Okay Errors   Throughput   Util     Demand
bytes   bytes    secs            #      #   10^6bits/sec % SS     us/KB

before:
212992    1470   10.00      809632      0      952.0     42.98    14.792
212992           10.00      809630             952.0     50.66    8.719

after:
212992    1470   10.00      813686      0      956.8     32.14    11.009
212992           10.00      813686             956.8     50.05    8.570

[1] https://lore.kernel.org/netdev/20200422161329.56026-1-edumazet@google.com/



Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3bacbe04

net: ti: am65-cpsw-nuss: fix RX IRQ state after .ndo_stop() · 47bfc4d1

Vignesh Raghavendra authored 3 years ago

On TI K3 am64x platform the issue with RX IRQ is observed - it's become
disabled forever after .ndo_stop(). The K3 CPSW driver manipulates RX IRQ
by using standard Linux enable_irq()/disable_irq_nosync() API as there is
no IRQ enable/disable options in CPSW HW itself, as result during
.ndo_stop() following sequence happens

  phy_stop()
  teardown TX/RX channels
  wait for TX tdown complete
  napi_disable(TX)
  clean up TX channels

  (a)

  napi_disable(RX)

At point (a) it's not possible to predict if RX IRQ was triggered or not.
if RX IRQ was triggered then it also not possible to definitely say if RX
NAPI was run or only scheduled and immediately canceled by
napi_disable(RX). Actually the last case causes RX IRQ to be permanently
disabled.

Another observed issue is that RX IRQ enable counter become unbalanced if
(gro_flush_timeout =! 0) while (napi_defer_hard_irqs == 0):

Unbalanced enable for IRQ 44
WARNING: CPU: 0 PID: 10 at ../kernel/irq/manage.c:776 __enable_irq+0x38/0x80
__enable_irq+0x38/0x80
enable_irq+0x54/0xb0
am65_cpsw_nuss_rx_poll+0x2f4/0x368
__napi_poll+0x34/0x1b8
net_rx_action+0xe4/0x220
_stext+0x11c/0x284
run_ksoftirqd+0x4c/0x60

To avoid above issues introduce flag indicating if RX was actually disabled
before enabling it in am65_cpsw_nuss_rx_poll() and restore RX IRQ state in
.ndo_open()

Fixes: 4f7cce27

 ("net: ethernet: ti: am65-cpsw: add support for am64x cpsw3g")
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

47bfc4d1

Merge branch 'ptp-ocp-fixes' · 370cb73a

David S. Miller authored 3 years ago


Jonathan Lemon says:

====================
ptp: ocp: assorted fixes.

Assorted fixes for the ocp timecard.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

370cb73a

ptp: ocp: Remove pending_image indicator from devlink · 8ef8ccbc

Jonathan Lemon authored 3 years ago

After writing an image blob to the flash memory, a reboot is required
to reload the FPGA. There is no versioning prsent in the FPGA image
file, so only a running version is available. The 'stored version'
was set to 'pending' in order to indicate a reboot was needed.

This isn't reliable, as the module could be unloaded/loaded, losing
the "reboot needed" indicator. Also, the devlink 'stored version'
information is designed to refer to the actual image version.

Unfortunately, there is no method to determine the flash image version
other than booting it, so remove the devlink stored version setting.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8ef8ccbc

ptp: ocp: Rename version string shown by devlink. · 1a052da9

Jonathan Lemon authored 3 years ago


The TimeCard has two FPGA images in the flash: the actual firmware,
and a manufacturing fallback version which is intended to act as a
loader in case the flash update failed.

Name these "fw" and "loader", which are reflected in devlink:

    [root@timecard drv]# devlink dev info
    pci/0000:04:00.0:
      driver ptp_ocp
      serial_number fc:c2:3d:2e:d7:c0
      versions:
          running:
            fw 5

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1a052da9

ptp: ocp: Use 'gnss' naming instead of 'gps' · ef0cfb34

Jonathan Lemon authored 3 years ago


GPS is not the only available positioning system.  Use the generic
naming of "GNSS" instead.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ef0cfb34

ptp: ocp: Remove devlink health and unused parameters. · 37a156ba

Jonathan Lemon authored 3 years ago


"devlink health" was used as a way to monitor the GNSS signal
status.  This isn't really the intended use, and the same
functionality can be achived by monitoring the status file.

Remove the devlink heath support entirely, and also remove the
currently unused devlink parameters.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

37a156ba

ptp: ocp: Add the mapping for the external PPS registers. · 0d43d4f2

Jonathan Lemon authored 3 years ago


There are two PPS blocks: one handles the external PPS signal output,
with the other handling the PPS signal input to the internal clock.
Add controls for the external PPS block.

Rename the fields so they match their function.

Add cable_delay to the register definitions.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0d43d4f2

ptp: ocp: Fix the error handling path for the class device. · d12f23fa

Jonathan Lemon authored 3 years ago


Move the put_device() call to the error handling path, so the
device is released after the .release callback, avoiding a
use-after-free.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d12f23fa

ethtool: return error from ethnl_ops_begin if dev is NULL · 596690e9

Heiner Kallweit authored 3 years ago

Julian reported that after d43c65b0 Coverity complains about a
missing check whether dev is NULL in ethnl_ops_complete().
There doesn't seem to be any valid case where dev could be NULL when
calling ethnl_ops_begin(), therefore return an error if dev is NULL.

Fixes: d43c65b0

 ("ethtool: runtime-resume netdev parent in ethnl_ops_begin")
Reported-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

596690e9

netdevsim: Protect both reload_down and reload_up paths · 5c0418ed

Leon Romanovsky authored 3 years ago

Don't progress with adding and deleting ports as long as devlink
reload is running.

Fixes: 23809a72

 ("netdevsim: Forbid devlink reload when adding or deleting ports")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5c0418ed

Merge branch 'cpsw-emac-skb_put_padto' · a5516053

David S. Miller authored 3 years ago

Grygorii Strashko says:

====================
net: ethernet: ti: cpsw/emac: switch to use skb_put_padto()

Now frame padding in TI TI CPSW/EMAC is implemented in a bit of entangled way as
frame SKB padded in drivers (without skb->len) while frame length fixed in CPDMA.
Things became even more confusing hence CPSW switcdev driver need to perform min
TX frame length correction in switch mode [1].

To avoid further confusion, make xmit path more clear and linear, and avoid
updating CPDMA configuration interface for min TX frame length correction
(which is not CPDMA job in general) this series switches TI CPSW/EMAC
drivers to skb_put_padto() instead of skb_padto() in their xmit path, so
skb->len also got updated properly and then removes TX frame length
fixup from CPDMA code.

[1] https://patchwork.kernel.org/project/netdevbpf/patch/20210611132732.10690-1-grygorii.strashko@ti.com/


====================

Signed-off-by: David S. Miller <davem@davemloft.net>

a5516053

net: ethernet: ti: davinci_cpdma: drop frame padding · 9ffc513f

Grygorii Strashko authored 3 years ago


Hence all users of davinci_cpdma switched to skb_put_padto() the frame
padding can be removed from it.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9ffc513f

net: ethernet: ti: davinci_emac: switch to use skb_put_padto() · 61e7a22d

Grygorii Strashko authored 3 years ago


Use skb_put_padto() instead of skb_padto() so skb->len also got updated, as
preparation for further removing frame padding from cpdma.
It also makes xmit path more clear and linear.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

61e7a22d

net: ethernet: ti: cpsw: switch to use skb_put_padto() · 1f88d5d5

Grygorii Strashko authored 3 years ago


Use skb_put_padto() instead of skb_padto() so skb->len also got updated, as
preparation for further removing frame padding from cpdma.
It also makes xmit path more clear and linear.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1f88d5d5

Aug 05, 2021

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 0ca8d3ca

Jakub Kicinski authored 3 years ago

Build failure in drivers/net/wwan/mhi_wwan_mbim.c:
add missing parameter (0, assuming we don't want buffer pre-alloc).

Conflict in drivers/net/dsa/sja1105/sja1105_main.c between:
  589918df ("net: dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too")
  0fac6aa0 ("net: dsa: sja1105: delete the best_effort_vlan_filtering mode")

Follow the instructions from the commit message of the former commit
- removed the if conditions. When looking at commit 589918df

 ("net:
dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too")
note that the mask_iotag fields get removed by the following patch.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

0ca8d3ca

Merge tag 'net-5.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 902e7f37

Linus Torvalds authored 3 years ago

Pull networking fixes from Jakub Kicinski:
 "Including fixes from ipsec.

  Current release - regressions:

   - sched: taprio: fix init procedure to avoid inf loop when dumping

   - sctp: move the active_key update after sh_keys is added

  Current release - new code bugs:

   - sparx5: fix build with old GCC & bitmask on 32-bit targets

  Previous releases - regressions:

   - xfrm: redo the PREEMPT_RT RCU vs hash_resize_mutex deadlock fix

   - xfrm: fixes for the compat netlink attribute translator

   - phy: micrel: Fix detection of ksz87xx switch

  Previous releases - always broken:

   - gro: set inner transport header offset in tcp/udp GRO hook to avoid
     crashes when such packets reach GSO

   - vsock: handle VIRTIO_VSOCK_OP_CREDIT_REQUEST, as required by spec

   - dsa: sja1105: fix static FDB entries on SJA1105P/Q/R/S and SJA1110

   - bridge: validate the NUD_PERMANENT bit when adding an extern_learn
     FDB entry

   - usb: lan78xx: don't modify phy_device state concurrently

   - usb: pegasus: check for errors of IO routines"

* tag 'net-5.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (48 commits)
  net: vxge: fix use-after-free in vxge_device_unregister
  net: fec: fix use-after-free in fec_drv_remove
  net: pegasus: fix uninit-value in get_interrupt_interval
  net: ethernet: ti: am65-cpsw: fix crash in am65_cpsw_port_offload_fwd_mark_update()
  bnx2x: fix an error code in bnx2x_nic_load()
  net: wwan: iosm: fix recursive lock acquire in unregister
  net: wwan: iosm: correct data protocol mask bit
  net: wwan: iosm: endianness type correction
  net: wwan: iosm: fix lkp buildbot warning
  net: usb: lan78xx: don't modify phy_device state concurrently
  docs: networking: netdevsim rules
  net: usb: pegasus: Remove the changelog and DRIVER_VERSION.
  net: usb: pegasus: Check the return value of get_geristers() and friends;
  net/prestera: Fix devlink groups leakage in error flow
  net: sched: fix lockdep_set_class() typo error for sch->seqlock
  net: dsa: qca: ar9331: reorder MDIO write sequence
  VSOCK: handle VIRTIO_VSOCK_OP_CREDIT_REQUEST
  mptcp: drop unused rcu member in mptcp_pm_addr_entry
  net: ipv6: fix returned variable type in ip6_skb_dst_mtu
  nfp: update ethtool reporting of pauseframe control
  ...

902e7f37

Bluetooth: defer cleanup of resources in hci_unregister_dev() · e0448092

Tetsuo Handa authored 3 years ago

syzbot is hitting might_sleep() warning at hci_sock_dev_event() due to
calling lock_sock() with rw spinlock held [1].

It seems that history of this locking problem is a trial and error.

Commit b40df574 ("[PATCH] bluetooth: fix socket locking in
hci_sock_dev_event()") in 2.6.21-rc4 changed bh_lock_sock() to
lock_sock() as an attempt to fix lockdep warning.

Then, commit 4ce61d1c ("[BLUETOOTH]: Fix locking in
hci_sock_dev_event().") in 2.6.22-rc2 changed lock_sock() to
local_bh_disable() + bh_lock_sock_nested() as an attempt to fix the
sleep in atomic context warning.

Then, commit 4b5dd696 ("Bluetooth: Remove local_bh_disable() from
hci_sock.c") in 3.3-rc1 removed local_bh_disable().

Then, commit e305509e ("Bluetooth: use correct lock to prevent UAF
of hdev object") in 5.13-rc5 again changed bh_lock_sock_nested() to
lock_sock() as an attempt to fix CVE-2021-3573.

This difficulty comes from current implementation that
hci_sock_dev_event(HCI_DEV_UNREG) is responsible for dropping all
references from sockets because hci_unregister_dev() immediately
reclaims resources as soon as returning from
hci_sock_dev_event(HCI_DEV_UNREG).

But the history suggests that hci_sock_dev_event(HCI_DEV_UNREG) was not
doing what it should do.

Therefore, instead of trying to detach sockets from device, let's accept
not detaching sockets from device at hci_sock_dev_event(HCI_DEV_UNREG),
by moving actual cleanup of resources from hci_unregister_dev() to
hci_cleanup_dev() which is called by bt_host_release() when all
references to this unregistered device (which is a kobject) are gone.

Since hci_sock_dev_event(HCI_DEV_UNREG) no longer resets
hci_pi(sk)->hdev, we need to check whether this device was unregistered
and return an error based on HCI_UNREGISTER flag.  There might be subtle
behavioral difference in "monitor the hdev" functionality; please report
if you found something went wrong due to this patch.

Link: https://syzkaller.appspot.com/bug?extid=a5df189917e79d5e59c9

 [1]
Reported-by: syzbot <syzbot+a5df189917e79d5e59c9@syzkaller.appspotmail.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: e305509e

 ("Bluetooth: use correct lock to prevent UAF of hdev object")
Acked-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

e0448092

Merge tag 'selinux-pr-20210805' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux · 0b53abfc

Linus Torvalds authored 3 years ago

Pull selinux fix from Paul Moore:
 "One small SELinux fix for a problem where an error code was not being
  propagated back up to userspace when a bogus SELinux policy is loaded
  into the kernel"

* tag 'selinux-pr-20210805' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: correct the return value when loads initial sids

0b53abfc

Merge branch 'for-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace · 6209049e

Linus Torvalds authored 3 years ago

Pull ucounts fix from Eric Biederman:
 "Fix a subtle locking versus reference counting bug in the ucount
  changes, found by syzbot"

* 'for-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  ucounts: Fix race condition between alloc_ucounts and put_ucounts

6209049e

Merge tag 'trace-v5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 3c3e9027

Linus Torvalds authored 3 years ago

Pull tracing fixes from Steven Rostedt:
 "Various tracing fixes:

   - Fix NULL pointer dereference caused by an error path

   - Give histogram calculation fields a size, otherwise it breaks
     synthetic creation based on them.

   - Reject strings being used for number calculations.

   - Fix recordmcount.pl warning on llvm building RISC-V allmodconfig

   - Fix the draw_functrace.py script to handle the new trace output

   - Fix warning of smp_processor_id() in preemptible code"

* tag 'trace-v5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Quiet smp_processor_id() use in preemptable warning in hwlat
  scripts/tracing: fix the bug that can't parse raw_trace_func
  scripts/recordmcount.pl: Remove check_objcopy() and $can_use_local
  tracing: Reject string operand in the histogram expression
  tracing / histogram: Give calculation hist_fields a size
  tracing: Fix NULL pointer dereference in start_creating

3c3e9027

Merge tag 's390-5.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 130951bb

Linus Torvalds authored 3 years ago

Pull s390 fixes from Heiko Carstens:

 - fix zstd build for -march=z900 (undefined reference to __clzdi2)

 - add missing .got.plts to vdso linker scripts to fix kpatch build
   errors

 - update defconfigs

* tag 's390-5.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: update defconfigs
  s390/boot: fix zstd build for -march=z900
  s390/vdso: add .got.plt in vdso linker script

130951bb

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 97fcc07b

Linus Torvalds authored 3 years ago

Pull kvm fixes from Paolo Bonzini:
 "Mostly bugfixes; plus, support for XMM arguments to Hyper-V hypercalls
  now obeys KVM_CAP_HYPERV_ENFORCE_CPUID.

  Both the XMM arguments feature and KVM_CAP_HYPERV_ENFORCE_CPUID are
  new in 5.14, and each did not know of the other"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86/mmu: Fix per-cpu counter corruption on 32-bit builds
  KVM: selftests: fix hyperv_clock test
  KVM: SVM: improve the code readability for ASID management
  KVM: SVM: Fix off-by-one indexing when nullifying last used SEV VMCB
  KVM: Do not leak memory for duplicate debugfs directories
  KVM: selftests: Test access to XMM fast hypercalls
  KVM: x86: hyper-v: Check if guest is allowed to use XMM registers for hypercall input
  KVM: x86: Introduce trace_kvm_hv_hypercall_done()
  KVM: x86: hyper-v: Check access to hypercall before reading XMM registers
  KVM: x86: accept userspace interrupt only if no event is injected

97fcc07b

Merge branch 'pcmcia-next' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux · 611ffd8a

Linus Torvalds authored 3 years ago

Pull pcmcia fix from Dominik Brodowski:
 "Zheyu Ma found and fixed a null pointer dereference bug in the device
  driver for the i82092 card reader"

* 'pcmcia-next' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux:
  pcmcia: i82092: fix a null pointer dereference bug

611ffd8a

pipe: increase minimum default pipe size to 2 pages · 46c4c9d1

Alex Xu (Hello71) authored 3 years ago

This program always prints 4096 and hangs before the patch, and always
prints 8192 and exits successfully after:

  int main()
  {
      int pipefd[2];
      for (int i = 0; i < 1025; i++)
          if (pipe(pipefd) == -1)
              return 1;
      size_t bufsz = fcntl(pipefd[1], F_GETPIPE_SZ);
      printf("%zd\n", bufsz);
      char *buf = calloc(bufsz, 1);
      write(pipefd[1], buf, bufsz);
      read(pipefd[0], buf, bufsz-1);
      write(pipefd[1], buf, 1);
  }

Note that you may need to increase your RLIMIT_NOFILE before running the
program.

Fixes: 759c0114 ("pipe: limit the per-user amount of pages allocated in pipes")
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/lkml/1628086770.5rn8p04n6j.none@localhost/
Link: https://lore.kernel.org/lkml/1628127094.lxxn016tj7.none@localhost/


Signed-off-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

46c4c9d1

Merge branch 'net-fix-use-after-free-bugs' · 6bb5318c

Jakub Kicinski authored 3 years ago

Pavel Skripkin says:

====================
net: fix use-after-free bugs

I've added new checker to smatch yesterday. It warns about using
netdev_priv() pointer after free_{netdev,candev}() call. I hope, it will
get into next smatch release.

Some of the reported bugs are fixed and upstreamed already, but Dan ran new
smatch with allmodconfig and found 2 more. Big thanks to Dan for doing it,
because I totally forgot to do it.
====================

Link: https://lore.kernel.org/r/cover.1628091954.git.paskripkin@gmail.com


Signed-off-by: Jakub Kicinski <kuba@kernel.org>

6bb5318c

net: vxge: fix use-after-free in vxge_device_unregister · 942e560a

Pavel Skripkin authored 3 years ago

Smatch says:
drivers/net/ethernet/neterion/vxge/vxge-main.c:3518 vxge_device_unregister() error: Using vdev after free_{netdev,candev}(dev);
drivers/net/ethernet/neterion/vxge/vxge-main.c:3518 vxge_device_unregister() error: Using vdev after free_{netdev,candev}(dev);
drivers/net/ethernet/neterion/vxge/vxge-main.c:3520 vxge_device_unregister() error: Using vdev after free_{netdev,candev}(dev);
drivers/net/ethernet/neterion/vxge/vxge-main.c:3520 vxge_device_unregister() error: Using vdev after free_{netdev,candev}(dev);

Since vdev pointer is netdev private data accessing it after free_netdev()
call can cause use-after-free bug. Fix it by moving free_netdev() call at
the end of the function

Fixes: 6cca2003

 ("vxge: cleanup probe error paths")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

942e560a

net: fec: fix use-after-free in fec_drv_remove · 44712965

Pavel Skripkin authored 3 years ago


Smatch says:
	drivers/net/ethernet/freescale/fec_main.c:3994 fec_drv_remove() error: Using fep after free_{netdev,candev}(ndev);
	drivers/net/ethernet/freescale/fec_main.c:3995 fec_drv_remove() error: Using fep after free_{netdev,candev}(ndev);

Since fep pointer is netdev private data, accessing it after free_netdev()
call can cause use-after-free bug. Fix it by moving free_netdev() call at
the end of the function

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: a31eda65

 ("net: fec: fix clock count mis-match")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Reviewed-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

44712965

net: pegasus: fix uninit-value in get_interrupt_interval · af35fc37

Pavel Skripkin authored 3 years ago


Syzbot reported uninit value pegasus_probe(). The problem was in missing
error handling.

get_interrupt_interval() internally calls read_eprom_word() which can
fail in some cases. For example: failed to receive usb control message.
These cases should be handled to prevent uninit value bug, since
read_eprom_word() will not initialize passed stack variable in case of
internal failure.

Fail log:

BUG: KMSAN: uninit-value in get_interrupt_interval drivers/net/usb/pegasus.c:746 [inline]
BUG: KMSAN: uninit-value in pegasus_probe+0x10e7/0x4080 drivers/net/usb/pegasus.c:1152
CPU: 1 PID: 825 Comm: kworker/1:1 Not tainted 5.12.0-rc6-syzkaller #0
...
Workqueue: usb_hub_wq hub_event
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x24c/0x2e0 lib/dump_stack.c:120
 kmsan_report+0xfb/0x1e0 mm/kmsan/kmsan_report.c:118
 __msan_warning+0x5c/0xa0 mm/kmsan/kmsan_instr.c:197
 get_interrupt_interval drivers/net/usb/pegasus.c:746 [inline]
 pegasus_probe+0x10e7/0x4080 drivers/net/usb/pegasus.c:1152
....

Local variable ----data.i@pegasus_probe created at:
 get_interrupt_interval drivers/net/usb/pegasus.c:1151 [inline]
 pegasus_probe+0xe57/0x4080 drivers/net/usb/pegasus.c:1152
 get_interrupt_interval drivers/net/usb/pegasus.c:1151 [inline]
 pegasus_probe+0xe57/0x4080 drivers/net/usb/pegasus.c:1152

Reported-and-tested-by:  <syzbot+02c9f70f3afae308464a@syzkaller.appspotmail.com>
Fixes: 1da177e4

 ("Linux-2.6.12-rc2")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Link: https://lore.kernel.org/r/20210804143005.439-1-paskripkin@gmail.com


Signed-off-by: Jakub Kicinski <kuba@kernel.org>

af35fc37

tracing: Quiet smp_processor_id() use in preemptable warning in hwlat · 51397dc6

Steven Rostedt (VMware) authored 3 years ago

The hardware latency detector (hwlat) has a mode that it runs one thread
across CPUs. The logic to move from the currently running CPU to the next
one in the list does a smp_processor_id() to find where it currently is.
Unfortunately, it's done with preemption enabled, and this triggers a
warning for using smp_processor_id() in a preempt enabled section.

As it is only using smp_processor_id() to get information on where it
currently is in order to simply move it to the next CPU, it doesn't really
care if it got moved in the mean time. It will simply balance out later if
such a case arises.

Switch smp_processor_id() to raw_smp_processor_id() to quiet that warning.

Link: https://lkml.kernel.org/r/20210804141848.79edadc0@oasis.local.home



Acked-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Fixes: 8fa826b7

 ("trace/hwlat: Implement the mode config option")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

51397dc6

net: ethernet: ti: am65-cpsw: fix crash in am65_cpsw_port_offload_fwd_mark_update() · ae03d189

Grygorii Strashko authored 3 years ago

The am65_cpsw_port_offload_fwd_mark_update() causes NULL exception crash
when there is at least one disabled port and any other port added to the
bridge first time.

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000858
pc : am65_cpsw_port_offload_fwd_mark_update+0x54/0x68
lr : am65_cpsw_netdevice_event+0x8c/0xf0
Call trace:
am65_cpsw_port_offload_fwd_mark_update+0x54/0x68
notifier_call_chain+0x54/0x98
raw_notifier_call_chain+0x14/0x20
call_netdevice_notifiers_info+0x34/0x78
__netdev_upper_dev_link+0x1c8/0x290
netdev_master_upper_dev_link+0x1c/0x28
br_add_if+0x3f0/0x6d0 [bridge]

Fix it by adding proper check for port->ndev != NULL.

Fixes: 2934db9b

 ("net: ti: am65-cpsw-nuss: Add netdevice notifiers")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ae03d189

bnx2x: fix an error code in bnx2x_nic_load() · fb653827

Dan Carpenter authored 3 years ago

Set the error code if bnx2x_alloc_fw_stats_mem() fails.  The current
code returns success.

Fixes: ad5afc89

 ("bnx2x: Separate VF and PF logic")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fb653827

netdevsim: Forbid devlink reload when adding or deleting ports · 23809a72

Leon Romanovsky authored 3 years ago


In order to remove complexity in devlink core related to
devlink_reload_enable/disable, let's rewrite new_port/del_port
logic to rely on internal to netdevsim lcok.

We should protect only reload_down flow because it destroys nsim_dev,
which is needed for nsim_dev_port_add/nsim_dev_port_del to hold
port_list_lock.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

23809a72

net: dsa: tag_sja1105: optionally build as module when switch driver is module if PTP is enabled · f8b17a0b

Vladimir Oltean authored 3 years ago

TX timestamps are sent by SJA1110 as Ethernet packets containing
metadata, so they are received by the tagging driver but must be
processed by the switch driver - the one that is stateful since it
keeps the TX timestamp queue.

This means that there is an sja1110_process_meta_tstamp() symbol
exported by the switch driver which is called by the tagging driver.

There is a shim definition for that function when the switch driver is
not compiled, which does nothing, but that shim is not effective when
the tagging protocol driver is built-in and the switch driver is a
module, because built-in code cannot call symbols exported by modules.

So add an optional dependency between the tagger and the switch driver,
if PTP support is enabled in the switch driver. If PTP is not enabled,
sja1110_process_meta_tstamp() will translate into the shim "do nothing
with these meta frames" function.

Fixes: 566b18c8

 ("net: dsa: sja1105: implement TX timestamping for SJA1110")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f8b17a0b

netdevice: add the case if dev is NULL · b37a4668

Yajun Deng authored 3 years ago


Add the case if dev is NULL in dev_{put, hold}, so the caller doesn't
need to care whether dev is NULL or not.

Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>

b37a4668

net: Remove redundant if statements · 1160dfa1

Yajun Deng authored 3 years ago


The 'if (dev)' statement already move into dev_{put , hold}, so remove
redundant if statements.

Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>

1160dfa1

Admin message