Forum | Documentation | Website | Blog

Skip to content
Snippets Groups Projects
  1. Jan 30, 2021
    • Vladimir Oltean's avatar
      net: dsa: felix: convert to the new .change_tag_protocol DSA API · adb3dccf
      Vladimir Oltean authored
      
      In expectation of the new tag_ocelot_8021q tagger implementation, we
      need to be able to do runtime switchover between one tagger and another.
      So we must structure the existing code for the current NPI-based tagger
      in a certain way.
      
      We move the felix_npi_port_init function in expectation of the future
      driver configuration necessary for tag_ocelot_8021q: we would like to
      not have the NPI-related bits interspersed with the tag_8021q bits.
      
      The conversion from this:
      
      	ocelot_write_rix(ocelot,
      			 ANA_PGID_PGID_PGID(GENMASK(ocelot->num_phys_ports, 0)),
      			 ANA_PGID_PGID, PGID_UC);
      
      to this:
      
      	cpu_flood = ANA_PGID_PGID_PGID(BIT(ocelot->num_phys_ports));
      	ocelot_rmw_rix(ocelot, cpu_flood, cpu_flood, ANA_PGID_PGID, PGID_UC);
      
      is perhaps non-trivial, but is nonetheless non-functional. The PGID_UC
      (replicator for unknown unicast) is already configured out of hardware
      reset to flood to all ports except ocelot->num_phys_ports (the CPU port
      module). All we change is that we use a read-modify-write to only add
      the CPU port module to the unknown unicast replicator, as opposed to
      doing a full write to the register.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      adb3dccf
    • Vladimir Oltean's avatar
      net: dsa: allow changing the tag protocol via the "tagging" device attribute · 53da0eba
      Vladimir Oltean authored
      Currently DSA exposes the following sysfs:
      $ cat /sys/class/net/eno2/dsa/tagging
      ocelot
      
      which is a read-only device attribute, introduced in the kernel as
      commit 98cdb480
      
       ("net: dsa: Expose tagging protocol to user-space"),
      and used by libpcap since its commit 993db3800d7d ("Add support for DSA
      link-layer types").
      
      It would be nice if we could extend this device attribute by making it
      writable:
      $ echo ocelot-8021q > /sys/class/net/eno2/dsa/tagging
      
      This is useful with DSA switches that can make use of more than one
      tagging protocol. It may be useful in dsa_loop in the future too, to
      perform offline testing of various taggers, or for changing between dsa
      and edsa on Marvell switches, if that is desirable.
      
      In terms of implementation, drivers can support this feature by
      implementing .change_tag_protocol, which should always leave the switch
      in a consistent state: either with the new protocol if things went well,
      or with the old one if something failed. Teardown of the old protocol,
      if necessary, must be handled by the driver.
      
      Some things remain as before:
      - The .get_tag_protocol is currently only called at probe time, to load
        the initial tagging protocol driver. Nonetheless, new drivers should
        report the tagging protocol in current use now.
      - The driver should manage by itself the initial setup of tagging
        protocol, no later than the .setup() method, as well as destroying
        resources used by the last tagger in use, no earlier than the
        .teardown() method.
      
      For multi-switch DSA trees, error handling is a bit more complicated,
      since e.g. the 5th out of 7 switches may fail to change the tag
      protocol. When that happens, a revert to the original tag protocol is
      attempted, but that may fail too, leaving the tree in an inconsistent
      state despite each individual switch implementing .change_tag_protocol
      transactionally. Since the intersection between drivers that implement
      .change_tag_protocol and drivers that support D in DSA is currently the
      empty set, the possibility for this error to happen is ignored for now.
      
      Testing:
      
      $ insmod mscc_felix.ko
      [   79.549784] mscc_felix 0000:00:00.5: Adding to iommu group 14
      [   79.565712] mscc_felix 0000:00:00.5: Failed to register DSA switch: -517
      $ insmod tag_ocelot.ko
      $ rmmod mscc_felix.ko
      $ insmod mscc_felix.ko
      [   97.261724] libphy: VSC9959 internal MDIO bus: probed
      [   97.267363] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 0
      [   97.274998] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 1
      [   97.282561] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 2
      [   97.289700] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 3
      [   97.599163] mscc_felix 0000:00:00.5 swp0 (uninitialized): PHY [0000:00:00.3:10] driver [Microsemi GE VSC8514 SyncE] (irq=POLL)
      [   97.862034] mscc_felix 0000:00:00.5 swp1 (uninitialized): PHY [0000:00:00.3:11] driver [Microsemi GE VSC8514 SyncE] (irq=POLL)
      [   97.950731] mscc_felix 0000:00:00.5 swp0: configuring for inband/qsgmii link mode
      [   97.964278] 8021q: adding VLAN 0 to HW filter on device swp0
      [   98.146161] mscc_felix 0000:00:00.5 swp2 (uninitialized): PHY [0000:00:00.3:12] driver [Microsemi GE VSC8514 SyncE] (irq=POLL)
      [   98.238649] mscc_felix 0000:00:00.5 swp1: configuring for inband/qsgmii link mode
      [   98.251845] 8021q: adding VLAN 0 to HW filter on device swp1
      [   98.433916] mscc_felix 0000:00:00.5 swp3 (uninitialized): PHY [0000:00:00.3:13] driver [Microsemi GE VSC8514 SyncE] (irq=POLL)
      [   98.485542] mscc_felix 0000:00:00.5: configuring for fixed/internal link mode
      [   98.503584] mscc_felix 0000:00:00.5: Link is Up - 2.5Gbps/Full - flow control rx/tx
      [   98.527948] device eno2 entered promiscuous mode
      [   98.544755] DSA: tree 0 setup
      
      $ ping 10.0.0.1
      PING 10.0.0.1 (10.0.0.1): 56 data bytes
      64 bytes from 10.0.0.1: seq=0 ttl=64 time=2.337 ms
      64 bytes from 10.0.0.1: seq=1 ttl=64 time=0.754 ms
      ^C
       -  10.0.0.1 ping statistics  -
      2 packets transmitted, 2 packets received, 0% packet loss
      round-trip min/avg/max = 0.754/1.545/2.337 ms
      
      $ cat /sys/class/net/eno2/dsa/tagging
      ocelot
      $ cat ./test_ocelot_8021q.sh
              #!/bin/bash
      
              ip link set swp0 down
              ip link set swp1 down
              ip link set swp2 down
              ip link set swp3 down
              ip link set swp5 down
              ip link set eno2 down
              echo ocelot-8021q > /sys/class/net/eno2/dsa/tagging
              ip link set eno2 up
              ip link set swp0 up
              ip link set swp1 up
              ip link set swp2 up
              ip link set swp3 up
              ip link set swp5 up
      $ ./test_ocelot_8021q.sh
      ./test_ocelot_8021q.sh: line 9: echo: write error: Protocol not available
      $ rmmod tag_ocelot.ko
      rmmod: can't unload module 'tag_ocelot': Resource temporarily unavailable
      $ insmod tag_ocelot_8021q.ko
      $ ./test_ocelot_8021q.sh
      $ cat /sys/class/net/eno2/dsa/tagging
      ocelot-8021q
      $ rmmod tag_ocelot.ko
      $ rmmod tag_ocelot_8021q.ko
      rmmod: can't unload module 'tag_ocelot_8021q': Resource temporarily unavailable
      $ ping 10.0.0.1
      PING 10.0.0.1 (10.0.0.1): 56 data bytes
      64 bytes from 10.0.0.1: seq=0 ttl=64 time=0.953 ms
      64 bytes from 10.0.0.1: seq=1 ttl=64 time=0.787 ms
      64 bytes from 10.0.0.1: seq=2 ttl=64 time=0.771 ms
      $ rmmod mscc_felix.ko
      [  645.544426] mscc_felix 0000:00:00.5: Link is Down
      [  645.838608] DSA: tree 0 torn down
      $ rmmod tag_ocelot_8021q.ko
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      53da0eba
    • Vladimir Oltean's avatar
      net: dsa: keep a copy of the tagging protocol in the DSA switch tree · 357f203b
      Vladimir Oltean authored
      
      Cascading DSA switches can be done multiple ways. There is the brute
      force approach / tag stacking, where one upstream switch, located
      between leaf switches and the host Ethernet controller, will just
      happily transport the DSA header of those leaf switches as payload.
      For this kind of setups, DSA works without any special kind of treatment
      compared to a single switch - they just aren't aware of each other.
      Then there's the approach where the upstream switch understands the tags
      it transports from its leaves below, as it doesn't push a tag of its own,
      but it routes based on the source port & switch id information present
      in that tag (as opposed to DMAC & VID) and it strips the tag when
      egressing a front-facing port. Currently only Marvell implements the
      latter, and Marvell DSA trees contain only Marvell switches.
      
      So it is safe to say that DSA trees already have a single tag protocol
      shared by all switches, and in fact this is what makes the switches able
      to understand each other. This fact is also implied by the fact that
      currently, the tagging protocol is reported as part of a sysfs installed
      on the DSA master and not per port, so it must be the same for all the
      ports connected to that DSA master regardless of the switch that they
      belong to.
      
      It's time to make this official and enforce it (yes, this also means we
      won't have any "switch understands tag to some extent but is not able to
      speak it" hardware oddities that we'll support in the future).
      
      This is needed due to the imminent introduction of the dsa_switch_ops::
      change_tag_protocol driver API. When that is introduced, we'll have
      to notify switches of the tagging protocol that they're configured to
      use. Currently the tag_ops structure pointer is held only for CPU ports.
      But there are switches which don't have CPU ports and nonetheless still
      need to be configured. These would be Marvell leaf switches whose
      upstream port is just a DSA link. How do we inform these of their
      tagging protocol setup/deletion?
      
      One answer to the above would be: iterate through the DSA switch tree's
      ports once, list the CPU ports, get their tag_ops, then iterate again
      now that we have it, and notify everybody of that tag_ops. But what to
      do if conflicts appear between one cpu_dp->tag_ops and another? There's
      no escaping the fact that conflict resolution needs to be done, so we
      can be upfront about it.
      
      Ease our work and just keep the master copy of the tag_ops inside the
      struct dsa_switch_tree. Reference counting is now moved to be per-tree
      too, instead of per-CPU port.
      
      There are many places in the data path that access master->dsa_ptr->tag_ops
      and we would introduce unnecessary performance penalty going through yet
      another indirection, so keep those right where they are.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      357f203b
    • Vladimir Oltean's avatar
      net: dsa: document the existing switch tree notifiers and add a new one · 886f8e26
      Vladimir Oltean authored
      The existence of dsa_broadcast has generated some confusion in the past:
      https://www.mail-archive.com/netdev@vger.kernel.org/msg365042.html
      
      
      
      So let's document the existing dsa_port_notify and dsa_broadcast
      functions and explain when each of them should be used.
      
      Also, in fact, the in-between function has always been there but was
      lacking a name, and is the main reason for this patch: dsa_tree_notify.
      Refactor dsa_broadcast to use it.
      
      This patch also moves dsa_broadcast (a top-level function) to dsa2.c,
      where it really belonged in the first place, but had no companion so it
      stood with dsa_port_notify.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      886f8e26
    • Vladimir Oltean's avatar
      net: mscc: ocelot: don't use NPI tag prefix for the CPU port module · cacea62f
      Vladimir Oltean authored
      Context: Ocelot switches put the injection/extraction frame header in
      front of the Ethernet header. When used in NPI mode, a DSA master would
      see junk instead of the destination MAC address, and it would most
      likely drop the packets. So the Ocelot frame header can have an optional
      prefix, which is just "ff:ff:ff:ff:ff:fe > ff:ff:ff:ff:ff:ff" padding
      put before the actual tag (still before the real Ethernet header) such
      that the DSA master thinks it's looking at a broadcast frame with a
      strange EtherType.
      
      Unfortunately, a lesson learned in commit 69df578c
      
       ("net: mscc:
      ocelot: eliminate confusion between CPU and NPI port") seems to have
      been forgotten in the meanwhile.
      
      The CPU port module and the NPI port have independent settings for the
      length of the tag prefix. However, the driver is using the same variable
      to program both of them.
      
      There is no reason really to use any tag prefix with the CPU port
      module, since that is not connected to any Ethernet port. So this patch
      makes the inj_prefix and xtr_prefix variables apply only to the NPI
      port (which the switchdev ocelot_vsc7514 driver does not use).
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cacea62f
    • Vladimir Oltean's avatar
      net: mscc: ocelot: reapply bridge forwarding mask on bonding join/leave · 9b521250
      Vladimir Oltean authored
      
      Applying the bridge forwarding mask currently is done only on the STP
      state changes for any port. But it depends on both STP state changes,
      and bonding interface state changes. Export the bit that recalculates
      the forwarding mask so that it could be reused, and call it when a port
      starts and stops offloading a bonding interface.
      
      Now that the logic is split into a separate function, we can rename "p"
      into "port", since the "port" variable was already taken in
      ocelot_bridge_stp_state_set. Also, we can rename "i" into "lag", to make
      it more clear what is it that we're iterating through.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarAlexandre Belloni <alexandre.belloni@bootlin.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9b521250
    • Vladimir Oltean's avatar
      net: mscc: ocelot: store a namespaced VCAP filter ID · 50c6cc5b
      Vladimir Oltean authored
      
      We will be adding some private VCAP filters that should not interfere in
      any way with the filters added using tc-flower. So we need to allocate
      some IDs which will not be used by tc.
      
      Currently ocelot uses an u32 id derived from the flow cookie, which in
      itself is an unsigned long. This is a problem in itself, since on 64 bit
      systems, sizeof(unsigned long)=8, so the driver is already truncating
      these.
      
      Create a struct ocelot_vcap_id which contains the full unsigned long
      cookie from tc, as well as a boolean that is supposed to namespace the
      filters added by tc with the ones that aren't.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      50c6cc5b
    • Vladimir Oltean's avatar
      net: mscc: ocelot: export VCAP structures to include/soc/mscc · 0e9bb4e9
      Vladimir Oltean authored
      
      The Felix driver will need to preinstall some VCAP filters for its
      tag_8021q implementation (outside of the tc-flower offload logic), so
      these need to be exported to the common includes.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0e9bb4e9
    • Vladimir Oltean's avatar
      net: dsa: tag_8021q: add helpers to deduce whether a VLAN ID is RX or TX VLAN · 9c7caf28
      Vladimir Oltean authored
      
      The sja1105 implementation can be blind about this, but the felix driver
      doesn't do exactly what it's being told, so it needs to know whether it
      is a TX or an RX VLAN, so it can install the appropriate type of TCAM
      rule.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9c7caf28
    • Ronak Doshi's avatar
      vmxnet3: Remove buf_info from device accessible structures · de1da8bc
      Ronak Doshi authored
      buf_info structures in RX & TX queues are private driver data that
      do not need to be visible to the device.  Although there is physical
      address and length in the queue descriptor that points to these
      structures, their layout is not standardized, and device never looks
      at them.
      
      So lets allocate these structures in non-DMA-able memory, and fill
      physical address as all-ones and length as zero in the queue
      descriptor.
      
      That should alleviate worries brought by Martin Radev in
      https://lists.osuosl.org/pipermail/intel-wired-lan/Week-of-Mon-20210104/022829.html
      
      
      that malicious vmxnet3 device could subvert SVM/TDX guarantees.
      
      Signed-off-by: default avatarPetr Vandrovec <petr@vmware.com>
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      de1da8bc
    • Kurt Kanzenbach's avatar
      net: dsa: hellcreek: Add missing TAPRIO dependency · 6c13d75b
      Kurt Kanzenbach authored
      Add missing dependency to TAPRIO to avoid build failures such as:
      
      |ERROR: modpost: "taprio_offload_get" [drivers/net/dsa/hirschmann/hellcreek_sw.ko] undefined!
      |ERROR: modpost: "taprio_offload_free" [drivers/net/dsa/hirschmann/hellcreek_sw.ko] undefined!
      
      Fixes: 24dfc6eb
      
       ("net: dsa: hellcreek: Add TAPRIO offloading support")
      Reported-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarKurt Kanzenbach <kurt@linutronix.de>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
      Link: https://lore.kernel.org/r/20210128163338.22665-1-kurt@linutronix.de
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6c13d75b
  2. Jan 29, 2021