Forum | Documentation | Website | Blog

Skip to content
Snippets Groups Projects
  1. Nov 15, 2019
    • David S. Miller's avatar
      Merge branch 'net-stmmac-CPU-Performance-Improvements' · 43da44c8
      David S. Miller authored
      
      Jose Abreu says:
      
      ====================
      net: stmmac: CPU Performance Improvements
      
      CPU Performance improvements for stmmac. Please check bellow for results
      before and after the series.
      
      Patch 1/7, allows RX Interrupt on Completion to be disabled and only use the
      RX HW Watchdog.
      
      Patch 2/7, setups the default RX coalesce settings instead of using the
      minimum value.
      
      Patch 3/7 and 4/7, removes the uneeded computations for RX Flow Control
      activation/de-activation, on some cases.
      
      Patch 5/7, tunes-up the default coalesce settings.
      
      Patch 6/7, re-works the TX coalesce timer activation logic.
      
      Patch 7/7, removes the now uneeded TBU interrupt.
      
      NetPerf UDP Results:
      --------------------
      
      Socket  Message  Elapsed      Messages                   CPU      Service
      Size    Size     Time         Okay Errors   Throughput   Util     Demand
      bytes   bytes    secs            #      #   10^6bits/sec % SS     us/KB
      --- XGMAC@2.5G: Before
      212992    1400   10.00     2100620      0     2351.7     36.69    5.112
      212992           10.00     2100539            2351.6     26.18    3.648
      --- XGMAC@2.5G: After
      212992    1400   10.00     2108972      0     2361.5     21.73    3.015
      212992           10.00     2097038            2348.1     19.21    2.666
      
      --- GMAC5@1G: Before
      212992    1400   10.00      786000      0      880.2     34.71    12.923
      212992           10.00      786000             880.2     23.42    8.719
      --- GMAC5@1G: After
      212992    1400   10.00      842648      0      943.7     14.12    4.903
      212992           10.00      842648             943.7     12.73    4.418
      
      Perf TCP Results on RX Path:
      ----------------------------
      --- XGMAC@2.5G: Before
      22.51%  swapper          [stmmac]           [k] dwxgmac2_dma_interrupt
      10.82%  swapper          [stmmac]           [k] dwxgmac2_host_mtl_irq_status
       5.21%  swapper          [stmmac]           [k] dwxgmac2_host_irq_status
       4.67%  swapper          [stmmac]           [k] dwxgmac3_safety_feat_irq_status
       3.63%  swapper          [kernel.kallsyms]  [k] stack_trace_consume_entry
       2.74%  iperf3           [kernel.kallsyms]  [k] copy_user_enhanced_fast_string
       2.52%  swapper          [kernel.kallsyms]  [k] update_stack_state
       1.94%  ksoftirqd/0      [stmmac]           [k] dwxgmac2_dma_interrupt
       1.45%  iperf3           [kernel.kallsyms]  [k] queued_spin_lock_slowpath
       1.26%  swapper          [kernel.kallsyms]  [k] create_object
      --- XGMAC@2.5G: After
       7.43%  swapper          [kernel.kallsyms]   [k] stack_trace_consume_entry
       5.86%  swapper          [stmmac]            [k] dwxgmac2_dma_interrupt
       5.68%  swapper          [kernel.kallsyms]   [k] update_stack_state
       4.71%  iperf3           [kernel.kallsyms]   [k] copy_user_enhanced_fast_string
       2.88%  swapper          [kernel.kallsyms]   [k] create_object
       2.69%  swapper          [stmmac]            [k] dwxgmac2_host_mtl_irq_status
       2.61%  swapper          [stmmac]            [k] stmmac_napi_poll_rx
       2.52%  swapper          [kernel.kallsyms]   [k] unwind_next_frame.part.4
       1.48%  swapper          [kernel.kallsyms]   [k] unwind_get_return_address
       1.38%  swapper          [kernel.kallsyms]   [k] arch_stack_walk
      
      --- GMAC5@1G: Before
      31.29%  swapper          [stmmac]           [k] dwmac4_dma_interrupt
      14.57%  swapper          [stmmac]           [k] dwmac4_irq_mtl_status
      10.66%  swapper          [stmmac]           [k] dwmac4_irq_status
       1.97%  swapper          [kernel.kallsyms]  [k] stack_trace_consume_entry
       1.73%  iperf3           [kernel.kallsyms]  [k] copy_user_enhanced_fast_string
       1.59%  swapper          [kernel.kallsyms]  [k] update_stack_state
       1.15%  iperf3           [kernel.kallsyms]  [k] do_syscall_64
       1.01%  ksoftirqd/0      [stmmac]           [k] dwmac4_dma_interrupt
       0.89%  swapper          [kernel.kallsyms]  [k] __default_send_IPI_dest_field
       0.75%  swapper          [stmmac]           [k] stmmac_napi_poll_rx
      --- GMAC5@1G: After
       6.70%  swapper          [kernel.kallsyms]   [k] stack_trace_consume_entry
       5.79%  swapper          [stmmac]            [k] dwmac4_dma_interrupt
       5.29%  swapper          [kernel.kallsyms]   [k] update_stack_state
       3.52%  iperf3           [kernel.kallsyms]   [k] copy_user_enhanced_fast_string
       2.83%  swapper          [stmmac]            [k] dwmac4_irq_mtl_status
       2.62%  swapper          [kernel.kallsyms]   [k] create_object
       2.46%  swapper          [stmmac]            [k] stmmac_napi_poll_rx
       2.32%  swapper          [kernel.kallsyms]   [k] unwind_next_frame.part.4
       2.19%  swapper          [stmmac]            [k] dwmac4_irq_status
       1.39%  swapper          [kernel.kallsyms]   [k] unwind_get_return_address
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      43da44c8
    • Jose Abreu's avatar
      net: stmmac: xgmac: Do not enable TBU interrupt · 8d07a793
      Jose Abreu authored
      
      Now that TX Coalesce has been rewritten we no longer need this
      additional interrupt enabled. This reduces CPU usage.
      
      Signed-off-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d07a793
    • Jose Abreu's avatar
      net: stmmac: Rework TX Coalesce logic · c2837423
      Jose Abreu authored
      
      Coalesce logic currently increments the number of packets and sets the
      IC bit when the coalesced packets have passed a given limit. This does
      not reflect very well what coalesce was meant for as we can have a large
      number of packets that are coalesced and then a single one, sent later
      on that has the IC bit.
      
      Rework the logic so that it coalesces only upon a limit of packets and
      sets the IC bit for large number of packets.
      
      Signed-off-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c2837423
    • Jose Abreu's avatar
      net: stmmac: Tune-up default coalesce settings · da202451
      Jose Abreu authored
      
      Tune-up the defalt coalesce settings for optimal values. This gives the
      best performance in most of the use-cases.
      
      Signed-off-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      da202451
    • Jose Abreu's avatar
      net: stmmac: xgmac: Remove uneeded computation for RFA/RFD · 52f96cd1
      Jose Abreu authored
      
      RFA and RFD should not be dependent on FIFO size. In fact, the more FIFO
      space we have, the later we can activate Flow Control. Let's use
      hard-coded values for RFA and RFD for all FIFO sizes with the exception
      of 4k, which is a special case.
      
      Signed-off-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      52f96cd1
    • Jose Abreu's avatar
      net: stmmac: gmac4+: Remove uneeded computation for RFA/RFD · 854248e5
      Jose Abreu authored
      
      RFA and RFD should not be dependent on FIFO size. In fact, the more FIFO
      space we have, the later we can activate Flow Control. Let's use
      hard-coded values for RFA and RFD for all FIFO sizes with the exception
      of 4k, which is a special case.
      
      Signed-off-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      854248e5
    • Jose Abreu's avatar
      net: stmmac: Setup a default RX Coalesce value instead of the minimum · 4e4337cc
      Jose Abreu authored
      
      For performance reasons, sometimes using the minimum RX Coalesce value
      is not optimal. Lets setup a default value that is optimal in most of
      the use cases.
      
      Signed-off-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e4337cc
    • Jose Abreu's avatar
      net: stmmac: Do not set RX IC bit if RX Coalesce is zero · 09146abe
      Jose Abreu authored
      
      We may only want to use the RX Watchdog so lets check if RX Coalesce
      settings are non-zero and only set the RX Interrupt on Completion bit if
      its not.
      
      Signed-off-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      09146abe
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Allocate discard adjacency entry when needed · 983db619
      Ido Schimmel authored
      Commit 0c3cbbf9 ("mlxsw: Add specific trap for packets routed via
      invalid nexthops") allocated an adjacency entry during driver
      initialization whose purpose is to discard packets hitting the route
      pointing to it.
      
      These adjacency entries are allocated from a resource called KVD linear
      (KVDL). There are situations in which the user can decide to set the
      size of this resource (via devlink-resource) to 0, in which case the
      driver will not be able to load.
      
      Therefore, instead of pre-allocating this adjacency entry, simply
      allocate it only when needed. A variable indicating the validity of the
      entry is added and is used to ensure it is only allocated and written
      once and that it is freed after all the routes were flushed.
      
      Fixes: 0c3cbbf9
      
       ("mlxsw: Add specific trap for packets routed via invalid nexthops")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      983db619
    • YueHaibing's avatar
      net/tls: Fix unused function warning · d6649d78
      YueHaibing authored
      
      If PROC_FS is not set, gcc warning this:
      
      net/tls/tls_proc.c:23:12: warning:
       'tls_statistics_seq_show' defined but not used [-Wunused-function]
      
      Use #ifdef to guard this.
      
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Acked-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d6649d78
  2. Nov 14, 2019