Forum | Documentation | Website | Blog

Skip to content
Snippets Groups Projects
  1. Apr 02, 2022
  2. Mar 22, 2022
    • Huang Ying's avatar
      NUMA balancing: optimize page placement for memory tiering system · c574bbe9
      Huang Ying authored
      With the advent of various new memory types, some machines will have
      multiple types of memory, e.g.  DRAM and PMEM (persistent memory).  The
      memory subsystem of these machines can be called memory tiering system,
      because the performance of the different types of memory are usually
      different.
      
      In such system, because of the memory accessing pattern changing etc,
      some pages in the slow memory may become hot globally.  So in this
      patch, the NUMA balancing mechanism is enhanced to optimize the page
      placement among the different memory types according to hot/cold
      dynamically.
      
      In a typical memory tiering system, there are CPUs, fast memory and slow
      memory in each physical NUMA node.  The CPUs and the fast memory will be
      put in one logical node (called fast memory node), while the slow memory
      will be put in another (faked) logical node (called slow memory node).
      That is, the fast memory is regarded as local while the slow memory is
      regarded as remote.  So it's possible for the recently accessed pages in
      the slow memory node to be promoted to the fast memory node via the
      existing NUMA balancing mechanism.
      
      The original NUMA balancing mechanism will stop to migrate pages if the
      free memory of the target node becomes below the high watermark.  This
      is a reasonable policy if there's only one memory type.  But this makes
      the original NUMA balancing mechanism almost do not work to optimize
      page placement among different memory types.  Details are as follows.
      
      It's the common cases that the working-set size of the workload is
      larger than the size of the fast memory nodes.  Otherwise, it's
      unnecessary to use the slow memory at all.  So, there are almost always
      no enough free pages in the fast memory nodes, so that the globally hot
      pages in the slow memory node cannot be promoted to the fast memory
      node.  To solve the issue, we have 2 choices as follows,
      
      a. Ignore the free pages watermark checking when promoting hot pages
         from the slow memory node to the fast memory node.  This will
         create some memory pressure in the fast memory node, thus trigger
         the memory reclaiming.  So that, the cold pages in the fast memory
         node will be demoted to the slow memory node.
      
      b. Define a new watermark called wmark_promo which is higher than
         wmark_high, and have kswapd reclaiming pages until free pages reach
         such watermark.  The scenario is as follows: when we want to promote
         hot-pages from a slow memory to a fast memory, but fast memory's free
         pages would go lower than high watermark with such promotion, we wake
         up kswapd with wmark_promo watermark in order to demote cold pages and
         free us up some space.  So, next time we want to promote hot-pages we
         might have a chance of doing so.
      
      The choice "a" may create high memory pressure in the fast memory node.
      If the memory pressure of the workload is high, the memory pressure
      may become so high that the memory allocation latency of the workload
      is influenced, e.g.  the direct reclaiming may be triggered.
      
      The choice "b" works much better at this aspect.  If the memory
      pressure of the workload is high, the hot pages promotion will stop
      earlier because its allocation watermark is higher than that of the
      normal memory allocation.  So in this patch, choice "b" is implemented.
      A new zone watermark (WMARK_PROMO) is added.  Which is larger than the
      high watermark and can be controlled via watermark_scale_factor.
      
      In addition to the original page placement optimization among sockets,
      the NUMA balancing mechanism is extended to be used to optimize page
      placement according to hot/cold among different memory types.  So the
      sysctl user space interface (numa_balancing) is extended in a backward
      compatible way as follow, so that the users can enable/disable these
      functionality individually.
      
      The sysctl is converted from a Boolean value to a bits field.  The
      definition of the flags is,
      
      - 0: NUMA_BALANCING_DISABLED
      - 1: NUMA_BALANCING_NORMAL
      - 2: NUMA_BALANCING_MEMORY_TIERING
      
      We have tested the patch with the pmbench memory accessing benchmark
      with the 80:20 read/write ratio and the Gauss access address
      distribution on a 2 socket Intel server with Optane DC Persistent
      Memory Model.  The test results shows that the pmbench score can
      improve up to 95.9%.
      
      Thanks Andrew Morton to help fix the document format error.
      
      Link: https://lkml.kernel.org/r/20220221084529.1052339-3-ying.huang@intel.com
      
      
      Signed-off-by: default avatar"Huang, Ying" <ying.huang@intel.com>
      Tested-by: default avatarBaolin Wang <baolin.wang@linux.alibaba.com>
      Reviewed-by: default avatarBaolin Wang <baolin.wang@linux.alibaba.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: default avatarOscar Salvador <osalvador@suse.de>
      Reviewed-by: default avatarYang Shi <shy828301@gmail.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Zi Yan <ziy@nvidia.com>
      Cc: Wei Xu <weixugc@google.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: zhongjiang-ali <zhongjiang-ali@linux.alibaba.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Feng Tang <feng.tang@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c574bbe9
    • David Hildenbrand's avatar
      cma: factor out minimum alignment requirement · e16faf26
      David Hildenbrand authored
      Patch series "mm: enforce pageblock_order < MAX_ORDER".
      
      Having pageblock_order >= MAX_ORDER seems to be able to happen in corner
      cases and some parts of the kernel are not prepared for it.
      
      For example, Aneesh has shown [1] that such kernels can be compiled on
      ppc64 with 64k base pages by setting FORCE_MAX_ZONEORDER=8, which will
      run into a WARN_ON_ONCE(order >= MAX_ORDER) in comapction code right
      during boot.
      
      We can get pageblock_order >= MAX_ORDER when the default hugetlb size is
      bigger than the maximum allocation granularity of the buddy, in which
      case we are no longer talking about huge pages but instead gigantic
      pages.
      
      Having pageblock_order >= MAX_ORDER can only make alloc_contig_range()
      of such gigantic pages more likely to succeed.
      
      Reliable use of gigantic pages either requires boot time allcoation or
      CMA, no need to overcomplicate some places in the kernel to optimize for
      corner cases that are broken in other areas of the kernel.
      
      This patch (of 2):
      
      Let's enforce pageblock_order < MAX_ORDER and simplify.
      
      Especially patch #1 can be regarded a cleanup before:
      	[PATCH v5 0/6] Use pageblock_order for cma and alloc_contig_range
      	alignment. [2]
      
      [1] https://lkml.kernel.org/r/87r189a2ks.fsf@linux.ibm.com
      [2] https://lkml.kernel.org/r/20220211164135.1803616-1-zi.yan@sent.com
      
      Link: https://lkml.kernel.org/r/20220214174132.219303-2-david@redhat.com
      
      
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarZi Yan <ziy@nvidia.com>
      Acked-by: default avatarRob Herring <robh@kernel.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frank Rowand <frowand.list@gmail.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: John Garry via iommu <iommu@lists.linux-foundation.org>
      
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e16faf26
    • Huang, Ying's avatar
      sched/numa: Fix boot crash on arm64 systems · ab31c7fd
      Huang, Ying authored
      Qian Cai reported a boot crash on arm64 systems, caused by:
      
        0fb3978b ("sched/numa: Fix NUMA topology for systems with CPU-less nodes")
      
      The bug is that node_state() must be supplied a valid node_states[] array index,
      but in task_numa_placement() the max_nid search can fail with NUMA_NO_NODE,
      which is not a valid index.
      
      Fix it by checking that max_nid is a valid index.
      
      [ mingo: Added changelog. ]
      
      Fixes: 0fb3978b
      
       ("sched/numa: Fix NUMA topology for systems with CPU-less nodes")
      Reported-by: default avatarQian Cai <quic_qiancai@quicinc.com>
      Tested-by: default avatarQian Cai <quic_qiancai@quicinc.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatar"Huang, Ying" <ying.huang@intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ab31c7fd
  3. Mar 21, 2022
    • Matthew Wilcox (Oracle)'s avatar
      mm: Add DEFINE_PAGE_VMA_WALK and DEFINE_FOLIO_VMA_WALK · eed05e54
      Matthew Wilcox (Oracle) authored
      
      Instead of declaring a struct page_vma_mapped_walk directly,
      use these helpers to allow us to transition to a PFN approach in the
      following patches.
      
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      eed05e54
    • Matthew Wilcox (Oracle)'s avatar
      mm/truncate: Inline invalidate_complete_page() into its one caller · 1b8ddbee
      Matthew Wilcox (Oracle) authored
      
      invalidate_inode_page() is the only caller of invalidate_complete_page()
      and inlining it reveals that the first check is unnecessary (because we
      hold the page locked, and we just retrieved the mapping from the page).
      Actually, it does make a difference, in that tail pages no longer fail
      at this check, so it's now possible to remove a tail page from a mapping.
      
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Reviewed-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      1b8ddbee
    • David Howells's avatar
      watch_queue: Actually free the watch · 3d8dcf27
      David Howells authored
      free_watch() does everything barring actually freeing the watch object.  Fix
      this by adding the missing kfree.
      
      kmemleak produces a report something like the following.  Note that as an
      address can be seen in the first word, the watch would appear to have gone
      through call_rcu().
      
      BUG: memory leak
      unreferenced object 0xffff88810ce4a200 (size 96):
        comm "syz-executor352", pid 3605, jiffies 4294947473 (age 13.720s)
        hex dump (first 32 bytes):
          e0 82 48 0d 81 88 ff ff 00 00 00 00 00 00 00 00  ..H.............
          80 a2 e4 0c 81 88 ff ff 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff8214e6cc>] kmalloc include/linux/slab.h:581 [inline]
          [<ffffffff8214e6cc>] kzalloc include/linux/slab.h:714 [inline]
          [<ffffffff8214e6cc>] keyctl_watch_key+0xec/0x2e0 security/keys/keyctl.c:1800
          [<ffffffff8214ec84>] __do_sys_keyctl+0x3c4/0x490 security/keys/keyctl.c:2016
          [<ffffffff84493a25>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
          [<ffffffff84493a25>] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
          [<ffffffff84600068>] entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fixes: c73be61c
      
       ("pipe: Add general notification queue support")
      Reported-and-tested-by: default avatar <syzbot+6e2de48f06cdb2884bfc@syzkaller.appspotmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      3d8dcf27
    • David Howells's avatar
      watch_queue: Fix NULL dereference in error cleanup · a635415a
      David Howells authored
      In watch_queue_set_size(), the error cleanup code doesn't take account of
      the fact that __free_page() can't handle a NULL pointer when trying to free
      up buffer pages that did get allocated.
      
      Fix this by only calling __free_page() on the pages actually allocated.
      
      Without the fix, this can lead to something like the following:
      
      BUG: KASAN: null-ptr-deref in __free_pages+0x1f/0x1b0 mm/page_alloc.c:5473
      Read of size 4 at addr 0000000000000034 by task syz-executor168/3599
      ...
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
       __kasan_report mm/kasan/report.c:446 [inline]
       kasan_report.cold+0x66/0xdf mm/kasan/report.c:459
       check_region_inline mm/kasan/generic.c:183 [inline]
       kasan_check_range+0x13d/0x180 mm/kasan/generic.c:189
       instrument_atomic_read include/linux/instrumented.h:71 [inline]
       atomic_read include/linux/atomic/atomic-instrumented.h:27 [inline]
       page_ref_count include/linux/page_ref.h:67 [inline]
       put_page_testzero include/linux/mm.h:717 [inline]
       __free_pages+0x1f/0x1b0 mm/page_alloc.c:5473
       watch_queue_set_size+0x499/0x630 kernel/watch_queue.c:275
       pipe_ioctl+0xac/0x2b0 fs/pipe.c:632
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:874 [inline]
       __se_sys_ioctl fs/ioctl.c:860 [inline]
       __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fixes: c73be61c
      
       ("pipe: Add general notification queue support")
      Reported-and-tested-by: default avatar <syzbot+d55757faa9b80590767b@syzkaller.appspotmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarFabio M. De Francesco <fmdefrancesco@gmail.com>
      a635415a
  4. Mar 20, 2022
    • Steven Rostedt (Google)'s avatar
      tracing: Have type enum modifications copy the strings · 795301d3
      Steven Rostedt (Google) authored
      When an enum is used in the visible parts of a trace event that is
      exported to user space, the user space applications like perf and
      trace-cmd do not have a way to know what the value of the enum is. To
      solve this, at boot up (or module load) the printk formats are modified to
      replace the enum with their numeric value in the string output.
      
      Array fields of the event are defined by [<nr-elements>] in the type
      portion of the format file so that the user space parsers can correctly
      parse the array into the appropriate size chunks. But in some trace
      events, an enum is used in defining the size of the array, which once
      again breaks the parsing of user space tooling.
      
      This was solved the same way as the print formats were, but it modified
      the type strings of the trace event. This caused crashes in some
      architectures because, as supposed to the print string, is a const string
      value. This was not detected on x86, as it appears that const strings are
      still writable (at least in boot up), but other architectures this is not
      the case, and writing to a const string will cause a kernel fault.
      
      To fix this, use kstrdup() to copy the type before modifying it. If the
      trace event is for the core kernel there's no need to free it because the
      string will be in use for the life of the machine being on line. For
      modules, create a link list to store all the strings being allocated for
      modules and when the module is removed, free them.
      
      Link: https://lore.kernel.org/all/yt9dr1706b4i.fsf@linux.ibm.com/
      Link: https://lkml.kernel.org/r/20220318153432.3984b871@gandalf.local.home
      
      
      
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Reported-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Fixes: b3bc8547
      
       ("tracing: Have TRACE_DEFINE_ENUM affect trace event types as well")
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      795301d3
  5. Mar 17, 2022
  6. Mar 15, 2022
  7. Mar 11, 2022
  8. Mar 10, 2022
  9. Mar 09, 2022
    • Jiapeng Chong's avatar
      ftrace: Fix some W=1 warnings in kernel doc comments · 78cbc651
      Jiapeng Chong authored
      Clean up the following clang-w1 warning:
      
      kernel/trace/ftrace.c:7827: warning: Function parameter or member 'ops'
      not described in 'unregister_ftrace_function'.
      
      kernel/trace/ftrace.c:7805: warning: Function parameter or member 'ops'
      not described in 'register_ftrace_function'.
      
      Link: https://lkml.kernel.org/r/20220307004303.26399-1-jiapeng.chong@linux.alibaba.com
      
      
      
      Reported-by: default avatarAbaci Robot <abaci@linux.alibaba.com>
      Signed-off-by: default avatarJiapeng Chong <jiapeng.chong@linux.alibaba.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      78cbc651
    • Nicolas Saenz Julienne's avatar
      tracing/osnoise: Force quiescent states while tracing · caf4c86b
      Nicolas Saenz Julienne authored
      At the moment running osnoise on a nohz_full CPU or uncontested FIFO
      priority and a PREEMPT_RCU kernel might have the side effect of
      extending grace periods too much. This will entice RCU to force a
      context switch on the wayward CPU to end the grace period, all while
      introducing unwarranted noise into the tracer. This behaviour is
      unavoidable as overly extending grace periods might exhaust the system's
      memory.
      
      This same exact problem is what extended quiescent states (EQS) were
      created for, conversely, rcu_momentary_dyntick_idle() emulates them by
      performing a zero duration EQS. So let's make use of it.
      
      In the common case rcu_momentary_dyntick_idle() is fairly inexpensive:
      atomically incrementing a local per-CPU counter and doing a store. So it
      shouldn't affect osnoise's measurements (which has a 1us granularity),
      so we'll call it unanimously.
      
      The uncommon case involve calling rcu_momentary_dyntick_idle() after
      having the osnoise process:
      
       - Receive an expedited quiescent state IPI with preemption disabled or
         during an RCU critical section. (activates rdp->cpu_no_qs.b.exp
         code-path).
      
       - Being preempted within in an RCU critical section and having the
         subsequent outermost rcu_read_unlock() called with interrupts
         disabled. (t->rcu_read_unlock_special.b.blocked code-path).
      
      Neither of those are possible at the moment, and are unlikely to be in
      the future given the osnoise's loop design. On top of this, the noise
      generated by the situations described above is unavoidable, and if not
      exposed by rcu_momentary_dyntick_idle() will be eventually seen in
      subsequent rcu_read_unlock() calls or schedule operations.
      
      Link: https://lkml.kernel.org/r/20220307180740.577607-1-nsaenzju@redhat.com
      
      Cc: stable@vger.kernel.org
      Fixes: bce29ac9
      
       ("trace: Add osnoise tracer")
      Signed-off-by: default avatarNicolas Saenz Julienne <nsaenzju@redhat.com>
      Acked-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Acked-by: default avatarDaniel Bristot de Oliveira <bristot@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      caf4c86b
    • Daniel Bristot de Oliveira's avatar
      tracing/osnoise: Do not unregister events twice · f0cfe17b
      Daniel Bristot de Oliveira authored
      Nicolas reported that using:
      
       # trace-cmd record -e all -M 10 -p osnoise --poll
      
      Resulted in the following kernel warning:
      
       ------------[ cut here ]------------
       WARNING: CPU: 0 PID: 1217 at kernel/tracepoint.c:404 tracepoint_probe_unregister+0x280/0x370
       [...]
       CPU: 0 PID: 1217 Comm: trace-cmd Not tainted 5.17.0-rc6-next-20220307-nico+ #19
       RIP: 0010:tracepoint_probe_unregister+0x280/0x370
       [...]
       CR2: 00007ff919b29497 CR3: 0000000109da4005 CR4: 0000000000170ef0
       Call Trace:
        <TASK>
        osnoise_workload_stop+0x36/0x90
        tracing_set_tracer+0x108/0x260
        tracing_set_trace_write+0x94/0xd0
        ? __check_object_size.part.0+0x10a/0x150
        ? selinux_file_permission+0x104/0x150
        vfs_write+0xb5/0x290
        ksys_write+0x5f/0xe0
        do_syscall_64+0x3b/0x90
        entry_SYSCALL_64_after_hwframe+0x44/0xae
       RIP: 0033:0x7ff919a18127
       [...]
       ---[ end trace 0000000000000000 ]---
      
      The warning complains about an attempt to unregister an
      unregistered tracepoint.
      
      This happens on trace-cmd because it first stops tracing, and
      then switches the tracer to nop. Which is equivalent to:
      
        # cd /sys/kernel/tracing/
        # echo osnoise > current_tracer
        # echo 0 > tracing_on
        # echo nop > current_tracer
      
      The osnoise tracer stops the workload when no trace instance
      is actually collecting data. This can be caused both by
      disabling tracing or disabling the tracer itself.
      
      To avoid unregistering events twice, use the existing
      trace_osnoise_callback_enabled variable to check if the events
      (and the workload) are actually active before trying to
      deactivate them.
      
      Link: https://lore.kernel.org/all/c898d1911f7f9303b7e14726e7cc9678fbfb4a0e.camel@redhat.com/
      Link: https://lkml.kernel.org/r/938765e17d5a781c2df429a98f0b2e7cc317b022.1646823913.git.bristot@kernel.org
      
      Cc: stable@vger.kernel.org
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Fixes: 2fac8d64
      
       ("tracing/osnoise: Allow multiple instances of the same tracer")
      Reported-by: default avatarNicolas Saenz Julienne <nsaenzju@redhat.com>
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      f0cfe17b
  10. Mar 08, 2022
  11. Mar 07, 2022
    • Frederic Weisbecker's avatar
      tick/rcu: Stop allowing RCU_SOFTIRQ in idle · 0345691b
      Frederic Weisbecker authored
      
      RCU_SOFTIRQ used to be special in that it could be raised on purpose
      within the idle path to prevent from stopping the tick. Some code still
      prevents from unnecessary warnings related to this specific behaviour
      while entering in dynticks-idle mode.
      
      However the nohz layout has changed quite a bit in ten years, and the
      removal of CONFIG_RCU_FAST_NO_HZ has been the final straw to this
      safe-conduct. Now the RCU_SOFTIRQ vector is expected to be raised from
      sane places.
      
      A remaining corner case is admitted though when the vector is invoked
      in fragile hotplug path.
      
      Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Cc: Paul Menzel <pmenzel@molgen.mpg.de>
      0345691b
    • Frederic Weisbecker's avatar
      tick/rcu: Remove obsolete rcu_needs_cpu() parameters · 29845399
      Frederic Weisbecker authored
      
      With the removal of CONFIG_RCU_FAST_NO_HZ, the parameters in
      rcu_needs_cpu() are not necessary anymore. Simply remove them.
      
      Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Cc: Paul Menzel <pmenzel@molgen.mpg.de>
      29845399