Forum | Documentation | Website | Blog

Skip to content
Snippets Groups Projects
  1. Mar 11, 2022
  2. Mar 08, 2022
  3. Mar 01, 2022
    • Sean Christopherson's avatar
      KVM: Drop kvm_reload_remote_mmus(), open code request in x86 users · 2f6f66cc
      Sean Christopherson authored
      
      Remove the generic kvm_reload_remote_mmus() and open code its
      functionality into the two x86 callers.  x86 is (obviously) the only
      architecture that uses the hook, and is also the only architecture that
      uses KVM_REQ_MMU_RELOAD in a way that's consistent with the name.  That
      will change in a future patch, as x86's usage when zapping a single
      shadow page x86 doesn't actually _need_ to reload all vCPUs' MMUs, only
      MMUs whose root is being zapped actually need to be reloaded.
      
      s390 also uses KVM_REQ_MMU_RELOAD, but for a slightly different purpose.
      
      Drop the generic code in anticipation of implementing s390 and x86 arch
      specific requests, which will allow dropping KVM_REQ_MMU_RELOAD entirely.
      
      Opportunistically reword the x86 TDP MMU comment to avoid making
      references to functions (and requests!) when possible, and to remove the
      rather ambiguous "this".
      
      No functional change intended.
      
      Cc: Ben Gardon <bgardon@google.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarBen Gardon <bgardon@google.com>
      Message-Id: <20220225182248.3812651-4-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      2f6f66cc
  4. Feb 25, 2022
    • Vipin Sharma's avatar
      KVM: Move VM's worker kthreads back to the original cgroup before exiting. · e45cce30
      Vipin Sharma authored
      
      VM worker kthreads can linger in the VM process's cgroup for sometime
      after KVM terminates the VM process.
      
      KVM terminates the worker kthreads by calling kthread_stop() which waits
      on the 'exited' completion, triggered by exit_mm(), via mm_release(), in
      do_exit() during the kthread's exit.  However, these kthreads are
      removed from the cgroup using the cgroup_exit() which happens after the
      exit_mm(). Therefore, A VM process can terminate in between the
      exit_mm() and cgroup_exit() calls, leaving only worker kthreads in the
      cgroup.
      
      Moving worker kthreads back to the original cgroup (kthreadd_task's
      cgroup) makes sure that the cgroup is empty as soon as the main VM
      process is terminated.
      
      Signed-off-by: default avatarVipin Sharma <vipinsh@google.com>
      Suggested-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220222054848.563321-1-vipinsh@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e45cce30
  5. Feb 17, 2022
    • Wanpeng Li's avatar
      KVM: Fix lockdep false negative during host resume · 4cb9a998
      Wanpeng Li authored
      
      I saw the below splatting after the host suspended and resumed.
      
         WARNING: CPU: 0 PID: 2943 at kvm/arch/x86/kvm/../../../virt/kvm/kvm_main.c:5531 kvm_resume+0x2c/0x30 [kvm]
         CPU: 0 PID: 2943 Comm: step_after_susp Tainted: G        W IOE     5.17.0-rc3+ #4
         RIP: 0010:kvm_resume+0x2c/0x30 [kvm]
         Call Trace:
          <TASK>
          syscore_resume+0x90/0x340
          suspend_devices_and_enter+0xaee/0xe90
          pm_suspend.cold+0x36b/0x3c2
          state_store+0x82/0xf0
          kernfs_fop_write_iter+0x1b6/0x260
          new_sync_write+0x258/0x370
          vfs_write+0x33f/0x510
          ksys_write+0xc9/0x160
          do_syscall_64+0x3b/0xc0
          entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      lockdep_is_held() can return -1 when lockdep is disabled which triggers
      this warning. Let's use lockdep_assert_not_held() which can detect
      incorrect calls while holding a lock and it also avoids false negatives
      when lockdep is disabled.
      
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Message-Id: <1644920142-81249-1-git-send-email-wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      4cb9a998
  6. Feb 10, 2022
  7. Jan 28, 2022
    • Hou Wenlong's avatar
      KVM: eventfd: Fix false positive RCU usage warning · 6a0c6170
      Hou Wenlong authored
      
      Fix the following false positive warning:
       =============================
       WARNING: suspicious RCU usage
       5.16.0-rc4+ #57 Not tainted
       -----------------------------
       arch/x86/kvm/../../../virt/kvm/eventfd.c:484 RCU-list traversed in non-reader section!!
      
       other info that might help us debug this:
      
       rcu_scheduler_active = 2, debug_locks = 1
       3 locks held by fc_vcpu 0/330:
        #0: ffff8884835fc0b0 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x88/0x6f0 [kvm]
        #1: ffffc90004c0bb68 (&kvm->srcu){....}-{0:0}, at: vcpu_enter_guest+0x600/0x1860 [kvm]
        #2: ffffc90004c0c1d0 (&kvm->irq_srcu){....}-{0:0}, at: kvm_notify_acked_irq+0x36/0x180 [kvm]
      
       stack backtrace:
       CPU: 26 PID: 330 Comm: fc_vcpu 0 Not tainted 5.16.0-rc4+
       Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
       Call Trace:
        <TASK>
        dump_stack_lvl+0x44/0x57
        kvm_notify_acked_gsi+0x6b/0x70 [kvm]
        kvm_notify_acked_irq+0x8d/0x180 [kvm]
        kvm_ioapic_update_eoi+0x92/0x240 [kvm]
        kvm_apic_set_eoi_accelerated+0x2a/0xe0 [kvm]
        handle_apic_eoi_induced+0x3d/0x60 [kvm_intel]
        vmx_handle_exit+0x19c/0x6a0 [kvm_intel]
        vcpu_enter_guest+0x66e/0x1860 [kvm]
        kvm_arch_vcpu_ioctl_run+0x438/0x7f0 [kvm]
        kvm_vcpu_ioctl+0x38a/0x6f0 [kvm]
        __x64_sys_ioctl+0x89/0xc0
        do_syscall_64+0x3a/0x90
        entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Since kvm_unregister_irq_ack_notifier() does synchronize_srcu(&kvm->irq_srcu),
      kvm->irq_ack_notifier_list is protected by kvm->irq_srcu. In fact,
      kvm->irq_srcu SRCU read lock is held in kvm_notify_acked_irq(), making it
      a false positive warning. So use hlist_for_each_entry_srcu() instead of
      hlist_for_each_entry_rcu().
      
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarHou Wenlong <houwenlong93@linux.alibaba.com>
      Message-Id: <f98bac4f5052bad2c26df9ad50f7019e40434512.1643265976.git.houwenlong.hwl@antgroup.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      6a0c6170
  8. Jan 26, 2022
  9. Jan 24, 2022
  10. Jan 19, 2022
    • Sean Christopherson's avatar
      KVM: Move x86 VMX's posted interrupt list_head to vcpu_vmx · 12a8eee5
      Sean Christopherson authored
      
      Move the seemingly generic block_vcpu_list from kvm_vcpu to vcpu_vmx, and
      rename the list and all associated variables to clarify that it tracks
      the set of vCPU that need to be poked on a posted interrupt to the wakeup
      vector.  The list is not used to track _all_ vCPUs that are blocking, and
      the term "blocked" can be misleading as it may refer to a blocking
      condition in the host or the guest, where as the PI wakeup case is
      specifically for the vCPUs that are actively blocking from within the
      guest.
      
      No functional change intended.
      
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20211208015236.1616697-7-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      12a8eee5
    • Sean Christopherson's avatar
      KVM: Drop unused kvm_vcpu.pre_pcpu field · e6eec09b
      Sean Christopherson authored
      
      Remove kvm_vcpu.pre_pcpu as it no longer has any users.  No functional
      change intended.
      
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20211208015236.1616697-6-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e6eec09b
    • Christian Borntraeger's avatar
      KVM: avoid warning on s390 in mark_page_dirty · e09fccb5
      Christian Borntraeger authored
      
      Avoid warnings on s390 like
      [ 1801.980931] CPU: 12 PID: 117600 Comm: kworker/12:0 Tainted: G            E     5.17.0-20220113.rc0.git0.32ce2abb03cf.300.fc35.s390x+next #1
      [ 1801.980938] Workqueue: events irqfd_inject [kvm]
      [...]
      [ 1801.981057] Call Trace:
      [ 1801.981060]  [<000003ff805f0f5c>] mark_page_dirty_in_slot+0xa4/0xb0 [kvm]
      [ 1801.981083]  [<000003ff8060e9fe>] adapter_indicators_set+0xde/0x268 [kvm]
      [ 1801.981104]  [<000003ff80613c24>] set_adapter_int+0x64/0xd8 [kvm]
      [ 1801.981124]  [<000003ff805fb9aa>] kvm_set_irq+0xc2/0x130 [kvm]
      [ 1801.981144]  [<000003ff805f8d86>] irqfd_inject+0x76/0xa0 [kvm]
      [ 1801.981164]  [<0000000175e56906>] process_one_work+0x1fe/0x470
      [ 1801.981173]  [<0000000175e570a4>] worker_thread+0x64/0x498
      [ 1801.981176]  [<0000000175e5ef2c>] kthread+0x10c/0x110
      [ 1801.981180]  [<0000000175de73c8>] __ret_from_fork+0x40/0x58
      [ 1801.981185]  [<000000017698440a>] ret_from_fork+0xa/0x40
      
      when writing to a guest from an irqfd worker as long as we do not have
      the dirty ring.
      
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@linux.ibm.com>
      Reluctantly-acked-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Message-Id: <20220113122924.740496-1-borntraeger@linux.ibm.com>
      Fixes: 2efd61a6
      
       ("KVM: Warn if mark_page_dirty() is called without an active vCPU")
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e09fccb5
  11. Jan 07, 2022
    • David Woodhouse's avatar
      KVM: Reinstate gfn_to_pfn_cache with invalidation support · 982ed0de
      David Woodhouse authored
      This can be used in two modes. There is an atomic mode where the cached
      mapping is accessed while holding the rwlock, and a mode where the
      physical address is used by a vCPU in guest mode.
      
      For the latter case, an invalidation will wake the vCPU with the new
      KVM_REQ_GPC_INVALIDATE, and the architecture will need to refresh any
      caches it still needs to access before entering guest mode again.
      
      Only one vCPU can be targeted by the wake requests; it's simple enough
      to make it wake all vCPUs or even a mask but I don't see a use case for
      that additional complexity right now.
      
      Invalidation happens from the invalidate_range_start MMU notifier, which
      needs to be able to sleep in order to wake the vCPU and wait for it.
      
      This means that revalidation potentially needs to "wait" for the MMU
      operation to complete and the invalidate_range_end notifier to be
      invoked. Like the vCPU when it takes a page fault in that period, we
      just spin — fixing that in a future...
      982ed0de
    • David Woodhouse's avatar
      KVM: Warn if mark_page_dirty() is called without an active vCPU · 2efd61a6
      David Woodhouse authored
      The various kvm_write_guest() and mark_page_dirty() functions must only
      ever be called in the context of an active vCPU, because if dirty ring
      tracking is enabled it may simply oops when kvm_get_running_vcpu()
      returns NULL for the vcpu and then kvm_dirty_ring_get() dereferences it.
      
      This oops was reported by "butt3rflyh4ck" <butterflyhuangxx@gmail.com> in
      https://lore.kernel.org/kvm/CAFcO6XOmoS7EacN_n6v4Txk7xL7iqRa2gABg3F7E3Naf5uG94g@mail.gmail.com/
      
      
      
      That actual bug will be fixed under separate cover but this warning
      should help to prevent new ones from being added.
      
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Message-Id: <20211210163625.2886-2-dwmw2@infradead.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      2efd61a6
  12. Dec 09, 2021
  13. Dec 08, 2021