I have done all the patches, but I have not disabled hyperthreading. I have waited and listened. One person I know and trust has found that now when he buys three servers, he has to buy four. And that is due to disable of HypterThreading, and all the security patches we have been hit with. I cannot easily buy additional servers so I am not going to disable hypterthreading so I need to get rid of that darn message.
We need to work in Advanced System Settings on each host.
So type uservars into the filter bar, and than scroll down a bit and you will see the parameter – as you see above.
Then use the Edit button, and filter again and scroll again.
You can see above that I have changed the 0 to 1 which is the surpress value. Hit OK, and refresh the browser.
And the message is gone.
Now we know how to fix this message manually, I was going to do it via PowerShell. So much easier and do it to all the hosts in the cluster at the same time. However, I won’t be doing that right now. I have one server that doesn’t see any datastores right now. The host suggests the 10 GiB card is not working. But I think it is. So will wrestle with that, then figure out PowerShell, and add it to below. Sorry about the delay.
Bu yazımda sizlere bir warning hakkında bilgi vereceğim. ESXi host’unuzu update ettiğinizde aşağıdaki hata ile karşılaşabilrsiniz. Bu hata L1 Terminal Fault – VMM ‘den kaynaklanmaktadır. CVE-2018-3646‘da bulunan güvenlik açıklarını kapatmak için VMSA-2018-0020‘de bulunan patch’leri yükledikten sonra karşınıza gelir.
XXX esx.problem.hyperthreading.unmitigated.formatonhost not found XXX
Bu hatanın sebebi ile ile ilgili size kısa bilgiler vereceğim. Belirtmiş olduğum bu uyarıyı ESXi host update‘i yaptıktan sonra karşılaşabilirsiniz. Bu uyarının sebebi ise; VMSA-2018-0020 de belirtilen patch’den kaynaklanmaktadır. Bu konu ile ilgili ayrıca bir makale yazacağım. Orada detaylarını anlatıyor olacağım.
Önemli: Aşağıda belirtmiş olduğum işlemleri yaptığınızda HT’nin işlevselliğini kaybedersiniz. Bundan dolayı virtual machine’lere daha fazla kaynak vermeniz gerekebilir. Tabi bu durum ilerleyen patch’ler ile düzelecek diye belirtiliyor. Bundan dolayı Hyperthreading‘i kapatma gibi bir hatada bulunmayın.
Ek olarak Foreshadow – L1 Terminal Fault ile ilgili Tolgahan’ın yazmış olduğu makaleyi buradaki linkten inceleyebilirsiniz.
Bu sorunun isterseniz vSphere Web Client/Host Client isterseniz de CLI üzerinden çözebilirsiniz.
Host Client üzerinden ESXi Side-Channel-Aware-Scheduler’i enable duruma getirmek için;
Bunun için ilk olarak ESXi host’a login oluyoruz. Daha sonrasında Manage > System > Advanced Settings bölümüne giriş yapıyoruz. Sağ üstte yer alan search bölümüne aşağıdaki kelimeyi yazıyoruz.
İlgili key girdisini bulduktan sonra hemen üstünde yer alan edit option bölümüne giriş yapıyoruz.
Default olarak bu değer False olarak geliyor. Bunu True olarak işaretliyor ve Save butonuna basıyoruz. İşlemin geçerli olması için ESXi host’u reboot etmeniz gerekmektedir.
CLI üzerinden ESXi Side-Channel-Aware-Scheduler’i enable etmek için;
İlk olarak ESXi host’a SSH ile bağlanıyoruz. SSH ile bağlandıktan sonra aşağıdaki komutu çalıştırıyoruz.
Gördüğünüz gibi FALSE olarak gözükmektedir. Bunu aşağıdaki komut ile enable duruma getiriyoruz.
Komutu çalıştırdıktan sonra ESXi host’u reboot etmeniz gerekmektedir.
Reboot işlemini tamamladıktan sonra tekrar aşağıdaki komutu çalıştırıp değerlerin TRUE olduğunu görebilirsiniz.
Bu sorunu aşağıdaki ESXi sürümlerinde görebilirsiniz.
- VMware vSphere ESXi 5.5
- VMware vSphere ESXi 6.0
- VMware vSphere ESXi 6.5
- VMware ESXi 6.7
Konu ile ilgili aşağıdaki KB’yi inceleyebilirsiniz.
Upgraded one of our ESXi hosts with the latest patches released today that are aimed at fixing the L1 Terminal Fault issues. After that the host started giving this warning: esx.problem.hyperthreading.unmitigated . No idea what it’s supposed to mean!
Went to Configure > Settings > Advanced System Settings and searched for anything with “hyperthread” in it. Found VMkernel.Boot.hyperthreadingMitigation , which was set to “false” but sounded suspiciously similar to the warning I had. Changed it to “true”, rebooted the host, and Googled on this setting to come across this KB article. It’s a good read but here’s some excerpts if you are interested in only the highlights:
Like Meltdown, Rogue System Register Read, and “Lazy FP state restore”, the “L1 Terminal Fault” vulnerability can occur when affected Intel microprocessors speculate beyond an unpermitted data access. By continuing the speculation in these cases, the affected Intel microprocessors expose a new side-channel for attack. (Note, however, that architectural correctness is still provided as the speculative operations will be later nullified at instruction retirement.)
CVE-2018-3646 is one of these Intel microprocessor vulnerabilities and impacts hypervisors. It may allow a malicious VM running on a given CPU core to effectively infer contents of the hypervisor’s or another VM’s privileged information residing at the same time in the same core’s L1 Data cache. Because current Intel processors share the physically-addressed L1 Data Cache across both logical processors of a Hyperthreading (HT) enabled core, indiscriminate simultaneous scheduling of software threads on both logical processors creates the potential for further information leakage. CVE-2018-3646 has two currently known attack vectors which will be referred to here as “Sequential-Context” and “Concurrent-Context.” Both attack vectors must be addressed to mitigate CVE-2018-3646..
Attack Vector Summary
- Sequential-context attack vector: a malicious VM can potentially infer recently accessed L1 data of a previous context (hypervisor thread or other VM thread) on either logical processor of a processor core.
- Concurrent-context attack vector: a malicious VM can potentially infer recently accessed L1 data of a concurrently executing context (hypervisor thread or other VM thread) on the other logical processor of the hyperthreading-enabled processor core.
Mitigation Summary
- Mitigation of the Sequential-Context attack vector is achieved by vSphere updates and patches. This mitigation is enabled by default and does not impose a significant performance impact. Please see resolution section for details.
- Mitigation of the Concurrent-context attack vector requires enablement of a new feature known as the ESXi Side-Channel-Aware Scheduler. The initial version of this feature will only schedule the hypervisor and VMs on one logical processor of an Intel Hyperthreading-enabled core. This feature may impose a non-trivial performance impact and is not enabled by default.
So that’s what the warning was about. To enable the ESXi Side Channel Aware scheduler we need to set the key above to “true”. More excerpts:
The Concurrent-context attack vector is mitigated through enablement of the ESXi Side-Channel-Aware Scheduler which is included in the updates and patches listed in VMSA-2018-0020. This scheduler is not enabled by default. Enablement of this scheduler may impose a non-trivial performance impact on applications running in a vSphere environment. The goal of the Planning Phase is to understand if your current environment has sufficient CPU capacity to enable the scheduler without operational impact.
The following list summarizes potential problem areas after enabling the ESXi Side-Channel-Aware Scheduler:
- VMs configured with vCPUs greater than the physical cores available on the ESXi host
- VMs configured with custom affinity or NUMA settings
- VMs with latency-sensitive configuration
- ESXi hosts with Average CPU Usage greater than 70%
- Hosts with custom CPU resource management options enabled
- HA Clusters where a rolling upgrade will increase Average CPU Usage above 100%
Note: It may be necessary to acquire additional hardware, or rebalance existing workloads, before enablement of the ESXi Side-Channel-Aware Scheduler. Organizations can choose not to enable the ESXi Side-Channel-Aware Scheduler after performing a risk assessment and accepting the risk posed by the Concurrent-context attack vector. This is NOT RECOMMENDED and VMware cannot make this decision on behalf of an organization.
So to fix the second issue we need to enable the new scheduler. That can have a performance hit, so best to enable it manually so you are aware and can keep an eye on the load and performance hits. Also, if you are not in a shared environment and don’t care, you don’t need to enable it either. Makes sense.
That warning message could have been a bit more verbose though! 🙂
Источник: