In recent days, a customer reported an anomaly on an HPE SimpliVity cluster hosting instant clone Horizon VDIs. In detail:
vSphere with seven hosts present, two were always at 98% CPU utilization and 90% RAM utilization.
Continuous vMotion generated by the VM DRS to and from those two HOSTS.
After a careful analysis, we identified that there were no problems at the vSphere infrastructure level. The issue was due to a Simplivity feature called IWO.
By disabling IWO and keeping DRS active (Full automatic) I have an optimal balance of CPU and RAM load between hosts at the expense of a slight increase in I/O trip times
Scenario – Even VM Load Distribution
I want even VM load across my cluster in terms of CPU and memory. Data locality and I/O performance are not top priorities. Most applications are CPU and memory intensive, and adding 1ms to 2ms to I/O trip times will not impact application performance.
In this scenario, IWO can be disabled thus ensuring no DRS affinity rules are populated into vCenter server. Suppressing DRS affinity rules will allow VMware DRS or allow you to directly distribute VMs across the cluster as desired to ensure all VMs are adequately resourced in terms of CPU and memory. The ‘Data Access Not Optimized’ alarm can be suppressed within vCenter server.
If you see such an error on the Cluster object of a vSAN (in my case it appeared on two vSAN clusters managed by the same vCenter)
vSphere DRS functionality was impacted due to an unhealthy state vSphere Cluster Service …….
an unhealthy state of the Service cluster
Errors such as the following in the EAM log. vCenter LOG
EAM.log:
2023-01-26T13:16:39.996Z | INFO | vim-monitor | VcListener.java | 131 | Retrying in 10 sec.
2023-01-26T13:16:41.432Z | ERROR | vlsi | DispatcherImpl.java | 468 | Internal server error during dispatch
com.vmware.vim.binding.eam.fault.EamServiceNotInitialized: EAM is still loading from database. Please try again later.
at com.vmware.eam.vmomi.EAMInitRequestFilter.handleBody(EAMInitRequestFilter.java:57) ~[eam-server.jar:?]
at com.vmware.vim.vmomi.server.impl.DispatcherImpl$SingleRequestDispatcher.handleBody(DispatcherImpl.java:373) [vlsi-server.jar:?]
at com.vmware.vim.vmomi.server.impl.DispatcherImpl$SingleRequestDispatcher.dispatch(DispatcherImpl.java:290) [vlsi-server.jar:?]
at com.vmware.vim.vmomi.server.impl.DispatcherImpl.dispatch(DispatcherImpl.java:246) [vlsi-server.jar:?]
at com.vmware.vim.vmomi.server.http.impl.CorrelationDispatcherTask.run(CorrelationDispatcherTask.java:58) [vlsi-server.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_345]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_345]
at java.lang.Thread.run(Thread.java:750) [?:1.8.0_345]
2023-01-26T13:16:50.007Z | INFO | vim-monitor | ExtensionSessionRenewer.java | 190 | [Retry:Login:com.vmware.vim.eam:b55a7f93b59f0f7e] Re-login to vCenter because method: currentTime of managed object: null::ServiceInstance:ServiceInstance failed due to expired client session: null
2023-01-26T13:16:50.007Z | INFO | vim-monitor | OpId.java | 37 | [vim:loginExtensionByCertificate:913aec585658e328] created from [Retry:Login:com.vmware.vim.eam:b55a7f93b59f0f7e]
2023-01-26T13:16:51.440Z | ERROR | vlsi | DispatcherImpl.java | 468 | Internal server error during dispatch
com.vmware.vim.binding.eam.fault.EamServiceNotInitialized: EAM is still loading from database. Please try again later.
And you see the lack of vCLS VMs in the two vSANs
To resolve the anomaly you must proceed as follows:
vCenter Snapshots and Backup
Log in to the vCenter Server Appliance using SSH.
Run this command to enable access the Bash shell:
shell.set --enabled true
Type shell and press Enter.
Run this command to retrieve the vpxd-extension solution user certificate and key:
Note: If this produces the error “Hostname mismatch, certificate is not valid for ‘localhost'”, change ‘localhost’ to the FQDN or IP of the vCenter. The process is checking this value against the SAN entries of the certificate.
Note: The default user and domain is Administrator@vsphere.local. If this was changed during configuration, change the domain to match your environment. When prompted, type in the Administrator@domain.local password.
Restart EAM and start the rest of the services with these commands: