Configure ControlUp for VMware Horizon Instant Clone VDI monitoring

In this guide, we will analyze how to configure ControlUP COP (ControlUP on-Premise) to monitor a VMware Horizon 2309 infrastructure with Instant Clone Desktop Pools (we will not cover the installation part of the product)

The following steps are required:

  • Control UP COP Server Component Installation (Optionally use an external SQL instance or SQL EXPRESS present in the Server component installation)
  • Installing the Control UP Console (Can also be installed on the same server)
  • Installing Agent Control Up on the GoldImage
  • Horizon Infrastructure Inventory
  • VirtualMachine Inventory (For this step we can also implement an automatism)

Requirements for the server part:


COP Server
COP Server Console Machine
Machine Windows Server Windows Server orWindows
Operating System Windows Server supported versions:2022,2019,2016 Windows Server supported versions:2022,2019,2016
OR Windows 11, 10
CPU* 2 CPUs 2 CPUs
Memory* 8 GB RAM 8 GB RAM
Disk Space* 10 GB 10 GB
Required Software & Permissions
  • .NET Framework 4.8 or later
  • PowerShell 5.x or later
.NET Framework 4.5 or later

Requirements for Part DB:

MSSQL Versions (Standard, Enterprise, or Express) Maximum Database Size Collation
2022,2019,2017,2016,2014 10 GB SQL_Latin1_General_CP1_CI_AS

Requirements for the VDI part:


ControlUp Agent
ControlUp Agent
Machine No server installation necessary. Deployed onto Windows machines that are monitored by ControlUp(Linux monitored via API).
Operating system Windows Server supported versions:
202220192016 (Core or Full)ORWindows 11, 10
Required installed software .NET 4.5 or later
TCP PORT 40705

A Service Account to access the Horizon infrastructure:

The Read-Only role is sufficient for all monitoring purposes. If you want to perform built-in Horizon actions, then the service account needs the following permissions:

  • Enable Farm and Desktop Pools
  • Manage Machine
  • Manage Sessions
  • Manage Global Sessions (Cloud Pod architecture only)

So what is needed is:

Download the version of ControlUP COP from the VMware site

Log in to the customer portal and in the product area under Desktop & End-User Computing

A screenshot of a computer

Description automatically generated

Log in to OEM Addons

A screenshot of a computer

Description automatically generated

Download the on-premise version

Perform the basic installation

Once the COP version is installed and the console is installed, log in to our ControlUP installation

A screenshot of a computer

Description automatically generated

How to install the agent on the GoldImage:

  1. The agent MSI file is on the downloaded file zip from VMware Portal
  2. Open the Real-Time Console and go to Agent Settings and copy your Agents Authentication Key. The key is used to connect the Agent to your ControlUp environment.

A screenshot of a computer

Description automatically generated

  1. Run the installation of the MSI package on the machine where you want to install the Agent.
  2. During the installation, paste the authentication key that you copied from the Real-Time Console.

A screenshot of a computer

Description automatically generated

  1. Complete the installation. The Agent is installed on the machine and the machine can be monitored from the Real-Time Console.
  2. Take the snapshot
  3. Deploy the new master image on Desktop Pool

Now from the ControlUp Management console, we are able to:

  • Connect our Vmware Horizon infrastructure
  • Connect the instant clone machine

Add Horizon infrastructure:

A screenshot of a computer

Description automatically generated

Add the infrastructure info

A screenshot of a computer

Description automatically generated

Click on OK

A screen shot of a computer

Description automatically generated

Add the pod to the console

A screenshot of a computer

Description automatically generated

Now on the left panel, we have our Horizon infrastructure added.

A screenshot of a computer

Description automatically generated

To monitor correctly our instant clone (after adding the agent) we need to discover the VM like a Machine

A screenshot of a computer

Description automatically generated

Search with the partial name of the VDI machines

A screenshot of a computer

Description automatically generated

Select cancel

A screenshot of a computer error message

Description automatically generated

We are VM on the left control panel in black status

A screenshot of a computer

Description automatically generated

After a few seconds the VDI VM Goes to Green

A screenshot of a computer

Description automatically generated

Auto connect state must be enabled (this function is important when the instant clone VDI is removed and recreated).

A screenshot of a computer

Description automatically generated

Now we can monitoring the Instant-Clone VDI

Check the VDI logon duration

Now we can manage and control the infrastructure, for example, to check the logon duration

A screenshot of a computer

Description automatically generated

A screenshot of a computer

Description automatically generated

What happens when VDI instant clones are regenerated?

If a user disconnects from his VDI of the instant clone type, it is destroyed and recreated, on the ControlUp side this is put in the Red -> Yellow state until it returns to Green

When recreating

A screenshot of a computer

Description automatically generated

After recess

A screenshot of a computer

Description automatically generated

Dynamic inventory

For a dynamic inventory of VDI, we can use Synchronization with Universal Sync Script (I’ll talk about this in a future post)

EUC Synchronization with Universal Sync Script (controlup.com)

After installation, we can schedule or start manually the script to sync my ControlUP with my EUC infrastructure.

References:

How to Deploy the Agent on Your Master Image for PVS/MCS/Linked/Instant Clones (controlup.com)

EUC Synchronization with Universal Sync Script (controlup.com)

ControlUp On-Premises

Configure ControlUp for VMware Horizon Instant Clone VDI monitoring

vSphere Distributed Switch health check

For us VMware systems engineers who every day find ourselves “dialoguing” with those who manage the network ecosystem, we can only find the vSphere Distributed Switch health check function useful.

  1. What these checks allow us to highlight:

These are some of the common configuration errors that health check identifies:

  • Mismatched VLAN trunks between a vSphere distributed switch and a physical switch.
  • Mismatched MTU settings between physical network adapters, distributed switches, and physical switch ports.
  • Mismatched virtual switch teaming policies for the physical switch port-channel settings.

The network health check in vSphere monitors the following three network parameters at regular intervals:

  • VLAN: Checks whether vSphere distributed switch VLAN settings match trunk port configuration on the adjacent physical switch ports.
  • MTU: Checks whether the physical access switch port MTU setting based on per VLAN matches the vSphere distributed switch MTU setting.
  • Network adapter teaming: Checks whether the physical access switch ports EtherChannel setting matches the distributed switch distributed port group IP Hash teaming policy settings.
  1. How to activate:

Access the network section of our vCenter

Select the vDS on which we want to activate health checks

A screenshot of a computer

Description automatically generated

And enable the check that interests us:

A screenshot of a computer

Description automatically generated

  1. Where to check the outcome of the checks?

Wait a few minutes and already first feedback we can have it on ESXi hosts using the vDS in question, where if there are problems the classic red dot will be displayed

A screenshot of a computer

Description automatically generated

For more details, access the network section of our vCenter and select the vDS in question

A screenshot of a computer

Description automatically generated

And we can see that on the vmnic0 and vmnic3 of the first host, there are vLANs of which we have a Portgroup but which are not proposed correctly on all the ports of the switches to which we have attested our hosts. Then we have to have the configuration verified by our colleagues in the network.

  1. How to turn it off:

Repeat the enabling steps but this time select disable.

  1. Risks in activating it (we always consider activating it for a short time)

Depending on the options that you select, the vSphere Distributed Switch Health Check can generate a significant number of MAC addresses for testing teaming policy, MTU size, vLAN configuration, resulting in extra network traffic.
Ensure the number of MAC addresses to be generated by the health check will be less than the size of the physical switch(es) MAC table. Otherwise, there is a risk that the switches will run out of memory, with subsequent network connectivity failures. After you disable vSphere Distributed Switch Health Check, the generated MAC addresses age out of your physical network environment according to your network policy.

More info:

vDS Health Check reports unsupported VLANs for MTU and VLAN (2140503) (vmware.com)

Enabling vSphere Distributed Switch health check in the vSphere Web Client (2032878) (vmware.com)

vSphere Distributed Switch health check

Enable copy and paste between Guest Operating System and Remote Console

Copy and paste operations between the guest operating system and remote console are deactivated by default. 

To enable it:

  • Browse to the virtual machine in the vSphere Client inventory
  • Right-click the virtual machine and click Edit Settings.
  • Select Advanced Parameters.
  • Add or edit the following parameters.

    isolation.tools.copy.disable False
    isolation.tools.paste.disable False
    isolation.tools.setGUIOptions.enable True
    These options override any settings made in the guest operating system’s VMware Tools control panel.
  • Click OK.
  • (Optional) If you made changes to the configuration parameters, restart the virtual machine.

Enable copy and paste between Guest Operating System and Remote Console

Remove an instant clone desktop pool in a deleting state

If we are removing a desktop pool from a Horizon infrastructure and we find ourselves in a situation that remains in a deleting state:

Graphical user interface, text, application

Description automatically generated

We can force the removal as follows:

  • Remove any VDI VMs still in your vSphere infrastructure.
  • Remove VM template, replication and parent.

In my example, we have the following situation

Graphical user interface, text, application

Description automatically generated

To remove them we use the tool iccleanup.cmd, we find te command on the connection servers by launching the following command to access:

iccleanup.cmd -vc <ome of vcenter> -uid < admin user of vcenter>

We enter the account password

Run with the list command the list of service VMs to be deleted

Text

Description automatically generated

In my case, they are the VMs indicated with ID 2 and 3

We start from 2 and first launch the unprotect indicating with -I the number 2 (unprotect -I 2) and confirm by writing unprotect

Text

Description automatically generated

Then we delete with the question delete -I 2 and confirm by writing  delete

Graphical user interface, text

Description automatically generated

Let’s go back by writing Back

Relaunch the List command and verify that Index has taken the other chain of system VMs to be deleted

Text

Description automatically generated with medium confidence

Ha was taken as index 2

We review the operations of unprotect and delete once again for index 2

At the end of the vCenter  (they are service VM from other pools that should not be deleted)

Text

Description automatically generated

Already in this case, we may have deleted the DesktopPool that was in a deleting state.

If the Pool in question is still present and in a deleting state, we proceed to access the ADSI Edit console and modify the ADAM DB by deleting the references left to the pool in deleting state (in my case ICTPM)

  • Remove Desktop Pool from ADAM Database

To connect follow this KB:

Connecting to the Horizon View Local ADAM Database (vmware.com)

remove the pool from Adam as follows.

Graphical user interface, text, application

Description automatically generated

Graphical user interface, text

Description automatically generated

Remove an instant clone desktop pool in a deleting state

vSphere DRS functionality was impacted due to an unhealthy state vSphere Cluster Service

If you see such an error on the Cluster object of a vSAN (in my case it appeared on two vSAN clusters managed by the same vCenter)

vSphere DRS functionality was impacted due to an unhealthy state vSphere Cluster Service …….

an unhealthy state of the Service cluster

Graphical user interface, text, application, email

Description automatically generated

Errors such as the following in the EAM log. vCenter LOG

EAM.log:

2023-01-26T13:16:39.996Z |  INFO | vim-monitor | VcListener.java | 131 | Retrying in 10 sec.
2023-01-26T13:16:41.432Z | ERROR | vlsi | DispatcherImpl.java | 468 | Internal server error during dispatch
com.vmware.vim.binding.eam.fault.EamServiceNotInitialized: EAM is still loading from database. Please try again later.
        at com.vmware.eam.vmomi.EAMInitRequestFilter.handleBody(EAMInitRequestFilter.java:57) ~[eam-server.jar:?]
        at com.vmware.vim.vmomi.server.impl.DispatcherImpl$SingleRequestDispatcher.handleBody(DispatcherImpl.java:373) [vlsi-server.jar:?]
        at com.vmware.vim.vmomi.server.impl.DispatcherImpl$SingleRequestDispatcher.dispatch(DispatcherImpl.java:290) [vlsi-server.jar:?]
        at com.vmware.vim.vmomi.server.impl.DispatcherImpl.dispatch(DispatcherImpl.java:246) [vlsi-server.jar:?]
        at com.vmware.vim.vmomi.server.http.impl.CorrelationDispatcherTask.run(CorrelationDispatcherTask.java:58) [vlsi-server.jar:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_345]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_345]
        at java.lang.Thread.run(Thread.java:750) [?:1.8.0_345]
2023-01-26T13:16:50.007Z |  INFO | vim-monitor | ExtensionSessionRenewer.java | 190 | [Retry:Login:com.vmware.vim.eam:b55a7f93b59f0f7e] Re-login to vCenter because method: currentTime of managed object: null::ServiceInstance:ServiceInstance failed due to expired client session: null
2023-01-26T13:16:50.007Z |  INFO | vim-monitor | OpId.java | 37 | [vim:loginExtensionByCertificate:913aec585658e328] created from [Retry:Login:com.vmware.vim.eam:b55a7f93b59f0f7e]
2023-01-26T13:16:51.440Z | ERROR | vlsi | DispatcherImpl.java | 468 | Internal server error during dispatch
com.vmware.vim.binding.eam.fault.EamServiceNotInitialized: EAM is still loading from database. Please try again later.


And you see the lack of vCLS VMs in the two vSANs

To resolve the anomaly you must proceed as follows:

  • vCenter Snapshots and Backup
  • Log in to the vCenter Server Appliance using SSH.
  • Run this command to enable access the Bash shell:

shell.set --enabled true

  • Type shell and press Enter.
  • Run this command to retrieve the vpxd-extension solution user certificate and key:

mkdir /certificate

/usr/lib/vmware-vmafd/bin/vecs-cli entry getcert --store vpxd-extension --alias vpxd-extension --output /certificate/vpxd-extension.crt

/usr/lib/vmware-vmafd/bin/vecs-cli entry getkey --store vpxd-extension --alias vpxd-extension --output /certificate/vpxd-extension.key

  • Run this command to update the extension’s certificate with vCenter Server.

python /usr/lib/vmware-vpx/scripts/updateExtensionCertInVC.py -e com.vmware.vim.eam -c /certificate/vpxd-extension.crt -k /certificate/vpxd-extension.key -s localhost -u "Administrator@domain.local"

Note: If this produces the error “Hostname mismatch, certificate is not valid for ‘localhost'”, change ‘localhost’ to the FQDN or IP of the vCenter. The process is checking this value against the SAN entries of the certificate.

Note: The default user and domain is Administrator@vsphere.local. If this was changed during configuration, change the domain to match your environment. When prompted, type in the Administrator@domain.local password.

  • Restart EAM and start the rest of the services with these commands:

service-control --stop vmware-eam

service-control --start --all

vSphere DRS functionality was impacted due to an unhealthy state vSphere Cluster Service

Horizon Instant Clone -VM Replica and Template in inaccessible state

In the various maintenance activities of a Horizon infrastructure, it can happen to find VMs of the instant clone chain in an inaccessible state. (Caused by issues on hosts or vCenter such as sudden shutdowns without properly maintaining Horizon Desktop Pools.)

Image that contains text

Auto-generated description

In the VMware Horizon solution there is a tool, from the command line, that allows the cleaning of these objects.

The tool is present in the directory

C:\Program Files\VMware\VMware View\Server\tools\bin>

of one of the connection servers of the Horizon infrastructure

The command is iccleanup.cmd

The first step is to connect to the vCenter in question by launching the following command

iccleanup.cmd -vc <fqdn of vCenter> -uid <administrative user>

Once you have entered the password you will have the possibility to list the VMs of the instant clone infrastructures implemented on that vCenter with the LIST command

Image that contains text

Auto-generated description

Or delete objects in an inaccessible state, for example:

With the delete –index 2 command

Image that contains text

Auto-generated description

After the completion of the cleaning, the situation will be as follows:

Image that contains text

Auto-generated description

Horizon Instant Clone -VM Replica and Template in inaccessible state

Script to see Datastore Permission

Last day in the VMware Community I saw a request for:

“I have AD group like mydomain\mygroup.

This group have access for many datastores.

How i can use powercli to get full list of datastores which the group can manage?”

I made this PowerCLI script:

$cred = Get-Credential
Connect-ViServer <vcenter-FQDN>; -Credential $cred
$datastores = Get-Datastore | Select Name
$groupAD = "domain\group"
$report = @()
foreach ($datastore in $datastores) {
  $report +=  Get-VIPermission
| Where-Object {($_.Entity.Name -Like $datastore.Name) -and ($_.Principal -eq $groupAD)} |Select Principal,Role,@{n='Datastore';E={$datastore.Name}},@{n='Entity';E={$_.Entity.Name}},@{N='Entity Type';E={$_.EntityId.Split('-')[0]}},@{N='vCenter';E={$_.Uid.Split('@:')[1]}}
}
$report | Export-Csv <path\csvfile> -NoTypeInformation

Script to see Datastore Permission

Copy-VMGuestFile

I needed to copy a folder to virtual machine on the DMZ network segment.
The firewall rule blocks any access to VM, well I used the Copy-VMGuestFile Powercli Command.

Connect-VIServer <vcenter server FQDN or IP>

$vm = Get-VM -Name <VM target>

Get-Item “<Source Path>” | Copy-VMGuestFile -Destination “<Destination Path on VM>” -VM $vm -LocalToGuest -GuestUser <User VM Guest> -GuestPassword <Password User VM Guest>

After the first tentative I receive this error:

Copy-VMGuestFile : 09/12/2021 11:08:40 Copy-VMGuestFile The request was aborted: The request was cancelled.
At line:1 char:27

Probably it is a time out error and I try to change the WebOperation Timeout Seconds.

PS C:\Windows\system32> Set-PowerCLIConfiguration -WebOperationTimeoutSeconds -1

Scope ProxyPolicy DefaultVIServerMode InvalidCertificateAction DisplayDeprecationWarnings WebOperationTimeout
Seconds
—– ———– ——————- ———————— ————————– ——————-
Session UseSystemProxy Multiple Unset True -1
User Multiple
AllUsers -1

After the change the error it is resolved

Copy-VMGuestFile

LDAP Identity source and vCenter

Whenever we installed a new vCenter the activity always included integration with Active Directory and normally IWA (Integrated Windows Authentication) was used.
Since vSphere 7.0 version this possibility has been deprecated
so it is good to start with the integration of the vCenter with Active Directory via LDAP.
In our case, we will use LDAPS which uses a certificate

For first the step we need to create the certificate:

  • Use SSH to vCenter connection

On shell use this command

openssl s_client -connect <DC FQDN>:636 -showcerts

Copy the certificate output with  —–BEGIN CERTIFICATE—– and —–END CERTIFICATE—–

Past on Notepad and save with .crt extension

Now we will go to configure the Identity Sources on vCenter:

  • Login as Single Sign-On Administrator to vCenter
  • Navigate to Menu > Administration > Single Sign-On Configuration
  • In the Identity Provider tab, open Identity Sources
  • Click ADD
  • Select Active Directory over LDAP or OpenLDAP, depending on your directory type.

Fill out the remaining fields as follows:
Identity Source Name: Label
Base DN for users: The Distinguished Name (DN) of the starting point for directory server searches. Example: “DC=pollaio,DC=lan”.
Base DN for groups: The Distinguished Name (DN) of the starting point for directory server searches.
Domain name: Your domain name. Example: “pollaio.lan”
Domain alias: Your NetBIOS name. Example: “pollaio.lan”
Username: Domain user with at least browse privileges. Example: “pollaio\administrator”.
Connect to:  “ldaps://<DC FQDN>”.

  • Click Browse next to SSL Certificate
  • Select the .cer file created in before step
Now we are ready to login to the vCenter with domain user (remember to assign the correct permission to domain group or user group)

If you want check the correct use of SSL certificate on the authentication to Active Directory with LDAP connection check the websso.log:

LDAP Identity source and vCenter