The other day I was scaling out a VIDM cluster for a customer. After the scale out I re-registered it back with the vRA 8.1 instance that the customer was running. At the time the latest patch (Patch 3) was installed and vRA 8 was running in clustered HA mode.
The step ‘Initialize vRA’ in vRSLCM took a long time and eventually failed with an error:
Issue – Error Code: LCMVRAVACONFIG590003
Cluster initialization failed on vRA.
vRA va initialize failed on : xxxx-xxxx-xxxx,
Please login to the vRA and check /var/log/deploy.log file for more information on failure. If its a vRA HA and to reset the load balancer, set the ‘isLBRetry’ to ‘true’ and enter a valid load balancer hostname.
So I checked the ‘/var/log.deploy.log’ on the vRA node that was failing and noticed the following message:
Error checking helm version -s –tiller-connection-timeout 5: cannot connect to Tiller
This indicated to me that Tiller was probably not running on this node, digging a deeper Tiller actually only runs on 1 vRA Node. If you then run the ‘deploy.sh’ command on a vRA node that does not have Tiller running, you end up with this error.
The solution is pretty straight forward and described in this VMware KB: https://kb.vmware.com/s/article/81815
The package will basically install a proxy for Tiller in the vRA 8 environment and also makes sure it’s configured in the YAML and boot scripts / files.
After applying the steps in the KB I retried the failed re-register VIDM request in vRSLCM with the default inputs and it succeeded.
Hope it helped! Have a nice day!