Data Platform intermittent error 'The system failed to connect to the User Console service. ' 'The remote server returned an error: (401) Unauthorized.'
For me, this has occurred on multiple deployments of Data Platform across customers.
Whether running a Normalization or a Catalog update, sometimes the process will fail, with the Activity Monitor and the BDNA.log showing 'The system failed to connect to the User Console service. Please check the environment and retry.'
On the User Console UX.log the error will be 'The remote server returned an error: (401) Unauthorized.'
This issue appears to be linked to Java caching DNS lookups. On service startup the User Console will do a look up for the Active Directory server, and that IP address will be the one it will use for any authentication queries until the service is restarted.
The obvious problem here is that Active Directory domain controllers go up and down; AD is supposed to be fault tolerant and allow for this, and applications need to cater for this inevitable scenario. If the User Console service cannot access that cached IP address then it will fail the process. Addtionally, any user access to the User Console will also fail.
A solution here is to set the time to live for the DNS entries the User Console uses to something short. On the User Console, edit the file <install drive>\Program Files\BDNA\User Console\Tools\Java\lib\security\java.security
to something low. I've set it to 15 seconds.
So every 15 seconds it expires whatever is in the DNS cache.