Earlier this week we started getting reports of the following error on our production Power BI Report Server (PBIRS) instance for any reports that were using a live connection back to one of our Analysis Services models.
All it basically said was “an unexpected error occurred. Please try again later”. And if you looked at the details it showed the message above saying that “We could not connect to the Analysis Services server”.
Some other fun facts about this issue are:
- It was not affecting all users (at least in our environment)
- There had been no recent updates or configuration changes on PBIRS, SQL or SSAS
- The same issue appeared to be affecting our test environment but not our development environment.
- It was happening for both pbix and paginated reports
- It was affecting reports that sourced data from SQL Server as well as SSAS
- When running a profiler trace against SSAS listening for the session events I could see some sessions with usernames and others where the user was listed as “ANONYMOUS USER”
Putting all of the above evidence together point to some sort of Kerberos issue. In our environment our SQL and SSAS instances are on different servers to our PBIRS instance we had to configure Kerberos with constrained delegation to allow for the user credentials to flow from client > to PBIRS > to SSAS.
Kerberos can be a little fiddly to setup initially, but once it’s configured I’ve never had an issue with it until the one this week. When we initially configured Kerberos I had distilled the generic instructions from the Microsoft documentation down into a series of specific steps for our production environment listing the specific
setspn commands that needed to be run and how the constrained delegation settings needed to be configured and saved these in a document. This made it very easy to cross check that all these configurations were still correct.
At this point we knew that the Kerberos configuration appeared to be unchanged, but we still had an issue where it worked for some users and not for others.
So the next thing I started doing was to get some of our users to run the
klist command from a command prompt. This command lists all the Kerberos tickets that are cached on the current machine. This lead us to find a pattern where everyone that was getting the error were getting their kerberos tickets from a specific list of domain controllers while people that did not have the issue were getting theirs from a different list of domain controllers. Our IT admins then noticed that all the failing servers were in a group which had had security patches applied earlier that day.
The specific patch that had been applied was KB5007206 and after raising a support ticket with Microsoft they were able to point us to the following hotfix (KB5008602) for this update which fixed an issue when you were using Kerberos authentication with constrained delegation and protocol transition (which is exactly how Kerberos needs to be configured when using PBIRS or SSRS or SSRS in Sharepoint integrated mode for people with on-prem Sharepoint environments).
Hopefully this post will help anyone else that runs into this issue.