One of my customers had a problem about logging in to Oracle Data Integrator (ODI) Studio. Their ODI implementation is configured to use external authentication (Microsoft Active Directory). The configuration was done years ago. No one modified it since it’s done, in fact most people even do not remember how it’s configured. Everything was fine until they started to get “ODI-10192: Unable to retrieve user GUID” error.
They said they got the first error about one month ago – 2 days before I started to visit them to support ETL performance issues. They asked me if I encountered similar problem. In addition they said the text box of “Find ODI object” is also disappeared! So they couldn’t search for the objects – if you’re an ODI user, you know how useful that feature is!
Although I was not called there to solve this problem, there was a delay about creating my database users, so I spent my 2 days working on the problem. The interesting point of the problem is, some users could login to the ODI without any error (but the find ODI object was broken for all users).
External Authentication relies on LDAP servers controlled by enterprise security admins, OS updates are controlled by windows admins, the repository database is handled by DBAs, and you’re there to help to Business Intelligence team. So where would you look first? So all of us suspected that there’s something changed on LDAP or OS level. So after I searched My Oracle Support and checked LDAP settings, I verified connection between the ODI machine and LDAP, contacted LDAP admins, re-created the wallet file (which is used for connecting to the LDAP server), wrote some powershell scripts for testing to retrive the GUID of the users, checked the Java version, checked when the OS patches applied, compared the users who can login and not login, and even tried to trace the java code but unfortunately I found nothing.
Meanwhile my customer raised a service request, and an endless messaging started between Oracle support guys and my customer. I had to work on other stuff so I quit working on the ODI problem but I was still included in the mails related with the problem.
Yesterday, when I visited my customer, I had some free time while waiting for a task completed by others. So I decided to work on ODI problem and compare the users again, this time I connected to the repository database and checked the users table (snp_table) and noticed there is an empty row in snp_user table. I really do not know how this empty row was created. Maybe when creating a new user or cloning an existing user, the ODI system crashed. Anyway, I deleted the empty row and “le voilà”: the problem is solved. Not only the login problem, also the find ODI object are solved.
At the end, the problem is solved in almost one month. I do not blame Oracle support guys because they do not have any chance to check the content of the snp_table. When I think about how can we find and solve the problem faster, there is only one solution came to my mind: If this product were open source, we could easily trace the problem and solve it in hours.