Monday 8 March 2010

wsadmin and WAS commands hanging

In the last week we were having all sorts of problems getting any commands working even though they were running as root. I first notcied that when ever I treid to get into a wsadmin session, it would just hang.

There were no error messages and nothing obvious. We then discovered that all commands that end up running java under the covers were having the same issue, so startServer.sh , stopManager.sh , serverStatus.sh and pretty much all the supplied WAS scripts.

After an age looking around at the setupcmdline and seeing if the OSGI bundels were causing an issue. We also found a fix in fp29 that seemed in a similar area but that didn't resolve it. Just as we were about to log a call with IBM, we decided to take a javacore of the processes we were running whilst they were hung (why we didn't do this sooner I have no idea!)

We took several javacores, 30 seconds apart. Although there were no blocking threads, in each javacore, the main thread appeared to be looking up the localhost:

at java/net/Inet6AddressImpl.getLocalHostName(Native Method)
at java/net/InetAddress.getLocalHost(InetAddress.java:1463)

These same entried were in each of the javacores so it appeared there was an issue getting the localhost name. Once we spotted this, it didn't take long to find out there was an issue contacting our dns servers. We removed the /etc/resolv.conf whilst we looked into this, so WAS would now go back to using the hosts file on the server and everything then jumped back into life.

*Added 19th Mar 2010

I have just been directed to this page from IBM which may well have resolved my problem. If you can't simply turn off DNS then this might be a preferred option:

IBM link swg21170467

No comments:

Post a Comment