Friday 23 April 2010

CONM6009E: The database is unable to get a connection to the database from DataSource

Another day, another test, another error!

We were running a stress test through our WAS systems that connect to an Oracle database on AIX. When we got a large number of concurrent requests we ended up getting the following error in our WAS logs:

CONM6009E: The database is unable to get a connection to the database from DataSource

We assumed at first that we had not sized our connection pools correctly. We turned on PMI and checked the size of the connection pools and found we weren't hitting the connection pool limits. We then checked the Oracle database which seemed correct but had logged a message stating the maximum user procs limit had been reached.

So on the DB server we ran the following:

lsattr -EH -l sys0 | grep -i maxuproc

which resulted in the following:

maxuproc 1024 Maximum number of PROCESSES allowed per user True

1024 was less than the total number of connection pool threads we had set in WAS. A quick chat with a friendly AIX administrator to increase this setting then resolved the issue.

Thursday 15 April 2010

Testing WAS app without creating a session

Since writing a post (here) on in-memory session count, I have been doing endless amounts of work on sessions, tracing them to see how the reaper script works as well as how frequently it runs.

One of the big issues we were facing is the number of in memory sessions we were creating. Due to memory limitations and an app that was creating large sessions we have limited number of sessions available so understanding the ins and outs of session management has been useful.

In front of our IHS and WAS servers we had a load balancer that was firing a request through to the front screen of the logon to see if the application we up and running. Getting the load balancer to test a static page on the web servers wasn't sufficient for our requirements. Given the frequency of the LB requests though and the fact every time they accessed the front page they were allocating a session, it would mean we would often end up with overflowed sessions.

Instead of hitting the app front page we tried to hit a simple jsp within the app but then WAS would create a session for that request rather than anything explicit in the application. After a bit of digging I found a line of code I could add to a jsp

<%@page session="false" %>

This also means the stats I was producing in my previous post were more accurate and did not inclue the LB requests in the session count!

Friday 9 April 2010

javax.net.ssl.SSLHandshakeException: bad certificate

We have been doing some testing on WAS recently where our app makes a call to a 3rd party which hosts some static images. In our test environments though we were getting a "bad certificate" error.

Our key stores and trust stores all appeard to have the valid certs in that we thought were reuqired. Unfortunately, even when we turned on tracing in WAS we couldn't see what the certificate was that was causing the issues.

Due to firewalls and proxies, we couldn't hit the url dirrect from a PC so we couldn't check it out manually. So to allows us to see what ceriticates were being served we used the openssl command which listed the certs served by the target site we were trying to hit:

/usr/linux/bin/openssl s_client -connect www.ourtargethost.com:443 -showcerts

Thisn showed the certifcate chain and the issues highlighted what the issues with the certs was:

CONNECTED(00000003)
depth=0 /C=GB/ST=Somewhere/L=Warrington/O=My company Ltd/OU=HS4/CN=www.ourtargethost.com
verify error:num=20:unable to get local issuer certificate
verify return:1
depth=0 /C=GB/ST=Somewhere/L=Warrington/O=
My company Ltd/OU=HS4/CN=www.ourtargethost.com
verify error:num=27:certificate not trusted
verify return:1
depth=0 /C=GB/ST=Somewhere/L=Warrington/O=
My company Ltd/OU=HS4/CN=www.ourtargethost.com
verify error:num=21:unable to verify the first certificate
verify return:1