CodeSOD: No Lifeguard in this Connection Pool
Michael was debugging a system where the J2EE service would just... crash. It was running on a Windows host set to restart the service when this happened, so the outages were brief, but it was a serious problem.
He eventually tracked down the problem to a class called JdbcSchema, which is about 1,100 lines of homebrew database management code. It does "fun" things, like create extra tables in the database to track "foreign key metadata".
Or, for real fun, it has a getTransIsolation function, which returns the transaction isolation your query should use. If you're thinking to yourself, "doesn't the transaction isolation depend on the specific query and workload?" you'd be right- this function changes what transaction isolation all queries are supposed to use based on the time of day. Presumably, this was to support batch processing at night, but it's a bonkers way of going about it.
None of that, however, is why the application crashed.
protected Connection createConnection( final boolean canKickOut, final boolean forceConnection, final boolean forCounters ) { final boolean canGetSpId = ( !url.startsWith( "jdbc:odbc:" ) ); String spid = null; Connection result = null; try { boolean isClosed = true; final int connsSize; synchronized( this ) { if( connections == null ) { connections = new LinkedList(); } while( retrieveConnectionFromPool( result, isClosed ) ) { result = (Connection)connections.get( 0 ); if( canGetSpId ) { spid = getConnectionId( result ); } isClosed = ( spid == null ); connections.remove( 0 ); } connsSize = connections.size(); } if( ( result == null || isClosed ) && connsSize == 0 ) { final int maxConnects = getMaxConnects(); // check to see if we can add another connection if( totalConnections < maxConnects ) { LOG.info( "CREATE CONNECTION: " + url ); result = DriverManager.getDriver( url ).connect( url, props ); result.setAutoCommit( true ); if( canGetSpId ) { spid = getConnectionId( result ); } } // if we cant add another connection, // then send an email and restart the system else { result = handleMaxConnectionsReached( maxConnects ); } } final int isolation = getTransIsolation( result ); if( isolation > Connection.TRANSACTION_NONE ) { result.setTransactionIsolation( isolation ); } } catch( final SQLException exc ) { throw new LibException( exc ); } incrementConnections( result, spid, forCounters ); return result; }
Now, this function isn't what was causing the crash, but there are a few highlights here that are worth noting.
First, note that the block which gets a connection is synchronized- they at least put a little thought into multithreading.
Then note that retrieveConnectionFromPool function- you'll note that the function doesn't retrieve a connection from a pool. It returns true or false- whether this specific connection can be pulled from the pool.
And yes, they implemented their own connection pooling, despite Java already having a perfectly good solution for it.
Regardless, once they've gotten a connection, they're going to compare totalConnections against maxConnects. At this point, it's worth noting that we've left the synchronized block. incrementConnections, down at the bottom of this function, is a synchronized method, so changing the count is threadsafe, but checking the count is not. Yes, there are loads of concurrency problems when the system is under load.
But if you've been reading ahead, you know where the suspicious part is:
// if we cant add another connection, // then send an email and restart the system else { result = handleMaxConnectionsReached( maxConnects ); }
"send an email and restart the system"? Uh oh.
private Connection handleMaxConnectionsReached( final int maxConnects ) { LOG.fatal( "Max connections reached", new RuntimeException() ); // restart system System.exit( 0 ); return null; }
So when there are too many database connections, we just... quit. We rely on the host OS to restart our service, because a System.exit( 0 ) just terminates us. Even better, we exit with status code 0- a successful exit. It's not an error, just a normal part of our operation.
Extra bonus point for the function signature: it returns a Connection, which will never happen, but clearly the original developer dreamed of someday coming up with a solution to expand the connection pool.
It's also worth noting that the maxConnects value is arbitrary, varies between servers, and has no basis in reality. Someone just picked a config number, and didn't think about anything else.
[Advertisement] Utilize BuildMaster to release your software with confidence, at the pace your business demands. Download today!