CodeSOD: Curious Queries about Performance
Andreas's employer recently upgraded their SQL Server databases to the latest version. During that upgrade, something went wrong, and performance dropped off a cliff. Applications which used to be merely slow to taking hours to query the data they required. Clearly, something was wrong in the upgrade process, but Andreas wasn't a DBA, so that wasn't specifically Andreas's problem. Instead, there were plenty of badly performing blocks of code- maybe now was the time to try and optimize them.
That's where this block of Visual Basic comes from.
Function Is_BookingCorr(CustomerID, Section, DocumentType, Spezification)Dim rst As ADODB.recordsetrst.Open "Protocol", CurrentProject.Connection, adOpenStatic, adLockOptimisticWith rst .Filter = "[CustomerID]='" & CustomerID & "'" .Filter = .Filter & " and [Section]='" & Section & "'" .Filter = .Filter & " and [DocumentType]='" & DocumentType & "'" .Filter = .Filter & " and [Spezification]='" & Spezification & "'" Is_BookingCorr = .RecordCount > 0End WithEnd Function
This is an interesting WTF, in part because it doesn't look like much of a WTF. The ADODB.recordset is one of our tools for querying a database in VB. So here, they open a database connection for a table called Protocol, set some filter properties, and then count how many records come back, which is the return value of the function (because VB lets you set the function name equal to the return value). Which, the obvious problem is: why not just use SELECT COUNT(*) as the query, instead of doing it on the client side? Instead of passing the name of the table in, you can just as easily pass a query in. Or use a ADODB.command object instead.
But that's the obvious problem. If you look at that code, you'll notice that there's nothing in the code that actually says "run this query". That's because the Filter property automatically updates the dataset when you modify the value. Even worse, the ADODB.recordset type is from a different era- an era where your resultset objects were backed by open cursors on the database side. This means that, every time they modify the filter field, a command is sent back to the database to update the cursor to hold the results of the new query.
And it gets even worse. ADODB.recordset requires you to specify how that cursor is managed. In this case, they chose adOpenStatic, a cursor mode that copies the entire result set into memory- the purpose is to allow you to write programs that may take a long time to process data without fear that other transactions will alter the data that you're seeing. But this isn't a long-running process, it's just a count, and the correct cursor for something like this would be the default: adOpenForwardOnly, especially when combined with an actual COUNT(*) query, which would be an atomic operation in the database.
They also chose an incorrect record locking option, but that at least is mostly harmless in this case- the adLockOptimistic only locks records when you try and update them, which doesn't happen here. But the correct value would be adLockReadOnly in this case.
In short, the developer writing this code made a series of choices about how to execute these queries. And each and every one of those choices was wrong. It's the worst possible way to run this query, and was not just performing badly on its own, but consuming resources in the database and making everything worse for every other operation too.
Converting this to a `SELECT COUNT(*) FROM Protocol WHERE..." didn't solve the performance problems created by the upgrade, but it certainly helped.
The one good thing we can say about the original is that the Filter property won't allow a SQL injection attack to happen.
[Advertisement] BuildMaster allows you to create a self-service release management platform that allows different teams to manage their applications. Explore how!