Article 1MSGK CodeSOD: OhgodnoSQL

CodeSOD: OhgodnoSQL

by
Remy Porter
from The Daily WTF on (#1MSGK)

How about those NoSQL databases, huh? There's nothing more trendy than a NoSQL database, and while they lack many of the features that make a traditional RDBMS desirable (like, um" guaranteeing writes?) , they compensate by being more scalable and easier to integrate into an application.

Chuck D's company made a big deal out of migrating their data to a more "modern", "JSON-based" solution. Chuck wasn't involved in that project, but after the result went live, he got roped in to diagnose a problem: the migration of data from the old to the new database created duplicate records. Many duplicates. So he took a look at the migration script, and found piles of code that looked like this:

 UPDATE DataItemSet SET Content = '{"id":116,"type":"Plan for Today", "title":"Initech Retirement Fund", "learnItemLink":"","content":"Re-balance fund portfolio."}' WHERE Name = 'OPERATION' and Content like '{"id":116,%'

If reading that line doesn't cause you to break out into hives, take a closer look at the schema.

Id Content Unread Type Name
79 {"id":9,"title":"Initech Facilities Revision","type":"CR05","img":"images/initechfac",content:"{"id:"55, "title":"Billing Code"" "} 0 Global_Customer CUSTOMERRESOURCE
102 {"id":94,"title":"Initech Facilities Construction","type":"CR05","img":"images/initechfac",content:"{"id:"55, "title":"Billing Code"" "} 0 Global_Customer CUSTOMERRESOURCE

The Content column holds a string of text, and that string of text is expected to be JSON, and they often need to filter by the content of the column, which means they have lots of queries with WHERE clauses like: WHERE Content LIKE '%"title":"Initech"%', which rank as one of the slowest possible queries you can run in SQL. The ID field in the Content column has no relationship to the autonumbered ID field that actually is on the database table. The name column sometimes contains IDs, many of the fields in the JSON content (like the img and learnItemLink fields) are actually foreign key references to other tables in the database. There's a type column on the record, which seems to control scope, but shouldn't be confused with the type field on the JSON document, which may or may not mean anything.

After many weeks of combing through the migration scripts, it turned out not to be a problem with the scripts at all. One of their customers started a process of simplifying their billing codes, which meant billing codes were constantly changing. Instead of only changing the codes that were" well, actually being changed, the web app which provided that interface created new records for every billing code.

otter-icon.png [Advertisement] Otter allows you to easily create and configure 1,000's of servers, all while maintaining ease-of-use, and granular visibility down to a single server. Find out more and download today! TheDailyWtf?d=yIl2AUoC8zADlLU7tsa1fA
External Content
Source RSS or Atom Feed
Feed Location http://syndication.thedailywtf.com/TheDailyWtf
Feed Title The Daily WTF
Feed Link http://thedailywtf.com/
Reply 0 comments