Article 54HJ4 CodeSOD: Sort Yourself Out

CodeSOD: Sort Yourself Out

by
Remy Porter
from The Daily WTF on (#54HJ4)

Object-Relational-Mappers (ORMs) are a subject of pain, pleasure, and flamewars. On one hand, they make it trivially easy to write basic persistence logic, as long as it stays basic. But they do this by concealing the broader powers of relational databases, which means that an ORM is a leaky abstraction. Used incautiously or inappropriately, and they stop making your life easy, and make it much, much harder.

That's bad, unless you're Tenesha's co-worker, because you apparently want to suffer.

In addition to new products, Tenesha's team works on a legacy Ruby-on-Rails application. It's an ecommerce tool, and thus it handles everything from inventory to order management to billing to taxes.

Taxes can get tricky. Each country may have national tax rates. Individual cities may have different ones. In their database, they have a TaxesRate table which tracks the country name, the city name, the region code name, and the tax rate.

You'll note that the actual database is storing names, which is problematic when you need to handle localizations. Is it Spain or Espana? The ugly hack to fix this is to have a lookup file in YAML, like so:

i18n_keys: es-ca: "Canary Islands" es: "Spain" eu: "Inside European Union" us: "United States" ar: "Argentina" la: "Latin America" as-oc: "Asia & Oceania" row: "Rest of the world"

Those are just the countries, but there were similar structures for regions, cities, and so on.

The YAML file took on more importance when management decided that the sort order of the tax codes within a region needed to be a specific order. They didn't want it to be sorted alphabetically, or by date added, or by number of orders, or anything: they had a specific order they wanted.

So Tenesha's co-worker had a bright idea: they could store the lookup keys in the YAML file in the order specified. It meant they didn't have to add or manage a sort_order field in the database, which sounded easier to them, and would be easier to implement, right?

Well, no. There's no easy way to tell an SQL order-by clause to sort in an arbitrary order. But our intrepid programmer was using an ORM, so they didn't need to think about little details like connecting to a database" or worrying about round trips" or is this at all efficient".

So they implemented it this way:

 # Order locations by their I18n registers to make it easier to reorder def self.order_by_location(regions) codes = I18n.t("quotation.selectable_taxes_rate_locations").keys.map{ |k| k.to_s } regions_ordered = [] codes.each do |code| regions_ordered.push(regions.where(region_code: code)) end # Insert the codes that are not listed at the end regions_ordered.push(regions.where("region_code NOT IN (?)", codes)).flatten end

This is called like so:

# NOTE: TaxesRate.all_rates returns all records with unique region codes,# ignoring cities; something like `TaxesRate.distinct(:region_code)`.regions = order_by_location(TaxesRate.all_rates)

We should be thankful that they didn't find a way to make this execute N2 queries, but as it is, it needs to execute N+1 queries.

First, we pull the rate locations from our internationalization YAML file. Then, for each region code, we run a query to fetch the tax rate for that one region code. This is one query for each code. Based on the internationalization file, it's just the codes for one country, but that can still be a large number. Finally, we run one final query to fetch all the other regions that aren't in our list.

This fetches the tax code for all regions, sorted based on the sort order in the localization file (which does mean each locale could have a different sort order, a feature no one requested).

Tenesha summarizes it:

So many things done wrong; in summary:
* Country names stored with different localization on the same database, instead of storing country codes.
* Using redundant data for storing region codes for different cities.
* Hard-coding a new front-end feature using localization keys order.
* Performing N+1 queries to retrieve well known data.

Now, this was a legacy application, so when Tenesha and her team went to management suggesting that they fix this terrible approach, the answer was Nope!" It was a legacy product, and was only going to get new features and critical bug fixes.

Tenesha scored a minor victory: she did convince them to let her rewrite the method so that it fetched the data from the database and then sorted using the Array#index method, which still wasn't great, but was far better than hundreds of database round trips.

proget-icon.png [Advertisement] Ensure your software is built only once and then deployed consistently across environments, by packaging your applications and components. Learn how today! TheDailyWtf?d=yIl2AUoC8zA72O9udO_xeU
External Content
Source RSS or Atom Feed
Feed Location http://syndication.thedailywtf.com/TheDailyWtf
Feed Title The Daily WTF
Feed Link http://thedailywtf.com/
Reply 0 comments