Article 6QN18 CodeSOD: Enumerated Science

CodeSOD: Enumerated Science

by
Remy Porter
from The Daily WTF on (#6QN18)

As frequently discussed here, there are scientists who end up writing a fair bit of code, but they're not software engineers. This means that the code frequently solves the problem in front of them, it often has issues.

Nancy works in a lab with a slew of data scientists, and the code she has to handle gets... problematic.

index = 0for index, fname in enumerate(img_list): data = np.load(img_list[index]) img = data[0][:,:] img_title 'img'+str(index).zfill(4)+'.jpg' cv2. imwrite(img_title, img) index = index + 1

This code takes a bunch of image data which had been serialized out as a NumPy array (e.g., just raw data). This code reads that with np.load, then writes it back out using cv2.imwrite, converting it to an image file.

In this tight little snippet, nearly everything is wrong.

We start with an enumerate(img_list). In Python, this returns a list of tuples- an enumerator value, and the actual value. Our for loop splits that back out into an index and fname variable.

We forget the fname variable exists, and get the filename again by doing an array indexing operation (img_list[index]).

We then slice the loaded array to extract just the dimensions containing image data, and that's fine. Then we generate a filename based on the index- which, it so happens, we already HAVE a good filename that we've opted not to use (in fname). That's not a WTF, it just annoys me- my usual path for these kinds of things is to read source data from one directory and dump converted data into a different directory with the same names. That's just me.

Then, for fun, we increment index. This, fortunately, doesn't break the loop, because we're setting index at the top of the loop, and thus the increment is overwritten.

This also tells us some history of the code. It originally didn't use enumerate, and just incremented the index manually. Someone mentioned, "You should use enumerate, it's more Pythonic," and so they added the enumerate call, without understanding what about it was more Pythonic.

As scientist written code goes, I wouldn't call this the worst I've seen. It annoys me, but at least I understand it, and it wouldn't be a huge effort to clean up.

buildmaster-icon.png [Advertisement] Utilize BuildMaster to release your software with confidence, at the pace your business demands. Download today!
External Content
Source RSS or Atom Feed
Feed Location http://syndication.thedailywtf.com/TheDailyWtf
Feed Title The Daily WTF
Feed Link http://thedailywtf.com/
Reply 0 comments