Article 3EQJW CodeSOD: The Pythonic Wheel Reinvention

CodeSOD: The Pythonic Wheel Reinvention

by
Remy Porter
from The Daily WTF on (#3EQJW)

Starting with Java, a robust built-in class library is practically a default feature of modern programming languages. Why struggle with OS-specific behaviors, or with writing your own code, or managing a third party library to handle problems like accessing files or network resources.

One common class of WTF is the developer who steadfastly refuses to use it. They inevitably reinvent the wheel as a triangle with no axle. Another is the developer who is simply ignorant of what the language offers, and is too lazy to Google it. They don't know what a wheel is, so they invent a coffee-table instead.

My personal favorite, though, is the rare person who knows about the class library, that uses the class library" to reinvent methods which exist in the class library. They've seen a wheel, they know what a wheel is for, and they still insist on inventing a coffee-table.

Anneke sends us one such method.

The method in question is called thus:

if output_exists("/some/path.dat"): do_something()

I want to stress, this is the only use of this method. The purpose is to check if a file containing output from a different process exists. If you're familiar with Python, you might be thinking, "Wait, isn't that just os.path.exists?"

Of course not.

def output_exists(full_path): path = os.path.dirname(full_path) + "/*" filename2=full_path.split('/')[-1] filename = '%s' % filename2 files = glob.glob(path) back = [] for f in re.findall(filename, " ".join(files)): back.append(os.path.join(os.path.dirname(full_path), f)) return back

Now, in general, most of your directory-tree manipulating functions live in the os.path package, and you can see os.path.dirname used. That splits off the directory-only part. Then they throw a glob on it. I could, at this point, bring up the importance of os.path.join for that sort of operation, but why bother?

They knew enough to use os.path.dirname to get the directory portion of the path, but not os.path.split which can pick off the file portion of the path. The "Pythonic" way of writing that line would be (path, filename) = os.path.split(full_path). Wait, I misspoke: the "Pythonic" way would be to not write any part of this method.

'%s' % filename2 is how Python's version of printf and I cannot for the life of me guess why it's being done here. A misguided attempt at doing an strcpy-type operation?

glob.glob isn't just the best method name in anything, it also does a filesystem search using globs, so files contains a list of all files in that directory.

" ".join(files) is the Python idiom for joining an array, so we turn the list of files into an array and search it using re.findall" which uses a regex for searching. Note that they're using the filename for the regex, and they haven't placed any guards around it, so if the input file is "foo.c", and the directory contains "foo.cpp", this will think that's fine.

And then last but not least, it returns the array of matches, relying on the fact that an empty array in Python is false.

To write this code required at least some familiarity with three different major packages in the class library- os.path, glob, and re, but just one ounce of more familiarity ith os.path would have replaced the entire thing with a simple call to os.path.exists. Which is what Anneke did.

proget-icon.png [Advertisement] High availability, Load-balanced or Basic - design your own Universal Package Manager, allow the enterprise to scale as you grow. Download and see for yourself! TheDailyWtf?d=yIl2AUoC8zA-M2o2mJlMhE
External Content
Source RSS or Atom Feed
Feed Location http://syndication.thedailywtf.com/TheDailyWtf
Feed Title The Daily WTF
Feed Link http://thedailywtf.com/
Reply 0 comments