Article 1D8DD Coded Smorgasbord: Finding a Path

Coded Smorgasbord: Finding a Path

by
Remy Porter
from The Daily WTF on (#1D8DD)

Readers of TDWTF know all too well that dates are hard. Strings are also hard. You know what else is hard? File paths.

Like dates, and strings, most languages these days have libraries to simplify parsing filepaths. For example, in Python, you can use the os.path module to parse out the directory structure, the file, and its extension without too much effort.

As Chris discovered, though, some people like that effort. Why use things like os.path when you've got Python's super-powered slice operator for splitting the string apart:

samp = file[file.find('intFiles')+9:].split("/")[0]fName = file.split("/")[-1]

The second line is the one that pulls off the file name- split into an array, grab the last element- and it'll work fine, if not efficiently.

By the same token, the preceding line isn't wrong, it's just ugly. It finds the starting point of the string "intFiles" in the filename, jumps just past it, and then splits on slashes, grabbing the first item. And frankly, what would we have rather the original developer used? A built in function?

It's ugly, but not the worst sin. What if we started with a language that was already ugly? What about Objective-C? Objective-C, especially when used on MacOS or iOS, is an ugly beast that mixes high-level abstractions with low-level APIs, and stubbornly insists on using Smalltalk-style message passing for calling methods of objects.

Paul's team-mate needed to find the file name from the path. There's a lovely built-in function called lastPathComponent that makes this easy, but again- Paul's team-mate wasn't interested in easy.

+ (__strong NSString*) getFilename: (NSString*) file{ NSString* Result = @""; @try { if ( [mbUtilis hasContent: file ] ) { Result = file; for ( NSInteger i = [file length] -1; i >= 0; i--) { UniChar c = [file characterAtIndex: i]; if ( c == '/' ) { if ( i != [file length] -1 ) { Result = [file substringFromIndex: i +1]; } break; } } } } @catch (NSException *e) { [mbApi logException: @"mbUtilis.getFilename" exception: e]; } @finally { // TODO } return Result;}

This, at least, does a reverse search, walking the string in reverse until it finds a slash. Then it chops off that portion of the string and stuffs it into result. The part that gets me isn't the path manipulation. It's why not just return the result from inside the for loop? Why a break? Why stuff return Result at the end of the function if you're not going to do anything with it?

Those two examples are both clear cases of ignorance-of-built-ins, and they're a little sloppy and perhaps a bit more cryptic than necessary, but hey, at least they're not using regular expressions.

Speaking of regular expressions, normally, we think of a person's name as an arbitrary string. There are too many edge cases, too many unexpected characters, too many cultural differences to have a simple rule that says, "This is what a valid name looks like".

Stefan's co-worker didn't like that line of thought. They wrote a set of regexes that gradually grew to account for more and more exceptions.

 public final static Pattern Lastname = Pattern.compile("^[\\pL]+[\\.\\']?((\\-| )*[,\\pL]+[\\.]?)*$"); public final static Pattern Firstname = Pattern.compile("^[\\pL]+[\\.]?((\\-| )[\\pL]+[\\.]?)*$"); public final static Pattern AcademicTitle = Pattern.compile("^[\\pL]+[\\.]?((\\-| )[\\pL]+[\\.]?)*$");

These validations were used to process a bulk-upload of customer data. When a single record failed validation, it wouldn't be saved- and all of the following records would fail silently. This created a number of messes.

release50.png[Advertisement] Release!is a light card game about software and the people who make it. Play with 2-5 people, or up to 10 with two copies - only $9.95 shipped! TheDailyWtf?d=yIl2AUoC8zAJM4DuGyS0-E
External Content
Source RSS or Atom Feed
Feed Location http://syndication.thedailywtf.com/TheDailyWtf
Feed Title The Daily WTF
Feed Link http://thedailywtf.com/
Reply 0 comments