A Backup Pipeline
"Um" can you come take a look at something for me?"
Pat looked up from happily hacking away at some new features to see Milton hovering at the edge of the cubicle.
"I think I messed up," Milton added.
One of their company's main internal systems was a data processing pipeline. "Pipeline" might be a bit generous, as in practice, it was actually a webwork of shell scripts and Python that pulled data in from files, did "stuff" to that data, and dropped the results into other files, which could then be read in by different scripts. In actual use, this usually meant people grabbed the latest version of the scripts from source control, modified and tweaked how they executed to answer one specific data-related question, and if that particular process seemed like it might have value, they'd clean the code up a bit and then try and merge it back up into source control. Otherwise, if they didn't think they'd need it again, they'd just reset back to HEAD.
Some folks, though, like Milton, mostly kept their own copy of all the scripts. Or in Milton's case, multiple copies. Milton knew the data processing pipeline better than anyone, but the vast majority of that knowledge was locked up in his personal scripting library.
"I was thinking I should probably try and contribute changes back upstream," Milton said. "So, like, I've got a script that's called by a script, which is called by a script, and it depends on having a bunch of shell variables created, like $SETUP_DIR."
Pat nodded along.
"So I wanted to refactor that into an argument, so other people could use it. And I did" but I forgot to change the calling scripts to pass the argument before I tried to test it."
Specifically, Milton's script had a line like this:
#!/bin/shrm -rf $SETUP_DIR/*/
Which he refactored into a line like this:
#!/bin/shrm -rf $1/*/
Shell scripts don't care if these variables exist or not. Milton had an environment which always guaranteed $SETUP_DIR existed. But $1 is the first argument, and if you don't pass an argument, it's nothing. So Milton's new script, when executed with no arguments, expanded to rm -rf /*/- deleting everything his account had access to.
Mostly that meant lots of failed attempts to delete files he didn't have the rights to. It also meant his home directory went away, along with his entire packrat pile of spaghettified scripts that were absolutely impossible to reconstruct, as they'd never been placed in source control.
"There's a way to fix this, right?" Milton asked.
"I mean, sure. You can restore from the last backup you took," Pat said.
While all the Windows boxes were running an automated backup tool, installed automatically, none of the Linux boxes were so configured. The support team took the stance that if you were technical enough to be running Linux and writing shell scripts, you were technical enough to set up your own backup solution. There was a SAN available to everyone for exactly that purpose.
"Oh, I" never set up a backup," Milton whispered. "Well" at least I didn't push?"
Pat wondered if Milton was learning the right lesson from this mistake.
[Advertisement] Ensure your software is built only once and then deployed consistently across environments, by packaging your applications and components. Learn how today!