Spam Filtering
Recently, Soylent News discussed adding more labels to the moderation system. Although opinions on "Disagree" and "Factually Incorrect" may still be varied, nearly everyone supported the addition of a "Spam" label.
For Pipedot, we've gone ahead and added the later. Moderating a comment as "Spam" will decrease its score by one and flag it for further review by an editor. This way, normal users can greatly help the editors identify junk comments.
Once an editor marks a comment as spam, the message will be "hidden" one step deeper than the normal "Hide Threshold" slider setting. However, comments are never deleted. If you want to continue to see all comments, including the spam, click the "Show Junk Comments" checkbox on your profile settings page. Similar to the current blue (new) and gray (seen) rendering, the title bar of junk comments will be colored red to easily differentiate them from the good stuff.
For Pipedot, we've gone ahead and added the later. Moderating a comment as "Spam" will decrease its score by one and flag it for further review by an editor. This way, normal users can greatly help the editors identify junk comments.
Once an editor marks a comment as spam, the message will be "hidden" one step deeper than the normal "Hide Threshold" slider setting. However, comments are never deleted. If you want to continue to see all comments, including the spam, click the "Show Junk Comments" checkbox on your profile settings page. Similar to the current blue (new) and gray (seen) rendering, the title bar of junk comments will be colored red to easily differentiate them from the good stuff.
Getting back to the regexps, it's hard to say what (if anything) would work for Pipedot without a good overview of the crap being submitted, but one general technique that does seem like it would work well for typical forum spam (including your example) is to trigger off excessive use of certain punctuation marks, particularly in subjects - commas and hyphens seem well liked by many forum spammers; the one in your example put four in there. Ideally you'd probably also want to have a requirement that multiple rules match before a post goes into the moderation queue, or even a basic scoring system like SpamAssassin et al use, but based on the comments above that's probably overkill - at least at present. Ultimately though it's still an arms race, and the spammers will adapt as soon as they realise they are being blocked; sometimes you just have to go for the easy stuff and accept that the rest might need manual handling later.