WordPress: Hint for False-Positive Spam

Paper spam ?! Another little code for the blog solution of mine that may not be perfect yet helpful…

Many encountered this problem already: A comment was erroneously marked as spam by the spam filter – the well-known Akismet (like here), another plugin or just too strict a blacklist. Such false positives happen on my blog about once every three weeks on average.

The experienced blogger or blog commenter knows that (s)he just has to wait until the blog owner (hopefully) manually approves the comment. The inexperienced commenter, however, may be at a complete loss when confronted with the top of the page that even after scrolling down does not reveal his/her comment; some may try to comment again, some may never return.

A hintful message may help – but (unlike for comments in moderation) WordPress doesn’t offer such a message. The normal loop that outputs the comments actually doesn’t know which current comment is from the caller, since the part of the URL that specifies this – #comment-123, which in particular causes the correct positioning in the browser1 – is not part of the request that’s visible to the server, but remains inside the browser.

Now one might add the comment number during the redirect after the comment posting as an & parameter to the URL (and query this in the output loop), but I somehow don’t like this method – who knows if this doesn’t cause several such URLs floating around the search engines. I found two other solutions I’d like to show you here:

Solution 1: Spam comment from the same IP address?

The first idea: Check during comment output if there’s a spammed comment from the past few minutes from the same IP address that the current request is coming from. In a function for the theme’s functions.php, this looks like this:

function ag_spammed_comment ($gotcomments) {
    global $wpdb, $post;
    $spamcom = $wpdb->get_results ("
        SELECT * FROM $wpdb->comments
        WHERE comment_post_ID = '$post->ID'
          AND comment_author_IP = '".$_SERVER['REMOTE_ADDR']."'
          AND comment_approved = 'spam'
          AND comment_type = ''
          AND TIME_TO_SEC(TIMEDIFF(NOW(),comment_date))<120");
    if ($spamcom) {
        if (!$gotcomments) echo '<ol class="commentlist">';
        foreach ($spamcom as $sc) {
            echo '<li id="comment-'.$sc->comment_ID.'" class="comment caughtasspam">'.
            '<strong>Apparently, the automatic spam filter marked your comment as spam.</strong><br/>'.
            'If this was a mistake, please be patient until the comment is approved manually.'.
            '</li>'."\n";
        }
        if (!$gotcomments) echo '</ol>';
    }
}

Call this from comments.php with ag_spammed_comment (true); (wrapped in <?php ?>) after the output of existing comments and with false instead of true in the branch for a yet uncommented post – of course you only need this differentiation if you want to include (and style) the message in the ol/ul comment list; if you want to use a separate div block, you can do without it.

Then of course style the .caughtasspam class in your style.css accordingly (e.g. with a red frame).

However, there’s a little…

Problem: The cache

If you’re using a cache plugin such as WP Super Cache that temporarily stores generated pages, there’s the problem that such a plugin – originally appropriately – doesn’t invalidate the affected page’s cache, i.e. still delivers the same old page without the code above having a chance to print its hint.

One solution: Modify the plugin such that spam comments (not spam trackbacks) do delete the cached page. For WP Super Cache, this can be done in wp-cache-phase2.php in function wp_cache_get_postid_from_comment where after

} elseif ( $comment['comment_approved'] == 'spam' ) { 

you replace these two lines:

if ( isset( $GLOBALS[ 'wp_super_cache_debug' ] ) && $GLOBALS[ 'wp_super_cache_debug' ] ) wp_cache_debug( "Spam comment. Don't delete any cache files.", 4 );
return $postid;

with these:

//--ag: for false-positive message
if ( $comment['comment_type'] == '' ) {
    if ( isset( $GLOBALS[ 'wp_super_cache_debug' ] ) && $GLOBALS[ 'wp_super_cache_debug' ] ) wp_cache_debug( "Spam comment. But update cache for post $postid to allow for false-positive message.", 4 );
    return wp_cache_post_change($postid);
} else {
    if ( isset( $GLOBALS[ 'wp_super_cache_debug' ] ) && $GLOBALS[ 'wp_super_cache_debug' ] ) wp_cache_debug( "Spam trackback. Don't delete any cache files.", 4 );
    return $postid;
}

Then the hint works. However at the expense of some performance if real spammers happen to try to flood a page with spam comments that is also often viewed by visitors – which, though, won’t happen that often on most blogs, I guess.

The bigger disadvantage in my opinion, however, is that you thus have (another?) plugin where you have to be careful to copy the changes to the new version during an update. Since I’d like to skip such tasks, I’m using another way:

Solution 2: JavaScript (with jQuery)

Of course this solution won’t work if the commenter has disabled JavaScript in his browser – a disadvantage I’m willing to accept, hoping this combination will be rare enough. Moreover, this solution also works if someone is using proxy servers that change on every request.

There are even two variations of this solution, the first of which I’ll only outline briefly: You use the comment_post_redirect filter (which is called in wp-comments-post.php) to modify the redirection’s target URL in case of a spam comment such that #comment-123 is replaced with something like #spammed and then use JavaScript to display (or fill) a block that’s prepared in the theme but originally set to display: none (or empty) if #spammed is part of the URL – which is something that JavaScript, unlike the server, does have access to.

Variation 2 which I’m using here works without such a filter and just looks if there’s an element named #comment-123 on the (complete loaded) page at all. If not, the message is inserted (via JavaScript to avoid it being indexed by search engines):

<div id="spammedhint" class="comment caughtasspam" style="display:none;"></div>
<script type="text/javascript">
<!--
var theUrl = document.location.toString();
if (theUrl.match("#comment-")) {
    var theHash = theUrl.substr(theUrl.indexOf("#"));
    if (jQuery(theHash).length==0) {
        jQuery(document).ready(function() {
            jQuery("#spammedhint").html("<strong>Apparently, the automatic spam filter marked your comment as spam.</strong><br/>"+
            "If this was a mistake, please be patient until the comment is approved manually.").fadeIn();
            var targetOfs = jQuery("#spammedhint").offset().top;
            jQuery("html,body").animate({scrollTop: targetOfs-20}, 500);
        });
    }
}
//-->
</script>

The presence check is done with if (jQuery(theHash).length==0), since jQuery() always returns an object, which means that if (jQuery(theHash)), which one might think of first, is always true. The last jQuery line then scrolls to the message blog (rather, a little above it) within 500 ms; I had sometimes problems with very short times, maybe because the document wasn’t all readyafter all and the browser jumped back to another position. (Maybe that only happens on a reload, though, which is often used during testing.)

I added this HTML/JS code directly (and of course not wrapped in <?php ?>) to comments.php directly after <?php if ('open' == $post->comment_status) : ?> thus directly before the output of the input fields.

Now this method has a side effect which, on one hand, unfortunately (but probably very rarely) will occur when someone got a link from somewhere with a wrong or nonsensical comment number, but on the other hand fortunately also occurs when someone bookmarked his comment that went through at first but is spammed later, since the message is shown then too.

Which also allows you to test this function easily – I prepared such a link here. :) You can also write a new comment and include “diesisteinspamtest” (German for “thisisaspamtest”) in it since I added this “word” to the blacklist. But don’t overdo it since I got to approve these comments…

So if you don’t use a cache plugin, you can easily use the first solution, otherwise you got to weigh the pros and cons; as I said I chose solution 2.

Any opinions, criticism, ideas, problems, questions…?

  1. if you got a flawed theme that doesn’t add this ID to the comments, thus causing all commenters staring at the top of the page after commenting, now’s finally the time to fix this… []

Links and Videos of the Week (2010/19)

 

(Something was embedded as Flash object here. Not exactly sensible anymore.)