The fight against comment spam


By Dylan Bushell-Embling
Wednesday, 11 June, 2014


The fight against comment spam

“My cousin’s boyfriend’s aunt makes $3000 an hour working from home ...” We’ve all seen spam messages pop up in the comment section of popular websites, and some of us have wondered where they came from and what nefarious purpose they serve.

A new report from Imperva indicates that a relatively small number of attack sources generate the majority of the world’s comment spam, and that early detection and blocking of a spammer can mitigate the majority of malicious activity.

The report, based on an analysis of comment spam traffic from around 60 applications over a two-week period, suggests that just 17% of comment spammers generate the majority of comment spam. Around 80% of comment spam traffic is generated by 28% of attackers.

Attackers make use of comment spam for various reasons, the most prevalent being search engine optimisation (SEO) - using links embedded within comments to increase the page rank of a promoted site, which may then be used for advertisement or malware distribution.

Attackers have also been known to use comment spam for purposes including click fraud and direct advertising.

According to the report, a comment spam attack typically follows three basic stages. The first is target acquisition, or URL harvesting - finding suitably popular and vulnerable sites to post comments on. URL quality is measured by the site’s search engine ranking, the difficulty of posting comments and other factors.

Next comes posting the comments on the chosen page, and then verifying that the comments were published. Successful comment spammers achieve the volume required to profit by automating these three stages.

To generate comments, attackers use programs such as Comment Blaster, which is able to automatically generate relevant comments based on entered keywords such as “music”. To get around spam blockers designed to detect duplicate messages, attackers use a method known as ‘spintax’, which involves creating comments with multiple possible variations that have the same basic message. The report gives the example of substituting “reading” with “studying” and “interesting” with “enriching”.

Comment spamming tools also typically offer the attacker the ability to automatically submit comments on many different URLs at once. The more sophisticated tools can get around protection mechanisms such as Captcha forms or user authentication.

Some tools also offer the ability to provide feedback specifying whether or not a spam comment was successfully posted to an individual site.

But just as comment spam tools are getting more sophisticated, so too are the tools website operators are using to defend against the nuisance.

One popular mitigation technology in use today is content inspection. This technique involves monitoring posted comments based on a predefined set of rules, such as whether it uses logical sentences related to the subject at hand. But this technology can result in false-positive detection and blocking of legitimate comments.

Another mitigation technique is based on identifying whether a comment is spam based on the reputation of the poster, such as if previous traffic from that source was considered comment spam. Website operators have used crowdsourcing to set up online repositories to check a comment source’s reputations, the most popular being www.projecthoneypots.org and www.stopforumspam.com/. Imperva said its research had found these repositories to be “rather reliable”.

Another popular tactic involves removing a spammer’s motivation to target a webpage. By simply changing the attribute of the HTML anchor element that defines a hyperlink to use the nofollow value - which determines whether a link should be followed by a search engine’s indexing algorithm - website owners can eliminate the SEO benefits of comment spam.

Imperva Vice President for Asia-Pacific and Japan Stree Naidu said that detecting a comment spammer early using these tools can mitigate most of the potential impact of an attack. “Quickly identifying the source of an attack and blocking comments from the source can greatly limit the attack’s effectiveness and minimise its impact on your website,” he said.

Image courtesy of Bryan Kerr under CC

Related Articles

Strategies for navigating Java vulnerabilities

Java remains a robust and widely adopted platform for enterprise applications, but staying ahead...

Not all cyber risk is created equal

The key to mitigating cyber exposure lies in preventing breaches before they happen.

How AI can help businesses manage their cyber risks

Artificial intelligence can be a powerful ally in the fight against cyberthreats.


  • All content Copyright © 2024 Westwick-Farrow Pty Ltd