Search engine files patent application for methods of detecting link spam.
Ever since people caught on to how Google used the number of incoming links to a web site in its ranking algorithms, people have tried to game the system. From selling links to creating link farms, SEO has focused much of its attention on link building over the years. As a result, the search engines have had to counter this by trying to separate good links from bad.
Yahoo has filed a patent for discovering abnormal link structures and demoting the rank of web pages based on these abnormal incoming links.
Titled “Detection of Undesirable Web Pages”, patent application 20100094868 (pdf) was filed in October 2008 and published today.
The patent describes a statistical method of determining when links pointing to a web page have been artificially generated. The method determines a normal range of links across a number of factors, and then looks for patterns that do not conform to the natural change in links over time:
As the value of the normalized entropy metric associated with a set of inlinks referencing the destination page approaches an outer limit of an acceptable range (e.g., 0 or 1), the likelihood that the set of inlinks to the destination web page is “unnatural” increases. In other words, there exists an inference that some of the inlinks among the set have been created for the purpose of artificial promotion of the destination web page rather than based on the genuine interests from a diverse set of independent users.
Some of the factors considered include:
-IP Address of link source
-Top level domain of links
-Language of each link (e.g. English, French, German)
-Autonomous system (i.e. a networked system of computing devices)
-Anchor text of links
-PageRank of incoming links
-Link age-attenuation weightings
Of course, many search engine optimization experts already try to skirt these measures by spreading links about in more natural ways. Further, I’d be surprised if some of Yahoo’s competitors weren’t using some of these same tactics before the patent application was filed.