Yahoo Files Patent App for Discovering SEO Link Spam

Search engine files patent application for methods of detecting link spam.

hyperlinkEver since people caught on to how Google used the number of incoming links to a web site in its ranking algorithms, people have tried to game the system. From selling links to creating link farms, SEO has focused much of its attention on link building over the years. As a result, the search engines have had to counter this by trying to separate good links from bad.

Yahoo has filed a patent for discovering abnormal link structures and demoting the rank of web pages based on these abnormal incoming links.

Titled “Detection of Undesirable Web Pages”, patent application 20100094868 (pdf) was filed in October 2008 and published today.

The patent describes a statistical method of determining when links pointing to a web page have been artificially generated. The method determines a normal range of links across a number of factors, and then looks for patterns that do not conform to the natural change in links over time:

As the value of the normalized entropy metric associated with a set of inlinks referencing the destination page approaches an outer limit of an acceptable range (e.g., 0 or 1), the likelihood that the set of inlinks to the destination web page is “unnatural” increases. In other words, there exists an inference that some of the inlinks among the set have been created for the purpose of artificial promotion of the destination web page rather than based on the genuine interests from a diverse set of independent users.

Some of the factors considered include:
-IP Address of link source
-Top level domain of links
-Language of each link (e.g. English, French, German)
-Autonomous system (i.e. a networked system of computing devices)
-Anchor text of links
-PageRank of incoming links
-Link age-attenuation weightings

Of course, many search engine optimization experts already try to skirt these measures by spreading links about in more natural ways. Further, I’d be surprised if some of Yahoo’s competitors weren’t using some of these same tactics before the patent application was filed.


  1. Jeremy Leader says

    One interesting question about this patent is, how would Yahoo find out if anyone were infringing on it? My impression back when I worked at Overture and then Yahoo was that the patent attorneys typically wouldn’t file applications for technology that ran solely on the server. The idea was that filing the application tells competitors (and people looking to game the system) your secrets, and if it’s server-side technology, you have no way of catching infringers and enforcing the patent. Only technology that’s visible to the outside world was worth patenting in their view. That would include inventions with a significant client-side component, or things like business processes that are visible to your customers or suppliers, but not things like this.

    I suppose if there were something compelling the revelation of the invention (e.g. if information were subpoenaed in a court case), they might file for a patent, as a poor defense that’s better than none.

  2. says

    This will certainly become interesting, as Jeremy said it’s pretty much impossible for them to analyze what exactly competitors do and I’d bet some Google machanisms aren’t much different. Curious if Google will react in any way of if this just sit and laugh since Yahoo now “discovered” what they have already been doing for years 😉

Leave a Reply