Study says typosquatting money big. But there may be a conflict of interest.
A new study suggests that Google may earn $497 million a year from typosquatters targeting the 100,000 most popular .com sites on the web.
Although I question some of the logic in the report and its appendices, and it is tainted by one of the authors being involved in a lawsuit against Google for typosquatting, the data is still fascinating.
First, for the claim of $497 million in revenue from typosquatting each year. That’s based on some assumptions that are weak at best:
-The study reviewed the top 3,264 .COM web sites, and found that 0.7% of their traffic was from typos. It then extrapolated this to the top 100,000 .com sites, assuming that much typo traffic to those sites as well. The truth is, typosquatters rarely target these less popular sites, meaning that less of their traffic is typosquatted.
-The authors concluded that domain revenue per search is the same as that of searches at Google. This is based on a vary narrow Efficient Frontier case study.
On the other hand, the authors may have under-counted traffic in other ways. For example, the report doesn’t take into consideration Google and Yahoo’s error redirect services, in which ISPs and computer makers hijack typos and send them to pages full of ads. It also doesn’t take into consideration typos such as .cm.
But the report is certainly fascinating. The authors found the most popular Google advertiser IDs on typo domains. They feature common names such as GoDaddy (probably through auto-parked pages on customers’ domains) and Sedo.
For domain servers with over 100,000 domains, the report found that dnsnameserver.org (FirstLook) had the highest ratio of typo domains at 4.75%, followed by trellian.com at 4.47% and hitfarm.com at 3.76%. When you look at nameservers over 25,000 domains, the top two are 365.com at 8.89% and ParkLogic.com at 8.19%.
Of course, not all typos are used inappropriately. CitizenHawk shows up as having 32% of its domains as typos, but the company is holding these on behalf of the trademark holders.
The report also found a number of extraordinarily “clean” nameservers. One of those is Michael Berken’s MostWantedDomains.com.
Another interesting finding is the domains that are most typosquatted by competitors to send traffic to their sites. It names the competitor that is receiving the traffic, too.
One final point. The authors found their crawlers blocked after hitting domain parking company’s nameservers excessively. They suggest the parking companies were trying to thwart the authors’ efforts. My guess is their click fraud and DDOS systems just kicked in.
At any rate, the research is worth reviewing.
[Via New Scientist]
See, this point discredited the entire article:
One final point. The authors found their crawlers blocked after hitting domain parking company’s nameservers excessively. They suggest the parking companies were trying to thwart the authors’ efforts. My guess is their click fraud and DDOS systems just kicked in.
You are right. They are just systems so that the server expenses are 50k instead of 500k due to robot excessively pounding and crawling.
Completely biased analysis.
@ Matt – huh? I don’t understand what you just wrote.
I think he is on to something.
Parking companies block repeating requests to stop crawlers from wasting server resources.
The guy that wrote the article is biased. He wrote that 500 million a year is made in typos, yet most companies have around 2% typo usage according to his document. That would put the industry worth at 25 billion, which is wrong.
I meant the guy that published the report**
@ Matt – I gotcha
“The truth is, typosquatters rarely target these less popular sites, meaning that less of their traffic is typosquatted.”
They assumed a long tail when there is really a short tail.
I have met Ben Edelman a few times in the past. He’s a very smart guy, I’m not saying his numbers are flawed. But he is just using an automated system that guesstimates how bad typo-squatting really is.
I think the numbers are much worse than most people think. Yahoo does block about 95% of trademarks today, but this paper wasn’t based on trademarks, just the top domains. Not all have trademarks.
Who owns hitfarm again?
Where in the report does it mention mostwanteddomains? Didn’t see or find that anywhere…
@ dch – http://www.benedelman.org/typosquatting/bottom25kns.html
hitfarm – Kevin Ham
@Donny
The problem really is, that if you think about it logically a trademark is just a name on a piece of paper on some government desk or in a government cabinet.
See, Donny, you and I both know the industry overall only has about 5% or less of its revenue from truly type-in generics, and I’m not talking about generics like Single.com which are really typos of Singles.com.
The rest is really search engines (WhyPark is probably made up of most of it), typos (DS, Skenzo, and most other parking companies), and expired site traffic which mostly Y! companies are monetizing.
Think about it logically. Expired site traffic is still a visitor looking to get to some website. He/she is pretty much tricked into clicking advertisements the same way as with typos. You just can’t sugar coat it otherwise you are really just fooling yourself and no one else around you.
On top of that, if you really think about things logically, you’ll notice that the difference between typing a trademark term into an address bar vs the search box on the Google.com or Yahoo.com page is no different. It’s just a TECHNICAL difference. Nobody is forcing the users to click the ads on neither the search results page on G or Y nor the parked pages.
On top of that, the biggest cybersquatters are usually the ones that say they are against cybersquatting. Mainly because they are too stupid and biased to realize it. I’ll give you a few prime examples: Verizon with the 404 pages and Microsoft with the default search engine traffic/DNS error traffic. Both of them are making more than any parking company. Perhaps not Verizon, but I am sure Microsoft with the DNS error monetization and address bar default search engine monetization are able to monetize more trademark terms than parking companies.
On top of this, I talked to an ISP recently that did 404 traffic, and the literally believed that their 404 traffic was helpful to users, and that the parking companies were crooked for monetizing parked domains. I just didn’t know what to think about the guy, other than the fact that he is just fooling himself.
You just can’t suger coat it Donny. All traffic is really created equal. Expired traffic is no different than typos other than the fact that some person or company decided to file a piece of paper with the government.
But this Ben guy has his figures completely off. We all know DS has more than 3% of domains that are typos.
But his revenue figures are also very inflated and he assumes. You know what they say about assuming.
Remember that Ben’s calculations were only based on the top x domains based on the Alexa rating. Sure DS probably has more than 3% TMs. I’m sure so do you. I think his calculation are a little high for us, because we do actively block TM’s with multiple systems today.
I won’t comment on nxd traffic at the ISP level, because I think they are all full of it.
But I do think that Google makes about $497M a year on trademarks, I just don’t think it’s based on domains, I think it’s also based on nxd and other traffic as well.
Donny – I found it interesting that with the data they had on Parked, they only showed two or three supposed typos.
@Teddy: We assumed a long tail because the data supports it. For example, discoveramerica.com is rated the 100,000-most popular site according to Alexa. We found ten typo domains corresponding to discoveramerica.com. Of course, this is less than we found for more popular sites, but that is to be expected given that the site is less popular. Consequently, I think it is fair to include the longer tail in our estimations of typo traffic.
@ Tyler – but didn’t you still assume the same percentage of typo traffic for the smaller web sites?
@Andrew: I think you misread the data we presented. For parked.com, we found its name server resolving 13,993 typo domains, around 2.5% of all domains being resolved by parked.com’s name server.
@ Tyler – yes, but in the “example” typos section of your appendix, you showed a bunch for some of the companies and only a few for others.
@Andrew (post 17): Yes, that’s correct. We assume that the typo sites receive 0.7% of the popular site’s traffic. While it is true that the most popular sites encounter more typo domains, it is likely that on the less popular sites the closest typos are the ones registered, while the more distant typos that receive fewer visitors are not.
It’s also true that most of the traffic and revenue comes from the more popular sites: the 3,264 .com domains we studied in greater detail in our paper account for 1/3 of the total estimated traffic and revenue for the top 100,000 sites. That said, we can’t write off the contributions from the tail.
@ Tyler – I understand your thinking, I just don’t think the tail traffic is the same percentage as the major typos.
Tyler,
Why would you use Alexa to measure when it’s probably the worst service of the big players (Compete, Quantcast, Alexa)?
Also your conclusion about Google being most suited to stop it I am uncertain about. I would view it akin to ISP’s role. Companies like CitizenHawk are *legitimately* monetizing trademark names (if I recall correctly the deal is they get domains for holders for the right to monetize them). Also the issue between typo and trademark infringement aren’t synonymous; you (at least Ben) are lawyers and know this.
Furthermore, your suggested solutions really wouldn’t work in practice. The value is the traffic, it will simply be arbitraged onto PPC another way or sent to other forms of monetization. By trying to cut them off simply will just drive that traffic ‘underground’ and redirecting it around in other ways.