This is part 5 in a series examining the geography of new domain endings (nTLDs). We began by identifying the most active registrant nations, asking why some countries are over/under-represented. Next, we focused on GEO suffixes like .LONDON and .IRISH, measuring just how local they are (in terms of ownership) and how popular compared to non-GEOs. Now it’s time to cross borders, leave behind place names, and consider a cultural aspect of geography that follows us around the planet: Language.
We can’t understand nTLD distribution without reference to language. .IMMOBILIEN is at home in a cluster of German-speaking nations. Spanish long ago outgrew the confines of Spain; so an extension like .VIAJES travels too, following Columbus. Chinese is spoken in China, of course – but also in Hong Kong and Taiwan. Therefore we’d expect similar patterns of nTLD adoption among those Chinese-speaking countries. Meanwhile, a multilingual Switzerland will presumably be a blend of French, German, and Italian nTLDs. As for the lingua franca, English keyword suffixes could show up anywhere; but we’d expect to see them most in the UK and former British colonies – the USA, Canada, South Africa, Australia, New Zealand, and India.
Really, the nTLD program is all about language. Its aim: To carve up the world into fragmentary name spaces based not only on topic but on vocabulary. Since most nTLDs are meaningful keywords or abbreviations, they’re constrained by or biased toward some language’s geography. .EARTH and .GLOBAL, despite the grand planetary canvas they describe, disclose an English perspective – just like .PLUMBING or .DOCTOR. And that fact does imply that their center of gravity will reside in English-speaking countries.
Anybody who has scanned a list of nTLDs will have noticed the prevalence of English terms –reflecting the lopsided priorities of registry applicants, their own national origins, or perhaps the scale of pent-up market demand in the USA where .COM crowding is felt most keenly. Yet, among nTLDs, more than 40 languages are represented.
So how many nTLDs are English or French or German or whatever? At first glance, that seems a straightforward question. But enumerating suffixes based on language is, in fact, a very tricky problem. Try assigning 1 language to each nTLD, and the process quickly falls apart. In the previous example, .GLOBAL and .DOCTOR don’t exclusively belong to any 1 language. .PLUMBING is uniquely English, and .MOI is surely solely French. But read the list .INTERNATIONAL, .SCIENCE, .CONSTRUCTION, .BOUTIQUE, .EXPERT, .RESTAURANT, .TENNIS, .BIBLE, .DIRECT, .PROTECTION, .POKER, and .TENNIS, and your first impression will be English or French depending on context and your mother tongue.
Languages share words, thanks to common origin or regular commerce between them. But inconsistent spellings can severely limit nTLD market size. A French or Italian audience might accept .POKER, but Spanish or Portuguese speakers would expect .POQUER. There goes South America! .FOOTBALL fails to account for the Spanish .FUTBOL, which itself won’t cover “futebol” in Brazil.
Some nTLDs can be labeled with 6 or more languages. .HOTEL is English, yes … and French … and Spanish … and Italian … and Portuguese … and German … and so forth. The word is shared internationally, but it isn’t truly global because plenty of languages do rely on their own native term for hotel. The Arabic word is utterly unrelated, for instance.
Certainly some suffixes can be used globally, irrespective of the audience’s language. Legacy gTLDs such as .COM and .NET and .INFO are treated in that way. .XYZ and .OOO aspire to the same kind of “language-less-ness”. Whatever their actual adoption, in theory such non-keyword TLDs are quite versatile. Arguably, some more meaningful suffixes – e.g. .CLUB, .VIP, and .TOP – achieve a similar global stature because they’re widely recognized loan words. Good for them! Bad for us, though, if we’re trying to categorize nTLDs based on language!
Some of you will remember the first nTLD ever to be released, .شبكة. It’s Arabic. Yet, in another sense, so too are .ARAB, .HALAL, and .ISLAM – all proposed nTLDs. Those transliterations reach beyond Arabic-speaking countries, anywhere Muslims and/or Arabs live – i.e. everywhere. We could as easily call these terms English as Arabic. Not to mention other languages. And does it make sense, really, to classify .شبكة and .HALAL both as “Arabic” in the same sense? Or, for that matter, to treat both .WANG and .在线 as Chinese?
As another example, take .RED. Did the registry intend an English color (as with .PINK) or the Spanish word for “network” (rather like .NET)? Does their assumption or goal even matter anyway? After all, we consumers may repurpose their product, as we have done with ccTLDs such as .ME, .IO, and .TV. Sí, .SOY may have been meant as Spanish “I am”; but one man’s existential echo of Descartes is another man’s bean paste. Ultimately, meaning is in the eye of the beholder. If you stare hard enough, almost any TLD begins to look Chinese.
Abbreviations pose similar challenges. .NGO is English; but .ONG, which has the very same meaning, is Spanish / Italian / French / Portuguese / Romanian. .GMBH is a German abbreviation. .LTDA (like English’s .LTD for “limited”) is Spanish and Portuguese … but not Italian. And .IMMO manages to function in German / Italian / French … but not in Spanish, whose spelling deviates by introducing an “N”: “inmobiliario” as opposed to “immobilien” / “immobili” / “immobilier”.
Getting a headache yet? Good luck labeling .DESI, a term used throughout the multifarious societies of India, Pakistan, and Bangladesh. India alone recognizes 22 official languages! Place names are no less problematic. .PARIS spells Paris in French, English, and other languages besides. But .MOSCOW isn’t Russian; it’s English. The Cyrillic IDN .МОСКВА has a different pronunciation (“Moskva”). And the city would be called Mosca, Moscou, or Moscú in Italian / French / Spanish. There is no consistency at all regarding which words languages do or don’t share. For instance, .BERLIN requires no translation moving from German to English, whereas .WIEN must be rewritten as Vienna.
It’s crucial to acknowledge and discuss all these challenges up front. Because I’m about to count nTLDs based on language, and readers must take the following data with a grain of salt. Here goes:
Language | TLDs | TLDs (Unique) |
TLDs > 100 |
TLDs (Unique) > 100 |
---|---|---|---|---|
English | 530 | 407 | 390 | 293 |
French | 79 | 12 | 64 | 10 |
Chinese | 70 | 69 | 28 | 22 |
Spanish | 66 | 9 | 50 | 7 |
Portuguese | 46 | 3 | 33 | 2 |
Italian | 40 | 1 | 29 | 0 |
German | 39 | 20 | 32 | 18 |
Arabic | 36 | 28 | 3 | 3 |
Japanese | 16 | 10 | 14 | 4 |
Russian | 15 | 15 | 6 | 6 |
Korean | 5 | 4 | 3 | 2 |
Nepali | 4 | 4 | 1 | 1 |
Dutch | 4 | 3 | 3 | 2 |
Farsi | 3 | 3 | 0 | 0 |
Tamil | 3 | 3 | 0 | 0 |
Hindi | 2 | 1 | 2 | 1 |
Turkish | 2 | 0 | 2 | 0 |
Thai | 2 | 2 | 0 | 0 |
Bengali | 2 | 2 | 0 | 0 |
Urdu | 2 | 2 | 0 | 0 |
Kurdish | 1 | 1 | 1 | 1 |
Tatar | 1 | 1 | 1 | 1 |
Basque | 1 | 1 | 1 | 1 |
Breton | 1 | 1 | 1 | 1 |
Welsh | 1 | 1 | 1 | 1 |
Frisian | 1 | 1 | 1 | 1 |
Afrikaans | 1 | 0 | 1 | 0 |
Romanian | 1 | 0 | 1 | 0 |
Hebrew | 1 | 1 | 0 | 0 |
Armenian | 1 | 1 | 0 | 0 |
Bulgarian | 1 | 1 | 0 | 0 |
Georgian | 1 | 1 | 0 | 0 |
Greek | 1 | 1 | 0 | 0 |
Gujarati | 1 | 1 | 0 | 0 |
Kazakh | 1 | 1 | 0 | 0 |
Punjabi | 1 | 1 | 0 | 0 |
Sinhala | 1 | 1 | 0 | 0 |
Telugu | 1 | 1 | 0 | 0 |
Finnish | 1 | 0 | 0 | 0 |
Hungarian | 1 | 0 | 0 | 0 |
Swedish | 1 | 0 | 0 | 0 |
Setting aside Dot Brands such as .LOREAL, .MARRIOTT, and .AARP, there are 761 nTLDs to be looked at. Of those, I’ve marked 2 as “language-less”: .XYZ and .OOO. To all the others, I’ve assigned 1 or more languages. Admittedly, this process reflects my own ignorance and laziness. Apart from English, I only know Spanish and Arabic. Beyond those 3 languages, I made some attempt to identify keywords that might fit French, Italian, Portuguese, and German. Since those were the ONLY languages I inspected when deciding whether a given nTLD is meaningful in more than 1 language, it’s quite likely that most other languages – e.g. Chinese, Dutch, Swedish, Turkish, etc. – have been under-counted.
Consequently, I’m confident that at least 145 nTLDs are meaningful in multiple languages. The remaining 614 suffixes seem – at this juncture – to be unique to a single language. But, of course, if I were to continue looking for instances of these terms in Greek or Bengali or Norwegian dictionaries, some would turn out to span more than 1 language after all. So the 614 number would shrink, and the 145 count would grow. Here’s the best way to interpret these numbers: (1) more than 145 multilingual nTLDs; (2) fewer than 614 monolingual nTLDs.
You’ll notice a column marked “TLDs (Unique)”. This gives a count of nTLDs that appear to be uniquely 1 language. So, by way of example, there are 530 English nTLDs. No more than 407 of these are strictly English. Another 123 English nTLDs (or more) are meaningful in at least 1 other language. Meanwhile, at most 12 nTLDs are uniquely French, though as many as 79 (if not more) can be interpreted as French.
The rightmost 2 columns limit attention to nTLDs in which at least 100 domains have been registered, in effect excluding suffixes that have yet to be publicly released. As you can see, only 3 out of 36 Arabic nTLDs have made it this far, whereas for German the fraction is 32 / 39. Even allowing for multilingual nTLDs and stalled rollouts, the preponderance of English keywords is clear as day. Depending on which column we look at, English outnumbers the 2nd most common nTLD language by a factor of 5.9 – 13.3.
Imperfect though my labeling is, this provides a rough idea of the role played by language within the nTLD program. In my next article, I’ll be delving into statistics based on this framework. So if you notice any languages that I’ve under-counted – i.e. nTLDs I’ve missed – please say so. This is a work in progress.
Eric Lyon says
This is an excellent write up that clearly distinguishes the different target regions of new gTLDs. Many people fail to realize when first investing in new gTLDs that there are some speed bumps and brick walls to maneuver around in order to find the ideal end-users (Unlike older gTLDs).
Joseph Peterson says
Thanks, Eric.
Definitely, the nTLDs are challenging to get a grip on – even for professionals – because there are so many of them. Not only that, they all rely on different pricing models. Some are restricted in various ways. They bump into one another a lot too. So it can get really complicated.
Even so, little by little, people are figuring out when and where to use the nTLDs. And there ARE good places to use many of them.
Kate says
New extensions cause unnecessary fragmentation of the Internet.
Not only most are niche, but indeed they are usually tied to a language that is most often English.
Doubly challenging.
This is market segmentation pushed to the extreme.
All the while most ccTLDs are still underutilized.
Not to mention redundant/overlapping TLDs, singular-plural variants etc.
Good luck.
168 says
Great info Joseph. Appreciate your efforts for the benefit of all.
It would be great if this was posted on NP for users to add to the perspective from a local point of view. For example, India with 22 languages which is used most in business? Which English words are commonly used ?
Asian, Europe, South America, Africa common business language etc
Cheers
Joseph Peterson says
@168,
That’s an interesting idea. NamePros could, theoretically at least, be a good tool for crowd-sourcing knowledge of all those languages. Domainers are a global community, and NamePros has very diverse membership.
168 says
Thanks,
Most definitely a good time to build a data base.
solidairesdumonde says
Thank you for such a thorough assessment. I was wondering what Google’s stance was regarding those TLD’s. – which would be given a particular weight on certain versions of Google (e.g. .fr ranks primarily in google.fr, .co.uk in the UK etc.) or if they’re all neutral like a .com or .net. While as you pointed out some are self-explanatory, you can’t help but wonder if Google will bother dealing with them on a case by case basis, for instance making sure that .BERLIN is for google.de, etc.
Your other study (https://domainnamewire.com/2017/01/31/analyzing-new-top-level-domain-registrations-country-part-3-geos/) seems to imply those geo domains rankbetter in their respective google domains, but I was wondering if Google confirmed that they gave them particular weight.
Joseph Peterson says
@solidairesdumonde,
My assumption is that Google will eventually – if they don’t already – treat some GEO nTLDs in the same way as they treat most ccTLDs. That is to say, .BERLIN website rankings (like .DE website rankings) will be biased upward inside Germany and downward outside Germany.
Of course, what Google does is up to Google. But the ultimate goal is to show RELEVANT results. And .BERLIN sites are more relevant to people in Germany, for obvious reasons. So I’d be surprised if Google doesn’t catch up to this fact sooner or later.
“Your other study seems to imply those geo domains rank better”
No. To be clear, I didn’t discuss rankings in the sense of SERPs. I was only analyzing domain registration volume and ranking TLDs within each country based on that.
Joseph Peterson says
P.S. It may take Google longer to incorporate nTLD keyword language into its ranking algorithms.
In cases where language is clear-cut and geographically constrained, this ought to be considered as an input for SERP rankings. That’s my only my opinion, of course; and Google doesn’t listen to me. But it seems obvious that a Chinese-language TLD ought to be favored inside China, and a Dutch keyword suffix ought to be less relevant outside the Netherlands.