Only 50% Of Twitter Messages Are In English, Study Says

Paris-based Semiocast, which helps brands understand and interact with real-time Web services, has performed a semantic and quantitative study of Twitter based on an analysis of 2.8 million tweets.

Turns out roughly half the tweets posted on the micro-sharing service are in English, down 25% from last year, even though the company is based in the U.S. and has more users and momentum in English-speaking countries than anywhere else on the planet. The analysis further showed that the top 5 languages used on Twitter are English, Japanese, Portuguese, Malay and Spanish.

Semiocast says the study was conducted on messages gathered over a period of 48 hours, from February 8 to February 10, with the sole aim of determining which languages were most often used on Twitter. The messages were processed with the company’s own analysis tools, which it says can identify the language used in short messages for some 41 languages, including Greek, Hebrew, Chinese, Korean, Tamil, etc.

English is still the most used language on Twitter, with 50% of messages, although Semiocast says this is a far cry from the two-third share they registered for English in the first half of 2009. Semiocast also forecasts that its share will grow thinner in the future, as Twitter becomes more internationalized (i.e. becomes available in more languages) and its pervasiveness spreads to Asia and Latin-America.

Japanese comes in second with 14% of messages. This isn’t all too surprising; Twitter has been addressing that market for almost two years now. The third most used language is Portuguese with 9% of all messages, mirroring the success of social networks in Brazil.

The rapid adoption of Twitter in Malaysia and Indonesia, where Twitter has partnerships with two mobile telcos in place, shows in the rankings as well. Malay languages, including Bahasa Malaysia and Bahasa Indonesia, now represent the fourth most used language on Twitter, with 6% of messages. Spanish comes in fifth with 4% of all messages.

The ranks six to eight are occupied by major European languages, namely Italian, Dutch and German, each accounting for about 1% to 2% of total messages. French represents a little less than 1% of total messages.

For your reference: Twitter earlier this week claimed that it was seeing about 50 million tweets per day now, so an analysis of less than 3 million messages measured over two days may not be super representative. Nevertheless, it only takes a peek at Twitter’s public timeline to see that there are lots of people using the service in a language other than English.