A mysterious yet intriguing project from Russia has come across our inbox. It is a search-engine optimization analysis tool for Websites called TheRarestWords. For any given URL, like Microsoft’s or Techcrunch’s, it shows you the rarest keywords on the homepage (i.e., the ones most likely to give your site some search-engine juice), other sites with related keywords, and a list of categories the site would fit under based on those keywords. For Microsoft, some the rare keywords it identifies are “silverlight,” “biztalk,” “onecare,” “skydrive, “popfly,” “ballmer,” and “ozzie.” You can try your site by going to http://therarestwords.com/YOURSITE.com. TheRarestWords then tries to tap into crowd intelligence by letting anyone add a 100-character definition for each keyword, which could give it a semantic edge in trying to categorize each site. This could also be gamed pretty easily, but this looks to be just a Web project at this point. It could also be used to create a Wiki dictionary like Lingoz or Wiktionary, but that does not seem to be the focus of the project. The developer is a mysterious Russian who does not want to give out his name. You can find more info on his blog and on this forum post. Mircea Goia from MyTestBox dug into it for us and reports: The author and the sole founder – who is from Russia and wants to have a low profile for now – says it is just a hobby that was started in December 2007 and he calls it a “linguistic experiment”. Their spider (called TheRarestParser/0.2a) started scouting the internet in May and extracted words from many websites. It looked at which one are used most often on those websites and which ones are rarely used, or not at all. For now it extracts only the words from the first page of a domain. It doesn’t go deeper than that, however the spider managed to index 20 million words from many domains. The author wants to implement new options like: * Trend spotting (which of the words are gaining popularity – like “django” is becoming more popular, “python” is still strong, and which are losing it like “perl”) * Help with SEO for mom-and-dad kinds of business sites (it could be useful from this stand point, the author says) * Auto-categorization of your sites against a big list of categories (actually, at this time it has already been implemented, but → Read More
Can a user-defined dictionary be done better than Wikipedia’s Wiktionary? Babylon, a maker of popular for-pay translation/dictionary desktop software, certainly thinks so, and they are launching Lingoz to prove it. Lingoz is a collaborative, online dictionary where users are encouraged to participate by contributing terms and definitions, as well as by voting, commenting and aggregating words into helpful glossaries. Considered a modest Israeli success story, Babylon has been around since 1997 and has sold 1.6 million licenses in over 160 countries. As the company’s first pure Web play, Lingoz is being kicked-off with a substantial base of 4.5M terms in 8 languages, leveraging the vast 9M definition database Babylon has amassed over its 10 years of operation. An additional 42 languages will be rolled-out in the coming months. Back to Wiktionary for a moment. The editorial back-and-forth process that works so well for encyclopedic entries on Wikipedia seems less successful when applied to defining dictionary terms, a process more suited towards voting on multiple versions of a definition. Cognizant of Wiktionary’s shortcomings, Lingoz is being launched with a sensible set of social/UGC features: Terms can be submitted or requested. Voting on content quality is performed with a simple thumbs-up/down. Users can also define brand-new glossaries themselves, or request ones to be created. Glossaries may prove quite sticky as there are virtually an infinite number of potential themes that can be built out (think Web 2.0 terms, 60′s Hollywood actresses, etc—although a good starting point might be an actual definition for Web 2.0, which does not yet exist on the site. The main competition Lingoz faces is from Answers.com—ironically, another Israeli company. Answers.com doesn’t embrace UGC yet. If Lingoz can become the Wikipedia of online dictionaries, perhaps one day it will give Answers.com a run for its money. That would especially be true if Lingoz could attract substantial Google traffic. As Google’s default “definition” provider, Answers.com is especially vulnerable to any changes in referrals from Google. (For instance, a recent Google search algorithm tweak reduced their traffic by 28%). How do you define opportunity? → Read More
San Francisco, CA